1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138
|
pandocfilters
=============
A python module for writing pandoc filters. Pandoc filters
are pipes that read a JSON serialization of the Pandoc AST
from stdin, transform it in some way, and write it to stdout.
They can be used with pandoc (>= 1.12) either using pipes:::
pandoc -t json -s | ./caps.py | pandoc -f json
or using the ``--filter`` (or ``-F``) command-line option:::
pandoc --filter ./caps.py -s
For more on pandoc filters, see the pandoc documentation under ``--filter``
and `the tutorial on writing filters`__.
__ http://johnmacfarlane.net/pandoc/scripting.html
To install::
python setup.py install
The ``pandocfilters`` module exports the following functions:
``walk(x, action, format, meta)``
Walk a tree, applying an action to every object.
Returns a modified tree.
``toJSONFilter(action)``
Converts an action into a filter that reads a JSON-formatted
pandoc document from stdin, transforms it by walking the tree
with the action, and returns a new JSON-formatted pandoc document
to stdout. The argument is a function action(key, value, format, meta),
where key is the type of the pandoc object (e.g. 'Str', 'Para'),
value is the contents of the object (e.g. a string for 'Str',
a list of inline elements for 'Para'), format is the target
output format (which will be taken for the first command line
argument if present), and meta is the document's metadata.
If the function returns None, the object to which it applies
will remain unchanged. If it returns an object, the object will
be replaced. If it returns a list, the list will be spliced in to
the list to which the target object belongs. (So, returning an
empty list deletes the object.)
``stringify(x)``
Walks the tree ``x`` and returns concatenated string content,
leaving out all formatting.
``attributes(attrs)``
Returns an attribute list, constructed from the
dictionary ``attrs``.
Most users will only need ``toJSONFilter``. Here is a simple example
of its use:::
#!/usr/bin/env python
"""
Pandoc filter to convert all regular text to uppercase.
Code, link URLs, etc. are not affected.
"""
from pandocfilters import toJSONFilter, Str
def caps(key, value, format, meta):
if key == 'Str':
return Str(value.upper())
if __name__ == "__main__":
toJSONFilter(caps)
Examples
--------
The examples subdirectory in the source repository contains the
following filters. These filters should provide a useful starting point
for developing your own pandocfilters.
- ``abc.py``
Pandoc filter to process code blocks with class ``abc`` containing ABC
notation into images. Assumes that abcm2ps and ImageMagick's convert
are in the path. Images are put in the abc-images directory.
- ``caps.py``
Pandoc filter to convert all regular text to uppercase. Code, link
URLs, etc. are not affected.
- ``comments.py``
Pandoc filter that causes everything between
``<!-- BEGIN COMMENT -->`` and ``<!-- END COMMENT -->`` to be ignored.
The comment lines must appear on lines by themselves, with blank
lines surrounding
- ``deemph.py``
Pandoc filter that causes emphasized text to be displayed in ALL
CAPS.
- ``deflists.py``
Pandoc filter to convert definition lists to bullet lists with the
defined terms in strong emphasis (for compatibility with standard
markdown).
- ``graphviz.py``
Pandoc filter to process code blocks with class ``graphviz`` into
graphviz-generated images.
- ``metavars.py``
Pandoc filter to allow interpolation of metadata fields into a
document. ``%{fields}`` will be replaced by the field's value, assuming
it is of the type ``MetaInlines`` or ``MetaString``.
- ``myemph.py``
Pandoc filter that causes emphasis to be rendered using the custom
macro ``\myemph{...}`` rather than ``\emph{...}`` in latex. Other output
formats are unaffected.
- ``theorem.py``
Pandoc filter to convert divs with ``class="theorem"`` to LaTeX theorem
environments in LaTeX output, and to numbered theorems in HTML
output.
- ``tikz.py``
Pandoc filter to process raw latex tikz environments into images.
Assumes that pdflatex is in the path, and that the standalone
package is available. Also assumes that ImageMagick's convert is in
the path. Images are put in the ``tikz-images`` directory.
|