File: index.html

package info (click to toggle)
python-mistletoe 1.4.0-1
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 864 kB
  • sloc: python: 5,649; sh: 66; makefile: 40
file content (293 lines) | stat: -rw-r--r-- 13,237 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
<html>
<head>
  <title>mistletoe | version 0.5.2</title>
  <meta charset="UTF-8" />
  <meta name="description" content="A fast, extensible Markdown parser in Python." />
  <meta name="keywords" content="Markdown,Python,LaTeX,HTML" />
  <meta name="author" content="Mi Yu" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <link rel="stylesheet" href="style.css" type="text/css" />
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/github.min.css">
  <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js"></script>
  <script>hljs.initHighlightingOnLoad();</script>
</head>
<body><h1>mistletoe<img src='https://cdn.rawgit.com/miyuchina/mistletoe/master/resources/logo.svg' align='right'></h1>
<p><a href="https://travis-ci.org/miyuchina/mistletoe"><img src="https://img.shields.io/travis/miyuchina/mistletoe.svg?style=flat-square" title="" alt="Build Status"></a>
<a href="https://coveralls.io/github/miyuchina/mistletoe%3Fbranch%3Dmaster"><img src="https://img.shields.io/coveralls/miyuchina/mistletoe.svg?style=flat-square" title="" alt="Coverage Status"></a>
<a href="https://pypi.python.org/pypi/mistletoe"><img src="https://img.shields.io/pypi/v/mistletoe.svg?style=flat-square" title="" alt="PyPI"></a>
<a href="https://pypi.python.org/pypi/mistletoe"><img src="https://img.shields.io/pypi/wheel/mistletoe.svg?style=flat-square" title="" alt="is wheel"></a>
</p>
<p>mistletoe is a Markdown parser in pure Python, designed to be fast, modular
and fully customizable.
</p>
<p>mistletoe is not simply a Markdown-to-HTML transpiler. It is designed, from
the start, to parse Markdown into an abstract syntax tree. You can swap out
renderers for different output formats, without touching any of the core
components.
</p>
<p>Remember to spell mistletoe in lowercase!
</p>
<h2>Features
</h2>
<ul>
<li><p><strong>Fast</strong>: mistletoe is as fast as the <a href="https://github.com/lepture/mistune">fastest implementation</a>
  currently available: that is, over 4 times faster than
  <a href="https://github.com/waylan/Python-Markdown">Python-Markdown</a>, and much faster than
  <a href="https://github.com/trentm/python-markdown2">Python-Markdown2</a>.
  
  See the <a href="#performance">performance</a> section for details.
</p>
</li>
<li><p><strong>Modular</strong>: mistletoe is designed with modularity in mind. Its initial
  goal is to provide a clear and easy API to extend upon.
</p>
</li>
<li><strong>Customizable</strong>: as of now, mistletoe can render Markdown documents to LaTeX, HTML and an abstract syntax tree out of the box. Writing a new renderer for mistletoe is a relatively trivial task.</li>
</ul>
<h2>Installation
</h2>
<p>mistletoe requires Python 3.3 and above, including Python 3.7, the current
development branch. It is also tested on PyPy 5.8.0. Install mistletoe with
pip:
</p>
<pre><code class="lang-sh">pip3 install mistletoe
</code></pre>
<p>Alternatively, clone the repo:
</p>
<pre><code class="lang-sh">git clone https://github.com/miyuchina/mistletoe.git
cd mistletoe
pip3 install -e .
</code></pre>
<p>See the <a href="contributing.html">contributing</a> doc for how to contribute to mistletoe.
</p>
<h2>Usage
</h2>
<h3>Basic usage</h3>
<p>Here&#x27;s how you can use mistletoe in a Python script:
</p>
<pre><code class="lang-python">import mistletoe

with open(&#x27;foo.md&#x27;, &#x27;r&#x27;) as fin:
    rendered = mistletoe.markdown(fin)
</code></pre>
<p><code>mistletoe.markdown()</code> uses mistletoe&#x27;s default settings: allowing HTML mixins
and rendering to HTML. The function also accepts an additional argument
<code>renderer</code>. To produce LaTeX output:
</p>
<pre><code class="lang-python">import mistletoe
from mistletoe.latex_renderer import LaTeXRenderer

with open(&#x27;foo.md&#x27;, &#x27;r&#x27;) as fin:
    rendered = mistletoe.markdown(fin, LaTeXRenderer)
</code></pre>
<p>Finally, here&#x27;s how you would manually specify extra tokens and a renderer
for mistletoe. In the following example, we use <code>HtmlRenderer</code> to render
the AST, which adds <code>HtmlBlock</code> and <code>HtmlSpan</code> to the normal parsing
process.
</p>
<pre><code class="lang-python">from mistletoe import Document, HtmlRenderer

with open(&#x27;foo.md&#x27;, &#x27;r&#x27;) as fin:
    with HtmlRenderer() as renderer:
        rendered = renderer.render(Document(fin))
</code></pre>
<h3>From the command-line</h3>
<p>pip installation enables mistletoe&#x27;s commandline utility. Type the following
directly into your shell:
</p>
<pre><code class="lang-sh">mistletoe foo.md
</code></pre>
<p>This will transpile <code>foo.md</code> into HTML, and dump the output to stdout. To save
the HTML, direct the output into a file:
</p>
<pre><code class="lang-sh">mistletoe foo.md &gt; out.html
</code></pre>
<p>You can pass in custom renderers by including the full path to your renderer
class after a <code>-r</code> or <code>--renderer</code> flag:
</p>
<pre><code class="lang-sh">mistletoe foo.md --renderer custom_renderer.CustomRenderer
</code></pre>
<p>Running <code>mistletoe</code> without specifying a file will land you in interactive
mode.  Like Python&#x27;s REPL, interactive mode allows you to test how your
Markdown will be interpreted by mistletoe:
</p>
<pre><code>mistletoe [version 0.5.2] (interactive)
Type Ctrl-D to complete input, or Ctrl-C to exit.
&gt;&gt;&gt; some **bold text**
... and some *italics*
... ^D
&lt;html&gt;
&lt;body&gt;
&lt;p&gt;some &lt;strong&gt;bold text&lt;/strong&gt; and some &lt;em&gt;italics&lt;/em&gt;&lt;/p&gt;
&lt;/body&gt;
&lt;/html&gt;
&gt;&gt;&gt;
</code></pre>
<p>The interactive mode also accepts the <code>--renderer</code> flag.
</p>
<h2>Performance
</h2>
<p>mistletoe is the fastest Markdown parser implementation available in pure
Python; that is, on par with <a href="https://github.com/lepture/mistune">mistune</a>. Try the benchmarks yourself by
running:
</p>
<pre><code class="lang-sh">python3 test/benchmark.py
</code></pre>
<p>One of the significant bottlenecks of mistletoe compared to mistune, however,
is the function overhead. Because, unlike mistune, mistletoe chooses to split
functionality into modules, function lookups can take significantly longer than
mistune.
</p>
<p>To boost the performance further, it is suggested to use PyPy with mistletoe.
Benchmark results show that on PyPy, mistletoe is about <strong>twice as fast</strong> as
mistune:
</p>
<pre><code class="lang-sh">$ pypy3 test/benchmark.py mistune mistletoe
Test document: test/samples/syntax.md
Test iterations: 1000
Running tests with mistune, mistletoe...
========================================
mistune: 13.524028996936977
mistletoe: 6.477352762129158
</code></pre>
<p>The above result was achieved on PyPy 5.8.0-beta0, on a 13-inch Retina MacBook
Pro (Early 2015).
</p>
<h2>Developer&#x27;s Guide
</h2>
<p>Here&#x27;s an example to add GitHub-style wiki links to the parsing process,
and provide a renderer for this new token.
</p>
<h3>A new token</h3>
<p>GitHub wiki links are span-level tokens, meaning that they reside inline,
and don&#x27;t really look like chunky paragraphs. To write a new span-level
token, all we need to do is make a subclass of <code>SpanToken</code>:
</p>
<pre><code class="lang-python">from mistletoe.span_token import SpanToken

class GithubWiki(SpanToken):
    pass
</code></pre>
<p>mistletoe uses regular expressions to search for span-level tokens in the
parsing process. As a refresher, GitHub wiki looks something like this:
<code>[[alternative text | target]]</code>. We define a class variable, <code>pattern</code>,
that stores the compiled regex:
</p>
<pre><code class="lang-python">class GithubWiki(SpanToken):
    pattern = re.compile(r&quot;\[\[ *(.+?) *\| *(.+?) *\]\]&quot;)
    def __init__(self, match_obj):
        pass
</code></pre>
<p>For spiritual guidance on regexes, refer to <a href="https://xkcd.com/208/">xkcd</a> classics. For an
actual representation of this author parsing Markdown with regexes, refer
to this brilliant <a href="http://www.greghendershott.com/img/grumpy-regexp-parser.png">meme</a> by <a href="http://www.greghendershott.com/2013/11/markdown-parser-redesign.html">Greg Hendershott</a>.
</p>
<p>mistletoe&#x27;s span-level tokenizer will search for our pattern. When it finds
a match, it will pass in the match object as argument into our constructor.
We have defined our regex so that the first match group is the alternative
text, and the second one is the link target.
</p>
<p>Note that alternative text can also contain other span-level tokens.  For
example, <code>[[*alt*|link]]</code> is a GitHub link with an <code>Emphasis</code> token as its
child.  To parse child tokens, simply pass <code>match_obj</code> to the <code>super</code>
constructor (which assumes children to be in <code>match_obj.group(1)</code>),
and save off all the additional attributes we need:
</p>
<pre><code class="lang-python">from mistletoe.span_token import SpanToken

class GithubWiki(SpanToken):
    pattern = re.compile(r&quot;\[\[ *(.+?) *\| *(.+?) *\]\]&quot;)
    def __init__(self, match_obj):
        super().__init__(match_obj)
        self.target = match_obj.group(2)
</code></pre>
<p>There you go: a new token in 7 lines of code.
</p>
<h3>A new renderer</h3>
<p>Adding a custom token to the parsing process usually involves a lot
of nasty implementation details. Fortunately, mistletoe takes care
of most of them for you. Simply pass your custom token class to 
<code>super().__init__()</code> does the trick:
</p>
<pre><code class="lang-python">from mistletoe.html_renderer import HtmlRenderer

class GithubWikiRenderer(HtmlRenderer):
    def __init__(self):
        super().__init__(GithubWiki)
</code></pre>
<p>We then only need to tell mistletoe how to render our new token:
</p>
<pre><code class="lang-python">def render_github_wiki(self, token):
    template = &#x27;&lt;a href=&quot;{target}&quot;&gt;{inner}&lt;/a&gt;&#x27;
    target = token.target
    inner = self.render_inner(token)
    return template.format(target=target, inner=inner)
</code></pre>
<p>Cleaning up, we have our new renderer class:
</p>
<pre><code class="lang-python">from mistletoe.html_renderer import HtmlRenderer, escape_url

class GithubWikiRenderer(HtmlRenderer):
    def __init__(self):
        super().__init__(GithubWiki)

    def render_github_wiki(self, token):
        template = &#x27;&lt;a href=&quot;{target}&quot;&gt;{inner}&lt;/a&gt;&#x27;
        target = escape_url(token.target)
        inner = self.render_inner(token)
        return template.format(target=target, inner=inner)
</code></pre>
<h3>Take it for a spin?</h3>
<p>It is preferred that all mistletoe&#x27;s renderers be used as context managers.
This is to ensure that your custom tokens are cleaned up properly, so that
you can parse other Markdown documents with different token types in the
same program.
</p>
<pre><code class="lang-python">from mistletoe import Document
from contrib.github_wiki import GithubWikiRenderer

with open(&#x27;foo.md&#x27;, &#x27;r&#x27;) as fin:
    with GithubWikiRenderer() as renderer:
        rendered = renderer.render(Document(fin))
</code></pre>
<p>For more info, take a look at the <code>base_renderer</code> module in mistletoe.
The docstrings might give you a more granular idea of customizing mistletoe
to your needs.
</p>
<h2>Why mistletoe?
</h2>
<p>For me, the question becomes: why not <a href="https://github.com/lepture/mistune">mistune</a>? My original
motivation really has nothing to do with starting a competition. Here&#x27;s a list
of reasons I created mistletoe in the first place:
</p>
<ul>
<li>I am interested in a Markdown-to-LaTeX transpiler in Python.</li>
<li>I want to write more Python.</li>
<li>&quot;How hard could it be?&quot;</li>
<li>&quot;For fun,&quot; says David Beazley.</li>
</ul>
<p>Here&#x27;s two things mistune inspired mistletoe to do:
</p>
<ul>
<li>Markdown parsers should be fast, and other parser implementations in Python leaves much to be desired.</li>
<li>A parser implementation for Markdown does not need to restrict itself to one flavor of Markdown.</li>
</ul>
<p>Here&#x27;s two things mistletoe does differently from mistune:
</p>
<ul>
<li>Per its <a href="https://github.com/lepture/mistune">readme</a>, mistune will always be a single-file script. mistletoe breaks its functionality into modules.</li>
<li>mistune, as of now, can only render Markdown into HTML. It is relatively trivial to write a new renderer for mistletoe.</li>
<li>Unlike mistune, mistletoe is pushing for some extent of spec compliance with CommonMark.</li>
</ul>
<p>The implications of these are quite profound, and there&#x27;s no definite
this-is-better-than-that answer. Mistune is near perfect if one wants what
it provides: I have used mistune extensively in the past, and had a great
experience. If you want more control, however, give mistletoe a try.
</p>
<h2>Copyright &amp; License
</h2>
<ul>
<li>mistletoe&#x27;s logo uses artwork by Daniele De Santis, under <a href="https://creativecommons.org/licenses/by/3.0/us/">CC BY 3.0</a>.</li>
<li>mistletoe is released under <a href="LICENSE">MIT</a>.</li>
</ul>
</body></html>