1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208
|
# HTML
`fpdf2` supports basic rendering from HTML.
This is implemented by using `html.parser.HTMLParser` from the Python standard library.
The whole HTML 5 specification is **not** supported, and neither is CSS,
but bug reports & contributions are very welcome to improve this.
_cf._ [Supported HTML features](#supported-html-features) below for details on its current limitations.
For a more robust & feature-full HTML-to-PDF converter in Python,
you may want to check [Reportlab](https://www.reportlab.com) (or [xhtml2pdf](https://pypi.org/project/xhtml2pdf/) based on it), [WeasyPrint](https://weasyprint.org)
or [borb](https://github.com/jorisschellekens/borb-examples/#76-exporting-html-as-pdf).
## write_html usage example
HTML rendering requires the use of [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html):
```python
from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
pdf.write_html("""
<dl>
<dt>Description title</dt>
<dd>Description Detail</dd>
</dl>
<h1>Big title</h1>
<section>
<h2>Section title</h2>
<p><b>Hello</b> world. <u>I am</u> <i>tired</i>.</p>
<p><a href="https://github.com/py-pdf/fpdf2">py-pdf/fpdf2 GitHub repo</a></p>
<p align="right">right aligned text</p>
<p>i am a paragraph <br>in two parts.</p>
<font color="#00ff00"><p>hello in green</p></font>
<font size="7"><p>hello small</p></font>
<font face="helvetica"><p>hello helvetica</p></font>
<font face="times"><p>hello times</p></font>
</section>
<section>
<h2>Other section title</h2>
<ul type="circle">
<li>unordered</li>
<li>list</li>
<li>items</li>
</ul>
<ol start="3" type="i">
<li>ordered</li>
<li>list</li>
<li>items</li>
</ol>
<br>
<br>
<pre>i am preformatted text.</pre>
<br>
<blockquote>hello blockquote</blockquote>
<table width="50%">
<thead>
<tr>
<th width="30%">ID</th>
<th width="70%">Name</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Alice</td>
</tr>
<tr>
<td>2</td>
<td>Bob</td>
</tr>
</tbody>
</table>
</section>
""")
pdf.output("html.pdf")
```
### Styling HTML tags globally
_New in [:octicons-tag-24: 2.7.9](https://github.com/py-pdf/fpdf2/blob/master/CHANGELOG.md)_
The style of several HTML tags (`<a>`, `<blockquote>`, `<code>`, `<pre>`, `<h1>`, `<h2>`, `<h3>`...) can be set globally, for the whole HTML document, by passing `tag_styles` to `FPDF.write_html()`:
```python
from fpdf import FPDF, FontFace
pdf = FPDF()
pdf.add_page()
pdf.write_html("""
<h1>Big title</h1>
<section>
<h2>Section title</h2>
<p>Hello world!</p>
</section>
""", tag_styles={
"h1": FontFace(color="#948b8b", size_pt=32),
"h2": FontFace(color="#948b8b", size_pt=24),
})
pdf.output("html_styled.pdf")
```
Similarly, the indentation of several HTML tags (`<blockquote>`, `<dd>`, `<li>`) can be set globally, for the whole HTML document, by passing `tag_styles` to `FPDF.write_html()`:
```python
from fpdf import FPDF, TextStyle
pdf = FPDF()
pdf.add_page()
pdf.write_html("""
<dl>
<dt>Term</dt>
<dd>Definition</dd>
</dl>
<blockquote>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed non risus.
Suspendisse lectus tortor, dignissim sit amet, adipiscing nec, ultricies sed, dolor.
Cras elementum ultrices diam.
</blockquote>
""", tag_styles={
"dd": TextStyle(l_margin=5),
"blockquote": TextStyle(color="#ccc", font_style="I",
t_margin=5, b_margin=5, l_margin=10),
})
pdf.output("html_dd_indented.pdf")
```
⚠️ Note that this styling is currently only supported for a subset of all HTML tags,
and that some [`FontFace`](https://py-pdf.github.io/fpdf2/fpdf/fonts.html#fpdf.fonts.FontFace) or [`TextStyle`](https://py-pdf.github.io/fpdf2/fpdf/fonts.html#fpdf.fonts.TextStyle) properties may not be honored.
However, **Pull Request are welcome** to implement missing features!
### Default font
_New in [:octicons-tag-24: 2.8.0](https://github.com/py-pdf/fpdf2/blob/master/CHANGELOG.md)_
The default font used by [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html) is **Times**.
You can change this default font by passing `font_family` to this method:
```python
from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
pdf.write_html("""
<h1>Big title</h1>
<section>
<h2>Section title</h2>
<p>Hello world!</p>
</section>
""", font_family="Helvetica")
pdf.output("html_helvetica.pdf")
```
## Supported HTML features
* `<h1>` to `<h6>`: headings (and `align` attribute)
* `<p>`: paragraphs (and `align`, `line-height` attributes)
* `<br>` & `<hr>` tags
* `<b>`, `<i>`, `<s>`, `<u>`: bold, italic, strikethrough, underline
* `<font>`: (and `face`, `size`, `color` attributes)
* `<center>` for aligning
* `<a>`: links (and `href` attribute) to a file, URL, or page number.
* `<pre>` & `<code>` tags
* `<img>`: images (and `src`, `width`, `height` attributes)
* `<ol>`, `<ul>`, `<li>`: ordered, unordered and list items (can be nested)
* `<dl>`, `<dt>`, `<dd>`: description list, title, details (can be nested)
* `<sup>`, `<sub>`: superscript and subscript text
* `<table>`: (with `align`, `border`, `width`, `cellpadding`, `cellspacing` attributes) those tags are rendered using [fpdf2 Tables layout](https://py-pdf.github.io/fpdf2/Tables.html) and the following sub-tags are supported:
+ `<thead>`: optional tag, wraps the table header row
+ `<tfoot>`: optional tag, wraps the table footer row
+ `<tbody>`: optional tag, wraps the table rows with actual content
+ `<tr>`: rows (with `align`, `bgcolor` attributes)
+ `<th>`: heading cells (with `align`, `bgcolor`, `width` attributes)
* `<td>`: cells (with `align`, `bgcolor`, `width`, `rowspan`, `colspan` attributes)
### Page breaks
_New in [:octicons-tag-24: 2.8.0](https://github.com/py-pdf/fpdf2/blob/master/CHANGELOG.md)_
Page breaks can be triggered explicitly using the [break-before](https://developer.mozilla.org/en-US/docs/Web/CSS/break-before) or [break-after](https://developer.mozilla.org/en-US/docs/Web/CSS/break-after) CSS properties.
For example you can use:
```html
<br style="break-after: page">
```
or:
```html
<p style="break-before: page">
Top of a new page.
</p>
```
## Known limitations
`fpdf2` HTML renderer does not support some configurations of nested tags.
For example:
* `<table>` cells can contain `<td><b><em>nested tags forming a single text block</em></b></td>`, but **not** `<td><b>arbitrarily</b> nested <em>tags</em></td>` - _cf._ [issue #845](https://github.com/py-pdf/fpdf2/issues/845)
You can also check the currently open GitHub issues with the tag `html`:
[label:html is:open](https://github.com/py-pdf/fpdf2/issues?q=is%3Aopen+label%3Ahtml)
## Using Markdown
Check the dedicated page: [Combine with Markdown](CombineWithMarkdown.md)
|