File: ch2a_fonts.py

package info (click to toggle)
python-reportlab 3.6.12-1%2Bdeb12u1
  • links: PTS
  • area: main
  • in suites: bookworm
  • size: 12,344 kB
  • sloc: python: 96,651; ansic: 19,106; xml: 1,494; makefile: 337; sh: 100
file content (513 lines) | stat: -rw-r--r-- 19,934 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
#Copyright ReportLab Europe Ltd. 2000-2017
#see license.txt for license details
#history https://hg.reportlab.com/hg-public/reportlab/log/tip/docs/userguide/ch2a_fonts.py
from tools.docco.rl_doc_utils import *
from reportlab.lib.codecharts import SingleByteEncodingChart
from reportlab.platypus import Image
import reportlab

heading1("Fonts and encodings")

disc("""
This chapter covers fonts, encodings and Asian language capabilities.
If you are purely concerned with generating PDFs for Western
European languages, you can just read the "Unicode is the default" section
below and skip the rest on a first reading.
We expect this section to grow considerably over time. We
hope that Open Source will enable us to give better support for
more of the world's languages than other tools, and we welcome
feedback and help in this area.
""")

heading2("Unicode and UTF8 are the default input encodings")

disc("""
Starting with reportlab Version 2.0 (May 2006), all text input you
provide to our APIs should be in UTF8 or as Python Unicode objects.
This applies to arguments to canvas.drawString and related APIs,
table cell content, drawing object parameters, and paragraph source
text.  
""")


disc("""
We considered making the input encoding configurable or even locale-dependent,
but decided that "explicit is better than implicit".""")

disc("""
This simplifies many things we used to do previously regarding greek
letters, symbols and so on.  To display any character, find out its
unicode code point, and make sure the font you are using is able
to display it.""")

disc("""
If you are adapting a ReportLab 1.x application, or reading data from
another source which contains single-byte data (e.g. latin-1 or WinAnsi),
you need to do a conversion into Unicode.  The Python codecs package now
includes converters for all the common encodings, including Asian ones.
""")



disc(u"""
If your data is not encoded as UTF8, you will get a UnicodeDecodeError as
soon as you feed in a non-ASCII character.  For example, this snippet below is
attempting to read in and print a series of names, including one with a French
accent:  ^Marc-Andr\u00e9 Lemburg^.  The standard error is quite helpful and tells you
what character it doesn't like:
""")

eg(u"""
>>> from reportlab.pdfgen.canvas import Canvas
>>> c = Canvas('temp.pdf')
>>> y = 700
>>> for line in file('latin_python_gurus.txt','r'):
...     c.drawString(100, y, line.strip())
...
Traceback (most recent call last):
...
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-11: invalid data
-->\u00e9 L<--emburg
>>> 
""")


disc("""
The simplest fix is just to convert your data to unicode, saying which encoding
it comes from, like this:""")

eg("""
>>> for line in file('latin_input.txt','r'):
...     uniLine = unicode(line, 'latin-1')
...     c.drawString(100, y, uniLine.strip())
>>>
>>> c.save()
""")


heading2("Automatic output font substitution")

disc("""
There are still a number of places in the code, including the rl_config
defaultEncoding parameter, and arguments passed to various Font constructors,
which refer to encodings.  These were useful in the past when people needed to
use glyphs in the Symbol and ZapfDingbats fonts which are supported by PDF
viewing devices.

By default the standard fonts (Helvetica, Courier, Times Roman)
will offer the glyphs available in Latin-1.  However, if our engine detects
a character not in the font, it will attempt to switch to Symbol or ZapfDingbats to
display these.   For example, if you include the Unicode character for a pair of 
right-facing scissors, \\u2702, in a call to ^drawString^, you should see them (there is
an example in ^test_pdfgen_general.py/pdf^).  It is not
necessary to switch fonts in your code.

""")


heading2("Using non-standard Type 1 fonts")

disc("""
As discussed in the previous chapter, every copy of Acrobat Reader
comes with 14 standard fonts built in.  Therefore, the ReportLab
PDF Library only needs to refer to these by name.  If you want
to use other fonts, they must be available to your code and
will be embedded in the PDF document.""")

disc("""
You can use the mechanism described below to include arbitrary
fonts in your documents. We have an open source
font named <i>DarkGardenMK</i> which we may
use for testing and/or documenting purposes (and which you may
use as well). It comes bundled with the ReportLab distribution in the
directory $reportlab/fonts$.
""")

disc("""
Right now font-embedding relies on font description files in the Adobe
AFM ('Adobe Font Metrics') and PFB ('Printer Font Binary') format. The
former is an ASCII file and contains information about the characters
('glyphs') in the font such as height, width, bounding box info and
other 'metrics', while the latter is a binary file that describes the
shapes of the font. The $reportlab/fonts$ directory contains the files
$'DarkGardenMK.afm'$ and $'DarkGardenMK.pfb'$ that are used as an example
font.
""")

disc("""
In the following example locate the folder containing the test font and
register it for future use with the $pdfmetrics$ module,
after which we can use it like any other standard font.
""")


eg("""
import os
import reportlab
folder = os.path.dirname(reportlab.__file__) + os.sep + 'fonts'
afmFile = os.path.join(folder, 'DarkGardenMK.afm')
pfbFile = os.path.join(folder, 'DarkGardenMK.pfb')

from reportlab.pdfbase import pdfmetrics
justFace = pdfmetrics.EmbeddedType1Face(afmFile, pfbFile)
faceName = 'DarkGardenMK' # pulled from AFM file
pdfmetrics.registerTypeFace(justFace)
justFont = pdfmetrics.Font('DarkGardenMK',
                           faceName,
                           'WinAnsiEncoding')
pdfmetrics.registerFont(justFont)

canvas.setFont('DarkGardenMK', 32)
canvas.drawString(10, 150, 'This should be in')
canvas.drawString(10, 100, 'DarkGardenMK')
""")


disc("""
Note that the argument "WinAnsiEncoding" has nothing to do with the input;
it's to say which set of characters within the font file will be active
and available.
""")

illust(examples.customfont1, "Using a very non-standard font")

disc("""
The font's facename comes from the AFM file's $FontName$ field.
In the example above we knew the name in advance, but quite
often the names of font description files are pretty cryptic
and then you might want to retrieve the name from an AFM file
automatically.
When lacking a more sophisticated method you can use some
code as simple as this:
""")

eg("""
class FontNameNotFoundError(Exception):
    pass


def findFontName(path):
    "Extract a font name from an AFM file."

    f = open(path)

    found = 0
    while not found:
        line = f.readline()[:-1]
        if not found and line[:16] == 'StartCharMetrics':
            raise FontNameNotFoundError, path
        if line[:8] == 'FontName':
            fontName = line[9:]
            found = 1

    return fontName
""")

disc("""
In the <i>DarkGardenMK</i> example we explicitely specified
the place of the font description files to be loaded.
In general, you'll prefer to store your fonts in some canonic
locations and make the embedding mechanism aware of them.
Using the same configuration mechanism we've already seen at the
beginning of this section we can indicate a default search path
for Type-1 fonts.
""")

disc("""
Unfortunately, there is no reliable standard yet for such
locations (not even on the same platform) and, hence, you might
have to edit one of the files $reportlab_settings.py$ or $~/.reportlab_settings$ to modify the
value of the $T1SearchPath$ identifier to contain additional
directories.  Our own recommendation is to use the ^reportlab/fonts^
folder in development; and to have any needed fonts as packaged parts of
your application in any kind of controlled server deployment.  This insulates
you from fonts being installed and uninstalled by other software or system
administrator.
""")

heading3("Warnings about missing glyphs")
disc("""If you specify an encoding, it is generally assumed that
the font designer has provided all the needed glyphs.  However,
this is not always true.  In the case of our example font,
the letters of the alphabet are present, but many symbols and
accents are missing.  The default behaviour is for the font to
print a 'notdef' character - typically a blob, dot or space -
when passed a character it cannot draw.  However, you can ask
the library to warn you instead; the code below (executed
before loading a font) will cause warnings to be generated
for any glyphs not in the font when you register it.""")

eg("""
import reportlab.rl_config
reportlab.rl_config.warnOnMissingFontGlyphs = 0
""")



heading2("Standard Single-Byte Font Encodings")
disc("""
This section shows you the glyphs available in the common encodings.
""")


disc("""The code chart below shows the characters in the $WinAnsiEncoding$.
This is the standard encoding on Windows and many Unix systems in America
and Western Europe.  It is also knows as Code Page 1252, and is practically
identical to ISO-Latin-1 (it contains one or two extra characters). This
is the default encoding used by the Reportlab PDF Library. It was generated from
a standard routine in $reportlab/lib$, $codecharts.py$,
which can be used to display the contents of fonts.  The index numbers
along the edges are in hex.""")

cht1 = SingleByteEncodingChart(encodingName='WinAnsiEncoding',charsPerRow=32, boxSize=12)
illust(lambda canv: cht1.drawOn(canv, 0, 0), "WinAnsi Encoding", cht1.width, cht1.height)

disc("""The code chart below shows the characters in the $MacRomanEncoding$.
as it sounds, this is the standard encoding on Macintosh computers in
America and Western Europe.  As usual with non-unicode encodings, the first
128 code points (top 4 rows in this case) are the ASCII standard and agree
with the WinAnsi code chart above; but the bottom 4 rows differ.""")
cht2 = SingleByteEncodingChart(encodingName='MacRomanEncoding',charsPerRow=32, boxSize=12)
illust(lambda canv: cht2.drawOn(canv, 0, 0), "MacRoman Encoding", cht2.width, cht2.height)

disc("""These two encodings are available for the standard fonts (Helvetica,
Times-Roman and Courier and their variants) and will be available for most
commercial fonts including those from Adobe.  However, some fonts contain non-
text glyphs and the concept does not really apply.  For example, ZapfDingbats
and Symbol can each be treated as having their own encoding.""")

cht3 = SingleByteEncodingChart(faceName='ZapfDingbats',encodingName='ZapfDingbatsEncoding',charsPerRow=32, boxSize=12)
illust(lambda canv: cht3.drawOn(canv, 0, 0), "ZapfDingbats and its one and only encoding", cht3.width, cht3.height)

cht4 = SingleByteEncodingChart(faceName='Symbol',encodingName='SymbolEncoding',charsPerRow=32, boxSize=12)
illust(lambda canv: cht4.drawOn(canv, 0, 0), "Symbol and its one and only encoding", cht4.width, cht4.height)


CPage(5)
heading2("TrueType Font Support")
disc("""
Marius Gedminas ($mgedmin@delfi.lt$) with the help of Viktorija Zaksiene ($vika@pov.lt$)
have contributed support for embedded TrueType fonts.  TrueType fonts work in Unicode/UTF8
and are not limited to 256 characters.""")


CPage(3)
disc("""We use <b>$reportlab.pdfbase.ttfonts.TTFont$</b> to create a true type
font object and register using <b>$reportlab.pdfbase.pdfmetrics.registerFont$</b>.
In pdfgen drawing directly to the canvas we can do""")
eg("""
# we know some glyphs are missing, suppress warnings
import reportlab.rl_config
reportlab.rl_config.warnOnMissingFontGlyphs = 0

from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
pdfmetrics.registerFont(TTFont('Vera', 'Vera.ttf'))
pdfmetrics.registerFont(TTFont('VeraBd', 'VeraBd.ttf'))
pdfmetrics.registerFont(TTFont('VeraIt', 'VeraIt.ttf'))
pdfmetrics.registerFont(TTFont('VeraBI', 'VeraBI.ttf'))
canvas.setFont('Vera', 32)
canvas.drawString(10, 150, "Some text encoded in UTF-8")
canvas.drawString(10, 100, "In the Vera TT Font!")
""")
illust(examples.ttffont1, "Using a the Vera TrueType Font")
disc("""In the above example the true type font object is created using""")
eg("""
    TTFont(name,filename)
""")
disc("""so that the ReportLab internal name is given by the first argument and the second argument
is a string(or file like object) denoting the font's TTF file. In Marius' original patch the filename
was supposed to be exactly correct, but we have modified things so that if the filename is relative
then a search for the corresponding file is done in the current directory and then in directories
specified by $reportlab.rl_config.TTFSearchpath$!""")

from reportlab.lib.styles import ParagraphStyle

from reportlab.pdfbase.pdfmetrics import registerFontFamily
registerFontFamily('Vera',normal='Vera',bold='VeraBd',italic='VeraIt',boldItalic='VeraBI')

disc("""Before using the TT Fonts in Platypus we should add a mapping from the family name to the
individual font names that describe the behaviour under the $&lt;b&gt;$ and $&lt;i&gt;$ attributes.""")

eg("""
from reportlab.pdfbase.pdfmetrics import registerFontFamily
registerFontFamily('Vera',normal='Vera',bold='VeraBd',italic='VeraIt',boldItalic='VeraBI')
""")

disc("""If we only have a Vera regular font, no bold or italic then we must map all to the
same internal fontname.  ^&lt;b&gt;^ and ^&lt;i&gt;^ tags may now be used safely, but
have no effect.
After registering and mapping
the Vera font as above we can use paragraph text like""")
parabox2("""<font name="Times-Roman" size="14">This is in Times-Roman</font>
<font name="Vera" color="magenta" size="14">and this is in magenta <b>Vera!</b></font>""","Using TTF fonts in paragraphs")




heading2("Asian Font Support")
disc("""The Reportlab PDF Library aims to expose full support for Asian fonts.
PDF is the first really portable solution for Asian text handling. There are
two main approaches for this:  Adobe's Asian Language Packs, or TrueType fonts.
""")

heading3("Asian Language Packs")
disc("""
This approach offers the best performance since nothing needs embedding in the PDF file;
as with the standard fonts, everything is on the reader.""")

disc("""
Adobe makes available add-ons for each main language.  In Adobe Reader 6.0 and 7.0, you
will be prompted to download and install these as soon as you try to open a document
using them.  In earlier versions, you would see an error message on opening an Asian document
and had to know what to do.   
""")

disc("""
Japanese, Traditional Chinese (Taiwan/Hong Kong), Simplified Chinese (mainland China)
and Korean are all supported and our software knows about the following fonts:
""")
bullet("""
$chs$ = Chinese Simplified (mainland): '$STSong-Light$'
""")
bullet("""
$cht$ = Chinese Traditional (Taiwan): '$MSung-Light$', '$MHei-Medium$'
""")
bullet("""
$kor$ = Korean: '$HYSMyeongJoStd-Medium$','$HYGothic-Medium$'
""")
bullet("""
$jpn$ = Japanese: '$HeiseiMin-W3$', '$HeiseiKakuGo-W5$'
""")


disc("""Since many users will not have the font packs installed, we have included
a rather grainy ^bitmap^ of some Japanese characters.  We will discuss below what is needed to
generate them.""")
# include a bitmap of some Asian text
I=os.path.join(os.path.dirname(reportlab.__file__),'docs','images','jpnchars.jpg')
try:
    getStory().append(Image(I))
except:
    disc("""An image should have appeared here.""")

disc("""Prior to Version 2.0, you had to specify one of many native encodings
when registering a CID Font. In version 2.0 you should a new UnicodeCIDFont
class.""")

eg("""
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.cidfonts import UnicodeCIDFont
pdfmetrics.registerFont(UnicodeCIDFont('HeiseiMin-W3'))
canvas.setFont('HeiseiMin-W3', 16)

# the two unicode characters below are "Tokyo"
msg = u'\\u6771\\u4EAC : Unicode font, unicode input'
canvas.drawString(100, 675, msg)
""")
#had to double-escape the slashes above to get escapes into the PDF

disc("""The old coding style with explicit encodings should still work, but is now
only relevant if you need to construct vertical text.  We aim to add more readable options
for horizontal and vertical text to the UnicodeCIDFont constructor in future.
The following four test scripts generate samples in the corresponding languages:""")
eg("""tests/test_multibyte_jpn.py
tests/test_multibyte_kor.py
tests/test_multibyte_chs.py
tests/test_multibyte_cht.py""")

## put back in when we have vertical text...
##disc("""The illustration below shows part of the first page
##of the Japanese output sample.  It shows both horizontal and vertical
##writing, and illustrates the ability to mix variable-width Latin
##characters in Asian sentences.  The choice of horizontal and vertical
##writing is determined by the encoding, which ends in 'H' or 'V'.
##Whether an encoding uses fixed-width or variable-width versions
##of Latin characters also depends on the encoding used; see the definitions
##below.""")
##
##Illustration(image("../images/jpn.gif", width=531*0.50,
##height=435*0.50), 'Output from test_multibyte_jpn.py')
##
##caption("""
##Output from test_multibyte_jpn.py
##""")




disc("""In previous versions of the ReportLab PDF Library, we had to make
use of Adobe's CMap files (located near Acrobat Reader if the Asian Language
packs were installed).  Now that we only have one encoding to deal with, the
character width data is embedded in the package, and CMap files are not needed
for generation.  The CMap search path in ^rl_config.py^ is now deprecated
and has no effect if you restrict yourself to UnicodeCIDFont.
""")


heading3("TrueType fonts with Asian characters")
disc("""
This is the easy way to do it.  No special handling at all is needed to
work with Asian TrueType fonts.  Windows users who have installed, for example,
Japanese as an option in Control Panel, will have a font "msmincho.ttf" which
can be used.  However, be aware that it takes time to parse the fonts, and that
quite large subsets may need to be embedded in your PDFs.  We can also now parse
files ending in .ttc, which are a slight variation of .ttf.

""")


heading3("To Do")
disc("""We expect to be developing this area of the package for some time.accept2dyear
Here is an outline of the main priorities.  We welcome help!""")

bullet("""
Ensure that we have accurate character metrics for all encodings in horizontal and
vertical writing.""")

bullet("""
Add options to ^UnicodeCIDFont^ to allow vertical and proportional variants where the font permits it.""")


bullet("""
Improve the word wrapping code in paragraphs and allow vertical writing.""")



CPage(5)
heading2("RenderPM tests")

disc("""This may also be the best place to mention the test function of $reportlab/graphics/renderPM.py$,
which can be considered the cannonical place for tests which exercise renderPM (the "PixMap Renderer",
as opposed to renderPDF, renderPS or renderSVG).""")

disc("""If you run this from the command line, you should see lots of output like the following.""")

eg("""C:\\code\\reportlab\\graphics>renderPM.py
wrote pmout\\renderPM0.gif
wrote pmout\\renderPM0.tif
wrote pmout\\renderPM0.png
wrote pmout\\renderPM0.jpg
wrote pmout\\renderPM0.pct
...
wrote pmout\\renderPM12.gif
wrote pmout\\renderPM12.tif
wrote pmout\\renderPM12.png
wrote pmout\\renderPM12.jpg
wrote pmout\\renderPM12.pct
wrote pmout\\index.html""")

disc("""This runs a number of tests progressing from a "Hello World" test, through various tests of
Lines; text strings in a number of sizes, fonts, colours and alignments; the basic shapes; translated
and rotated groups; scaled coordinates; rotated strings; nested groups; anchoring and non-standard fonts.""")

disc("""It creates a subdirectory called $pmout$, writes the image files into it, and writes an
$index.html$ page which makes it easy to refer to all the results.""")

disc("""The font-related tests which you may wish to look at are test #11 ('Text strings in a non-standard font')
and test #12 ('Test Various Fonts').""")




##### FILL THEM IN