File: 67-utf8.txt

package info (click to toggle)
debmake-doc 1.14-1
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 7,256 kB
  • sloc: sh: 674; makefile: 469; python: 146; ansic: 114; sed: 16
file content (34 lines) | stat: -rw-r--r-- 1,028 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
[[utf-8-build]]
=== Build under UTF-8

The default locale of the build environment is *C*.

Some programs such as the *read* function of Python3 change their behavior depending on the locale.

Adding the following code to the *debian/rules* file ensures building the program under the *C.UTF-8* locale.

----
LC_ALL := C.UTF-8
export LC_ALL
----

[[utf-8-conv]]
=== UTF-8 conversion

If upstream documents are encoded in old encoding schemes, converting them to https://en.wikipedia.org/wiki/UTF-8[UTF-8] is a good idea.

Use the *iconv* command in the *libc-bin* package to convert encodings of plain text files.

----
 $ iconv -f latin1 -t utf8 foo_in.txt > foo_out.txt
----

Use *w3m*(1) to convert from HTML files to UTF-8 plain text files. When you do this, make sure to execute it under UTF-8 locale.
----
 $ LC_ALL=C.UTF-8 w3m -o display_charset=UTF-8 \
        -cols 70 -dump -no-graph -T text/html \
        < foo_in.html > foo_out.txt
----

Run these scripts in the *override_dh_** target of the *debian/rules* file.