[[utf-8-build]] === Build under UTF-8 The default locale of the build environment is *C*. Some programs such as the *read* function of Python3 change their behavior depending on the locale. Adding the following code to the *debian/rules* file ensures building the program under the *C.UTF-8* locale. ---- LC_ALL := C.UTF-8 export LC_ALL ---- [[utf-8-conv]] === UTF-8 conversion If upstream documents are encoded in old encoding schemes, converting them to https://en.wikipedia.org/wiki/UTF-8[UTF-8] is a good idea. Use the *iconv* command in the *libc-bin* package to convert encodings of plain text files. ---- $ iconv -f latin1 -t utf8 foo_in.txt > foo_out.txt ---- Use *w3m*(1) to convert from HTML files to UTF-8 plain text files. When you do this, make sure to execute it under UTF-8 locale. ---- $ LC_ALL=C.UTF-8 w3m -o display_charset=UTF-8 \ -cols 70 -dump -no-graph -T text/html \ < foo_in.html > foo_out.txt ---- Run these scripts in the *override_dh_** target of the *debian/rules* file.