File: text-encoding.rst

package info (click to toggle)
copyq 3.7.3-1~bpo9+1
  • links: PTS, VCS
  • area: main
  • in suites: stretch-backports
  • size: 10,480 kB
  • sloc: cpp: 51,894; sh: 734; python: 211; xml: 57; makefile: 34
file content (26 lines) | stat: -rw-r--r-- 1,131 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Text Encoding
=============

This page serves as concept for adding additional CopyQ command line
switch to print and read texts in UTF-8 (i.e. without using system
encoding).

Every time the bytes are read from a command (standard output or
arguments from client) the input is expected to be either just series of
bytes or text in system encoding (possibly Latin1 on Windows). But
texts/strings in CopyQ and in clipboard are UTF-8 formatted (except some
MIME types with specified encoding).

When reading system-encoded text (MIME starts with "text/") CopyQ
re-encodes the data from system encoding to UTF-8. That's not a problem
if the received data is really in system encoding. But if you send data
from Perl with the UTF-8 switch, CopyQ must also know that UTF-8 is used
instead of system encoding.

The same goes for other way. CopyQ sends texts back to client or to a
command in system encoding so it needs to convert these texts from
UTF-8.

As for the re-encoding part, Qt does nice job transforming characters
from UTF-8 but of course for lot of characters in UTF-8 there is no
alternative in Latin1 and other encodings.