File: unicode-width-workaround.Rd

package info (click to toggle)
r-cran-cli 3.6.4-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 4,288 kB
  • sloc: ansic: 16,412; cpp: 37; sh: 13; makefile: 2
file content (32 lines) | stat: -rw-r--r-- 1,429 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/unicode.R
\name{unicode-width-workaround}
\alias{unicode-width-workaround}
\title{Working around the bad Unicode character widths}
\description{
R 3.6.2 and also the coming 3.6.3 and 4.0.0 versions use the Unicode 8
standard to calculate the display width of Unicode characters.
Unfortunately the widths of most emojis are incorrect in this standard,
and width 1 is reported instead of the correct 2 value.
}
\details{
cli implements a workaround for this. The package contains a table that
contains all Unicode ranges that have wide characters (display width 2).

On first use of one of the workaround wrappers (in \code{ansi_nchar()}, etc.)
we check what the current version of R thinks about the width of these
characters, and then create a regex that matches the ones that R
is wrong about (\code{re_bad_char_width}).

Then we use this regex to duplicate all of the problematic characters
in the input string to the wrapper function, before calling the real
string manipulation function (\code{nchar()}, \code{strwrap()}) etc. At end we
undo the duplication before we return the result.

This workaround is fine for \code{nchar()} and \code{strwrap()}, and consequently
\code{ansi_align()} and \code{ansi_strtrim()} as well.

The rest of the \verb{ansi_*()} functions work on characters, and do not
deal with character width.
}
\keyword{internal}