File: stri_length.Rd

package info (click to toggle)
r-cran-stringi 1.7.12-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 39,772 kB
  • sloc: cpp: 482,349; ansic: 51,900; perl: 471; makefile: 9; sh: 1
file content (61 lines) | stat: -rw-r--r-- 2,044 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/length.R
\name{stri_length}
\alias{stri_length}
\title{Count the Number of Code Points}
\usage{
stri_length(str)
}
\arguments{
\item{str}{character vector or an object coercible to}
}
\value{
Returns an integer vector of the same length as \code{str}.
}
\description{
This function returns the number of code points
in each string.
}
\details{
Note that the number of code points is
not the same as the `width` of the string when
printed on the console.

If a given string is in UTF-8 and has not been properly normalized
(e.g., by \code{\link{stri_trans_nfc}}), the returned counts may sometimes be
misleading. See \code{\link{stri_count_boundaries}} for a method to count
\emph{Unicode characters}. Moreover, if an incorrect UTF-8 byte sequence
is detected, then a warning is generated and the corresponding output element
is set to \code{NA}, see also \code{\link{stri_enc_toutf8}} for a method
to deal with such cases.

Missing values are handled properly.
For `byte` encodings we get, as usual, an error.
}
\examples{
stri_length(LETTERS)
stri_length(c('abc', '123', '\u0105\u0104'))
stri_length('\u0105') # length is one, but...
stri_numbytes('\u0105') # 2 bytes are used
stri_numbytes(stri_trans_nfkd('\u0105')) # 3 bytes here but...
stri_length(stri_trans_nfkd('\u0105')) # ...two code points (!)
stri_count_boundaries(stri_trans_nfkd('\u0105'), type='character') # ...and one Unicode character

}
\seealso{
The official online manual of \pkg{stringi} at \url{https://stringi.gagolewski.com/}

Gagolewski M., \pkg{stringi}: Fast and portable character string processing in R, \emph{Journal of Statistical Software} 103(2), 2022, 1-59, \doi{10.18637/jss.v103.i02}

Other length: 
\code{\link{\%s$\%}()},
\code{\link{stri_isempty}()},
\code{\link{stri_numbytes}()},
\code{\link{stri_pad_both}()},
\code{\link{stri_sprintf}()},
\code{\link{stri_width}()}
}
\concept{length}
\author{
\href{https://www.gagolewski.com/}{Marek Gagolewski} and other contributors
}