File: stri_split_lines.Rd

package info (click to toggle)
r-cran-stringi 1.7.12-1
links: PTS, VCS
area: main
in suites: bookworm
size: 39,772 kB
sloc: cpp: 482,349; ansic: 51,900; perl: 471; makefile: 9; sh: 1
file content (87 lines) | stat: -rw-r--r-- 3,086 bytes
parent folder | download | duplicates (2)
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/search_split_bound.R
\name{stri_split_lines}
\alias{stri_split_lines}
\alias{stri_split_lines1}
\title{Split a String Into Text Lines}
\usage{
stri_split_lines(str, omit_empty = FALSE)

stri_split_lines1(str)
}
\arguments{
\item{str}{character vector (\code{stri_split_lines})
or a single string (\code{stri_split_lines1})}

\item{omit_empty}{logical vector; determines whether empty
strings should be removed from the result
   [\code{stri_split_lines} only]}
}
\value{
\code{stri_split_lines} returns a list of character vectors.
If any input string is \code{NA}, then the corresponding list element
is a single \code{NA} string.

\code{stri_split_lines1(str)} is equivalent to
\code{stri_split_lines(str[1])[[1]]} (with default parameters),
therefore it returns a character vector. Moreover, if the input string
ends with a newline sequence, the last empty string is omitted from the
file's contents into text lines.
}
\description{
These functions split each character string in a given vector
into text lines.
}
\details{
Vectorized over \code{str} and \code{omit_empty}.

\code{omit_empty} is applied when splitting. If set to \code{TRUE},
then empty strings will never appear in the resulting vector.

Newlines are represented with the Carriage Return
(CR, 0x0D), Line Feed (LF, 0x0A), CRLF, or Next Line (NEL, 0x85) characters,
depending on the platform.
Moreover, the Unicode Standard defines two unambiguous separator characters,
the Paragraph Separator (PS, 0x2029) and the Line Separator (LS, 0x2028).
Sometimes also the Vertical Tab (VT, 0x0B) and the Form Feed (FF, 0x0C)
are used for this purpose.

These \pkg{stringi} functions follow UTR#18 rules,
where a newline sequence
corresponds to the following regular expression:
\code{(?:\\u\{D A\}|(?!\\u\{D A\})[\\u\{A\}-\\u\{D\}\\u\{85\}\\u\{2028\}\\u\{2029\}]}.
Each match serves as a text line separator.
}
\references{
\emph{Unicode Newline Guidelines} -- Unicode Technical Report #13,
\url{https://www.unicode.org/standard/reports/tr13/tr13-5.html}

\emph{Unicode Regular Expressions} -- Unicode Technical Standard #18,
\url{https://www.unicode.org/reports/tr18/}
}
\seealso{
The official online manual of \pkg{stringi} at \url{https://stringi.gagolewski.com/}

Gagolewski M., \pkg{stringi}: Fast and portable character string processing in R, \emph{Journal of Statistical Software} 103(2), 2022, 1-59, \doi{10.18637/jss.v103.i02}

Other search_split: 
\code{\link{about_search}},
\code{\link{stri_split_boundaries}()},
\code{\link{stri_split}()}

Other text_boundaries: 
\code{\link{about_search_boundaries}},
\code{\link{about_search}},
\code{\link{stri_count_boundaries}()},
\code{\link{stri_extract_all_boundaries}()},
\code{\link{stri_locate_all_boundaries}()},
\code{\link{stri_opts_brkiter}()},
\code{\link{stri_split_boundaries}()},
\code{\link{stri_trans_tolower}()},
\code{\link{stri_wrap}()}
}
\concept{search_split}
\concept{text_boundaries}
\author{
\href{https://www.gagolewski.com/}{Marek Gagolewski} and other contributors
}