File: tokenizers.Rd

package info (click to toggle)
r-cran-tokenizers 0.3.0-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 824 kB
  • sloc: cpp: 143; sh: 13; makefile: 2
file content (19 lines) | stat: -rw-r--r-- 757 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tokenizers-package.r
\docType{package}
\name{tokenizers}
\alias{tokenizers}
\title{Tokenizers}
\description{
A collection of functions with a consistent interface to convert natural
language text into tokens.
}
\details{
The tokenizers in this package have a consistent interface. They all take
either a character vector of any length, or a list where each element is a
character vector of length one. The idea is that each element comprises a
text. Then each function returns a list with the same length as the input
vector, where each element in the list are the tokens generated by the
function. If the input character vector or list is named, then the names are
preserved.
}