File: control

package info (click to toggle)
r-cran-tokenizers 0.3.0-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 824 kB
  • sloc: cpp: 143; sh: 13; makefile: 2
file content (41 lines) | stat: -rw-r--r-- 1,397 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Source: r-cran-tokenizers
Standards-Version: 4.7.3
Maintainer: Debian R Packages Maintainers <r-pkg-team@alioth-lists.debian.net>
Uploaders:
 Andreas Tille <tille@debian.org>,
Section: gnu-r
Testsuite: autopkgtest-pkg-r
Build-Depends:
 debhelper-compat (= 13),
 dh-r,
 r-base-dev,
 r-cran-stringi,
 r-cran-rcpp,
 r-cran-snowballc,
 architecture-is-64-bit,
 architecture-is-little-endian,
Vcs-Browser: https://salsa.debian.org/r-pkg-team/r-cran-tokenizers
Vcs-Git: https://salsa.debian.org/r-pkg-team/r-cran-tokenizers.git
Homepage: https://cran.r-project.org/package=tokenizers
Rules-Requires-Root: no

Package: r-cran-tokenizers
Architecture: any
Depends:
 ${R:Depends},
 ${shlibs:Depends},
 ${misc:Depends},
Recommends:
 ${R:Recommends},
Suggests:
 ${R:Suggests},
Description: GNU R fast, consistent tokenization of natural language text
 Convert natural language text into tokens. Includes tokenizers for
 shingled n-grams, skip n-grams, words, word stems, sentences,
 paragraphs, characters, shingled characters, lines, tweets, Penn
 Treebank, regular expressions, as well as functions for counting
 characters, words, and sentences, and a function for splitting longer
 texts into separate documents, each with the same number of words.
 The tokenizers have a consistent interface, and the package is built
 on the 'stringi' and 'Rcpp' packages for fast yet correct
 tokenization in 'UTF-8'.