File: control

package info (click to toggle)
r-cran-ff 4.5.2%2Bds-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 1,924 kB
  • sloc: ansic: 7,297; cpp: 2,329; perl: 24; sh: 19; makefile: 5
file content (69 lines) | stat: -rw-r--r-- 3,743 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
Source: r-cran-ff
Standards-Version: 4.7.3
Maintainer: Debian R Packages Maintainers <r-pkg-team@alioth-lists.debian.net>
Uploaders:
 Steffen Moeller <moeller@debian.org>,
Section: gnu-r
Testsuite: autopkgtest-pkg-r
Build-Depends:
 debhelper-compat (= 13),
 dh-r,
 r-base-dev,
 r-cran-bit,
 architecture-is-64-bit,
 architecture-is-little-endian,
Vcs-Browser: https://salsa.debian.org/r-pkg-team/r-cran-ff
Vcs-Git: https://salsa.debian.org/r-pkg-team/r-cran-ff.git
Homepage: https://cran.r-project.org/package=ff
Rules-Requires-Root: no

Package: r-cran-ff
Architecture: any
Depends:
 ${R:Depends},
 ${shlibs:Depends},
 ${misc:Depends},
Recommends:
 ${R:Recommends},
Suggests:
 ${R:Suggests},
Description: Memory-Efficient Storage of Large Data on Disk and Fast Access Functions
 The ff package provides data structures that are stored on disk
 but behave (almost) as if they were in RAM by transparently mapping only
 a section (pagesize) in main memory - the effective virtual memory consumption
 per ff object. ff supports R's standard atomic data types 'double',
 'logical', 'raw' and 'integer' and non-standard atomic types boolean
 (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed
 with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort
 (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows
 efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned
 types support 'circular' arithmetic. There is also support for close-to-atomic
 types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types.
 ff not only has native C-support for vectors, matrices and arrays with
 flexible dimorder (major column-order, major row-order and generalizations for
 arrays). There is also a ffdf class not unlike data.frames and import/export
 filters for csv files. ff objects store raw data in binary flat files in
 native encoding, and complement this with metadata stored in R as physical
 and virtual attributes. ff objects have well-defined hybrid copying semantics,
 which gives rise to certain performance improvements through virtualization.
 ff objects can be stored and reopened across R sessions. ff files can be
 shared by multiple ff R objects (using different data en/de-coding schemes)
 in the same process or from multiple R processes to exploit parallelism.
 A wide choice of finalizer options allows to work with 'permanent'
 files as well as creating/removing 'temporary' ff files completely transparent
 to the user. On certain OS/Filesystem combinations, creating the ff files
 works without notable delay thanks to using sparse file allocation.
 Several access optimization techniques such as Hybrid Index Preprocessing
 and Virtualization are implemented to achieve good performance even with
 large datasets, for example virtual matrix transpose without touching a single
 byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data
 types get stored native and compact on binary flat files i.e. logicals take up
 exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions,
 the ff package also provides compatibility functions that facilitate writing
 code for ff and ram objects and support for batch processing on ff objects
 (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from
 package 'bit': chunked looping, fast bit operations and coercions between
 different objects that can store subscript information ('bit', 'bitwhich', ff
 'boolean', ri range index, hi hybrid index). This allows to work interactively
 with selections of large datasets and quickly modify selection criteria.
 Further high-performance enhancements can be made available upon request.