File: NEWS

package info (click to toggle)
r-bioc-delayedarray 0.8.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 980 kB
  • sloc: ansic: 93; makefile: 2; sh: 1
file content (132 lines) | stat: -rw-r--r-- 6,467 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
CHANGES IN VERSION 0.8.0
------------------------

NEW FEATURES

    o Add get/setAutoBlockSize(), getAutoBlockLength(),
      get/setAutoBlockShape() and get/setAutoGridMaker().

    o Add rowGrid() and colGrid(), in addition to blockGrid().

    o Add get/setAutoBPPARAM() to control the automatic 'BPPARAM' used by
      blockApply().

    o Reduce memory usage when realizing a sparse DelayedArray to disk
    
      On-disk realization of a DelayedArray object that is reported to be sparse
      (by is_sparse()) to a "sparsity-optimized" backend (i.e. to a backend with
      a memory efficient write_sparse_block() like the TENxMatrix backend imple-
      mented in the HDF5Array package) now preserves sparse representation of
      the data all the way. More precisely, each block of data is now kept in
      a sparse form during the 3 steps that it goes thru: read from seed,
      realize in memory, and write to disk.

    o showtree() now displays whether a tree node or leaf is considered sparse
      or not.

    o Enhance "aperm" method and dim() setter for DelayedArray objects. In
      addition to allowing dropping "ineffective dimensions" (i.e. dimensions
      equal to 1) from a DelayedArray object, aperm() and the dim() setter now
      allow adding "ineffective dimensions" to it.

    o Enhance subassignment to a DelayedArray object.
    
      So far subassignment to a DelayedArray object only supported the **linear
      form** (i.e. x[i] <- value) with strong restrictions (the subscript 'i'
      must be a logical DelayedArray of the same dimensions as 'x', and 'value'
      must be an ordinary vector of length 1).
    
      In addition to this linear form, subassignment to a DelayedArray object
      now supports the **multi-dimensional form** (e.g. x[3:1, , 6] <- 0). In
      this form, one subscript per dimension is supplied, and each subscript
      can be missing or be anything that multi-dimensional subassignment to
      an ordinary array supports. The replacement value (a.k.a. the right
      value) can be an array-like object (e.g. ordinary array, dgCMatrix object,
      DelayedArray object, etc...) or an ordinary vector of length 1. Like the
      linear form, the multi-dimensional form is also implemented as a delayed
      operation.

    o Re-implement internal helper simple_abind() in C and support long arrays.
      simple_abind() is the workhorse behind realization of arbind() and
      acbind() operations on DelayedArray objects.

    o Add "table" and (restricted) "unique" methods for DelayedArray objects,
      both block-processed.

    o range() (block-processed) now supports the 'finite' argument on a
      DelayedArray object.

    o %*% (block-processed) now works between a DelayedMatrix object and an
      ordinary vector.

    o Improve support for DelayedArray of type "list".

    o Add TENxMatrix to list of supported realization backends.

    o Add backend-agnostic RealizationSink() constructor.

    o Add linearInd() utility for turning array indices into linear indices.
      Note that linearInd() performs the reverse transformation of
      base::arrayInd().

    o Add low-level utilities mapToGrid() and mapToRef() for mapping reference
      array positions to grid positions and vice-versa.

    o Add downsample() for reducing the "resolution" of an ArrayGrid object.

    o Add maxlength() generic and methods for ArrayGrid objects.

SIGNIFICANT USER-VISIBLE CHANGES

    o Multi-dimensional subsetting is no more delayed when drop=TRUE and the
      result has only one dimension. In this case the result now is returned
      as an **ordinary** vector (atomic or list). This is the only case of
      multi-dimensional single bracket subsetting that is not delayed.

    o Rename defaultGrid() -> blockGrid(). The 'max.block.length' argument
      is replaced with the 'block.length' argument. 2 new arguments are
      added: 'chunk.grid' and 'block.shape'.

    o Major improvements to the block processing mechanism.
      All block-processed operations (except realization by block) now support
      blocks of **arbitrary** geometry instead of column-oriented blocks only.
      'blockGrid(x)', which is called by the block-processed operations to get
      the grid of blocks to use on 'x', has the following new features:
      1) It's "chunk aware". This means that, when the chunk grid is known (i.e.
         when 'chunkGrid(x)' is not NULL), 'blockGrid(x)' defines blocks that
         are "compatible" with the chunks i.e. that any chunk is fully contained
         in a block. In other words, blocks are chosen so that chunks don't
         cross their boundaries.
      2) When the chunk grid is unknown (i.e. when 'chunkGrid(x)' is NULL),
         blocks are "isotropic", that is, they're as close as possible to an
         hypercube instead of being "column-oriented" (column-oriented blocks,
         also known as "linear blocks", are elongated along the 1st dimension,
         then along the 2nd dimension, etc...)
      3) The returned grid has the lowest "resolution" compatible with
         'getAutoBlockSize()', that is, the blocks are made as big as possible
         as long as their size in memory doesn't exceed 'getAutoBlockSize()'.
         Note that this is not a new feature. What is new though is that an
         exception now is made when the chunk grid is known and some chunks
         are >= 'getAutoBlockSize()', in which case 'blockGrid(x)' returns a
         grid that is the same as the chunk grid.
      These new features are supposed to make the returned grid "optimal" for
      block processing. (Some benchmarks still need to be done to
      confirm/quantify this.)

    o The automatic block size now is set to 100 Mb (instead of 4.5 Mb
      previously) at package startup. Use setAutoBlockSize() to change the
      automatic block size.

    o No more 'BPREDO' argument to blockApply().

    o Replace block_APPLY_and_COMBINE() with blockReduce().

BUG FIXES

    o No-op operations on a DelayedArray derivative really act like no-ops.
      Operating on a DelayedArray derivative (e.g. RleArray, HDF5Array or
      GDSArray) will now return an objet of the original class if the result
      is "pristine" (i.e. if it doesn't carry delayed operations) instead of
      degrading the object to a DelayedArray instance. This applies for example
      to 't(t(x))' or 'dimnames(x) <- dimnames(x)' etc...