1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
|
CHANGES IN VERSION 1.34.0
-------------------------
NEW FEATURES
o Add 'as.vector' argument to h5mread().
SIGNIFICANT USER-VISIBLE CHANGES
o Improvements to coercions from CSC_H5SparseMatrixSeed, H5SparseMatrix,
TENxMatrix, or H5ADMatrix to SparseArray:
- should be significantly more efficient, thanks to various tweaks that
happened in the SparseArray and Delayed5Array packages;
- support coercing an object with more than 2^31 nonzero values.
o Coercion from any of the class above to a sparseMatrix derivative
now fails early if object to coerce has >= 2^31 nonzero values.
o All *Seed classes in the package now extend the new OutOfMemoryObject
class defined in BiocGenerics (virtual class with no slots).
BUG FIXES
o Fix long standing bug in t() methods for CSC_H5SparseMatrixSeed and
CSR_H5SparseMatrixSeed objects.
o Replace internal calls to rhdf5::H5Fopen(), rhdf5::H5Dopen(), and
rhdf5::H5Gopen(), with calls to new internal helpers .H5Fopen(),
.H5Dopen(), and .H5Gopen(), respectively.
See commit 31a7e06 for more information.
CHANGES IN VERSION 1.32.0
-------------------------
NEW FEATURES
o Some light refactoring of the HDF5 dump management utilities:
- All the settings controlled by the get/setHDF5Dump*() functions are
now formally treated as global options (i.e. they're stored in the
global .Options vector). The benefit is that the settings will always
get passed to the workers in the context of parallel evaluation, even
when using a parallel back-end like BiocParallel::SnowParam.
In other words, all the workers are now guaranteed to use the same
settings as the main R process.
- In addition, getHDF5DumpFile() was further modified to make sure that
it will generate unique "automatique dump files" across workers.
SIGNIFICANT USER-VISIBLE CHANGES
o Change 'with.dimnames' default to TRUE (was FALSE) in writeHDF5Array().
BUG FIXES
o Make sure that chunkdim(x) on a TENxRealizationSink,
CSC_H5SparseMatrixSeed, or CSR_H5SparseMatrixSeed object 'x'
**always** returns dimensions that are at most dim(x), even
when 'x' has 0 rows and/or columns.
CHANGES IN VERSION 1.30.0
-------------------------
NEW FEATURES
o Add 'dim' and 'sparse.layout' args to H5SparseMatrixSeed().
SIGNIFICANT USER-VISIBLE CHANGES
o HDF5Array now imports S4Arrays.
CHANGES IN VERSION 1.28.0
-------------------------
- No changes in this version.
CHANGES IN VERSION 1.26.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Try harder to find and load the matrix rownames of a 10x Genomics dataset.
See commit abafbb9e99ad54a64e5013305486b97daa9442bc.
BUG FIXES
o Handle HDF5 sparse matrices where shape is not an integer vector.
When the shape returned by internal helper .read_h5sparse_dim() is a
double vector it is now coerced to an integer vector. Integer overflows
resulting from this coercion trigger an error with an informative error
message.
See GitHub issue #48.
CHANGES IN VERSION 1.24.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Improve error reporting in internal helper .h5openlocalfile()
BUG FIXES
o Make sure updateObject() handles very old HDF5ArraySeed instances.
CHANGES IN VERSION 1.22.0
-------------------------
- No changes in this version.
CHANGES IN VERSION 1.20.0
-------------------------
NEW FEATURES
o Implement the H5SparseMatrix class and H5SparseMatrix() constructor
function. H5SparseMatrix is a DelayedMatrix subclass for representing
and operating on an HDF5 sparse matrix stored in CSR/CSC/Yale format.
o Implement the H5ADMatrix class and H5ADMatrix() constructor function.
H5ADMatrix is a DelayedMatrix subclass for representing and operating
on the central matrix of an ‘h5ad’ file, or any matrix in its '/layers'
group.
o Implement H5File objects. The H5File class provides a formal
representation of an HDF5 file (local or remote, including a file
stored in an Amazon S3 bucket).
o HDF5Array objects now work with files on Amazon S3 (via use of H5File()).
BUG FIXES
o Remove "global counter" files at unload time (commit f7913043).
CHANGES IN VERSION 1.18.0
-------------------------
NEW FEATURES
o Add 'as.sparse' argument to h5mread(), HDF5Array(), HDF5ArraySeed(),
writeHDF5Array(), saveHDF5SummarizedExperiment(), and
HDF5RealizationSink().
Even though it won't change how the data is stored in the HDF5 file
(data will still be stored the usual dense way), the 'as.sparse'
argument allows the user to control whether the HDF5 dataset should
be considered sparse (and treated as such) or not. More precisely,
when HDF5Array() is called with 'as.sparse=TRUE', the returned object
will be considered sparse i.e. blocks in the object will be loaded as
sparse objects during block processing. This should lead to less
memory usage and hopefully overall better performance.
o Add is_sparse() setter for HDF5Array and HDF5ArraySeed objects.
SIGNIFICANT USER-VISIBLE CHANGES
o Change default value of 'verbose' argument from FALSE to NA for
writeHDF5Array(), saveHDF5SummarizedExperiment(), and writeTENxMatrix().
BUG FIXES
o Fix handling of logical NAs in h5mread().
o Fix bug in saveHDF5SummarizedExperiment() when 'chunkdim' is specified.
CHANGES IN VERSION 1.16.0
-------------------------
NEW FEATURES
o New h5writeDimnames()/h5readDimnames() functions for writing/reading
the dimnames of an HDF5 dataset to/from the HDF5 file.
See ?h5writeDimnames for more information.
o Add full support for HDF5Array objects of type "raw":
- writeHDF5Array() now works on a DelayedArray object of type "raw" (it
creates an H5 dataset of type H5T_STD_U8LE).
- The HDF5Array() constructor now should return an HDF5Array object of
type "raw" when pointed to an H5 dataset with an 8-bit width type (e.g.
H5T_STD_U8LE, H5T_STD_U8BE, H5T_STD_I8LE, H5T_STD_I8BE, H5T_STD_B8LE,
H5T_STD_B8BE, etc...)
o Add 'H5type' argument to writeHDF5Array().
o h5mread() now supports contiguous (i.e. unchunked) string data.
SIGNIFICANT USER-VISIBLE CHANGES
o HDF5Array objects now find their dimnames in the HDF5 file.
writeHDF5Array() and as(x, "HDF5Array") know how to write the dimnames
to the HDF5 file, and the HDF5Array() constructor knows how to find
them. See ?writeHDF5Array for more information.
BUG FIXES
o Fix bug causing character data to be truncated when written to HDF5 file.
o Fix h5mread() inefficiency when the user selection covers full chunks.
o h5mread() now handles character NAs consistently with rhdf5::h5read().
o Fix writeHDF5Array() error on character array filled with NAs.
CHANGES IN VERSION 1.14.0
-------------------------
NEW FEATURES
o Add coercions from TENxMatrix (or TENxMatrixSeed) to dgCMatrix
SIGNIFICANT USER-VISIBLE CHANGES
o h5mread() argument 'starts' now defaults to NULL
BUG FIXES
o h5mread() now supports datasets with contiguous layout (i.e. not chunked)
CHANGES IN VERSION 1.12.0
-------------------------
NEW FEATURES
o Add 'prefix' arg to save/loadHDF5SummarizedExperiment()
o Add quickResaveHDF5SummarizedExperiment() for fast re-saving after
initial saveHDF5SummarizedExperiment().
See ?quickResaveHDF5SummarizedExperiment for more information.
o Add h5mread() as a faster alternative to rhdf5::h5read(). It is now
the workhorse behind the extract_array() method for HDF5ArraySeed
objects. This change should significantly speed up block processing
of HDF5ArraySeed-based DelayedArray objects (including HDF5Array
objects).
CHANGES IN VERSION 1.10.0
-------------------------
NEW FEATURES
o Implement the TENxMatrix container (DelayedArray backend for the
HDF5-based sparse matrix representation used by 10x Genomics).
Also add writeTENxMatrix() and coercion to TENxMatrix.
SIGNIFICANT USER-VISIBLE CHANGES
o By default automatic HDF5 datasets (e.g. the dataset that gets written
to disk when calling 'as(x, "HDF5Array")') now are created with chunks
of 1 million array elements (revious default was 1/75 of
'getAutoBlockLength(x)'). This can be controlled with new low-level
utilities get/setHDF5DumpChunkLength().
o By default automatic HDF5 datasets now are created with chunks of
shape "scale" instead of "first-dim-grows-first". This can be
controlled with new low-level utilities get/setHDF5DumpChunkShape().
o getHDF5DumpChunkDim() looses the 'type' and 'ratio' arguments (only 'dim'
is left).
|