1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
|
Source: kerchunk
Section: python
Maintainer: Debian GIS Project <pkg-grass-devel@lists.alioth.debian.org>
Uploaders: Antonio Valentino <antonio.valentino@tiscali.it>
Build-Depends: debhelper-compat (= 13),
dh-sequence-python3,
dh-sequence-sphinxdoc <!nodoc>,
pybuild-plugin-pyproject,
python3-aiohttp <!nocheck>,
python3-all,
python3-astropy <!nodoc>,
python3-dask <!nocheck>,
python3-cfgrib,
python3-cftime,
python3-eccodes <!nodoc>,
python3-fsspec,
python3-h5netcdf <!nocheck>,
python3-h5py,
python3-netcdf4 <!nocheck>,
python3-numcodecs,
python3-numpy,
python3-numpydoc <!nodoc>,
python3-pytest <!nocheck>,
python3-s3fs <!nocheck>,
python3-scipy,
python3-setuptools,
python3-setuptools-scm,
python3-sphinx <!nodoc>,
python3-sphinx-rtd-theme <!nodoc>,
python3-tifffile <!nodoc>,
python3-ujson,
python3-xarray,
python3-zarr
Standards-Version: 4.7.3
Testsuite: autopkgtest-pkg-pybuild
Homepage: https://github.com/fsspec/kerchunk
Vcs-Browser: https://salsa.debian.org/debian-gis-team/kerchunk
Vcs-Git: https://salsa.debian.org/debian-gis-team/kerchunk.git
Description: Cloud-friendly access to archival data
Kerchunk is a library that provides a unified way to represent a
variety of chunked, compressed data formats (e.g. NetCDF, HDF5, GRIB),
allowing efficient access to the data from traditional file systems or
cloud object storage. It also provides a flexible way to create
virtual datasets from multiple files. It does this by extracting the
byte ranges, compression information and other information about the
data and storing this metadata in a new, separate object.
This means that you can create a virtual aggregate dataset over
potentially many source files, for efficient, parallel and
cloud-friendly *in-situ* access without having to copy or translate
the originals. It is a gateway to in-the-cloud massive data processing
while the data providers still insist on using legacy formats for
archival storage.
.
Features:
.
* completely serverless architecture
* metadata consolidation, so you can understand a many-file dataset
(metadata plus physical storage) in a single read
* read from all of the storage backends supported by fsspec,
including object storage (s3, gcs, abfs, alibaba), http, cloud user
storage (dropbox, gdrive) and network protocols (ftp, ssh, hdfs,
smb...)
* loading of various file types (currently netcdf4/HDF, grib2, tiff,
fits, zarr), potentially heterogeneous within a single dataset,
without a need to go via the specific driver (e.g., no need for
h5py)
* asynchronous concurrent fetch of many data chunks in one go,
amortizing the cost of latency
* parallel access with a library like zarr without any locks
* logical datasets viewing many (>~millions) data files, and direct
access/subselection to them via coordinate indexing across an
arbitrary number of dimensions
Package: python3-kerchunk
Architecture: all
Depends: ${python3:Depends},
${misc:Depends}
Recommends: python3-cfgrib,
python3-cftime,
python3-h5py,
python3-scipy,
python3-xarray
Suggests: python3-aiohttp,
python3-dask,
python3-netcdf4,
python3-s3fs
Description: ${source:Synopsis}
${source:Extended-Description}
Package: python-kerchunk-doc
Section: doc
Architecture: all
Multi-Arch: foreign
Depends: ${sphinxdoc:Depends},
${misc:Depends}
Suggests: www-browser
Description: ${source:Synopsis} (documentation)
${source:Extended-Description}
.
This package provides the HTML documentation for kerchunk.
|