File: glossary.rst

package info (click to toggle)
python-cooler 0.10.3-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 32,600 kB
  • sloc: python: 11,033; makefile: 173; sh: 31
file content (22 lines) | stat: -rw-r--r-- 1,248 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
.. _Glossary:

Glossary
--------

* HDF5 is a general purpose binary container format for large scientific datasets.

* h5py is a Python library providing low-level bindings to the libhdf5 C-library and a high-level, numpy-aware API to interact with HDF5 files on disk.

* The cooler **data model** is a flexible sparse data model for Hi-C and other genomically-labeled arrays.

* The cooler **schema** describes an implementation of the cooler data model using HDF5 as the underlying storage layer.

* Cooler files store one or more cooler **data collections**, each representing a genomically-labeled sparse array.

* Single-resolution cooler files are conventionally given the extension ``.cool``. Multi-resolution files are usually suffixed ``.mcool``.

* The *cooler* Python package provides an API to create cooler files and to interact with them both as data frames and sparse matrices.

* A genomic **pairs** list provides pointwise 2-tuples of single-bp genomic locations. In Hi-C this is also called a contact list.

* A genomic **matrix**, 2D array or heatmap assigns unique quantitative values to pairs of genomic intervals taken from a bin segmentation of a genome assembly. In Hi-C, a contact matrix is obtained by aggregating pairs.