File: quickstart.rst

package info (click to toggle)
pairtools 1.1.3-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 3,284 kB
  • sloc: python: 8,792; sh: 123; makefile: 68; ansic: 31
file content (38 lines) | stat: -rw-r--r-- 1,291 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Quickstart
==========

Install `pairtools` and all of its dependencies using the 
`conda <https://conda.io/docs/user-guide/install/index.html>`_ package manager and 
the `bioconda <https://bioconda.github.io/index.html>`_ channel for bioinformatics 
software.

.. code-block:: bash

    $ conda install -c conda-forge -c bioconda pairtools

Setup a new test folder and download a small Hi-C dataset mapped to sacCer3 genome:

.. code-block:: bash

    $ mkdir /tmp/test-pairtools
    $ cd /tmp/test-pairtools
    $ wget https://github.com/open2c/distiller-test-data/raw/master/bam/MATalpha_R1.bam

Additionally, we will need a .chromsizes file, a TAB-separated plain text table describing the names, sizes and the order of chromosomes in the genome assembly used during mapping:

.. code-block:: bash

    $ wget https://raw.githubusercontent.com/open2c/distiller-test-data/master/genome/sacCer3.reduced.chrom.sizes

With `pairtools parse`, we can convert paired-end sequence alignments stored in .sam/.bam format into .pairs, a TAB-separated table of Hi-C ligation junctions:

.. code-block:: bash

    $ pairtools parse -c sacCer3.reduced.chrom.sizes -o MATalpha_R1.pairs.gz --drop-sam MATalpha_R1.bam 

Inspect the resulting table:

.. code-block:: bash

    $ less MATalpha_R1.pairs.gz