File: intro.rst

package info (click to toggle)
sideretro 1.1.6-3
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 6,636 kB
  • sloc: ansic: 15,270; perl: 46; python: 44; makefile: 3
file content (167 lines) | stat: -rw-r--r-- 5,340 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
************
Introduction
************

**sideRETRO** is a bioinformatic tool devoted for the detection
of somatic (*de novo*) **retrocopy insertion** in whole genome
and whole exome sequencing data (WGS, WES). The program has been
written from scratch in C, and uses `HTSlib <http://www.htslib.org/>`_
and `SQLite3 <https://www.sqlite.org>`_ libraries, in order to
manage SAM/BAM/CRAM reading and data analysis. The source code is
distributed under the **GNU General Public License**.

Wait, what is retrocopy?
========================

I can tell you now that retrocopy is a term used for the process
resulting from **reverse-transcription** of a mature **mRNA**
molecule into **cDNA**, and its insertion into a new position on
the genome.

.. image:: images/retrocopy.png
   :scale: 50%
   :align: center

Got interested? For a more detailed explanation about what is
a retrocopy at all, please see our section :ref:`Retrocopy in a
nutshell <chap_retrocopy>`.

Features
========

When detecting retrocopy mobilization, sideRETRO can annotate
several other features related to the event:

Parental gene
  The **gene** which **underwent retrotransposition** process,
  giving rise to the retrocopy.

Genomic position
  The genome **coordinate** where occurred the retrocopy
  **integration** (chromosome:start-end). It includes the
  **insertion point**.

Strandness
  Detects the orientation of the insertion (+/-). It takes into
  account the orientation of insertion, whether in the
  **leading** (+) or **lagging** (-) DNA strand.

Genomic context
  The retrocopy integration site context: If the retrotransposition
  event occurred at an **intergenic** or **intragenic** region - the
  latter can be splitted into **exonic** and **intronic** according
  to the host gene.

Genotype
  When **multiple** individuals are analysed, annotate the
  events for each one. That way, it is possible to
  **distinguish** if an event is **exclusive** or **shared**
  among the cohort.

Haplotype
  Our tool provides information about the ploidy of the event,
  i.e., whether it occurs in one or both **homologous** chromosomes
  (homozygous or heterozygous).

How it works
============

sideRETRO compiles to an executable called :code:`sider`,
which has three subcommands: :code:`process-sample`,
:code:`merge-call` and :code:`make-vcf`. The :code:`process-sample`
subcommand reads a list of SAM/BAM/CRAM files, and captures
**abnormal reads** that must be related to an event of retrocopy.
All those data is saved to a **SQLite3 database** and then we come
to the second step :code:`merge-call`, which **processes** the database
and **annotate** all the retrocopies found. Finally we can run the
subcommand :code:`make-vcf` and generate an annotated retrocopy
`VCF <https://samtools.github.io/hts-specs/VCFv4.2.pdf>`_.

.. code-block:: sh

   # List of BAM files
   $ cat 'my-bam-list.txt'
   /path/to/file1.bam
   /path/to/file2.bam
   /path/to/file3.bam
   ...

   # Run process-sample step
   $ sider process-sample \
     --annotation-file='my-annotation.gtf' \
     --input-file='my-bam-list.txt'

   $ ls -1
   my-genome.fa
   my-annotation.gtf
   my-bam-list.txt
   out.db

   # Run merge-call step
   $ sider merge-call --in-place out.db

   # Run make-vcf step
   $ sider make-vcf \
     --reference-file='my-genome.fa' out.db

Take a look at the manual page for :ref:`installation <chap_installation>`
and :ref:`usage <chap_usage>` information. Also for more details about
the algorithm, see our :ref:`methodology <chap_methodology>`.

Obtaining sideRETRO
===================

The source code for the program can be obtaining in the `github
<https://github.com/galantelab/sideRETRO>`_ page. From the command
line you can clone our repository::

  $ git clone https://github.com/galantelab/sideRETRO.git

No Warranty
===========

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
`GNU General Public License
<https://www.gnu.org/licenses/gpl-3.0.en.html>`_
for more details.

Reporting Bugs
==============

If you find a bug, or have any issue, please inform us in the
`github issues tab <https://github.com/galantelab/sideRETRO/issues>`_.
All bug reports should include:

- The version number of sideRETRO
- A description of the bug behavior

Citation
========

If sideRETRO was somehow useful in your research, please cite it:

.. code-block:: bib

   @article{10.1093/bioinformatics/btaa689,
     author = {Miller, Thiago L A and Orpinelli, Fernanda and Buzzo, José Leonel L and Galante, Pedro A F},
     title = "{sideRETRO: a pipeline for identifying somatic and polymorphic insertions of processed pseudogenes or retrocopies}",
     journal = {Bioinformatics},
     year = {2020},
     month = {07},
     issn = {1367-4803},
     doi = {10.1093/bioinformatics/btaa689},
     url = {https://doi.org/10.1093/bioinformatics/btaa689},
     note = {btaa689},
   }

Further Information
===================

If you need additional information, or a closer contact with the authors -
*we are always looking for coffee and good company* - contact us by email,
see :ref:`authors <chap_authors>`.

Our bioinformatic group has a site, feel free to make us a visit:
https://www.bioinfo.mochsl.org.br/.