File: chrysalis.notes

package info (click to toggle)
trinityrnaseq 2.2.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 212,452 kB
  • ctags: 5,067
  • sloc: perl: 45,552; cpp: 19,678; java: 11,865; sh: 1,485; makefile: 613; ansic: 427; python: 313; xml: 83
file content (30 lines) | stat: -rwxr-xr-x 1,480 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
there are two parts for the graph, one is finalgraph.out: for each component, you have the identifier

Component 144

followed by the list of nodes:

477     476     850     TATCTTTGTTTAGATTAATTCTGC        0
478     477     576     ATCTTTGTTTAGATTAATTCTGCG        0
479     478     291     TCTTTGTTTAGATTAATTCTGCGC        0
480     479     235     CTTTGTTTAGATTAATTCTGCGCT        0
481     480     190     TTTGTTTAGATTAATTCTGCGCTA        0

listing, in each line:
- the node ID
- the incoming (left) node ID
- the edge weight, i.e. the 25-mer count going into this node (in the example above, node 478 lists the number of 25-mers TATCTTTGTTTAGATTAATTCTGCG  since the incoming node, 477, starts with "A").
- Last column, ignore (a debug flag)

Nodes can list the same 24-mer (up to 4 times). The second part is finalgraph.out.reads, which basically lists all the reads that contribute at least one edge to the component:

>SRR039231.29_FC42DB6AAXX:2:1:2:1297/1  0       324     50      16522           AAATTTTTGAGAAGTTCTCATTTACAAAATAAATTGCGACCTCGATGTTGGATTAAGAACAACTTATAAAGAAAA

- read name
- first k-mer position used in component in read coordinates
- first node id in graph corresponding to first used k-mer read
- last k-mer position used in component in read coordinates (always less than read length - k)
- last node id in graph corresponding to last used k-mer in read
- read sequence

The transformation from inchworm to finalgraph is lossless, modulo breaking and joining