File: README.md

package info (click to toggle)
unikmer 0.20.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 4,796 kB
  • sloc: sh: 116; makefile: 4
file content (27 lines) | stat: -rw-r--r-- 794 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
## Compression rate comparison

No Taxids stored.

1. Prepare a genome sequence, I used human genome chromosome X (`t_chrX.fa.gz`)

2. Computation

    f=t_chrX.fa.gz
    ./cr2.sh $f > table.tsv
    
3. Plot
    
    
    cat table.tsv \
        | csvtk -t mutate2 -L 1 -n r_gzip -e '$gzip/$plain*100' \
        | csvtk -t mutate2 -L 1 -n r_unik.default -e '$unik/$plain*100' \
        | csvtk -t mutate2 -L 1 -n r_unik.compact -e '$cunik/$plain*100' \
        | csvtk -t mutate2 -L 1 -n r_unik.sorted -e '$sunik/$plain*100' \
        | csvtk -t cut -F -f k,num,r_* \
        | csvtk -t gather -k group -v value -F -f 'r_*' \
        | csvtk -t replace -f group -p 'r_' \
        | csvtk -t replace -f num -p '^(.+)$' -k size.tsv -r '{kv} k-mers' \
        > table.r.tsv
    
    ./plot.R