File: README.md

package info (click to toggle)
golang-github-shenwei356-bio 0.13.3-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 852 kB
  • sloc: perl: 114; sh: 58; makefile: 21
file content (42 lines) | stat: -rwxr-xr-x 1,945 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# sketches

[![Go Reference](https://pkg.go.dev/badge/github.com/shenwei356/bio/sketches.svg)](https://pkg.go.dev/github.com/shenwei356/bio/sketches)


This package provides iterators for k-mer and k-mer sketches 
([Minimizer](https://academic.oup.com/bioinformatics/article/20/18/3363/202143),
 [Scaled MinHash](https://f1000research.com/articles/8-1006),
 [Closed Syncmers](https://peerj.com/articles/10805/)).
K-mers are either encoded (k<=32) or hashed (arbitrary k, using [ntHash](https://github.com/will-rowe/nthash)) into `uint64`.

Related projects:

- [kmers](https://github.com/shenwei356/kmers) provides manipulations for bit-packed k-mers (k<=32, encoded in `uint64`).
- [kmcp](https://github.com/shenwei356/kmcp) uses this package.

## Benchmark

CPU: AMD Ryzen 7 2700X Eight-Core Processor, 3.7 GHz

    $ go test . -bench=Bench* -benchmem \
        | grep Bench \
        | perl -pe 's/\s\s+/\t/g' \
        | csvtk cut -Ht -f 1,3-5 \
        | csvtk add-header -t -n test,time,memory,allocs \
        | csvtk pretty -t -r
 
                                          test           time     memory        allocs
    ------------------------------------------   ------------   --------   -----------
              BenchmarkKmerIterator/1.00_KB-16    11445 ns/op     0 B/op   0 allocs/op
              BenchmarkHashIterator/1.00_KB-16     7974 ns/op    24 B/op   1 allocs/op
           BenchmarkSimHashIterator/1.00_KB-16    79477 ns/op    48 B/op   1 allocs/op           
           BenchmarkProteinIterator/1.00_KB-16    17852 ns/op   432 B/op   2 allocs/op
    
           BenchmarkMinimizerSketch/1.00_KB-16    56071 ns/op    48 B/op   2 allocs/op
             BenchmarkSyncmerSketch/1.00_KB-16   101310 ns/op   977 B/op   7 allocs/op
    BenchmarkProteinMinimizerSketch/1.00_KB-16    29914 ns/op   736 B/op   5 allocs/op


## History

This package was originally maintained in [unikmer](https://github.com/shenwei356/unikmer).