File: README.md

package info (click to toggle)
golang-github-will-rowe-nthash 0.4.0-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, sid, trixie
  • size: 88 kB
  • sloc: makefile: 2
file content (62 lines) | stat: -rw-r--r-- 2,243 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
<div align="center">
  <h1>ntHash</h1>
  <h3><a href="ntHash">ntHash</a> implementation in Go</h3>
  <hr>
  <a href="https://travis-ci.org/will-rowe/ntHash"><img src="https://travis-ci.org/will-rowe/ntHash.svg?branch=master" alt="travis"></a>
  <a href="https://godoc.org/github.com/will-rowe/nthash"><img src="https://godoc.org/github.com/will-rowe/nthash?status.svg" alt="GoDoc"></a>
  <a href="https://goreportcard.com/report/github.com/will-rowe/ntHash"><img src="https://goreportcard.com/badge/github.com/will-rowe/ntHash" alt="goreportcard"></a>
  <a href="https://codecov.io/gh/will-rowe/ntHash"><img src="https://codecov.io/gh/will-rowe/ntHash/branch/master/graph/badge.svg" alt="codecov"></a>
</div>

---

## Overview

This is a Go implementation of the [ntHash](https://github.com/bcgsc/ntHash) recursive hash function for hashing all possible k-mers in a DNA/RNA sequence.

For more information, read the ntHash [paper](http://dx.doi.org/10.1093/bioinformatics/btw397) by Mohamadi et al. or check out their C++ [implementation](https://github.com/bcgsc/ntHash).

This implementation was inspired by [Luiz Irber](https://luizirber.org/) and his recent [blog post](https://blog.luizirber.org/2018/09/13/nthash/) on his cool [Rust ntHash implementation](https://github.com/luizirber/nthash).

I have coded this up in Go so that ntHash can be used in my [HULK](https://github.com/will-rowe/hulk) and [GROOT](https://github.com/will-rowe/groot) projects but feel free to use it for yourselves.

## Installation

```go
go get github.com/will-rowe/nthash
```

## Example usage

### range over ntHash values for a sequence

```go
package main

import (
    "log"
    "github.com/will-rowe/nthash"
)

var (
    sequence = []byte("ACGTCGTCAGTCGATGCAGTACGTCGTCAGTCGATGCAGT")
    kmerSize = 11
)

func main() {

    // create the ntHash iterator using a pointer to the sequence and a k-mer size
    hasher, err := ntHash.New(&sequence, kmerSize)

    // check for errors (e.g. bad k-mer size choice)
    if err != nil {
        log.Fatal(err)
    }

    // collect the hashes by ranging over the hash channel produced by the Hash method
    canonical := true
    for hash := range hasher.Hash(canonical) {
        log.Println(hash)
    }
}
```