File: README.md

package info (click to toggle)
jellyfish 2.3.1-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 3,276 kB
  • sloc: cpp: 35,703; sh: 995; ruby: 578; makefile: 397; python: 165; perl: 36
file content (121 lines) | stat: -rw-r--r-- 5,134 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
![CI workflow](https://github.com/gmarcais/Jellyfish/actions/workflows/c-cpp.yml/badge.svg)

# Jellyfish

## Overview


Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. Jellyfish can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the "compare-and-swap" CPU instruction to increase parallelism.

JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in a binary format, which can be translated into a human-readable text format using the "jellyfish dump" command, or queried for specific k-mers with "jellyfish query". See the [documentation](doc/Readme.md) for details.

If you use Jellyfish in your research, please cite:

  Guillaume Marcais and Carl Kingsford, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (2011) 27(6): 764-770 ([first published online January 7, 2011](http://bioinformatics.oxfordjournals.org/cgi/content/abstract/27/6/764 "Paper on Oxford Bioinformatics website")) doi:10.1093/bioinformatics/btr011

## Installation

### Linux Binaries

On Debian and Ubuntu with `apt`:
```Shell
sudo apt update
sudo apt install jellyfish
```

On Arch, it is available from [AUR](https://aur.archlinux.org/packages/jellyfish/).

### FreeBSD

Jellyfish can be installed on FreeBSD via the FreeBSD ports system.

To install via the binary package, simply run:
```Shell
pkg install Jellyfish
```

To install from source:
```Shell
cd /usr/ports/biology/jellyfish
make install
```

### Windows

With [Cygwin](https://www.cygwin.com/), Jellyfish can be compiled from source as [explained below](#from-source).
The simpler way on Windows 10 is to first install [WSL](https://docs.microsoft.com/en-us/windows/wsl/install-win10) and then install a Linux distribution that carries Jellyfish (e.g., Ubuntu) from the Windows Store.
Finally, install with:
```Shell
sudo apt update
sudo apt install jellyfish
```

### From source

To get an easier to compiled packaged tar ball of the source code, download a release from the [github release][3]. You need make and g++ version 4.4 or higher. To install in your home directory, do:
```Shell
./configure --prefix=$HOME
make -j 4
make install
```

To compile from the git tree, you will also need autoconf, automake, libool, gettext, pkg-config and [yaggo](https://github.com/gmarcais/yaggo/releases "Yaggo release on github"). Then to compile and install (in `/usr/local` in that example) with:
```Shell
autoreconf -i
./configure
make -j 4
sudo make install
```

If the software is installed in system directories (hint: you needed to use `sudo` to install), like the example above, then the system library cache must be updated like such:
```Shell
sudo ldconfig
```

Usage
-----

Instruction of use are available in the [doc](https://github.com/gmarcais/Jellyfish/tree/master/doc) directory.

Extra / Examples
----------------

In the examples directory are potentially useful extra programs to query/manipulates output files of Jellyfish, using the shared library of Jellyfish in C++ or with scripting languages. The examples are not compiled by default. Each subdirectory of examples is independent and is compiled with a simple invocation of 'make'.


Binding to script languages
---------------------------

Bindings to Ruby, Python and Perl are provided. This binding allows to read the output file of Jellyfish directly in a scripting language. Compilation of the bindings is easier from the [release tarball][3]. The development files of the target scripting language are required.

Compilation of the bindings from the git tree requires [SWIG][2] version 3 and adding the switch `--enable-swig` to the configure command lines show below.

To compile all three bindings, configure and compile with:

```Shell
./configure --enable-ruby-binding --enable-python-binding --enable-perl-binding
make -j 4
sudo make install
```

By default, Jellyfish is installed in `/usr/local` and the bindings are installed in the proper system location. When the `--prefix` switch is passed, the bindings are installed in the given directory. For example:

```Shell
./configure --prefix=$HOME --enable-python-binding
make -j 4
make install
```

This will install the python binding in `$HOME/lib/python2.7/site-packages` (adjust based on your Python version).

Then, for Python, Ruby or Perl to find the binding, an environment variable may need to be adjusted (`PYTHONPATH`, `RUBYLIB` and `PERL5LIB` respectively). For example:

```Shell
export PYTHONPATH=$HOME/lib/python2.7/site-packages
```

See the [swig directory](../../tree/master/swig) for examples on how to use the bindings.

[1]: http://www.genome.umd.edu/jellyfish.html "Genome group at University of Maryland"
[2]: http://www.swig.org/
[3]: https://github.com/gmarcais/Jellyfish/releases "Jellyfish release"