File: fingerprint.lua

package info (click to toggle)
genometools 1.6.2%2Bds-3
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 50,504 kB
  • sloc: ansic: 271,868; ruby: 30,327; python: 4,942; sh: 3,230; makefile: 1,214; perl: 219; pascal: 159; haskell: 37; sed: 5
file content (40 lines) | stat: -rw-r--r-- 1,339 bytes parent folder | download | duplicates (8)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
print ([[

If neither option '-check' nor option '-duplicates' is used, the fingerprints
for all sequences are shown on stdout.

Fingerprint of a sequence is case insensitive. Thus MD5 fingerprint of two
identical sequences will be the same even if one is soft-masked.

Examples
--------

Compute (unified) list of fingerprints:

    $ gt fingerprint U89959_ests.fas | sort | uniq > U89959_ests.checklist_uniq

Compare fingerprints:

    $ gt fingerprint -check U89959_ests.checklist_uniq U89959_ests.fas
    950b7715ab6cc030a8c810a0dba2dd33 only in sequence_file(s)

Make sure a sequence file contains no duplicates (not the case here):

    $ gt fingerprint -duplicates U89959_ests.fas
    950b7715ab6cc030a8c810a0dba2dd33        2
    gt fingerprint: error: duplicates found: 1 out of 200 (0.500%)

Extract sequence with given fingerprint:

    $ gt fingerprint -extract 6d3b4b9db4531cda588528f2c69c0a57 U89959_ests.fas
    >SQ;8720010
    TTTTTTTTTTTTTTTTTCCTGACAAAACCCCAAGACTCAATTTAATCAATCCTCAAATTTACATGATAC
    CAACGTAATGGGAGCTTAAAAATA

Return values
-------------

- 0  everything went fine ('-check': the comparison was successful;
                           '-duplicates': no duplicates found)
- 1  an error occurred     ('-check': the comparison was not successful;
                           '-duplicates': duplicates found)]])