1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
|
# Getting Started #
```
git clone https://bitbucket.org/genomicepidemiology/kmerresistance.git
cd kmerresistance && make;
./kmerresistance -i reads_se.fq.gz -o output/name -t_db templates -s_db species
./kmerresistance -ipe reads_1.fq.gz reads_2.fq.gz -o output/name -t_db templates -s_db species
```
# Introduction #
KmerResistance correlates mapped genes with the predicted species of WGS
samples, where this this allows for identification of genes in samples which
have been poorly sequenced or high accuracy predictions for samples with
contamination.
KmerResistance has one dependency, namely KMA to perform the mapping, which is
also freely available.
If you use KmerResistance for your published research, then please cite:
Philip T.L.C. Clausen, Ea Zankari, Frank M. Aarestrup & Ole Lund,
"Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data",
J Antimicrob Chemother. 2016 Sep;71(9):2484-8.
Philip T.L.C. Clausen, Frank M. Aarestrup & Ole Lund,
"Rapid and precise alignment of raw reads against redundant databases with KMA",
BMC Bioinformatics, 2018;19:307.
# Indexing #
Databases has to set up with KMA. The resistance genes needs standard indexing
while the species database needs to Sparse.
# Templates, e.g. resistance genes: #
```
kma index -i ResFinder.fsa -o ResFinder
```
# Species database (or other host database): #
```
kma index -i bacteria.fsa -o bacteria -Sparse ATG
```
# Gene identification #
Aligning reads against resistance genes:
```
kmerresistance -i sample_1.fastq sample_2.fastq -o out -t_db ResFinder -s_db bacteria
```
# Output: #
1. *.KmerRes KmerResistance output
2. *.res Result file, containing summary of output.
3. *.aln Consensus alignment.
4. *.fsa Consensus sequences drawn from mappings.
5. *.frag Information about each mapping read, containing: Read, #matches, aln score, start, end, template, read name
6. *.mat.gz Count of each called nucleotide on each position in all mapped templates, requires that the "-matrix" option is enabled when mapping. The columns are: Ref. nucleotide, #A, #T, #C, #G, #N, #-.
# Installation Requirements #
In order to run KmerResistance, you will need to install KMA. Available here:
*https://bitbucket.org/genomicepidemiology/kma.git*
# Help #
Usage and options are available with the "-h" option on all three programs.
If in doubt, please mail any concerns or problems to: *plan@dtu.dk*.
# Citation #
1. Philip T.L.C. Clausen, Ea Zankari, Frank M. Aarestrup & Ole Lund, "Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data", J Antimicrob Chemother. 2016 Sep;71(9):2484-8.
2. Philip T.L.C. Clausen, Frank M. Aarestrup & Ole Lund, "Rapid and precise alignment of raw reads against redundant databases with KMA", BMC Bioinformatics, 2018;19:307.
# License #
Copyright (c) 2017, Philip Clausen, Technical University of Denmark
All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
|