File: DefineClones.py.1

package info (click to toggle)
changeo 1.3.0-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 2,272 kB
  • sloc: python: 7,096; sh: 116; makefile: 5
file content (155 lines) | stat: -rw-r--r-- 5,338 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.16.
.TH DEFINECLONES.PY "1" "October 2020" "DefineClones.py 1.0.1" "User Commands"
.SH NAME
DefineClones.py \- Repertoire clonal assignment toolkit (Python 3)
.SH DESCRIPTION
usage: DefineClones.py [\-\-version] [\-h] \fB\-d\fR DB_FILES [DB_FILES ...]
.TP
[\-o OUT_FILES [OUT_FILES ...]] [\-\-outdir OUT_DIR]
[\-\-outname OUT_NAME] [\-\-log LOG_FILE] [\-\-failed]
[\-\-format {airr,changeo}] [\-\-nproc NPROC]
[\-\-sf SEQ_FIELD] [\-\-vf V_FIELD] [\-\-jf J_FIELD]
[\-\-gf GROUP_FIELDS [GROUP_FIELDS ...]]
[\-\-mode {allele,gene}] [\-\-act {first,set}]
[\-\-model {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}]
[\-\-dist DISTANCE] [\-\-norm {len,mut,none}]
[\-\-sym {avg,min}] [\-\-link {single,average,complete}]
[\-\-maxmiss MAX_MISSING]
.PP
Assign Ig sequences into clones
.SS "help:"
.TP
\fB\-\-version\fR
show program's version number and exit
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.SS "standard arguments:"
.TP
\fB\-d\fR DB_FILES [DB_FILES ...]
A list of tab delimited database files. (default:
None)
.TP
\fB\-o\fR OUT_FILES [OUT_FILES ...]
Explicit output file name. Note, this argument cannot
be used with the \fB\-\-failed\fR, \fB\-\-outdir\fR, or \fB\-\-outname\fR
arguments. If unspecified, then the output filename
will be based on the input filename(s). (default:
None)
.TP
\fB\-\-outdir\fR OUT_DIR
Specify to changes the output directory to the
location specified. The input file directory is used
if this is not specified. (default: None)
.TP
\fB\-\-outname\fR OUT_NAME
Changes the prefix of the successfully processed
output file to the string specified. May not be
specified with multiple input files. (default: None)
.TP
\fB\-\-log\fR LOG_FILE
Specify to write verbose logging to a file. May not be
specified with multiple input files. (default: None)
.TP
\fB\-\-failed\fR
If specified create files containing records that fail
processing. (default: False)
.TP
\fB\-\-format\fR {airr,changeo}
Specify input and output format. (default: airr)
.TP
\fB\-\-nproc\fR NPROC
The number of simultaneous computational processes to
execute (CPU cores to utilized). (default: 8)
.SS "cloning arguments:"
.TP
\fB\-\-sf\fR SEQ_FIELD
Field to be used to calculate distance between
records. Defaults to junction (airr) or JUNCTION
(changeo). (default: None)
.TP
\fB\-\-vf\fR V_FIELD
Field containing the germline V segment call. Defaults
to v_call (airr) or V_CALL (changeo). (default: None)
.TP
\fB\-\-jf\fR J_FIELD
Field containing the germline J segment call. Defaults
to j_call (airr) or J_CALL (changeo). (default: None)
.TP
\fB\-\-gf\fR GROUP_FIELDS [GROUP_FIELDS ...]
Additional fields to use for grouping clones aside
from V, J and junction length. (default: None)
.TP
\fB\-\-mode\fR {allele,gene}
Specifies whether to use the V(D)J allele or gene for
initial grouping. (default: gene)
.TP
\fB\-\-act\fR {first,set}
Specifies how to handle multiple V(D)J assignments for
initial grouping. The "first" action will use only the
first gene listed. The "set" action will use all gene
assignments and construct a larger gene grouping
composed of any sequences sharing an assignment or
linked to another sequence by a common assignment
(similar to single\-linkage). (default: set)
.TP
\fB\-\-model\fR {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}
Specifies which substitution model to use for
calculating distance between sequences. The "ham"
model is nucleotide Hamming distance and "aa" is amino
acid Hamming distance. The "hh_s1f" and "hh_s5f"
models are human specific single nucleotide and 5\-mer
content models, respectively, from Yaari et al, 2013.
The "mk_rs1nf" and "mk_rs5nf" models are mouse
specific single nucleotide and 5\-mer content models,
respectively, from Cui et al, 2016. The "m1n_compat"
and "hs1f_compat" models are deprecated models
provided backwards compatibility with the "m1n" and
"hs1f" models in Change\-O v0.3.3 and SHazaM v0.1.4.
Both 5\-mer models should be considered experimental.
(default: ham)
.TP
\fB\-\-dist\fR DISTANCE
The distance threshold for clonal grouping (default:
0.0)
.TP
\fB\-\-norm\fR {len,mut,none}
Specifies how to normalize distances. One of none (do
not normalize), len (normalize by length), or mut
(normalize by number of mutations between sequences).
(default: len)
.TP
\fB\-\-sym\fR {avg,min}
Specifies how to combine asymmetric distances. One of
avg (average of A\->B and B\->A) or min (minimum of A\->B
and B\->A). (default: avg)
.TP
\fB\-\-link\fR {single,average,complete}
Type of linkage to use for hierarchical clustering.
(default: single)
.TP
\fB\-\-maxmiss\fR MAX_MISSING
The maximum number of non\-ACGT characters (gaps or Ns)
to permit in the junction sequence before excluding
the record from clonal assignment. Note, under single
linkage non\-informative positions can create
artifactual links between unrelated sequences. Use
with caution. (default: 0)
.SS "output files:"
.IP
clone\-pass
.IP
database with assigned clonal group numbers.
.IP
clone\-fail
.IP
database with records failing clonal grouping.
.SS "required fields:"
.IP
sequence_id, v_call, j_call, junction
.SS "output fields:"
.IP
clone_id
.SH AUTHOR
 This manpage was written by Nilesh Patra for the Debian distribution and
 can be used for any other usage of the program.