File: changelog.txt

package info (click to toggle)
reapr 1.0.18+dfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 1,092 kB
  • ctags: 733
  • sloc: cpp: 4,816; perl: 1,467; sh: 160; makefile: 116
file content (168 lines) | stat: -rw-r--r-- 5,383 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
______ 1.0.17 -> 1.0.18 ______

* Bug fix for rare cases when sampling to get coverage/GC sstats.

* When writing broken assembly, no by default include all bin
contigs >= 1000 bases long in the main assembly. Can change this
cutoff with -m option. Short contigs still written to the bin.

______ 1.0.16 -> 1.0.17 ______

* Expose number of threads option in smalt map

* Expose all task options when running the pipeline

* Bug fix with some options not working in stats task.

* Add comma to the list of bad characters in facheck

* Update manual to reflect changes and change examples
to have what to do with just one library. Fix a couple
of typos in examples.

* Speed up fa2gc stage of preprocess by about 4 times.


______ 1.0.15 -> 1.0.16 ______

* Added option -t to task 'break'. This can be used to trim
bases off contigs ends, wherever a contig is broken from an
FCD error with the -a option.

* Change default of -l option of break to output sequences that are at
least 100bp long (default was 1bp).

* install.sh now checks that the required Perl modules are installed, and
checks that R is in the path.

* install.sh checks that the OS appears to be Linux and dies if
it's not. Added option to try to force the install anyway regardless of OS.

* Bug fix in smaltmap: depending on the OS, bam header was not getting
made correctly.

* smaltmap now starts by running samtools faidx on the assembly fasta file.
A common cause of the pipeline falling over is a fasta file that makes
samtools faidx segfault. Print a nice error message about this
if samtools faidx ends badly.

______ 1.0.14 -> 1.0.15 ______

* Added task 'seqrename' to rename all the sequences
in a BAM file. This saves remapping the reads to make a
new BAM that will be OK for the pipeline.

* Added task 'smaltmap' run map reads using SMALT.

* Updated the plots task to make tabix indexed plots, since
Artemis (version 15.0.0) can now read these.

* Bug fix in task 'break', where the -l option for min length
of sequence to output didn't always work.

______ 1.0.13 -> 1.0.14 ______

* Fixed Makefiles for tabix and reapr because they didn't work
on some systems (e.g. Ubuntu).

* Change sequence names output by break: use underscores instead
of : and -, so that the output is compatible with REAPR itself.

* Added -b option to break, which will ignore FCD and low fragment
coverage errors within contigs (i.e. those that don't contain a
gap)

______ 1.0.12 -> 1.0.13 ______

* Bug fix: off by one error in coordinates in
errors gff file made by 'score'.

* pipeline now starts by running facheck
on the assembly.

* pipeline changed so that it writes a bash
script of all the commands it's going to run,
then runs that bash script. Useful if it dies
and you want to know the commands needed to finish
the pipeline.

* Change in perfectmap: added --variance to be
0.5 * fragment size in the call to snpomatic.
The previous default was 0.25 * fragment size.

* Added -a option to 'break' for aggressive breaking:
it breaks contigs at errors (as well as breaking at
gaps).


______ 1.0.11 -> 1.0.12 ______

* Bug fix where in rare cases the 'break'
task would incorrectly make a broken fasta file
with duplicated sequences, or sequences continuing
right through to the end of the scaffold, instead of
stopping at the appropriate gap.

* Prefix the name of every bin contig
made when running break with 'REAPR_bin.'.

* In facheck, added brackets (){} and various
other characters to the list of characters that
break the pipeline.

* More verbose error message in preprocess when
something goes wrong at the point of sampling the
fragment coverage vs GC content.

* Fix typo in report.txt file made by summary, should be
'low score' not 'high score'. Also now writes the same
information in a report.tsv file, for ease of putting
results into spreadsheets.

______ 1.0.10 -> 1.0.11 ______

* Switch meaning of score to be more intuitive,
so that a score of 1 means perfect, down to
0 for bad.  Give all gaps a score of -1.


______ 1.0.9 -> 1.0.10 ______

* Bug fix with counting perfect bases.  It was slightly
overestimating, by counting gaps which were too long
to call as perfect.


______ 1.0.8 -> 1.0.9 ______

* Added task 'perfectfrombam' to use as an alternative to
perfectmap.  perfectmap maps reads with SNP-o-matic, which
is very fast but also very high memory.  perfectmapfrombam
takes a BAM file as input, and generates a file of perfect
and uniquely mapped reads, same format as for perfectmap,
for use in the REAPR pipeline.  Intended use case is
large genomes.

* Fix bug where facheck was writing .fa and .info files
when just an assembly fasta was given as input, with no
output files prefix.

* Bug fix of link reporting.  The coords needed 1 adding
to them in the Note... section of the gff file made by score.

* Remove superfluous double-quotes in the note section
of the gff errors file made by score.

* For each plot file, now additionally writes data in a .dat file,
(the R plots truncate the x axis and so the .R files don't
have all the data in them, but the .dat files do
have all the data in them, should anyone want it).

* Add option -u to stats task, to just run on a given
list of chromosomes.

* Added -f to every system call to tabix

* 'break' now also outputs a prefix.broken_assembly_bin.fa
fasta file of the parts of the genomes which were replaced
with Ns.