File: Changes-1.3.rdoc

package info (click to toggle)
ruby-bio 2.0.6-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 7,108 kB
  • sloc: ruby: 68,331; perl: 13; makefile: 11; sh: 1
file content (239 lines) | stat: -rw-r--r-- 9,682 bytes parent folder | download | duplicates (9)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
= Incompatible and important changes since the BioRuby 1.2.1 release

A lot of changes have been made to the BioRuby after the version 1.2.1
is released.

== New features

=== Support for sequence output with improvements of Bio::Sequence

The outputting of EMBL and GenBank formatted text are now supported in the
Bio::Sequence class. See the document of Bio::Sequence#output for details.
You can also create Bio::Sequence objects from many kinds of data such as
Bio::GenBank, Bio::EMBL, and Bio::FastaFormat by using the to_biosequence
method.

=== BioSQL support

BioSQL support is completely rewritten by using ActiveRecord.

=== Bio::Blast

Bio::Blast#reports can parse NCBI default (-m 0) format and tabular (-m 8)
format, in addition to XML (-m 7) format.

Bio::Blast::Report now supports XML format with multiple query sequences
generated by blastall 2.2.14 or later.

Bio::Blast.remote supports DDBJ, in addition to GenomeNet.
In addition, a list of available blast databases on remote sites
can be obtained by using Bio::Blast::Remote::DDBJ.databases and
Bio::Blast::Remote::GenomeNet.databases methods. Note that the above
remote blast methods may be changed in the future to support NCBI.

Bio::Blast::RPSBlast::Report is newly added, a parser for NCBI RPS Blast
(Reversed Position Specific Blast) default (-m 0 option) results.

=== Bio::GFF::GFF2 and Bio::GFF::GFF3

The outputting of GFF2/GFF3-formatted text is now supported. However, many
incompatible changes have been made (See below for details).

=== Bio::Hinv

H-Invitational Database web service (REST) client class is newly added.

=== Bio::NCBI::REST

NCBI E-Utilities client class is newly added.

=== Bio::PAML::Codeml and Bio::PAML::Codeml::Report

Bio::PAML::Codeml, wrapper for PAML codeml program, and
Bio::PAML::Codeml::Report, parser for codeml result are newly added,
though some of them are still under construction and too specific to
particular use cases.

=== Bio::Locations

New method Bio::Locations#to_s is added to support output of features.

=== Bio::TogoWS::REST

TogoWS REST client class is newly added. Information about TogoWS REST service
can be found on http://togows.dbcls.jp/site/en/rest.html.

== Deprecated classes

=== Bio::Features

Bio::Features is obsoleted and changed to an array of Bio::Feature object
with some backward compatibility methods.  The backward compatibility methods
will soon be removed in the future.

=== Bio::References

Bio::References is obsoleted and changed to an array of Bio::Reference object
with some backward compatibility methods.  The backward compatibility methods
will soon be removed in the future.

== Incompatible changes

=== Bio::BIORUBY_VERSION

Definition of the constant Bio::BIORUBY_VERSION is moved from lib/bio.rb to
lib/bio/version.rb. Normally, the autoload mechanism of Ruby correctly loads
the version.rb, but special scripts directly using bio.rb may be needed to
be changed.

Bio::BIORUBY_VERSION is changed to be frozen.

New constants Bio::BIORUBY_EXTRA_VERSION and Bio::BIORUBY_VERSION_ID are
added. See their RDoc for details.

=== Bio::Sequence

Bio::Sequence#date is removed.  Alternatively, date_created or date_modified
can be used.

Bio::Sequence#taxonomy is changed to be an alias of classification, and
the data type is changed to an array of string.

=== Bio::Locations and Bio::Location

A carat in a location (e.g. "123^124") is now parsed, instead of being
replaced by "..".  To distinguish from normal "..", a new attribute
Bio::Location#carat is used.

"order(...)" or "group(...)" are also parsed, instead of being regarded
as "join(...)".  To distinguish from "join(...)", a new attribute
Bio::Locations#operator is used.  When "order(...)" or "group(...)",
the attribute is set to :order or :group, respectively.  Note that
"group(...)" is already deprecated in EMBL/GenBank/DDBJ.

=== Bio::Blast

Return value of Bio::Blast#exec_* is changed to String instead of Report
object. Parsing the string is now processed in Bio::Blast#query method.

Bio::Blast#exec_genomenet_tab and Bio::Blast#server="genomenet_tab" is
deprecated.

Bio::Blast#options=() can now change the following attributes: program, db,
format, matrix, and filter.

Bio::Blast.reports now supports default (-m 0) and tabular (-m 8) formats.
Old implementation (only supports XML) is renamed to Bio::Blast.reports_xml,
to keep compatibility for older BLAST XML documents which might not be parsed
by the new Bio::Blast.reports nor Bio::FlatFile, although we are not sure
whether such documents really exist or not.

=== Bio::Blast::Default::Report and Bio::Blast::WU::Report

Iteration#lambda, #kappa, #entropy, #gapped_lambda, #gapped_kappa,
and #gapped_entropy, and the same methods in the Report class are
changed to return float or nil instead of string or nil.

=== Bio::Blat

When reading BLAT psl (or pslx) data by using Bio::FlatFile, it checks
each query name and returns a new entry object when the query name is
changed from previous queries. This is, data is stored to two or more
Bio::Blat::Report objects, instead of previous version's behavior
(always reads all data at once and stores to a Bio::Blat::Report object).

=== Bio::GFF, Bio::GFF::GFF2 and Bio::GFF::GFF3

Bio::GFF::Record#comments is renamed to #comment, and #comments= is
renamed to #comment=, because they only allow a single String (or nil)
and the plural form "comments" may be confusable.  The "comments" and
"comments=" methods can still be used, but warning messages will be
shown when using in GFF2::Record and GFF3::Record objects.

See below about GFF2 and/or GFF3 specific changes.

=== Bio::GFF::GFF2 and Bio::GFF::GFF3

Bio::GFF::GFF2::Record.new and Bio::GFF::GFF3::Record.new can also
get 9 arguments corresponding to GFF columns, which helps to create
Record object directly without formatted text.

Bio::GFF::GFF2::Record#start, #end, and #frame return Integer or nil,
and #score returns Float or nil, instead of String or nil.
The same changes are also made to Bio::GFF::GFF3::Record.

Bio::GFF::GFF2::Record#attributes and Bio::GFF::GFF3::Record#attributes
are changed to return a nested Array, containing [ tag, value ] pairs,
because of supporting multiple tags in the same tag names.  If you want
to get a Hash, use Record#attributes_to_hash method, though some
tag-value pairs in the same tag names may be lost.  Note that
Bio::GFF::Record#attribute still returns a Hash for compatibility.

New methods for getting, setting and manipulating attributes are added
to Bio::GFF::GFF2::Record and Bio::GFF::GFF3::Record classes:
attribute, get_attribute, get_attributes, set_attribute, replace_attributes,
add_attribute, delete_attribute, delete_attributes, sort_attributes_by_tag!.
It is recommended to use these methods instead of directly manipulating
the array returned by Record#attributes.

Bio::GFF::GFF2#to_s, Bio::GFF::GFF3#to_s, Bio::GFF::GFF2::Record#to_s,
and Bio::GFF::GFF3::Record#to_s are added to support output of
GFF2/GFF3 data.

=== Bio::GFF::GFF2

GFF2 attribute values are now automatically unescaped.  In addition,
if a value of an attribute is consisted of two or more tokens delimited
by spaces, an object of the new class Bio::GFF::GFF2::Record::Value is
returned instead of String.  The new class Bio::GFF::GFF2::Record::Value
aims to store a parsed value of an attribute.  If you really want to get
unparsed string, Bio::GFF::GFF2::Record::Value#to_s can be used.

The metadata (lines beginning with "##") are parsed to
Bio::GFF::GFF2::MetaData objects and are stored to Bio::GFF::GFF2#metadata
as an array, except the "##gff-version" line.  The "##gff-version" version
string is stored to the Bio::GFF::GFF2#gff_version as a string.

=== Bio::GFF::GFF3

Aliases of columns which are renamed in the GFF3 specification are added
to the Bio::GFF::GFF3::Record class: seqid (column 1; alias of "seqname"),
feature_type (column 3; alias of "feature"; in the GFF3 spec, it is
called "type", but because "type" is already used by Ruby, we use
"feature_type"), phase (column 8; formerly "frame"). Original names can
still be used because they are only aliases.

Sequences bundled within GFF3 after "##FASTA" are now supported
(Bio::GFF::GFF3#sequences).

GFF3 attribute keys and values are automatically unescaped. Each attribute
value is stored as a string, except for special attributes listed below:
* Bio::GFF::GFF3::Record::Target to store a "Target" attribute.
* Bio::GFF::GFF3::Record::Gap to store a "Gap" attribute.

The metadata (lines beginning with "##") are parsed to
Bio::GFF::GFF3::MetaData objects and stored to Bio::GFF::GFF3#metadata
as an array, except "##gff-version", "##sequence-region", "###",
and "##FASTA" lines.
* "##gff-version" version string is stored to Bio::GFF::GFF3#gff_version.
* "##sequence-region" lines are parsed to Bio::GFF::GFF3::SequenceRegion
  objects and stored to Bio::GFF::GFF3#sequence_regions as an array.
* "###" lines are parsed to Bio::GFF::GFF3::RecordBoundary objects.
* "##FASTA" is regarded as the beginning of bundled sequences.

=== Bio::Pathway

Bio::Pathway#cliquishness is changed to calculate cliquishness (clustering
coefficient) for not only undirected graphs but also directed graphs.

In Bio::Pathway#to_matrix, dump_matrix, dump_list, and depth_first_search
methods, to avoid dependency to the order of objects in Hash#each (and
each_keys etc.), Bio::Pathway#index is used to specify preferences of
nodes in a graph.

=== Bio::SQL and BioSQL related classes

BioSQL support is completely rewritten by using ActiveRecord. See documents
in lib/bio/io/sql.rb, lib/bio/io/biosql, and lib/bio/db/biosql for details
of changes and usage of the classes/modules.