1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239
|
= Incompatible and important changes since the BioRuby 1.2.1 release
A lot of changes have been made to the BioRuby after the version 1.2.1
is released.
== New features
=== Support for sequence output with improvements of Bio::Sequence
The outputting of EMBL and GenBank formatted text are now supported in the
Bio::Sequence class. See the document of Bio::Sequence#output for details.
You can also create Bio::Sequence objects from many kinds of data such as
Bio::GenBank, Bio::EMBL, and Bio::FastaFormat by using the to_biosequence
method.
=== BioSQL support
BioSQL support is completely rewritten by using ActiveRecord.
=== Bio::Blast
Bio::Blast#reports can parse NCBI default (-m 0) format and tabular (-m 8)
format, in addition to XML (-m 7) format.
Bio::Blast::Report now supports XML format with multiple query sequences
generated by blastall 2.2.14 or later.
Bio::Blast.remote supports DDBJ, in addition to GenomeNet.
In addition, a list of available blast databases on remote sites
can be obtained by using Bio::Blast::Remote::DDBJ.databases and
Bio::Blast::Remote::GenomeNet.databases methods. Note that the above
remote blast methods may be changed in the future to support NCBI.
Bio::Blast::RPSBlast::Report is newly added, a parser for NCBI RPS Blast
(Reversed Position Specific Blast) default (-m 0 option) results.
=== Bio::GFF::GFF2 and Bio::GFF::GFF3
The outputting of GFF2/GFF3-formatted text is now supported. However, many
incompatible changes have been made (See below for details).
=== Bio::Hinv
H-Invitational Database web service (REST) client class is newly added.
=== Bio::NCBI::REST
NCBI E-Utilities client class is newly added.
=== Bio::PAML::Codeml and Bio::PAML::Codeml::Report
Bio::PAML::Codeml, wrapper for PAML codeml program, and
Bio::PAML::Codeml::Report, parser for codeml result are newly added,
though some of them are still under construction and too specific to
particular use cases.
=== Bio::Locations
New method Bio::Locations#to_s is added to support output of features.
=== Bio::TogoWS::REST
TogoWS REST client class is newly added. Information about TogoWS REST service
can be found on http://togows.dbcls.jp/site/en/rest.html.
== Deprecated classes
=== Bio::Features
Bio::Features is obsoleted and changed to an array of Bio::Feature object
with some backward compatibility methods. The backward compatibility methods
will soon be removed in the future.
=== Bio::References
Bio::References is obsoleted and changed to an array of Bio::Reference object
with some backward compatibility methods. The backward compatibility methods
will soon be removed in the future.
== Incompatible changes
=== Bio::BIORUBY_VERSION
Definition of the constant Bio::BIORUBY_VERSION is moved from lib/bio.rb to
lib/bio/version.rb. Normally, the autoload mechanism of Ruby correctly loads
the version.rb, but special scripts directly using bio.rb may be needed to
be changed.
Bio::BIORUBY_VERSION is changed to be frozen.
New constants Bio::BIORUBY_EXTRA_VERSION and Bio::BIORUBY_VERSION_ID are
added. See their RDoc for details.
=== Bio::Sequence
Bio::Sequence#date is removed. Alternatively, date_created or date_modified
can be used.
Bio::Sequence#taxonomy is changed to be an alias of classification, and
the data type is changed to an array of string.
=== Bio::Locations and Bio::Location
A carat in a location (e.g. "123^124") is now parsed, instead of being
replaced by "..". To distinguish from normal "..", a new attribute
Bio::Location#carat is used.
"order(...)" or "group(...)" are also parsed, instead of being regarded
as "join(...)". To distinguish from "join(...)", a new attribute
Bio::Locations#operator is used. When "order(...)" or "group(...)",
the attribute is set to :order or :group, respectively. Note that
"group(...)" is already deprecated in EMBL/GenBank/DDBJ.
=== Bio::Blast
Return value of Bio::Blast#exec_* is changed to String instead of Report
object. Parsing the string is now processed in Bio::Blast#query method.
Bio::Blast#exec_genomenet_tab and Bio::Blast#server="genomenet_tab" is
deprecated.
Bio::Blast#options=() can now change the following attributes: program, db,
format, matrix, and filter.
Bio::Blast.reports now supports default (-m 0) and tabular (-m 8) formats.
Old implementation (only supports XML) is renamed to Bio::Blast.reports_xml,
to keep compatibility for older BLAST XML documents which might not be parsed
by the new Bio::Blast.reports nor Bio::FlatFile, although we are not sure
whether such documents really exist or not.
=== Bio::Blast::Default::Report and Bio::Blast::WU::Report
Iteration#lambda, #kappa, #entropy, #gapped_lambda, #gapped_kappa,
and #gapped_entropy, and the same methods in the Report class are
changed to return float or nil instead of string or nil.
=== Bio::Blat
When reading BLAT psl (or pslx) data by using Bio::FlatFile, it checks
each query name and returns a new entry object when the query name is
changed from previous queries. This is, data is stored to two or more
Bio::Blat::Report objects, instead of previous version's behavior
(always reads all data at once and stores to a Bio::Blat::Report object).
=== Bio::GFF, Bio::GFF::GFF2 and Bio::GFF::GFF3
Bio::GFF::Record#comments is renamed to #comment, and #comments= is
renamed to #comment=, because they only allow a single String (or nil)
and the plural form "comments" may be confusable. The "comments" and
"comments=" methods can still be used, but warning messages will be
shown when using in GFF2::Record and GFF3::Record objects.
See below about GFF2 and/or GFF3 specific changes.
=== Bio::GFF::GFF2 and Bio::GFF::GFF3
Bio::GFF::GFF2::Record.new and Bio::GFF::GFF3::Record.new can also
get 9 arguments corresponding to GFF columns, which helps to create
Record object directly without formatted text.
Bio::GFF::GFF2::Record#start, #end, and #frame return Integer or nil,
and #score returns Float or nil, instead of String or nil.
The same changes are also made to Bio::GFF::GFF3::Record.
Bio::GFF::GFF2::Record#attributes and Bio::GFF::GFF3::Record#attributes
are changed to return a nested Array, containing [ tag, value ] pairs,
because of supporting multiple tags in the same tag names. If you want
to get a Hash, use Record#attributes_to_hash method, though some
tag-value pairs in the same tag names may be lost. Note that
Bio::GFF::Record#attribute still returns a Hash for compatibility.
New methods for getting, setting and manipulating attributes are added
to Bio::GFF::GFF2::Record and Bio::GFF::GFF3::Record classes:
attribute, get_attribute, get_attributes, set_attribute, replace_attributes,
add_attribute, delete_attribute, delete_attributes, sort_attributes_by_tag!.
It is recommended to use these methods instead of directly manipulating
the array returned by Record#attributes.
Bio::GFF::GFF2#to_s, Bio::GFF::GFF3#to_s, Bio::GFF::GFF2::Record#to_s,
and Bio::GFF::GFF3::Record#to_s are added to support output of
GFF2/GFF3 data.
=== Bio::GFF::GFF2
GFF2 attribute values are now automatically unescaped. In addition,
if a value of an attribute is consisted of two or more tokens delimited
by spaces, an object of the new class Bio::GFF::GFF2::Record::Value is
returned instead of String. The new class Bio::GFF::GFF2::Record::Value
aims to store a parsed value of an attribute. If you really want to get
unparsed string, Bio::GFF::GFF2::Record::Value#to_s can be used.
The metadata (lines beginning with "##") are parsed to
Bio::GFF::GFF2::MetaData objects and are stored to Bio::GFF::GFF2#metadata
as an array, except the "##gff-version" line. The "##gff-version" version
string is stored to the Bio::GFF::GFF2#gff_version as a string.
=== Bio::GFF::GFF3
Aliases of columns which are renamed in the GFF3 specification are added
to the Bio::GFF::GFF3::Record class: seqid (column 1; alias of "seqname"),
feature_type (column 3; alias of "feature"; in the GFF3 spec, it is
called "type", but because "type" is already used by Ruby, we use
"feature_type"), phase (column 8; formerly "frame"). Original names can
still be used because they are only aliases.
Sequences bundled within GFF3 after "##FASTA" are now supported
(Bio::GFF::GFF3#sequences).
GFF3 attribute keys and values are automatically unescaped. Each attribute
value is stored as a string, except for special attributes listed below:
* Bio::GFF::GFF3::Record::Target to store a "Target" attribute.
* Bio::GFF::GFF3::Record::Gap to store a "Gap" attribute.
The metadata (lines beginning with "##") are parsed to
Bio::GFF::GFF3::MetaData objects and stored to Bio::GFF::GFF3#metadata
as an array, except "##gff-version", "##sequence-region", "###",
and "##FASTA" lines.
* "##gff-version" version string is stored to Bio::GFF::GFF3#gff_version.
* "##sequence-region" lines are parsed to Bio::GFF::GFF3::SequenceRegion
objects and stored to Bio::GFF::GFF3#sequence_regions as an array.
* "###" lines are parsed to Bio::GFF::GFF3::RecordBoundary objects.
* "##FASTA" is regarded as the beginning of bundled sequences.
=== Bio::Pathway
Bio::Pathway#cliquishness is changed to calculate cliquishness (clustering
coefficient) for not only undirected graphs but also directed graphs.
In Bio::Pathway#to_matrix, dump_matrix, dump_list, and depth_first_search
methods, to avoid dependency to the order of objects in Hash#each (and
each_keys etc.), Bio::Pathway#index is used to specify preferences of
nodes in a graph.
=== Bio::SQL and BioSQL related classes
BioSQL support is completely rewritten by using ActiveRecord. See documents
in lib/bio/io/sql.rb, lib/bio/io/biosql, and lib/bio/db/biosql for details
of changes and usage of the classes/modules.
|