File: Spork.slides

package info (click to toggle)
libbio-chado-schema-perl 0.10010-1
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 1,800 kB
  • sloc: perl: 15,043; makefile: 7; sql: 3
file content (257 lines) | stat: -rw-r--r-- 6,238 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
----
presentation_topic: bio-chado-schema
presentation_title: Bio::Chado::Schema
presentation_subtitle: a standard Perl ORM layer for Chado
presentation_place: NESCent, Durham, NC
presentation_date: March 5-7, 2011
----
= What is DBIx::Class?

* Object-relational mapping framework for Perl 5
* is now the de-facto standard
* very powerful query builder

----
= What is DBIx::Class?

* query building (the magic of chainable ResultSets)
* cross-database deployment (using |SQL::Translator| in the backend)
* lots and lots of plugins availabe (dates, testing, ...)

see http://search.cpan.org/dist/DBIx-Class/lib/DBIx/Class/Manual.pod

----
= Data Model

* a set of classes
* encapsulate the underlying storage
* providing a nicer, maintainable API for working with the data

This isn't really middleware, it's a code library.

----
= And therefore, Bio::Chado::Schema 

== As of 0.08: 205 Perl 5 classes
* every table and view, every module
* Chado docs mirrored in the POD

http://search.cpan.org/dist/Bio-Chado-Schema

+  ( So why do we need this? )

----
= Chado needs encapsulation

== Chado's design makes for:
* complex queries
* steep learning curve

----
= Chado needs encapsulation

== complex queries
* generating
* storing
* automating

----
= Chado needs encapsulation

== steep learning curve
* codifying best practices
* unified, high-level documentation
* can help with database administration

----
= BCS Usage

* open the schema.  ( actual DB connection is deferred )
+  my $chado = Bio::Chado::Schema->connect( 'dbi:Pg:...', $user, $pass );

+* get a ResultSet object representing a set of Rows
+  $chado->resultset('Sequence::Feature');
+        ->search({ name => 'something' });
+        ->search_related('organism')

+* get an actual data object
+  $chado->resultset('Sequence::Feature')
        ->find( 232432 );

----
= BCS Usage

+* iterate over the actual data in a set
+  my $stuff = $chado->blahblahblah;
  while( my $row = $stuff->next ) {
    # do something
  }

+* traverse relationships
+ say $some_feature->type->name;

----
= BCS Usage: Joined Select

  # get features via the potato organism, also joining in the cvterms table
+  my $potato_bacs =
       $chado->resultset('Organism::Organism')
+             ->search({ species => 'Solanum tuberosum' })
+             ->search_related( 'features',
                               { 'type.name' => 'BAC_clone'},
                               { 'join' => 'type' },
                             );

----
= BCS Usage: Joined Select

  # the equivalent bare SQL
  my $potato_bacs = $dbh->selectall_arrayref( <<EOS, undef, 'Solanum tuberosum', 'BAC_clone');
  SELECT features.feature_id
       , features.dbxref_id
       , features.organism_id
       , features.name
       , features.uniquename
       , features.residues
       , features.seqlen
       , features.md5checksum
       , features.type_id
       , features.is_analysis
       , features.is_obsolete
       , features.timeaccessioned
       , features.timelastmodified
  FROM organism me
  LEFT JOIN feature features
         ON features.organism_id = me.organism_id
  JOIN cvterm type
         ON type.cvterm_id = features.type_id
  WHERE type.name = 'BAC_clone' AND species = 'Solanum tuberosum'
  EOS

----
= BCS Usage: Loading

  $chado->resultset( 'Cv::Cv' )
        ->find_or_create({ name => 'My Fake Ontology' })
        ->create_related(  'cvterm',
                           { name => 'MyFakeTerm' });

makes the SQL:

  SELECT me.cv_id
       , me.name
       , me.definition
  FROM cv me
  WHERE ( me.name = 'my fake ontology' )

  INSERT INTO cv ( name )
          VALUES ( ?    )

----
= BCS Usage: Transactions

  $chado->txn_do(sub {

      $chado->resultset('Cv::Cv')
            ->find_or_create({ name => 'My Fake Ontology' })
            ->create_related( 'cvterm', { name => 'MyFakeTerm' } );

      $chado->do_all_kinds_of_other_stuff;
  });

----
= The Real Advantages of DBIC

* easier to manipulate and assemble queries
* Don't Repeat Yourself

----
= The Real Advantages of DBIC

* it's all objects.  you can delegate to them, pass them around, etc.
* HOWEVER:
+** usually you don't want to subclass them
** but, see |DBIx::Class::Manual::Cookbook|

----
= The Real Advantages of DBIC

* complex joined queries (Chado queries) are very easy and compact

----
= The Real Advantages of DBIC

* SQL syntax errors are much more difficult to make

----
= Using DBIC with your own tables

* use |DBIx::Class::Schema::Loader| to dump a whole set
* make your own definitions

* your table: other_thing, foreign key feature_id to Chado feature table

  package My::DBIC:::Layer::OtherThing;
  use base 'DBIx::Class::Core';

  __PACKAGE__->table('other_thing');
  __PACKAGE__->add_columns(
    'other_thing_id' => { ... },
    'name'           => { ... },
    'definition'     => { ... },
    'feature_id'     => { ... },
  );
  __PACKAGE__->set_primary_key('other_thing_id');
  __PACKAGE__->add_unique_constraint('ot_c1', ['name']);

  __PACKAGE__->belongs_to(
    'feature',
    'Bio::Chado::Schema::Sequence::Feature',
    { 'foreign.feature_id' => 'self.feature_id' },
  );

----
= "Duct tape" your own schema onto BCS

* make an accessor 'other_things' that ties your own DBIC class to BCS

+  Bio::Chado::Schema::Sequence::Feature->has_many(
    'other_things',
    'My::DBIC::Layer::OtherThing',
    { 'foreign.feature_id' => 'self.feature_id' },
  );

+* add it to the BCS schema dynamically
+  Bio::Chado::Schema->register_source('OtherThing', 'My::DBIC::Layer::OtherThing');

+* use it with the rest
+  $chado->resultset('Sequence::Feature')->other_things;

----
= Making a composite schema

  package My::Schema;
  # load our own classes
  __PACKAGE__->load_namespaces;
  __PACKAGE__->load_namespaces(
      result_namespace    => '+Bio::Chado::Schema::Result',
      resultset_namespace => '+Bio::Chado::Schema::ResultSet',
    );

----
= Development Roadmap

* continue to fill in many_to_many rels as needed
* continue to add more (generic!) convenience methods
** automate more common querying and loading patterns

----
= Acknowledgments

* Jonathan Leto (SGN)
* Aure Bombarely (SGN)
* Naama Menda (SGN)
* Siddhartha Basu (dictybase)
* Lukas Mueller (SGN)

----