File: TODO

package info (click to toggle)
r-bioc-summarizedexperiment 1.4.0-2
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 1,908 kB
  • sloc: sh: 3; makefile: 2
file content (175 lines) | stat: -rw-r--r-- 7,600 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
Tentative roadmap for the SummarizedExperiment to RangedSummarizedExperiment
transition
============================================================================

Where we are:

(1) We have a new class, RangedSummarizedExperiment (defined in
    the SummarizedExperiment package), that is *functionally*
    equivalent to the old SummarizedExperiment class (defined in
    the GenomicRanges package).
    In particular, it supports the rowRanges() setter and getter,
    for annotating the rows with a range-based object (only GRanges
    or GRangesList at the moment).
    Note that, just like the old SummarizedExperiment class, it
    also supports the mcols() setter and getter for additional
    annotations on the rows. More precisely, mcols(), like for
    any vector-like object defined in our core infrastructure,
    takes or returns a DataFrame. In the case of a
    RangedSummarizedExperiment object, this DataFrame must have
    one row per row in the object.

(2) We have a new class, SummarizedExperiment0, that is the direct
    parent of RangedSummarizedExperiment. The only difference with
    the latter is that the former doesn't support the rowRanges()
    getter. In other words, the rows can only be annotated via the
    mcols() accessor, which behaves exactly as in the case of a
    RangedSummarizedExperiment object.
    Note however that the rowRanges() setter DOES work on a
    SummarizedExperiment0 object but turns it into a
    RangedSummarizedExperiment object (thus violating the somewhat
    general expectation that a setter should behave as an endomorphism).

(3) SummarizedExperiment0 extends Vector:

                 Annotated    (defined in S4Vectors)
                     ^
                     |
                   Vector     (defined in S4Vectors)
                     ^
                     |
            SummarizedExperiment0
                     ^
                     |
         RangedSummarizedExperiment

    And therefore RangedSummarizedExperiment also extends Vector!
    Note that the old SummarizedExperiment class didn't. As a
    consequence, a noticeable difference between a
    RangedSummarizedExperiment object and an old SummarizedExperiment
    object is that the former has a length whereas the length was
    always 1 for the latter. More precisely, the length of a
    SummarizedExperiment0 (or RangedSummarizedExperiment) object is
    the same as its number of rows, and thus the same as the number
    of rows in its mcols().

(4) The old SummarizedExperiment class is deprecated in favor of
    the new RangedSummarizedExperiment class.

(5) Because SummarizedExperiment0 and RangedSummarizedExperiment
    now extend the Annotated class (via Vector), they inherit the
    'metadata' slot and accessor. Therefore the 'exptData' slot and
    accessor of the old SummarizedExperiment class are not needed
    anymore. Note that exptData() still works on SummarizedExperiment0
    but is deprecated in favor of metadata(). A noticeable difference
    though is that metadata() returns an ordinary list whereas
    exptData() was returning a SimpleList.

(6) There are coercion methods to switch back and forth between
    RangedSummarizedExperiment and SummarizedExperiment0. Doing
    SummarizedExperiment0 -> RangedSummarizedExperiment ->
    SummarizedExperiment0 is a no-op, but not the other way
    around.

(7) rowData() is defunct on old SummarizedExperiment objects.
    It could (and should) be brought back on SummarizedExperiment0
    objects as an alias for mcols(). This will restore the
    rowData/colData combo which is more intuitive than the
    mcols/colData combo. That cannot be done until the old
    defunct rowData() has completely disappeared from the
    GenomicRanges package though (see (e) below).

What should happen before the BioC 3.2 release:

(a) Remove test_SummarizedExperiment-rowData-methods.R from the
    GenomicRanges package (but first make sure that no testing
    is lost by checking that the unit tests in SummarizedExperiment
    cover everything that was in
    test_SummarizedExperiment-rowData-methods.R).

(b) Write the vignette in new SummarizedExperiment package.

What should happen in BioC 3.3 (preferably at the beginning of the
devel cycle i.e. soon after the BioC 3.2 release):

(c) Remove the old SummarizedExperiment class entirely from the
    GenomicRanges package. Right now it's deprecated only so that
    means we skip the 6-month defunct cycle which I think is OK.

(d) In the SummarizedExperiment package, rename SummarizedExperiment0
    -> SummarizedExperiment. This renaming should NOT break any
    serialized RangedSummarizedExperiment (or derived) objects!
    It would break serialized SummarizedExperiment0 instances but
    AFAIK we don't have any in the software or data-experiment
    repos (because the class is so new). However, this renaming
    will possibly impact some packages that depend on the
    SummarizedExperiment package and contain occurrences of
    the SummarizedExperiment0 string in their source. All these
    occurrences need to be replaced with SummarizedExperiment.
    If this blunt renaming turns out to be more damaging than
    anticipated we can always temporarily re-introduce the
    SummarizedExperiment0 class as an alias for SummarizedExperiment
    but hopefully that won't be necessary.

(e) Remove rowData() entirely from the GenomicRanges package and
    re-introduce it in the SummarizedExperiment package with a
    method for SummarizedExperiment0 that is an alias for mcols().

(f) Note that the user should also be able to specify the rowData
    component at construction-time with something like:

        # return a SummarizedExperiment0
        SummarizedExperiment(SimpleList(counts=counts), rowData=DF)

    Something Sonali was expecting to work in a recent email on
    devteam-bioc. Makes a lot of sense given that the user can
    already specify the rowRanges and/or colData components with
    something like:

        # return a RangedSummarizedExperiment
        SummarizedExperiment(SimpleList(counts=counts), rowRanges=GR)

    or

        # return a SummarizedExperiment0
        SummarizedExperiment(SimpleList(counts=counts), colData=DF)

    Question: should we support the case where the user specifies both
    rowData and rowRanges?

        # return a RangedSummarizedExperiment
        SummarizedExperiment(..., rowData=DF, rowRanges=GR)

    Maybe at least when GR has no mcols on it?

(g) Defunct exptData().

(h) Move any remaining generic from
      GenomicRanges/R/SummarizedExperiment-class.R
    to the appropriate place in SummarizedExperiment (either
    SummarizedExperiment0-class.R or RangedSummarizedExperiment-class.R)
    Except the value() generic and methods, which can go away.

(i) The "clone" methods defined in
      GenomicRanges/R/SummarizedExperiment-class.R
    can also go away.

(j) Move helper classes Assays, ShallowData, and ShallowSimpleListAssays
    from
      GenomicRanges/R/SummarizedExperiment-class.R
    to
      SummarizedExperiment/R/Assays-class.R

(k) Remove the SummarizedExperiment-class.R and
    SummarizedExperiment-rowData-methods.R files from GenomicRanges.

(l) Update the SummarizedExperiment vignette (including completing section
    on extending SummarizedExperiment).

I'm probably missing a few things that we'll discover along the road.
I'll take care of (a) in September and can take care of (c)-(l) after the
BioC 3.2 release. I'm happy to leave (b) (SummarizedExperiment vignette) to
someone else...

H.P. - August 2015