1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195
|
```python
from functools import partial
from rpy2.ipython import html
html.html_rdataframe=partial(html.html_rdataframe, table_class="docutils")
```
# Basic handling
The S4 system is one the OOP systems in R.
Its largest use might be in the Bioconductor collection of packages
for bioinformatics and computational biology.
We use the bioconductor `Biobase`:
```python
from rpy2.robjects.packages import importr
biobase = importr('Biobase')
```
The R package contains constructors for the S4 classes defined. They
are simply functions, and can be used as such through `rpy2`:
```python
eset = biobase.ExpressionSet()
```
The object `eset` is an R object of type `S4`:
```python
type(eset)
```
It has a class as well:
```python
tuple(eset.rclass)
```
In R, objects attributes are also known as slots. The attribute names
can be listed with:
```python
tuple(eset.slotnames())
```
The attributes can also be accessed through the `rpy2` property `slots`.
`slots` is a mapping between attributes names (keys) and their associated
R object (values). It can be used as Python `dict`:
```python
# print keys
print(tuple(eset.slots.keys()))
# fetch `phenoData`
phdat = eset.slots['phenoData']
# phdat is an S4 object itself
pheno_dataf = phdat.slots['data']
```
# Mapping S4 classes to Python classes
Writing one's own Python class extending rpy2's `RS4` is straightforward.
That class can be used wrap our `eset` object
```python
from rpy2.robjects.methods import RS4
class ExpressionSet(RS4):
pass
eset_myclass = ExpressionSet(eset)
```
## Custom conversion
The conversion system can also be made aware our new class by customizing
the handling of S4 objects.
A simple implementation is a factory function that will conditionally wrap
the object in our Python class `ExpressionSet`:
```python
def rpy2py_s4(obj):
if 'ExpressionSet' in obj.rclass:
res = ExpressionSet(obj)
else:
res = robj
return res
# try it
rpy2py_s4(eset)
```
That function can be be register to a `Converter`:
```python
from rpy2.robjects import default_converter
from rpy2.robjects.conversion import Converter
my_converter = Converter('ExpressionSet-aware converter',
template=default_converter)
from rpy2.rinterface import SexpS4
my_converter.rpy2py.register(SexpS4, rpy2py_s4)
```
When using that converter, the matching R objects are returned as
instances of our Python class `ExpressionSet`:
```python
with my_converter.context() as cv:
eset = biobase.ExpressionSet()
print(type(eset))
```
## Class attributes
The R attribute `assayData` can be accessed
through the accessor method `exprs()` in R.
We can make it a property in our Python class:
```python
class ExpressionSet(RS4):
def _exprs_get(self):
return self.slots['assayData']
def _exprs_set(self, value):
self.slots['assayData'] = value
exprs = property(_exprs_get,
_exprs_set,
None,
"R attribute `exprs`")
eset_myclass = ExpressionSet(eset)
eset_myclass.exprs
```
## Methods
In R's S4 methods are generic functions served by a multiple dispatch system.
A natural way to expose the S4 method to Python is to use the
`multipledispatch` package:
```python
from multipledispatch import dispatch
from functools import partial
my_namespace = dict()
dispatch = partial(dispatch, namespace=my_namespace)
@dispatch(ExpressionSet)
def rowmedians(eset,
na_rm=False):
res = biobase.rowMedians(eset,
na_rm=na_rm)
return res
res = rowmedians(eset_myclass)
```
The R method `rowMedians` is also defined for matrices, which we can expose
on the Python end as well:
```python
from rpy2.robjects.vectors import Matrix
@dispatch(Matrix)
def rowmedians(m,
na_rm=False):
res = biobase.rowMedians(m,
na_rm=na_rm)
return res
```
While this is working, one can note that we call the same R function
`rowMedians()` in the package `Biobase` in both Python decorated
functions. What is happening is that the dispatch is performed by R.
If this is ever becoming a performance issue, the specific R function
dispatched can be prefetched and explicitly called in the Python
function. For example:
```python
from rpy2.robjects.methods import getmethod
from rpy2.robjects.vectors import StrVector
_rowmedians_matrix = getmethod(StrVector(["rowMedians"]),
signature=StrVector(["matrix"]))
@dispatch(Matrix)
def rowmedians(m,
na_rm=False):
res = _rowmedians_matrix(m,
na_rm=na_rm)
return res
```
|