File: structural_data_2.rst

package info (click to toggle)
python-cogent 1.5.3-2
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 16,424 kB
  • ctags: 24,343
  • sloc: python: 134,200; makefile: 100; ansic: 17; sh: 10
file content (129 lines) | stat: -rw-r--r-- 5,710 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
Structural Data Advanced
------------------------

This section covers more advanced structural entity handling tasks.

Adding an entity when it exists
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A PyCogent ``Entity`` is a subclass of a dictionary. Adding children is essentially the same as updating a dictionary, with a minimal amount of book-keeping. It is equivalent to the following:

::

    child.setParent(parent)
    child_id = child.getId()
    parent[child_id] = child
    parent.setModified(True, False)

This points the child entity to it's new parent (line 1) and adds the child to the parent dictionary (line 3). The call to ``setModified`` notifies all parents of the parent of the modification. A dictionary has unique keys and so a parent has children with unique ids. If you try to add a child which has an id clash it will update the parent and override the previous child, just like you would update a dictionary.

Why are the short ids inside a tuple?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Short ids are parts of a long id. The long id is a tuple. Short ids can be concatenated to form a long id. This would not be possible if short ids were not within a tuple initially. For example:

.. doctest::

    >>> (0,) + ('A',) + (('GLY', 209, ' '),) + (('C', ' '),)
    (0, 'A', ('GLY', 209, ' '), ('C', ' '))

The output here is a valid long id of an atom for use in ``AtomHolder`` instances.

Select children of a ``MultiEntity`` instance by a feature
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Selection is a common task and ``PyCogent`` has a unified syntax for this via the ``selectChildren`` method. The idea behind it is as follows:

    #. gather "requested data" from all children.
    #. compare each returned child value to the template "value" using the
       "operator"
    #. return children for which the comparison is ``True``

The signature of this method is selectChildren("value", "operator", "requested data"). In the first step all children return the "requested data", the request might be an attribute, a value corresponding to a key in the ``parent.xtra`` dictionary or any other query supported by the ``getData`` method.

.. doctest::

    >>> from cogent.parse.pdb import PDBParser
    >>> pdb_fh = open('data/1HQF.pdb')
    >>> pdb_structure = PDBParser(pdb_fh)
    >>> model = pdb_structure[(0,)]
    >>> chainA = model[('A',)]

Example 1: select all alanines from a chain.
""""""""""""""""""""""""""""""""""""""""""""

.. doctest::

    >>> alanines = chainA.selectChildren('ALA', 'eq', 'name')

This requests the "name" attribute from all children in chain A and uses the "eq" (equals) operator to compare this to "ALA". It returns a list of residues which have this name.

Example 2: select all residues, which are not amino acids or nucleic acids.
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

.. doctest::

    >>> selection = chainA.selectChildren('H', 'eq', 'h_flag')

This requests the "h_flag" i.e. hetero-atom flag from all residues. For amino acids and nucleic acids this should be "" for all other molecular entities "H", so the function returns only ligands, waters etc.

Example 3: What if some children have data to return?
"""""""""""""""""""""""""""""""""""""""""""""""""""""

First we pick out a residue and modify it's xtra dictionary to contain some
custom data. We mark ``lys39`` as a catalytic residue.

.. doctest::

    >>> lys39 = chainA[(('LYS', 39, ' '),)]
    >>> lys39.xtra['CATALYTIC'] = True

All other residues do not have a value corresponding to the "CATALYTIC" key. But we still can select all "CATALYTIC" residues in chain A.

.. doctest::

    >>> catalytic = chainA.selectChildren(True, 'eq', 'CATALYTIC', xtra=True)
    >>> catalytic
    {(('LYS', 39, ' '),): <Residue LYS resseq=39 icode= >}

The difference is that we have requested a value from the "xtra" dictionary instead of a hypothetical "CATALYTIC" attribute.

Comparison "operators" supported for the ``selectChildren`` method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The "operator" can be either a) a string corresponding to a function from the ``operator`` module from the python standard library. The list of currently supported operators is: ``gt``, ``ge``, ``lt``, ``le``, ``eq``, ``ne``, ``or_``, ``and_``, ``contains``, ``is_``, ``is_not`` or alternatively it can be a a custom function, which has the following signature operator (value, got), where "got" is the value returned by the child and "value" is what it is compared to.

Copying or serializing an entity
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

PyCogent ``MutltiEntity`` and ``Entity`` are Python objects and they support the copy and deepcopy protocols.

.. doctest::

    >>> import cPickle
    >>> pickledA = cPickle.dumps(chainA)
    >>> unpickledA = cPickle.loads(pickledA)
    >>> unpickledA is chainA
    False
    >>> unpickledA == chainA
    True

In the above we have pickled and unpickled a ``MultiEntity`` instance. This results in a new instance "unpickledA" which is the same as "chainA", but has a different id (different objects, identity fails).

If you are only interested in obtaining a copy of an ``Entity`` instance and not being able to share entities between python sessions. You can use the functions from the ``copy`` module. Please note that copies and deep copies are the same

.. doctest::

    >>> from copy import copy, deepcopy
    >>> otherA = copy(chainA)
    >>> otherA is chainA
    False
    >>> otherA == chainA
    True
    >>> cys119 = chainA[(('CYS', 119, ' '),)]
    >>> cys119_other = otherA[(('CYS', 119, ' '),)]
    >>> cys119 is cys119_other
    False
    >>> cys119 == cys119_other
    True