1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331
|
Queries
================================================================================
.. currentmodule:: ost.mol
OpenStructure includes a powerful query system that allows you to perform custom
selections on a molecular entity in a convenient way.
The Basics
--------------------------------------------------------------------------------
It is often convenient to highlight or focus certain parts of the structure.
OpenStructure includes a powerful query system that allows you to perform custom
selections in a convenient way. Selections are carried out mainly by calling the Select method made available by EntityHandle and :class:`EntityView` objects while providing a query string. Queries are written using a dedicated mini-language. For example, to select all arginine residues of a given structure, one would write:
.. code-block:: python
arginines = model.Select('rname=ARG')
A simple selection query (called a predicate) consists of a property (here,
`rname`), a comparison operator (here, `=`) and an argument (here, `ARG`). The
return value of a call to the :meth:`EntityHandle.Select` method is always an
:class:`EntityView`. The :class:`EntityView` always contains a full hierarchy of
elements, never standalone separated elements. In the above example, the
:class:`EntityView` called `arginines` will contain all chains from the
structure called `model` that have at least one arginine. In turn these chains
will contain all residues that have been identified as arginines. The residues
themselves will contain references to all of their atoms. Of course, queries are
not limited to selecting residues based on their type, it is also possible to
select atom by name:
.. code-block:: python
c_betas = model.Select('aname=CB')
As before, `c_betas` is an instance of an :class:`EntityView` object and contains
a full hierarchy. The main difference to the previous example is that the
selected residues do not contain a list of all of their atoms but only the
C-beta. These examples clarify why the name 'view' was chosen for this result of
a :meth:`~EntityHandle.Select` statement. It represents a reduced, restrained
way of looking at the original structure.
Both the selection statements that have been used so far take strings as their arguments. However, selection properties such as `rnum` (residue number), take numeric arguments. With numeric arguments it is possible to use identity operators ( `!=` and `=`). It is also possible to compare them using the `>`, `<`, `>=` and `<=` operators. For example, the 20 N-terminal residues of a protein can be selected with:
.. code-block:: python
n_term = model.Select('rnum<=20')
If you want to supply arguments with special characters they need to be put in
quotation marks (' or "). For instance, this is needed for any chain name
containing spaces as in:
.. code-block:: python
model.Select('cname=" "')
Almost any name can be quoted with :meth:`QueryQuoteName`.
Combining predicates
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Selection predicates can be combined with boolean operators. For example , you might want to select all C atoms with crystallographic occupancy higher than 50. These atoms must match the predicate `ele=C` in addition to the predicate `occ>50`. In the query language this can be written as:
.. code-block:: python
model.Select('ele=C and occ>50')
Compact forms are available for several selection statements. For example, to select all arginines and aspargines, one could use a statement like:
.. code-block:: python
arg_and_asn = model.Select('rname=ARG or rname=ASN')
However, this is rather cumbersome as it requires the word `rname` to be typed twice. Since the only difference between the two parts of the selection is the argument that follows the word `rname`, the statement can also be written in an abbreviated form:
.. code-block:: python
arg_and_asn = model.Select('rname=ARG,ASN')
Another example: to select residues with numbers in the range 130 to 200, one could use the following statement
.. code-block:: python
center = model.Select('rnum>=130 and rnum<=200')
or alternatively use the much nicer syntax:
.. code-block:: python
center = model.Select('rnum=130:200')
This last statement is completely equivalent to the previous one. This syntax
can be used when the selection statement requires a range of integer values
within a closed interval.
Distance Queries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The query
.. code-block:: python
around_center = model.Select('5 <> {0,0,0}')
selects all chains, residues and atoms that lie with 5 Å to the origin of the
reference system ({0,0,0}). The `<>` operator is called the 'within' operator.
Instead of a point, the within statements can also be used to return a view
containing all chains, residues and atoms within a radius of another selection
statement applied to the same entity. Square brackets are used to delimit the
inner query statement.
.. code-block:: python
around_hem = model.Select('5 <> [rname=HEM]')
model.Select('5 <> [rname=HEM and ele=C] and rname!=HEM')
Bonds and Queries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When an :class:`EntityView` is generated by a selection, it includes by default
only bonds for which both connected atoms satisfy the query statement. This can
be changed by passing the parameters `EXCLUSIVE_BONDS` or `NO_BONDS` when
calling the Select method. `EXCLUSIVE_BONDS` adds bonds to the
:class:`EntityView` when at least one of the two atoms falls within the boundary
of the selection. `NO_BONDS` suppresses the bond inclusion step completely.
Whole Residue Queries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If the parameter `MATCH_RESIDUES` is passed when the Select method is called,
the resulting :class:`EntityView` will include whole residues for which at least
one atom satisfies the query. This means that if at least one atom in the
residue falls within the boundaries of the selection, all atoms of the residue
will be included in the View.
More Query Usage
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The high level interface for queries are the Select methods of the
EntityHandle and :class:`EntityView` classes. By passing in a query string, a view
consisting of a subset of the elements is returned.
Queries also offer a second interface: `IsAtomSelected()`,
`IsResidueSelected()` and `IsChainSelected()` take an atom, residue or
chain as their argument and return true or false, depending on whether the
element fulfills the predicates.
.. _genprop-in-queries:
Generic Properties in Queries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The query language can also be used for numeric generic properties (i.e.
float and int), but the syntax is slightly different. To access any generic
properties, it needs to be specified that they are generic and at which level
they are defined. Therefore, all generic properties start with a `g`, followed by an `a`, `r` or `c` for atom, residue or chain level respectively.
.. code-block:: python
# set generic properties for atom, residue, chain
atom_handle.SetFloatProp("testpropatom", 5.2)
resid_handle.SetFloatProp("testpropres", 1.1)
chain_handle.SetIntProp("testpropchain", 10)
# query statements
sel_a = e.Select("gatestpropatom<=10.0")
sel_r = e.Select("grtestpropres=1.0")
sel_c = e.Select("gctestpropchain>5")
Since generic properties do not need to be defined for all parts of an entity
(e.g. it could be specified for one single :class:`AtomHandle`), the query
statement will throw an error unless you specify a default value in the query
statement which can be done using a ':' character:
.. code-block:: python
# if one or more atoms have no generic properties
sel = e.Select("gatestprop=5")
# this will throw an error
# you can specify a default value:
sel = e.Select("gatestprop:1.0=5")
# this will run through smoothly and use 1.0 as
# the default value for all atoms that do not
# have the generic property 'testprop'
Using this method, you will be warned if a generic property is not set for all
atoms, residues or chains unless you specify a default value. So, be careful
when you do.
Available Properties
--------------------------------------------------------------------------------
The following properties may be used in predicates. The type is given for each property.
Properties of Chains
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**cname/chain** (str) :attr:`Chain name<ChainHandle.name>`
Properties of Residues
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**rname** (str): :attr:`Residue name<ResidueHandle.name>`
**rnum** (int): :attr:`Residue number<ResidueHandle.number>`. Currently only the
numeric part is honored.
**rtype** (str): Residue type as given by the DSSP code (e.g. "H" for alpha
helix, "E" for extended), "helix" for all helix types, "ext" or "strand" for
all beta sheets or "coil" for any type of coil (see :class:`SecStructure`).
**rindex** (int): :attr:`Index<ResidueHandle.index>` of residue handle in chain.
This index is the same for views and handles.
**peptide** (bool): Whether the residue is :attr:`peptide linking
<ResidueHandle.peptide_linking>`.
**nucleotide** (bool): Whether the residue is :attr:`nucleotide linking
<ResidueHandle.nucleotide_linking>`.
**protein** (bool): Whether the residue is considered to be
:attr:`part of a connected protein <ResidueHandle.is_protein>`.
**rbfac** (float): average B (temperature) factor of residue
**ligand** (bool) Whether the residue is a :meth:`ligand
<ResidueHandle.is_ligand>`.
**water** (bool) Whether the residue is water.
Properties of Atoms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**aname** (str): :attr:`Atom name<AtomHandle.name>`
**ele** (str): :attr:`Atom element<AtomHandle.element>`
**occ** (float): :attr:`Atom occupancy<AtomHandle.occupancy>`
**abfac** (float): :attr:`Atom B-factor<AtomHandle.b_factor>`
**x** (float): :attr:`X<AtomHandle.pos>` coordinate of atom.
**y** (float): :attr:`Y<AtomHandle.pos>` coordinate of atom.
**z** (float): :attr:`Z<AtomHandle.pos>` coordinate of atom.
**aindex** (int): :attr:`Atom index<AtomHandle.index>`
**ishetatm** (bool): Whether the atom is a :attr:`heterogenous
atom<AtomHandle.is_hetatom>`.
**acharge** (float): :attr:`Atom charge<AtomHandle.charge>`
Query API documentation
--------------------------------------------------------------------------------
In the following, the interface of the query class is described. In general, you will not have to use this interface but will pass the query as string directly.
.. class:: Query(string='')
Create a new query from the given string. The constructor does not throw any
error in case the query contains syntax errors. Use :attr:`valid` to check
whether the query was valid.
.. attribute:: string
The string used to create the query.
:type: str
.. attribute:: valid
True, when the query could be compiled without syntax errors.
:type: bool
.. attribute:: error
If :attr:`valid` is false, this attribute contains the error message.
Otherwise it is set to an empty string
:type: str
.. method:: IsAtomSelected(atom)
Returns true, when the given atom handle fulfills the predicates, false if
not.
.. method:: IsChainSelected(chain)
Return true if at least one of the atomso of the chain matches the
predicates.
.. method:: IsResidueSelected(residue)
Returns true, when at least one atom of the residue matches the predicates.
.. class:: QueryFlag
Defines flags to change default behaviour of Select queries. Possible values:
* ``EXCLUSIVE_BONDS`` - adds bonds to the :class:`EntityView` when at least
one of the two bonded atoms was selected (by default both must be selected)
* ``NO_BONDS`` - do not include any bonds (by default bonds are included)
* ``MATCH_RESIDUES`` - include all atoms of a residue if any of its atoms is
selected (by default only selected atoms are included)
.. method:: QueryQuoteName(name)
Adds appropriate quotation marks to use *name* in a :class:`Query`. For
instance the following code snippet would generate a query string selecting
all chains from a list of chain names:
.. code-block:: python
query = "cname=" + ','.join(mol.QueryQuoteName(name) for name in names)
Note that there is some limited support of wild card symbols (* and ?) which
may have undesired effects in a query such as the code above.
:param name: Name to put in quotation marks
:type name: :class:`str`
:rtype: :class:`str`
:raises: :exc:`~exceptions.Exception` if *name* cannot be used in queries.
This happens if *name* includes both ' and " or if it ends with \\.
|