File: properties.md

package info (click to toggle)
python-pubchempy 1.0.5-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 648 kB
  • sloc: python: 1,619; makefile: 13; sh: 5
file content (47 lines) | stat: -rw-r--r-- 2,482 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
(properties)=

# Properties

The {func}`~pubchempy.get_properties` function allows the retrieval of specific properties without having to deal with entire compound records. This is especially useful for retrieving the properties of a large number of compounds at once:

```python
p = pcp.get_properties("SMILES", "CC", "smiles", searchtype="superstructure")
```

Multiple properties may be specified in a list, or in a comma-separated string. The available properties are: MolecularFormula, MolecularWeight, ConnectivitySMILES, SMILES, InChI, InChIKey, IUPACName, XLogP, ExactMass, MonoisotopicMass, TPSA, Complexity, Charge, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, IsotopeAtomCount, AtomStereoCount, DefinedAtomStereoCount, UndefinedAtomStereoCount, BondStereoCount, DefinedBondStereoCount, UndefinedBondStereoCount, CovalentUnitCount, Volume3D, XStericQuadrupole3D, YStericQuadrupole3D, ZStericQuadrupole3D, FeatureCount3D, FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D, FeatureHydrophobeCount3D, ConformerModelRMSD3D, EffectiveRotorCount3D, ConformerCount3D.

## Synonyms

Get a list of synonyms for a given input using the {func}`~pubchempy.get_synonyms` function:

```python
pcp.get_synonyms("Aspirin", "name")
pcp.get_synonyms("Aspirin", "name", "substance")
```

Inputs that match more than one SID/CID will have multiple, separate synonyms lists returned.

## CAS Registry Numbers

CAS Registry Numbers are not officially supported by PubChem, but they are often present in the synonyms associated with a compound. Therefore it is straightforward to retrieve them by filtering the synonyms to just those with the CAS Registry Number format:

```python
for result in pcp.get_synonyms("Aspirin", "name"):
    cid = result["CID"]
    cas_rns = []
    for syn in result.get("Synonym", []):
        match = re.match(r"(\d{2,7}-\d\d-\d)", syn)
        if match:
            cas_rns.append(match.group(1))
    print(f"CAS registry numbers for CID {cid}: {cas_rns}")
```

## Identifiers

There are three functions for getting a list of identifiers for a given input:

- {func}`~pubchempy.get_cids`
- {func}`~pubchempy.get_sids`
- {func}`~pubchempy.get_aids`

For example, passing a CID to {func}`~pubchempy.get_sids` will return a list of SIDs corresponding to the {class}`~pubchempy.Substance` records that were standardised and merged to produce the given {class}`~pubchempy.Compound`.