File: README

package info (click to toggle)
rdkit 202009.4-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 129,624 kB
  • sloc: cpp: 288,030; python: 75,571; java: 6,999; ansic: 5,481; sql: 1,968; yacc: 1,842; lex: 1,254; makefile: 572; javascript: 461; xml: 229; fortran: 183; sh: 134; cs: 93
file content (49 lines) | stat: -rw-r--r-- 2,093 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Contributions from Markus Kossner

Descriptions:

#-------------------------------
Frames.py:

Basically what the code does is a walk from every Atom that is the
neighbor of a ring away from that ring.  If we can not reach another
ring on the way, the atom chain will be deleted. However, as soon as
we reach another ring, the chain will become a connecting chain in the
scaffold. Ring Atoms [found by GetRingInfo()] as well as connecting
non-ring chains are stored in a Chem.EditableMol-Object representing
the Scaffold of the molecule.

There is a function, that can return this scaffold in two forms that
differ in the interpretation of what a 'Scaffold' is:

- with mode='RedScaff' a Scaffold will be the pure molecular core
  ignoring bond or atom types!  this is done by running over the
  Scaffold (...the editable mol from above) and changing any atom to
  carbon and any bond to a single bond

- with mode='Scaff' a Scaffold will be sensitive of element and bond
  type.  With mode='Scaff' Cyclohexyl-, phenyl or pyridyl- mojeties
  will be different ring species.  mode='RedScaff' will interpret them
  as six-membered rings and make all of them become a cyclohexane in
  the scaffold representation.

Usage of the script:
- simply supply a .sdf file after the script name.
- The program will generate a .sdf file called frames.sdf containing the
cluster ID of the frame it belongs to as a property tag.
- Additionally the raw frames are output to 'frames.smi'. This is a
simple enumeration of the frames detected.

Please note, that the data is preliminary stored in memory. This
means, that very large input files may cause the consumption of large
amounts of memory at runtime!

#-------------------------------
BaseFeatures_DIP2_NoMicrospecies.fdef

After I talked to Nik at the Sheffield Conference about the difficulty
of defining Positive/Negative ionizable features by SMARTS he asked me
to to send my approach towards the problem.  Attached you will find
the .fdef file I use. Hopefully there is something in there that might
help in setting up robust feature definitions.