1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156
|
//
// Copyright (C) 2003-2022 Greg Landrum and other RDKit contributors
//
// @@ All Rights Reserved @@
// This file is part of the RDKit.
// The contents are covered by the terms of the BSD license
// which is included in the file license.txt, found at the root
// of the RDKit source tree.
//
/*! \file Subgraphs.h
\brief functionality for finding subgraphs and paths in molecules
Difference between _subgraphs_ and _paths_ :
Subgraphs are potentially branched, whereas paths (in our
terminology at least) cannot be. So, the following graph:
\verbatim
C--0--C--1--C--3--C
|
2
|
C
\endverbatim
has 3 _subgraphs_ of length 3: (0,1,2),(0,1,3),(2,1,3)
but only 2 _paths_ of length 3: (0,1,3),(2,1,3)
*/
#include <RDGeneral/export.h>
#ifndef RD_SUBGRAPHS_H
#define RD_SUBGRAPHS_H
#include <vector>
#include <list>
#include <map>
#include <unordered_map>
namespace RDKit {
class ROMol;
// NOTE: before replacing the defn of PATH_TYPE: be aware that
// we do occasionally use reverse iterators on these things, so
// replacing with a slist would probably be a bad idea.
typedef std::vector<int> PATH_TYPE;
typedef std::list<PATH_TYPE> PATH_LIST;
typedef PATH_LIST::const_iterator PATH_LIST_CI;
typedef std::map<int, PATH_LIST> INT_PATH_LIST_MAP;
typedef INT_PATH_LIST_MAP::const_iterator INT_PATH_LIST_MAP_CI;
typedef INT_PATH_LIST_MAP::iterator INT_PATH_LIST_MAP_I;
// --- --- --- --- --- --- --- --- --- --- --- --- ---
//
//
// --- --- --- --- --- --- --- --- --- --- --- --- ---
//! \brief find all bond subgraphs in a range of sizes
/*!
* \param mol - the molecule to be considered
* \param lowerLen - the minimum subgraph size to find
* \param upperLen - the maximum subgraph size to find
* \param useHs - if set, hydrogens in the graph will be considered
* eligible to be in paths. NOTE: this will not add
* Hs to the graph.
* \param rootedAtAtom - if non-negative, only subgraphs that start at
* this atom will be returned.
*
* The result is a map from subgraph size -> list of paths
* (i.e. list of list of bond indices)
*/
RDKIT_SUBGRAPHS_EXPORT INT_PATH_LIST_MAP findAllSubgraphsOfLengthsMtoN(
const ROMol &mol, unsigned int lowerLen, unsigned int upperLen,
bool useHs = false, int rootedAtAtom = -1);
//! \brief find all bond subgraphs of a particular size
/*!
* \param mol - the molecule to be considered
* \param targetLen - the length of the subgraphs to be returned
* \param useHs - if set, hydrogens in the graph will be considered
* eligible to be in paths. NOTE: this will not add
* Hs to the graph.
* \param rootedAtAtom - if non-negative, only subgraphs that start at
* this atom will be returned.
*
*
* The result is a list of paths (i.e. list of list of bond indices)
*/
RDKIT_SUBGRAPHS_EXPORT PATH_LIST
findAllSubgraphsOfLengthN(const ROMol &mol, unsigned int targetLen,
bool useHs = false, int rootedAtAtom = -1);
//! \brief find unique bond subgraphs of a particular size
/*!
* \param mol - the molecule to be considered
* \param targetLen - the length of the subgraphs to be returned
* \param useHs - if set, hydrogens in the graph will be considered
* eligible to be in paths. NOTE: this will not add
* Hs to the graph.
* \param useBO - if set, bond orders will be considered when uniquifying
* the paths
* \param rootedAtAtom - if non-negative, only subgraphs that start at
* this atom will be returned.
*
* The result is a list of paths (i.e. list of list of bond indices)
*/
RDKIT_SUBGRAPHS_EXPORT PATH_LIST findUniqueSubgraphsOfLengthN(
const ROMol &mol, unsigned int targetLen, bool useHs = false,
bool useBO = true, int rootedAtAtom = -1);
//! \brief find all paths of a particular size
/*!
* \param mol - the molecule to be considered
* \param targetLen - the length of the paths to be returned
* \param useBonds - if set, the path indices will be bond indices,
* not atom indices
* \param useHs - if set, hydrogens in the graph will be considered
* eligible to be in paths. NOTE: this will not add
* Hs to the graph.
* \param rootedAtAtom - if non-negative, only subgraphs that start at
* this atom will be returned.
* \param onlyShortestPaths - if set then only paths which are <= the shortest
* path between the begin and end atoms will be
* included in the results
*
* The result is a list of paths (i.e. list of list of bond indices)
*/
RDKIT_SUBGRAPHS_EXPORT PATH_LIST findAllPathsOfLengthN(
const ROMol &mol, unsigned int targetLen, bool useBonds = true,
bool useHs = false, int rootedAtAtom = -1, bool onlyShortestPaths = false);
RDKIT_SUBGRAPHS_EXPORT INT_PATH_LIST_MAP findAllPathsOfLengthsMtoN(
const ROMol &mol, unsigned int lowerLen, unsigned int upperLen,
bool useBonds = true, bool useHs = false, int rootedAtAtom = -1,
bool onlyShortestPaths = false);
//! \brief Find bond subgraphs of a particular radius around an atom.
//! Return empty result if there is no bond at the requested radius.
/*!
* \param mol - the molecule to be considered
* \param radius - the radius of the subgraphs to be considered
* \param rootedAtAtom - the atom to consider
* \param useHs - if set, hydrogens in the graph will be considered
* eligible to be in paths. NOTE: this will not add
* Hs to the graph.
* \param enforceSize - If false, all the bonds within the requested radius
* (<= radius) is collected. Otherwise, at least one bond
* located at the requested radius must be found and
* added. \param atomMap - Optional: If provided, it will measure the minimum
* distance of the atom from the rooted atom (start with 0 from the rooted
* atom). The result is a pair of the atom ID and the distance. The result is a
* path (a vector of bond indices)
*/
RDKIT_SUBGRAPHS_EXPORT PATH_TYPE findAtomEnvironmentOfRadiusN(
const ROMol &mol, unsigned int radius, unsigned int rootedAtAtom,
bool useHs = false, bool enforceSize = true,
std::unordered_map<unsigned int, unsigned int> *atomMap = nullptr);
} // namespace RDKit
#endif
|