1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261
|
.. _self_referential:
Adjacency List Relationships
-----------------------------
The **adjacency list** pattern is a common relational pattern whereby a table
contains a foreign key reference to itself. This is the most common
way to represent hierarchical data in flat tables. Other methods
include **nested sets**, sometimes called "modified preorder",
as well as **materialized path**. Despite the appeal that modified preorder
has when evaluated for its fluency within SQL queries, the adjacency list model is
probably the most appropriate pattern for the large majority of hierarchical
storage needs, for reasons of concurrency, reduced complexity, and that
modified preorder has little advantage over an application which can fully
load subtrees into the application space.
In this example, we'll work with a single mapped
class called ``Node``, representing a tree structure::
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('node.id'))
data = Column(String(50))
children = relationship("Node")
With this structure, a graph such as the following::
root --+---> child1
+---> child2 --+--> subchild1
| +--> subchild2
+---> child3
Would be represented with data such as::
id parent_id data
--- ------- ----
1 NULL root
2 1 child1
3 1 child2
4 3 subchild1
5 3 subchild2
6 1 child3
The :func:`.relationship` configuration here works in the
same way as a "normal" one-to-many relationship, with the
exception that the "direction", i.e. whether the relationship
is one-to-many or many-to-one, is assumed by default to
be one-to-many. To establish the relationship as many-to-one,
an extra directive is added known as :paramref:`~.relationship.remote_side`, which
is a :class:`.Column` or collection of :class:`.Column` objects
that indicate those which should be considered to be "remote"::
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('node.id'))
data = Column(String(50))
parent = relationship("Node", remote_side=[id])
Where above, the ``id`` column is applied as the :paramref:`~.relationship.remote_side`
of the ``parent`` :func:`.relationship`, thus establishing
``parent_id`` as the "local" side, and the relationship
then behaves as a many-to-one.
As always, both directions can be combined into a bidirectional
relationship using the :func:`.backref` function::
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('node.id'))
data = Column(String(50))
children = relationship("Node",
backref=backref('parent', remote_side=[id])
)
There are several examples included with SQLAlchemy illustrating
self-referential strategies; these include :ref:`examples_adjacencylist` and
:ref:`examples_xmlpersistence`.
Composite Adjacency Lists
~~~~~~~~~~~~~~~~~~~~~~~~~
A sub-category of the adjacency list relationship is the rare
case where a particular column is present on both the "local" and
"remote" side of the join condition. An example is the ``Folder``
class below; using a composite primary key, the ``account_id``
column refers to itself, to indicate sub folders which are within
the same account as that of the parent; while ``folder_id`` refers
to a specific folder within that account::
class Folder(Base):
__tablename__ = 'folder'
__table_args__ = (
ForeignKeyConstraint(
['account_id', 'parent_id'],
['folder.account_id', 'folder.folder_id']),
)
account_id = Column(Integer, primary_key=True)
folder_id = Column(Integer, primary_key=True)
parent_id = Column(Integer)
name = Column(String)
parent_folder = relationship("Folder",
backref="child_folders",
remote_side=[account_id, folder_id]
)
Above, we pass ``account_id`` into the :paramref:`~.relationship.remote_side` list.
:func:`.relationship` recognizes that the ``account_id`` column here
is on both sides, and aligns the "remote" column along with the
``folder_id`` column, which it recognizes as uniquely present on
the "remote" side.
.. versionadded:: 0.8
Support for self-referential composite keys in :func:`.relationship`
where a column points to itself.
Self-Referential Query Strategies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Querying of self-referential structures works like any other query::
# get all nodes named 'child2'
session.query(Node).filter(Node.data=='child2')
However extra care is needed when attempting to join along
the foreign key from one level of the tree to the next. In SQL,
a join from a table to itself requires that at least one side of the
expression be "aliased" so that it can be unambiguously referred to.
Recall from :ref:`ormtutorial_aliases` in the ORM tutorial that the
:func:`.orm.aliased` construct is normally used to provide an "alias" of
an ORM entity. Joining from ``Node`` to itself using this technique
looks like:
.. sourcecode:: python+sql
from sqlalchemy.orm import aliased
nodealias = aliased(Node)
{sql}session.query(Node).filter(Node.data=='subchild1').\
join(nodealias, Node.parent).\
filter(nodealias.data=="child2").\
all()
SELECT node.id AS node_id,
node.parent_id AS node_parent_id,
node.data AS node_data
FROM node JOIN node AS node_1
ON node.parent_id = node_1.id
WHERE node.data = ?
AND node_1.data = ?
['subchild1', 'child2']
:meth:`.Query.join` also includes a feature known as
:paramref:`.Query.join.aliased` that can shorten the verbosity self-
referential joins, at the expense of query flexibility. This feature
performs a similar "aliasing" step to that above, without the need for
an explicit entity. Calls to :meth:`.Query.filter` and similar
subsequent to the aliased join will **adapt** the ``Node`` entity to
be that of the alias:
.. sourcecode:: python+sql
{sql}session.query(Node).filter(Node.data=='subchild1').\
join(Node.parent, aliased=True).\
filter(Node.data=='child2').\
all()
SELECT node.id AS node_id,
node.parent_id AS node_parent_id,
node.data AS node_data
FROM node
JOIN node AS node_1 ON node_1.id = node.parent_id
WHERE node.data = ? AND node_1.data = ?
['subchild1', 'child2']
To add criterion to multiple points along a longer join, add
:paramref:`.Query.join.from_joinpoint` to the additional
:meth:`~.Query.join` calls:
.. sourcecode:: python+sql
# get all nodes named 'subchild1' with a
# parent named 'child2' and a grandparent 'root'
{sql}session.query(Node).\
filter(Node.data=='subchild1').\
join(Node.parent, aliased=True).\
filter(Node.data=='child2').\
join(Node.parent, aliased=True, from_joinpoint=True).\
filter(Node.data=='root').\
all()
SELECT node.id AS node_id,
node.parent_id AS node_parent_id,
node.data AS node_data
FROM node
JOIN node AS node_1 ON node_1.id = node.parent_id
JOIN node AS node_2 ON node_2.id = node_1.parent_id
WHERE node.data = ?
AND node_1.data = ?
AND node_2.data = ?
['subchild1', 'child2', 'root']
:meth:`.Query.reset_joinpoint` will also remove the "aliasing" from filtering
calls::
session.query(Node).\
join(Node.children, aliased=True).\
filter(Node.data == 'foo').\
reset_joinpoint().\
filter(Node.data == 'bar')
For an example of using :paramref:`.Query.join.aliased` to
arbitrarily join along a chain of self-referential nodes, see
:ref:`examples_xmlpersistence`.
.. _self_referential_eager_loading:
Configuring Self-Referential Eager Loading
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Eager loading of relationships occurs using joins or outerjoins from parent to
child table during a normal query operation, such that the parent and its
immediate child collection or reference can be populated from a single SQL
statement, or a second statement for all immediate child collections.
SQLAlchemy's joined and subquery eager loading use aliased tables in all cases
when joining to related items, so are compatible with self-referential
joining. However, to use eager loading with a self-referential relationship,
SQLAlchemy needs to be told how many levels deep it should join and/or query;
otherwise the eager load will not take place at all. This depth setting is
configured via :paramref:`~.relationships.join_depth`:
.. sourcecode:: python+sql
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('node.id'))
data = Column(String(50))
children = relationship("Node",
lazy="joined",
join_depth=2)
{sql}session.query(Node).all()
SELECT node_1.id AS node_1_id,
node_1.parent_id AS node_1_parent_id,
node_1.data AS node_1_data,
node_2.id AS node_2_id,
node_2.parent_id AS node_2_parent_id,
node_2.data AS node_2_data,
node.id AS node_id,
node.parent_id AS node_parent_id,
node.data AS node_data
FROM node
LEFT OUTER JOIN node AS node_2
ON node.id = node_2.parent_id
LEFT OUTER JOIN node AS node_1
ON node_2.id = node_1.parent_id
[]
|