Relationship Configuration¶

This section describes the relationship() function and in depth discussion of its usage. The reference material here continues into the next section, Collection Configuration and Techniques, which has additional detail on configuration of collections via relationship().

Basic Relational Patterns¶

A quick walkthrough of the basic relational patterns.

The imports used for each of the following sections is as follows:

from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

One To Many¶

A one to many relationship places a foreign key on the child table referencing the parent. relationship() is then specified on the parent, as referencing a collection of items represented by the child:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

To establish a bidirectional relationship in one-to-many, where the “reverse” side is a many to one, specify the backref option:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", backref="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

Child will get a parent attribute with many-to-one semantics.

Many To One¶

Many to one places a foreign key in the parent table referencing the child. relationship() is declared on the parent, where a new scalar-holding attribute will be created:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    child_id = Column(Integer, ForeignKey('child.id'))
    child = relationship("Child")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)

Bidirectional behavior is achieved by setting backref to the value "parents", which will place a one-to-many collection on the Child class:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    child_id = Column(Integer, ForeignKey('child.id'))
    child = relationship("Child", backref="parents")

One To One¶

One To One is essentially a bidirectional relationship with a scalar attribute on both sides. To achieve this, the uselist flag indicates the placement of a scalar attribute instead of a collection on the “many” side of the relationship. To convert one-to-many into one-to-one:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    child = relationship("Child", uselist=False, backref="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

Or to turn a one-to-many backref into one-to-one, use the backref() function to provide arguments for the reverse side:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    child_id = Column(Integer, ForeignKey('child.id'))
    child = relationship("Child", backref=backref("parent", uselist=False))

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)

Many To Many¶

Many to Many adds an association table between two classes. The association table is indicated by the secondary argument to relationship(). Usually, the Table uses the MetaData object associated with the declarative base class, so that the ForeignKey directives can locate the remote tables with which to link:

association_table = Table('association', Base.metadata,
    Column('left_id', Integer, ForeignKey('left.id')),
    Column('right_id', Integer, ForeignKey('right.id'))
)

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                    secondary=association_table)

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)

For a bidirectional relationship, both sides of the relationship contain a collection. The backref keyword will automatically use the same secondary argument for the reverse relationship:

association_table = Table('association', Base.metadata,
    Column('left_id', Integer, ForeignKey('left.id')),
    Column('right_id', Integer, ForeignKey('right.id'))
)

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                    secondary=association_table,
                    backref="parents")

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)

The secondary argument of relationship() also accepts a callable that returns the ultimate argument, which is evaluated only when mappers are first used. Using this, we can define the association_table at a later point, as long as it’s available to the callable after all module initialization is complete:

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                    secondary=lambda: association_table,
                    backref="parents")

With the declarative extension in use, the traditional “string name of the table” is accepted as well, matching the name of the table as stored in Base.metadata.tables:

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                    secondary="association",
                    backref="parents")

Deleting Rows from the Many to Many Table¶

A behavior which is unique to the secondary argument to relationship() is that the Table which is specified here is automatically subject to INSERT and DELETE statements, as objects are added or removed from the collection. There is no need to delete from this table manually. The act of removing a record from the collection will have the effect of the row being deleted on flush:

# row will be deleted from the "secondary" table
# automatically
myparent.children.remove(somechild)

A question which often arises is how the row in the “secondary” table can be deleted when the child object is handed directly to Session.delete():

session.delete(somechild)

There are several possibilities here:

If there is a relationship() from Parent to Child, but there is not a reverse-relationship that links a particular Child to each Parent, SQLAlchemy will not have any awareness that when deleting this particular Child object, it needs to maintain the “secondary” table that links it to the Parent. No delete of the “secondary” table will occur.
If there is a relationship that links a particular Child to each Parent, suppose it’s called Child.parents, SQLAlchemy by default will load in the Child.parents collection to locate all Parent objects, and remove each row from the “secondary” table which establishes this link. Note that this relationship does not need to be bidrectional; SQLAlchemy is strictly looking at every relationship() associated with the Child object being deleted.
A higher performing option here is to use ON DELETE CASCADE directives with the foreign keys used by the database. Assuming the database supports this feature, the database itself can be made to automatically delete rows in the “secondary” table as referencing rows in “child” are deleted. SQLAlchemy can be instructed to forego actively loading in the Child.parents collection in this case using the passive_deletes directive on relationship(); see Using Passive Deletes for more details on this.

Note again, these behaviors are only relevant to the secondary option used with relationship(). If dealing with association tables that are mapped explicitly and are not present in the secondary option of a relevant relationship(), cascade rules can be used instead to automatically delete entities in reaction to a related entity being deleted - see Cascades for information on this feature.

Association Object¶

The association object pattern is a variant on many-to-many: it’s used when your association table contains additional columns beyond those which are foreign keys to the left and right tables. Instead of using the secondary argument, you map a new class directly to the association table. The left side of the relationship references the association object via one-to-many, and the association class references the right side via many-to-one. Below we illustrate an association table mapped to the Association class which includes a column called extra_data, which is a string value that is stored along with each association between Parent and Child:

class Association(Base):
    __tablename__ = 'association'
    left_id = Column(Integer, ForeignKey('left.id'), primary_key=True)
    right_id = Column(Integer, ForeignKey('right.id'), primary_key=True)
    extra_data = Column(String(50))
    child = relationship("Child")

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Association")

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)

The bidirectional version adds backrefs to both relationships:

class Association(Base):
    __tablename__ = 'association'
    left_id = Column(Integer, ForeignKey('left.id'), primary_key=True)
    right_id = Column(Integer, ForeignKey('right.id'), primary_key=True)
    extra_data = Column(String(50))
    child = relationship("Child", backref="parent_assocs")

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Association", backref="parent")

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)

Working with the association pattern in its direct form requires that child objects are associated with an association instance before being appended to the parent; similarly, access from parent to child goes through the association object:

# create parent, append a child via association
p = Parent()
a = Association(extra_data="some data")
a.child = Child()
p.children.append(a)

# iterate through child objects via association, including association
# attributes
for assoc in p.children:
    print assoc.extra_data
    print assoc.child

To enhance the association object pattern such that direct access to the Association object is optional, SQLAlchemy provides the Association Proxy extension. This extension allows the configuration of attributes which will access two “hops” with a single access, one “hop” to the associated object, and a second to a target attribute.

Note

When using the association object pattern, it is advisable that the association-mapped table not be used as the secondary argument on a relationship() elsewhere, unless that relationship() contains the option viewonly set to True. SQLAlchemy otherwise may attempt to emit redundant INSERT and DELETE statements on the same table, if similar state is detected on the related attribute as well as the associated object.

Adjacency List Relationships¶

The adjacency list pattern is a common relational pattern whereby a table contains a foreign key reference to itself. This is the most common way to represent hierarchical data in flat tables. Other methods include nested sets, sometimes called “modified preorder”, as well as materialized path. Despite the appeal that modified preorder has when evaluated for its fluency within SQL queries, the adjacency list model is probably the most appropriate pattern for the large majority of hierarchical storage needs, for reasons of concurrency, reduced complexity, and that modified preorder has little advantage over an application which can fully load subtrees into the application space.

In this example, we’ll work with a single mapped class called Node, representing a tree structure:

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node.id'))
    data = Column(String(50))
    children = relationship("Node")

With this structure, a graph such as the following:

root --+---> child1
       +---> child2 --+--> subchild1
       |              +--> subchild2
       +---> child3

Would be represented with data such as:

id       parent_id     data
---      -------       ----
1        NULL          root
2        1             child1
3        1             child2
4        3             subchild1
5        3             subchild2
6        1             child3

The relationship() configuration here works in the same way as a “normal” one-to-many relationship, with the exception that the “direction”, i.e. whether the relationship is one-to-many or many-to-one, is assumed by default to be one-to-many. To establish the relationship as many-to-one, an extra directive is added known as remote_side, which is a Column or collection of Column objects that indicate those which should be considered to be “remote”:

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node.id'))
    data = Column(String(50))
    parent = relationship("Node", remote_side=[id])

Where above, the id column is applied as the remote_side of the parent relationship(), thus establishing parent_id as the “local” side, and the relationship then behaves as a many-to-one.

As always, both directions can be combined into a bidirectional relationship using the backref() function:

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node.id'))
    data = Column(String(50))
    children = relationship("Node",
                backref=backref('parent', remote_side=[id])
            )

There are several examples included with SQLAlchemy illustrating self-referential strategies; these include Adjacency List and XML Persistence.

Composite Adjacency Lists¶

A sub-category of the adjacency list relationship is the rare case where a particular column is present on both the “local” and “remote” side of the join condition. An example is the Folder class below; using a composite primary key, the account_id column refers to itself, to indicate sub folders which are within the same account as that of the parent; while folder_id refers to a specific folder within that account:

class Folder(Base):
    __tablename__ = 'folder'
    __table_args__ = (
      ForeignKeyConstraint(
          ['account_id', 'parent_id'],
          ['folder.account_id', 'folder.folder_id']),
    )

    account_id = Column(Integer, primary_key=True)
    folder_id = Column(Integer, primary_key=True)
    parent_id = Column(Integer)
    name = Column(String)

    parent_folder = relationship("Folder",
                        backref="child_folders",
                        remote_side=[account_id, folder_id]
                  )

Above, we pass account_id into the remote_side list. relationship() recognizes that the account_id column here is on both sides, and aligns the “remote” column along with the folder_id column, which it recognizes as uniquely present on the “remote” side.

New in version 0.8: Support for self-referential composite keys in relationship() where a column points to itself.

Self-Referential Query Strategies¶

Querying of self-referential structures works like any other query:

# get all nodes named 'child2'
session.query(Node).filter(Node.data=='child2')

However extra care is needed when attempting to join along the foreign key from one level of the tree to the next. In SQL, a join from a table to itself requires that at least one side of the expression be “aliased” so that it can be unambiguously referred to.

Recall from Using Aliases in the ORM tutorial that the orm.aliased() construct is normally used to provide an “alias” of an ORM entity. Joining from Node to itself using this technique looks like:

fromsqlalchemy.ormimportaliasednodealias=aliased(Node)sqlsession.query(Node).filter(Node.data=='subchild1').\
                join(nodealias,Node.parent).\
                filter(nodealias.data=="child2").\
                all()SELECT node.id AS node_id,
        node.parent_id AS node_parent_id,
        node.data AS node_data
FROM node JOIN node AS node_1
    ON node.parent_id = node_1.id
WHERE node.data = ?
    AND node_1.data = ?
['subchild1', 'child2']

Query.join() also includes a feature known as Query.join.aliased that can shorten the verbosity self- referential joins, at the expense of query flexibility. This feature performs a similar “aliasing” step to that above, without the need for an explicit entity. Calls to Query.filter() and similar subsequent to the aliased join will adapt the Node entity to be that of the alias:

sqlsession.query(Node).filter(Node.data=='subchild1').\
        join(Node.parent,aliased=True).\
        filter(Node.data=='child2').\
        all()SELECT node.id AS node_id,
        node.parent_id AS node_parent_id,
        node.data AS node_data
FROM node
    JOIN node AS node_1 ON node_1.id = node.parent_id
WHERE node.data = ? AND node_1.data = ?
['subchild1', 'child2']

To add criterion to multiple points along a longer join, add Query.join.from_joinpoint to the additional join() calls:

# get all nodes named 'subchild1' with a# parent named 'child2' and a grandparent 'root'sqlsession.query(Node).\
        filter(Node.data=='subchild1').\
        join(Node.parent,aliased=True).\
        filter(Node.data=='child2').\
        join(Node.parent,aliased=True,from_joinpoint=True).\
        filter(Node.data=='root').\
        all()SELECT node.id AS node_id,
        node.parent_id AS node_parent_id,
        node.data AS node_data
FROM node
    JOIN node AS node_1 ON node_1.id = node.parent_id
    JOIN node AS node_2 ON node_2.id = node_1.parent_id
WHERE node.data = ?
    AND node_1.data = ?
    AND node_2.data = ?
['subchild1', 'child2', 'root']

Query.reset_joinpoint() will also remove the “aliasing” from filtering calls:

session.query(Node).\
        join(Node.children, aliased=True).\
        filter(Node.data == 'foo').\
        reset_joinpoint().\
        filter(Node.data == 'bar')

For an example of using Query.join.aliased to arbitrarily join along a chain of self-referential nodes, see XML Persistence.

Configuring Self-Referential Eager Loading¶

Eager loading of relationships occurs using joins or outerjoins from parent to child table during a normal query operation, such that the parent and its immediate child collection or reference can be populated from a single SQL statement, or a second statement for all immediate child collections. SQLAlchemy’s joined and subquery eager loading use aliased tables in all cases when joining to related items, so are compatible with self-referential joining. However, to use eager loading with a self-referential relationship, SQLAlchemy needs to be told how many levels deep it should join and/or query; otherwise the eager load will not take place at all. This depth setting is configured via join_depth:

classNode(Base):__tablename__='node'id=Column(Integer,primary_key=True)parent_id=Column(Integer,ForeignKey('node.id'))data=Column(String(50))children=relationship("Node",lazy="joined",join_depth=2)sqlsession.query(Node).all()SELECT node_1.id AS node_1_id,
        node_1.parent_id AS node_1_parent_id,
        node_1.data AS node_1_data,
        node_2.id AS node_2_id,
        node_2.parent_id AS node_2_parent_id,
        node_2.data AS node_2_data,
        node.id AS node_id,
        node.parent_id AS node_parent_id,
        node.data AS node_data
FROM node
    LEFT OUTER JOIN node AS node_2
        ON node.id = node_2.parent_id
    LEFT OUTER JOIN node AS node_1
        ON node_2.id = node_1.parent_id
[]

Linking Relationships with Backref¶

The backref keyword argument was first introduced in Object Relational Tutorial, and has been mentioned throughout many of the examples here. What does it actually do ? Let’s start with the canonical User and Address scenario:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    addresses = relationship("Address", backref="user")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    email = Column(String)
    user_id = Column(Integer, ForeignKey('user.id'))

The above configuration establishes a collection of Address objects on User called User.addresses. It also establishes a .user attribute on Address which will refer to the parent User object.

In fact, the backref keyword is only a common shortcut for placing a second relationship() onto the Address mapping, including the establishment of an event listener on both sides which will mirror attribute operations in both directions. The above configuration is equivalent to:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    addresses = relationship("Address", back_populates="user")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    email = Column(String)
    user_id = Column(Integer, ForeignKey('user.id'))

    user = relationship("User", back_populates="addresses")

Above, we add a .user relationship to Address explicitly. On both relationships, the back_populates directive tells each relationship about the other one, indicating that they should establish “bidirectional” behavior between each other. The primary effect of this configuration is that the relationship adds event handlers to both attributes which have the behavior of “when an append or set event occurs here, set ourselves onto the incoming attribute using this particular attribute name”. The behavior is illustrated as follows. Start with a User and an Address instance. The .addresses collection is empty, and the .user attribute is None:

>>> u1 = User()
>>> a1 = Address()
>>> u1.addresses
[]
>>> print a1.user
None

However, once the Address is appended to the u1.addresses collection, both the collection and the scalar attribute have been populated:

>>> u1.addresses.append(a1)
>>> u1.addresses
[<__main__.Address object at 0x12a6ed0>]
>>> a1.user
<__main__.User object at 0x12a6590>

This behavior of course works in reverse for removal operations as well, as well as for equivalent operations on both sides. Such as when .user is set again to None, the Address object is removed from the reverse collection:

>>> a1.user = None
>>> u1.addresses
[]

The manipulation of the .addresses collection and the .user attribute occurs entirely in Python without any interaction with the SQL database. Without this behavior, the proper state would be apparent on both sides once the data has been flushed to the database, and later reloaded after a commit or expiration operation occurs. The backref/back_populates behavior has the advantage that common bidirectional operations can reflect the correct state without requiring a database round trip.

Remember, when the backref keyword is used on a single relationship, it’s exactly the same as if the above two relationships were created individually using back_populates on each.

Backref Arguments¶

We’ve established that the backref keyword is merely a shortcut for building two individual relationship() constructs that refer to each other. Part of the behavior of this shortcut is that certain configurational arguments applied to the relationship() will also be applied to the other direction - namely those arguments that describe the relationship at a schema level, and are unlikely to be different in the reverse direction. The usual case here is a many-to-many relationship() that has a secondary argument, or a one-to-many or many-to-one which has a primaryjoin argument (the primaryjoin argument is discussed in Specifying Alternate Join Conditions). Such as if we limited the list of Address objects to those which start with “tony”:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    addresses = relationship("Address",
                    primaryjoin="and_(User.id==Address.user_id, "
                        "Address.email.startswith('tony'))",
                    backref="user")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    email = Column(String)
    user_id = Column(Integer, ForeignKey('user.id'))

We can observe, by inspecting the resulting property, that both sides of the relationship have this join condition applied:

>>> print User.addresses.property.primaryjoin
"user".id = address.user_id AND address.email LIKE :email_1 || '%%'
>>>
>>> print Address.user.property.primaryjoin
"user".id = address.user_id AND address.email LIKE :email_1 || '%%'
>>>

This reuse of arguments should pretty much do the “right thing” - it uses only arguments that are applicable, and in the case of a many-to- many relationship, will reverse the usage of primaryjoin and secondaryjoin to correspond to the other direction (see the example in Self-Referential Many-to-Many Relationship for this).

It’s very often the case however that we’d like to specify arguments that are specific to just the side where we happened to place the “backref”. This includes relationship() arguments like lazy, remote_side, cascade and cascade_backrefs. For this case we use the backref() function in place of a string:

# <other imports>
from sqlalchemy.orm import backref

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    addresses = relationship("Address",
                    backref=backref("user", lazy="joined"))

Where above, we placed a lazy="joined" directive only on the Address.user side, indicating that when a query against Address is made, a join to the User entity should be made automatically which will populate the .user attribute of each returned Address. The backref() function formatted the arguments we gave it into a form that is interpreted by the receiving relationship() as additional arguments to be applied to the new relationship it creates.

One Way Backrefs¶

An unusual case is that of the “one way backref”. This is where the “back-populating” behavior of the backref is only desirable in one direction. An example of this is a collection which contains a filtering primaryjoin condition. We’d like to append items to this collection as needed, and have them populate the “parent” object on the incoming object. However, we’d also like to have items that are not part of the collection, but still have the same “parent” association - these items should never be in the collection.

Taking our previous example, where we established a primaryjoin that limited the collection only to Address objects whose email address started with the word tony, the usual backref behavior is that all items populate in both directions. We wouldn’t want this behavior for a case like the following:

>>> u1 = User()
>>> a1 = Address(email='mary')
>>> a1.user = u1
>>> u1.addresses
[<__main__.Address object at 0x1411910>]

Above, the Address object that doesn’t match the criterion of “starts with ‘tony’” is present in the addresses collection of u1. After these objects are flushed, the transaction committed and their attributes expired for a re-load, the addresses collection will hit the database on next access and no longer have this Address object present, due to the filtering condition. But we can do away with this unwanted side of the “backref” behavior on the Python side by using two separate relationship() constructs, placing back_populates only on one side:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    addresses = relationship("Address",
                    primaryjoin="and_(User.id==Address.user_id, "
                        "Address.email.startswith('tony'))",
                    back_populates="user")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    email = Column(String)
    user_id = Column(Integer, ForeignKey('user.id'))
    user = relationship("User")

With the above scenario, appending an Address object to the .addresses collection of a User will always establish the .user attribute on that Address:

>>> u1 = User()
>>> a1 = Address(email='tony')
>>> u1.addresses.append(a1)
>>> a1.user
<__main__.User object at 0x1411850>

However, applying a User to the .user attribute of an Address, will not append the Address object to the collection:

>>> a2 = Address(email='mary')
>>> a2.user = u1
>>> a2 in u1.addresses
False

Of course, we’ve disabled some of the usefulness of backref here, in that when we do append an Address that corresponds to the criteria of email.startswith('tony'), it won’t show up in the User.addresses collection until the session is flushed, and the attributes reloaded after a commit or expire operation. While we could consider an attribute event that checks this criterion in Python, this starts to cross the line of duplicating too much SQL behavior in Python. The backref behavior itself is only a slight transgression of this philosophy - SQLAlchemy tries to keep these to a minimum overall.

Configuring how Relationship Joins¶

relationship() will normally create a join between two tables by examining the foreign key relationship between the two tables to determine which columns should be compared. There are a variety of situations where this behavior needs to be customized.

Handling Multiple Join Paths¶

One of the most common situations to deal with is when there are more than one foreign key path between two tables.

Consider a Customer class that contains two foreign keys to an Address class:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class Customer(Base):
    __tablename__ = 'customer'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    billing_address_id = Column(Integer, ForeignKey("address.id"))
    shipping_address_id = Column(Integer, ForeignKey("address.id"))

    billing_address = relationship("Address")
    shipping_address = relationship("Address")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    street = Column(String)
    city = Column(String)
    state = Column(String)
    zip = Column(String)

The above mapping, when we attempt to use it, will produce the error:

sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join
condition between parent/child tables on relationship
Customer.billing_address - there are multiple foreign key
paths linking the tables.  Specify the 'foreign_keys' argument,
providing a list of those columns which should be
counted as containing a foreign key reference to the parent table.

The above message is pretty long. There are many potential messages that relationship() can return, which have been carefully tailored to detect a variety of common configurational issues; most will suggest the additional configuration that’s needed to resolve the ambiguity or other missing information.

In this case, the message wants us to qualify each relationship() by instructing for each one which foreign key column should be considered, and the appropriate form is as follows:

class Customer(Base):
    __tablename__ = 'customer'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    billing_address_id = Column(Integer, ForeignKey("address.id"))
    shipping_address_id = Column(Integer, ForeignKey("address.id"))

    billing_address = relationship("Address", foreign_keys=[billing_address_id])
    shipping_address = relationship("Address", foreign_keys=[shipping_address_id])

Above, we specify the foreign_keys argument, which is a Column or list of Column objects which indicate those columns to be considered “foreign”, or in other words, the columns that contain a value referring to a parent table. Loading the Customer.billing_address relationship from a Customer object will use the value present in billing_address_id in order to identify the row in Address to be loaded; similarly, shipping_address_id is used for the shipping_address relationship. The linkage of the two columns also plays a role during persistence; the newly generated primary key of a just-inserted Address object will be copied into the appropriate foreign key column of an associated Customer object during a flush.

When specifying foreign_keys with Declarative, we can also use string names to specify, however it is important that if using a list, the list is part of the string:

billing_address = relationship("Address", foreign_keys="[Customer.billing_address_id]")

In this specific example, the list is not necessary in any case as there’s only one Column we need:

billing_address = relationship("Address", foreign_keys="Customer.billing_address_id")

Changed in version 0.8: relationship() can resolve ambiguity between foreign key targets on the basis of the foreign_keys argument alone; the primaryjoin argument is no longer needed in this situation.

Specifying Alternate Join Conditions¶

The default behavior of relationship() when constructing a join is that it equates the value of primary key columns on one side to that of foreign-key-referring columns on the other. We can change this criterion to be anything we’d like using the primaryjoin argument, as well as the secondaryjoin argument in the case when a “secondary” table is used.

In the example below, using the User class as well as an Address class which stores a street address, we create a relationship boston_addresses which will only load those Address objects which specify a city of “Boston”:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    boston_addresses = relationship("Address",
                    primaryjoin="and_(User.id==Address.user_id, "
                        "Address.city=='Boston')")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    user_id = Column(Integer, ForeignKey('user.id'))

    street = Column(String)
    city = Column(String)
    state = Column(String)
    zip = Column(String)

Within this string SQL expression, we made use of the and_() conjunction construct to establish two distinct predicates for the join condition - joining both the User.id and Address.user_id columns to each other, as well as limiting rows in Address to just city='Boston'. When using Declarative, rudimentary SQL functions like and_() are automatically available in the evaluated namespace of a string relationship() argument.

The custom criteria we use in a primaryjoin is generally only significant when SQLAlchemy is rendering SQL in order to load or represent this relationship. That is, it’s used in the SQL statement that’s emitted in order to perform a per-attribute lazy load, or when a join is constructed at query time, such as via Query.join(), or via the eager “joined” or “subquery” styles of loading. When in-memory objects are being manipulated, we can place any Address object we’d like into the boston_addresses collection, regardless of what the value of the .city attribute is. The objects will remain present in the collection until the attribute is expired and re-loaded from the database where the criterion is applied. When a flush occurs, the objects inside of boston_addresses will be flushed unconditionally, assigning value of the primary key user.id column onto the foreign-key-holding address.user_id column for each row. The city criteria has no effect here, as the flush process only cares about synchronizing primary key values into referencing foreign key values.

Creating Custom Foreign Conditions¶

Another element of the primary join condition is how those columns considered “foreign” are determined. Usually, some subset of Column objects will specify ForeignKey, or otherwise be part of a ForeignKeyConstraint that’s relevant to the join condition. relationship() looks to this foreign key status as it decides how it should load and persist data for this relationship. However, the primaryjoin argument can be used to create a join condition that doesn’t involve any “schema” level foreign keys. We can combine primaryjoin along with foreign_keys and remote_side explicitly in order to establish such a join.

Below, a class HostEntry joins to itself, equating the string content column to the ip_address column, which is a Postgresql type called INET. We need to use cast() in order to cast one side of the join to the type of the other:

from sqlalchemy import cast, String, Column, Integer
from sqlalchemy.orm import relationship
from sqlalchemy.dialects.postgresql import INET

from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class HostEntry(Base):
    __tablename__ = 'host_entry'

    id = Column(Integer, primary_key=True)
    ip_address = Column(INET)
    content = Column(String(50))

    # relationship() using explicit foreign_keys, remote_side
    parent_host = relationship("HostEntry",
                        primaryjoin=ip_address == cast(content, INET),
                        foreign_keys=content,
                        remote_side=ip_address
                    )

The above relationship will produce a join like:

SELECT host_entry.id, host_entry.ip_address, host_entry.content
FROM host_entry JOIN host_entry AS host_entry_1
ON host_entry_1.ip_address = CAST(host_entry.content AS INET)

An alternative syntax to the above is to use the foreign() and remote() annotations, inline within the primaryjoin expression. This syntax represents the annotations that relationship() normally applies by itself to the join condition given the foreign_keys and remote_side arguments; the functions are provided in the API in the rare case that relationship() can’t determine the exact location of these features on its own:

from sqlalchemy.orm import foreign, remote

class HostEntry(Base):
    __tablename__ = 'host_entry'

    id = Column(Integer, primary_key=True)
    ip_address = Column(INET)
    content = Column(String(50))

    # relationship() using explicit foreign() and remote() annotations
    # in lieu of separate arguments
    parent_host = relationship("HostEntry",
                        primaryjoin=remote(ip_address) == \
                                cast(foreign(content), INET),
                    )

Using custom operators in join conditions¶

Another use case for relationships is the use of custom operators, such as Postgresql’s “is contained within” << operator when joining with types such as postgresql.INET and postgresql.CIDR. For custom operators we use the Operators.op() function:

inet_column.op("<<")(cidr_column)

However, if we construct a primaryjoin using this operator, relationship() will still need more information. This is because when it examines our primaryjoin condition, it specifically looks for operators used for comparisons, and this is typically a fixed list containing known comparison operators such as ==, <, etc. So for our custom operator to participate in this system, we need it to register as a comparison operator using the is_comparison parameter:

inet_column.op("<<", is_comparison=True)(cidr_column)

A complete example:

class IPA(Base):
    __tablename__ = 'ip_address'

    id = Column(Integer, primary_key=True)
    v4address = Column(INET)

    network = relationship("Network",
                        primaryjoin="IPA.v4address.op('<<', is_comparison=True)"
                            "(foreign(Network.v4representation))",
                        viewonly=True
                    )
class Network(Base):
    __tablename__ = 'network'

    id = Column(Integer, primary_key=True)
    v4representation = Column(CIDR)

Above, a query such as:

session.query(IPA).join(IPA.network)

Will render as:

SELECT ip_address.id AS ip_address_id, ip_address.v4address AS ip_address_v4address
FROM ip_address JOIN network ON ip_address.v4address << network.v4representation

New in version 0.9.2: - Added the Operators.op.is_comparison flag to assist in the creation of relationship() constructs using custom operators.

Non-relational Comparisons / Materialized Path¶

Warning

this section details an experimental feature.

Using custom expressions means we can produce unorthodox join conditions that don’t obey the usual primary/foreign key model. One such example is the materialized path pattern, where we compare strings for overlapping path tokens in order to produce a tree structure.

Through careful use of foreign() and remote(), we can build a relationship that effectively produces a rudimentary materialized path system. Essentially, when foreign() and remote() are on the same side of the comparison expression, the relationship is considered to be “one to many”; when they are on different sides, the relationship is considered to be “many to one”. For the comparison we’ll use here, we’ll be dealing with collections so we keep things configured as “one to many”:

class Element(Base):
    __tablename__ = 'element'

    path = Column(String, primary_key=True)

    descendants = relationship('Element',
                           primaryjoin=
                                remote(foreign(path)).like(
                                        path.concat('/%')),
                           viewonly=True,
                           order_by=path)

Above, if given an Element object with a path attribute of "/foo/bar2", we seek for a load of Element.descendants to look like:

SELECT element.path AS element_path
FROM element
WHERE element.path LIKE ('/foo/bar2' || '/%') ORDER BY element.path

New in version 0.9.5: Support has been added to allow a single-column comparison to itself within a primaryjoin condition, as well as for primaryjoin conditions that use Operators.like() as the comparison operator.

Self-Referential Many-to-Many Relationship¶

Many to many relationships can be customized by one or both of primaryjoin and secondaryjoin - the latter is significant for a relationship that specifies a many-to-many reference using the secondary argument. A common situation which involves the usage of primaryjoin and secondaryjoin is when establishing a many-to-many relationship from a class to itself, as shown below:

from sqlalchemy import Integer, ForeignKey, String, Column, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

node_to_node = Table("node_to_node", Base.metadata,
    Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
    Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True)
)

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    label = Column(String)
    right_nodes = relationship("Node",
                        secondary=node_to_node,
                        primaryjoin=id==node_to_node.c.left_node_id,
                        secondaryjoin=id==node_to_node.c.right_node_id,
                        backref="left_nodes"
    )

Where above, SQLAlchemy can’t know automatically which columns should connect to which for the right_nodes and left_nodes relationships. The primaryjoin and secondaryjoin arguments establish how we’d like to join to the association table. In the Declarative form above, as we are declaring these conditions within the Python block that corresponds to the Node class, the id variable is available directly as the Column object we wish to join with.

Alternatively, we can define the primaryjoin and secondaryjoin arguments using strings, which is suitable in the case that our configuration does not have either the Node.id column object available yet or the node_to_node table perhaps isn’t yet available. When referring to a plain Table object in a declarative string, we use the string name of the table as it is present in the MetaData:

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    label = Column(String)
    right_nodes = relationship("Node",
                        secondary="node_to_node",
                        primaryjoin="Node.id==node_to_node.c.left_node_id",
                        secondaryjoin="Node.id==node_to_node.c.right_node_id",
                        backref="left_nodes"
    )

A classical mapping situation here is similar, where node_to_node can be joined to node.c.id:

from sqlalchemy import Integer, ForeignKey, String, Column, Table, MetaData
from sqlalchemy.orm import relationship, mapper

metadata = MetaData()

node_to_node = Table("node_to_node", metadata,
    Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
    Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True)
)

node = Table("node", metadata,
    Column('id', Integer, primary_key=True),
    Column('label', String)
)
class Node(object):
    pass

mapper(Node, node, properties={
    'right_nodes':relationship(Node,
                        secondary=node_to_node,
                        primaryjoin=node.c.id==node_to_node.c.left_node_id,
                        secondaryjoin=node.c.id==node_to_node.c.right_node_id,
                        backref="left_nodes"
                    )})

Note that in both examples, the backref keyword specifies a left_nodes backref - when relationship() creates the second relationship in the reverse direction, it’s smart enough to reverse the primaryjoin and secondaryjoin arguments.

Composite “Secondary” Joins¶

Note

This section features some new and experimental features of SQLAlchemy.

Sometimes, when one seeks to build a relationship() between two tables there is a need for more than just two or three tables to be involved in order to join them. This is an area of relationship() where one seeks to push the boundaries of what’s possible, and often the ultimate solution to many of these exotic use cases needs to be hammered out on the SQLAlchemy mailing list.

In more recent versions of SQLAlchemy, the secondary parameter can be used in some of these cases in order to provide a composite target consisting of multiple tables. Below is an example of such a join condition (requires version 0.9.2 at least to function as is):

class A(Base):
    __tablename__ = 'a'

    id = Column(Integer, primary_key=True)
    b_id = Column(ForeignKey('b.id'))

    d = relationship("D",
                secondary="join(B, D, B.d_id == D.id)."
                            "join(C, C.d_id == D.id)",
                primaryjoin="and_(A.b_id == B.id, A.id == C.a_id)",
                secondaryjoin="D.id == B.d_id",
                uselist=False
                )

class B(Base):
    __tablename__ = 'b'

    id = Column(Integer, primary_key=True)
    d_id = Column(ForeignKey('d.id'))

class C(Base):
    __tablename__ = 'c'

    id = Column(Integer, primary_key=True)
    a_id = Column(ForeignKey('a.id'))
    d_id = Column(ForeignKey('d.id'))

class D(Base):
    __tablename__ = 'd'

    id = Column(Integer, primary_key=True)

In the above example, we provide all three of secondary, primaryjoin, and secondaryjoin, in the declarative style referring to the named tables a, b, c, d directly. A query from A to D looks like:

sess.query(A).join(A.d).all()SELECT a.id AS a_id, a.b_id AS a_b_id
FROM a JOIN (
    b AS b_1 JOIN d AS d_1 ON b_1.d_id = d_1.id
        JOIN c AS c_1 ON c_1.d_id = d_1.id)
    ON a.b_id = b_1.id AND a.id = c_1.a_id JOIN d ON d.id = b_1.d_id

In the above example, we take advantage of being able to stuff multiple tables into a “secondary” container, so that we can join across many tables while still keeping things “simple” for relationship(), in that there’s just “one” table on both the “left” and the “right” side; the complexity is kept within the middle.

New in version 0.9.2: Support is improved for allowing a join() construct to be used directly as the target of the secondary argument, including support for joins, eager joins and lazy loading, as well as support within declarative to specify complex conditions such as joins involving class names as targets.

Relationship to Non Primary Mapper¶

In the previous section, we illustrated a technique where we used secondary in order to place additional tables within a join condition. There is one complex join case where even this technique is not sufficient; when we seek to join from A to B, making use of any number of C, D, etc. in between, however there are also join conditions between A and B directly. In this case, the join from A to B may be difficult to express with just a complex primaryjoin condition, as the intermediary tables may need special handling, and it is also not expressable with a secondary object, since the A->secondary->B pattern does not support any references between A and B directly. When this extremely advanced case arises, we can resort to creating a second mapping as a target for the relationship. This is where we use mapper() in order to make a mapping to a class that includes all the additional tables we need for this join. In order to produce this mapper as an “alternative” mapping for our class, we use the non_primary flag.

Below illustrates a relationship() with a simple join from A to B, however the primaryjoin condition is augmented with two additional entities C and D, which also must have rows that line up with the rows in both A and B simultaneously:

class A(Base):
    __tablename__ = 'a'

    id = Column(Integer, primary_key=True)
    b_id = Column(ForeignKey('b.id'))

class B(Base):
    __tablename__ = 'b'

    id = Column(Integer, primary_key=True)

class C(Base):
    __tablename__ = 'c'

    id = Column(Integer, primary_key=True)
    a_id = Column(ForeignKey('a.id'))

class D(Base):
    __tablename__ = 'd'

    id = Column(Integer, primary_key=True)
    c_id = Column(ForeignKey('c.id'))
    b_id = Column(ForeignKey('b.id'))

# 1. set up the join() as a variable, so we can refer
# to it in the mapping multiple times.
j = join(B, D, D.b_id == B.id).join(C, C.id == D.c_id)

# 2. Create a new mapper() to B, with non_primary=True.
# Columns in the join with the same name must be
# disambiguated within the mapping, using named properties.
B_viacd = mapper(B, j, non_primary=True, properties={
    "b_id": [j.c.b_id, j.c.d_b_id],
    "d_id": j.c.d_id
    })

A.b = relationship(B_viacd, primaryjoin=A.b_id == B_viacd.c.b_id)

In the above case, our non-primary mapper for B will emit for additional columns when we query; these can be ignored:

sess.query(A).join(A.b).all()SELECT a.id AS a_id, a.b_id AS a_b_id
FROM a JOIN (b JOIN d ON d.b_id = b.id JOIN c ON c.id = d.c_id) ON a.b_id = b.id

Building Query-Enabled Properties¶

Very ambitious custom join conditions may fail to be directly persistable, and in some cases may not even load correctly. To remove the persistence part of the equation, use the flag viewonly on the relationship(), which establishes it as a read-only attribute (data written to the collection will be ignored on flush()). However, in extreme cases, consider using a regular Python property in conjunction with Query as follows:

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)

    def _get_addresses(self):
        return object_session(self).query(Address).with_parent(self).filter(...).all()
    addresses = property(_get_addresses)

Rows that point to themselves / Mutually Dependent Rows¶

This is a very specific case where relationship() must perform an INSERT and a second UPDATE in order to properly populate a row (and vice versa an UPDATE and DELETE in order to delete without violating foreign key constraints). The two use cases are:

A table contains a foreign key to itself, and a single row will have a foreign key value pointing to its own primary key.
Two tables each contain a foreign key referencing the other table, with a row in each table referencing the other.

For example:

          user
---------------------------------
user_id    name   related_user_id
   1       'ed'          1

Or:

             widget                                                  entry
-------------------------------------------             ---------------------------------
widget_id     name        favorite_entry_id             entry_id      name      widget_id
   1       'somewidget'          5                         5       'someentry'     1

In the first case, a row points to itself. Technically, a database that uses sequences such as PostgreSQL or Oracle can INSERT the row at once using a previously generated value, but databases which rely upon autoincrement-style primary key identifiers cannot. The relationship() always assumes a “parent/child” model of row population during flush, so unless you are populating the primary key/foreign key columns directly, relationship() needs to use two statements.

In the second case, the “widget” row must be inserted before any referring “entry” rows, but then the “favorite_entry_id” column of that “widget” row cannot be set until the “entry” rows have been generated. In this case, it’s typically impossible to insert the “widget” and “entry” rows using just two INSERT statements; an UPDATE must be performed in order to keep foreign key constraints fulfilled. The exception is if the foreign keys are configured as “deferred until commit” (a feature some databases support) and if the identifiers were populated manually (again essentially bypassing relationship()).

To enable the usage of a supplementary UPDATE statement, we use the post_update option of relationship(). This specifies that the linkage between the two rows should be created using an UPDATE statement after both rows have been INSERTED; it also causes the rows to be de-associated with each other via UPDATE before a DELETE is emitted. The flag should be placed on just one of the relationships, preferably the many-to-one side. Below we illustrate a complete example, including two ForeignKey constructs, one which specifies use_alter to help with emitting CREATE TABLE statements:

from sqlalchemy import Integer, ForeignKey, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class Entry(Base):
    __tablename__ = 'entry'
    entry_id = Column(Integer, primary_key=True)
    widget_id = Column(Integer, ForeignKey('widget.widget_id'))
    name = Column(String(50))

class Widget(Base):
    __tablename__ = 'widget'

    widget_id = Column(Integer, primary_key=True)
    favorite_entry_id = Column(Integer,
                            ForeignKey('entry.entry_id',
                            use_alter=True,
                            name="fk_favorite_entry"))
    name = Column(String(50))

    entries = relationship(Entry, primaryjoin=
                                    widget_id==Entry.widget_id)
    favorite_entry = relationship(Entry,
                                primaryjoin=
                                    favorite_entry_id==Entry.entry_id,
                                post_update=True)

When a structure against the above configuration is flushed, the “widget” row will be INSERTed minus the “favorite_entry_id” value, then all the “entry” rows will be INSERTed referencing the parent “widget” row, and then an UPDATE statement will populate the “favorite_entry_id” column of the “widget” table (it’s one row at a time for the time being):

>>> w1=Widget(name='somewidget')>>> e1=Entry(name='someentry')>>> w1.favorite_entry=e1>>> w1.entries=[e1]>>> session.add_all([w1,e1])sql>>> session.commit()BEGIN (implicit)
INSERT INTO widget (favorite_entry_id, name) VALUES (?, ?)
(None, 'somewidget')
INSERT INTO entry (widget_id, name) VALUES (?, ?)
(1, 'someentry')
UPDATE widget SET favorite_entry_id=? WHERE widget.widget_id = ?
(1, 1)
COMMIT

An additional configuration we can specify is to supply a more comprehensive foreign key constraint on Widget, such that it’s guaranteed that favorite_entry_id refers to an Entry that also refers to this Widget. We can use a composite foreign key, as illustrated below:

from sqlalchemy import Integer, ForeignKey, String, \
        Column, UniqueConstraint, ForeignKeyConstraint
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class Entry(Base):
    __tablename__ = 'entry'
    entry_id = Column(Integer, primary_key=True)
    widget_id = Column(Integer, ForeignKey('widget.widget_id'))
    name = Column(String(50))
    __table_args__ = (
        UniqueConstraint("entry_id", "widget_id"),
    )

class Widget(Base):
    __tablename__ = 'widget'

    widget_id = Column(Integer, autoincrement='ignore_fk', primary_key=True)
    favorite_entry_id = Column(Integer)

    name = Column(String(50))

    __table_args__ = (
        ForeignKeyConstraint(
            ["widget_id", "favorite_entry_id"],
            ["entry.widget_id", "entry.entry_id"],
            name="fk_favorite_entry", use_alter=True
        ),
    )

    entries = relationship(Entry, primaryjoin=
                                    widget_id==Entry.widget_id,
                                    foreign_keys=Entry.widget_id)
    favorite_entry = relationship(Entry,
                                primaryjoin=
                                    favorite_entry_id==Entry.entry_id,
                                foreign_keys=favorite_entry_id,
                                post_update=True)

The above mapping features a composite ForeignKeyConstraint bridging the widget_id and favorite_entry_id columns. To ensure that Widget.widget_id remains an “autoincrementing” column we specify autoincrement to the value "ignore_fk" on Column, and additionally on each relationship() we must limit those columns considered as part of the foreign key for the purposes of joining and cross-population.

Mutable Primary Keys / Update Cascades¶

When the primary key of an entity changes, related items which reference the primary key must also be updated as well. For databases which enforce referential integrity, it’s required to use the database’s ON UPDATE CASCADE functionality in order to propagate primary key changes to referenced foreign keys - the values cannot be out of sync for any moment.

For databases that don’t support this, such as SQLite and MySQL without their referential integrity options turned on, the passive_updates flag can be set to False, most preferably on a one-to-many or many-to-many relationship(), which instructs SQLAlchemy to issue UPDATE statements individually for objects referenced in the collection, loading them into memory if not already locally present. The passive_updates flag can also be False in conjunction with ON UPDATE CASCADE functionality, although in that case the unit of work will be issuing extra SELECT and UPDATE statements unnecessarily.

A typical mutable primary key setup might look like:

class User(Base):
    __tablename__ = 'user'

    username = Column(String(50), primary_key=True)
    fullname = Column(String(100))

    # passive_updates=False *only* needed if the database
    # does not implement ON UPDATE CASCADE
    addresses = relationship("Address", passive_updates=False)

class Address(Base):
    __tablename__ = 'address'

    email = Column(String(50), primary_key=True)
    username = Column(String(50),
                ForeignKey('user.username', onupdate="cascade")
            )

passive_updates is set to True by default, indicating that ON UPDATE CASCADE is expected to be in place in the usual case for foreign keys that expect to have a mutating parent key.

A passive_updates setting of False may be configured on any direction of relationship, i.e. one-to-many, many-to-one, and many-to-many, although it is much more effective when placed just on the one-to-many or many-to-many side. Configuring the passive_updates to False only on the many-to-one side will have only a partial effect, as the unit of work searches only through the current identity map for objects that may be referencing the one with a mutating primary key, not throughout the database.

Relationships API¶

sqlalchemy.orm.relationship(argument, secondary=None, primaryjoin=None, secondaryjoin=None, foreign_keys=None, uselist=None, order_by=False, backref=None, back_populates=None, post_update=False, cascade=False, extension=None, viewonly=False, lazy=True, collection_class=None, passive_deletes=False, passive_updates=True, remote_side=None, enable_typechecks=True, join_depth=None, comparator_factory=None, single_parent=False, innerjoin=False, distinct_target_key=None, doc=None, active_history=False, cascade_backrefs=True, load_on_pending=False, strategy_class=None, _local_remote_pairs=None, query_class=None, info=None)¶

Provide a relationship between two mapped classes.

This corresponds to a parent-child or associative table relationship. The constructed class is an instance of RelationshipProperty.

A typical relationship(), used in a classical mapping:

mapper(Parent, properties={
  'children': relationship(Child)
})

Some arguments accepted by relationship() optionally accept a callable function, which when called produces the desired value. The callable is invoked by the parent Mapper at “mapper initialization” time, which happens only when mappers are first used, and is assumed to be after all mappings have been constructed. This can be used to resolve order-of-declaration and other dependency issues, such as if Child is declared below Parent in the same file:

mapper(Parent, properties={
    "children":relationship(lambda: Child,
                        order_by=lambda: Child.id)
})

When using the Declarative extension, the Declarative initializer allows string arguments to be passed to relationship(). These string arguments are converted into callables that evaluate the string as Python code, using the Declarative class-registry as a namespace. This allows the lookup of related classes to be automatic via their string name, and removes the need to import related classes at all into the local module space:

from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", order_by="Child.id")

See also

Relationship Configuration - Full introductory and reference documentation for relationship().

Building a Relationship - ORM tutorial introduction.

Parameters:

argument¶ –
a mapped class, or actual Mapper instance, representing the target of the relationship.

argument may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.

See also

Configuring Relationships - further detail on relationship configuration when using Declarative.
secondary¶ –
for a many-to-many relationship, specifies the intermediary table, and is typically an instance of Table. In less common circumstances, the argument may also be specified as an Alias construct, or even a Join construct.

secondary may also be passed as a callable function which is evaluated at mapper initialization time. When using Declarative, it may also be a string argument noting the name of a Table that is present in the MetaData collection associated with the parent-mapped Table.

The secondary keyword argument is typically applied in the case where the intermediary Table is not otherwise exprssed in any direct class mapping. If the “secondary” table is also explicitly mapped elsewhere (e.g. as in Association Object), one should consider applying the viewonly flag so that this relationship() is not used for persistence operations which may conflict with those of the association object pattern.

See also

Many To Many - Reference example of “many to many”.

Building a Many To Many Relationship - ORM tutorial introduction to many-to-many relationships.

Self-Referential Many-to-Many Relationship - Specifics on using many-to-many in a self-referential case.

Configuring Many-to-Many Relationships - Additional options when using Declarative.

Association Object - an alternative to secondary when composing association table relationships, allowing additional attributes to be specified on the association table.

Composite “Secondary” Joins - a lesser-used pattern which in some cases can enable complex relationship() SQL conditions to be used.

New in version 0.9.2: secondary works more effectively when referring to a Join instance.
active_history=False¶ – When True, indicates that the “previous” value for a many-to-one reference should be loaded when replaced, if not already loaded. Normally, history tracking logic for simple many-to-ones only needs to be aware of the “new” value in order to perform a flush. This flag is available for applications that make use of attributes.get_history() which also need to know the “previous” value of the attribute.
backref¶ –
indicates the string name of a property to be placed on the related mapper’s class that will handle this relationship in the other direction. The other property will be created automatically when the mappers are configured. Can also be passed as a backref() object to control the configuration of the new relationship.

See also

Linking Relationships with Backref - Introductory documentation and examples.

back_populates - alternative form of backref specification.

backref() - allows control over relationship() configuration when using backref.
back_populates¶ –
Takes a string name and has the same meaning as backref, except the complementing property is not created automatically, and instead must be configured explicitly on the other mapper. The complementing property should also indicate back_populates to this relationship to ensure proper functioning.

See also

Linking Relationships with Backref - Introductory documentation and examples.

backref - alternative form of backref specification.
cascade¶ –
a comma-separated list of cascade rules which determines how Session operations should be “cascaded” from parent to child. This defaults to False, which means the default cascade should be used - this default cascade is "save-update, merge".

The available cascades are save-update, merge, expunge, delete, delete-orphan, and refresh-expire. An additional option, all indicates shorthand for "save-update, merge, refresh-expire, expunge, delete", and is often used as in "all, delete-orphan" to indicate that related objects should follow along with the parent object in all cases, and be deleted when de-associated.

See also

Cascades - Full detail on each of the available cascade options.

Configuring delete/delete-orphan Cascade - Tutorial example describing a delete cascade.
cascade_backrefs=True¶ –
a boolean value indicating if the save-update cascade should operate along an assignment event intercepted by a backref. When set to False, the attribute managed by this relationship will not cascade an incoming transient object into the session of a persistent parent, if the event is received via backref.

See also

Controlling Cascade on Backrefs - Full discussion and examples on how the cascade_backrefs option is used.
collection_class¶ –
a class or callable that returns a new list-holding object. will be used in place of a plain list for storing elements.

See also

Customizing Collection Access - Introductory documentation and examples.
comparator_factory¶ –
a class which extends RelationshipProperty.Comparator which provides custom SQL clause generation for comparison operations.

See also

PropComparator - some detail on redefining comparators at this level.

Operator Customization - Brief intro to this feature.
distinct_target_key=None¶ –
Indicate if a “subquery” eager load should apply the DISTINCT keyword to the innermost SELECT statement. When left as None, the DISTINCT keyword will be applied in those cases when the target columns do not comprise the full primary key of the target table. When set to True, the DISTINCT keyword is applied to the innermost SELECT unconditionally.

It may be desirable to set this flag to False when the DISTINCT is reducing performance of the innermost subquery beyond that of what duplicate innermost rows may be causing.

New in version 0.8.3: - distinct_target_key allows the subquery eager loader to apply a DISTINCT modifier to the innermost SELECT.

Changed in version 0.9.0: - distinct_target_key now defaults to None, so that the feature enables itself automatically for those cases where the innermost query targets a non-unique key.

See also

Relationship Loading Techniques - includes an introduction to subquery eager loading.
doc¶ – docstring which will be applied to the resulting descriptor.
extension¶ –
an AttributeExtension instance, or list of extensions, which will be prepended to the list of attribute listeners for the resulting descriptor placed on the class.

Deprecated since version 0.7: Please see AttributeEvents.
foreign_keys¶ –
a list of columns which are to be used as “foreign key” columns, or columns which refer to the value in a remote column, within the context of this relationship() object’s primaryjoin condition. That is, if the primaryjoin condition of this relationship() is a.id == b.a_id, and the values in b.a_id are required to be present in a.id, then the “foreign key” column of this relationship() is b.a_id.

In normal cases, the foreign_keys parameter is not required. relationship() will automatically determine which columns in the primaryjoin conditition are to be considered “foreign key” columns based on those Column objects that specify ForeignKey, or are otherwise listed as referencing columns in a ForeignKeyConstraint construct. foreign_keys is only needed when:
1. There is more than one way to construct a join from the local table to the remote table, as there are multiple foreign key references present. Setting foreign_keys will limit the relationship() to consider just those columns specified here as “foreign”.
  
  Changed in version 0.8: A multiple-foreign key join ambiguity can be resolved by setting the foreign_keys parameter alone, without the need to explicitly set primaryjoin as well.
2. The Table being mapped does not actually have ForeignKey or ForeignKeyConstraint constructs present, often because the table was reflected from a database that does not support foreign key reflection (MySQL MyISAM).
3. The primaryjoin argument is used to construct a non-standard join condition, which makes use of columns or expressions that do not normally refer to their “parent” column, such as a join condition expressed by a complex comparison using a SQL function.
The relationship() construct will raise informative error messages that suggest the use of the foreign_keys parameter when presented with an ambiguous condition. In typical cases, if relationship() doesn’t raise any exceptions, the foreign_keys parameter is usually not needed.

foreign_keys may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.

See also

Handling Multiple Join Paths

Creating Custom Foreign Conditions

foreign() - allows direct annotation of the “foreign” columns within a primaryjoin condition.

New in version 0.8: The foreign() annotation can also be applied directly to the primaryjoin expression, which is an alternate, more specific system of describing which columns in a particular primaryjoin should be considered “foreign”.
info¶ –
Optional data dictionary which will be populated into the MapperProperty.info attribute of this object.

New in version 0.8.
innerjoin=False¶ –
when True, joined eager loads will use an inner join to join against related tables instead of an outer join. The purpose of this option is generally one of performance, as inner joins generally perform better than outer joins.

This flag can be set to True when the relationship references an object via many-to-one using local foreign keys that are not nullable, or when the reference is one-to-one or a collection that is guaranteed to have one or at least one entry.

If the joined-eager load is chained onto an existing LEFT OUTER JOIN, innerjoin=True will be bypassed and the join will continue to chain as LEFT OUTER JOIN so that the results don’t change. As an alternative, specify the value "nested". This will instead nest the join on the right side, e.g. using the form “a LEFT OUTER JOIN (b JOIN c)”.

New in version 0.9.4: Added innerjoin="nested" option to support nesting of eager “inner” joins.

See also

What Kind of Loading to Use ? - Discussion of some details of various loader options.

joinedload.innerjoin - loader option version
join_depth¶ –
when non-None, an integer value indicating how many levels deep “eager” loaders should join on a self-referring or cyclical relationship. The number counts how many times the same Mapper shall be present in the loading condition along a particular join branch. When left at its default of None, eager loaders will stop chaining when they encounter a the same target mapper which is already higher up in the chain. This option applies both to joined- and subquery- eager loaders.

See also

Configuring Self-Referential Eager Loading - Introductory documentation and examples.
lazy=’select’¶ –
specifies how the related items should be loaded. Default value is select. Values include:
- select - items should be loaded lazily when the property is first accessed, using a separate SELECT statement, or identity map fetch for simple many-to-one references.
- immediate - items should be loaded as the parents are loaded, using a separate SELECT statement, or identity map fetch for simple many-to-one references.
- joined - items should be loaded “eagerly” in the same query as that of the parent, using a JOIN or LEFT OUTER JOIN. Whether the join is “outer” or not is determined by the innerjoin parameter.
- subquery - items should be loaded “eagerly” as the parents are loaded, using one additional SQL statement, which issues a JOIN to a subquery of the original statement, for each collection requested.
- noload - no loading should occur at any time. This is to support “write-only” attributes, or attributes which are populated in some manner specific to the application.
- dynamic - the attribute will return a pre-configured Query object for all read operations, onto which further filtering operations can be applied before iterating the results. See the section Dynamic Relationship Loaders for more details.
- True - a synonym for ‘select’
- False - a synonym for ‘joined’
- None - a synonym for ‘noload’
See also

Relationship Loading Techniques - Full documentation on relationship loader configuration.

Dynamic Relationship Loaders - detail on the dynamic option.
load_on_pending=False¶ –
Indicates loading behavior for transient or pending parent objects.

When set to True, causes the lazy-loader to issue a query for a parent object that is not persistent, meaning it has never been flushed. This may take effect for a pending object when autoflush is disabled, or for a transient object that has been “attached” to a Session but is not part of its pending collection.

The load_on_pending flag does not improve behavior when the ORM is used normally - object references should be constructed at the object level, not at the foreign key level, so that they are present in an ordinary way before a flush proceeds. This flag is not not intended for general use.

See also

Session.enable_relationship_loading() - this method establishes “load on pending” behavior for the whole object, and also allows loading on objects that remain transient or detached.
order_by¶ –
indicates the ordering that should be applied when loading these items. order_by is expected to refer to one of the Column objects to which the target class is mapped, or the attribute itself bound to the target class which refers to the column.

order_by may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.
passive_deletes=False¶ –
Indicates loading behavior during delete operations.

A value of True indicates that unloaded child items should not be loaded during a delete operation on the parent. Normally, when a parent item is deleted, all child items are loaded so that they can either be marked as deleted, or have their foreign key to the parent set to NULL. Marking this flag as True usually implies an ON DELETE <CASCADE|SET NULL> rule is in place which will handle updating/deleting child rows on the database side.

Additionally, setting the flag to the string value ‘all’ will disable the “nulling out” of the child foreign keys, when there is no delete or delete-orphan cascade enabled. This is typically used when a triggering or error raise scenario is in place on the database side. Note that the foreign key attributes on in-session child objects will not be changed after a flush occurs so this is a very special use-case setting.

See also

Using Passive Deletes - Introductory documentation and examples.
passive_updates=True¶ –
Indicates loading and INSERT/UPDATE/DELETE behavior when the source of a foreign key value changes (i.e. an “on update” cascade), which are typically the primary key columns of the source row.

When True, it is assumed that ON UPDATE CASCADE is configured on the foreign key in the database, and that the database will handle propagation of an UPDATE from a source column to dependent rows. Note that with databases which enforce referential integrity (i.e. PostgreSQL, MySQL with InnoDB tables), ON UPDATE CASCADE is required for this operation. The relationship() will update the value of the attribute on related items which are locally present in the session during a flush.

When False, it is assumed that the database does not enforce referential integrity and will not be issuing its own CASCADE operation for an update. The relationship() will issue the appropriate UPDATE statements to the database in response to the change of a referenced key, and items locally present in the session during a flush will also be refreshed.

This flag should probably be set to False if primary key changes are expected and the database in use doesn’t support CASCADE (i.e. SQLite, MySQL MyISAM tables).

See also

Mutable Primary Keys / Update Cascades - Introductory documentation and examples.

mapper.passive_updates - a similar flag which takes effect for joined-table inheritance mappings.
post_update¶ –
this indicates that the relationship should be handled by a second UPDATE statement after an INSERT or before a DELETE. Currently, it also will issue an UPDATE after the instance was UPDATEd as well, although this technically should be improved. This flag is used to handle saving bi-directional dependencies between two individual rows (i.e. each row references the other), where it would otherwise be impossible to INSERT or DELETE both rows fully since one row exists before the other. Use this flag when a particular mapping arrangement will incur two rows that are dependent on each other, such as a table that has a one-to-many relationship to a set of child rows, and also has a column that references a single child row within that list (i.e. both tables contain a foreign key to each other). If a flush operation returns an error that a “cyclical dependency” was detected, this is a cue that you might want to use post_update to “break” the cycle.

See also

Rows that point to themselves / Mutually Dependent Rows - Introductory documentation and examples.
primaryjoin¶ –
a SQL expression that will be used as the primary join of this child object against the parent object, or in a many-to-many relationship the join of the primary object to the association table. By default, this value is computed based on the foreign key relationships of the parent and child tables (or association table).

primaryjoin may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.

See also

Specifying Alternate Join Conditions
remote_side¶ –
used for self-referential relationships, indicates the column or list of columns that form the “remote side” of the relationship.

relationship.remote_side may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.

Changed in version 0.8: The remote() annotation can also be applied directly to the primaryjoin expression, which is an alternate, more specific system of describing which columns in a particular primaryjoin should be considered “remote”.

See also

Adjacency List Relationships - in-depth explanation of how remote_side is used to configure self-referential relationships.

remote() - an annotation function that accomplishes the same purpose as remote_side, typically when a custom primaryjoin condition is used.
query_class¶ –
a Query subclass that will be used as the base of the “appender query” returned by a “dynamic” relationship, that is, a relationship that specifies lazy="dynamic" or was otherwise constructed using the orm.dynamic_loader() function.

See also

Dynamic Relationship Loaders - Introduction to “dynamic” relationship loaders.
secondaryjoin¶ –
a SQL expression that will be used as the join of an association table to the child object. By default, this value is computed based on the foreign key relationships of the association and child tables.

secondaryjoin may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.

See also

Specifying Alternate Join Conditions
single_parent¶ –
when True, installs a validator which will prevent objects from being associated with more than one parent at a time. This is used for many-to-one or many-to-many relationships that should be treated either as one-to-one or one-to-many. Its usage is optional, except for relationship() constructs which are many-to-one or many-to-many and also specify the delete-orphan cascade option. The relationship() construct itself will raise an error instructing when this option is required.

See also

Cascades - includes detail on when the single_parent flag may be appropriate.
uselist¶ –
a boolean that indicates if this property should be loaded as a list or a scalar. In most cases, this value is determined automatically by relationship() at mapper configuration time, based on the type and direction of the relationship - one to many forms a list, many to one forms a scalar, many to many is a list. If a scalar is desired where normally a list would be present, such as a bi-directional one-to-one relationship, set uselist to False.

The uselist flag is also available on an existing relationship() construct as a read-only attribute, which can be used to determine if this relationship() deals with collections or scalar attributes:
```
>>> User.addresses.property.uselist
True
```
See also

One To One - Introduction to the “one to one” relationship pattern, which is typically when the uselist flag is needed.
viewonly=False¶ – when set to True, the relationship is used only for loading objects, and not for any persistence operation. A relationship() which specifies viewonly can work with a wider range of SQL operations within the primaryjoin condition, including operations that feature the use of a variety of comparison operators as well as SQL functions such as cast(). The viewonly flag is also of general use when defining any kind of relationship() that doesn’t represent the full set of related objects, to prevent modifications of the collection from resulting in persistence operations.

sqlalchemy.orm.backref(name, **kwargs)¶

Create a back reference with explicit keyword arguments, which are the same arguments one can send to relationship().

Used with the backref keyword argument to relationship() in place of a string argument, e.g.:

'items':relationship(
    SomeItem, backref=backref('parent', lazy='subquery'))

sqlalchemy.orm.relation(*arg, **kw)¶: A synonym for relationship().

sqlalchemy.orm.dynamic_loader(argument, **kw)¶

Construct a dynamically-loading mapper property.

This is essentially the same as using the lazy='dynamic' argument with relationship():

dynamic_loader(SomeClass)

# is the same as

relationship(SomeClass, lazy="dynamic")

See the section Dynamic Relationship Loaders for more details on dynamic loading.

sqlalchemy.orm.foreign(expr)¶

Annotate a portion of a primaryjoin expression with a ‘foreign’ annotation.

See the section Creating Custom Foreign Conditions for a description of use.

New in version 0.8.

SQLAlchemy 0.9 Documentation

SQLAlchemy 0.9 Documentation