1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309
|
=================================
Additional Persistence Techniques
=================================
.. _flush_embedded_sql_expressions:
Embedding SQL Insert/Update Expressions into a Flush
=====================================================
This feature allows the value of a database column to be set to a SQL
expression instead of a literal value. It's especially useful for atomic
updates, calling stored procedures, etc. All you do is assign an expression to
an attribute::
class SomeClass(object):
pass
mapper(SomeClass, some_table)
someobject = session.query(SomeClass).get(5)
# set 'value' attribute to a SQL expression adding one
someobject.value = some_table.c.value + 1
# issues "UPDATE some_table SET value=value+1"
session.commit()
This technique works both for INSERT and UPDATE statements. After the
flush/commit operation, the ``value`` attribute on ``someobject`` above is
expired, so that when next accessed the newly generated value will be loaded
from the database.
.. _session_sql_expressions:
Using SQL Expressions with Sessions
====================================
SQL expressions and strings can be executed via the
:class:`~sqlalchemy.orm.session.Session` within its transactional context.
This is most easily accomplished using the
:meth:`~.Session.execute` method, which returns a
:class:`~sqlalchemy.engine.ResultProxy` in the same manner as an
:class:`~sqlalchemy.engine.Engine` or
:class:`~sqlalchemy.engine.Connection`::
Session = sessionmaker(bind=engine)
session = Session()
# execute a string statement
result = session.execute("select * from table where id=:id", {'id':7})
# execute a SQL expression construct
result = session.execute(select([mytable]).where(mytable.c.id==7))
The current :class:`~sqlalchemy.engine.Connection` held by the
:class:`~sqlalchemy.orm.session.Session` is accessible using the
:meth:`~.Session.connection` method::
connection = session.connection()
The examples above deal with a :class:`~sqlalchemy.orm.session.Session` that's
bound to a single :class:`~sqlalchemy.engine.Engine` or
:class:`~sqlalchemy.engine.Connection`. To execute statements using a
:class:`~sqlalchemy.orm.session.Session` which is bound either to multiple
engines, or none at all (i.e. relies upon bound metadata), both
:meth:`~.Session.execute` and
:meth:`~.Session.connection` accept a ``mapper`` keyword
argument, which is passed a mapped class or
:class:`~sqlalchemy.orm.mapper.Mapper` instance, which is used to locate the
proper context for the desired engine::
Session = sessionmaker()
session = Session()
# need to specify mapper or class when executing
result = session.execute("select * from table where id=:id", {'id':7}, mapper=MyMappedClass)
result = session.execute(select([mytable], mytable.c.id==7), mapper=MyMappedClass)
connection = session.connection(MyMappedClass)
.. _session_partitioning:
Partitioning Strategies
=======================
Simple Vertical Partitioning
----------------------------
Vertical partitioning places different kinds of objects, or different tables,
across multiple databases::
engine1 = create_engine('postgresql://db1')
engine2 = create_engine('postgresql://db2')
Session = sessionmaker(twophase=True)
# bind User operations to engine 1, Account operations to engine 2
Session.configure(binds={User:engine1, Account:engine2})
session = Session()
Above, operations against either class will make usage of the :class:`.Engine`
linked to that class. Upon a flush operation, similar rules take place
to ensure each class is written to the right database.
The transactions among the multiple databases can optionally be coordinated
via two phase commit, if the underlying backend supports it. See
:ref:`session_twophase` for an example.
Custom Vertical Partitioning
----------------------------
More comprehensive rule-based class-level partitioning can be built by
overriding the :meth:`.Session.get_bind` method. Below we illustrate
a custom :class:`.Session` which delivers the following rules:
1. Flush operations are delivered to the engine named ``master``.
2. Operations on objects that subclass ``MyOtherClass`` all
occur on the ``other`` engine.
3. Read operations for all other classes occur on a random
choice of the ``slave1`` or ``slave2`` database.
::
engines = {
'master':create_engine("sqlite:///master.db"),
'other':create_engine("sqlite:///other.db"),
'slave1':create_engine("sqlite:///slave1.db"),
'slave2':create_engine("sqlite:///slave2.db"),
}
from sqlalchemy.orm import Session, sessionmaker
import random
class RoutingSession(Session):
def get_bind(self, mapper=None, clause=None):
if mapper and issubclass(mapper.class_, MyOtherClass):
return engines['other']
elif self._flushing:
return engines['master']
else:
return engines[
random.choice(['slave1','slave2'])
]
The above :class:`.Session` class is plugged in using the ``class_``
argument to :class:`.sessionmaker`::
Session = sessionmaker(class_=RoutingSession)
This approach can be combined with multiple :class:`.MetaData` objects,
using an approach such as that of using the declarative ``__abstract__``
keyword, described at :ref:`declarative_abstract`.
Horizontal Partitioning
-----------------------
Horizontal partitioning partitions the rows of a single table (or a set of
tables) across multiple databases.
See the "sharding" example: :ref:`examples_sharding`.
.. _bulk_operations:
Bulk Operations
===============
.. note:: Bulk Operations mode is a new series of operations made available
on the :class:`.Session` object for the purpose of invoking INSERT and
UPDATE statements with greatly reduced Python overhead, at the expense
of much less functionality, automation, and error checking.
As of SQLAlchemy 1.0, these features should be considered as "beta", and
additionally are intended for advanced users.
.. versionadded:: 1.0.0
Bulk operations on the :class:`.Session` include :meth:`.Session.bulk_save_objects`,
:meth:`.Session.bulk_insert_mappings`, and :meth:`.Session.bulk_update_mappings`.
The purpose of these methods is to directly expose internal elements of the unit of work system,
such that facilities for emitting INSERT and UPDATE statements given dictionaries
or object states can be utilized alone, bypassing the normal unit of work
mechanics of state, relationship and attribute management. The advantages
to this approach is strictly one of reduced Python overhead:
* The flush() process, including the survey of all objects, their state,
their cascade status, the status of all objects associated with them
via :func:`.relationship`, and the topological sort of all operations to
be performed is completely bypassed. This reduces a great amount of
Python overhead.
* The objects as given have no defined relationship to the target
:class:`.Session`, even when the operation is complete, meaning there's no
overhead in attaching them or managing their state in terms of the identity
map or session.
* The :meth:`.Session.bulk_insert_mappings` and :meth:`.Session.bulk_update_mappings`
methods accept lists of plain Python dictionaries, not objects; this further
reduces a large amount of overhead associated with instantiating mapped
objects and assigning state to them, which normally is also subject to
expensive tracking of history on a per-attribute basis.
* The set of objects passed to all bulk methods are processed
in the order they are received. In the case of
:meth:`.Session.bulk_save_objects`, when objects of different types are passed,
the INSERT and UPDATE statements are necessarily broken up into per-type
groups. In order to reduce the number of batch INSERT or UPDATE statements
passed to the DBAPI, ensure that the incoming list of objects
are grouped by type.
* The process of fetching primary keys after an INSERT also is disabled by
default. When performed correctly, INSERT statements can now more readily
be batched by the unit of work process into ``executemany()`` blocks, which
perform vastly better than individual statement invocations.
* UPDATE statements can similarly be tailored such that all attributes
are subject to the SET clase unconditionally, again making it much more
likely that ``executemany()`` blocks can be used.
The performance behavior of the bulk routines should be studied using the
:ref:`examples_performance` example suite. This is a series of example
scripts which illustrate Python call-counts across a variety of scenarios,
including bulk insert and update scenarios.
.. seealso::
:ref:`examples_performance` - includes detailed examples of bulk operations
contrasted against traditional Core and ORM methods, including performance
metrics.
Usage
-----
The methods each work in the context of the :class:`.Session` object's
transaction, like any other::
s = Session()
objects = [
User(name="u1"),
User(name="u2"),
User(name="u3")
]
s.bulk_save_objects(objects)
For :meth:`.Session.bulk_insert_mappings`, and :meth:`.Session.bulk_update_mappings`,
dictionaries are passed::
s.bulk_insert_mappings(User,
[dict(name="u1"), dict(name="u2"), dict(name="u3")]
)
.. seealso::
:meth:`.Session.bulk_save_objects`
:meth:`.Session.bulk_insert_mappings`
:meth:`.Session.bulk_update_mappings`
Comparison to Core Insert / Update Constructs
---------------------------------------------
The bulk methods offer performance that under particular circumstances
can be close to that of using the core :class:`.Insert` and
:class:`.Update` constructs in an "executemany" context (for a description
of "executemany", see :ref:`execute_multiple` in the Core tutorial).
In order to achieve this, the
:paramref:`.Session.bulk_insert_mappings.return_defaults`
flag should be disabled so that rows can be batched together. The example
suite in :ref:`examples_performance` should be carefully studied in order
to gain familiarity with how fast bulk performance can be achieved.
ORM Compatibility
-----------------
The bulk insert / update methods lose a significant amount of functionality
versus traditional ORM use. The following is a listing of features that
are **not available** when using these methods:
* persistence along :func:`.relationship` linkages
* sorting of rows within order of dependency; rows are inserted or updated
directly in the order in which they are passed to the methods
* Session-management on the given objects, including attachment to the
session, identity map management.
* Functionality related to primary key mutation, ON UPDATE cascade
* SQL expression inserts / updates (e.g. :ref:`flush_embedded_sql_expressions`)
* ORM events such as :meth:`.MapperEvents.before_insert`, etc. The bulk
session methods have no event support.
Features that **are available** include:
* INSERTs and UPDATEs of mapped objects
* Version identifier support
* Multi-table mappings, such as joined-inheritance - however, an object
to be inserted across multiple tables either needs to have primary key
identifiers fully populated ahead of time, else the
:paramref:`.Session.bulk_save_objects.return_defaults` flag must be used,
which will greatly reduce the performance benefits
|