1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516
|
Connections / Engines
=====================
.. contents::
:local:
:class: faq
:backlinks: none
How do I configure logging?
---------------------------
See :ref:`dbengine_logging`.
How do I pool database connections? Are my connections pooled?
----------------------------------------------------------------
SQLAlchemy performs application-level connection pooling automatically
in most cases. With the exception of SQLite, a :class:`_engine.Engine` object
refers to a :class:`.QueuePool` as a source of connectivity.
For more detail, see :ref:`engines_toplevel` and :ref:`pooling_toplevel`.
How do I pass custom connect arguments to my database API?
----------------------------------------------------------
The :func:`_sa.create_engine` call accepts additional arguments either
directly via the ``connect_args`` keyword argument::
e = create_engine(
"mysql://scott:tiger@localhost/test", connect_args={"encoding": "utf8"}
)
Or for basic string and integer arguments, they can usually be specified
in the query string of the URL::
e = create_engine("mysql://scott:tiger@localhost/test?encoding=utf8")
.. seealso::
:ref:`custom_dbapi_args`
"MySQL Server has gone away"
----------------------------
The primary cause of this error is that the MySQL connection has timed out
and has been closed by the server. The MySQL server closes connections
which have been idle a period of time which defaults to eight hours.
To accommodate this, the immediate setting is to enable the
:paramref:`_sa.create_engine.pool_recycle` setting, which will ensure that a
connection which is older than a set amount of seconds will be discarded
and replaced with a new connection when it is next checked out.
For the more general case of accommodating database restarts and other
temporary loss of connectivity due to network issues, connections that
are in the pool may be recycled in response to more generalized disconnect
detection techniques. The section :ref:`pool_disconnects` provides
background on both "pessimistic" (e.g. pre-ping) and "optimistic"
(e.g. graceful recovery) techniques. Modern SQLAlchemy tends to favor
the "pessimistic" approach.
.. seealso::
:ref:`pool_disconnects`
.. _mysql_sync_errors:
"Commands out of sync; you can't run this command now" / "This result object does not return rows. It has been closed automatically"
------------------------------------------------------------------------------------------------------------------------------------
The MySQL drivers have a fairly wide class of failure modes whereby the state of
the connection to the server is in an invalid state. Typically, when the connection
is used again, one of these two error messages will occur. The reason is because
the state of the server has been changed to one in which the client library
does not expect, such that when the client library emits a new statement
on the connection, the server does not respond as expected.
In SQLAlchemy, because database connections are pooled, the issue of the messaging
being out of sync on a connection becomes more important, since when an operation
fails, if the connection itself is in an unusable state, if it goes back into the
connection pool, it will malfunction when checked out again. The mitigation
for this issue is that the connection is **invalidated** when such a failure
mode occurs so that the underlying database connection to MySQL is discarded.
This invalidation occurs automatically for many known failure modes and can
also be called explicitly via the :meth:`_engine.Connection.invalidate` method.
There is also a second class of failure modes within this category where a context manager
such as ``with session.begin_nested():`` wants to "roll back" the transaction
when an error occurs; however within some failure modes of the connection, the
rollback itself (which can also be a RELEASE SAVEPOINT operation) also
fails, causing misleading stack traces.
Originally, the cause of this error used to be fairly simple, it meant that
a multithreaded program was invoking commands on a single connection from more
than one thread. This applied to the original "MySQLdb" native-C driver that was
pretty much the only driver in use. However, with the introduction of pure Python
drivers like PyMySQL and MySQL-connector-Python, as well as increased use of
tools such as gevent/eventlet, multiprocessing (often with Celery), and others,
there is a whole series of factors that has been known to cause this problem, some of
which have been improved across SQLAlchemy versions but others which are unavoidable:
* **Sharing a connection among threads** - This is the original reason these kinds
of errors occurred. A program used the same connection in two or more threads at
the same time, meaning multiple sets of messages got mixed up on the connection,
putting the server-side session into a state that the client no longer knows how
to interpret. However, other causes are usually more likely today.
* **Sharing the filehandle for the connection among processes** - This usually occurs
when a program uses ``os.fork()`` to spawn a new process, and a TCP connection
that is present in th parent process gets shared into one or more child processes.
As multiple processes are now emitting messages to essentially the same filehandle,
the server receives interleaved messages and breaks the state of the connection.
This scenario can occur very easily if a program uses Python's "multiprocessing"
module and makes use of an :class:`_engine.Engine` that was created in the parent
process. It's common that "multiprocessing" is in use when using tools like
Celery. The correct approach should be either that a new :class:`_engine.Engine`
is produced when a child process first starts, discarding any :class:`_engine.Engine`
that came down from the parent process; or, the :class:`_engine.Engine` that's inherited
from the parent process can have it's internal pool of connections disposed by
calling :meth:`_engine.Engine.dispose`.
* **Greenlet Monkeypatching w/ Exits** - When using a library like gevent or eventlet
that monkeypatches the Python networking API, libraries like PyMySQL are now
working in an asynchronous mode of operation, even though they are not developed
explicitly against this model. A common issue is that a greenthread is interrupted,
often due to timeout logic in the application. This results in the ``GreenletExit``
exception being raised, and the pure-Python MySQL driver is interrupted from
its work, which may have been that it was receiving a response from the server
or preparing to otherwise reset the state of the connection. When the exception
cuts all that work short, the conversation between client and server is now
out of sync and subsequent usage of the connection may fail. SQLAlchemy
as of version 1.1.0 knows how to guard against this, as if a database operation
is interrupted by a so-called "exit exception", which includes ``GreenletExit``
and any other subclass of Python ``BaseException`` that is not also a subclass
of ``Exception``, the connection is invalidated.
* **Rollbacks / SAVEPOINT releases failing** - Some classes of error cause
the connection to be unusable within the context of a transaction, as well
as when operating in a "SAVEPOINT" block. In these cases, the failure
on the connection has rendered any SAVEPOINT as no longer existing, yet
when SQLAlchemy, or the application, attempts to "roll back" this savepoint,
the "RELEASE SAVEPOINT" operation fails, typically with a message like
"savepoint does not exist". In this case, under Python 3 there will be
a chain of exceptions output, where the ultimate "cause" of the error
will be displayed as well. Under Python 2, there are no "chained" exceptions,
however recent versions of SQLAlchemy will attempt to emit a warning
illustrating the original failure cause, while still throwing the
immediate error which is the failure of the ROLLBACK.
.. _faq_execute_retry:
How Do I "Retry" a Statement Execution Automatically?
-------------------------------------------------------
The documentation section :ref:`pool_disconnects` discusses the strategies
available for pooled connections that have been disconnected since the last
time a particular connection was checked out. The most modern feature
in this regard is the :paramref:`_sa.create_engine.pre_ping` parameter, which
allows that a "ping" is emitted on a database connection when it's retrieved
from the pool, reconnecting if the current connection has been disconnected.
It's important to note that this "ping" is only emitted **before** the
connection is actually used for an operation. Once the connection is
delivered to the caller, per the Python :term:`DBAPI` specification it is now
subject to an **autobegin** operation, which means it will automatically BEGIN
a new transaction when it is first used that remains in effect for subsequent
statements, until the DBAPI-level ``connection.commit()`` or
``connection.rollback()`` method is invoked.
In modern use of SQLAlchemy, a series of SQL statements are always invoked
within this transactional state, assuming
:ref:`DBAPI autocommit mode <dbapi_autocommit>` is not enabled (more on that in
the next section), meaning that no single statement is automatically committed;
if an operation fails, the effects of all statements within the current
transaction will be lost.
The implication that this has for the notion of "retrying" a statement is that
in the default case, when a connection is lost, **the entire transaction is
lost**. There is no useful way that the database can "reconnect and retry" and
continue where it left off, since data is already lost. For this reason,
SQLAlchemy does not have a transparent "reconnection" feature that works
mid-transaction, for the case when the database connection has disconnected
while being used. The canonical approach to dealing with mid-operation
disconnects is to **retry the entire operation from the start of the
transaction**, often by using a custom Python decorator that will
"retry" a particular function several times until it succeeds, or to otherwise
architect the application in such a way that it is resilient against
transactions that are dropped that then cause operations to fail.
There is also the notion of extensions that can keep track of all of the
statements that have proceeded within a transaction and then replay them all in
a new transaction in order to approximate a "retry" operation. SQLAlchemy's
:ref:`event system <core_event_toplevel>` does allow such a system to be
constructed, however this approach is also not generally useful as there is
no way to guarantee that those
:term:`DML` statements will be working against the same state, as once a
transaction has ended the state of the database in a new transaction may be
totally different. Architecting "retry" explicitly into the application
at the points at which transactional operations begin and commit remains
the better approach since the application-level transactional methods are
the ones that know best how to re-run their steps.
Otherwise, if SQLAlchemy were to provide a feature that transparently and
silently "reconnected" a connection mid-transaction, the effect would be that
data is silently lost. By trying to hide the problem, SQLAlchemy would make
the situation much worse.
However, if we are **not** using transactions, then there are more options
available, as the next section describes.
.. _faq_execute_retry_autocommit:
Using DBAPI Autocommit Allows for a Readonly Version of Transparent Reconnect
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
With the rationale for not having a transparent reconnection mechanism stated,
the preceding section rests upon the assumption that the application is in
fact using DBAPI-level transactions. As most DBAPIs now offer :ref:`native
"autocommit" settings <dbapi_autocommit>`, we can make use of these features to
provide a limited form of transparent reconnect for **read only,
autocommit only operations**. A transparent statement retry may be applied to
the ``cursor.execute()`` method of the DBAPI, however it is still not safe to
apply to the ``cursor.executemany()`` method of the DBAPI, as the statement may
have consumed any portion of the arguments given.
.. warning:: The following recipe should **not** be used for operations that
write data. Users should carefully read and understand how the recipe
works and test failure modes very carefully against the specifically
targeted DBAPI driver before making production use of this recipe.
The retry mechanism does not guarantee prevention of disconnection errors
in all cases.
A simple retry mechanism may be applied to the DBAPI level ``cursor.execute()``
method by making use of the :meth:`_events.DialectEvents.do_execute` and
:meth:`_events.DialectEvents.do_execute_no_params` hooks, which will be able to
intercept disconnections during statement executions. It will **not**
intercept connection failures during result set fetch operations, for those
DBAPIs that don't fully buffer result sets. The recipe requires that the
database support DBAPI level autocommit and is **not guaranteed** for
particular backends. A single function ``reconnecting_engine()`` is presented
which applies the event hooks to a given :class:`_engine.Engine` object,
returning an always-autocommit version that enables DBAPI-level autocommit.
A connection will transparently reconnect for single-parameter and no-parameter
statement executions::
import time
from sqlalchemy import event
def reconnecting_engine(engine, num_retries, retry_interval):
def _run_with_retries(fn, context, cursor_obj, statement, *arg, **kw):
for retry in range(num_retries + 1):
try:
fn(cursor_obj, statement, context=context, *arg)
except engine.dialect.dbapi.Error as raw_dbapi_err:
connection = context.root_connection
if engine.dialect.is_disconnect(raw_dbapi_err, connection, cursor_obj):
if retry > num_retries:
raise
engine.logger.error(
"disconnection error, retrying operation",
exc_info=True,
)
connection.invalidate()
# use SQLAlchemy 2.0 API if available
if hasattr(connection, "rollback"):
connection.rollback()
else:
trans = connection.get_transaction()
if trans:
trans.rollback()
time.sleep(retry_interval)
context.cursor = cursor_obj = connection.connection.cursor()
else:
raise
else:
return True
e = engine.execution_options(isolation_level="AUTOCOMMIT")
@event.listens_for(e, "do_execute_no_params")
def do_execute_no_params(cursor_obj, statement, context):
return _run_with_retries(
context.dialect.do_execute_no_params, context, cursor_obj, statement
)
@event.listens_for(e, "do_execute")
def do_execute(cursor_obj, statement, parameters, context):
return _run_with_retries(
context.dialect.do_execute, context, cursor_obj, statement, parameters
)
return e
Given the above recipe, a reconnection mid-transaction may be demonstrated
using the following proof of concept script. Once run, it will emit a
``SELECT 1`` statement to the database every five seconds::
from sqlalchemy import create_engine
from sqlalchemy import select
if __name__ == "__main__":
engine = create_engine("mysql://scott:tiger@localhost/test", echo_pool=True)
def do_a_thing(engine):
with engine.begin() as conn:
while True:
print("ping: %s" % conn.execute(select([1])).scalar())
time.sleep(5)
e = reconnecting_engine(
create_engine("mysql://scott:tiger@localhost/test", echo_pool=True),
num_retries=5,
retry_interval=2,
)
do_a_thing(e)
Restart the database while the script runs to demonstrate the transparent
reconnect operation::
$ python reconnect_test.py
ping: 1
ping: 1
disconnection error, retrying operation
Traceback (most recent call last):
...
MySQLdb._exceptions.OperationalError: (2006, 'MySQL server has gone away')
2020-10-19 16:16:22,624 INFO sqlalchemy.pool.impl.QueuePool Invalidate connection <_mysql.connection open to 'localhost' at 0xf59240>
ping: 1
ping: 1
...
.. versionadded: 1.4 the above recipe makes use of 1.4-specific behaviors and will
not work as given on previous SQLAlchemy versions.
The above recipe is tested for SQLAlchemy 1.4.
Why does SQLAlchemy issue so many ROLLBACKs?
--------------------------------------------
SQLAlchemy currently assumes DBAPI connections are in "non-autocommit" mode -
this is the default behavior of the Python database API, meaning it
must be assumed that a transaction is always in progress. The
connection pool issues ``connection.rollback()`` when a connection is returned.
This is so that any transactional resources remaining on the connection are
released. On a database like PostgreSQL or MSSQL where table resources are
aggressively locked, this is critical so that rows and tables don't remain
locked within connections that are no longer in use. An application can
otherwise hang. It's not just for locks, however, and is equally critical on
any database that has any kind of transaction isolation, including MySQL with
InnoDB. Any connection that is still inside an old transaction will return
stale data, if that data was already queried on that connection within
isolation. For background on why you might see stale data even on MySQL, see
https://dev.mysql.com/doc/refman/5.1/en/innodb-transaction-model.html
I'm on MyISAM - how do I turn it off?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The behavior of the connection pool's connection return behavior can be
configured using ``reset_on_return``::
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool
engine = create_engine(
"mysql://scott:tiger@localhost/myisam_database",
pool=QueuePool(reset_on_return=False),
)
I'm on SQL Server - how do I turn those ROLLBACKs into COMMITs?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``reset_on_return`` accepts the values ``commit``, ``rollback`` in addition
to ``True``, ``False``, and ``None``. Setting to ``commit`` will cause
a COMMIT as any connection is returned to the pool::
engine = create_engine(
"mssql://scott:tiger@mydsn", pool=QueuePool(reset_on_return="commit")
)
I am using multiple connections with a SQLite database (typically to test transaction operation), and my test program is not working!
----------------------------------------------------------------------------------------------------------------------------------------------------------
If using a SQLite ``:memory:`` database, or a version of SQLAlchemy prior
to version 0.7, the default connection pool is the :class:`.SingletonThreadPool`,
which maintains exactly one SQLite connection per thread. So two
connections in use in the same thread will actually be the same SQLite
connection. Make sure you're not using a :memory: database and
use :class:`.NullPool`, which is the default for non-memory databases in
current SQLAlchemy versions.
.. seealso::
:ref:`pysqlite_threading_pooling` - info on PySQLite's behavior.
.. _faq_dbapi_connection:
How do I get at the raw DBAPI connection when using an Engine?
--------------------------------------------------------------
With a regular SA engine-level Connection, you can get at a pool-proxied
version of the DBAPI connection via the :attr:`_engine.Connection.connection` attribute on
:class:`_engine.Connection`, and for the really-real DBAPI connection you can call the
:attr:`._ConnectionFairy.dbapi_connection` attribute on that. On regular sync drivers
there is usually no need to access the non-pool-proxied DBAPI connection,
as all methods are proxied through::
engine = create_engine(...)
conn = engine.connect()
# pep-249 style ConnectionFairy connection pool proxy object
connection_fairy = conn.connection
# typically to run statements one would get a cursor() from this
# object
cursor_obj = connection_fairy.cursor()
# ... work with cursor_obj
# to bypass "connection_fairy", such as to set attributes on the
# unproxied pep-249 DBAPI connection, use .dbapi_connection
raw_dbapi_connection = connection_fairy.dbapi_connection
# the same thing is available as .driver_connection (more on this
# in the next section)
also_raw_dbapi_connection = connection_fairy.driver_connection
.. versionchanged:: 1.4.24 Added the
:attr:`._ConnectionFairy.dbapi_connection` attribute,
which supersedes the previous
:attr:`._ConnectionFairy.connection` attribute which still remains
available; this attribute always provides a pep-249 synchronous style
connection object. The :attr:`._ConnectionFairy.driver_connection`
attribute is also added which will always refer to the real driver-level
connection regardless of what API it presents.
Accessing the underlying connnection for an asyncio driver
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When an asyncio driver is in use, there are two changes to the above
scheme. The first is that when using an :class:`_asyncio.AsyncConnection`,
the :class:`._ConnectionFairy` must be accessed using the awaitable method
:meth:`_asyncio.AsyncConnection.get_raw_connection`. The
returned :class:`._ConnectionFairy` in this case retains a sync-style
pep-249 usage pattern, and the :attr:`._ConnectionFairy.dbapi_connection`
attribute refers to a
a SQLAlchemy-adapted connection object which adapts the asyncio
connection to a sync style pep-249 API, in other words there are *two* levels
of proxying going on when using an asyncio driver. The actual asyncio connection
is available from the :class:`._ConnectionFairy.driver_connection` attribute.
To restate the previous example in terms of asyncio looks like::
async def main():
engine = create_async_engine(...)
conn = await engine.connect()
# pep-249 style ConnectionFairy connection pool proxy object
# presents a sync interface
connection_fairy = await conn.get_raw_connection()
# beneath that proxy is a second proxy which adapts the
# asyncio driver into a pep-249 connection object, accessible
# via .dbapi_connection as is the same with a sync API
sqla_sync_conn = connection_fairy.dbapi_connection
# the really-real innermost driver connection is available
# from the .driver_connection attribute
raw_asyncio_connection = connection_fairy.driver_connection
# work with raw asyncio connection
result = await raw_asyncio_connection.execute(...)
.. versionchanged:: 1.4.24 Added the
:attr:`._ConnectionFairy.dbapi_connection`
and :attr:`._ConnectionFairy.driver_connection` attributes to allow access
to pep-249 connections, pep-249 adaption layers, and underlying driver
connections using a consistent interface.
When using asyncio drivers, the above "DBAPI" connection is actually a
SQLAlchemy-adapted form of connection which presents a synchronous-style
pep-249 style API. To access the actual
asyncio driver connection, which will present the original asyncio API
of the driver in use, this can be accessed via the
:attr:`._ConnectionFairy.driver_connection` attribute of
:class:`._ConnectionFairy`.
For a standard pep-249 driver, :attr:`._ConnectionFairy.dbapi_connection`
and :attr:`._ConnectionFairy.driver_connection` are synonymous.
You must ensure that you revert any isolation level settings or other
operation-specific settings on the connection back to normal before returning
it to the pool.
As an alternative to reverting settings, you can call the
:meth:`_engine.Connection.detach` method on either :class:`_engine.Connection`
or the proxied connection, which will de-associate the connection from the pool
such that it will be closed and discarded when :meth:`_engine.Connection.close`
is called::
conn = engine.connect()
conn.detach() # detaches the DBAPI connection from the connection pool
conn.connection.<go nuts>
conn.close() # connection is closed for real, the pool replaces it with a new connection
How do I use engines / connections / sessions with Python multiprocessing, or os.fork()?
----------------------------------------------------------------------------------------
This is covered in the section :ref:`pooling_multiprocessing`.
|