File: engines.rst

package info (click to toggle)
sqlalchemy 1.3.22%2Bds1-1
links: PTS, VCS
area: main
in suites: bullseye
size: 16,916 kB
sloc: python: 263,220; ansic: 1,342; makefile: 255; xml: 17
file content (450 lines) | stat: -rw-r--r-- 18,116 bytes
.. _engines_toplevel:

====================
Engine Configuration
====================

The :class:`_engine.Engine` is the starting point for any SQLAlchemy application. It's
"home base" for the actual database and its :term:`DBAPI`, delivered to the SQLAlchemy
application through a connection pool and a :class:`.Dialect`, which describes how
to talk to a specific kind of database/DBAPI combination.

The general structure can be illustrated as follows:

.. image:: sqla_engine_arch.png

Where above, an :class:`_engine.Engine` references both a
:class:`.Dialect` and a :class:`_pool.Pool`,
which together interpret the DBAPI's module functions as well as the behavior
of the database.

Creating an engine is just a matter of issuing a single call,
:func:`_sa.create_engine()`::

    from sqlalchemy import create_engine
    engine = create_engine('postgresql://scott:tiger@localhost:5432/mydatabase')

The above engine creates a :class:`.Dialect` object tailored towards
PostgreSQL, as well as a :class:`_pool.Pool` object which will establish a DBAPI
connection at ``localhost:5432`` when a connection request is first received.
Note that the :class:`_engine.Engine` and its underlying :class:`_pool.Pool` do **not**
establish the first actual DBAPI connection until the :meth:`_engine.Engine.connect`
method is called, or an operation which is dependent on this method such as
:meth:`_engine.Engine.execute` is invoked. In this way, :class:`_engine.Engine` and
:class:`_pool.Pool` can be said to have a *lazy initialization* behavior.

The :class:`_engine.Engine`, once created, can either be used directly to interact with the database,
or can be passed to a :class:`.Session` object to work with the ORM.   This section
covers the details of configuring an :class:`_engine.Engine`.   The next section, :ref:`connections_toplevel`,
will detail the usage API of the :class:`_engine.Engine` and similar, typically for non-ORM
applications.

.. _supported_dbapis:

Supported Databases
===================

SQLAlchemy includes many :class:`.Dialect` implementations for various
backends.   Dialects for the most common databases are included with SQLAlchemy; a handful
of others require an additional install of a separate dialect.

See the section :ref:`dialect_toplevel` for information on the various backends available.

.. _database_urls:

Database Urls
=============

The :func:`_sa.create_engine` function produces an :class:`_engine.Engine` object based
on a URL.  These URLs follow `RFC-1738
<http://rfc.net/rfc1738.html>`_, and usually can include username, password,
hostname, database name as well as optional keyword arguments for additional configuration.
In some cases a file path is accepted, and in others a "data source name" replaces
the "host" and "database" portions.  The typical form of a database URL is::

    dialect+driver://username:password@host:port/database

Dialect names include the identifying name of the SQLAlchemy dialect,
a name such as ``sqlite``, ``mysql``, ``postgresql``, ``oracle``, or ``mssql``.
The drivername is the name of the DBAPI to be used to connect to
the database using all lowercase letters. If not specified, a "default" DBAPI
will be imported if available - this default is typically the most widely
known driver available for that backend.

As the URL is like any other URL, special characters such as those that
may be used in the password need to be URL encoded.   Below is an example
of a URL that includes the password ``"kx%jj5/g"``::

  postgresql+pg8000://dbuser:kx%25jj5%2Fg@pghost10/appdb

The encoding for the above password can be generated using ``urllib``::

  >>> import urllib.parse
  >>> urllib.parse.quote_plus("kx%jj5/g")
  'kx%25jj5%2Fg'

Examples for common connection styles follow below.  For a full index of
detailed information on all included dialects as well as links to third-party
dialects, see :ref:`dialect_toplevel`.

PostgreSQL
----------

The PostgreSQL dialect uses psycopg2 as the default DBAPI.  pg8000 is
also available as a pure-Python substitute::

    # default
    engine = create_engine('postgresql://scott:tiger@localhost/mydatabase')

    # psycopg2
    engine = create_engine('postgresql+psycopg2://scott:tiger@localhost/mydatabase')

    # pg8000
    engine = create_engine('postgresql+pg8000://scott:tiger@localhost/mydatabase')

More notes on connecting to PostgreSQL at :ref:`postgresql_toplevel`.

MySQL
-----

The MySQL dialect uses mysql-python as the default DBAPI.  There are many
MySQL DBAPIs available, including MySQL-connector-python and OurSQL::

    # default
    engine = create_engine('mysql://scott:tiger@localhost/foo')

    # mysqlclient (a maintained fork of MySQL-Python)
    engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')

    # PyMySQL
    engine = create_engine('mysql+pymysql://scott:tiger@localhost/foo')

More notes on connecting to MySQL at :ref:`mysql_toplevel`.

Oracle
------

The Oracle dialect uses cx_oracle as the default DBAPI::

    engine = create_engine('oracle://scott:tiger@127.0.0.1:1521/sidname')

    engine = create_engine('oracle+cx_oracle://scott:tiger@tnsname')

More notes on connecting to Oracle at :ref:`oracle_toplevel`.

Microsoft SQL Server
--------------------

The SQL Server dialect uses pyodbc as the default DBAPI.  pymssql is
also available::

    # pyodbc
    engine = create_engine('mssql+pyodbc://scott:tiger@mydsn')

    # pymssql
    engine = create_engine('mssql+pymssql://scott:tiger@hostname:port/dbname')

More notes on connecting to SQL Server at :ref:`mssql_toplevel`.

SQLite
------

SQLite connects to file-based databases, using the Python built-in
module ``sqlite3`` by default.

As SQLite connects to local files, the URL format is slightly different.
The "file" portion of the URL is the filename of the database.
For a relative file path, this requires three slashes::

    # sqlite://<nohostname>/<path>
    # where <path> is relative:
    engine = create_engine('sqlite:///foo.db')

And for an absolute file path, the three slashes are followed by the absolute path::

    # Unix/Mac - 4 initial slashes in total
    engine = create_engine('sqlite:////absolute/path/to/foo.db')

    # Windows
    engine = create_engine('sqlite:///C:\\path\\to\\foo.db')

    # Windows alternative using raw string
    engine = create_engine(r'sqlite:///C:\path\to\foo.db')

To use a SQLite ``:memory:`` database, specify an empty URL::

    engine = create_engine('sqlite://')

More notes on connecting to SQLite at :ref:`sqlite_toplevel`.

Others
------

See :ref:`dialect_toplevel`, the top-level page for all additional dialect
documentation.

.. _create_engine_args:

Engine Creation API
===================

.. autofunction:: sqlalchemy.create_engine

.. autofunction:: sqlalchemy.engine_from_config

.. autofunction:: sqlalchemy.engine.url.make_url


.. autoclass:: sqlalchemy.engine.url.URL
    :members:

Pooling
=======

The :class:`_engine.Engine` will ask the connection pool for a
connection when the ``connect()`` or ``execute()`` methods are called. The
default connection pool, :class:`~.QueuePool`, will open connections to the
database on an as-needed basis. As concurrent statements are executed,
:class:`.QueuePool` will grow its pool of connections to a
default size of five, and will allow a default "overflow" of ten. Since the
:class:`_engine.Engine` is essentially "home base" for the
connection pool, it follows that you should keep a single
:class:`_engine.Engine` per database established within an
application, rather than creating a new one for each connection.

.. note::

   :class:`.QueuePool` is not used by default for SQLite engines.  See
   :ref:`sqlite_toplevel` for details on SQLite connection pool usage.

For more information on connection pooling, see :ref:`pooling_toplevel`.


.. _custom_dbapi_args:

Custom DBAPI connect() arguments / on-connect routines
=======================================================

For cases where special connection methods are needed, in the vast majority
of cases, it is most appropriate to use one of several hooks at the
:func:`_sa.create_engine` level in order to customize this process. These
are described in the following sub-sections.

Special Keyword Arguments Passed to dbapi.connect()
---------------------------------------------------

For special arguments that must be passed to the DBAPI for which the
SQLAlchemy dialect does not parse from the query string correctly,
the :paramref:`_sa.create_engine.connect_args` dictionary can be used.
This is often when special sub-structures or objects must be passed to
the DBAPI, or sometimes it's just that a particular flag must be sent as
the ``True`` symbol and the SQLAlchemy dialect is not aware of this keyword
argument.  Below illustrates the use of a psycopg2 "connection factory"
that replaces the underlying implementation the connection::


    engine = create_engine(
        "postgresql://user:pass@hostname/dbname",
        connect_args={"connection_factory": MyConnectionFactory}
    )


Controlling how parameters are passed to the DBAPI connect() function
---------------------------------------------------------------------

At the next level, we can customize how the DBAPI ``connect()`` function
itself is called using the :meth:`.DialectEvents.do_connect` event hook.
This hook is passed the full ``*args, **kwargs`` that the dialect would
send to ``connect()``.  These collections can then be modified in place
to alter how they are used::

    from sqlalchemy import event

    engine = create_engine("postgresql://user:pass@hostname/dbname")

    @event.listens_for(engine, "do_connect")
    def receive_do_connect(dialect, conn_rec, cargs, cparams):
        cparams['connection_factory'] = MyConnectionFactory

Modifying the DBAPI connection after connect, or running commands after connect
-------------------------------------------------------------------------------

For a DBAPI connection that SQLAlchemy creates without issue, but where we
would like to modify the completed connection before it's actually used, such
as for setting special flags or running certain commands, the
:meth:`.PoolEvents.connect` event hook is the most appropriate hook.  This
hook is called for every new connection created, before it is used by
SQLAlchemy::

    from sqlalchemy import event

    engine = create_engine(
        "postgresql://user:pass@hostname/dbname"
    )

    @event.listens_for(engine, "connect")
    def connect(dbapi_connection, connection_record):
        cursor = dbapi_connection.cursor()
        cursor.execute("SET some session variables")
        cursor.close()


Fully Replacing the DBAPI ``connect()`` function
------------------------------------------------

Finally, the :meth:`.DialectEvents.do_connect` event hook can also allow us to take
over the connection process entirely by establishing the connection
and returning it::

    from sqlalchemy import event

    engine = create_engine(
        "postgresql://user:pass@hostname/dbname"
    )

    @event.listens_for(engine, "do_connect")
    def receive_do_connect(dialect, conn_rec, cargs, cparams):
        # return the new DBAPI connection with whatever we'd like to
        # do
        return psycopg2.connect(*cargs, **cparams)

The :meth:`.DialectEvents.do_connect` hook supersedes the previous
:paramref:`_sa.create_engine.creator` hook, which remains available.
:meth:`.DialectEvents.do_connect` has the distinct advantage that the
complete arguments parsed from the URL are also passed to the user-defined
function which is not the case with :paramref:`_sa.create_engine.creator`.

.. _dbengine_logging:

Configuring Logging
===================

Python's standard `logging
<http://docs.python.org/library/logging.html>`_ module is used to
implement informational and debug log output with SQLAlchemy. This allows
SQLAlchemy's logging to integrate in a standard way with other applications
and libraries.   There are also two parameters
:paramref:`_sa.create_engine.echo` and :paramref:`_sa.create_engine.echo_pool`
present on :func:`_sa.create_engine` which allow immediate logging to ``sys.stdout``
for the purposes of local development; these parameters ultimately interact
with the regular Python loggers described below.

This section assumes familiarity with the above linked logging module. All
logging performed by SQLAlchemy exists underneath the ``sqlalchemy``
namespace, as used by ``logging.getLogger('sqlalchemy')``. When logging has
been configured (i.e. such as via ``logging.basicConfig()``), the general
namespace of SA loggers that can be turned on is as follows:

* ``sqlalchemy.engine`` - controls SQL echoing.  Set to ``logging.INFO`` for
  SQL query output, ``logging.DEBUG`` for query + result set output.  These
  settings are equivalent to ``echo=True`` and ``echo="debug"`` on
  :paramref:`_sa.create_engine.echo`, respectively.

* ``sqlalchemy.pool`` - controls connection pool logging.  Set to
  ``logging.INFO`` to log connection invalidation and recycle events; set to
  ``logging.DEBUG`` to additionally log all pool checkins and checkouts.
  These settings are equivalent to ``pool_echo=True`` and ``pool_echo="debug"``
  on :paramref:`_sa.create_engine.echo_pool`, respectively.

* ``sqlalchemy.dialects`` - controls custom logging for SQL dialects, to the
  extend that logging is used within specific dialects, which is generally
  minimal.

* ``sqlalchemy.orm`` - controls logging of various ORM functions to the extent
  that logging is used within the ORM, which is generally minimal.  Set to
  ``logging.INFO`` to log some top-level information on mapper configurations.

For example, to log SQL queries using Python logging instead of the
``echo=True`` flag::

    import logging

    logging.basicConfig()
    logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

By default, the log level is set to ``logging.WARN`` within the entire
``sqlalchemy`` namespace so that no log operations occur, even within an
application that has logging enabled otherwise.

.. note::

   The SQLAlchemy :class:`_engine.Engine` conserves Python function call
   overhead by only emitting log statements when the current logging level is
   detected as ``logging.INFO`` or ``logging.DEBUG``.  It only checks this
   level when a new connection is procured from the connection pool.  Therefore
   when changing the logging configuration for an already-running application,
   any :class:`_engine.Connection` that's currently active, or more commonly a
   :class:`~.orm.session.Session` object that's active in a transaction, won't
   log any SQL according to the new configuration until a new
   :class:`_engine.Connection` is procured (in the case of
   :class:`~.orm.session.Session`, this is after the current transaction ends
   and a new one begins).

More on the Echo Flag
---------------------

As mentioned previously, the :paramref:`_sa.create_engine.echo` and :paramref:`_sa.create_engine.echo_pool`
parameters are a shortcut to immediate logging to ``sys.stdout``::


    >>> from sqlalchemy import create_engine, text
    >>> e = create_engine("sqlite://", echo=True, echo_pool='debug')
    >>> with e.connect() as conn:
    ...    print(conn.scalar(text("select 'hi'")))
    ...
    2020-10-24 12:54:57,701 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Created new connection <sqlite3.Connection object at 0x7f287819ac60>
    2020-10-24 12:54:57,701 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Connection <sqlite3.Connection object at 0x7f287819ac60> checked out from pool
    2020-10-24 12:54:57,702 INFO sqlalchemy.engine.Engine select 'hi'
    2020-10-24 12:54:57,702 INFO sqlalchemy.engine.Engine ()
    hi
    2020-10-24 12:54:57,703 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Connection <sqlite3.Connection object at 0x7f287819ac60> being returned to pool
    2020-10-24 12:54:57,704 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Connection <sqlite3.Connection object at 0x7f287819ac60> rollback-on-return

Use of these flags is roughly equivalent to::

    import logging
    logging.basicConfig()
    logging.getLogger("sqlalchemy.engine").setLevel(logging.INFO)
    logging.getLogger("sqlalchemy.pool").setLevel(logging.DEBUG)

It's important to note that these two flags work **independently** of any
existing logging configuration, and will make use of ``logging.basicConfig()``
unconditionally.  This has the effect of being configured **in addition** to
any existing logger configurations. Therefore, **when configuring logging
explicitly, ensure all echo flags are set to False at all times**, to avoid
getting duplicate log lines.

Setting the Logging Name
-------------------------

The logger name of instance such as an :class:`~sqlalchemy.engine.Engine` or
:class:`~sqlalchemy.pool.Pool` defaults to using a truncated hex identifier
string. To set this to a specific name, use the
:paramref:`_sa.create_engine.logging_name` and
:paramref:`_sa.create_engine.pool_logging_name`  with
:func:`sqlalchemy.create_engine`::

    >>> from sqlalchemy import create_engine
    >>> from sqlalchemy import text
    >>> e = create_engine("sqlite://", echo=True, logging_name='myengine')
    >>> with e.connect() as conn:
    ...     conn.execute(text("select 'hi'"))
    ...
    2020-10-24 12:47:04,291 INFO sqlalchemy.engine.Engine.myengine select 'hi'
    2020-10-24 12:47:04,292 INFO sqlalchemy.engine.Engine.myengine ()

Hiding Parameters
------------------

The logging emitted by :class:`_engine.Engine` also indicates an excerpt
of the SQL parameters that are present for a particular statement.  To prevent
these parameters from being logged for privacy purposes, enable the
:paramref:`_sa.create_engine.hide_parameters` flag::

    >>> e = create_engine("sqlite://", echo=True, hide_parameters=True)
    >>> with e.connect() as conn:
    ...     conn.execute(text("select :some_private_name"), {"some_private_name": "pii"})
    ...
    2020-10-24 12:48:32,808 INFO sqlalchemy.engine.Engine select ?
    2020-10-24 12:48:32,808 INFO sqlalchemy.engine.Engine [SQL parameters hidden due to hide_parameters=True]