File: persistence_techniques.rst

package info (click to toggle)
sqlalchemy 1.4.46%2Bds1-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 22,444 kB
  • sloc: python: 341,434; ansic: 1,760; makefile: 226; xml: 17; sh: 7
file content (1061 lines) | stat: -rw-r--r-- 44,517 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
=================================
Additional Persistence Techniques
=================================



.. _flush_embedded_sql_expressions:

Embedding SQL Insert/Update Expressions into a Flush
====================================================

This feature allows the value of a database column to be set to a SQL
expression instead of a literal value. It's especially useful for atomic
updates, calling stored procedures, etc. All you do is assign an expression to
an attribute::

    class SomeClass(Base):
        __tablename__ = "some_table"

        # ...

        value = Column(Integer)


    someobject = session.query(SomeClass).get(5)

    # set 'value' attribute to a SQL expression adding one
    someobject.value = SomeClass.value + 1

    # issues "UPDATE some_table SET value=value+1"
    session.commit()

This technique works both for INSERT and UPDATE statements. After the
flush/commit operation, the ``value`` attribute on ``someobject`` above is
expired, so that when next accessed the newly generated value will be loaded
from the database.

The feature also has conditional support to work in conjunction with
primary key columns.  A database that supports RETURNING, e.g. PostgreSQL,
Oracle, or SQL Server, or as a special case when using SQLite with the pysqlite
driver and a single auto-increment column, a SQL expression may be assigned
to a primary key column as well.  This allows both the SQL expression to
be evaluated, as well as allows any server side triggers that modify the
primary key value on INSERT, to be successfully retrieved by the ORM as
part of the object's primary key::


    class Foo(Base):
        __tablename__ = 'foo'
        pk = Column(Integer, primary_key=True)
        bar = Column(Integer)

    e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
    Base.metadata.create_all(e)

    session = Session(e)

    foo = Foo(pk=sql.select(sql.func.coalesce(sql.func.max(Foo.pk) + 1, 1))
    session.add(foo)
    session.commit()

On PostgreSQL, the above :class:`.Session` will emit the following INSERT:

.. sourcecode:: sql

    INSERT INTO foo (foopk, bar) VALUES
    ((SELECT coalesce(max(foo.foopk) + %(max_1)s, %(coalesce_2)s) AS coalesce_1
    FROM foo), %(bar)s) RETURNING foo.foopk

.. versionadded:: 1.3
    SQL expressions can now be passed to a primary key column during an ORM
    flush; if the database supports RETURNING, or if pysqlite is in use, the
    ORM will be able to retrieve the server-generated value as the value
    of the primary key attribute.

.. _session_sql_expressions:

Using SQL Expressions with Sessions
===================================

SQL expressions and strings can be executed via the
:class:`~sqlalchemy.orm.session.Session` within its transactional context.
This is most easily accomplished using the
:meth:`~.Session.execute` method, which returns a
:class:`~sqlalchemy.engine.CursorResult` in the same manner as an
:class:`~sqlalchemy.engine.Engine` or
:class:`~sqlalchemy.engine.Connection`::

    Session = sessionmaker(bind=engine)
    session = Session()

    # execute a string statement
    result = session.execute("select * from table where id=:id", {"id": 7})

    # execute a SQL expression construct
    result = session.execute(select(mytable).where(mytable.c.id == 7))

The current :class:`~sqlalchemy.engine.Connection` held by the
:class:`~sqlalchemy.orm.session.Session` is accessible using the
:meth:`~.Session.connection` method::

    connection = session.connection()

The examples above deal with a :class:`_orm.Session` that's
bound to a single :class:`_engine.Engine` or
:class:`_engine.Connection`. To execute statements using a
:class:`_orm.Session` which is bound either to multiple
engines, or none at all (i.e. relies upon bound metadata), both
:meth:`_orm.Session.execute` and
:meth:`_orm.Session.connection` accept a dictionary of bind arguments
:paramref:`_orm.Session.execute.bind_arguments` which may include "mapper"
which is passed a mapped class or
:class:`_orm.Mapper` instance, which is used to locate the
proper context for the desired engine::

    Session = sessionmaker()
    session = Session()

    # need to specify mapper or class when executing
    result = session.execute(
        text("select * from table where id=:id"),
        {"id": 7},
        bind_arguments={"mapper": MyMappedClass},
    )

    result = session.execute(
        select(mytable).where(mytable.c.id == 7), bind_arguments={"mapper": MyMappedClass}
    )

    connection = session.connection(MyMappedClass)

.. versionchanged:: 1.4 the ``mapper`` and ``clause`` arguments to
   :meth:`_orm.Session.execute` are now passed as part of a dictionary
   sent as the :paramref:`_orm.Session.execute.bind_arguments` parameter.
   The previous arguments are still accepted however this usage is
   deprecated.

.. _session_forcing_null:

Forcing NULL on a column with a default
=======================================

The ORM considers any attribute that was never set on an object as a
"default" case; the attribute will be omitted from the INSERT statement::

    class MyObject(Base):
        __tablename__ = "my_table"
        id = Column(Integer, primary_key=True)
        data = Column(String(50), nullable=True)


    obj = MyObject(id=1)
    session.add(obj)
    session.commit()  # INSERT with the 'data' column omitted; the database
    # itself will persist this as the NULL value

Omitting a column from the INSERT means that the column will
have the NULL value set, *unless* the column has a default set up,
in which case the default value will be persisted.   This holds true
both from a pure SQL perspective with server-side defaults, as well as the
behavior of SQLAlchemy's insert behavior with both client-side and server-side
defaults::

    class MyObject(Base):
        __tablename__ = "my_table"
        id = Column(Integer, primary_key=True)
        data = Column(String(50), nullable=True, server_default="default")


    obj = MyObject(id=1)
    session.add(obj)
    session.commit()  # INSERT with the 'data' column omitted; the database
    # itself will persist this as the value 'default'

However, in the ORM, even if one assigns the Python value ``None`` explicitly
to the object, this is treated the **same** as though the value were never
assigned::

    class MyObject(Base):
        __tablename__ = "my_table"
        id = Column(Integer, primary_key=True)
        data = Column(String(50), nullable=True, server_default="default")


    obj = MyObject(id=1, data=None)
    session.add(obj)
    session.commit()  # INSERT with the 'data' column explicitly set to None;
    # the ORM still omits it from the statement and the
    # database will still persist this as the value 'default'

The above operation will persist into the ``data`` column the
server default value of ``"default"`` and not SQL NULL, even though ``None``
was passed; this is a long-standing behavior of the ORM that many applications
hold as an assumption.

So what if we want to actually put NULL into this column, even though the
column has a default value?  There are two approaches.  One is that
on a per-instance level, we assign the attribute using the
:obj:`_expression.null` SQL construct::

    from sqlalchemy import null

    obj = MyObject(id=1, data=null())
    session.add(obj)
    session.commit()  # INSERT with the 'data' column explicitly set as null();
    # the ORM uses this directly, bypassing all client-
    # and server-side defaults, and the database will
    # persist this as the NULL value

The :obj:`_expression.null` SQL construct always translates into the SQL
NULL value being directly present in the target INSERT statement.

If we'd like to be able to use the Python value ``None`` and have this
also be persisted as NULL despite the presence of column defaults,
we can configure this for the ORM using a Core-level modifier
:meth:`.TypeEngine.evaluates_none`, which indicates
a type where the ORM should treat the value ``None`` the same as any other
value and pass it through, rather than omitting it as a "missing" value::

    class MyObject(Base):
        __tablename__ = "my_table"
        id = Column(Integer, primary_key=True)
        data = Column(
            String(50).evaluates_none(),  # indicate that None should always be passed
            nullable=True,
            server_default="default",
        )


    obj = MyObject(id=1, data=None)
    session.add(obj)
    session.commit()  # INSERT with the 'data' column explicitly set to None;
    # the ORM uses this directly, bypassing all client-
    # and server-side defaults, and the database will
    # persist this as the NULL value

.. topic:: Evaluating None

  The :meth:`.TypeEngine.evaluates_none` modifier is primarily intended to
  signal a type where the Python value "None" is significant, the primary
  example being a JSON type which may want to persist the JSON ``null`` value
  rather than SQL NULL.  We are slightly repurposing it here in order to
  signal to the ORM that we'd like ``None`` to be passed into the type whenever
  present, even though no special type-level behaviors are assigned to it.

.. versionadded:: 1.1 added the :meth:`.TypeEngine.evaluates_none` method
   in order to indicate that a "None" value should be treated as significant.

.. _orm_server_defaults:

Fetching Server-Generated Defaults
===================================

As introduced in the sections :ref:`server_defaults` and :ref:`triggered_columns`,
the Core supports the notion of database columns for which the database
itself generates a value upon INSERT and in less common cases upon UPDATE
statements.  The ORM features support for such columns regarding being
able to fetch these newly generated values upon flush.   This behavior is
required in the case of primary key columns that are generated by the server,
since the ORM has to know the primary key of an object once it is persisted.

In the vast majority of cases, primary key columns that have their value
generated automatically by the database are  simple integer columns, which are
implemented by the database as either a so-called "autoincrement" column, or
from a sequence associated with the column.   Every database dialect within
SQLAlchemy Core supports a method of retrieving these primary key values which
is often native to the Python DBAPI, and in general this process is automatic,
with the exception of a database like Oracle that requires us to specify a
:class:`.Sequence` explicitly.   There is more documentation regarding this
at :paramref:`_schema.Column.autoincrement`.

For server-generating columns that are not primary key columns or that are not
simple autoincrementing integer columns, the ORM requires that these columns
are marked with an appropriate ``server_default`` directive that allows the ORM to
retrieve this value.   Not all methods are supported on all backends, however,
so care must be taken to use the appropriate method. The two questions to be
answered are, 1. is this column part of the primary key or not, and 2. does the
database support RETURNING or an equivalent, such as "OUTPUT inserted"; these
are SQL phrases which return a server-generated value at the same time as the
INSERT or UPDATE statement is invoked. Databases that support RETURNING or
equivalent include PostgreSQL, Oracle, and SQL Server.  Databases that do not
include SQLite and MySQL.

Case 1: non primary key, RETURNING or equivalent is supported
-------------------------------------------------------------

In this case, columns should be marked as :class:`.FetchedValue` or with an
explicit :paramref:`_schema.Column.server_default`.   The
:paramref:`_orm.mapper.eager_defaults` parameter
may be used to indicate that these
columns should be fetched immediately upon INSERT and sometimes UPDATE::


    class MyModel(Base):
        __tablename__ = "my_table"

        id = Column(Integer, primary_key=True)
        timestamp = Column(DateTime(), server_default=func.now())

        # assume a database trigger populates a value into this column
        # during INSERT
        special_identifier = Column(String(50), server_default=FetchedValue())

        __mapper_args__ = {"eager_defaults": True}

Above, an INSERT statement that does not specify explicit values for
"timestamp" or "special_identifier" from the client side will include the
"timestamp" and "special_identifier" columns within the RETURNING clause so
they are available immediately. On the PostgreSQL database, an INSERT for the
above table will look like:

.. sourcecode:: sql

   INSERT INTO my_table DEFAULT VALUES RETURNING my_table.id, my_table.timestamp, my_table.special_identifier


Case 2: non primary key, RETURNING or equivalent is not supported or not needed
--------------------------------------------------------------------------------

This case is the same as case 1 above, except we don't specify
:paramref:`.orm.mapper.eager_defaults`::

    class MyModel(Base):
        __tablename__ = "my_table"

        id = Column(Integer, primary_key=True)
        timestamp = Column(DateTime(), server_default=func.now())

        # assume a database trigger populates a value into this column
        # during INSERT
        special_identifier = Column(String(50), server_default=FetchedValue())

After a record with the above mapping is INSERTed, the "timestamp" and
"special_identifier" columns will remain empty, and will be fetched via
a second SELECT statement when they are first accessed after the flush, e.g.
they are marked as "expired".

If the :paramref:`.orm.mapper.eager_defaults` is still used, and the backend
database does not support RETURNING or an equivalent, the ORM will emit this
SELECT statement immediately following the INSERT statement.   This is often
undesirable as it adds additional SELECT statements to the flush process that
may not be needed.  Using the above mapping with the
:paramref:`.orm.mapper.eager_defaults` flag set to True against MySQL results
in SQL like this upon flush (minus the comment, which is for clarification only):

.. sourcecode:: sql

    INSERT INTO my_table () VALUES ()

    -- when eager_defaults **is** used, but RETURNING is not supported
    SELECT my_table.timestamp AS my_table_timestamp, my_table.special_identifier AS my_table_special_identifier
    FROM my_table WHERE my_table.id = %s

Case 3: primary key, RETURNING or equivalent is supported
----------------------------------------------------------

A primary key column with a server-generated value must be fetched immediately
upon INSERT; the ORM can only access rows for which it has a primary key value,
so if the primary key is generated by the server, the ORM needs a way for the
database to give us that new value immediately upon INSERT.

As mentioned above, for integer "autoincrement" columns as well as
PostgreSQL SERIAL, these types are handled automatically by the Core; databases
include functions for fetching the "last inserted id" where RETURNING
is not supported, and where RETURNING is supported SQLAlchemy will use that.

However, for non-integer values, as well as for integer values that must be
explicitly linked to a sequence or other triggered routine,  the server default
generation must be marked in the table metadata.

For an explicit sequence as we use with Oracle, this just means we are using
the :class:`.Sequence` construct::

    class MyOracleModel(Base):
        __tablename__ = "my_table"

        id = Column(Integer, Sequence("my_sequence"), primary_key=True)
        data = Column(String(50))

The INSERT for a model as above on Oracle looks like:

.. sourcecode:: sql

    INSERT INTO my_table (id, data) VALUES (my_sequence.nextval, :data) RETURNING my_table.id INTO :ret_0

Where above, SQLAlchemy renders ``my_sequence.nextval`` for the primary key column
and also uses RETURNING to get the new value back immediately.

For datatypes that generate values automatically, or columns that are populated
by a trigger, we use :class:`.FetchedValue`.  Below is a model that uses a
SQL Server TIMESTAMP column as the primary key, which generates values automatically::

    class MyModel(Base):
        __tablename__ = "my_table"

        timestamp = Column(TIMESTAMP(), server_default=FetchedValue(), primary_key=True)

An INSERT for the above table on SQL Server looks like:

.. sourcecode:: sql

    INSERT INTO my_table OUTPUT inserted.timestamp DEFAULT VALUES

Case 4: primary key, RETURNING or equivalent is not supported
--------------------------------------------------------------

In this area we are generating rows for a database such as SQLite or MySQL
where some means of generating a default is occurring on the server, but is
outside of the database's usual autoincrement routine. In this case, we have to
make sure SQLAlchemy can "pre-execute" the default, which means it has to be an
explicit SQL expression.

.. note::  This section will illustrate multiple recipes involving
   datetime values for MySQL and SQLite, since the datetime datatypes on these
   two  backends have additional idiosyncratic requirements that are useful to
   illustrate.  Keep in mind however that SQLite and MySQL require an explicit
   "pre-executed" default generator for *any* auto-generated datatype used as
   the primary key other than the usual single-column autoincrementing integer
   value.

MySQL with DateTime primary key
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Using the example of a :class:`.DateTime` column for MySQL, we add an explicit
pre-execute-supported default using the "NOW()" SQL function::

    class MyModel(Base):
        __tablename__ = "my_table"

        timestamp = Column(DateTime(), default=func.now(), primary_key=True)

Where above, we select the "NOW()" function to deliver a datetime value
to the column.  The SQL generated by the above is:

.. sourcecode:: sql

    SELECT now() AS anon_1
    INSERT INTO my_table (timestamp) VALUES (%s)
    ('2018-08-09 13:08:46',)

MySQL with TIMESTAMP primary key
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When using the :class:`_types.TIMESTAMP` datatype with MySQL, MySQL ordinarily
associates a server-side default with this datatype automatically.  However
when we use one as a primary key, the Core cannot retrieve the newly generated
value unless we execute the function ourselves.  As :class:`_types.TIMESTAMP` on
MySQL actually stores a binary value, we need to add an additional "CAST" to our
usage of "NOW()" so that we retrieve a binary value that can be persisted
into the column::

    from sqlalchemy import cast, Binary


    class MyModel(Base):
        __tablename__ = "my_table"

        timestamp = Column(TIMESTAMP(), default=cast(func.now(), Binary), primary_key=True)

Above, in addition to selecting the "NOW()" function, we additionally make
use of the :class:`.Binary` datatype in conjunction with :func:`.cast` so that
the returned value is binary.  SQL rendered from the above within an
INSERT looks like:

.. sourcecode:: sql

    SELECT CAST(now() AS BINARY) AS anon_1
    INSERT INTO my_table (timestamp) VALUES (%s)
    (b'2018-08-09 13:08:46',)

SQLite with DateTime primary key
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For SQLite, new timestamps can be generated using the SQL function
``datetime('now', 'localtime')`` (or specify ``'utc'`` for UTC),
however making things more complicated is that this returns a string
value, which is then incompatible with SQLAlchemy's :class:`.DateTime`
datatype (even though the datatype converts the information back into a
string for the SQLite backend, it must be passed through as a Python datetime).
We therefore must also specify that we'd like to coerce the return value to
:class:`.DateTime` when it is returned from the function, which we achieve
by passing this as the ``type_`` parameter::

    class MyModel(Base):
        __tablename__ = "my_table"

        timestamp = Column(
            DateTime,
            default=func.datetime("now", "localtime", type_=DateTime),
            primary_key=True,
        )

The above mapping upon INSERT will look like:

.. sourcecode:: sql

    SELECT datetime(?, ?) AS datetime_1
    ('now', 'localtime')
    INSERT INTO my_table (timestamp) VALUES (?)
    ('2018-10-02 13:37:33.000000',)


.. seealso::

    :ref:`metadata_defaults_toplevel`

Notes on eagerly fetching client invoked SQL expressions used for INSERT or UPDATE
-----------------------------------------------------------------------------------

The preceding examples indicate the use of :paramref:`_schema.Column.server_default`
to create tables that include default-generation functions within their
DDL.

SQLAlchemy also supports non-DDL server side defaults, as documented at
:ref:`defaults_client_invoked_sql`; these "client invoked SQL expressions"
are set up using the :paramref:`_schema.Column.default` and
:paramref:`_schema.Column.onupdate` parameters.

These SQL expressions currently are subject to the same limitations within the
ORM as occurs for true server-side defaults; they won't be eagerly fetched with
RETURNING when using :paramref:`_orm.mapper.eager_defaults` unless the
:class:`.FetchedValue` directive is associated with the
:class:`_schema.Column`, even though these expressions are not DDL server
defaults and are actively rendered by SQLAlchemy itself. This limitation may be
addressed in future SQLAlchemy releases.

The :class:`.FetchedValue` construct can be applied to
:paramref:`_schema.Column.server_default` or
:paramref:`_schema.Column.server_onupdate` at the same time that a SQL
expression is used with :paramref:`_schema.Column.default` and
:paramref:`_schema.Column.onupdate`, such as in the example below where the
``func.now()`` construct is used as a client-invoked SQL expression
for :paramref:`_schema.Column.default` and
:paramref:`_schema.Column.onupdate`.  In order for the behavior of
:paramref:`_orm.mapper.eager_defaults` to include that it fetches these
values using RETURNING when available, :paramref:`_schema.Column.server_default` and
:paramref:`_schema.Column.server_onupdate` are used with :class:`.FetchedValue`
to ensure that the fetch occurs::

    class MyModel(Base):
        __tablename__ = "my_table"

        id = Column(Integer, primary_key=True)

        created = Column(DateTime(), default=func.now(), server_default=FetchedValue())
        updated = Column(
            DateTime(),
            onupdate=func.now(),
            server_default=FetchedValue(),
            server_onupdate=FetchedValue(),
        )

        __mapper_args__ = {"eager_defaults": True}

With a mapping similar to the above, the SQL rendered by the ORM for
INSERT and UPDATE will include ``created`` and ``updated`` in the RETURNING
clause::

  INSERT INTO my_table (created) VALUES (now()) RETURNING my_table.id, my_table.created, my_table.updated

  UPDATE my_table SET updated=now() WHERE my_table.id = %(my_table_id)s RETURNING my_table.updated



.. _orm_dml_returning_objects:


Using INSERT, UPDATE and ON CONFLICT (i.e. upsert) to return ORM Objects
==========================================================================

.. deepalchemy:: The feature of linking ORM objects to RETURNING is a new and
   experimental feature.

.. versionadded:: 1.4.0

The :term:`DML` constructs :func:`_dml.insert`, :func:`_dml.update`, and
:func:`_dml.delete` feature a method :meth:`_dml.UpdateBase.returning` which on
database backends that support RETURNING (PostgreSQL, SQL Server, some MariaDB
versions) may be used to return database rows generated or matched by
the statement as though they were SELECTed. The ORM-enabled UPDATE and DELETE
statements may be combined with this feature, so that they return rows
corresponding to all the rows which were matched by the criteria::

    from sqlalchemy import update

    stmt = (
        update(User)
        .where(User.name == "squidward")
        .values(name="spongebob")
        .returning(User.id)
    )

    for row in session.execute(stmt):
        print(f"id: {row.id}")

The above example returns the ``User.id`` attribute for each row matched.
Provided that each row contains at least a primary key value, we may opt to
receive these rows as ORM objects, allowing ORM objects to be loaded from the
database corresponding atomically to an UPDATE statement against those rows. To
achieve this, we may combine the :class:`_dml.Update` construct which returns
``User`` rows with a :func:`_sql.select` that's adapted to run this UPDATE
statement in an ORM context using the :meth:`_sql.Select.from_statement`
method::

    stmt = (
        update(User)
        .where(User.name == "squidward")
        .values(name="spongebob")
        .returning(User)
    )

    orm_stmt = select(User).from_statement(stmt).execution_options(populate_existing=True)

    for user in session.execute(orm_stmt).scalars():
        print("updated user: %s" % user)

Above, we produce an :func:`_dml.update` construct that includes
:meth:`_dml.Update.returning` given the full ``User`` entity, which will
produce complete rows from the database table as it UPDATEs them; any arbitrary
set of columns to load may be specified as long as the full primary key is
included. Next, these rows are adapted to an ORM load by producing a
:func:`_sql.select` for the desired entity, then adapting it to the UPDATE
statement by passing the :class:`_dml.Update` construct to the
:meth:`_sql.Select.from_statement` method; this special ORM method, introduced
at :ref:`orm_queryguide_selecting_text`, produces an ORM-specific adapter that
allows the given statement to act as though it were the SELECT of rows that is
first described.   No SELECT is actually emitted in the database, only the
UPDATE..RETURNING we've constructed.

Finally, we make use of :ref:`orm_queryguide_populate_existing` on the
construct so that all the data returned by the UPDATE, including the columns
we've updated, are populated into the returned objects, replacing any
values which were there already.  This has the same effect as if we had
used the ``synchronize_session='fetch'`` strategy described previously
at :ref:`orm_expression_update_delete_sync`.

Using PostgreSQL ON CONFLICT with RETURNING to return upserted ORM objects
---------------------------------------------------------------------------

The above approach can be used with INSERTs with RETURNING as well. As a more
advanced example, below illustrates how to use the PostgreSQL
:ref:`postgresql_insert_on_conflict` construct to INSERT or UPDATE rows in the
database, while simultaneously producing those objects as ORM instances::

    from sqlalchemy.dialects.postgresql import insert

    stmt = insert(User).values(
        [
            dict(name="sandy", fullname="Sandy Cheeks"),
            dict(name="squidward", fullname="Squidward Tentacles"),
            dict(name="spongebob", fullname="Spongebob Squarepants"),
        ]
    )

    stmt = stmt.on_conflict_do_update(
        index_elements=[User.name], set_=dict(fullname=stmt.excluded.fullname)
    ).returning(User)

    orm_stmt = select(User).from_statement(stmt).execution_options(populate_existing=True)
    for user in session.execute(
        orm_stmt,
    ).scalars():
        print("inserted or updated: %s" % user)

To start, we make sure we are using the PostgreSQL variant of the
:func:`_postgresql.insert` construct.   Next, we construct a multi-values
INSERT statement, where a single INSERT statement will provide multiple rows
to be inserted.  On the PostgreSQL database, this syntax provides the most
efficient means of sending many hundreds of rows at once to be INSERTed.

From there, we could if we wanted add the ``RETURNING`` clause to produce
a bulk INSERT.  However, to make the example even more interesting, we will
also add the PostgreSQL specific ``ON CONFLICT..DO UPDATE`` syntax so that
rows which already exist based on a unique criteria will be UPDATEd instead.
We assume there is an INDEX or UNIQUE constraint on the ``name`` column of the
``user_account`` table above, and then specify an appropriate :meth:`_postgresql.Insert.on_conflict_do_update`
criteria that will update the ``fullname`` column for rows that already exist.

Finally, we add the :meth:`_dml.Insert.returning` clause as we did in the
previous example, and select our ``User`` objects using the same
:meth:`_sql.Select.from_statement` approach as we did earlier. Supposing the
database only a row for ``(1, "squidward", NULL)`` present; this row will
trigger the ON CONFLICT routine in our above statement, in other words perform
the equivalent of an UPDATE statement. The other two rows,
``(NULL, "sandy", "Sandy Cheeks")`` and
``(NULL, "spongebob", "Spongebob Squarepants")`` do not yet exist in the
database, and will be inserted using normal INSERT semantics; the primary key
column ``id`` uses either ``SERIAL`` or ``IDENTITY`` to auto-generate new
integer values.

Using this above form, we see SQL emitted on the PostgreSQL database as:


.. sourcecode:: pycon+sql

    {opensql}INSERT INTO user_account (name, fullname)
    VALUES (%(name_m0)s, %(fullname_m0)s), (%(name_m1)s, %(fullname_m1)s), (%(name_m2)s, %(fullname_m2)s)
    ON CONFLICT (name) DO UPDATE SET fullname = excluded.fullname
    RETURNING user_account.id, user_account.name, user_account.fullname
    {'name_m0': 'sandy', 'fullname_m0': 'Sandy Cheeks', 'name_m1': 'squidward', 'fullname_m1': 'Squidward Tentacles', 'name_m2': 'spongebob', 'fullname_m2': 'Spongebob Squarepants'}{stop}

    inserted or updated: User(id=2, name='sandy', fullname='Sandy Cheeks')
    inserted or updated: User(id=3, name='squidward', fullname='Squidward Tentacles')
    inserted or updated: User(id=1, name='spongebob', fullname='Spongebob Squarepants')

Above we can also see that the INSERTed ``User`` objects have a
newly generated primary key value as we would expect with any other ORM
oriented INSERT statement.

.. seealso::

  :ref:`orm_queryguide_selecting_text` - introduces the
  :meth:`_sql.Select.from_statement` method.

.. _session_partitioning:

Partitioning Strategies (e.g. multiple database backends per Session)
=====================================================================

Simple Vertical Partitioning
----------------------------

Vertical partitioning places different classes, class hierarchies,
or mapped tables, across multiple databases, by configuring the
:class:`.Session` with the :paramref:`.Session.binds` argument. This
argument receives a dictionary that contains any combination of
ORM-mapped classes, arbitrary classes within a mapped hierarchy (such
as declarative base classes or mixins), :class:`_schema.Table` objects,
and :class:`_orm.Mapper` objects as keys, which then refer typically to
:class:`_engine.Engine` or less typically :class:`_engine.Connection` objects as targets.
The dictionary is consulted whenever the :class:`.Session` needs to
emit SQL on behalf of a particular kind of mapped class in order to locate
the appropriate source of database connectivity::

    engine1 = create_engine("postgresql://db1")
    engine2 = create_engine("postgresql://db2")

    Session = sessionmaker()

    # bind User operations to engine 1, Account operations to engine 2
    Session.configure(binds={User: engine1, Account: engine2})

    session = Session()

Above, SQL operations against either class will make usage of the :class:`_engine.Engine`
linked to that class.     The functionality is comprehensive across both
read and write operations; a :class:`_query.Query` that is against entities
mapped to ``engine1`` (determined by looking at the first entity in the
list of items requested) will make use of ``engine1`` to run the query.   A
flush operation will make use of **both** engines on a per-class basis as it
flushes objects of type ``User`` and ``Account``.

In the more common case, there are typically base or mixin classes that  can be
used to distinguish between operations that are destined for different database
connections.  The :paramref:`.Session.binds` argument can accommodate any
arbitrary Python class as a key, which will be used if it is found to be in the
``__mro__`` (Python method resolution order) for a particular  mapped class.
Supposing two declarative bases are representing two different database
connections::

    BaseA = declarative_base()

    BaseB = declarative_base()

    class User(BaseA):
        # ...

    class Address(BaseA):
        # ...


    class GameInfo(BaseB):
        # ...

    class GameStats(BaseB):
        # ...


    Session = sessionmaker()

    # all User/Address operations will be on engine 1, all
    # Game operations will be on engine 2
    Session.configure(binds={BaseA:engine1, BaseB:engine2})

Above, classes which descend from ``BaseA`` and ``BaseB`` will have their
SQL operations routed to one of two engines based on which superclass
they descend from, if any.   In the case of a class that descends from more
than one "bound" superclass, the superclass that is highest in the target
class' hierarchy will be chosen to represent which engine should be used.

.. seealso::

    :paramref:`.Session.binds`


Coordination of Transactions for a multiple-engine Session
----------------------------------------------------------

One caveat to using multiple bound engines is in the case where a commit
operation may fail on one backend after the commit has succeeded on another.
This is an inconsistency problem that in relational databases is solved
using a "two phase transaction", which adds an additional "prepare" step
to the commit sequence that allows for multiple databases to agree to commit
before actually completing the transaction.

Due to limited support within DBAPIs,  SQLAlchemy has limited support for two-
phase transactions across backends.  Most typically, it is known to work well
with the PostgreSQL backend and to  a lesser extent with the MySQL backend.
However, the :class:`.Session` is fully capable of taking advantage of the two
phase transaction feature when the backend supports it, by setting the
:paramref:`.Session.use_twophase` flag within :class:`.sessionmaker` or
:class:`.Session`.  See :ref:`session_twophase` for an example.


.. _session_custom_partitioning:

Custom Vertical Partitioning
----------------------------

More comprehensive rule-based class-level partitioning can be built by
overriding the :meth:`.Session.get_bind` method.   Below we illustrate
a custom :class:`.Session` which delivers the following rules:

1. Flush operations, as well as bulk "update" and "delete" operations,
   are delivered to the engine named ``leader``.

2. Operations on objects that subclass ``MyOtherClass`` all
   occur on the ``other`` engine.

3. Read operations for all other classes occur on a random
   choice of the ``follower1`` or ``follower2`` database.

::

    engines = {
        "leader": create_engine("sqlite:///leader.db"),
        "other": create_engine("sqlite:///other.db"),
        "follower1": create_engine("sqlite:///follower1.db"),
        "follower2": create_engine("sqlite:///follower2.db"),
    }

    from sqlalchemy.sql import Update, Delete
    from sqlalchemy.orm import Session, sessionmaker
    import random


    class RoutingSession(Session):
        def get_bind(self, mapper=None, clause=None):
            if mapper and issubclass(mapper.class_, MyOtherClass):
                return engines["other"]
            elif self._flushing or isinstance(clause, (Update, Delete)):
                return engines["leader"]
            else:
                return engines[random.choice(["follower1", "follower2"])]

The above :class:`.Session` class is plugged in using the ``class_``
argument to :class:`.sessionmaker`::

    Session = sessionmaker(class_=RoutingSession)

This approach can be combined with multiple :class:`_schema.MetaData` objects,
using an approach such as that of using the declarative ``__abstract__``
keyword, described at :ref:`declarative_abstract`.

.. seealso::

    `Django-style Database Routers in SQLAlchemy <https://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/>`_  - blog post on a more comprehensive example of :meth:`.Session.get_bind`

Horizontal Partitioning
-----------------------

Horizontal partitioning partitions the rows of a single table (or a set of
tables) across multiple databases.    The SQLAlchemy :class:`.Session`
contains support for this concept, however to use it fully requires that
:class:`.Session` and :class:`_query.Query` subclasses are used.  A basic version
of these subclasses are available in the :ref:`horizontal_sharding_toplevel`
ORM extension.   An example of use is at: :ref:`examples_sharding`.

.. _bulk_operations:

Bulk Operations
===============

.. tip::

   Bulk operations are essentially lower-functionality versions
   of the Unit of Work's facilities for emitting INSERT and UPDATE statements
   on primary key targeted rows.   These routines were added to suit some
   cases where many rows being inserted or updated could be run into the
   database without as much of the usual unit of work overhead, by bypassing
   a large portion of the functionality that the unit of work provides.

   SQLAlchemy 2.0 features new and improved bulk techniques with clarified
   behavior, better integration with ORM objects as well as INSERT/UPDATE/DELETE
   statements, and new capabilities.  They additionally repair some long lived
   performance issues that plagued both regular unit of work and "bulk" routines,
   most notably in the area of INSERT operations.

   For these reasons, the previous bulk methods move into legacy status, which
   is revised from the original plan that "bulk" features were to be deprecated
   entirely.

   When using the legacy 1.4 versions of these features, please read all
   caveats at :ref:`bulk_operations_caveats`, as they are not always obvious.

.. note:: Bulk INSERT and UPDATE should not be confused with the
   more common feature known as :ref:`orm_expression_update_delete`.   This
   feature allows a single UPDATE or DELETE statement with arbitrary WHERE
   criteria to be emitted.    There is also an option on some backends to
   use true "upsert" with the ORM, such as on PostgreSQL.  See the section
   :ref:`orm_dml_returning_objects` for examples.

.. seealso::

    :ref:`orm_expression_update_delete` - using straight multi-row UPDATE and DELETE statements
    in an ORM context.

    :ref:`orm_dml_returning_objects` - use UPDATE, INSERT or upsert operations to
    return ORM objects

.. versionadded:: 1.0.0

Bulk INSERT/per-row UPDATE operations on the :class:`.Session` include
:meth:`.Session.bulk_save_objects`, :meth:`.Session.bulk_insert_mappings`, and
:meth:`.Session.bulk_update_mappings`. The purpose of these methods is to
directly expose internal elements of the unit of work system, such that
facilities for emitting INSERT and UPDATE statements given dictionaries or
object states can be utilized alone, bypassing the normal unit of work
mechanics of state, relationship and attribute management.   The advantages to
this approach is strictly one of reduced Python overhead:

* The flush() process, including the survey of all objects, their state,
  their cascade status, the status of all objects associated with them
  via :func:`_orm.relationship`, and the topological sort of all operations to
  be performed is completely bypassed.  This reduces a great amount of
  Python overhead.

* The objects as given have no defined relationship to the target
  :class:`.Session`, even when the operation is complete, meaning there's no
  overhead in attaching them or managing their state in terms of the identity
  map or session.

* The :meth:`.Session.bulk_insert_mappings` and :meth:`.Session.bulk_update_mappings`
  methods accept lists of plain Python dictionaries, not objects; this further
  reduces a large amount of overhead associated with instantiating mapped
  objects and assigning state to them, which normally is also subject to
  expensive tracking of history on a per-attribute basis.

* The set of objects passed to all bulk methods are processed
  in the order they are received.   In the case of
  :meth:`.Session.bulk_save_objects`, when objects of different types are passed,
  the INSERT and UPDATE statements are necessarily broken up into per-type
  groups.  In order to reduce the number of batch INSERT or UPDATE statements
  passed to the DBAPI, ensure that the incoming list of objects
  are grouped by type.

* The process of fetching primary keys after an INSERT also is disabled by
  default.   When performed correctly, INSERT statements can now more readily
  be batched by the unit of work process into ``executemany()`` blocks, which
  perform vastly better than individual statement invocations.

* UPDATE statements can similarly be tailored such that all attributes
  are subject to the SET clause unconditionally, again making it much more
  likely that ``executemany()`` blocks can be used.

The performance behavior of the bulk routines should be studied using the
:ref:`examples_performance` example suite.  This is a series of example
scripts which illustrate Python call-counts across a variety of scenarios,
including bulk insert and update scenarios.

.. seealso::

  :ref:`examples_performance` - includes detailed examples of bulk operations
  contrasted against traditional Core and ORM methods, including performance
  metrics.

Usage
-----

The methods each work in the context of the :class:`.Session` object's
transaction, like any other::

    s = Session()
    objects = [User(name="u1"), User(name="u2"), User(name="u3")]
    s.bulk_save_objects(objects)
    s.commit()

For :meth:`.Session.bulk_insert_mappings`, and :meth:`.Session.bulk_update_mappings`,
dictionaries are passed::

    s.bulk_insert_mappings(User, [dict(name="u1"), dict(name="u2"), dict(name="u3")])
    s.commit()

.. seealso::

    :meth:`.Session.bulk_save_objects`

    :meth:`.Session.bulk_insert_mappings`

    :meth:`.Session.bulk_update_mappings`


Comparison to Core Insert / Update Constructs
---------------------------------------------

The bulk methods offer performance that under particular circumstances
can be close to that of using the core :class:`_expression.Insert` and
:class:`_expression.Update` constructs in an "executemany" context (for a description
of "executemany", see :ref:`tutorial_multiple_parameters` in the Core tutorial).
In order to achieve this, the
:paramref:`.Session.bulk_insert_mappings.return_defaults`
flag should be disabled so that rows can be batched together.   The example
suite in :ref:`examples_performance` should be carefully studied in order
to gain familiarity with how fast bulk performance can be achieved.

.. _bulk_operations_caveats:

ORM Compatibility / Caveats
----------------------------

.. warning::  Be sure to familiarize with these limitations before using the
   bulk routines.

The bulk insert / update methods lose a significant amount of functionality
versus traditional ORM use.   The following is a listing of features that
are **not available** when using these methods:

* persistence along :func:`_orm.relationship` linkages

* sorting of rows within order of dependency; rows are inserted or updated
  directly in the order in which they are passed to the methods

* Session-management on the given objects, including attachment to the
  session, identity map management.

* Functionality related to primary key mutation, ON UPDATE cascade -
  **mutation of primary key columns will not work** - as the original PK
  value of each row is not available, so the WHERE criteria cannot be
  generated.

* SQL expression inserts / updates (e.g. :ref:`flush_embedded_sql_expressions`) -
  having to evaluate these would prevent INSERT and UPDATE statements from
  being batched together in a straightforward way for a single executemany()
  call as they alter the SQL compilation of the statement itself.

* ORM events such as :meth:`.MapperEvents.before_insert`, etc.  The bulk
  session methods have no event support.

Features that **are available** include:

* INSERTs and UPDATEs of mapped objects

* Version identifier support

* Multi-table mappings, such as joined-inheritance - however, an object
  to be inserted across multiple tables either needs to have primary key
  identifiers fully populated ahead of time, else the
  :paramref:`.Session.bulk_save_objects.return_defaults` flag must be used,
  which will greatly reduce the performance benefits