File: import_workflow.rst

package info (click to toggle)
python-django-import-export 4.3.5-2
links: PTS, VCS
area: main
in suites: forky, sid, trixie
size: 4,300 kB
sloc: python: 11,650; makefile: 180; sh: 63; javascript: 50
file content (123 lines) | stat: -rw-r--r-- 5,713 bytes
====================
Import workflow
====================

This document describes the import data workflow in detail, with hooks that enable
customization of the import process.

Methods highlighted in yellow in the sequence diagram indicate public methods which can
be overridden.

.. image:: _static/images/import_workflow.svg
  :alt: Import workflow sequence diagram
  :scale: 75%

The :meth:`~import_export.resources.Resource.import_data` method of
:class:`~import_export.resources.Resource` is responsible for importing data
from a given dataset.  Refer to the method documentation for parameters to this method.

This is what happens when the method is invoked:

#. First, a new :class:`~import_export.results.Result` instance, which holds
   errors and other information gathered during the import, is initialized.

   Then, an :class:`~import_export.instance_loaders.BaseInstanceLoader` responsible
   for loading existing instances is initialized. A different
   :class:`~import_export.instance_loaders.BaseInstanceLoader` can be specified
   via :class:`~import_export.options.ResourceOptions`'s
   ``instance_loader_class`` attribute. A
   :class:`~import_export.instance_loaders.CachedInstanceLoader` can be used to
   reduce number of database queries. See the :mod:`~import_export.instance_loaders`
   for available implementations.

#. The :meth:`~import_export.resources.Resource.before_import` hook is called.
   By implementing this method in your resource, you can customize the import process.

#. Each row of the to-be-imported dataset is processed according to the
   following steps:

   * The :meth:`~import_export.resources.Resource.before_import_row` hook is
     called to allow for row data to be modified before it is imported.

   * :meth:`~import_export.resources.Resource.get_or_init_instance` is called
     with current :class:`~import_export.instance_loaders.BaseInstanceLoader`
     and current row of the dataset, returning an object and a Boolean
     declaring if the object is newly created or not.

     If no object can be found for the current row,
     :meth:`~import_export.resources.Resource.init_instance` is invoked to
     initialize an object.

     As always, you can override the implementation of
     :meth:`~import_export.resources.Resource.init_instance` to customize
     how the new object is created (i.e. set default values).

   * :meth:`~import_export.resources.Resource.for_delete` is called to
     determine if the passed ``instance``
     should be deleted. In this case, the import process for the current row
     is stopped at this point.

   * If the instance was not deleted in the previous step,
     :meth:`~import_export.resources.Resource.import_row` is called with the
     ``instance`` as current object instance, ``row`` as current row.

     :meth:`~import_export.resources.Resource.import_field` is called for
     each field in :class:`~import_export.resources.Resource` skipping many-
     to-many fields. Many-to-many fields are skipped because they require
     instances to have a primary key and therefore assignment is postponed to
     when the object has already been saved.

     :meth:`~import_export.resources.Resource.import_field` in turn calls
     :meth:`~import_export.fields.Field.save`, if ``Field.attribute`` is set
     and ``Field.column_name`` exists in the given row.

   * It then is determined whether the newly imported object is different
     from the already present object and if therefore the given row should be
     skipped or not. This is handled by calling
     :meth:`~import_export.resources.Resource.skip_row` with ``original`` as
     the original object and ``instance`` as the current object from the dataset.

     If the current row is to be skipped, ``row_result.import_type`` is set
     to ``IMPORT_TYPE_SKIP``.

   * If the current row is not to be skipped,
     :meth:`~import_export.resources.Resource.save_instance` is called and
     actually saves the instance when ``dry_run`` is not set.

     There are two hook methods (that by default do nothing) giving you the
     option to customize the import process:

       * :meth:`~import_export.resources.Resource.before_save_instance`
       * :meth:`~import_export.resources.Resource.after_save_instance`

   * :meth:`~import_export.resources.Resource.save_m2m` is called to save
     many to many fields.

   * :class:`~import_export.results.RowResult` is assigned with a diff
     between the original and the imported object fields, as well as and
     ``import_type`` attribute which states whether the row is new, updated,
     skipped or deleted.

     If an exception is raised during row processing and
     :meth:`~import_export.resources.Resource.import_row` was invoked with
     ``raise_errors=False`` (which is the default) the particular traceback
     is appended to :class:`~import_export.results.RowResult` as well.

     If either the row was not skipped or the
     :class:`~import_export.resources.Resource` is configured to report
     skipped rows, the :class:`~import_export.results.RowResult` is appended
     to the :class:`~import_export.results.Result`

   * The :meth:`~import_export.resources.Resource.after_import_row` hook is called

#. The :class:`~import_export.results.Result` is returned.

Transaction support
-------------------

If transaction support is enabled, whole import process is wrapped inside
transaction and rolled back or committed respectively.
All methods called from inside of :meth:`~import_export.resources.Resource.import_data`
(create / delete / update) receive ``False`` for ``dry_run`` argument.

.. _Dataset: https://tablib.readthedocs.io/en/stable/api/#dataset-object