File: mp_tree.rst

package info (click to toggle)
python-django-treebeard 4.7.1-1
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 988 kB
  • sloc: python: 5,317; javascript: 258; makefile: 180; sh: 6
file content (255 lines) | stat: -rw-r--r-- 8,392 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
Materialized Path trees
=======================

.. module:: treebeard.mp_tree

This is an efficient implementation of Materialized Path
trees for Django, as described by `Vadim Tropashko`_ in `SQL Design
Patterns`_. Materialized Path is probably the fastest way of working with
trees in SQL without the need of extra work in the database, like Oracle's
``CONNECT BY`` or sprocs and triggers for nested intervals.

In a materialized path approach, every node in the tree will have a
:attr:`~MP_Node.path` attribute, where the full path from the root
to the node will be stored. This has the advantage of needing very simple
and fast queries, at the risk of inconsistency because of the
denormalization of ``parent``/``child`` foreign keys. This can be prevented
with transactions.

``django-treebeard`` uses a particular approach: every step in the path has
a fixed width and has no separators. This makes queries predictable and
faster at the cost of using more characters to store a step. To address
this problem, every step number is encoded.

Also, two extra fields are stored in every node:
:attr:`~MP_Node.depth` and :attr:`~MP_Node.numchild`.
This makes the read operations faster, at the cost of a little more
maintenance on tree updates/inserts/deletes. Don't worry, even with these
extra steps, materialized path is more efficient than other approaches.

.. warning::

   As with all tree implementations, please be aware of the
   :doc:`caveats`.

.. note::

   The materialized path approach makes heavy use of ``LIKE`` in your
   database, with clauses like ``WHERE path LIKE '002003%'``. If you think
   that ``LIKE`` is too slow, you're right, but in this case the
   :attr:`~MP_Node.path` field is indexed in the database, and all
   ``LIKE`` clauses that don't **start** with a ``%`` character will use
   the index. This is what makes the materialized path approach so fast.

.. inheritance-diagram:: MP_Node
.. autoclass:: MP_Node
  :show-inheritance:

  .. warning::

     Do not change the values of :attr:`path`, :attr:`depth` or
     :attr:`numchild` directly: use one of the included methods instead.
     Consider these values *read-only*.

  .. warning::

     Do not change the values of the :attr:`steplen`, :attr:`alphabet` or
     :attr:`node_order_by` after saving your first object. Doing so will
     corrupt the tree.

  .. warning::

     If you need to define your own
     :py:class:`~django.db.models.Manager` class,
     you'll need to subclass
     :py:class:`~MP_NodeManager`.

     Also, if in your manager you need to change the default
     queryset handler, you'll need to subclass
     :py:class:`~MP_NodeQuerySet`.


  Example:

  .. code-block:: python

     class SortedNode(MP_Node):
        node_order_by = ['numval', 'strval']

        numval = models.IntegerField()
        strval = models.CharField(max_length=255)

  Read the API reference of :class:`treebeard.models.Node` for info on methods
  available in this class, or read the following section for methods with
  particular arguments or exceptions.

  .. attribute:: steplen

     Attribute that defines the length of each step in the :attr:`path` of
     a node.  The default value of *4* allows a maximum of
     *1679615* children per node. Increase this value if you plan to store
     large trees (a ``steplen`` of *5* allows more than *60M* children per
     node). Note that increasing this value, while increasing the number of
     children per node, will decrease the max :attr:`depth` of the tree (by
     default: *63*). To increase the max :attr:`depth`, increase the
     max_length attribute of the :attr:`path` field in your model.

  .. attribute:: alphabet

     Attribute: the alphabet that will be used in base conversions
     when encoding the path steps into strings. The default value,
     ``0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ`` is the most optimal possible
     value that is portable between the supported databases (which means:
     their default collation will order the :attr:`path` field correctly).

     .. note::

        In case you know what you are doing, there is a test that is
        disabled by default that can tell you the optimal default alphabet
        in your enviroment. To run the test you must enable the
        :envvar:`TREEBEARD_TEST_ALPHABET` enviroment variable:

        .. code-block:: console

          $ TREEBEARD_TEST_ALPHABET=1 py.test -k test_alphabet

        In OS X Mavericks, good readable values for the three supported
        databases in their *default* configuration:

         ================ ================ ====
         Database         Optimal Alphabet Base
         ================ ================ ====
         MySQL 5.6.17     0-9A-Z           36
         PostgreSQL 9.3.4 0-9A-Za-z        62
         Sqlite3          0-9A-Za-z        62
         ================ ================ ====

        The default value is MySQL's since it will work in all DBs,
        but when working with a better database, changing the
        :attr:`alphabet` value is recommended in order to increase the
        density of the paths.

        For an even better approach, change the collation of the
        :attr:`path` column in the database to handle raw ASCII, and
        use the printable ASCII characters (0x20 to 0x7E) as the
        :attr:`alphabet`.


  .. attribute:: node_order_by

     Attribute: a list of model fields that will be used for node
     ordering. When enabled, all tree operations will assume this ordering.

     Example:

     .. code-block:: python

       node_order_by = ['field1', 'field2', 'field3']

  .. attribute:: path

     ``CharField``, stores the full materialized path for each node. The
     default value of it's max_length, *255*, is the max efficient and
     portable value for a ``varchar``. Increase it to allow deeper trees (max
     depth by default: *63*)

     .. note::

       `django-treebeard` uses Django's abstract model inheritance, so
       to change the ``max_length`` value of the path in your model, you
       have to redeclare the path field in your model:

       .. code-block:: python

         class MyNodeModel(MP_Node):
             path = models.CharField(max_length=1024, unique=True)

     .. note::

       For performance, and if your database allows it, you can safely
       define the path column as ASCII (not utf-8/unicode/iso8859-1/etc) to
       keep the index smaller (and faster). Also note that some databases
       (mysql) have a small index size limit. InnoDB for instance has a
       limit of 765 bytes per index, so that would be the limit if your path
       is ASCII encoded. If your path column in InnoDB is using unicode,
       the index limit will be 255 characters since in MySQL's indexes,
       unicode means 3 bytes per character.

     .. note::

        ``django-treebeard`` uses `numconv`_ for path encoding.


  .. attribute:: depth

     ``PositiveIntegerField``, depth of a node in the tree. A root node
     has a depth of *1*.

  .. attribute:: numchild

     ``PositiveIntegerField``, the number of children of the node.

  .. automethod:: add_root

     See: :meth:`treebeard.models.Node.add_root`

  .. automethod:: add_child

     See: :meth:`treebeard.models.Node.add_child`

  .. automethod:: add_sibling

     See: :meth:`treebeard.models.Node.add_sibling`

  .. automethod:: move

     See: :meth:`treebeard.models.Node.move`

  .. automethod:: get_tree

     See: :meth:`treebeard.models.Node.get_tree`

     .. note::

        This method returns a queryset.

  .. automethod:: find_problems

     .. note::

        A node won't appear in more than one list, even when it exhibits
        more than one problem. This method stops checking a node when it
        finds a problem and continues to the next node.

     .. note::

        Problems 1, 2 and 3 can't be solved automatically.

     Example:

     .. code-block:: python

        MyNodeModel.find_problems()

  .. automethod:: fix_tree

     Example:

     .. code-block:: python

        MyNodeModel.fix_tree()



.. autoclass:: MP_NodeManager
  :show-inheritance:

.. autoclass:: MP_NodeQuerySet
  :show-inheritance:



.. _`Vadim Tropashko`: http://vadimtropashko.wordpress.com/
.. _`Sql Design Patterns`:
   http://www.rampant-books.com/book_2006_1_sql_coding_styles.htm
.. _numconv: https://tabo.pe/projects/numconv/