File: migration_from_1_to_2.rst

package info (click to toggle)
django-haystack 3.3.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,504 kB
  • sloc: python: 23,475; xml: 1,708; sh: 74; makefile: 71
file content (285 lines) | stat: -rw-r--r-- 10,121 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
.. _ref-migration_from_1_to_2:

===========================================
Migrating From Haystack 1.X to Haystack 2.X
===========================================

Haystack introduced several backward-incompatible changes in the process of
moving from the 1.X series to the 2.X series. These were done to clean up the
API, to support new features & to clean up problems in 1.X. At a high level,
they consisted of:

* The removal of ``SearchSite`` & ``haystack.site``.
* The removal of ``handle_registrations`` & ``autodiscover``.
* The addition of multiple index support.
* The addition of ``SignalProcessors`` & the removal of ``RealTimeSearchIndex``.
* The removal/renaming of various settings.

This guide will help you make the changes needed to be compatible with Haystack
2.X.


Settings
========

Most prominently, the old way of specifying a backend & its settings has changed
to support the multiple index feature. A complete Haystack 1.X example might
look like::

    HAYSTACK_SEARCH_ENGINE = 'solr'
    HAYSTACK_SOLR_URL = 'http://localhost:9001/solr/default'
    HAYSTACK_SOLR_TIMEOUT = 60 * 5
    HAYSTACK_INCLUDE_SPELLING = True
    HAYSTACK_BATCH_SIZE = 100

    # Or...
    HAYSTACK_SEARCH_ENGINE = 'whoosh'
    HAYSTACK_WHOOSH_PATH = '/home/search/whoosh_index'
    HAYSTACK_WHOOSH_STORAGE = 'file'
    HAYSTACK_WHOOSH_POST_LIMIT = 128 * 1024 * 1024
    HAYSTACK_INCLUDE_SPELLING = True
    HAYSTACK_BATCH_SIZE = 100

    # Or...
    HAYSTACK_SEARCH_ENGINE = 'xapian'
    HAYSTACK_XAPIAN_PATH = '/home/search/xapian_index'
    HAYSTACK_INCLUDE_SPELLING = True
    HAYSTACK_BATCH_SIZE = 100

In Haystack 2.X, you can now supply as many backends as you like, so all of the
above settings can now be active at the same time. A translated set of settings
would look like::

    HAYSTACK_CONNECTIONS = {
        'default': {
            'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
            'URL': 'http://localhost:9001/solr/default',
            'TIMEOUT': 60 * 5,
            'INCLUDE_SPELLING': True,
            'BATCH_SIZE': 100,
        },
        'autocomplete': {
            'ENGINE': 'haystack.backends.whoosh_backend.WhooshEngine',
            'PATH': '/home/search/whoosh_index',
            'STORAGE': 'file',
            'POST_LIMIT': 128 * 1024 * 1024,
            'INCLUDE_SPELLING': True,
            'BATCH_SIZE': 100,
        },
        'slave': {
            'ENGINE': 'xapian_backend.XapianEngine',
            'PATH': '/home/search/xapian_index',
            'INCLUDE_SPELLING': True,
            'BATCH_SIZE': 100,
        },
    }

You are required to have at least one connection listed within
``HAYSTACK_CONNECTIONS``, it must be named ``default`` & it must have a valid
``ENGINE`` within it. Bare minimum looks like::

    HAYSTACK_CONNECTIONS = {
        'default': {
            'ENGINE': 'haystack.backends.simple_backend.SimpleEngine'
        }
    }

The key for each backend is an identifier you use to describe the backend within
your app. You should refer to the :ref:`ref-multiple_index` documentation for
more information on using the new multiple indexes & routing features.

Also note that the ``ENGINE`` setting has changed from a lowercase "short name"
of the engine to a full path to a new ``Engine`` class within the backend.
Available options are:

* ``haystack.backends.solr_backend.SolrEngine``
* ``haystack.backends.whoosh_backend.WhooshEngine``
* ``haystack.backends.simple_backend.SimpleEngine``

Additionally, the following settings were outright removed & will generate
an exception if found:

* ``HAYSTACK_SITECONF`` - Remove this setting & the file it pointed to.
* ``HAYSTACK_ENABLE_REGISTRATIONS``
* ``HAYSTACK_INCLUDE_SPELLING``


Backends
========

The ``dummy`` backend was outright removed from Haystack, as it served very
little use after the ``simple`` (pure-ORM-powered) backend was introduced.

If you wrote a custom backend, please refer to the "Custom Backends" section
below.


Indexes
=======

The other major changes affect the ``SearchIndex`` class. As the concept of
``haystack.site`` & ``SearchSite`` are gone, you'll need to modify your indexes.

A Haystack 1.X index might've looked like::

    import datetime
    from haystack.indexes import *
    from haystack import site
    from myapp.models import Note


    class NoteIndex(SearchIndex):
        text = CharField(document=True, use_template=True)
        author = CharField(model_attr='user')
        pub_date = DateTimeField(model_attr='pub_date')

        def get_queryset(self):
            """Used when the entire index for model is updated."""
            return Note.objects.filter(pub_date__lte=datetime.datetime.now())


    site.register(Note, NoteIndex)

A converted Haystack 2.X index should look like::

    import datetime
    from haystack import indexes
    from myapp.models import Note


    class NoteIndex(indexes.SearchIndex, indexes.Indexable):
        text = indexes.CharField(document=True, use_template=True)
        author = indexes.CharField(model_attr='user')
        pub_date = indexes.DateTimeField(model_attr='pub_date')

        def get_model(self):
            return Note

        def index_queryset(self, using=None):
            """Used when the entire index for model is updated."""
            return self.get_model().objects.filter(pub_date__lte=datetime.datetime.now())

Note the import on ``site`` & the registration statements are gone. Newly added
are is the ``NoteIndex.get_model`` method. This is a **required** method &
should simply return the ``Model`` class the index is for.

There's also a new, additional class added to the ``class`` definition. The
``indexes.Indexable`` class is a simple mixin that serves to identify the
classes Haystack should automatically discover & use. If you have a custom
base class (say ``QueuedSearchIndex``) that other indexes inherit from, simply
leave the ``indexes.Indexable`` off that declaration & Haystack won't try to
use it.

Additionally, the name of the ``document=True`` field is now enforced to be
``text`` across all indexes. If you need it named something else, you should
set the ``HAYSTACK_DOCUMENT_FIELD`` setting. For example::

    HAYSTACK_DOCUMENT_FIELD = 'pink_polka_dot'

Finally, the ``index_queryset`` method should supplant the ``get_queryset``
method. This was present in the Haystack 1.2.X series (with a deprecation warning
in 1.2.4+) but has been removed in Haystack v2.

Finally, if you were unregistering other indexes before, you should make use of
the new ``EXCLUDED_INDEXES`` setting available in each backend's settings. It
should be a list of strings that contain the Python import path to the indexes
that should not be loaded & used. For example::

    HAYSTACK_CONNECTIONS = {
        'default': {
            'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
            'URL': 'http://localhost:9001/solr/default',
            'EXCLUDED_INDEXES': [
                # Imagine that these indexes exist. They don't.
                'django.contrib.auth.search_indexes.UserIndex',
                'third_party_blog_app.search_indexes.EntryIndex',
            ]
        }
    }

This allows for reliable swapping of the index that handles a model without
relying on correct import order.


Removal of ``RealTimeSearchIndex``
==================================

Use of the ``haystack.indexes.RealTimeSearchIndex`` is no longer valid. It has
been removed in favor of ``RealtimeSignalProcessor``. To migrate, first change
the inheritance of all your ``RealTimeSearchIndex`` subclasses to use
``SearchIndex`` instead::

    # Old.
    class MySearchIndex(indexes.RealTimeSearchIndex, indexes.Indexable):
        # ...


    # New.
    class MySearchIndex(indexes.SearchIndex, indexes.Indexable):
        # ...

Then update your settings to enable use of the ``RealtimeSignalProcessor``::

    HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'


Done!
=====

For most basic uses of Haystack, this is all that is necessary to work with
Haystack 2.X. You should rebuild your index if needed & test your new setup.


Advanced Uses
=============

Swapping Backend
----------------

If you were manually swapping the ``SearchQuery`` or ``SearchBackend`` being
used by ``SearchQuerySet`` in the past, it's now preferable to simply setup
another connection & use the ``SearchQuerySet.using`` method to select that
connection instead.

Also, if you were manually instantiating ``SearchBackend`` or ``SearchQuery``,
it's now preferable to rely on the connection's engine to return the right
thing. For example::

    from haystack import connections
    backend = connections['default'].get_backend()
    query = connections['default'].get_query()


Custom Backends
---------------

If you had written a custom ``SearchBackend`` and/or custom ``SearchQuery``,
there's a little more work needed to be Haystack 2.X compatible.

You should, but don't have to, rename your ``SearchBackend`` & ``SearchQuery``
classes to be more descriptive/less collide-y. For example,
``solr_backend.SearchBackend`` became ``solr_backend.SolrSearchBackend``. This
prevents non-namespaced imports from stomping on each other.

You need to add a new class to your backend, subclassing ``BaseEngine``. This
allows specifying what ``backend`` & ``query`` should be used on a connection
with less duplication/naming trickery. It goes at the bottom of the file (so
that the classes are defined above it) and should look like::

    from haystack.backends import BaseEngine
    from haystack.backends.solr_backend import SolrSearchQuery

    # Code then...

    class MyCustomSolrEngine(BaseEngine):
        # Use our custom backend.
        backend = MySolrBackend
        # Use the built-in Solr query.
        query = SolrSearchQuery

Your ``HAYSTACK_CONNECTIONS['default']['ENGINE']`` should then point to the
full Python import path to your new ``BaseEngine`` subclass.

Finally, you will likely have to adjust the ``SearchBackend.__init__`` &
``SearchQuery.__init__``, as they have changed significantly. Please refer to
the commits for those backends.