1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264
|
Metadata-Version: 2.1
Name: pysolr
Version: 3.8.1
Summary: Lightweight python wrapper for Apache Solr.
Home-page: https://github.com/django-haystack/pysolr/
Author: Daniel Lindsley
Author-email: daniel@toastdriven.com
License: BSD
Description: ======
pysolr
======
``pysolr`` is a lightweight Python wrapper for `Apache Solr`_. It provides an
interface that queries the server and returns results based on the query.
.. _`Apache Solr`: http://lucene.apache.org/solr/
Status
======
.. image:: https://secure.travis-ci.org/django-haystack/pysolr.png
:target: https://secure.travis-ci.org/django-haystack/pysolr
`Changelog <https://github.com/django-haystack/pysolr/blob/master/CHANGELOG.rst>`_
Features
========
* Basic operations such as selecting, updating & deleting.
* Index optimization.
* `"More Like This" <http://wiki.apache.org/solr/MoreLikeThis>`_ support (if set up in Solr).
* `Spelling correction <http://wiki.apache.org/solr/SpellCheckComponent>`_ (if set up in Solr).
* Timeout support.
* SolrCloud awareness
Requirements
============
* Python 2.7 - 3.6
* Requests 2.9.1+
* **Optional** - ``simplejson``
* **Optional** - ``kazoo`` for SolrCloud mode
Installation
============
pysolr is on PyPI:
.. code-block:: console
$ pip install pysolr
Or if you want to install directly from the repository: ``python setup.py install``, or drop the ``pysolr.py`` file anywhere on your ``PYTHONPATH``.
Usage
=====
Basic usage looks like:
.. code-block:: python
# If on Python 2.X
from __future__ import print_function
import pysolr
# Setup a Solr instance. The timeout is optional.
solr = pysolr.Solr('http://localhost:8983/solr/', timeout=10, auth=<type of authentication>)
# How you'd index data.
solr.add([
{
"id": "doc_1",
"title": "A test document",
},
{
"id": "doc_2",
"title": "The Banana: Tasty or Dangerous?",
"_doc": [
{ "id": "child_doc_1", "title": "peel" },
{ "id": "child_doc_2", "title": "seed" },
]
},
])
# Note that the add method has commit=True by default, so this is
# immediately committed to your index.
# You can index a parent/child document relationship by
# associating a list of child documents with the special key '_doc'. This
# is helpful for queries that join together conditions on children and parent
# documents.
# Later, searching is easy. In the simple case, just a plain Lucene-style
# query is fine.
results = solr.search('bananas')
# The ``Results`` object stores total results found, by default the top
# ten most relevant results and any additional data like
# facets/highlighting/spelling/etc.
print("Saw {0} result(s).".format(len(results)))
# Just loop over it to access the results.
for result in results:
print("The title is '{0}'.".format(result['title']))
# For a more advanced query, say involving highlighting, you can pass
# additional options to Solr.
results = solr.search('bananas', **{
'hl': 'true',
'hl.fragsize': 10,
})
# You can also perform More Like This searches, if your Solr is configured
# correctly.
similar = solr.more_like_this(q='id:doc_2', mltfl='text')
# Finally, you can delete either individual documents,
solr.delete(id='doc_1')
# also in batches...
solr.delete(id=['doc_1', 'doc_2'])
# ...or all documents.
solr.delete(q='*:*')
.. code-block:: python
# For SolrCloud mode, initialize your Solr like this:
zookeeper = pysolr.ZooKeeper("zkhost1:2181,zkhost2:2181,zkhost3:2181")
solr = pysolr.SolrCloud(zookeeper, "collection1", auth=<type of authentication>)
Multicore Index
~~~~~~~~~~~~~~~
Simply point the URL to the index core:
.. code-block:: python
# Setup a Solr instance. The timeout is optional.
solr = pysolr.Solr('http://localhost:8983/solr/core_0/', timeout=10)
Custom Request Handlers
~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python
# Setup a Solr instance. The trailing slash is optional.
solr = pysolr.Solr('http://localhost:8983/solr/core_0/', search_handler='/autocomplete', use_qt_param=False)
If ``use_qt_param`` is ``True`` it is essential that the name of the handler is exactly what is configured
in ``solrconfig.xml``, including the leading slash if any (though with the ``qt`` parameter a leading slash is not
a requirement by SOLR). If ``use_qt_param`` is ``False`` (default), the leading and trailing slashes can be
omitted.
If ``search_handler`` is not specified, pysolr will default to ``/select``.
The handlers for MoreLikeThis, Update, Terms etc. all default to the values set in the ``solrconfig.xml`` SOLR ships
with: ``mlt``, ``update``, ``terms`` etc. The specific methods of pysolr's ``Solr`` class (like ``more_like_this``,
``suggest_terms`` etc.) allow for a kwarg ``handler`` to override that value. This includes the ``search`` method.
Setting a handler in ``search`` explicitly overrides the ``search_handler`` setting (if any).
Custom Authentication
~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python
# Setup a Solr instance in a kerborized enviornment
from requests_kerberos import HTTPKerberosAuth, OPTIONAL
kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False)
solr = pysolr.Solr('http://localhost:8983/solr/', auth=kerberos_auth)
.. code-block:: python
# Setup a CloudSolr instance in a kerborized environment
from requests_kerberos import HTTPKerberosAuth, OPTIONAL
kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False)
zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181")
solr = pysolr.SolrCloud(zookeeper, "collection", auth=kerberos_auth)
If your Solr servers run off https
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python
# Setup a Solr instance in an https environment
solr = pysolr.Solr('http://localhost:8983/solr/', verify=path/to/cert.pem)
.. code-block:: python
# Setup a CloudSolr instance in a kerborized environment
zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181")
solr = pysolr.SolrCloud(zookeeper, "collection", verify=path/to/cert.perm)
Custom Commit Policy
~~~~~~~~~~~~~~~~~~~~
.. code-block:: python
# Setup a Solr instance. The trailing slash is optional.
# All request to solr will result in a commit
solr = pysolr.Solr('http://localhost:8983/solr/core_0/', search_handler='/autocomplete', always_commit=True)
``always_commit`` signals to the Solr object to either commit or not commit by default for any solr request.
Be sure to change this to True if you are upgrading from a version where the default policy was alway commit by default.
Functions like ``add`` and ``delete`` also still provide a way to override the default by passing the ``commit`` kwarg.
It is generally good practice to limit the amount of commits to solr.
Excessive commits risk opening too many searcher or using too many system resources.
LICENSE
=======
``pysolr`` is licensed under the New BSD license.
Running Tests
=============
The ``run-tests.py`` script will automatically perform the steps below and is recommended for testing by
default unless you need more control.
Running a test Solr instance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Downloading, configuring and running Solr 4 looks like this::
./start-solr-test-server.sh
Running the tests
~~~~~~~~~~~~~~~~~
The test suite requires the unittest2 library:
Python 2::
python -m unittest2 tests
Python 3::
python3 -m unittest tests
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 3
Provides-Extra: solrcloud
|