File: tutorial.txt

package info (click to toggle)
python-rtree 0.8.3%2Bds-2
links: PTS, VCS
area: main
in suites: buster
size: 376 kB
sloc: python: 1,611; makefile: 105; sh: 1
file content (208 lines) | stat: -rw-r--r-- 7,653 bytes
parent folder | download | duplicates (2)
.. _tutorial:

Tutorial
------------------------------------------------------------------------------

This tutorial demonstrates how to take advantage of :ref:`Rtree <home>` for 
querying data that have a spatial component that can be modeled as bounding 
boxes.


Creating an index
..............................................................................

The following section describes the basic instantiation and usage of 
:ref:`Rtree <home>`.

Import
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

After :ref:`installing <installation>` :ref:`Rtree <home>`, you should be able to 
open up a Python prompt and issue the following::

  >>> from rtree import index

:py:mod:`rtree` is organized as a Python package with a couple of modules
and two major classes - :py:class:`rtree.index.Index` and
:py:class:`rtree.index.Property`. Users manipulate these classes to interact
with the index.

Construct an instance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

After importing the index module, construct an index with the default 
construction::

  >>> idx = index.Index()

.. note::

    While the default construction is useful in many cases, if you want to 
    manipulate how the index is constructed you will need pass in a 
    :py:class:`rtree.index.Property` instance when creating the index. 

Create a bounding box
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

After instantiating the index, create a bounding box that we can 
insert into the index::

  >>> left, bottom, right, top = (0.0, 0.0, 1.0, 1.0)

.. note::

    The coordinate ordering for all functions are sensitive the the index's
    :py:attr:`~rtree.index.Index.interleaved` data member. If
    :py:attr:`~rtree.index.Index.interleaved` is False, the coordinates must
    be in the form [xmin, xmax, ymin, ymax, ..., ..., kmin, kmax]. If
    :py:attr:`~rtree.index.Index.interleaved` is True, the coordinates must be
    in the form [xmin, ymin, ..., kmin, xmax, ymax, ..., kmax].

Insert records into the index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Insert an entry into the index::

  >>> idx.insert(0, (left, bottom, right, top))

.. note::

    Entries that are inserted into the index are not unique in either the 
    sense of the `id` or of the bounding box that is inserted with index 
    entries. If you need to maintain uniqueness, you need to manage that before 
    inserting entries into the Rtree.

.. note::

    Inserting a point, i.e. where left == right && top == bottom, will
    essentially insert a single point entry into the index instead of copying
    extra coordinates and inserting them. There is no shortcut to explicitly 
    insert a single point, however.

Query the index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are three primary methods for querying the index.
:py:meth:`rtree.index.Index.intersection` will return you index entries that
*cross* or are *contained* within the given query window.
:py:meth:`rtree.index.Index.intersection`

Intersection
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Given a query window, return ids that are contained within the window::

  >>> list(idx.intersection((1.0, 1.0, 2.0, 2.0)))
  [0]

Given a query window that is beyond the bounds of data we have in the 
index::

  >>> list(idx.intersection((1.0000001, 1.0000001, 2.0, 2.0)))
  []

Nearest Neighbors
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

The following finds the 1 nearest item to the given bounds. If multiple items
are of equal distance to the bounds, both are returned::
  
  >>> idx.insert(1, (left, bottom, right, top))
  >>> list(idx.nearest((1.0000001, 1.0000001, 2.0, 2.0), 1))
  [0, 1]


.. _clustered:

Using Rtree as a cheapo spatial database
..............................................................................

Rtree also supports inserting any object you can pickle into the index (called
a clustered index in `libspatialindex`_ parlance). The following inserts the
picklable object ``42`` into the index with the given id::

  >>> index.insert(id=id, bounds=(left, bottom, right, top), obj=42)

You can then return a list of objects by giving the ``objects=True`` flag
to intersection::

  >>> [n.object for n in idx.intersection((left, bottom, right, top), objects=True)]
  [None, None, 42]

.. warning::
    `libspatialindex`_'s clustered indexes were not designed to be a database.
    You get none of the data integrity protections that a database would
    purport to offer, but this behavior of :ref:`Rtree <home>` can be useful
    nonetheless. Consider yourself warned. Now go do cool things with it.

Serializing your index to a file
..............................................................................

One of :ref:`Rtree <home>`'s most useful properties is the ability to 
serialize Rtree indexes to disk. These include the clustered indexes 
described :ref:`here <clustered>`::
  
  >>> file_idx = index.Rtree('rtree')
  >>> file_idx.insert(1, (left, bottom, right, top))
  >>> file_idx.insert(2, (left - 1.0, bottom - 1.0, right + 1.0, top + 1.0))
  >>> [n for n in file_idx.intersection((left, bottom, right, top))]
  [1, 2]

.. note::

    By default, if an index file with the given name `rtree` in the example
    above already exists on the file system, it will be opened in append mode
    and not be re-created. You can control this behavior with the
    :py:attr:`rtree.index.Property.overwrite` property of the index property
    that can be given to the :py:class:`rtree.index.Index` constructor.

.. seealso::

    :ref:`performance` describes some parameters you can tune to make
    file-based indexes run a bit faster. The choices you make for the
    parameters is entirely dependent on your usage.

Modifying file names
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Rtree uses the extensions `dat` and `idx` by default for the two index files
that are created when serializing index data to disk. These file extensions
are controllable using the :py:attr:`rtree.index.Property.dat_extension` and
:py:attr:`rtree.index.Property.idx_extension` index properties.

::

    >>> p = rtree.index.Property()
    >>> p.dat_extension = 'data'
    >>> p.idx_extension = 'index'
    >>> file_idx = index.Index('rtree', properties = p)

3D indexes
..............................................................................

As of Rtree version 0.5.0, you can create 3D (actually kD) indexes. The
following is a 3D index that is to be stored on disk. Persisted indexes are
stored on disk using two files -- an index file (.idx) and a data (.dat) file.
You can modify the extensions these files use by altering the properties of
the index at instantiation time. The following creates a 3D index that is
stored on disk as the files ``3d_index.data`` and ``3d_index.index``::

  >>> from rtree import index
  >>> p = index.Property()
  >>> p.dimension = 3
  >>> p.dat_extension = 'data'
  >>> p.idx_extension = 'index'  
  >>> idx3d = index.Index('3d_index',properties=p)
  >>> idx3d.insert(1, (0, 0, 60, 60, 23.0, 42.0))
  >>> idx3d.intersection( (-1, -1, 62, 62, 22, 43))
  [1L]

ZODB and Custom Storages
..............................................................................

https://mail.zope.org/pipermail/zodb-dev/2010-June/013491.html contains a custom 
storage backend for `ZODB`_

.. _ZODB: http://www.zodb.org/

.. _`libspatialindex`: http://libspatialindex.github.com