1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245
|
<html><head><title>PHP bindings for Xapian</title></head>
<body>
<h1>PHP bindings for Xapian</h1>
<p>
The PHP bindings for Xapian are packaged in the <code>xapian</code>
extension. The PHP API provided by this extension largely follows Xapian's C++
API. This document lists the differences and additions.
</p>
<p>
As of Xapian version 1.1.0, these bindings require PHP5 (PHP4 is no longer
supported).
</p>
<p>
PHP strings, arrays, etc., are converted automatically to and from the
corresponding C++ types in the bindings, so generally you can pass arguments as
you would expect. One thing to be aware of though is that SWIG implements
dispatch functions for overloaded methods based on the types of the parameters,
so you can't always pass in a string containing a number (e.g.
<code>"42"</code>) where a number is expected as you usually can in PHP.
You need to
explicitly convert to the type required - e.g. use <code>(int)</code> to
convert to an integer, <code>(string)</code> to string, <code>(double)</code>
to a floating point number.
</p>
<p>
PHP has a lot of reserved words of various sorts, which sadly clash with common
method names. Because of this <code>empty()</code> methods of various
container-like classes are wrapped as <code>is_empty()</code> for PHP
and the <code>clone()</code> method of the <code>XapianWeight</code>
class and subclasses is wrapped as <code>clone_object()</code>.
</p>
<p>
The <code>examples</code> subdirectory contains examples showing how to use the
PHP bindings based on the simple examples from <code>xapian-examples</code>:
<a href="examples/simpleindex.php5">simpleindex.php5</a>,
<a href="examples/simplesearch.php5">simplesearch.php5</a>,
<a href="examples/simpleexpand.php5">simpleexpand.php5</a>.
</p>
Note that these examples are written to work with the command line (CLI)
version of the PHP interpreter, not through a webserver. Xapian's PHP
bindings may of course also be used under CGI, Apache's modphp, ISAPI,
etc.
</p>
<h2>Installation</h2>
<p>
Assuming you have a suitable version of PHP installed, running
configure will automatically enable the PHP bindings, and
<code>make install</code> will install the extension shared library in
the location reported by <code>php-config --extension-dir</code>.
</p>
<p>
Check that php.ini has a line like <code>extension_dir =
"<i><location reported by php-config --extension-dir></i>"</code>.
</p>
<p>
Then add this line to php.ini: <code>extension = xapian.so</code> (or
whatever the library is called - not all UNIX systems use <code>.so</code>
as the extension, and MS Windows uses <code>.dll</code>).
</p>
<p>
If you're using PHP as a webserver module (e.g. mod_php with Apache), you
may need to restart the webserver for this change to take effect.
</p>
<p>
Alternatively, you can get scripts which use Xapian to explicitly load it.
This approach is useful if you don't have root access and so can't make
changes to php.ini. The simplest set up is to copy <code>xapian.so</code> into
the same directory as your PHP script, and then add the following line to the
start of your PHP scripts which use Xapian: <code>dl('xapian.so');</code>
</p>
<p>
You can put <code>xapian.so</code> elsewhere (and it's probably better to)
but note that <code>dl()</code> requires a <b>relative</b> path so you
might have to use something insane-looking like:
<code>dl('../../../../usr/lib/php5/20051025/xapian.so');</code>
<p>
You also need to add <code>include "xapian.php"</code>
to your PHP scripts which use Xapian in order to get the PHP class wrappers.
</p>
<h2>Exceptions</h2>
<p>
Exceptions thrown by Xapian are translated into PHP Exception objects
which are thrown into the PHP script.
</p>
<h2>Object orientation</h2>
<p>
As of Xapian 0.9.7, the PHP bindings use a PHP object oriented style.
</p>
<p>
In order to construct an object, use
<code>$object = new XapianClassName(...);</code>. Objects are destroyed
when they go out of scope - to explicitly destroy an object you can use
<code>unset($object);</code> or <code>$object = Null;</code>
</p>
<p>
You invoke a method on an object using <code>$object->method_name()</code>.
</p>
<h2>Unicode Support</h2>
<p>
In Xapian 1.0.0 and later, the Xapian::Stem, Xapian::QueryParser, and
Xapian::TermGenerator classes all assume text is in UTF-8. If you want
to index strings in a different encoding, use the PHP
<a href="http://php.net/iconv"><code>iconv
function</code></a>
to convert them to UTF-8 before passing them to Xapian, and
when reading values back from Xapian.
</p>
<h2>Iterators</h2>
<p>
All iterators support <code>next()</code> and <code>equals()</code> methods
to move through and test iterators (as for all language bindings).
MSetIterator and ESetIterator also support <code>prev()</code>.
</p>
<h2>Iterator dereferencing</h2>
<p>
C++ iterators are often dereferenced to get information, eg
<code>(*it)</code>. With PHP these are all mapped to named methods, as
follows:
</p>
<table title="Iterator deferencing methods">
<thead><td>Iterator</td><td>Dereferencing method</td></thead>
<tr><td>PositionIterator</td> <td><code>get_termpos()</code></td></tr>
<tr><td>PostingIterator</td> <td><code>get_docid()</code></td></tr>
<tr><td>TermIterator</td> <td><code>get_term()</code></td></tr>
<tr><td>ValueIterator</td> <td><code>get_value()</code></td></tr>
<tr><td>MSetIterator</td> <td><code>get_docid()</code></td></tr>
<tr><td>ESetIterator</td> <td><code>get_term()</code></td></tr>
</table>
<p>
Other methods, such as <code>MSetIterator::get_document()</code>, are
available unchanged.
</p>
<h2>MSet</h2>
<p>
MSet objects have some additional methods to simplify access (these
work using the C++ array dereferencing):
</p>
<table title="MSet additional methods">
<thead><td>Method name</td><td>Explanation</td></thead>
<tr><td><code>get_hit(index)</code></td><td>returns MSetIterator at index</td></tr>
<tr><td><code>get_document_percentage(index)</code></td><td><code>convert_to_percent(get_hit(index))</code></td></tr>
<tr><td><code>get_document(index)</code></td><td><code>get_hit(index)->get_document()</code></td></tr>
<tr><td><code>get_docid(index)</code></td><td><code>get_hit(index)->get_docid()</code></td></tr>
</table>
<h2>Database Factory Functions</h2>
<ul>
<li> <code>Xapian::Auto::open_stub(<i>file</i>)</code> is wrapped as <code>Xapian::auto_open_stub(<i>file</i>)</code>
<li> <code>Xapian::Brass::open()</code> is wrapped as <code>Xapian::brass_open()</code>
<li> <code>Xapian::Chert::open()</code> is wrapped as <code>Xapian::chert_open()</code>
<li> <code>Xapian::Flint::open()</code> is wrapped as <code>Xapian::flint_open()</code>
<li> <code>Xapian::InMemory::open()</code> is wrapped as <code>Xapian::inmemory_open()</code>
<li> <code>Xapian::Remote::open(...)</code> is wrapped as <code>Xapian::remote_open(...)</code> (both
the TCP and "program" versions are wrapped - the SWIG wrapper checks the parameter list to
decide which to call).
<li> <code>Xapian::Remote::open_writable(...)</code> is wrapped as <code>Xapian::remote_open_writable(...)</code> (both
the TCP and "program" versions are wrapped - the SWIG wrapper checks the parameter list to
decide which to call).
</ul>
<h2>Constants</h2>
<p>
Constants are wrapped as <code>const</code> members of the appropriate class.
So <code>Xapian::DB_CREATE_OR_OPEN</code> is available as
<code>Xapian::DB_CREATE_OR_OPEN</code>, <code>Xapian::Query::OP_OR</code> is
available as <code>XapianQuery::OP_OR</code>, and so on.
</p>
<h2>Functions</h2>
<p>
Non-class functions are wrapped in the natural way, so the C++
function <code>Xapian::version_string</code> is wrapped under the same
name in PHP.
</p>
<h2>Query</h2>
<p>
In C++ there's a Xapian::Query constructor which takes a query operator and
start/end iterators specifying a number of terms or queries, plus an optional
parameter. In PHP, this is wrapped to accept an array listing the terms
and/or queries (you can specify a mixture of terms and queries if you wish)
For example:
</p>
<pre>
$subq = new XapianQuery(XapianQuery::OP_AND, "hello", "world");
$q = new XapianQuery(XapianQuery::OP_AND, array($subq, "foo", new XapianQuery("bar", 2)));
</pre>
<h3>MatchAll and MatchNothing</h3>
<p>
These aren't yet wrapped for PHP, but you can use <code>XapianQuery("")</code>
instead of MatchAll and <code>XapianQuery()</code> instead of MatchNothing.
</p>
<h2>Enquire</h2>
<p>
There is an additional method <code>get_matching_terms()</code> which takes
an MSetIterator and returns a list of terms in the current query which
match the document given by that iterator. You may find this
more convenient than using the TermIterator directly.
</p>
<address>
Last updated $Date: 2009-12-22 12:37:51 +0000 (Tue, 22 Dec 2009) $
</address>
</body>
</html>
|