File: index.rst

package info (click to toggle)
xapian-bindings 1.4.29-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 21,436 kB
  • sloc: cpp: 379,853; python: 10,780; cs: 9,529; java: 6,949; sh: 4,629; perl: 4,435; makefile: 1,274; ruby: 1,028; php: 586; tcl: 246
file content (124 lines) | stat: -rw-r--r-- 4,491 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
Xapian Java Bindings
********************

How to build the bindings:
##########################

If you want to install from source you'll need to `download the source
code <https://xapian.org/download>`_ if you haven't already done so.

Running "make" and then "make install" will "install" a JNI glue shared library
into a "built" subdirectory of the java build directory.  The jar file is built
into the "built" subdirectory too.

You can copy these two files into your java installation, or just use them
in-place.

How to compile the examples:
############################

::

  cd java
  javac -classpath built/xapian.jar:. docs/examples/SimpleIndex.java
  javac -classpath built/xapian.jar:. docs/examples/SimpleSearch.java

How to run the examples:
########################

To run the examples, you need to give Java a special system-property named
"java.library.path".  The value of this property is the path of the directory
where the libxapian_jni.so (or whatever extension is used on your platform)
JNI library is located.

::

 java -Djava.library.path=built -classpath built/xapian.jar:docs/examples \
      SimpleIndex ./test.db index words like java

 java -Djava.library.path=built -classpath built/xapian.jar:docs/examples \
      SimpleSearch ./test.db index words like java

Alternatively, you can avoid needing the `-Djava.library.path` setting by
setting the `LD_LIBRARY_PATH` environment variable, or by installing the JNI
library in the appropriate directory so your JVM finds it automatically
(for example, on macOS you can copy it into `/Library/Java/Extensions/`
- you can also copy the `.jar` file there and avoid needing to specify it
via `-classpath`).

The java bindings have been tested recently with OpenJDK versions 1.8.0_77,
1.7.0_03, and 1.6.0_38, but they should work with any java toolchain with
suitable JNI support - please report success stories or any problems to the
development mailing list: xapian-devel@lists.xapian.org

Strings and binary data
#######################

The Xapian C++ API is largely agnostic about character encoding, and uses
the `std::string` type as an opaque container for a sequence of bytes.
In places where the bytes represent text (for example, in the
`Stem`, `QueryParser` and `TermGenerator` classes), UTF-8 encoding is used.
In Java, the `String` class uses UTF-16 encoding, and can't hold arbitrary
binary data.

The approach taken to this problem by these bindings (in Xapian 1.4.4 and
later) is to map C++ `std::string` to/from Java byte arrays (`byte[]`) in
places where the data is inherently binary (serialisation functions) or likely
to be binary (document values).

This loses a bit of generality compared to the C++ API - for example, in C++
you can add a term with a binary data value but in Java it has to be a
Unicode string.  But users rarely actually need or want that generality,
and losing it means that you can just work with Java `String`.

Document values work best when the values are compactly encoded, so a binary
encoding is usually appropriate.  However, if you really want to put a text
value in a document value slot you can explicitly convert `String` to/from
a byte array of UTF-8 data like so::

  import java.nio.charset.StandardCharsets;

  //...

  doc.addValue(1, some_string.getBytes(StandardCharsets.UTF_8));

  //...
  String value = new String(doc.getValue(1), StandardCharsets.UTF_8);

As well as terms, document data and user metadata are also required to be
text at the moment when using these bindings.

Naming of wrapped methods:
##########################

Methods are renamed to match Java's naming conventions.  So get_mset becomes
getMSet, etc.  Also get_description is wrapped as toString.

MatchAll and MatchNothing
#########################

In Xapian 1.3.0 and later, these are wrapped as static constants
``Query.MatchAll`` and ``Query.MatchNothing``.

If you want to be compatible with earlier versions, you can continue to use
``new Query("")`` instead of ``Query.MatchAll`` and ``new Query()`` instead of
``Query.MatchNothing``.

TODO list:
##########

* Fix string passing to be zero-byte clean:
  https://trac.xapian.org/ticket/46

* These were missing in the JNI bindings - it would be good to add them to
  SmokeTest.java:

    - optional parameter "parameter" for Query ctor.

    - changes to Enquire sorting API.

    - new method ESet::back().

    - Third (optional) argument to Document::add_posting().

    - Xapian::Weight and standard subclasses.