File: text-zones.rst

package info (click to toggle)
python-djvulibre 0.9.3-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 648 kB
  • sloc: python: 2,437; makefile: 38; sh: 25
file content (124 lines) | stat: -rw-r--r-- 3,370 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
Text zones
==========

.. testsetup::

   from djvu.const import *
   from djvu.decode import *
   from djvu.sexpr import *
   from pprint import pprint

.. seealso::
   |djvu3ref|_ (8.3.5 *Text Chunk*).

   Representing text zones as S-expressions is DjVuLibre-specific; see |djvused|_
   for reference.

.. currentmodule:: djvu.decode
.. class:: PageText(page[, details=TEXT_DETAILS_ALL])


   A wrapper around page text.

   `details` controls the level of details in the returned S-expression:

   * :data:`~TEXT_DETAILS_PAGE`, or
   * :data:`~TEXT_DETAILS_COLUMN`, or
   * :data:`~TEXT_DETAILS_REGION`, or
   * :data:`~TEXT_DETAILS_PARAGRAPH`, or
   * :data:`~TEXT_DETAILS_LINE`, or
   * :data:`~TEXT_DETAILS_WORD`, or
   * :data:`~TEXT_DETAILS_CHARACTER`, or
   * :data:`~TEXT_DETAILS_ALL`.

   .. method:: wait()

         Wait until the associated S-expression is available.

   .. attribute:: page

         :rtype: :class:`Page`

   .. attribute:: sexpr

      :rtype: :class:`djvu.sexpr.Expression`
      :raise NotAvailable:
         if the S-expression is not available; then, :class:`PageInfoMessage`
         messages with empty :attr:`~Message.page_job` may be emitted.
      :raise JobFailed:
         on failure.

.. currentmodule:: djvu.const
.. class:: TextZoneType

    A type of a text zone.

    To create objects of this class, use the :func:`get_text_zone_type()` function.

.. currentmodule:: djvu.const
.. function:: get_text_zone_type(symbol)

   Return one of the following text zone types:

      .. data:: TEXT_ZONE_PAGE

         >>> get_text_zone_type(Symbol('page')) is TEXT_ZONE_PAGE
         True

      .. data:: TEXT_ZONE_COLUMN

         >>> get_text_zone_type(Symbol('column')) is TEXT_ZONE_COLUMN
         True

      .. data:: TEXT_ZONE_REGION

         >>> get_text_zone_type(Symbol('region')) is TEXT_ZONE_REGION
         True

      .. data:: TEXT_ZONE_PARAGRAPH

         >>> get_text_zone_type(Symbol('para')) is TEXT_ZONE_PARAGRAPH
         True

      .. data:: TEXT_ZONE_LINE

         >>> get_text_zone_type(Symbol('line')) is TEXT_ZONE_LINE
         True

      .. data:: TEXT_ZONE_WORD

         >>> get_text_zone_type(Symbol('word')) is TEXT_ZONE_WORD
         True

      .. data:: TEXT_ZONE_CHARACTER

         >>> get_text_zone_type(Symbol('char')) is TEXT_ZONE_CHARACTER
         True

   You can compare text zone types using the ``>`` operator:

   >>> TEXT_ZONE_PAGE > TEXT_ZONE_COLUMN > TEXT_ZONE_REGION > TEXT_ZONE_PARAGRAPH
   True
   >>> TEXT_ZONE_PARAGRAPH > TEXT_ZONE_LINE > TEXT_ZONE_WORD > TEXT_ZONE_CHARACTER
   True

.. currentmodule:: djvu.decode
.. function:: cmp_text_zone(zonetype1, zonetype2)

   :return: a negative integer if `zonetype1` is more concrete than `zonetype2`.
   :return: a negative integer if `zonetype1` is the same as `zonetype2`.
   :return: a positive integer if `zonetype1` is the general than `zonetype2`.

.. currentmodule:: djvu.const
.. data:: TEXT_ZONE_SEPARATORS

   Dictionary that maps text types to their separators.

   >>> pprint(TEXT_ZONE_SEPARATORS)
   {<djvu.const.TextZoneType: char>: '',
    <djvu.const.TextZoneType: word>: ' ',
    <djvu.const.TextZoneType: line>: '\n',
    <djvu.const.TextZoneType: para>: '\x1f',
    <djvu.const.TextZoneType: region>: '\x1d',
    <djvu.const.TextZoneType: column>: '\x0b',
    <djvu.const.TextZoneType: page>: '\x0c'}