File: encode.html

package info (click to toggle)
minidjvu 0.8.svn.2010.05.06+dfsg-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd, stretch
  • size: 2,372 kB
  • ctags: 1,082
  • sloc: sh: 9,230; ansic: 5,788; cpp: 2,400; makefile: 259; python: 42
file content (159 lines) | stat: -rw-r--r-- 4,998 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
<HTML>
<H1>minidjvu: how to compress into DjVu</H1>

This file describes minidjvu 0.3 library;
this is also fully applicable to 0.33 version.
<P>
The library interface is <B>unstable</B>.
<P>
See also: <a href="decode.html">how to decode a DjVu page</a>
<HR>
<H3>Step 0: get things working</H3>
Add this include line to you source files that use minidjvu:
<PRE>
    #include &lt;minidjvu.h&gt;
</PRE>
I'll assume that your compiler can find the minidjvu headers
and your linker can link against the library. If not, try to read INSTALL
or README, or try to add the parent release directory into the header search path.
<P>
This examples also require
<PRE>
    #include &lt;assert.h&gt;
    #include &lt;stdio.h&gt;
    #include &lt;stdlib.h&gt;
</PRE>

<HR>
<H3>Step 1: get the bitmap to compress</H3>

There are many ways to get a bitmap, but only loading from files is demonstrated here.
<P>
To load a Windows BMP file, use <I>mdjvu_load_bmp()</I>; here's the example with
error handling:

<PRE>
    const char *input = "your_input_file_name_here.bmp";
    mdjvu_error_t error;
    mdjvu_bitmap_t bitmap = mdjvu_load_bmp(input, &amp;error);
    if (!bitmap)
    {
        fprintf(stderr, "%s: %s\n", input, mdjvu_get_error_message(error));
        exit(1);
    }
</PRE>

PBM files are read in the same way with <I>mdjvu_load_pbm()</I>:

<PRE>
    const char *input = "your_input_file_name_here.pbm";
    mdjvu_error_t error;
    mdjvu_bitmap_t bitmap = mdjvu_load_pbm(input, &amp;error);
    if (!bitmap)
    {
        fprintf(stderr, "%s: %s\n", input, mdjvu_get_error_message(error));
        exit(1);
    }
</PRE>

TIFF files are a bit different: the function <I>mdjvu_load_tiff()</I>
receives another argument, a pointer to resolution:

<PRE>
    const char *input = "your_input_file_name_here.pbm";
    mdjvu_error_t error;
    int32 resolution = 300; // always set a default value in case TIFF has no dpi recorded
    mdjvu_bitmap_t bitmap = mdjvu_load_tiff(input, &amp;resolution, &amp;error);
    if (!bitmap)
    {
        fprintf(stderr, "%s: %s\n", input, mdjvu_get_error_message(error));
        exit(1);
    }
</PRE>

<HR>
<H3>Step 2 (optional): smooth the bitmap</H3>
"Smoothing" is a filter applied to the bitmap before splitting into letters.
The idea is to remove pixels that are probably noise. Right now, the
implementation is very simple, but still wins up to 5% of file size
(on scanned documents). Use
<PRE>
    mdjvu_smooth(bitmap);
</PRE>
to smooth it.


<HR>
<H3>Step 3: split the bitmap</H3>
We have the bitmap now; but we need a split image.
A split image, or simply an image, is a sequence of commands "put (a bitmap)
at point x = (an integer), y = (an integer)". An image is obtained from a bitmap by splitting.
<P>
You have to supply the resolution (in dots per inch) and a pointer to options,
which may be NULL.
<PRE>
    int32 dpi = 300;    // change the resolution if necessary
    mdjvu_image_t image = mdjvu_split(bitmap, dpi, NULL);
    assert(image);
</PRE>

<HR>
<H3>Step 4: call compression routine</H3>

The main compression function is called <I>mdjvu_compress_image()</I>.
It takes two arguments: the image and options. For lossless compression,
NULL option will do:
<PRE>
    mdjvu_compress_image(image, NULL);
</PRE>

Lossy compression is trickier: you have to create options structure
and options for the pattern matcher. Suppose you want to compress with
the aggression of 110, cleaning and printing verbose messages to stdout;
here's the example of doing it:

<PRE>
    mdjvu_matcher_options_t m_options = mdjvu_matcher_options_create();
    mdjvu_compression_options_t options = mdjvu_compression_options_create();
    mdjvu_set_aggression(m_options, 110);
    mdjvu_set_matcher_options(options, m_options);
    mdjvu_set_clean(options, 1);
    mdjvu_set_verbose(options, 1);
    
    mdjvu_compress_image(image, options);

    mdjvu_compression_options_destroy(options);
</PRE>

You don't have to destroy the matcher options,
since destroying compression options does this.

<HR>
<H3>Step 5: save the image</H3>
Just one call to <I>mdjvu_save_djvu_page()</I> does the job.
The file is silently rewritten if it exists.
<P>
The function mdjvu_save_djvu_page() takes an extra parameter: erosion flag.
<P>
Here's an example of dealing with possible errors:
<PRE>
    int erosion = 0;
    const char *output = "your_output_file_name_here.djvu";
    mdjvu_error_t error;
    if (!mdjvu_save_djvu_page(image, output, &amp;error, erosion))
    {
        fprintf(stderr, "%s: %s\n", output, mdjvu_get_error_message(error));
        exit(1);
    }
</PRE>
For the sake of completeness, there's a second declaration of <I>error</I>
in this example. Obviously, you should remove it if you plan to compile this.
<HR>
<H3>Step 6: clean up</H3>
If you no longer need the image and the bitmap, destroy them:
<PRE>
    mdjvu_image_destroy(image);
    mdjvu_bitmap_destroy(bitmap);
</PRE>
You could as well destroy the bitmap immediately after splitting it.
</HTML>