1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175
|
=head1 NAME
puzzle_init_cvec, puzzle_init_dvec, puzzle_fill_dvec_from_file, puzzle_fill_cvec_from_file, puzzle_fill_cvec_from_dvec, puzzle_free_cvec, puzzle_free_dvec, puzzle_init_compressed_cvec, puzzle_free_compressed_cvec, puzzle_compress_cvec, puzzle_uncompress_cvec, puzzle_vector_normalized_distance - compute comparable signatures of bitmap images
=head1 SYNOPSIS
#include <puzzle.h>
int puzzle_init_context(PuzzleContext *I<context>);
int puzzle_free_context(PuzzleContext *I<context>);
int puzzle_init_cvec(PuzzleContext *I<context>, PuzzleCvec *I<cvec>);
int puzzle_init_dvec(PuzzleContext *I<context>, PuzzleDvec *I<cvec>);
void puzzle_fill_dvec_from_file(PuzzleContext *I<context>, PuzzleDvec *I<dvec>, const char *I<file>);
void puzzle_fill_cvec_from_file(PuzzleContext *I<context>, PuzzleCvec *I<cvec>, const char *I<file>);
void puzzle_fill_cvec_from_dvec(PuzzleContext *I<context>, PuzzleCvec *I<cvec>, const PuzzleDvec *I<dvec>);
void puzzle_free_cvec(PuzzleContext *I<context>, PuzzleCvec *I<cvec>);
void puzzle_free_dvec(PuzzleContext *I<context>, PuzzleDvec *I<cvec>);
void puzzle_init_compressed_cvec(PuzzleContext *I<context>, PuzzleCompressedCvec *I<compressed_cvec>);
void puzzle_free_compressed_cvec(PuzzleContext *I<context>, PuzzleCompressedCvec *I<compressed_cvec>);
int puzzle_compress_cvec(PuzzleContext *I<context>, PuzzleCompressedCvec *I<compressed_cvec>, const PuzzleCvec *I<cvec>);
int puzzle_uncompress_cvec(PuzzleContext *I<context>, PuzzleCompressedCvec *I<compressed_cvec>, PuzzleCvec *I<const cvec>);
double puzzle_vector_normalized_distance(PuzzleContext *I<context>, const PuzzleCvec *I<cvec1>, const PuzzleCvec *I<cvec2>, const int I<fix_for_texts>);
=head1 DESCRIPTION
The Puzzle library computes a signature out of a bitmap picture. Signatures are comparable and similar pictures have similar signatures.
After a picture has been loaded and uncompressed, featureless parts of the image are skipped (autocrop), unless that step has been explicitly disabled, see L<puzzle_set(3)>
=head1 LIBPUZZLE CONTEXT
Every public function requires a I<PuzzleContext> object, that stores every required tunables.
Any application using libpuzzle should initialize a I<PuzzleContext> object with B<puzzle_init_context()> and free it after use with B<puzzle_free_context()>
=over 2
PuzzleContext context;
puzzle_init_context(&context);
[...]
puzzle_free_context(&context);
=back
=head1 DVEC AND CVEC VECTORS
The next step is to divide the cropped image into a grid and to compute the average intensity of soft-edged pixels in every block. The result is a I<PuzzleDvec> object.
I<PuzzleDvec> objects should be initialized before use, with B<puzzle_init_dvec()> and freed after use with B<puzzle_free_dvec()>.
The I<PuzzleDvec> structure has two important fields: I<vec> is the pointer to the first element of the array containing the average intensities, and I<sizeof_compressed_vec> is the number of elements.
I<PuzzleDvec> objects are not comparable, so what you usually want is to transform these objects into I<PuzzleCvec> objects.
A I<PuzzleCvec> object is a vector with relationships between adjacent blocks from a I<PuzzleDvec> object.
The B<puzzle_fill_cvec_from_dvec()> fills a I<PuzzleCvec> object from a I<PuzzleDvec> object.
But just like the other structure, I<PuzzleCvec> objects must be initialized and freed with B<puzzle_init_cvec()> and B<puzzle_free_cvec()>.
I<PuzzleCvec> objects have a vector whoose first element is in the I<vec> field, and the number of elements is in the I<sizeof_vec> field.
=head1 LOADING PICTURES
I<PuzzleDvec> and I<PuzzleCvec> objects can be computed from a bitmap picture file, with B<puzzle_fill_dvec_from_file()> and B<puzzle_fill_cvec_from_file()>.
I<GIF>, I<PNG> and I<JPEG> files formats are currently supported and automatically recognized.
Here's a simple example that creates a I<PuzzleCvec> objects out of a file.
=over 2
PuzzleContext context;
PuzzleCvec cvec;
puzzle_init_context(&context);
puzzle_init_cvec(&context, &cvec);
puzzle_fill_cvec_from_file(&context, &cvec, "test-picture.jpg");
[...]
puzzle_free_cvec(&context, &cvec);
puzzle_free_context(&context);
=back
=head1 COMPARING VECTORS
In order to check whether two pictures are similar, you need to compare their I<PuzzleCvec> signatures, using B<puzzle_vector_normalized_distance()>.
That function returns a distance, between 0.0 and 1.0. The lesser, the nearer.
Tests on common pictures show that a normalized distance of 0.6 (also defined as I<PUZZLE_CVEC_SIMILARITY_THRESHOLD>) means that both pictures are visually similar.
If that threshold is not right for your set of pictures, you can experiment with I<PUZZLE_CVEC_SIMILARITY_HIGH_THRESHOLD>, I<PUZZLE_CVEC_SIMILARITY_LOW_THRESHOLD> and I<PUZZLE_CVEC_SIMILARITY_LOWER_THRESHOLD> or with your own value.
If the I<fix_for_texts> of B<puzzle_vector_normalized_distance()> is 1, a fix is applied to the computation in order to deal with bitmap pictures that contain text. That fix is recommended, as it allows using the same threshold for that kind of picture as for generic pictures.
If I<fix_for_texts> is I<0>, that special way of computing the normalized distance is disabled.
=over 2
PuzzleContext context;
PuzzleCvec cvec1, cvec2;
double d;
puzzle_init_context(&context);
puzzle_init_cvec(&context, &cvec1);
puzzle_init_cvec(&context, &cvec2);
puzzle_fill_cvec_from_file(&context, &cvec1, "test-picture-1.jpg");
puzzle_fill_cvec_from_file(&context, &cvec2, "test-picture-2.jpg");
d = puzzle_vector_normalized_distance(&context, &cvec1, &cvec2, 1);
if (d < PUZZLE_CVEC_SIMILARITY_THRESHOLD) {
puts("Pictures are similar");
}
puzzle_free_cvec(&context, &cvec2);
puzzle_free_cvec(&context, &cvec1);
puzzle_free_context(&context);
=back
=head1 CVEC COMPRESSION
In order to reduce storage needs, I<PuzzleCvec> objects can be compressed to 1/3 of their original size.
I<PuzzleCompressedCvec> structures hold the compressed data. Before and after use, these structures have to be passed to B<puzzle_init_compressed_cvec()> and B<puzzle_free_compressed_cvec()>.
B<puzzle_compress_cvec()> compresses a I<PuzzleCvec> object into a I<PuzzleCompressedCvec> object.
And B<puzzle_uncompress_cvec()> uncompresses a I<PuzzleCompressedCvec> object into a I<PuzzleCvec> object.
=over 2
PuzzleContext context;
PuzzleCvec cvec;
PuzzleCompressedCvec c_cvec;
[...]
puzzle_init_compressed_cvec(&context, &c_cvec);
puzzle_compress_cvec(&context, &c_cvec, &cvec);
[...]
puzzle_free_compressed_cvec(&context, &c_cvec);
=back
The I<PuzzleCompressedCvec> structure has two important fields: I<vec> that is a pointer to the first element of the compressed data, and I<sizeof_compressed_vec> that contains the number of elements.
=head1 RETURN VALUE
Functions return I<0> on success, and I<-1> if something went wrong.
=head1 AUTHORS
Frank DENIS libpuzzle at pureftpd dot org
=head1 ACKNOWLEDGMENTS
Xerox Research Center H. CHI WONG Marschall BERN David GOLDBERG Sameh SCHAFIK
=head1 SEE ALSO
L<puzzle_set(3)>, L<puzzle-diff(8)>
=cut
|