File: CHANGELOG

package info (click to toggle)
clara 20031214-2
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 2,192 kB
  • ctags: 1,833
  • sloc: ansic: 28,836; perl: 1,522; makefile: 120; sed: 9
file content (966 lines) | stat: -rw-r--r-- 45,379 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
[search "->" to find people]

Dec 06 2003 -> Giulio Lunati added heuristic to find splitting points
            in partial matching.
	    - minor changes in preproc.c
	    - added P_SPL[] and p_spl_comp() in symbol.c
	    - modified classify(), step 8 in symbol.c
	    - enabled partial matching (p_match=1 in clara.c)

March 25 2003 added additional cjvu information at char level,
              suggested by -> Imre Simon.
              Copied the development files to the net (20030318).

March 18 2003 -> John Wehle changed some barcode parameters and updated
              obd.c: 

              1) Finished checksum support.  A side effect is
                 that the c39 start / stop character (*) is no
                 longer included in the decoded barcode.

              2) Finished c128 support.

              3) Added interleaved 2 of 5 support.

              4) Default to autodetecting the symbology.

              5) Generate placeholders for missing widths
                 when converting the scanline.

February 26 2003 - Integrated -> John Wehle's decoder into Clara OCR.
                   To read a barcode, scan it (backgound must be white)
                   and try "clara -y res searchb=1 -b -f file.pbm",
                   where res is the resolution in dots per inch.
                   Obs. currently only one barcode per file is read.
                   Obs. use searchb=2 to display the pixels instead
                   of trying to decode.
                   Added -> John Wehle's c39.pbm sample to the tarball,
                   the command "clara -y 204 searchb=1 -b -f c39.pbm"
                   should produce "*THE QUICK BROWN*".
                   Applied -> John Wehle's patches.
                   Added NO_RINTF flag to Makefile. By now this is
                   default, but can be undefined on systems that
                   provide rintf().
                   Copied the development files to the net (20030226).

February 14 2003 - Fixed a bug at skel(), some skeletons were
                   truncated.
                   More fixes on PAGE_FATBITS display. The modes that
                   do require FSxFS now force this window size.
                   Replaced "indo-arabic" by "decimal" along the
                   source code.

February 13 2003 - Finished first operational implementation of
                   classifier 4.
                   Fixed the fatbits display, broken since
                   PAGE_FATBITS enlarged (see Dec 3 2002).
                   Added skeleton computation method #7.

February 10 2003 - Added classes.txt (hardcopy of "patterns (list)"
                   tab, generated when batch processing).
                   Fixed: "changed" flags were not being set on some
                   cases.
                   More variables can be set on command line (patterns=,
                   bf_auto=, see checkvar for full listing).
                   Copied the development files to the net (20030210).

February 3 2003 - Added build_internal_patterns().
                  Classifier 4 now uses the same structure than
                  the others (cf. Dec. 14 and the note on clara.c
                  on calling build_internal_patterns()).
                  -> Imre Simon announced the availability of
                  http://www.ime.usp.br/~is/atc/atc.djvu a
                  300-page Computer Science textbook. The text
                  frame was generated by claraocr.

January 29 2003 - Added pp_save and pp_only. Now it's possible to
                  invoke the program as a preprocessor only, using
                  -b and these two variables. The preprocessed image
                  will be saved as pgm, with "_pp" added to the
                  original filename.
                  Added variable report to inhibit the creation of
                  report.txt (use report=0 on the command line).

January 28 2003 - Copied the development files to the net (20030128).

January 27 2003 - Giulio Lunati announced a new classifier.

January 23 2003 - Dot-related fixes (thanks -> T. Ribbrock and
                  -> Berend Reitsma).
                  Added restrictors start_at and stop_at. See the
                  documentation of -f to known how to use them.
                  Re-scan all patterns (Edit menu) on by default.

January 21 2003 - Now -b implies -u (easy way to avoid a crash).

January 20 2003 - added djvu output format '-o d' (so the
                  text output can be inserted in a djvu document,
                  see http://www.djvuzone.org/ for details on
                  djvu). Oops.. iso to utf-8 post-filtering
                  is required. Thanks -> Imre Simon for suggesting
                  this feature, studying djvu details and performing
                  tests.
                  Disabled blockfinding by default. To enable use
                  dontblock=0 on the command line.

January 16 2003 - -> Mark F. Heiman reported successfull building
                  on OS X, but behaviour isn't ok (endianess problems?).

January 13 2003 - -> Thomas Ribbrock reported/commented a tutorial
                  problem.

January 8 2003 - Consolidated preprocessing support for image
                 compressors. By now this works as follows: load
                 an image, OCR it asking to "build the bookfont
                 automatically", ask "save replacing symbols"
                 (File menu). The tozip.pbm file is the
                 preprocessed image. It'll achieve better compression
                 rates than the original one.
                 Copied the development files to the net (20030108).

January 2 2002 - -> Berend Reitsma sent comments concerning
                 diacritics (added directly to the code).

December 17 2002 - -> Benjamin Wong sent patches (applied).

December 14 2002 - Added a new bitmap comparison method (#4), referred
                   as "shape" on the source code (see the function
                   bmpcmp_shape). It's partly based on -> Rodrigo
                   Readi's suggestions.

December 4 2002 - -> Adriano Nagelschmidt Rodrigues donated a 15"
                  monitor to the project.

December 3 2002 - Implemented/activated features suggested by
                  -> Rodrigo Readi: Buffered I/O became default to
                  avoid flickering (obs. this is experimental stuff,
                  use -u to switch off).
                  PAGE_FATBITS now displays areas larger than FSxFS. It
                  enlarges as needed to cover the entire plate. Special
                  display features (like "Show border") are restricted
                  to the top-left FSxFS rectangle, though.
                  Copied the development files to the net (20021203).

November 23 2002 - More suggestions by -> Rodrigo Readi. Added his
                   suggestions to the file Rodrigo.Readi.

November 9 2002 - -> Ed Snible reported fatal errors (UNCHECKED).
                  -> Rodrigo Readi sent various good suggestions (to
                  be worked on). He reported good results using fraktur.

November 4 2002 - -> Rodrigo Meza reported problems recognizing trained
                  characters (UNCHECKED).

September 28 2002 - -> Andy Miller reported problems on X-Win32
                    displays.

September 8 2002 - -> Eduardo Maan renamed selthresh.pl to selthresh
                   and wrote the man page selthresh.1.

July 23 2002 - Copied the development files to the net (20020723).
               Fixed claraocr-devel list (subscriptions made between
               approx. 20020522 and 20020723 may have silently
               failed). Thanks -> Charles Davant.

July 22 2002 - Uploaded CF doubts to www.claraocr.org.

July 19 2002 - Added doubts counter to the web interface.
               Fixed PAGE (doubts) initialization bug.
               Fixed pattern typer initialization bug.
               Set various regeneratio/changed flags along the code.
               Images added to PATTERN (types).

July 17 2002 - Finished new web interface.

July 16 2002 - Started writing the new web interface.
               Added registration/authentication.

July 11 2002 - Replaced "obs." by "remark:" along all files (thanks
               -> Roy).
               Added draw_thin flag (see redraw.c).

July 10 2002 - Fixed a manual typo (thanks -> Roy).
               Perfect matching tests for grabbed images (thanks -> Roy),
               fixed classify() bug ("direct" variable not being reset),
               added dump_bm() to debug classify(), fixed computation
               of skeleton limits for algorithms 3, 4, 5 and 6.
               Diaeresis support a bit better now (thanks -> Werner Speer
               and -> Kolja Brix).

July 8 2002 - -> Charles Davant fixed an alias problem reported by
              -> Kolja Brix.

July 4 2002 - Added page (doubts) window.
              First procedure for classification of doubts.

July 3 2002 - Implemented Abagar debug mode.
              Diagnosed a local binarization problem (fragments that
              disappear - search "dropped fragments" at pbm2cl.c).
              Fixed references to "pp_thresh" and "gl_thresh" (now
              h_thresh is referred).

July 2 2002 - Added horizontal disjunction criterion to cmpln()
              (bad line ordering).

June 28 2002 - Tests using -> Werner Speer file.
               Partial matching off by default.

May 22 2002 - Copied the development files to the net (20020522).

May 20 2002 - added dict_sel.

May 18 2002 - First dictionary interface.

May 16 2002 - First win32 binary, by -> Brian G.

May 14 2002 - Applied -> Franz Bakan patches and added to the tarball the
              additional files required by OS/2.
              Copied the development files to the net (20020514).

May 7 2002 - Fixed two initialization bugs reported by -> Thomas Klausner.
             Added -pedantic to the makefile.
             -> Franz Bakan reported that Clara OCR compiles and runs
             on OS/2 with minor changes.

May 5 2002 - -> Brian G. reported that Clara OCR compiles cleanly on
             win32 with cygwin. Next step is linking.

May 3 2002 - Released 0.9.9.

May 2 2002 - Fixed local binarizer bug (see the new variable "eaten" at
             pbm2bm()).
             Fixed a tutorial error.
             Fixed -geometry (thanks -> "groggy").
             All names cited by the CHANGELOG were acknowledged by the
             main documentation (end of CREDITS session).

April 30 2002 - Fixed the hyperlink edit/n.
                Switched off partial matching during binarization for
                faster operation (see step_5()).
                Copied the development files to the net (20020430).
                Fixed presentation problem.
                -> Brian G. is trying a win32 port.

April 29 2002 - Added a more informative message for some non-supported
                file formats (thanks -> "groggy").
                As the PAGE behaviour for graymaps is confusing, now it
                enters three-state automatically after segmentation.
                Dropped -A (use -allow_pres instead).
                Removed heuristic 5 from skeleton auto-tune (see
                the CSP array).
                Developer's guide fixed.

April 27 2002 - Added clipping to draw_rb().
                Added symbol display on HTML windows: use IMG SRC=symbol/n,
                see an example at mk_pattern_action().
                Added dynamic registration of GUI buttons. To see the
                test button, use the command-line switch -pp_test.
                Finished 'PATTERN (action)'.

April 27 2002 - Added "avoid-links" feature in segmentation step.
                Changed reference to hqbin() in clara.c.
		Reactivated Test button (only for my purposes :-).
		Wrote pm_fl() in html.c (not used!).
		Added "threshold factor" and "try avoid links" 
		parameters in Tune tab.
		Improved balance code (now works fine); 
		other internal changes in preproc.c
                Added 3 obs. to book.c [-> Giulio Lunati]

April 26 2002 - Dropped -C (use dmmode= instead).
                Dropped -S (use RW= instead).
                Initial window now tries to approach height/width
                ratio 3/4.
                Added SL status. Dropped -G (use TC=, SL=, etc).
                Implemented 'PATTERN (action)' window.

April 25 2002 - Added status tskel_ready.
                Now all menus (not only those on the menu bar) dismiss
                when unfocused.
                Added check to skel() to detect code that change skeleton
                params without consisting them.
                Added auto-tune checkbox to TUNE (skel).
                Disabled auto classification of patterns by type.
                Removed st_auto_tune status (using the old status
                st_auto instead).
                Removed skeleton auto-tune entry from TUNE tab (the
                new checkbox and submit button at TUNE (skel) tab
                do a much better work).

April 24 2002 - Applied -> Giulio's patch (interpolation and others).
                Fixed skeleton bug: not calling consist_skel()
                after some changes of skeleton parameters.
                Changed skeleton quality coverage criterion, and
                documented carefully skel_quality().
                Fixed skeleton auto-tune bug: not calling pskel().

April 23 2002 - Fixed pp_thresh() (thanks -> Giulio Lunati).
                Copied the development files to the net (20020423).

April 22 2002 - Finished updating the user's manual.
                Updated preproc.c (thanks -> Giulio Lunati).
                Added (naive) double resolution, interpolation
                to come.

April 20 2002 - Double resolution tests.

April 16 2002 - Copied the development files to the net (20020416).
                Added translation of long options and a command-line interface
                to define internal variables (see checkvar() and
                process_cl()).

April 15 2002 - Removed buttons "bold", "italic" and "test".
                Finished fixing the tutorial.
                Added balance checkbox to the tune tab.
                fixed a thresholding bug (the code 20020409 was
                unable to handle PBMs).
                The link "nullify and remove" is back to the PATTERN
                tab (so now it's possible to remove one pattern).
                TUNE tab entries reorganized in to same sequence
                of OCR steps.
                The four binarization methods were collected into
                one radio at the TUNE tab (manual global, histogram,
                local and local strong).
                "How many types" removed from TUNE tab (now the
                array of types grows automatically when needed).
                Fixed bugs related to the Debug menu (the PAGE
                menu became broken and the Debug menu wasn't
                working).
                Reorganized various menus and changed some labels.
                Constant MAX_MT now is 45.
                Fixed segfault related to "show words" option.

April 9 2002 - Added compositions / /, / a, / o, / A, / O, " s (thanks
               -> Jeroen Ruigrok).
               Changed some hardcoded constants to try avoiding
               partial matching mistakes (thanks -> Thomas Klausner).
               Copied the development files to the net (20020402).

April 6 2002 - Applied -> Giulio Lunati's patch (renamed deskew.c to
               preproc.c, added balance relative black, test
               button, and code cleanup).
               Removed deskewer name, as requested.
               Added "Debug" menu.
               Added "Show unaligned symbols" feature.
               Moved "Show lines (geometrical)" to "Debug" menu.

April 2 2002 - Copied the development files to the net (20020402).

April 1 2002 - Added field "bs" to symbol structure to store the base
               symbol of accents.
               Support for double quotes enhanced, partially working.
               Added macro trans()

March 30 2002 - Added lfa reset at rmvotes().
                Partial matching restricted to symbol classification (mode
                1 at classify()).
                Fixed a bug on the local binarizer (training from
                grayscale was not working).
                Tested -> Giulio's code on a 24-bit padded display (works).
                Fixed seed search code at pbm2bm(). Some letters were
                being dropped by the local binarizer due to this bug.
                Implemented auto-tune of skeleton parameters (global),
                it's the function tune_skel_global().
                Changed the semantics of -a: the st_auto_global status
                instead of st_auto. Same for "auto tune skeleton
                parameters" on the TUNE tab.
                Integrated tune_skel_global() to the engine (see
                prepare_patterns()) and dropped the old calls to
                tune_skel().
                Linked -> Giulio Lunati's histogram-based thresholder
                with Clara OCR. It's used by default when converting
                grayscale to black-and-white.

March 28 2002 - -> Giulio Lunati added support to sparse 24bpp displays. Such
                displays have a depth of 24bpp, but pixel colors are padded
                with an additional (unused) byte to provide 32-bit
                alignment.
                Added automatic conversion from PBM to PGM on loading,
                basically to simplify the program at the expense of system
                memory (now black-and-white images are internally handled
                as 8bpp graymaps before segmentation).
                Fixed a problem concerning partial matches (dashes
                being recognized as a sequence of dots).
                Diagnosed and fixed a re-training bug: every time a class
                was re-trained, class data was lost. Thanks -> De Clarke
                for reporting it.
                One rule for validation of dots based on alignment was
                commented out (see the function recog_validation()).
                The usage of the term "lineart" was carefully reviewed
                on the manuals.
                Removed some test code that was causing problems to
                spyhole().

March 26 2001 - Added code to handle double quotes (unfinished).
                Added -o command-line switch and the ability of generate
                text output (thanks -> De Clarke).
                Now using rintf() instead of roundf() (thanks -> Thomas
                Klausner).
                Copied the development files to the net (20020326).

March 21 2002 - Scanned Anchieta's Grammar, p. 48-117

March 18 2002 - Fixed a bug at classify().
                Copied the development files to the net (20020318).
                Scanned Ancheta's Grammar, p. 1-48.

March 16 2002 - Finished linking -> Giulio's deskewer.
                Changed the copyright notice displayed by the GUI.
                Added the modified version of selthresh.pl (by -> Tyler
                Akins) to the tarball.
                Now partial matching can be called by the OCR engine.
                Robustified cmpln().

March 11 2002 - Fixed some partial matching bugs.
                Copied the development files to the net (20020311).

March 9 2002 - Partial matching now works.

March 8 2002 - Started writing code to perform partial matching (to solve
               horizontal links).

March 6 2002 - Fixed a segmentation bug, at new_cl() (thanks -> Giulio Lunati).

March 4 2002 - Trying to link -> Giulio's deskewer with Clara OCR.

February 28 2002 - -> Giulio Lunati contributed some filters.

February 27 2002 - -> Tyler Akins reported various problems.

February 13 2002 - Added auto-classify, as suggested some time ago by
                   -> Adriano Nagelschmidt Rodrigues. Added "Auto classify" entry
                   to the Edit menu.
                   Copied the development files to the net (20020213).

February 11 2002 - Changed the directory scanning code. Now the PAGE (list)
                   tab sorts the filenames as numbers (e.g. "2.pgm" before
                   "10.pgm").

February 9 2002 - Applied -> Giulio Lunati's patch to handle apostrophe,
                  colon, semicolon and multichar alignment.

February 7 2002 - Added checkboxes "use local binarizer" and "strong mode"
                  to the TUNE tab.
                  Copied the development files to the net (20020207).

February 6 2002 - Local weak threshold tests.

February 5 2002 - Added relaxed mode to spyhole().

February 4 2002 - The local binarizer is now integrated to the main OCR
                  engine in "weak" and "strong" fashions.
                  Added operation mode 4 to spyhole(), it's a faster
                  version of mode 0.
                  Local binarizer measurements and bugfixes.

February 2 2002 - Wrote find_thing() to integrate the local binarizer to
                  the OCR engine.
                  Prepared new_cl() to accept fat bitmaps from find_thing().

January 30 2002 - Replaced the screenshot 7.
                  Copied the development files to the net (20020130).

January 29 2002 - Added more entries to the glossary.

January 28 2002 - Added "expected threshold" parameter to spyhole(). Now
                  the "best" threshold is first searched around the expected
                  threshold.
                  Added to spyhole() an heuristic to compute an alternative,
                  larger threshold (called "next"), based on detection of
                  merging of fragments. It's a tentative to solve segmentation
                  problems (broken symbols). The pixels added by this larger
                  threshold are displayed gray by the spyhole.
                  Added relaxed mode to border_path().
                  Documented carefully the detection of merging of fragments
                  (referred as "spare pixels" by the source code), see the
                  large comment before the main loop of spyhole().

January 26 2002 - Destroyed the binding between pattern and symbol ID
                  (field "e" of structure pdesc, it's still there but
                  out of use).
                  Added the concept of pattern transliteration submission
                  (REV_PATT and review_patt()), because this cannot be
                  done anymore through symbol transliteration submission.
                  Changed update_pattern() to accept a bitmap submission,
                  besides symbol submissions.
                  Added the ability to include a pattern from the bitmap
                  stored by the spyhole buffer (just press the key
                  corresponding to the symbol transliteration while the
                  spyhole is active).
                  Started writing a small glossary specific to Clara OCR,
                  Added initial support to mkdoc.pl generate the
                  glossary.

January 24 2002 - Received "Que sais-je? - La Terminologie. Noms et
                  Notions" (Alain Rey), sent by -> Daniel Merigoux (Clara
                  OCR is strongly related to dictionaries and
                  vocabularies).
                  -> Ron Young reported more mirror problems.

January 23 2002 - Copied the development files to the net (20020123).

January 22 2002 - Fixed minor bugs at spyhole() and finished a first
                  strategy for choosing per-symbol thresholds.
                  New screenshot spyhole.jpeg added.

January 21 2002 - Added thresholding loop to spyhole().

January 19 2002 - More documentation for border_path(). Adapted border_path()
                  to be used to span connected components.
                  Now spyhole() uses border_path() to select the connected
                  component close to the pointer.

January 18 2002 - Added "What is PBM/PGM/PPM/PNM?" to the FAQ.
                  Splitted the service avoid() into avoid_geo() and
                  avoid_context(). Moved the context avoidance tests from
                  bmpcmp_skel() to classify_symbol() (in fact, this is a
                  bugfix).
                  More documentation for bmpcmp_skel() and classify().
                  bmpcmp_skel() now supports the comparison of any
                  given bitmap with a given pattern (feature added to
                  implement the new binarizer).
                  Changed the service classify() to support patterns
                  and any bitmap. Dropped compare_patterns().

January 15 2002 - First implementation of spyhole().

January 14 2002 - Re-scanned CF pages 49-110 to work on a new version
                  of the case study http://www.claraocr.org/cf-test/
                  Copied the development files to the net (20020115).

December 26 2001 - Finished attaching to each menu item its availability
                   conditions.

December 24 2001 - Implemented availability tests, short help and
                   diagnostics for menu items.
                   -> Charles Davant fixed the mirrors.

December 21 2001 - Implemented the unavailable state for menu items.
                   -> Imre Simon donated 200 CD-R medias.
                   -> Erich Mueller reported mirror problems.

December 15 2001 - Detection of extremities tests. Good results for
                   sans-serif.
                   Copied the development files to the net (2001217).

December 14 2001 - Finished detection of extremities. Detached is_extr()
                   from dx().

December 13 2001 - Returned the new motherboard (failure on keyboard
                   detection).

December 10 2001 - Copied the development files to the net (2001210).

December 7 2001 - Purchased a new motherboard and Duron 1GHz CPU to replace
                  an old, damaged board.
                  -> Sergei Andrievskii provided some explanations about Russian
                  and Ukrainian Cyrillic.

December 6 2001 - Implemented the service dx() and the circ* stuff.
                  Separated add-closure() from pixel_mlist().
                  Added "detect extremities" menu option (PAGE_FATBITS).

December 3 2001 - Copied the development files to the net (2001203).

November 30 2001 - Implemented manager.c test mode (no operator, have_oper==0).

November 29 2001 - Wrote and tested burn_cd().
                   Added a very crude sound interface to manager.c.

November 28 2001 - bandwidth tests. It's a hard problem to make the scanner
                   station apt to scan and write CDs at the same time.

November 27 2001 - Scanning tests using manager.c. Finished preparing the
                   hardware of the scanner station.

November 26 2001 - Fixed the regeneration of PAGE (symbol) to follow the current
                   symbol when arrow keys are used.
                   Fixed a small webclip-related regeneration problem.
                   Tested carefully the full PGM cycle and zones support.
                   Small documentation updates to reflect the new features.
                   Copied the development files to the net (20011126).
                   Began preparing the scanner station manager (manager.c).

November 24 2001 - Finished the multiple zones stuff.

November 23 2001 - Added "Deskewing" OCR step (currently empty).
                   Changed position of "detect blocks" OCR step. By now using it
                   to activate the CF PGM blockfind.
                   Added "Segmentation" OCR step (by now, it starts pbm2bm reading
                   from the PGM buffer).
                   Added PBM support to pgmload(). Now the PBM loader pbm2bm()
                   will be used only to perform segmentation, and the same handling
                   applies to both PGM and PBM files.

November 22 2001 - Added zfgetc.
                   Added to the z* internal I/O API the ability of
                   thresholding and "reading" the PGM buffer as if it were
                   a PBM file.

November 21 2001 - Changed draw_zone to support non-rectangular zones.
                   Added zfread and zfwrite in order to prepare the z* internal
                   I/O API to be a more featured I/O selector to support file,
                   compressed file, internal buffer and TCP I/O.

November 20 2001 - Replaced 'button 2' with 'button 3' along the source code.
                   Added handler for mouse button 2. By now, it toggles the
                   max/min view on some windows.
                   Disabled show_hint on waiting_key state.
                   Implemented "Instant thresholding" feature.
                   Added the geometric service inside().

November 19 2001 - Finished reimplementing the CF blockfinder (by now, it can
                   be requested using "C-x p" after loading a PGM file).
                   Changed search_barcode to use the service clusterize().
                   Added initial support for multiple zones.
                   Copied the development files to the net (20011119).
                   Conformed loadpgm() to the partial execution model.

November 17 2001 - Reimplemented partially the old CF blockfinder.
                   Wrote the service clusterize(), to be used by the new
                   blockfinder and also by the barcode searcher.

November 16 2001 - Now pgmblock.c is linked with Clara OCR.
                   Implemented PGM visualization.
                   Now any mouse buttonpress clears the message line.

November 15 2001 - Scanned the pages 1-32 from the Candido de Figueiredo
                   Dictionary using the HR5 scanner and SANE 1.0.4
                   (600 dpi).
                   Fixed floating comparison problems due to inexact binary
                   representation at selthresh.pl (see the warning #6 on
                   the script).

November 14 2001 - Added "Show pixel" and "Show pattern type" to the "View"
                   menu.
                   Tests using the HR5 scanner. The bundled controller
                   (Domex) didn't work for us (defining SANE_DEBUG_UMAX=128,
                   scanimage stops on the message "waiting scanner"). Using
                   a 2940 instead.

November 13 2001 - Fixed background redraw on the junction PAGE/PAGE_OUTPUT.
                   Added mode "PAGE only" (see menu "Options"). When active,
                   the windows "PAGE (output)" and "PAGE (symbol)" become
                   hidden (useful when you need to visualize a larger portion
                   of the scanned image).
                   Fixed a bug on the computation of the bar medium skew.

November 12 2001 - Wrote search_barcode.
                   Added "Search barcode" to the "Edit" menu.
                   Enhanced closure_at (added the parameter u).
                   Implemented the laserbeam.
                   Changed the behaviour of button ZOOM at tab PAGE (now
                   it'll zomm the PAGE window almost always).
                   Copied the development files to the net (20011112).

November 10 2001 - Fixed a symbol selection problem at PAGE window:
                   those black margins on scanned documents (if any) act as
                   a gigantic symbol that contains almost any pixel. As the
                   symbol selection was based only on checking if the pixel
                   is inside the bounding box, the frames were selected
                   instead of the desired symbol (reported by -> Harold van
                   Oostrom).
                   Added isbar() service.
                   Fixed menu placement problem (the status line was being
                   drawn over the PAGE_FATBITS context menu).

November 9 2001 - Enhanced closure_at (no more simple bounding box
                  inclusion, bit state is also tested).
                  Various tests of straight borderlines detection using
                  skewed barcodes.

November 8 2001 - Finished segment extension at closure_border_slines().
                  Finished pixel extension at closure_border_slines().
                  Now the window PAGE_FATBITS displays on the message
                  line the closure-relative pixel coordinates, its
                  parameter (correlation or distance to the interpolated
                  line) and slope (after requesting "search straight lines").

November 7 2001 - Finished correlation code at closure_border_slines().
                  Splitted "Search straight lines on border" into options
                  "linear" (based on linear distances) and "quadratic"
                  (based on correlation).

November 6 2001 - Fixed display bug (PAGE_FATBITS scrolling).
                  Added "Centralize" and "Search straight lines on border"
                  to the PAGE_FATBITS context menu.
                  Began writing closure_border_slines().
                  The pbm2cl.c code was reviewed to avoid allocating too
                  much memory on dark pages, or pages with large images. On
                  some tests, the new code allocates 6 times less memory
                  and runs 50% faster.

November 5 2001 - Added all three contributed specfiles to the
                  distribution tarball (not a very good idea, but
                  it's the only thing I can do by now). See the
                  file README.RPM for details.
                  fixed a small regeneration bug (TUNE_PATTERN
                  window).
                  Copied the development files to the net (20011105).

November 3 2001 - Finished border_path().
                  3-bit optimization (see border_path()).

November 1 2001 - A service to compute a border path is available
                  (see border_path()).
                  New context menu: PAGE_FATBITS options (pops up
                  when the mouse button 2 is pressed on the
                  PAGE_FATBITS window).
                  Added "See in fatbits" item to PAGE options menu.
                  Added regeneration control to PAGE_FATBITS in order
                  to make easier to implement visualization of
                  pixel-level heuristics.
                  Added "the flea", a visualization feature. If a
                  "flea path" is defined and the fun code is 3, an
                  'x' will be drawn by the interface walking along
                  the flea path.

October 30 2001 - Began implementation of barcode detection.

October 29 2001 - -> R P Herrold contributed a spec file for RH 7.2.

October 27 2001 - More compression tests. Tried to adapt tic98.
                  Copied the development files to the net (20011026).

October 26 2001 - Tested various remappings to try to achieve better PGM
                  compression rates (code available at pgmblock.c). Beats
                  gzip but not bzip2.

October 24 2001 - Dumper bug fixed (reported by -> Stuart Yeates).
                  Crash when allocating large buffers using alloca (reported
                  by -> Stuart Yeates). Fixed, now using malloc instead.
                  Added checkings to review_tr() to refuse entering
                  symbols too large as patterns.
                  Added the answer to -> Ho Chak Hung to the Developer's Guide.

October 23 2001 - Crash on doubleclicking an anchor at PAGE_LIST (fixed).
                  Fixed a HTML parse problem ('>' as part of the value,
                  reported by -> Stuart Yeates).

October 22 2001 - Applied -> Harold van Oostrom's patch to initialize
                  the grid separation and avoid crashing on some
                  unexpected user actions. -> Harold van Oostrom also
                  contributed a spec file for RH and SuSE.

October 20 2001 - Added "test" target to the Makefile.
                  Added sselect(), fselect() and bselect(). These are required
                  to implement an auto-test feature. A prototype
                  is already available (try "make test").
                  Fixed a display bug when filling text input fields.
                  Release 0.9.8.

October 18 2001 - Copied the development files to the net (20011018).
                  Fixed an i64 bug.

October 17 2001 - Fixed a bug at recog_validation() (reported by
                  -> Stuart Yeates).
                  Fixed a bug at event.c (zoom- now correctly
                  repositionates the page).

October 16 2001 - Skeleton code became (more) 20% faster due to
                  optimizations at skel() and cb_border() (the W and H
                  parameters, and the i64 optimization flag, used
                  used there to handle 8 pixels at a time when converting
                  to 8bpp).
                  Added BIG_ENDIAN compilation flag (see clara.c), however
                  this is a work in progress.

October 15 2001 - Changed BC to MBB along the manual.
                  Skeleton code became 20% faster due to optimizations at
                  bmcmp_skel().

October 12 2001 - Gave up using the Brazilian Constitution (BC) files as
                  example. Due to the small clearance, it's hard to obtain
                  good results (a larger resolution should solve the
                  problems). Now trying Manuel Bernardes Branco
                  Dictionary (MBB).
                  Added a new strategy to selthresh.pl (see the variables
                  'clean' and 'small').

October 8 2001 - Finished "A first OCR Project".
                 copied the development files to the net (20011008).

October 6 2001 - -> Romeu's RPMs added to the download page.
                 Fixed a pattern comparison bug.
                 Optimized memory copies at bmpcmp_skel.

October 5 2001 - Added ab_mem and test_ab_mem.
                 Added switches -l and -y to selthresh.pl.
                 Added -T command-line switch.
                 Robustified selthresh.pl.

October 4 2001 - Purchased a Genius HR5 scanner. This scanner is supported
                 by the SANE UMAX backend.

October 3 2001 - -> Romeu Mantovani Jr contributed a clara.spec file (to
                 produce RPMs).

October 2 2001 - Fixed a problem at mk_page_output, reported by -> Laurent-jan.

October 1 2001 - Computed the thresholds for "Exclamations" (Therese of Avila)
                 using selthresh.pl.
                 Began reviewing the documentation to release 0.9.8.

September 28 2001 - copied the development files to the net (20010928).

September 27 2001 - Tests using -> Emile's binarizer.

September 26 2001 - Uploaded the new results for the Candido de Figueiredo
                    Dictionary.
                    Various optimizations at bmpcmp_pd.

September 25 2001 - Fixed bug at cml.c (the release 20010924 is unable to
                    read dumps).
                    Adopted per-classifier minimum scores.
                    New feature: "re-scan all paterns" ("Edit" menu).
                    Tests using the pixel distance classifier to handle
                    symbols not classified using skeletons (good results,
                    but slow).

September 24 2001 - copied the development files to the net (20010924).
                    New feature: "Show line (geometrical)" ("View" menu).
                    New feature: "Display boxes instead of symbols"
                    ("Options" menu).
                    As the "View" menu became too large, we've moved some
                    entries to the "Options" menu.

September 21 2001 - word alignment tests robustified.

September 20 2001 - Tested -> Emile Snider deskewer.
                    Added the pattern bitmaps to the Pattern (types)
                    tab (first step to create a manual baseline adjusting
                    tool).
                    Added (horizontal) support to CELLSPACING attribute
                    of TABLE elements.

September 19 2001 - New feature: "set pattern type" (Edit menu).
                    Fixed some renderization problems when
                    dismissing menus.
                    Changed the structure ptdesc.

September 18 2001 - Four buttons became read-only (alphabet, pattern
                    type, bold and italic).
                    The service enter_wait now supports mode 4 to
                    read a string.

September 17 2001 - Fixed the web interface. The section "how to
                    use the web interface" became more detailed
                    (thanks -> Erich Mueller).
                    Skeleton parameters are global again, but it's
                    still possible to use the per-pattern behaviour
                    (see the PATT_SKEL compilation macro).

September 15 2001 - Case study based on the recent tests available
                    at http://www.claraocr.org/cf-test/
                    copied the development files to the net (20010915).

September 10-15 2001 - Tests using Candido de Figueiredo Dictionary,
                       4th edition. Various small fixes or
                       adjustments: pgmblock improved, display
                       the type 0 absent symbols, avoidance of common
                       false positives (the classification now must be
                       performed at least two times), alignment
                       problems diagnosed, etc.

September 8 2001 - Fixed: segfaults caused by changing properties of
                   untransliterated symbols (reported by -> Erich Mueller).
                   copied the development files to the net (20010903).

September 7 2001 - Added "search unexpected mismatches" feature to
                   prepare_patterns and to the Options menu. This
                   is a tool to detect cases where the behaviour
                   of the classificator is bad.
                   Fixed a classificator bug. This bug
                   was producing occasional false positives.

September 6 2001 - -> Bruno Barbieri Gnecco will adapt pgmblock to
                   be used through libgocr. This is a first tentative
                   to make Clara OCR compatable with libgocr.

September 5 2001 - pgmblock, a simple text block locator
                   for PGM files, is working.

September 4 2001 - Added -a command line option.
                   selthresh.pl, a simple script for
                   selecting the best threshold when converting PGM
                   to PBM, is working.

September 3 2001 - -> Terran Melconian reported a bug on the Makefile
                   and contributed a bugfix.
                   copied the development files to the net (20010903).

August 31 2001 - Added new skeleton heuristic (#6), based on
                 removing the border until remain only isolated
                 pixels.

August 30 2001 - Now it's possible to inhibit the usage of a
                 given classifier for some letters through the
                 pattern types form.
                 The reset service became partially operational.
                 Added block separation heuristic based on the
                 detection of vertical separation lines.
                 Fixed bad behaviour of from_asc when the
                 string field to read is absent.

August 28 2001 - -> Erich Mueller reported geometry problems on dumps.

August 27 2001 - Added new classifier, based on pixel distances.

August 24 2001 - Changed 'setmode' to 'setview' (thanks -> Bruce Momjian).
                 Tests using FreeBSD.

August 22 2001 - Finished pattern types form.

August 15 2001 - Changed geometry of "tune (skel)" window.

August 14 2001 - Added distance-based skeleton heuristic (#5).
                 Better version of skel_qualitu, based on distance from
                 the pixel to the border.
                 Skeleton parameters now are per-pattern.

August 13 2001 - Tested -> Adriano's bitmap.

August 10 2001 - First version of skel_quality, based on distance from
                 the pixel to the skeleton.

August 8 2001 - Service compare_patterns finished.
                Partially fixed the behaviour of "display comparisons".

August 7 2001 - Began implementation of skeleton auto-tune.
                Fixed a bug in bmpcmp_skel.

August 4 2001 - Manual adjustment of pattern types now works.

August 3 2001 - Started reworking pattern types.

August 2 2001 - -> Nathalie Vielmas told us about blind people needs.

July 26 2001 - -> Tim McNerney told us about the NIST OCR.

July 18 2001 - Added service dump_cb.
               Added detection and handling of C-x prefix.

July 16 2001 - Release 0.9.7 (first release announced at large).

July 4 2001 - Release 0.9.6.

June 22 2001 - Release 0.9.5.


** historic - recovered from my agenda **

April 2000 - Version 0.9.c available on the web.

November 11-12 1999 - Showed it to various friends.

September 19 1999 - First version of the Xlib interface.

September 18 1999 - Named it "clara".

February 15 1999 - Tests trying to write a web interface, running
                   as a CGI.

December 18 1998 - Tests trying to write an interface based on GTK.

November 9 1998 - First brute force tests, using manually-built
                  fonts.