1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425
|
#============================================================================
# Enca v1.21 (2025-10-20) guess and convert encoding of text files
# Copyright (C) 2000-2003 David Necas (Yeti) <yeti@physics.muni.cz>
# Copyright (C) 2009-2016 Michal Cihar <michal@cihar.com>
#============================================================================
List of user-visible changes in Enca
More detailed log can be obtained from older changelogs or git log.
Legend: + new feature
* change of behaviour (including disappearing of a feature)
- bugfix
enca-1.21
+ Add support for Finnish language with CP1267 charset.
+ Added -without-librecode to configure script
- Fixed --with-libiconv and --without-libiconv options
- Removed old travis ci link from readme.
- Update source path in enca.spec.in
- Fixed language tools build warnings and error.
enca-1.20
- fix crosscompilation issues
- fix documentation build
- fix compiler warnings
- fix librecode detection
- fix normalize.py input format
+ with --disable-gtk-doc docs are not installed
+ add support for testing on MSYS2 MINGW
- fixed normalize.pl input format.
+ add usage help to normalize.pl and countall
enca-1.19 2016-09-05
- fix possible memory leak
- make utf-8 detection work even on one character
enca-1.18 2016-01-07
- fix installation of devhelp documentation
enca-1.17 2016-01-04
- Fixed conversion of GB2312 encoding with iconv
- Fixed iconv conversion on OSX
- Documentation improvements
- Fixed execution of external converters with ACLs
- Improved test coverage to 80%
enca-1.16 2014-10-20
- Fixed typo in Belarusian language name
- Added aliases for Chinese and Yugoslavian languages
enca-1.15 2013-09-30
- Documentation improvement
- Development moved to GitHub
- Do not use deprecated autoconf macros
enca-1.14 2012-09-11
- Allow standard names for belarusian and slovenian languages, thanks
to Branislav Geržo for suggestion.
- Reset strictness when check buffer less than file size, thanks to
Sam Liao.
- Fixed typos in man page, thanks to A. Costa.
enca-1.13 2010-02-09
- Reverse usage of temp file while converting using recode to prevent
file truncation (bug #1135).
enca-1.12 2009-10-29
- Fixes some minor memory leaks.
- Fixes little problems in autoconf scripts.
enca-1.11 2009-09-25
- Dropped scanf configure test which is not used at all.
- Fixes some wrong format strings.
enca-1.10 2009-08-25
+ Enca is back alive or at least in maintenance mode.
* Enca now lives in git repository, see <http://gitorious.org/enca>.
- Add missing charset koi8u to belarusian language.
- Fixed some typos in program and documentation.
enca-1.9 2005-12-18
+ support for HZ encoding
* Big5 and GBK detection improved
- enca.spec no longer installs docs to world-unreadable directory
enca-1.8 2005-11-24
+ Chinese (Big5 and GBK) support (thanks to Zuxy)
* deb/ subdirectory is gone as there is finally an Enca package in Debian
(thanks to Michal Cihar)
- manual page clean-up (thanks to Michal Cihar)
enca-1.7 2005-02-27
+ new name type: preferred MIME name (option -m)
- broken iconv detection on some system was fixed
enca-1.6 2004-09-01
* English language names (--list=languages, enca_language_english_name())
were changed to lowercase to match common locale aliases
- Win32, i.e. MinGW and Cygwin, build problems were fixed
enca-1.5 2004-05-30
- crash on impossible recovery after iconv failure in pipe was fixed
- rpm building problems on Mandrake Linux were fixed
enca-1.4 2004-05-12
- dependency of guessing API on locales (via ctype functions) was fixed
- --help text generation failure on some systems was fixed
enca-1.3 2003-12-24
+ [libenca] it's possible to get analyser option values, not just set them
* a good BOM (byte order mark) increases the chance of being recognized for
UCS-4 and UTF-8 too
* external converter wrappers were moved from bin to libexec and the b-
prefix was removed (though it still works)
* external converters are no longer searched in PATH, nonstandard ones
has to be specified with full path
enca-1.2 2003-11-26
- fixed segfault in language detection for some locale setups
enca-1.1 2003-11-17
- fixed losing data at the end of file when using external converters in a
pipe (and maybe in other situations)
- [libenca] enca_analyser_free() not freeing analyser completely was fixed
enca-1.0 2003-11-06
* deprectated options -T, -R, -S, -u, -U, -m, and -M were finally removed
* default HTML API docs installation path changed to the new gtk-doc style
(DATADIR/gtk-doc/html/enca)
* debian/ subdir moved to deb/ to allow official deb creation w/o too much
hassle
enca-0.99.4 2003-07-15
- several race conditions in librecode and iconv interfaces were fixed
- temporary file names are much less predictable now
enca-0.99.3 2003-06-30
* Debian package is back from death
* failure to find external converter is now fatal
- fixed build problems on FreeBSD (and probably other Unices)
- libiconv is not used for `conversion to ASCII' since never does the
Right Thing, whatever it is
- when conversion with libiconv fails, the file should now survive intact
- fixed build problems on systems w/o libiconv (hopefully)
- fixed distclean and uninstall targets to really clean and uninstall
everything
- fixed builds with separate source (read-only) and build directories
- fixed builds with --without-libiconv and --without-librecode on GNU/Linux
- external converter is not checked when it's not going to be used
enca-0.99.2 2003-06-25
+ EOL type is used to decide ambiguous cases, e.g. CP1250 is reported
instead of ISO-8859-2/CRLF
* --list languages by default prints English names, instead of ISO-639a
codes, use -e or -r to get the old listing
* if LC_CTYPE is something like en_US, more locale categories are examined
to detect the language
* cork charset was modified to contain \n, \r and \t in the same places as
ASCII
* some heuristics tuning
enca-0.99.1 2003-06-22
+ libenca pkg-config support
* all libenca tuning parameters (-T, -R, -S, -u, -U, -m, and -M) were
marked deprecated and are noop, Enca should DWIM
* ambiguity is now always OK when the sample has the same meaning in all the
charsets
* deprecated `built-in-encodings' and `encodings' lists were removed
* PAGER feature was removed
- exchanged `latvian' and `lithuanian' language names were fixed (`lv' and
`lt' were always OK)
- missing tests for the new languages was added to the test suit
enca-0.99.0 2003-06-14
+ added some support for: Bulgarian, Croatian, Estonian, Hungarian, Latvian,
Lithuanian, Slovene
+ a new algorithm for 8bit-dense languages (cyrillics), the old one is used
as a fallback
* removed support for non-transitive iconv (such a thing should not exist)
* auxiliary tools in data are not longer built in regular builds,
use --enable-maintainer-mode to rebuild them, create dists, etc.
- fixed iconv interface surface check pickier than iconv itself inhibiting
some otherwise possible conversions
- fixed u+x permissions on temporary files (from 0.10.7)
- fixed not deleting temporary files in iconv interface
- fixed broken iconv interface behaviour in pipes
- fixed iconvcap misdetecting Latin5 as ISO-8859-5
- fixed casual `make distclean' failures
enca-0.10.7 2003-01-28
- fixed interchanged iconv and cstocs encoding names
- corrected(?) librecode surface interaction
- fixed a temporary file creation race condition
* added tex and utf8 to cstocs (names and b-cstocs)
enca-0.10.6 2002-10-22
+ enconv uses DEFAULT_CHARSET variable, exactly as recode
- ENCAOPT works everywhere, albeit imperfectly
- options -P and -p no longer imply -M too
- ambiguous mode (-M) works again
- pager is run so that help text doesn't disappear
- standard input it printed as STDIN with -d, not as null
- make check works again
- it compiles wihtout recode again
enca-0.10.5 2002-10-13
+ UTF-8 recognition in binary and otherwise messy files
+ detection of double-encoding from some 8bit charset to UTF-8
+ Cork encoding conversion
* librecode interaction was (hopefully) improved
- fixed some build-time problems
enca-0.10.4 2002-10-10
+ added Cork encoding support for Czech, Slovak and Polish
- empty files are now considered convertible to any encoding
- removed the so-called faster (in fact slower) I/O
- fixed some more compile-time search path issues
enca-0.10.3 2002-09-22
* added support for perl umap as external converter
- fixed external converter wrappers to work with standard sh
- fixed some compile-time library search path issues
enca-0.10.2 2002-09-15
+ target charset is automatically obtained from locales when called as
enconv, new options --guess, --auto-convert
+ English language names can be used instead of ISO-639 codes everywhere
- cs_SK and ru_UA locales are properly recognised as Slovak and Ukrainian
enca-0.10.1 2002-08-29
+ faster I/O
* external converters can be disabled at build time
- `-' is accepted for standard input
- fixed broken built-in converter
- fixed crasing on an unknown language
- trivial (identity) conversions are not performed any more
- help is now printed when input is a terminal and no argument specified
- changed braindamaged <STDIN>, <STDOUT> to STDIN, STDOUT in messages
- various small fixes and build-time improvements
enca-0.10.0 2002-08-26
+ added support for Ukraininan (CP1251, IBM855, ISO-8859-5, KOI8-U, maccyr
CP1125), Belarusian (CP1251, IBM866, ISO-8859-5, KOI8-UNI, maccyr,
IBM855) and Polish (ISO-8859-2, ISO-8859-12, ISO-8859-16, Baltic, macce,
IBM852, CP1250)
+ Enca library introduced
* dropped native Debian package
* --details no longer prints guessing details (now is mostly like --human)
* --list=encodings, --list=built-in-encodings corrected to --list=charsets,
--list-built-in charsets (old names supported with a warning)
* improved Czech and Slovak charsets detection
enca-0.9.4: 2002-03-03
- built-in converter didn't convert more than first 64kB of a file
enca-0.9.3: 2001-07-16
+ a native Debian package
- fixed random reporting of nonsense results
- fixed self-contradictory --details output when file was quoted-printable
encoded
- fixed poor performance on non-GNU/Linux
- made pager less intrusive (instead of intrusive `less' ;-)
- --list=encodings prints only `known' encodings
- fixed several compile-time/portability problems
enca-0.9.2: 2001-07-13
* --help and --license are displayed through pager (when possible)
- fixed broken language hooks--they were never activated (from 0.9.1)
- fixed reporting ASCII when a 7bit encoding was detected
- fixed boundary-case behaviour when recovering from librecode failures
enca-0.9.1: 2001-06-25
+ support for Macintosh Cyrillic, including conversion
+ support for unusual UCS-4 byte orders (3412 and 2143)
+ new option --license printing full enca license
* exit codes now make sense (0, 1, 2; where 2 means serious troubles)
- temporary files are no longer world-readable
enca-0.9.0: 2001-03-26
Serious incompatibilities:
* -E and -C option letters exchanged (much better mnemonics)
* converter wrappers renamed to b-cstocs and b-recode
* finding only 7bit ASCII is no longer considered failure
* need to use --language to set language (sometimes)
* dull converter behaviour no longer supported, -x syntax changed
* option -g removed (try --name=aliases)
* option -c changed to --list=converters, listing format changed
* option -l changed to --list=encodings, listing format changed
* converter names are no longer case insensitive
* no longer uses cstocs names as canonical
* external converters are called with Enca's names, not cstocs's
Other changes:
+ support for slovak and russian (and `none') language
+ support for CP1251, IBM866, ISO-8859-5 and KOI8-R, including conversion
+ UCS-2, UCS-4, UTF-8, UTF-7 and LaTeX encoding recognition
+ much more encoding aliases accepted
+ long `GNU style' command line options
+ new output types: --enca-name, --iconv-name
+ output type --name=WORD allowing to select output type by name
+ ENCAOPT environment variable
+ language detection from locales
+ support for surfaces (experimental)
+ new option --list printing various listings
+ new converter wrapper b-map (for perl `map')
+ new option -m to reset -M back
+ new language filters
+ new options -u and -U to control multibyte encoding checks
+ included [generated] enca.spec into the tarball to allow `rpm -tb'
* -d output improved
* read limit changed to 16MB
* librecode now run with flags diacritics_only and ascii_graphics
- fixed broken -P options
- fixed several build problems on non-GNU/Linux systems
- fixed some missing and wrong characters in Unicode data
- temporary copy of damaged original file is not deleted when rescue fails
enca-0.8.x: Since features planned for 0.8 and 0.9 happened to be developed
simultaneously, this version number has been skipped.
enca-0.7.7: 2001-01-01
+ ability to use UNIX98 iconv conversion functions
+ the word `none' can be used as -E parameter causing clearing of converter
list
- fixed disarranged help text, misspelled word `European' in macce long
name, obsolete statements in manual page and other stuff of this kind
enca-0.7.6: 2000-11-20
+ any converter combination/order can be now specified with -E, old -E
meaning is no longer valid
+ new option -c (list all valid converter names)
* cork encoding not supported anymore
* better verbosity
* `/' is added to recode recoding requests thus partially solving the
surface problem---surface never changes
* some errors like specifying invalid value of threshold are no longer fatal,
the bad values are ignored instead
* handling of some exotic characters in bulit-in converter slightly changed
- fixed several fatal bugs regarding stdin to stdout conversion
- stdin is copied to stdout in case of failure whenever possible/applicable
enca-0.7.5: 2000-10-25
* license changed to GNU GPL Version 2 (i.e. license version is explicitly
specified)
* prints error message when conversion is impossible
* binary data filter improved/changed
- fails back to external converter when GNU recode library cannot convert
due to errorneous request
- '' no longer causes enca to read from stdin
- tries to restore files damaged by GNU recode library
enca-0.7.4: 2000-10-12
+ box-drawing characters are (carefully) filtered out when guessing
- fixed intermixed behaviour in SMS/nonSMS modes
enca-0.7.3: 2000-10-09
+ blocks of probably binary data are filtered out when guessing
* standard input is copied to standard output when its encoding is unknown
- fixed reading only 4096 bytes from pipe (from 0.7.1)
enca-0.7.2: has been never released
+ GNU recode recoding chains made possible by starting -x (convert) parameter
with `..'
+ second best guess is marked with `-' in -d (print details) output
enca-0.7.1: 2000-10-02
* in case of nonfatal i/o failure enca continues processing remaining files
enca-0.7.0: 2000-09-26
+ standard input to standard output conversion
+ short message mode -M
+ ability to use GNU recode library
+ new output type -r (encoding name after RFC1345)
+ ability to convert cork internally
+ new external converter brecode (recode wrapper)
+ new output type -g (list of aliases)
+ new option -V (verbose)
* -x (convert) paramteres syntax changed to in_enc..out_enc (old syntax still
supported, will be removed in 0.8.x)
* option -e (disable external) no longer supported, empty string as -C
(external converter) parameter can be used instead
* encoding names specified as -x (convert) parameters are case insensitive
* ascii is not considered unknown encoding (i.e. failure) so enca returns 0
* -d (print details) output improved/changed/updated
* -p (prefix result with file name) no longer prints conversion details
* by default result is prefixed by file name when enca is run on more than
one file
enca-0.6.2: 2000-08-17
+ help texts (-h and -v) made usable (thanx to Halef)
enca-0.6.1: 2000-08-15
- tarball bugfix
enca-0.6.0: 2000-07-20
+ bulilt-in converter
+ -x (convert) can now take form -x in_enc,out_enc causing enca to behave
like a dull converter
+ new options -e and -E (disable internal/external converter)
+ new option -l (print internally-convertible encodings)
enca-0.5.0: 2000-07-17
* -p (prefix result with file name) causes enca to print what is converted
and how
* iso8859-2/cp1250 recognition improved
- doesn't spawn external converters as fast as is possbile, but waits for
them to return
- fixed `Unrecognized encoding' when winner is 1250 (from 0.4.3)
- corrected -d (print details) table alignment
enca-0.4.3: 2000-07-14
* -d (print details) prints encodings alphabetically sorted
- corrected short encoding name t1 -> cork
- division-by-zero bugfixes
enca-0.4.2: has been never released
* options -m/-M ([don't] use iso8892-2/cp1250 hack) no longer supported
- fixed showing standard input as empty string (<STDIN> is printed now)
enca-0.4.1: 2000-07-12
* default of 60 significant characters changed to 10
enca-0.4.0: 2000-07-10
+ first public release
|