1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517
|
2.21.1 2025-01-27
[Maarten van Gompel]
* Fix for segfault on edge cases without /etc/services
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1094197)
2.21 2024-12-16
[Ko van der Sloot]
* needs latest ticcutils
* code rework and cleanup
- better introspection (isinstance() and isSubClass())
- select() functions
- type consistency. (e.g. all Enum's are Classes)
* improved debugging possibilities
* introduced an AbstractWord als base for (Hidden-)Word
* added more sanity tests
* more strictly forbid empty attributes in FoLiA
* improved GitHub CI builds
2.20 2024-09-12
[Ko van der Sloot]
* require C++17 now
* small refactorings
* cleanup in GitHub CI file
2.19 2024-05-27
[Ko van der Sloot]
* again bumped the .so version, as we break the ABI
* Refactored the Class hierarchy for clearer code
- introducing an AbstracFeature class as base for all derived Features
* improved exception and error handling, including line numbers in messages
when possible
* added code to detect mismatch between annotators and processors (were not
detected until now)
* cleaner en better C++ code, const correctness and such...
2.18 2024-04-26
[Ko van der Sloot]
* buming the .so version due to ABI breaks
* fix for --canonical option in folialint
* fix for https://github.com/LanguageMachines/libfolia/issues/56
* improved checking for emtpy <t> nodes
* several code improvements. const correctness etc.
* fix for https://github.com/LanguageMachines/libfolia/issues/55
* better check for illegal Correction's:
https://github.com/proycon/folia/issues/77
* added Doxygen config
* better handling of XML comment nodes
2.17 2023-10-21
[Ko van der Sloot]
* assume ticcutils >= 0.34 to force NFC normalization
* refactored str() and unicode() text extraction functions.
* a lot of work on code quality
2.16 2023-09-21
[Ko van der Sloot]
* fix for https://github.com/LanguageMachines/libfolia/issues/54
* Clearer error messages (adding filename, if present)
* some code cleaned/clarified
* added code to parse, store and output XML's Processing Instruction nodes
* added Etymology annotation
2.15 2023-05-08
[Ko van der Sloot]
* fixed a terrible typo/bug in subclasses.cxx:
using el-referable(), where el->referable() was meant
* plugged a small potential memory-leak
* fixed some offset problems in text handling
* fixed https://github.com/LanguageMachines/libfolia/issues/52
* foliadiff script now returns better message on failure
* switching to C++14
* code polishing
* updated GitHub action
2.14 2023-02-20
[Ko van der Sloot]
* implemented an ADD_FORMATTING TextPolicy to extract otherwise hidden
text from <t-hspace> and <t-hyph> Markup elements
(see https://github.com/proycon/foliapy/issues/25)
* fix for https://github.com/LanguageMachines/libfolia/issues/51
* General improvements:
- include a filename when throwing during Document processing
- added setutest() member for XmlText class
- general code quality
2.13 2023-01-23
[Ko van der Sloot]
* removed dependency on libtar
* quick fix for ignoring text inside <t-hbr> https://github.com/proycon/foliapy/issues/25
[Maarten van Gompel]
* updated minimum required libxml2 version
2.12 2023-01-02
[Ko van der Sloot]
* fix for https://github.com/LanguageMachines/libfolia/issues/49
* ABI breached, so bunped the .so file version
* cleaner C++ code, more C++11 now, removing CppCheck warnings
* using more recent TiccUtils (for enum_flags.h)
* several small improvements
* improved GitHub action
2.11 2022-07-22
[Ko vd Sloot]
* Significant refactoring, code cleaning, code reduction, and extra comments
* fixed memory leaks in the test (and also tests destroy() function now)
* Added some safeguards against multiple setnames for text_annotation. This is a limitation discussed in https://github.com/proycon/folia/issues/104
* added code to handle text extraction for "empty" rows.
* implemented a fix for empty cell's. https://github.com/proycon/foliatools/issues/41
* added a fix for text offsets in embedded elements in a structure that may NOT carry text itself. Like cell inside a table.
[Maarten van Gompel]
* codemeta.json: updated metadata according to (proposed) CLARIAH requirements
2.10 2021-12-15
[Ko vd Sloot]
* several code improvements, suggested by CPPcheck and scan-build
* start using TextPolicy::debug
[Maarten van Gompel]
* impemented implicitspace logic for whitespace issue proycon/folia#101
2.9 2021-07-12
[Ko vd Sloot]
* Reworked the FoliaElement class hierarchy. Much clearer now
* re-arranged file structure. Separating some files into smaller files
* text extraction:
- numereous changes and additions to handle spaces better.
- refactored the code, using a new TextPolicy class for clarity
- added code for handling 'tag' attributes using callbacks
* improved handling of Correction
* numerous code refactorings for clearity and speed
* adapted and improved documentation
2.8.1 2021-04-07
[Ko vd Sloot]
* re-added the ltrim() function for backward compatibility
2.8 2021-04-07
* implements FoLia v2.5, with a new 'model' for whitespaces in texts.
* bumped the .so version to 17
[Maarten van Gompel]
* added <t-lang> TextMarkup tag
* added <t-hspace> TextMarkup tag
* added tag attribute
* fix for proycon/folia#88, proycon/folia#92, proycon/folia#93, proycon/folia#94
* added text normalization functions to support the new text model,
maintaining backward compatibility.
[Ko vd Sloot]
* parse and preserve the xml:space attribute.
* added a 'space' normalizer. ALL exotic spaces (like em-space and en-space)
are replaced by the standard ascii space
* fixed https://github.com/LanguageMachines/libfolia/issues/48
* code cleanup/refactoring
* ditch TravisCI and implemented a GitHub action
2.7 2021-01-07
* implemented a more relaxed MetaData scheme, allowing mixing 'foreign'
and 'native' MetaData
* bumped the .so version to 15
* features may be present in <t> and <t-*> nodes now
2.6.1 2020-12-11
[Maarten van Gompel]
* Updated for FoLiA v2.4.1: strip leading/trailing whitespace in text content (proycon/folia#88)
[Ko vd Sloot]
* Fixed problem with text-conststency errors for <t-str> within <t>
2.6.0 2020-11-16
[Maarten van Gompel]
* Updated for FoLiA v2.4
* Revised external implementation
* Implemented Modality annotation
[Ko vd Sloot]
* cleanup and extra sanity tests
* Implemented an 'explicit' mode for Document (FoLiA v2.3) and in folialint
2.5.1 2020-09-15
[Maarten van Gompel]
* Bugfix: Fixed handling of control characters, strip control characters by default
[Ko vd Sloot]
* fix in date handling (lookup table for month -> integer conversion )
* minor refactoring
* some documentation
2.5 2020-09-02
[Maarten van Gompel]
* Adapted to FoLiA v2.3
* Support parsing of the new explicit form
[Ko vd Sloot]
* folialint: updated usage() and man page
* minor refactoring
2.4 2020-04-15
[Ko van der Sloot]
* comment in Doxygen format added
* bumped the library version to 14
* fix for https://github.com/proycon/folia/issues/82
* fix for https://github.com/proycon/folia/issues/42
* fixed problem with using new tag names on pre 1.6 documents
* better checks in folia_engine on text inconsistencies and such
(https://github.com/LanguageMachines/libfolia/issues/43)
* confidence output is more consistent now
* removed the folia_builder (was not used)
* code refactorings and cleanup, removing unused functions
2.3.2 2020-01-13
[Ko van der Sloot]
Bug fix release
* fix for https://github.com/LanguageMachines/foliautils/issues/37
* fix for https://github.com/LanguageMachines/foliautils/issues/38
* fixes in Correction handling.
* fixed a Multi-Threading problem with the static reverse_old map
2.3.1 2019-10-21
[Maarten van Gompel]
* Bug fix release for gcc 9.1
It stumbles upon some inline functions
[Ko van der Sloot]
* replaced call to unsafe 'tmpnam()' by 'TiCC::tempname()'
2.3 2019-08-29
[Ko vd Sloot]
new features:
* autodeclare mode introduced (as in FoLiApy)
* folialint by default doesn't autodeclare. use -a or --autodeclare to use it
* Better detecting of declaration errors in general
* the select function now also enables the possibility to search recursively
upto the first matching sibling.
* bumped library version to 13
other changes:
* some exceptions are changed.
* less exceptions are thrown. An empty result is returned instead.
* folialint now accept bote -d and --debug
* real fix for issue 70
* small bug fixes and refactorings
* accept empty Comment and Description nodes
2.2.1 2019-07-22
[Ko vd Sloot]
Bug fix release:
* There were some problems handling NO setname vs. EMPTY setname, during
incremental parsing in folia::Engine. This was sorted out now:
https://github.com/proycon/folia/issues/74
This related to some ucto and frog issues too:
https://github.com/LanguageMachines/ucto/issues/70
https://github.com/LanguageMachines/frog/issues/72
2.2 2019-07-15
[Ko vd Sloot]
Bug fix release.
* Folia::Engine choked on some complex FoLiA. Solved by refactoring and in fact
simplifying some code. (Frog issue #77 revealed this)
* added flush() on document output to streams. (frog issue #72)
* improved output in debugging mode
2.1 2019-06-19
[Ko vd Sloot]
Bug fixes and enhancements:
* provenance:
- added 'generate_id' attribute with 'auto()' and 'next()' values
- some code improvements
* bugs:
- When using the FoLiA-engine, we have to save the ORIGINAL annotationdefaults,
and use these when parsing.
2.0 2019-05-22
[Ko vd Sloot]
Major release.
* Supports the new FoLiA 2.0 features:
- provenance support
- more stricter checking on annotation declarations
- added the new TextMarkupReference class
- supports Hidden Words.
- All structure elements can have the 'space' attribute
- support for groupannotations
- many more.
* API and ABI breaches:
- libray version bumped to version 10
- many functions are renamed
- the text() functions have an ENUM parameter now to select for STRICT,
RETAIN or HIDDEN
* bug fixes
- support for xlink: improved
- there as a rare mixup between <text> nodes and <text-annotation> nodes in
the folia::Engine
- all <wref> nodes get a 't' attribute now on serializing.
- reading Extrenal Folia could getint an endless loop
* code refactoring and cleanup
1.16 2019-05-15
[Ko van der Sloot]
stabilizing release for folia1.5. Next release wil support the new FoLiA 2.0
Changes:
* renamed folia::Processor to folia::Engine
* extended and improved Engine code a lot
* avoid spurious newline on Document output
* Will read and ignore some FoLiA 2.0 additions
* numerous small additions and fixes
* make sure that the XmlParser uses the HUGE model everywhere
1.15 2018-11-29
[Ko van der Sloot]
* added (still experimental) code for a FoLiA Builder, Processor and
TextProcessor class.
Use with care. The API may change unannounced!
* a foliadiff script (using folialint) is installed now
* several refactorings, to make the code more clear.
* the 'ref' attribute was not serialized for TextContent
* several smaller small bug fixes
* the .so version is bumped to 9 because of a lot of API/ABI changes
1.14 never released
1.13 2018-05-16
[Ko van der Sloot]
* disabled WordRefrence test. It was incomplete, and hard to do
* use icu:: namespace
[Maarten van Gompel]
* added codemeta.json
* fix spelling errors in error messages
1.12 2018-02-19
* configuration cleanup. MacOSX is better supported now.
* folialint now supports --fixtext (handle with care!)
* library version bumped to 8.0, due to changes in the API
* regenerated FoLiA properties (to FoLiA version 1.5.1)
* several small bug fixes
1.11 2017-12-04
Bug fix release:
* handling of <comment> tags within <t> nodes
* better handling of <wref> tags. Forbid forward references
1.10.1 2017-11-06
Minor fix
* bumped the .so version to 7.0
1.10 2017-10-17
Major Release, implementing FoLiA spec 1.5
* added text checking for all 1.5 documents and up
* added offset and ref checking for Text in all 1.5 documents and up
* 'empty' text inside TextContent, PhonContent and Textmarkup is significamt
* better version checking
* text checking can be dis/enabled using FOLIA_TEXT_CHECK environment variable
* added submetadata mechanism
* implemented aliasses for annotation setnames
* added an xmlstring() serializer for Document
* bug fixes:
- in LineBreak serializing
- XmlComment is textless.
- miscelaneous small fixes
1.9 2017-08-30
Bug fix release
* accept ICU 50 too (was 52) to make CentOS happy
* XmlComment INSIDE <t> lead to crashes. fixed.
* code changes in code that is only executed for documents in folia 1.5 format
(that shouldn't exist in the wild)
1.8 2017-00-16
Implements FoLiA spec 1.4.3
* adding textclass attribute
API changed. Bumped library version to 6.2.0
[Ko van der Sloot]
* added experimental textchecking code. only working for FoLiA documents
according to spec 1.5. NOT RELEASED YET!
Work in Progress
* fix in generate_id. AUTO_GENERATE_ID property was ignored.
* numereous bug fixes
1.7 2017-04-04
API changed so bumped library version to 6.1.0
[Ko van der Sloot]
* textcontent() and phoncontent() return const pointers, and also
work for TexContent adn PhonContent elements now
* some reactoring, as suggested by CPPCHECK
* typos
* added dangerous functions to manipulate the class of a TextContent
* added reference countion on annotations.
This allows to remove unneeded declarations.
* small bug fixes:
- str() should never throw.
- avoid memory leak
[maarten van Gompel]
* fixes in folia_properties for FoliA spec 1.4.1
1.6 2017-01-05
* We now implement FoLiA spec 1.4
* ABI breakage. .so name bumped to 6.0.0
reason:
- new properties added
- implementation of generateId() is changed
* enhancements to folialint. Saving a document with --strip also
implies canonical output
* some bug fixes
1.5 2016-11-14
[Ko van der Sloot]
* Bumped the .so name. Should have been done in 1.4!
* addition: text() mebmer for document-
* minor bug fixes:
- isNCname test now conforms to XML definition
- improved am error messag in Document
- check empty attributes in Feature() construction
1.4 2016-10-18
[Ko van der Sloot]
* Now fully implements Folia spec 1.3.2
- multiple ForeignData nodes
- added more Feature nodes, like Polarity, Strenght
- Source, Target, Relation, Predicate, Sentiment Statement,
Observation Annotations and Layers.
- Comment node
- better version checking.(and a bit relaxed too)
* some bug fixes and code improvement.
- str() works more as expected
- fixup ref 'id' vs. 'xml:id'
- improved sanity check to better test errors in the specs.
* added language getter and setter.
1.3 2016-07-11
[Ko van der Sloot]
Maintenance release:
- long options --help aan --version added
- fix in LineBreak: text() generates a newline
1.2 2016-05-23
[Ko van der Sloot]
* now based on Folia spec 1.2
- ForeignData node added
- Foreign metadata in document
- relaxed aref and ref type, implementing full xlink syntax for
'simple' and 'locator' type.
- Linebreak nows supports 'linenr', 'pagenr' and 'newpage' attributes
- enhanced folialint:
- added a '--strip' option to strip 'instable'
information like dates and version numbers.
- added '--output' option to speciy an file
- added '--nooutput'option to suppres output
- document outputs annotations in same order as read in.
- we no longer output the set for AnnotationLayers
- alien atributes are totally ignored now.
- small bug fixes
1.0 2016-03-09
[Ko van der Sloot]
* totally reworked the implementation:
it is based on code generated from generic definitions that
assure that the Python and C++ versions are always inline
(Most of that work done by Maarten van Gompel)
* simplified and overhauld the class API.
* a lot of bug fixes for cornercases
0.13 2016-01-14
[Ko van der Sloot]
* repository moved to GitHub
* added Travis support
* much smaller memory footprint
* Document deletion was very slow due to a brainfarth
* no more initialization problems in MT cases
* use XML_PARSE_HUGE to be able to handle VERY LARGE documents
* text() functions in line with the Python version
* added new "tags" to keep in sync with the Python version
* a lot of small code updates and refactorings
0.12 2014-09-23
[Ko van der Sloot]
* release 0.12
* library version bumped to 3.0.0 because ABI and API have heavily changed
0.11 2014-09-16
* now implements nodes like: Ref, Note, External
* Correction may be added to a lot more nodes
* several bug fixes
* better MT safe
* major API and ABI changes: Added a true virtual base,
using virtual inheritance.
0.10
* some XML stuff is moved to ticcutils
* also use some other goodies form ticcutils
* now implements new nodes like Metric, Coreferences and Semroles
(following the FoLiA 0.9 specs)
* a lot of code improvement too, including some bug fixes
0.9
* lost in tranistion
0.8 2012-03-29
* reworked and improved handling of (default) annotation sets.
We are more strict now.
* some refactoring to get a more uniform handling of folia::classes
0.7 2012-02-13
* some bugs fixed in annotation declaration handling
* added GAP annotation to gap
0.6 2012-01-09
* fixed a problem with escaping in arguments, "att='\\'" failed
0.5 2011-12-21
* 0.4 is lost in space
* rather extensive rework. API and ABI changed.
More to do, but releasing now is desirable
0.3 2011-11-02
[Ko van der Sloot]
* Get ready for first (beta) release as a package
|