1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774
|
# C Left-Right Parser (libcleri)
Language parser for the C/C++ programming language. Initially created for [SiriDB](https://github.com/SiriDB/siridb-server).
---------------------------------------
* [Installation](#installation)
* [Related projects](#related-projects)
* [Quick usage](#quick-usage)
* [API](#api)
* [cleri_t](#cleri_t)
* [cleri_grammar_t](#cleri_grammar_t)
* [cleri_parse_t](#cleri_parse_t)
* [cleri_node_t](#cleri_node_t)
* [cleri_children_t](#cleri_children_t)
* [cleri_olist_t](#cleri_olist_t)
* [Elements](#elements)
* [cleri_keyword_t](#cleri_keyword_t)
* [cleri_regex_t](#cleri_regex_t)
* [cleri_choice_t](#cleri_choice_t)
* [cleri_sequence_t](#cleri_sequence_t)
* [cleri_optional_t](#cleri_optional_t)
* [cleri_prio_t](#cleri_prio_t)
* [cleri_repeat_t](#cleri_repeat_t)
* [cleri_list_t](#cleri_list_t)
* [cleri_token_t](#cleri_repeat_t)
* [cleri_tokens_t](#cleri_tokens_t)
* [Forward reference](#forward-reference)
* [cleri_dup_t](#cleri_dup_t)
* [Miscellaneous functions](#miscellaneous-functions)
---------------------------------------
## Installation
>Note: libcleri requires [pcre2](http://www.pcre.org/)
>
>On Ubuntu:
>
>`sudo apt install libpcre2-dev`
>
>On MacOs:
>
>`brew install pcre2`
>
Install the release version.
```
$ cd Release
```
Compile libcleri
>Note: On MacOs you might need to set environment variables:
>
>`export CFLAGS="-I/usr/local/include" && export LDFLAGS="-L/usr/local/lib"`
>
```
$ make all
```
Install libcleri
```
$ sudo make install
```
> Note: run `sudo make uninstall` for removal.
## Related projects
- [pyleri](https://github.com/transceptor-technology/pyleri): Python parser (can export grammar to pyleri, libcleri, goleri and jsleri)
- [jsleri](https://github.com/transceptor-technology/jsleri): JavaScript parser
- [goleri](https://github.com/transceptor-technology/goleri): Go parser
## Quick usage
>The recommended way to create a grammar is to use [pyleri](https://github.com/transceptor-technology/pyleri) for
>writing the grammar and then export the grammar to libcleri or other languages.
This is a simple example using libcleri:
```c
#include <stdio.h>
#include <cleri/cleri.h>
void test_str(cleri_grammar_t * grammar, const char * str)
{
cleri_parse_t * pr = cleri_parse(grammar, str);
printf("Test string '%s': %s\n", str, pr->is_valid ? "true" : "false");
cleri_parse_free(pr);
}
int main(void)
{
/* define grammar */
cleri_t * k_hi = cleri_keyword(0, "hi", 0);
cleri_t * r_name = cleri_regex(0, "^(?:\"(?:[^\"]*)\")+");
cleri_t * start = cleri_sequence(0, 2, k_hi, r_name);
/* compile grammar */
cleri_grammar_t * my_grammar = cleri_grammar(start, NULL);
/* test some strings */
test_str(my_grammar, "hi \"Iris\""); // true
test_str(my_grammar, "bye \"Iris\""); // false
/* cleanup grammar */
cleri_grammar_free(my_grammar);
return 0;
}
```
Although libcleri is written for C, it can be used with C++ too:
```c++
#include <iostream>
#include <cleri/cleri.h>
void test_str(cleri_grammar_t * grammar, const char * str)
{
cleri_parse_t * pr = cleri_parse(grammar, str);
std::cout << "Test string " << str << ": " <<
(pr->is_valid ? "true" : "false") << std::endl;
cleri_parse_free(pr);
}
int main()
{
/* define grammar */
cleri_t * k_hi = cleri_keyword(0, "hi", 0);
cleri_t * r_name = cleri_regex(0, "^(?:\"(?:[^\"]*)\")+");
cleri_t * start = cleri_sequence(0, 2, k_hi, r_name);
/* compile grammar */
cleri_grammar_t * my_grammar = cleri_grammar(start, NULL);
/* test some strings */
test_str(my_grammar, "hi \"Iris\""); // true
test_str(my_grammar, "bye \"Iris\""); // false
/* cleanup grammar */
cleri_grammar_free(my_grammar);
return 0;
}
```
## API
### `cleri_t`
Cleri type is the base object for each element.
*Public members*
- `uint32_t gid`: Global Identifier for the element. This GID is not required and
as a rule it should be set to 0 if not used. You can use the GID for identifiying
an element in a parse result. When exporting a Pyleri grammar, each *named* element
automatically gets a unique GID assigned. (readonly)
- `cleri_tp tp`: Type for the cleri object. (readonly)
- `CLERI_TP_SEQUENCE`
- `CLERI_TP_OPTIONAL`
- `CLERI_TP_CHOICE`
- `CLERI_TP_LIST`
- `CLERI_TP_REPEAT`
- `CLERI_TP_PRIO`
- `CLERI_TP_RULE`
- `CLERI_TP_THIS`
- `CLERI_TP_KEYWORD`
- `CLERI_TP_TOKEN`
- `CLERI_TP_TOKENS`
- `CLERI_TP_REGEX`
- `CLERI_TP_END_OF_STATEMENT`
- `cleri_via_t via`: Element. (readonly)
- `cleri_sequence_t * sequence`
- `cleri_optional_t * optional`
- `cleri_choice_t * choice`
- `cleri_list_t * list`
- `cleri_repeat_t * repeat`
- `cleri_prio_t * prio`
- `cleri_rule_t * rule`
- `cleri_keyword_t * keyword`
- `cleri_regex_t * regex`
- `cleri_token_t * token`
- `cleri_tokens_t * tokens`
- `void * dummy` (place holder, this, eof)
#### `cleri_t * cleri_new(uint32_t gid, cleri_tp tp, cleri_free_object_t free_object, cleri_parse_object_t parse_object)`
Create and return a new cleri object. A unique gid is not required but can help
you with identifiying the element in a [parse result](#cleri_parse_t). As a rule
you should assign 0 in case no specific gid is required. This function should only
be used in case you want to create your own custom element.
#### `void cleri_incref(cleri_t * cl_object)`
Increment the reference counter for a cleri object. Should only be used in case you
want to write your own custom element.
#### `void cleri_decref(cleri_t * cl_object)`
Decrement the reference counter for a cleri object. If no references are left the
object will be destroyed. Do not use this function after the element has
successfully been added to another element or grammar. Should only be used in
case you want to write your own custom element.
#### `int cleri_free(cleri_t * cl_object)`
Decrement reference counter for a cleri object. When there are no more references
left the object will be destroyed. Use this function to cleanup after errors
have occurred. Do not use this function after the element has successfully been
added to another element or grammar.
Example strict error handling:
```c
cleri_grammar_t * compile_grammar(void)
{
cleri_t * k_hello = cleri_keyword(0, "hello", 0);
if (k_hello == NULL) {
return NULL;
}
cleri_t * k_world = cleri_keyword(0, "world", 0);
if (k_world == NULL) {
cleri_free(k_hello); // must cleanup k_hello
return NULL;
}
cleri_t * hello_world = cleri_sequence(0, 2, k_hello, k_world);
if (start == NULL) {
cleri_free(k_hello);
cleri_free(k_world);
return NULL;
}
cleri_t * opt = cleri_optional(0, hello_world);
if (opt == NULL) {
/* we now must only cleanup hello_world since this sequence will
* cleanup both keywords too. */
cleri_free(hello_world);
return NULL;
}
cleri_grammar_t * grammar = cleri_grammar(opt, NULL);
if (grammar == NULL) {
cleri_free(opt);
}
/* when your program has finished, the grammar including all elements can
* be destroyed using cleri_grammar_free() */
return grammar;
}
```
>Note: Usually grammar is only compiled at the startup of your program so
>memory allocation errors during the grammar creation are unlikely to occur.
>If NULL is parsed as an argument instead of an element, then the function
>to which the argument is parsed to, will return NULL. Following this
>chain the final grammar returns NULL in case an error has occurred somewhere.
>In this case you should usually abort the program.
### `cleri_grammar_t`
Compiled libcleri grammar.
*No public members*
#### `cleri_grammar_t * cleri_grammar(cleri_t * start, const char * re_keywords)`
Create and return a compiled grammar. Argument `start` must be the entry element
for the grammar. Argument `re_keywords` should be a regular expression starting
with character `^` for matching keywords in a grammar. When a grammar is created,
each defined [keyword](#cleri_keyword_t) should match this regular expression.
`re_keywords` is allowed to be `NULL` in which case the default
`CLERI_DEFAULT_RE_KEYWORDS` is used.
#### `void cleri_grammar_free(cleri_grammar_t * grammar)`
Cleanup grammar. This will also destroy all elements which are used by the
grammar. Make sure all parse results are destroyed before destroying the grammar
because a [parse result](#cleri_parse_t) depends on elements from the grammar.
### `cleri_parse_t`
Parse result containing the parse tree and other information about the parse
result.
*Public members*
- `int cleri_parse_t.is_valid`: Boolean. Value is 1 (TRUE) in case the parse string is valid or 0 (FALSE) if not. (readonly)
- `size_t cleri_parse_t.pos`: Position in the string to where the string was successfully parsed. This value is (readonly)
equal to the length of the string in case `cleri_parse_t.is_valid` is TRUE. (readonly)
- `const char * cleri_parse_t.str`: Pointer to the provided string. (readonly)
- `cleri_node_t * tree`: Parse tree. Even when `is_valid` is `False` the parse tree is returned but will only contain results as far as parsing has succeeded. The tree is the root node which can include several `children` nodes. The structure will be further clarified in the example that explains a way of visualizing the parse tree. This example can be found in the "examples/tree_and_expect/tree" folder. Run this code and it will output a parse tree in JSON format. (see also [cleri_node_t](#cleri_node_t) and [cleri_children_t](#cleri_children_t)) (readonly)
- `const cleri_olist_t * expect`: Linked list to possible elements at position `cleri_parse_t.pos` in `cleri_parse_t.str`. Even if `is_valid` is true there might be elements in this set, for example when an `Optional()` element could be added to the string. Expecting is useful if you want to implement things like auto-completion, syntax error handling, auto-syntax-correction etc. An example of this can be found in the "examples/tree_and_expect/expect" folder. (see [cleri_olist_t](#cleri_olist_t) for more information)
#### `cleri_parse_t * cleri_parse(cleri_grammar_t * grammar, const char * str)`
Create and return a parse result. The parse result contains pointers to the
provided string (`str`) so make sure the string is available while using the
parse result.
#### `void cleri_parse_free(cleri_parse_t * pr)`
Cleanup a parse result.
#### `void cleri_parse_expect_start(cleri_parse_t * pr)`
Can be used to reset the expect list to start. Usually you are not required to
use this function since the expect list is already at the start position.
#### `void cleri_parse_strn(char * s, size_t n, cleri_parse_t * pr, cleri_translate_t * translate)`
Can be used to generate a textual parse result. The first argument `s` should be able to hold
the complete message and will be restricted by `n`. The return value is the number of characters which
are (or would be) written to `s`, excluding the terminator char. This behavior is similar to functions like `snprintf`.
One could for example use `NULL` for `s` with `n` equals to `0` to get the size which is required. Then you could
`malloc` the size plus one for the terminator and run the functions again. A negative value indicates an error.
Argument `pr` should be a parse result or `NULL` and `translate` a translation function or `NULL`.
Example:
```c
// In case a translation function returns an empty string, no text is used
const char * translate(cleri_t * o) {
return ""; // a possible result might be: `error at position x`
}
// Text may be returned based on gid
const char * translate(cleri_t * o) {
switch (o->gid) {
case 1: return "A"; // error at position x, expecting: A
case 2: return ""; // gid 2 will be ignored
}
return NULL; // normal parsing for everything else
}
```
### `cleri_node_t`
Node object. A parse result has a parse tree which consists of nodes. Each node
may have children.
*Public members*
- `const char * cleri_node_t.str`: Pointer to the position in the parse string where this node starts. (readonly)
- `size_t cleri_node_t.len`: Length of the string which is applicable for this node. (readonly)
- `cleri_t * cleri_node_t.cl_obj`: Element from the grammar which matches this node. Note that the `cl_obj` is `NULL` for the root node and the first can be found in its children. (readonly)
- `cleri_children_t * cleri_node_t.children`: Optional children for this node. (readonly)
#### `bool cleri_node_has_children(cleri_node_t * node)`
Macro function for checking if a node has children.
### `cleri_children_t`
Children from a node in a linked list.
*Public members*
- `cleri_node_t * cleri_children_t.node`: Child node. (readonly)
- `struct cleri_children_s * cleri_children_t.next`: Next child node or `NULL` if there are no other children. (readonly)
Example looping over all children within a node:
```c
/* we asume having a node (cleri_node_t*) */
if (cleri_node_has_children(node)) {
cleri_children_t * child = node->children;
while (child != NULL) {
// do something with child->node
child = child->next;
}
}
```
### `cleri_olist_t`
Linked list holding libcleri objects. A `cleri_olist_t` type is used for
expected elements in a parse result.
*Public members*
- `cleri_t * cl_obj`: Object (holding an element, readonly)
- `cleri_olist_t * next`: Next object. (readonly)
Example looping over `cleri_parse_t.expect`:
```c
/* we assume having a pr (cleri_parse_t*)
*
* Notes:
* pr->expect is NULL if nothing is expected and it is save to
* change pr->expect. If required the linked list can be reset to start
* using cleri_parse_expect_start(). */
while (pr->expect != NULL) {
// do something with pr->expect->cl_obj
pr->expect = pr->expect->next;
}
```
## Elements
Elements are objects used to define a grammar.
### `cleri_keyword_t`
Keyword element. The parser needs a match with the keyword.
*Type (`cleri_t.tp`)*: `CLERI_TP_KEYWORD`
*Public members*
- `const char * cleri_keyword_t.keyword`: Contains the keyword string. (readonly)
- `int cleri_keyword_t.ign_case`: Boolean. (readonly)
- `size_t cleri_keyword_t.len`: Length of the keyword string. (readonly)
#### `cleri_t * cleri_keyword(uint32_t gid, const char * keyword, int ign_case)`
Create and return a new [object](#cleri_t) containing a keyword element.
Argument `ign_case` can be set to 1 for a case insensitive keyword match.
Example:
```c
/* define case insensitive keyword */
cleri_t * k_tictactoe = cleri_keyword(
0, // gid, not used in this example
"tic-tac-toe", // keyword
1); // case insensitive
/* create grammar with custom keyword regular expression match */
cleri_grammar_t * grammar = cleri_grammar(k_tictactoe, "^[A-Za-z-]+");
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "Tic-Tac-Toe");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_regex_t`
Regular expression element. The parser needs a match with the regular
expression.
*No public members*
#### `cleri_t * cleri_regex(uint32_t gid, const char * pattern)`
Create and return a new [object](#cleri_t) containing a regular
expression element. Argument `pattern` should contain the regular expression.
Each pattern must start with character `^` and the pattern should be checked
before calling this function.
See [Quick usage](#quick-usage) for a `cleri_regex_t` example.
### `cleri_choice_t`
Choice element. The parser must choose one of the child elements.
*Public members*
- `int cleri_choice_t.most_greedy`: Boolean. (readonly)
- `cleri_olist_t * cleri_choice_t.olist`: Children. (readonly)
#### `cleri_t * cleri_choice(uint32_t gid, int most_greedy, size_t len, ...)`
Create and return a new [object](#cleri_t) containing a choice element.
Argument `most_greedy` can be set to 1 in which case the parser will select the
most greedy match. When 0, the parser will select the first match.
Example:
```c
/* define grammar */
cleri_t * k_hello = cleri_keyword(0, "hello", 0);
cleri_t * k_goodbye = cleri_keyword(0, "goodbye", 0);
cleri_t * choice = cleri_choice(
0, // gid, not used in this example
0, // stop at first match
2, // number of elements
k_hello, k_goodbye); // elements
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(choice, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "goodbye");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_sequence_t`
Sequence element. The parser must match each element in the specified order.
*Public members*
- `cleri_olist_t * cleri_sequence_t.olist`: Elements. (readonly)
#### `cleri_t * cleri_sequence(uint32_t gid, size_t len, ...)`
Create and return a new [object](#cleri_t) containing a sequence element.
Example:
```c
cleri_t * sequence = cleri_sequence(
0, // gid, not used in the example
3, // number of elements
cleri_keyword(0, "Tic", 0), // first element
cleri_keyword(0, "Tac", 0), // second element
cleri_keyword(0, "Toe", 0)); // third element
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(sequence, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "Tic Tac Toe");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_optional_t`
Optional element. The parser looks for an optional element.
*Public members*
- `cleri_t * cleri_optional_t.cl_obj`: Optional element. (readonly)
#### `cleri_t * cleri_optional(uint32_t gid, cleri_t * cl_obj)`
Create and return a new [object](#cleri_t) containing an optional element.
Example:
```c
/* define grammar */
cleri_t * k_hello = cleri_keyword(0, "hello", 0);
cleri_t * k_there = cleri_keyword(0, "there", 0);
cleri_t * optional = cleri_optional(
0, // gid, not used in this example
k_there); // optional element
cleri_t * greet = cleri_sequence(
0, // gid, not used in this example
2, // number of elements
k_hello, optional); // elements
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(greet, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "hello");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_prio_t`
Prio element. The parser must match one element. Inside the prio element it
is possible to use `CLERI_THIS` which is a reference to itself.
>Note: Use a [forward reference](#forward-reference) when possible.
>A prio is required when the same position in a string is potentially checked
>more than once.
*Public members*
- `cleri_olist_t * cleri_sequence_t.olist`: Elements. (readonly)
#### `cleri_t * cleri_prio(uint32_t gid, size_t len, ...)`
Create and return a new [object](#cleri_t) containing a prio element.
Example:
```c
/*
* define grammar.
*
* Note: The third and fourth element are using a reference to the prio
* element at the same position in the string as the prio element.
* This is why a forward reference cannot be used for this example.
*/
cleri_t * prio = cleri_prio(
0, // gid, not used in the example
4, // number of elements
cleri_keyword(0, "ni", 0), // first element
cleri_sequence(0, 3, // second element
cleri_token(0, "("),
CLERI_THIS,
cleri_token(0, ")")),
cleri_sequence(0, 3, // third element
CLERI_THIS,
cleri_keyword(0, "or", 0),
CLERI_THIS),
cleri_sequence(0, 3, // fourth element
CLERI_THIS,
cleri_keyword(0, "and", 0),
CLERI_THIS));
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(prio, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "(ni or ni) and (ni or ni)");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_repeat_t`
Repeat element. The parser must math at least `cleri_repeat_t.min` elements and
at most `cleri_repeat_t.max`. An unlimited amount is allowed in case `cleri_repeat_t.max`
is set to 0 (zero).
*Public members*
- `cleri_t * cleri_repeat_t.cl_obj`: Element to repeat. (readonly)
- `size_t cleri_repeat_t.min`: Minimum times an element is expected. (readonly)
- `size_t cleri_repeat_t.max`: Maximum times an element is expected or 0 for unlimited. (readonly)
#### `cleri_t * cleri_repeat(uint32_t gid, cleri_t * cl_obj, size_t min, size_t max)`
Create and return a new [object](#cleri_t) containing a repeat element.
Argument `max` should be greater or equal to `min` or 0.
Example:
```c
/* define grammar */
cleri_t * repeat = cleri_repeat(
0, // gid, not used in this example
cleri_keyword(0, "ni", 0), // repeated element
0, // min n times
0); // max n times (0 for unlimited)
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(repeat, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "ni ni ni ni ni");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_list_t`
List element. Like [repeat](#cleri_repeat_t) but with a delimiter.
*Public members*
- `cleri_t * cleri_list_t.cl_obj`: Element to repeat. (readonly)
- `cleri_t * cleri_list_t.delimiter`: Delimiter between repeating element. (readonly)
- `size_t cleri_list_t.min`: Minimum times an element is expected. (readonly)
- `size_t cleri_list_t.max`: Maximum times an element is expected or 0 for unlimited. (readonly)
- `int cleri_list_t.opt_closing`: Allow or disallow ending with a delimiter.
#### `cleri_t * cleri_list(uint32_t gid, cleri_t * cl_obj, cleri_t * delimiter, size_t min, size_t max, int opt_closing)`
Create and return a new [object](#cleri_t) containing a list element.
Argument `max` should be greater or equal to `min` or 0. Argument `opt_closing`
can be 1 (TRUE) to allow or 0 (FALSE) to disallow a list to end with a delimiter.
Example:
```c
/* define grammar */
cleri_t * list = cleri_list(
0, // gid, not used in this example
cleri_keyword(0, "ni", 0), // repeated element
cleri_token(0, ","), // delimiter element
0, // min n times
0, // max n times (0 for unlimited)
0); // disallow ending with a delimiter
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(list, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "ni, ni, ni, ni, ni");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_token_t`
Token element. The parser must math a token exactly. A token can be one or more
characters and is usually used to match operators like `+`, `-`, `*` etc.
*Public members*
- `const char * cleri_token_t.token`: Token string. (readonly)
- `size_t cleri_token_t.len`: Length of the token string. (readonly)
#### `cleri_t * cleri_token(uint32_t gid, const char * token)`
Create and return a new [object](#cleri_t) containing a token element.
Example:
```c
/* define grammar */
cleri_t * token = cleri_token(
0, // gid, not used in this example
"-"); // token string (dash)
cleri_t * ni = cleri_keyword(0, "ni", 0);
cleri_t * list = cleri_list(0, ni, token, 0, 0, 0);
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(list, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "ni-ni - ni- ni -ni");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_tokens_t`
Tokens element. Can be used to register multiple tokens at once.
#### `cleri_t * cleri_tokens(uint32_t gid, const char * tokens)`
Create and return a new [object](#cleri_t) containing a tokens element.
Argument `tokens` must be a string with tokens seperated by spaces. If given
tokens are different in size, the parser will try to match the longest tokens
first.
Example:
```c
/* define grammar */
cleri_t * tokens = cleri_tokens(
0, // gid, not used in this example
"+ - -="); // tokens string '+', '-' and '-='
cleri_t * ni = cleri_keyword(0, "ni", 0);
cleri_t * list = cleri_list(0, ni, tokens, 0, 0, 0);
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(list, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "ni + ni -= ni - ni");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `Forward reference`
Forward reference to a libcleri object. There is no specific type for a
reference.
>Warning: A reference is not protected against testing the same position in
>in a string. This could potentially lead to an infinite loop.
>For example:
>```c
>cleri_ref_set(ref, cleri_optional(0, ref)); // DON'T DO THIS
>```
>Use [prio](#cleri_prio_t) if such recursive construction is required.
#### `cleri_t * cleri_ref(void)`
Create and return a new [object](#cleri_t) as reference element.
Once the reference is created, it can be used as element in you grammar. Do not
forget to actually set the reference using `cleri_ref_set()`.
#### `void cleri_ref_set(cleri_t * ref, cleri_t * cl_obj)`
Set a reference. For every created forward reference, this function must be
called exactly once. Argument `ref` must be created with `cleri_ref()`. Argument
`cl_obj` cannot be used outside the reference. Since the reference becomes
the `cl_obj`, it is the reference you should use.
Example
```c
/* define grammar */
cleri_t * ref = cleri_ref();
cleri_t * choice = cleri_choice(
0, 0, 2, cleri_keyword(0, "ni", 0), ref);
cleri_ref_set(ref, cleri_sequence(
0,
3,
cleri_token(0, "["),
cleri_list(0, choice, cleri_token(0, ","), 0, 0, 0),
cleri_token(0, "]")));
/* create grammar */
cleri_grammar_t * grammar = cleri_grammar(ref, NULL);
/* parse some test string */
cleri_parse_t * pr = cleri_parse(grammar, "[ni, ni, [ni, [], [ni, ni]]]");
printf("Valid: %s\n", pr->is_valid ? "true" : "false"); // true
/* cleanup */
cleri_parse_free(pr);
cleri_grammar_free(grammar);
```
### `cleri_dup_t`
Duplicate an object. The type is an extension to `cleri_t`.
#### `cleri_t * cleri_dup(uint32_t gid, cleri_t * cl_obj)`
Duplicate a libcleri object with a different gid but using the same element.
>Note: Only the object is duplicated. The element (`cleri_via_t via`)
>is a pointer to the original object.
The following [pyleri](https://github.com/transceptor-technology/pyleri) code
will use `cleri_dup()` when exported to c:
```python
elem = Repeat(obj, mi=1, ma=1)
```
Use the code below if you want similar behavior without duplication:
```python
elem = Sequence(obj)
```
### Miscellaneous functions
#### `const char * cleri_version(void)`
Returns the version of libcleri.
|