1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
|
<img src="https://raw.githubusercontent.com/maciejhirsz/logos/master/logos.svg?sanitize=true" alt="Logos logo" width="250" align="right">
# Logos

[](https://crates.io/crates/logos2)
[](https://docs.rs/logos2)
[](https://crates.io/crates/logos2)
_Create ridiculously fast Lexers._
**Logos** has two goals:
+ To make it easy to create a Lexer, so you can focus on more complex problems.
+ To make the generated Lexer faster than anything you'd write by hand.
To achieve those, **Logos**:
+ Combines all token definitions into a single [deterministic state machine](https://en.wikipedia.org/wiki/Deterministic_finite_automaton).
+ Optimizes branches into [lookup tables](https://en.wikipedia.org/wiki/Lookup_table) or [jump tables](https://en.wikipedia.org/wiki/Branch_table).
+ Prevents [backtracking](https://en.wikipedia.org/wiki/ReDoS) inside token definitions.
+ [Unwinds loops](https://en.wikipedia.org/wiki/Loop_unrolling), and batches reads to minimize bounds checking.
+ Does all of that heavy lifting at compile time.
## Example
```rust
use logos::Logos;
#[derive(Logos, Debug, PartialEq)]
#[logos(skip r"[ \t\n\f]+")] // Ignore this regex pattern between tokens
enum Token {
// Tokens can be literal strings, of any length.
#[token("fast")]
Fast,
#[token(".")]
Period,
// Or regular expressions.
#[regex("[a-zA-Z]+")]
Text,
}
fn main() {
let mut lex = Token::lexer("Create ridiculously fast Lexers.");
assert_eq!(lex.next(), Some(Ok(Token::Text)));
assert_eq!(lex.span(), 0..6);
assert_eq!(lex.slice(), "Create");
assert_eq!(lex.next(), Some(Ok(Token::Text)));
assert_eq!(lex.span(), 7..19);
assert_eq!(lex.slice(), "ridiculously");
assert_eq!(lex.next(), Some(Ok(Token::Fast)));
assert_eq!(lex.span(), 20..24);
assert_eq!(lex.slice(), "fast");
assert_eq!(lex.next(), Some(Ok(Token::Text)));
assert_eq!(lex.slice(), "Lexers");
assert_eq!(lex.span(), 25..31);
assert_eq!(lex.next(), Some(Ok(Token::Period)));
assert_eq!(lex.span(), 31..32);
assert_eq!(lex.slice(), ".");
assert_eq!(lex.next(), None);
}
```
For more examples and documentation, please refer to the
[Logos handbook](https://maciejhirsz.github.io/logos/) or the
[crate documentation](https://docs.rs/logos2/latest/logos/).
## How fast?
Ridiculously fast!
```norust
test identifiers ... bench: 647 ns/iter (+/- 27) = 1204 MB/s
test keywords_operators_and_punctators ... bench: 2,054 ns/iter (+/- 78) = 1037 MB/s
test strings ... bench: 553 ns/iter (+/- 34) = 1575 MB/s
```
## Acknowledgements
+ [Pedrors](https://pedrors.pt/) for the **Logos** logo.
## Thank you
**Logos** is very much a labor of love. If you find it useful, consider
[getting me some coffee](https://github.com/sponsors/maciejhirsz). ☕
If you'd like to contribute to Logos, then consider reading the
[Contributing guide](https://maciejhirsz.github.io/logos/contributing).
## License
This code is distributed under the terms of both the MIT license
and the Apache License (Version 2.0), choose whatever works for you.
See [LICENSE-APACHE](LICENSE-APACHE) and [LICENSE-MIT](LICENSE-MIT) for details.
|