File: js_extractor.md

package info (click to toggle)
kitinerary 25.12.1-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 20,428 kB
  • sloc: cpp: 214,542; javascript: 10,610; sh: 303; xml: 164; python: 48; makefile: 17
file content (692 lines) | stat: -rw-r--r-- 24,228 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
# JavaScript Extractor Scripts

Kitinerary provides a JavaScript-engine for writing extractors to parse and output structured data from tickets and
other travel documents.
This data is later used in many projects to generate useful information for the user.

All extractor scripts are written in JavaScript and [stored in
`/src/lib/scripts`](https://invent.kde.org/pim/kitinerary/-/tree/master/src/lib/scripts).

# How to make your own extractor script

It's highly recommended to use the [KItinerary Workbench](https://invent.kde.org/pim/kitinerary-workbench) to develop
and test your extractor scripts. How to install and more information, see
the [KItinerary Workbench README](https://invent.kde.org/pim/kitinerary-workbench/-/blob/master/README.md).

## Creating a new extractor script

To create a new extractor script, you need to create a new files in the `$XDG_DATA_DIRS/kitinerary/extractors` (~/
.local/share/kitinerary/extractors`) directory.

> **Note:** For easier management and later collaboration in Git we recommend linking extractor scripts from the
> directory to a Git repository (`ln -s $(pwd)/src/lib/scripts ~/.local/share/kitinerary/extractors`).

## Script declaration

Kitinerary uses a JSON file to declare the extractor scripts. This file sets filtering rules by which it knows, which
extractor to run and defines the script itself.

> **Note:** Multiple extractors can run on a single document, if more than one extractor outputs valid data it will be
> merged into single output.
>
> **Note:** Multiple script declarations can exist for one js extractor file. Usefull if there many types of documents
> but same script can be used to extract data from them.

### Extractor declaration
 
It contains the MIME type of the document that is going to be ingested, filter defining when this extractor script should run, and
declaration script and the function that will be called for it.

```json
{
  "mimeType": "application/pdf",
  // MIME type of the document that is going to be ingested (email or pdf, or other)
  "filter": [
    {
      ...
    }
  ],
  // Rules by which detect this ticket should be parsed by this script
  "script": "my-extractor-script.js",
  // Name of the extractor script itself
  "function": "extractTicket"
  // Name of the function in the script that will be called
}
```

Extractor scripts are run against a document node if all of the following conditions are met:

- The `mimeType` of the script matches that of the node.
- At least one of the extractor `filter` of the script match the node.

### Extractor filters

Extractor filters are evaluated against document nodes (content of the document). 

Extractor script filter consists of the following four properties:

```json
{
  "[...]": "[...]",
  "filter": [
    {
      "mimeType": "text/plain",
      // Specifies the type of document you're looking for (e.g., QR code in document, plain text).
      "field": "",
      // If it is a complex document (like email with fields), it sets which filed of the document run "match" on. This is ignored for nodes containing basic types such as plain text or binary data.
      "match": "KDE.org/airlines",
      // A regular expression or exact string we look for.
      "scope": "Current"
      // Defines the relation to the node the script should be run on (Current, Parent, Children, Ancestors or Descendants).
    }
    // [...]
  ]
}
```

Scope defines where to match the filter in relation to mimeType. The following values are supported:

- `Current`: The filter is applied to the node itself (mimeType in filter).
- `Parent`: The filter is applied to the direct parent node of the current node (only one back).
- `Children`: The filter is applied to the direct child nodes of the current node (only one forward).
- `Ancestors`: The filter is applied to all parent nodes of the current node (all the way back).
- `Descendants`: The filter is applied to all child nodes of the current node (all the way forward).

<details>
<summary>Scope examples</summary>

---

We have an email with PDF, and ticket details inside the PDF.
The PDF is a child of the email, and the ticket details are inside the PDF.

```tree
└── message/rfc822 // Email send by booking operator
    ├── text/html // HTML content of the email
    │   └── text/plain // extracted only text from the HTML
    ├── application/pdf // PDF attached to the email
    │   ├── internal/qimage // Image of the QR code
    │   │   └── text/plain // Decoded text from the QR code
    │   └── text/plain // Usually text inside the PDF
    └── internal/qimage // Image of booking company logo - we dont care 
```

```json
{
  "mimeType": "application/pdf",
  // We are looking for PDF
  "field": "From",
  // Email (message/rfc822) has "From" field, with sender
  "match": "^booking@exampl-operator\.com$",
  "scope": "Parent"
  // We look at the parent of the PDF, which is the email
}
```

```js
function parser(pdfTicket, node, matched) {
    [...]
}
```

Node[0] is the PDF, but we match ticket based on it's parrent which is Email, and we see if it was send from "
booking@exampl-operator.com".
Which results in the first argument of the parser function becoming the PDF, second argument is the node (PDF) and third
argument is the matched the document (message/rfc822).

</details>


<details>
<summary>Examples</summary>

---

Anything attached to an email sent by "booking@example-operator.com". The field matched against here
is the `From` header of the MIME message.

```json
{
  "mimeType": "message/rfc822",
  // Its mime type of email
  "field": "From",
  // We look at fiels "From" in the email, which is the sender
  "match": "^booking@exampl-operator\.com$",
  // We look at exactly "booking@exampl-operator.com"
  "scope": "Ancestors"
}
```

---

Documents containing a barcode of the format "F12345678". Note that the scope here is `Descendants`
rather than `Children` as the direct child nodes tend to be the images containing the barcode.

```json
{
  "mimeType": "text/plain",
  // We look at plain text
  "scope": "Ancestors",
  "match": "^F\d{8}$"
  // We look for exactly "F" followed by 8 digits
}
```

---

Apple Wallet passes issued by "org.kde.travelAgency".

```json
{
  "mimeType": "application/vnd.apple.pkpass",
  // We look at Apple Wallet passes
  "field": "passTypeIdentifier",
  // We look at field "passTypeIdentifier" which is the issuer
  "match": "org.kde.travelAgency",
  "scope": "Current"
  // We look only at this document
}
```

---

iCal events with an organizer email address of the "kde.org" domain. Note that the field here accesses
a property of a property. This works at arbitrary depth, as long as the corresponding types are
introspectable by Qt.

```json
{
  "mimeType": "internal/event",
  "field": "organizer.email",
  "match": "@kde.org$",
  "scope": "Current"
}
```

---

A (PDF) document containing an IATA boarding pass barcode of the airline "AB". Triggering
vendor-specific UIC or ERA railway tickets can be done very similarly, matching on the corresponding
carrier ids.

```json
{
  "mimeType": "internal/iata-bcbp",
  "field": "operatingCarrierDesignator",
  "match": "AB",
  "scope": "Descendants"
}
```

---

A node that has already existing results containing a reservation from "My Transport Operator".
This is useful for scripts that want to augment or fix schema.org annotation already provided by
the source. Note that the mimeType "application/ld+json" is special here as it doesn't only trigger
on the document node content itself, but also matches against the result of nodes of any type.

```json
{
  "mimeType": "application/ld+json",
  "field": "reservationFor.provider.name",
  "match": "My Transport Operator",
  "scope": "Current"
}
```

---

**NOT RECOMMENDED** This should be used as a last resort only, as matching against the full PDF document content can be
expensive.

PDF documents containing the string "My Ferry Booking" anywhere.

```json
{
  "mimeType": "application/pdf",
  "field": "text",
  "match": "My Ferry Booking",
  "scope": "Current"
}
```

</details>

## Extractor script

Extractor scripts are run inside a QJSEngine, **it isn't a full JS environment**, and not everything is supported.
There are some additional APIs available to extractor scripts (technical docs can be found
here [KItinerary::JsApi](https://api.kde.org/kdepim/kitinerary/html/namespaceKItinerary_1_1JsApi.html).

### Objects of a document

#### ExtractorDocumentNode (node)

It's a object that represents a node in the document tree:

- `content`: Value of the node (eg. text, barcode content, etc)
- `childNodes`: List of child of this node, they are also ExtractorDocumentNode objects.
- `mimeType`: MIME type of the node (eg. text/plain, application/pdf, internal/qimage etc)

<details>
<summary>Examples</summary>

```tree
└── application/pdf // Ticket in PDF format
    ├── internal/qimage // Image of the QR code
    │   └── text/plain // Decoded text from the QR code
    └── text/plain // Usually text inside the PDF
```

```js
function main(pdf, node) {
    cnsole.log(pdf.content); // Automagically extracted PDF content, no need to point at it.
    let imageOfQR = node.childNodes[0];
    let textFromQR = imageOfQR.childNodes[0].content;
}
```

</details>

#### DocumentNode types

Ticket itself can be in different formats, and each format has its own object:

<details id="PDF - PDF document">
<summary>PDF - PDF document</summary>
PdfDocument is a object that represents a PDF document; it has the following properties:

- `text`: Extracts text from the PDF page. If used on root node, it extracts all text from the PDF.
- `pages`: List of pages in the PDF
- `textInRect`: Extracts text from a given rectangle on the PDF page. Uses normalized coordinates (0-1) in format "Left,
  Top, Right, Bottom".

> More:
> [PdfDocument](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1PdfDocument.html)

  <details>
  <summary>Examples</summary>

```js
// If ticket is in PDF the fist argument is the `PdfPage` object
function main(contentPDF, node) {
    const allText = contentPDF.text; // Extracts all text from the PDF page
    const firstPage = contentPDF.pages[0].text; // Extracts text from only from first page
    const textInRect = contentPDF.pages[0].textInRect(0, 0, 0.3, 0.25); // "Passanger: Kandalf"
}
```

  </details>
</details>

<details id="Html - HTML document">
<summary>Html - HTML document</summary>
HtmlDocument is an object that represents an HTML document consisting HtmlElements; it has the following properties and methods:

- `rawData()`: Returns the raw textual HTML data.
- `root()`: Returns the root element of the document.
- `eval(xpath)`: Evaluates an XPath expression relative to the document root and returns matching elements.

HtmlElement represents an element within an HTML document; it has the following properties and methods:

- `name`: Returns the element name (tag).
- `isNull`: Checks if the element is null/invalid.
- `attribute`: Returns the value of the specified attribute.
- `hasAttribute`: Checks whether an attribute with the given name exists.
- `attributes`: Returns a list of all attributes of this element.
- `content`: Returns the immediate text content of this element (trimmed of whitespace).
- `recursiveContent`: Returns the text content of this element and all its children.
- `parent`: Returns the parent element of this node.
- `firstChild`: Returns the first child element of this node.
- `nextSibling(: Returns the next sibling element of this node.
- `eval`: Evaluates an XPath expression relative to this element.

> More:
> [HtmlDocument](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1HtmlDocument.html),
> [HtmlElement](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1HtmlElement.html)

  <details>
  <summary>Examples</summary>

  ```js
  // Create a simple HTML document
const simpleHtml = `
    <html><head>
    <title>Flight Details</title>
    </head><body>
  
    <div class="flight-info">
    <h1>Flight KDE1996</h1>
    <div class="departure">
    <span class="code">KDQ</span>
    <span class="time">2025-02-20 08:30</span>
    </div>
  
    <div class="arrival">
    <span class="code">KDA</span>
    <span class="time">2025-02-22 16:45</span>
    </div>
  
    <div class="passenger" id="traveler">
    <span class="name">Kandalf the wizard</span>
    <span class="seat">12A</span>
  
    </div>
    </div>
    </body>
    </html>
    `;

const html = ExtractorEngine.extract(simpleHtml, "text/html").content;

const res = JsonLd.newFlightReservation();

// Get flight number from h1
const flightHeader = html.eval("//h1")[0];
console.log(flightHeader.content);
if (typeof flightHeader.content == "string") {
    const flightNumber = flightHeader.content.match(/Flight ([A-Z]{2})(\d+)/);
    if (flightNumber) {
        res.reservationFor.airline.iataCode = flightNumber[1];
        res.reservationFor.flightNumber = flightNumber[2];
    }
}

// Get departure info
const departureElement = html.eval("//div[@class='departure']")[0];
if (typeof flightHeader.content == "string") {
    const codeElement = departureElement.eval("span[@class='code']")[0];
    const timeElement = departureElement.eval("span[@class='time']")[0];

    res.reservationFor.departureAirport.iataCode = codeElement.content;
    res.reservationFor.departureTime = JsonLd.toDateTime(
        timeElement.content,
        "yyyy-MM-dd HH:mm",
        "en",
    );
}

// Get arrival info
const arrivalElement = html.eval("//div[@class='arrival']")[0];
if (typeof flightHeader.content == "string") {
    const codeElement = arrivalElement.eval("span[@class='code']")[0];
    const timeElement = arrivalElement.eval("span[@class='time']")[0];

    res.reservationFor.arrivalAirport.iataCode = codeElement.content;
    res.reservationFor.arrivalTime = JsonLd.toDateTime(
        timeElement.content,
        "yyyy-MM-dd HH:mm",
        "en",
    );
}

// Get passenger info using element navigation
const passengerDiv = html.eval("//div[@id='traveler']")[0];
const nameSpan = passengerDiv.firstChild;
const seatSpan = nameSpan.nextSibling;

res.underName = {
    "@type": "Person",
    name: nameSpan.content,
};

res.reservedTicket = {
    "@type": "Ticket",
    ticketedSeat: {
        "@type": "Seat",
        seatNumber: seatSpan.content,
    },
};

return res;
  ```

  </details>

</details>

<details id="PKPASS">
<summary>PKPASS</summary>
It's a object of fields inside PKPASS:

- `field[X]`: Object with labels and values

  <details>
  <summary>Example - pkpass</summary>

  ```js
  function main(pkpass, node) {
    // pass.json has "boardingPass" with keys "depar" "arrir" "arrirTime" "deparTime" "code"
    var res = node.result[0];
  
    var f = JsonLd.newFlightReservation(); // https://schema.org/FlightReservation
    f.reservationFor.departureAirport.name = pass.field["depar"].label;
    f.reservationFor.arrivalAirport.name = pass.field["arrir"].label;
    f.reservationFor.departureTime = JsonLd.toDateTime(
      pass.field["deparTime"].value,
      "hh:mm dd.MM.yyyy",
      "en",
    );
    f.reservationFor.arrivalTime = JsonLd.toDateTime(
      pass.field["arrirTime"].value,
      "hh:mm dd.MM.yyyy",
      "en",
    );
    f.reservationFor.airline.iataCode = "KD";
    f.reservationFor.flightNumber = pass.field["code"].label;
    return f; // Returns the flight reservation object later used by other apps
  }
  ```

  </details>

</details>

### Additional API available to extractor scripts

#### JSON-LD API

API for supporting schema.org output:

- `JsonLd`: factory functions for schema.org objects, date/time parsing, etc

> More: [JsonLd](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1JsApi_1_1JsonLd.html)

<details>
<summary>Examples</summary>

```js
var f = JsonLd.newFlightReservation(); // https://schema.org/FlightReservation
f.reservationFor.departureAirport.name = "KDE Konqi Airport (KDQ)"; // https://schema.org/FlightReservation -> https://schema.org/Flight -> https://schema.org/Place -> https://schema.org/Airport
f.reservationFor.arrivalAirport.name = "KDE Katie City Airport (KDA)";
f.reservationFor.departureTime = JsonLd.toDateTime(
    "08:36 20.02.2025",
    "hh:mm dd.MM.yyyy",
    "en",
);
f.reservationFor.arrivalTime = JsonLd.toDateTime(
    "09:56 20.02.2025",
    "hh:mm dd.MM.yyyy",
    "en",
);
f.reservationFor.airline.iataCode = "KD";
f.reservationFor.flightNumber = "KD 1096";
return f; // Returns the flight reservation object later used by other apps
```

</details>

#### ByteArray, BitArray, Barcode

API for handling specific types of input data:

- `ByteArray`: functions for dealing with byte-aligned binary data, including decompression, Base64 decoding, Protcol
  Buffer decoding, etc.
- `BitArray`: functions for dealing with non byte-aligned binary data, such as reading numerical data at arbitrary bit
  offsets. Often used if binary data is with nonstandard encoding (eg. 6bit per character).
- `Barcode`: functions for manual barcode decoding. This should be rarely needed nowadays, with the extractor engine
  doing this automatically and creating corresponding document nodes.

> More:
> [ByteArray](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1JsApi_1_1ByteArray.html),
> [BitArray](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1JsApi_1_1BitArray.html),
> [Barcode](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1JsApi_1_1Barcode.html)

<details>
<summary>Examples</summary>

```js
const KonqiPersonality = ByteArray.toBase64("Cheerful"); // "Q2hlZXJmdWwK"
const KatieMessage = ByteArray.fromBase64("UmVtZW1iZXIgdG8gdGFrZSBicmVha3MK"); // "Remember to take breaks"

const theQR = node.childNodes[1].childNodes[0].content; // Base64 encoded data
const decodedQR = ByteArray.fromBase64(theQR); // binary blob
const bitsOfQR = ByteArray.toBitArray(theQR); // Conver this to bitArray so it can be manipulated bit-by-bit
let outputString = "";
for (let i = 0; i < 6; ++i) {
    let magicalNumber = bitsOfQR.readNumberMSB(0, 6); // Reads 6 **bits**, eg. '43'
    outputString += String.fromCharCode(magicalNumber + 32); // '43' + 32 = K
}
console.log(outputString); // Konqi

// Usually not needed, as the extractor engine will create barcode nodes automatically
const QRCode = ImageOfAztecQRCodeNotDecodedByExtractorEngine;
const DecodedAztec = Barcode.decodeAztec(
    ImageOfAztecQRCodeNotDecodedByExtractorEngine,
);
console.log(DecodedAztec); // ["KDE airlines", "KDE Konqi Airport (KDQ)", "KDE Katie City Airport (KDA)", "20.02.2025", "08:36", "20.02.2025", "09:56", "KD 1096", "magicalstringsoweknowthisticketwasnottamperedwithbyevilwizards"]
```

</details>

#### Extractor API

API for interacting with the extractor engine itself:

- `ExtractorEngine`: Allows to recursively perform extraction.
  It can be useful for elements that need custom decoding in an extractor script first,
  but that contain otherwise generally supported data formats. Standard barcodes encoded
  in URL arguments are such an example.

> More: [ExtractorEngine](https://api.kde.org/kdepim/kitinerary/html/classKItinerary_1_1ExtractorEngine.html)

<details>
<summary>Examples</summary>

```js
const XMLdataIncorreclyInterpretedAsText = "<xml><data>42</data></xml>";
const CorrectlyInterpretedXML = ExtractorEngine.extract(
    XMLdataIncorreclyInterpretedAsText,
    "application/xml",
);

var f = JsonLd.newFlightReservation();
ExtractorEngine.extractPrice("13 EUR", f); // Adds to ticket price
```

</details>

### Extractor scripts

The script entry point is called with three arguments:

- The first argument is the content of the node that is processed. The data type of that argument
  depends on the node type as described in the document model section above. This is usually
  what extractor script are most concerned with.
- The second argument is the document node being processed (KItinerary::ExtractorDocumentNode, see example under).
  It can be useful to access already extracted results on a node (e.g. coming from generic extraction)
  in order to augment those.
- The third argument is the document node that matched the filter. This can be the same as the second
  argument (for filters with `scope` = Current), but it doesn't have to be. It is most useful when
  triggering on descendant nodes such as barcodes, the content of which will then be incorporated into
  the extraction result by the script.

Output of your JS function should be:

- A JS object following the schema.org ontology (JsonLd) with a single extraction result.
- A JS array containing one or more schema.org/JsonLd objects. Useful if a ticket document has multiple tickets.

> Script errors and empty array is considered as "[]" (aka. nothing was returned).

<details>
<summary>Examples</summary>
Let's assume we want to create an extractor script for a railway ticket which comes with a simple
tabular layout for a single leg per page, and contains a QR code with a 10 digit number for each leg.

```
Konqi -> Katie West
Departure: 21 Jun 18:42
Arrival: 21 Jun 23:12

[Big QR code]
```

As a filter we'd use something similar as example 2 above, triggering on the barcode content.

```js
function extractTicket(pdf, node, barcode) {
    // text for the PDF page containing the barcode that triggered this
    const text = pdf.pages[barcode.location].text;

    // empty http://schema.org/TrainReservation object for the result
    let res = JsonLd.newTrainReservation();

    // when using regular expressions, matching on things that don't change in different
    // language variants is usually preferable, but might not always be possible
    // when creating regular expressions consider that various special characters might occur in names
    // of people or locations (in the above example spaces and parenthesis)
    const leg = text.match(/(.*) -> (.*)/); // ["Konqi", "Katie West"]

    // this can throw an error if the regular expression didn't match
    // that's fine though, the script is aborted here and considered not to have any result
    // ie. handling this case explicitly is unnecessary here
    res.reservationFor.departureStation.name = leg[1]; // Konqi
    res.reservationFor.arrivalStation.name = leg[2]; // Katie West

    // date/time parsing can recover missing year numbers from context, if available
    // In our example it would consider the PDF creation time for that, and the resulting
    // date would be the first occurrence of the given day and month following that.
    // https://doc.qt.io/qt-6/qdate.html#fromString-1
    res.reservationFor.departureTime = JsonLd.toDateTime(
        text.match(/Departure: (.*)/)[1],
        "dd MMM hh:mm",
        "en",
    );

    // for supporting different language formats, both the format string and the locale
    // argument can be lists. All combinations are then tried until one yields a valid result.
    res.reservationFor.arrivalTime = JsonLd.toDateTime(
        text.match(/(?:Arrival|Arrivé|Ankunft): (.*)/)[1],
        ["dd MMM hh:mm", "dd MMM hh.mm"],
        ["en", "fr", "de"],
    );

    // the node that triggered this script (the barcode) can be accessed and integrated into the result
    res.reservedTicket.ticketToken = "qrCode:" + barcode.content;

    return res;
}
```

The above example produces and entirely new result. Another common case are scripts that
merely augment an existing result. Let's assume an Apple Wallet pass for a flight, the
automatically extracted result is correct but misses the boarding group. The filter for
this would be similar to example 4 above, triggering on the pass issuer.

```js
// unused arguments can be omitted
function extractBoardingPass(pass, node) {
    // use the existing result as a starting point
    // generally this can be more than one, but specific types of documents
    // might only produce a deterministic amount (like 1 in this case).
    let res = node.result[0];

    // modify the result as necessary
    res.boardingGroup = pass.field["group"].label;

    // returning a result here will replace the existing results for this node
    return res;
}
```

</details>