File: getting_started.rst

package info (click to toggle)
python-yaql 2.0.0-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,048 kB
  • sloc: python: 7,765; sh: 25; makefile: 19
file content (480 lines) | stat: -rw-r--r-- 13,754 bytes parent folder | download | duplicates (6)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
Getting started with YAQL
=========================

Introduction to YAQL
--------------------

YAQL (Yet Another Query Language) is an embeddable and extensible query
language that allows performing complex queries against arbitrary data structures.
`Embeddable` means that you can easily integrate a YAQL query processor in your code. Queries come
from your DSLs (domain specific language), user input, JSON, and so on. YAQL has a
vast and comprehensive standard library of functions that can be used to query data of any complexity.
Also, YAQL can be extended even further with user-specified functions.
YAQL is written in Python and is distributed through PyPI.

YAQL was inspired by Microsoft LINQ for Objects and its first aim is to execute expressions
on the data in memory. A YAQL expression has the same role as an SQL query to databases:
search and operate the data. In general, any SQL query can be transformed to a YAQL expression,
but YAQL can also be used for computational statements. For example, `2 + 3*4` is a valid
YAQL expression.

Moreover, in YAQL, the following operations are supported out of the box:

* Complex data queries
* Creation and transformation of lists, dicts, and arrays
* String operations
* Basic math operations
* Conditional expression
* Date and time operations (will be supported in yaql 1.1)

An interesting thing in YAQL is that everything is a function and any function can
be customized or overridden. This is true even for built-in functions.
YAQL cannot call any function that was not explicitly registered to be accessible
by YAQL. The same is true for operators.

YAQL can be used in two different ways: as an independent CLI tool, and as a
Python module.

Installation
------------

You can install YAQL in two different ways:

#. Using PyPi:

   .. code-block:: console

        pip install yaql

#. Using your system package manager (for example Ubuntu):

   .. code-block:: console

        sudo apt-get install python-yaql

HowTo: Use YAQL in Python
-------------------------

You can operate with YAQL from Python in three easy steps:

* Create a YAQL engine
* Parse a YAQL expression
* Execute the parsed expression

.. NOTE::
    The engine should be created once for a set of operators and parser rules. It can
    be reused for all queries.

Here is an example how it can be done with the YAML file which looks like:

.. code-block:: yaml

      customers_city:
        - city: New York
          customer_id: 1
        - city: Saint Louis
          customer_id: 2
        - city: Mountain View
          customer_id: 3
      customers:
        - customer_id: 1
          name: John
          orders:
            - order_id: 1
              item: Guitar
              quantity: 1
        - customer_id: 2
          name: Paul
          orders:
            - order_id: 2
              item: Banjo
              quantity: 2
            - order_id: 3
              item: Piano
              quantity: 1
        - customer_id: 3
          name: Diana
          orders:
            - order_id: 4
              item: Drums
              quantity: 1

.. code-block:: python

    import yaql
    import yaml

    data_source = yaml.load(open('shop.yaml', 'r'))

    engine = yaql.factory.YaqlFactory().create()

    expression = engine(
        '$.customers.orders.selectMany($.where($.order_id = 4))')

    order = expression.evaluate(data=data_source)

Content of the ``order`` will be the following:

.. code-block:: console

    [{u'item': u'Drums', u'order_id': 4, u'quantity': 1}]

YAQL grammar
------------

YAQL has a very simple grammar:

* Three keywords as in JSON: true, false, null
* Numbers, such as 12 and 34.5
* Strings: `'foo'` and `"bar"`
* Access to the data: $variable, $
* Binary and unary operators: 2 + 2, -1, 1 != 2, $list[1]

Data access
~~~~~~~~~~~

Although YAQL expressions may be self-sufficient, the most important value of YAQL
is its ability to operate on user-passed data. Such data is placed into variables
which are accessible in a YAQL expression as `$<variable_name>`. The `variable_name`
can contain numbers, English alphabetic characters, and underscore symbols. The `variable_name`
can be empty, in this case you will use `$`. Variables can be set prior to executing
a YAQL expression or can be changed during the execution of some functions.

According to the convention in YAQL, function parameters, including input data,
are stored in variables like `$1`, `$2`, and so on. The `$` stands for `$1`.
For most cases, all function parameters are passed in one piece and can be accessed
using `$`, that is why this variable is the most used one in YAQL expressions.
Besides, some functions are expected to get a YAQL expression as one of the
parameters (for example, a predicate for collection sorting). In this case,
passed expression is granted access to the data by `$`.

Strings
~~~~~~~

In YAQL, strings can be enclosed in `"` and `'`. Both types are absolutely equal and
support all standard escape symbols including unicode code-points. In YAQL, both types
of quotes are useful when you need to include one type of quotes into the
other. In addition, ` is used to create a string where only one escape symbol \` is possible.
This is especially suitable for regexp expressions.

If a string does not start with a digit or `__` and contains only digits, `_`, and English letters,
it is called identifier and can be used without quotes at all. An identifier can be used
as a name for function, parameter or property in `$obj.property` case.

Functions
~~~~~~~~~

A function call has syntax of `functionName(functionParameters)`. Brackets are necessary
even if there are no parameters. In YAQL, there are two types of parameters:

* Positional parameters
   ``foo(1, 2, someValue)``
* Named parameters
   ``foo(paramName1 => value1, paramName2 => 123)``

Also, a function can be called using both positional and named parameters: ``foo(1, false, param => null)``.
In this case, named arguments must be written after positional arguments. In
``name => value``, `name` must be a valid identifier and must match the name of
parameter in function definition. Usually, arguments can be passed in both ways,
but named-only parameters are supported in YAQL since Python 3 supports them.

Parameters can have default values. Named parameters is a good way to pass only needed
parameters and skip arguments which can be use default values, also you can simply
skip parameters in function call: ``foo(1,,3)``.

In YAQL, there are three types of functions:

* Regular functions: ``max(1,2)``
* Method-like functions, which are called by specifying an object for which the
   function is called, followed by a dot and a function call: ``stringValue.toUpper()``
* Extension methods, which can be called both ways: ``len(string)``, ``string.len()``

YAQL standard library contains hundreds of functions which belong to one of these types.
Moreover, applications can add new functions and override functions from the standard library.

Operators
~~~~~~~~~

YAQL supports the following types of operators out of the box:

* Arithmetic: `+`. `-`, `*`, `/`, `mod`
* Logical: `=`, `!=`, `>=`, `<=`, `and`, `or`, `not`
* Regexp operations: `=~`, `!~`
* Method call, call to the attribute: `.`, `?.`
* Context pass: `->`
* Indexing: `[ ]`
* Membership test operations: `in`

Data structures
~~~~~~~~~~~~~~~

YAQL supports these types out of the box:


* Scalars

   YAQL supports such types as string, int. boolean. Datetime and timespan
   will be available after yaql 1.1 release.

* Lists

   List creation: ``[1, 2, value, true]``
   Alternative syntax: ``list(1, 2, value, true)``
   List elemenets can be accesessed by index: ``$list[0]``

* Dictionaries

   Dict creation: ``{key1 => value1, true => 1, 0 => false}``
   Alternative syntax: ``dict(key1 => value1, true => 1, 0 => false)``
   Dictionaries can be indexed by keys: ``$dict[key]``. Exception will be raised
   if the key is missing in the dictionary. Also, you can specify value which will
   be returned if the key is not in the dictionary: ``dict.get(key, default)``.

   .. NOTE::
      During iteration through the dictionary, `key` can be called like: ``$.key``

* (Optional) Sets

   Set creation: ``set(1, 2, value, true)``

.. NOTE::
   YAQL is designed to keep input data unchanged. All the functions that
   look as if they change data, actually return an updated copy and keep the original
   data unchanged. This is one reason why YAQL is thread-safe.

Basic YAQL query operations
---------------------------

It is obvious that we can compare YAQL with SQL as they both are designed to solve
similar tasks. Here we will take a look at the YAQL functions which have a direct
equivalent with SQL.

We will use YAML from `HowTo: use YAQL in Python`_ as a data source in our examples.


Filtering
~~~~~~~~~

.. NOTE::

    Analog is SQL WHERE

The most common query to the data sets is filtering. This is a type of
query which will return only elements for which the filtering query is true. In YAQL,
we use ``where`` to apply filtering queries.

.. code-block:: console

    yaql> $.customers.where($.name = John)

.. code-block:: yaml

      - customer_id: 1
        name: John
        orders:
          - order_id: 1
            item: Guitar
            quantity: 1


Ordering
~~~~~~~~

.. NOTE::

    Analog is SQL ORDER BY

It may be required to sort the data returned by some YAQL query. The ``orderBy`` clause will cause
the elements in the returned sequence to be sorted according to the default comparer
for the type being sorted. For example, the following query can be extended to sort
the results based on the profession property.

.. code-block:: console

    yaql> $.customers.orderBy($.name)

.. code-block:: yaml

      - customer_id: 3
        name: Diana
        orders:
          - order_id: 4
            item: Drums
            quantity: 1
      - customer_id: 1
        name: John
        orders:
          - order_id: 1
            item: Guitar
            quantity: 1
      - customer_id: 2
        name: Paul
        orders:
          - order_id: 2
            item: Banjo
            quantity: 2
          - order_id: 3
            item: Piano
            quantity: 1

Grouping
~~~~~~~~

.. NOTE::

    Analog is SQL GROUP BY

The ``groupBy`` clause allows you to group the results according to the key you specified.
Thus, it is possible to group example json by gender.

.. code-block:: console

    yaql> $.customers.groupBy($.name)

.. code-block:: yaml

        - Diana:
          - customer_id: 3
            name: Diana
            orders:
              - order_id: 4
                item: Drums
                quantity: 1
        - Paul:
          - customer_id: 2
            name: Paul
            orders:
              - order_id: 2
                item: Banjo
                quantity: 2
              - order_id: 3
                item: Piano
                quantity: 1
        - John:
          - customer_id: 1
            name: John
            orders:
              - order_id: 1
                item: Guitar
                quantity: 1

So, here you can see the difference between ``groupBy`` and ``orderBy``. We use
the same parameter `name` for both operations, but in the output for ``groupBy``
`name` is located in additional place before everything else.

Selecting
~~~~~~~~~

.. NOTE::

    Analog is SQL SELECT

The ``select`` method allows building new objects out of objects of some collection.
In the following example, the result will contain a list of name/orders pairs.

.. code-block:: console

    yaql> $.customers.select([$.name, $.orders])

.. code-block:: console

        - John:
          - order_id: 1
            item: Guitar
            quantity: 1
        - Paul:
          - order_id: 2
            item: Banjo
            quantity: 2
          - order_id: 3
            item: Piano
            quantity: 1
        - Diana:
          - order_id: 4
            item: Drums
            quantity: 1

Joining
~~~~~~~

.. NOTE::

    Analog is SQL JOIN

The ``join`` method creates a new collection by joining two other collections by
some condition.

.. code-block:: console

    yaql> $.customers.join($.customers_city, $1.customer_id = $2.customer_id, {customer=>$1.name, city=>$2.city, orders=>$1.orders})

.. code-block:: yaml

      - customer: John
        city: New York
        orders:
          - order_id: 1
            item: Guitar
            quantity: 1
      - customer: Paul
        city: Saint Louis
        orders:
          - order_id: 2
            item: Banjo
            quantity: 2
          - order_id: 3
            item: Piano
            quantity: 1
      - customer: Diana
        city: Mountain View
        orders:
          - order_id: 4
            item: Drums
            quantity: 1


Take an element from collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

YAQL supports two general methods that can help you to take elements from collection
``skip`` and ``take``.

.. code-block:: console

    yaql> $.customers.skip(1).take(2)

.. code-block:: yaml

      - customer_id: 2
        name: Paul
        orders:
          - order_id: 2
            item: Banjo
            quantity: 2
          - order_id: 3
            item: Piano
            quantity: 1
      - customer_id: 3
        name: Diana
        orders:
          - order_id: 4
            item: Drums
            quantity: 1

First element of collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``first`` method will return the first element of a collection.

.. code-block:: console

    yaql> $.customers.first()

.. code-block:: yaml

    - customer_id: 1
      name: John
      orders:
        - order_id: 1
          item: Guitar
          quantity: 1