File: queryset.rst

package info (click to toggle)
python-cassandra-driver 3.29.2-6
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 5,144 kB
  • sloc: python: 51,532; ansic: 768; makefile: 138; sh: 13
file content (419 lines) | stat: -rw-r--r-- 12,857 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
==============
Making Queries
==============

.. module:: cqlengine.queryset

Retrieving objects
==================
Once you've populated Cassandra with data, you'll probably want to retrieve some of it. This is accomplished with QuerySet objects. This section will describe how to use QuerySet objects to retrieve the data you're looking for.

Retrieving all objects
----------------------
The simplest query you can make is to return all objects from a table.

This is accomplished with the ``.all()`` method, which returns a QuerySet of all objects in a table

Using the Person example model, we would get all Person objects like this:

.. code-block:: python

    all_objects = Person.objects.all()

.. _retrieving-objects-with-filters:

Retrieving objects with filters
-------------------------------
Typically, you'll want to query only a subset of the records in your database.

That can be accomplished with the QuerySet's ``.filter(\*\*)`` method.

For example, given the model definition:

.. code-block:: python

    class Automobile(Model):
        manufacturer = columns.Text(primary_key=True)
        year = columns.Integer(primary_key=True)
        model = columns.Text()
        price = columns.Decimal()
        options = columns.Set(columns.Text)

...and assuming the Automobile table contains a record of every car model manufactured in the last 20 years or so, we can retrieve only the cars made by a single manufacturer like this:


.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')

You can also use the more convenient syntax:

.. code-block:: python

    q = Automobile.objects(Automobile.manufacturer == 'Tesla')

We can then further filter our query with another call to **.filter**

.. code-block:: python

    q = q.filter(year=2012)

*Note: all queries involving any filtering MUST define either an '=' or an 'in' relation to either a primary key column, or an indexed column.*

Accessing objects in a QuerySet
===============================

There are several methods for getting objects out of a queryset

* iterating over the queryset
    .. code-block:: python

        for car in Automobile.objects.all():
            #...do something to the car instance
            pass

* list index
    .. code-block:: python

        q = Automobile.objects.all()
        q[0] #returns the first result
        q[1] #returns the second result

    .. note::

        * CQL does not support specifying a start position in it's queries. Therefore, accessing elements using array indexing will load every result up to the index value requested
        * Using negative indices requires a "SELECT COUNT()" to be executed. This has a performance cost on large datasets.

* list slicing
    .. code-block:: python

        q = Automobile.objects.all()
        q[1:] #returns all results except the first
        q[1:9] #returns a slice of the results

    .. note::

        * CQL does not support specifying a start position in it's queries. Therefore, accessing elements using array slicing will load every result up to the index value requested
        * Using negative indices requires a "SELECT COUNT()" to be executed. This has a performance cost on large datasets.

* calling :attr:`get() <query.QuerySet.get>` on the queryset
    .. code-block:: python

        q = Automobile.objects.filter(manufacturer='Tesla')
        q = q.filter(year=2012)
        car = q.get()

    this returns the object matching the queryset

* calling :attr:`first() <query.QuerySet.first>` on the queryset
    .. code-block:: python

        q = Automobile.objects.filter(manufacturer='Tesla')
        q = q.filter(year=2012)
        car = q.first()

    this returns the first value in the queryset

.. _query-filtering-operators:

Filtering Operators
===================

:attr:`Equal To <query.QueryOperator.EqualsOperator>`

The default filtering operator.

.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')
    q = q.filter(year=2012)  #year == 2012

In addition to simple equal to queries, cqlengine also supports querying with other operators by appending a ``__<op>`` to the field name on the filtering call

:attr:`in (__in) <query.QueryOperator.InOperator>`

.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')
    q = q.filter(year__in=[2011, 2012])


:attr:`> (__gt) <query.QueryOperator.GreaterThanOperator>`

.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')
    q = q.filter(year__gt=2010)  # year > 2010

    # or the nicer syntax

    q.filter(Automobile.year > 2010)

:attr:`>= (__gte) <query.QueryOperator.GreaterThanOrEqualOperator>`

.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')
    q = q.filter(year__gte=2010)  # year >= 2010

    # or the nicer syntax

    q.filter(Automobile.year >= 2010)

:attr:`< (__lt) <query.QueryOperator.LessThanOperator>`

.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')
    q = q.filter(year__lt=2012)  # year < 2012

    # or...

    q.filter(Automobile.year < 2012)

:attr:`<= (__lte) <query.QueryOperator.LessThanOrEqualOperator>`

.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')
    q = q.filter(year__lte=2012)  # year <= 2012

    q.filter(Automobile.year <= 2012)

:attr:`CONTAINS (__contains) <query.QueryOperator.ContainsOperator>`

The CONTAINS operator is available for all collection types (List, Set, Map).

.. code-block:: python

    q = Automobile.objects.filter(manufacturer='Tesla')
    q.filter(options__contains='backup camera').allow_filtering()

Note that we need to use allow_filtering() since the *options* column has no secondary index.

:attr:`LIKE (__like) <query.QueryOperator.LikeOperator>`

The LIKE operator is available for text columns that have a SASI secondary index.

.. code-block:: python

    q = Automobile.objects.filter(model__like='%Civic%').allow_filtering()

:attr:`IS NOT NULL (IsNotNull(column_name)) <statement.IsNotNull>`

The IS NOT NULL operator is not yet supported for C*.

.. code-block:: python

    q = Automobile.objects.filter(IsNotNull('model'))

Limitations:

- Currently, cqlengine does not support SASI index creation. To use this feature, you need to create the SASI index using the core driver.
- Queries using LIKE must use allow_filtering() since the *model* column has no standard secondary index. Note that the server will use the SASI index properly when executing the query.

TimeUUID Functions
==================

In addition to querying using regular values, there are two functions you can pass in when querying TimeUUID columns to help make filtering by them easier. Note that these functions don't actually return a value, but instruct the cql interpreter to use the functions in it's query.

.. class:: MinTimeUUID(datetime)

    returns the minimum time uuid value possible for the given datetime

.. class:: MaxTimeUUID(datetime)

    returns the maximum time uuid value possible for the given datetime

*Example*

.. code-block:: python

    class DataStream(Model):
        id      = columns.UUID(partition_key=True)
        time    = columns.TimeUUID(primary_key=True)
        data    = columns.Bytes()

    min_time = datetime(1982, 1, 1)
    max_time = datetime(1982, 3, 9)

    DataStream.filter(time__gt=functions.MinTimeUUID(min_time), time__lt=functions.MaxTimeUUID(max_time))

Token Function
==============

Token functon may be used only on special, virtual column pk__token, representing token of partition key (it also works for composite partition keys).
Cassandra orders returned items by value of partition key token, so using cqlengine.Token we can easy paginate through all table rows.

See http://cassandra.apache.org/doc/cql3/CQL-3.0.html#tokenFun

*Example*

.. code-block:: python

    class Items(Model):
        id      = columns.Text(primary_key=True)
        data    = columns.Bytes()

    query = Items.objects.all().limit(10)

    first_page = list(query);
    last = first_page[-1]
    next_page = list(query.filter(pk__token__gt=cqlengine.Token(last.pk)))

QuerySets are immutable
=======================

When calling any method that changes a queryset, the method does not actually change the queryset object it's called on, but returns a new queryset object with the attributes of the original queryset, plus the attributes added in the method call.

*Example*

.. code-block:: python

    #this produces 3 different querysets
    #q does not change after it's initial definition
    q = Automobiles.objects.filter(year=2012)
    tesla2012 = q.filter(manufacturer='Tesla')
    honda2012 = q.filter(manufacturer='Honda')

Ordering QuerySets
==================

Since Cassandra is essentially a distributed hash table on steroids, the order you get records back in will not be particularly predictable.

However, you can set a column to order on with the ``.order_by(column_name)`` method.

*Example*

.. code-block:: python

    #sort ascending
    q = Automobiles.objects.all().order_by('year')
    #sort descending
    q = Automobiles.objects.all().order_by('-year')

*Note: Cassandra only supports ordering on a clustering key. In other words, to support ordering results, your model must have more than one primary key, and you must order on a primary key, excluding the first one.*

*For instance, given our Automobile model, year is the only column we can order on.*

Values Lists
============

There is a special QuerySet's method ``.values_list()`` - when called, QuerySet returns lists of values instead of model instances. It may significantly speedup things with lower memory footprint for large responses.
Each tuple contains the value from the respective field passed into the ``values_list()`` call — so the first item is the first field, etc. For example:

.. code-block:: python

    items = list(range(20))
    random.shuffle(items)
    for i in items:
        TestModel.create(id=1, clustering_key=i)

    values = list(TestModel.objects.values_list('clustering_key', flat=True))
    # [19L, 18L, 17L, 16L, 15L, 14L, 13L, 12L, 11L, 10L, 9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L, 0L]

Per Query Timeouts
===================

By default all queries are executed with the timeout defined in `~cqlengine.connection.setup()`
The examples below show how to specify a per-query timeout.
A timeout is specified in seconds and can be an int, float or None.
None means no timeout.


.. code-block:: python

    class Row(Model):
        id = columns.Integer(primary_key=True)
        name = columns.Text()


Fetch all objects with a timeout of 5 seconds

.. code-block:: python

    Row.objects().timeout(5).all()

Create a single row with a 50ms timeout

.. code-block:: python

    Row(id=1, name='Jon').timeout(0.05).create()

Delete a single row with no timeout

.. code-block:: python

    Row(id=1).timeout(None).delete()

Update a single row with no timeout

.. code-block:: python

    Row(id=1).timeout(None).update(name='Blake')

Batch query timeouts

.. code-block:: python

    with BatchQuery(timeout=10) as b:
        Row(id=1, name='Jon').create()


NOTE: You cannot set both timeout and batch at the same time, batch will use the timeout defined in it's constructor.
Setting the timeout on the model is meaningless and will raise an AssertionError.


.. _ttl-change:

Default TTL and Per Query TTL
=============================

Model default TTL now relies on the *default_time_to_live* feature, introduced in Cassandra 2.0. It is not handled anymore in the CQLEngine Model (cassandra-driver >=3.6). You can set the default TTL of a table like this:

Example:

.. code-block:: python

    class User(Model):
        __options__ = {'default_time_to_live': 20}

        user_id = columns.UUID(primary_key=True)
        ...

You can set TTL per-query if needed. Here are a some examples:

Example:

.. code-block:: python

    class User(Model):
        __options__ = {'default_time_to_live': 20}

        user_id = columns.UUID(primary_key=True)
        ...

    user = User.objects.create(user_id=1)  # Default TTL 20 will be set automatically on the server

    user.ttl(30).update(age=21)            # Update the TTL to 30
    User.objects.ttl(10).create(user_id=1)  # TTL 10
    User(user_id=1, age=21).ttl(10).save()  # TTL 10


Named Tables
===================

Named tables are a way of querying a table without creating an class.  They're useful for querying system tables or exploring an unfamiliar database.


.. code-block:: python

    from cassandra.cqlengine.connection import setup
    setup("127.0.0.1", "cqlengine_test")

    from cassandra.cqlengine.named import NamedTable
    user = NamedTable("cqlengine_test", "user")
    user.objects()
    user.objects()[0]

    # {u'pk': 1, u't': datetime.datetime(2014, 6, 26, 17, 10, 31, 774000)}