1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836
|
# Package redisearch Documentation
## Overview
`redisearch-py` is a python search engine library that utilizes the RediSearch Redis Module API.
It is the "official" client of redisearch, and should be regarded as its canonical client implementation.
The source code can be found at [http://github.com/RedisLabs/redisearch-py](http://github.com/RedisLabs/redisearch-py)
### Example: Using the Python Client
```py
from redisearch import Client, TextField, NumericField, Query
# Creating a client with a given index name
client = Client('myIndex')
# Creating the index definition and schema
client.create_index([TextField('title', weight=5.0), TextField('body')])
# Indexing a document
client.add_document('doc1', title = 'RediSearch', body = 'Redisearch impements a search engine on top of redis')
# Simple search
res = client.search("search engine")
# the result has the total number of results, and a list of documents
print res.total # "1"
print res.docs[0].title
# Searching with snippets
res = client.search("search engine", snippet_sizes = {'body': 50})
# Searching with complext parameters:
q = Query("search engine").verbatim().no_content().paging(0,5)
res = client.search(q)
```
### Example: Using the Auto Completer Client:
```py
# Using the auto-completer
ac = AutoCompleter('ac')
# Adding some terms
ac.add_suggestions(Suggestion('foo', 5.0), Suggestion('bar', 1.0))
# Getting suggestions
suggs = ac.get_suggestions('goo') # returns nothing
suggs = ac.get_suggestions('goo', fuzzy = True) # returns ['foo']
```
### Installing
1. Install redis 4.0 RC2 or above
2. [Install RediSearch](http://redisearch.io/Quick_Start/#building-and-running)
3. Install the python client
```sh
$ pip install redisearch
```
## Class AutoCompleter
A client to RediSearch's AutoCompleter API
It provides prefix searches with optionally fuzzy matching of prefixes
### \_\_init\_\_
```py
def __init__(self, key, host='localhost', port=6379, conn=None)
```
Create a new AutoCompleter client for the given key, and optional host and port
If conn is not None, we employ an already existing redis connection
### add\_suggestions
```py
def add_suggestions(self, *suggestions, **kwargs)
```
Add suggestion terms to the AutoCompleter engine. Each suggestion has a score and string.
If kwargs['increment'] is true and the terms are already in the server's dictionary, we increment their scores
### delete
```py
def delete(self, string)
```
Delete a string from the AutoCompleter index.
Returns 1 if the string was found and deleted, 0 otherwise
### get\_suggestions
```py
def get_suggestions(self, prefix, fuzzy=False, num=10, with_scores=False, with_payloads=False)
```
Get a list of suggestions from the AutoCompleter, for a given prefix
### Parameters:
- **prefix**: the prefix we are searching. **Must be valid ascii or utf-8**
- **fuzzy**: If set to true, the prefix search is done in fuzzy mode.
**NOTE**: Running fuzzy searches on short (<3 letters) prefixes can be very slow, and even scan the entire index.
- **with_scores**: if set to true, we also return the (refactored) score of each suggestion.
This is normally not needed, and is NOT the original score inserted into the index
- **with_payloads**: Return suggestion payloads
- **num**: The maximum number of results we return. Note that we might return less. The algorithm trims irrelevant suggestions.
Returns a list of Suggestion objects. If with_scores was False, the score of all suggestions is 1.
### len
```py
def len(self)
```
Return the number of entries in the AutoCompleter index
## Class Client
A client for the RediSearch module.
It abstracts the API of the module and lets you just use the engine
### \_\_init\_\_
```py
def __init__(self, index_name, host='localhost', port=6379, conn=None)
```
Create a new Client for the given index_name, and optional host and port
If conn is not None, we employ an already existing redis connection
### add\_document
```py
def add_document(self, doc_id, nosave=False, score=1.0, payload=None, replace=False, partial=False, language=None, **fields)
```
Add a single document to the index.
### Parameters
- **doc_id**: the id of the saved document.
- **nosave**: if set to true, we just index the document, and don't save a copy of it. This means that searches will just return ids.
- **score**: the document ranking, between 0.0 and 1.0
- **payload**: optional inner-index payload we can save for fast access in scoring functions
- **replace**: if True, and the document already is in the index, we perform an update and reindex the document
- **partial**: if True, the fields specified will be added to the existing document.
This has the added benefit that any fields specified with `no_index`
will not be reindexed again. Implies `replace`
- **language**: Specify the language used for document tokenization.
- **fields** kwargs dictionary of the document fields to be saved and/or indexed.
NOTE: Geo points shoule be encoded as strings of "lon,lat"
### add\_document\_hash
```py
def add_document_hash(self, doc_id, score=1.0, language=None, replace=False)
```
Add a hash document to the index.
### Parameters
- **doc_id**: the document's id. This has to be an existing HASH key in Redis that will hold the fields the index needs.
- **score**: the document ranking, between 0.0 and 1.0
- **replace**: if True, and the document already is in the index, we perform an update and reindex the document
- **language**: Specify the language used for document tokenization.
### aggregate
```py
def aggregate(self, query)
```
Issue an aggregation query
### Parameters
**query**: This can be either an `AggeregateRequest`, or a `Cursor`
An `AggregateResult` object is returned. You can access the rows from its
`rows` property, which will always yield the rows of the result
### alter\_schema\_add
```py
def alter_schema_add(self, fields)
```
Alter the existing search index by adding new fields. The index must already exist.
### Parameters:
- **fields**: a list of Field objects to add for the index
### batch\_indexer
```py
def batch_indexer(self, chunk_size=100)
```
Create a new batch indexer from the client with a given chunk size
### create\_index
```py
def create_index(self, fields, no_term_offsets=False, no_field_flags=False, stopwords=None)
```
Create the search index. The index must not already exist.
### Parameters:
- **fields**: a list of TextField or NumericField objects
- **no_term_offsets**: If true, we will not save term offsets in the index
- **no_field_flags**: If true, we will not save field flags that allow searching in specific fields
- **stopwords**: If not None, we create the index with this custom stopword list. The list can be empty
### delete\_document
```py
def delete_document(self, doc_id, conn=None, delete_actual_document=False)
```
Delete a document from index
Returns 1 if the document was deleted, 0 if not
### Parameters
- **delete_actual_document**: if set to True, RediSearch also delete the actual document if it is in the index
### drop\_index
```py
def drop_index(self)
```
Drop the index if it exists
### explain
```py
def explain(self, query)
```
### info
```py
def info(self)
```
Get info an stats about the the current index, including the number of documents, memory consumption, etc
### load\_document
```py
def load_document(self, id)
```
Load a single document by id
### search
```py
def search(self, query)
```
Search the index for a given query, and return a result of documents
### Parameters
- **query**: the search query. Either a text for simple queries with default parameters, or a Query object for complex queries.
See RediSearch's documentation on query format
- **snippet_sizes**: A dictionary of {field: snippet_size} used to trim and format the result. e.g.e {'body': 500}
## Class BatchIndexer
A batch indexer allows you to automatically batch
document indexeing in pipelines, flushing it every N documents.
### \_\_init\_\_
```py
def __init__(self, client, chunk_size=1000)
```
### add\_document
```py
def add_document(self, doc_id, nosave=False, score=1.0, payload=None, replace=False, partial=False, **fields)
```
Add a document to the batch query
### add\_document\_hash
```py
def add_document_hash(self, doc_id, score=1.0, language=None, replace=False)
```
Add a hash document to the batch query
### commit
```py
def commit(self)
```
Manually commit and flush the batch indexing query
## Class Document
Represents a single document in a result set
### \_\_init\_\_
```py
def __init__(self, id, payload=None, **fields)
```
## Class GeoField
GeoField is used to define a geo-indexing field in a schema defintion
### \_\_init\_\_
```py
def __init__(self, name)
```
### redis\_args
```py
def redis_args(self)
```
## Class GeoFilter
None
### \_\_init\_\_
```py
def __init__(self, field, lon, lat, radius, unit='km')
```
## Class NumericField
NumericField is used to define a numeric field in a schema defintion
### \_\_init\_\_
```py
def __init__(self, name, sortable=False, no_index=False)
```
### redis\_args
```py
def redis_args(self)
```
## Class NumericFilter
None
### \_\_init\_\_
```py
def __init__(self, field, minval, maxval, minExclusive=False, maxExclusive=False)
```
## Class Query
Query is used to build complex queries that have more parameters than just the query string.
The query string is set in the constructor, and other options have setter functions.
The setter functions return the query object, so they can be chained,
i.e. `Query("foo").verbatim().filter(...)` etc.
### \_\_init\_\_
```py
def __init__(self, query_string)
```
Create a new query object.
The query string is set in the constructor, and other options have setter functions.
### add\_filter
```py
def add_filter(self, flt)
```
Add a numeric or geo filter to the query.
**Currently only one of each filter is supported by the engine**
- **flt**: A NumericFilter or GeoFilter object, used on a corresponding field
### get\_args
```py
def get_args(self)
```
Format the redis arguments for this query and return them
### highlight
```py
def highlight(self, fields=None, tags=None)
```
Apply specified markup to matched term(s) within the returned field(s)
- **fields** If specified then only those mentioned fields are highlighted, otherwise all fields are highlighted
- **tags** A list of two strings to surround the match.
### in\_order
```py
def in_order(self)
```
Match only documents where the query terms appear in the same order in the document.
i.e. for the query 'hello world', we do not match 'world hello'
### language
```py
def language(self, language)
```
Analyze the query as being in the specified language
:param language: The language (e.g. `chinese` or `english`)
### limit\_fields
```py
def limit_fields(self, *fields)
```
Limit the search to specific TEXT fields only
- **fields**: A list of strings, case sensitive field names from the defined schema
### limit\_ids
```py
def limit_ids(self, *ids)
```
Limit the results to a specific set of pre-known document ids of any length
### no\_content
```py
def no_content(self)
```
Set the query to only return ids and not the document content
### no\_stopwords
```py
def no_stopwords(self)
```
Prevent the query from being filtered for stopwords.
Only useful in very big queries that you are certain contain no stopwords.
### paging
```py
def paging(self, offset, num)
```
Set the paging for the query (defaults to 0..10).
- **offset**: Paging offset for the results. Defaults to 0
- **num**: How many results do we want
### query\_string
```py
def query_string(self)
```
Return the query string of this query only
### return\_fields
```py
def return_fields(self, *fields)
```
Only return values from these fields
### slop
```py
def slop(self, slop)
```
Allow a masimum of N intervening non matched terms between phrase terms (0 means exact phrase)
### sort\_by
```py
def sort_by(self, field, asc=True)
```
Add a sortby field to the query
- **field** - the name of the field to sort by
- **asc** - when `True`, sorting will be done in asceding order
### summarize
```py
def summarize(self, fields=None, context_len=None, num_frags=None, sep=None)
```
Return an abridged format of the field, containing only the segments of
the field which contain the matching term(s).
If `fields` is specified, then only the mentioned fields are
summarized; otherwise all results are summarized.
Server side defaults are used for each option (except `fields`) if not specified
- **fields** List of fields to summarize. All fields are summarized if not specified
- **context_len** Amount of context to include with each fragment
- **num_frags** Number of fragments per document
- **sep** Separator string to separate fragments
### verbatim
```py
def verbatim(self)
```
Set the query to be verbatim, i.e. use no query expansion or stemming
### with\_payloads
```py
def with_payloads(self)
```
Ask the engine to return document payloads
## Class Result
Represents the result of a search query, and has an array of Document objects
### \_\_init\_\_
```py
def __init__(self, res, hascontent, duration=0, has_payload=False)
```
- **snippets**: An optional dictionary of the form {field: snippet_size} for snippet formatting
## Class SortbyField
None
### \_\_init\_\_
```py
def __init__(self, field, asc=True)
```
## Class Suggestion
Represents a single suggestion being sent or returned from the auto complete server
### \_\_init\_\_
```py
def __init__(self, string, score=1.0, payload=None)
```
## Class TagField
TagField is a tag-indexing field with simpler compression and tokenization.
See http://redisearch.io/Tags/
### \_\_init\_\_
```py
def __init__(self, name, separator=',', no_index=False)
```
### redis\_args
```py
def redis_args(self)
```
## Class TextField
TextField is used to define a text field in a schema definition
### \_\_init\_\_
```py
def __init__(self, name, weight=1.0, sortable=False, no_stem=False, no_index=False)
```
### redis\_args
```py
def redis_args(self)
```
|