File: query-string-query.asciidoc

package info (click to toggle)
elasticsearch 1.6.2%2Bdfsg-1~bpo8%2B1
  • links: PTS, VCS
  • area: main
  • in suites: jessie-backports
  • size: 59,348 kB
  • sloc: java: 461,436; xml: 1,913; python: 1,402; sh: 1,183; ruby: 618; perl: 172; makefile: 46
file content (200 lines) | stat: -rw-r--r-- 6,662 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
[[query-dsl-query-string-query]]
=== Query String Query

A query that uses a query parser in order to parse its content. Here is
an example:

[source,js]
--------------------------------------------------
{
    "query_string" : {
        "default_field" : "content",
        "query" : "this AND that OR thus"
    }
}
--------------------------------------------------

The `query_string` top level parameters include:

[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`query` |The actual query to be parsed. See <<query-string-syntax>>.

|`default_field` |The default field for query terms if no prefix field
is specified. Defaults to the `index.query.default_field` index
settings, which in turn defaults to `_all`.

|`default_operator` |The default operator used if no explicit operator
is specified. For example, with a default operator of `OR`, the query
`capital of Hungary` is translated to `capital OR of OR Hungary`, and
with default operator of `AND`, the same query is translated to
`capital AND of AND Hungary`. The default value is `OR`.

|`analyzer` |The analyzer name used to analyze the query string.

|`allow_leading_wildcard` |When set, `*` or `?` are allowed as the first
character. Defaults to `true`.

|`lowercase_expanded_terms` |Whether terms of wildcard, prefix, fuzzy,
and range queries are to be automatically lower-cased or not (since they
are not analyzed). Default it `true`.

|`enable_position_increments` |Set to `true` to enable position
increments in result queries. Defaults to `true`.

|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
expand to. Defaults to `50`

|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
to `AUTO`. See  <<fuzziness>> for allowed settings.

|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
is `0`.

|`phrase_slop` |Sets the default slop for phrases. If zero, then exact
phrase matches are required. Default value is `0`.

|`boost` |Sets the boost value of the query. Defaults to `1.0`.

|`analyze_wildcard` |By default, wildcards terms in a query string are
not analyzed. By setting this value to `true`, a best effort will be
made to analyze those as well.

|`auto_generate_phrase_queries` |Defaults to `false`.

|`max_determinized_states` |Limit on how many automaton states regexp
queries are allowed to create.  This protects against too-difficult
(e.g. exponentially hard) regexps.  Defaults to 10000.

|`minimum_should_match` |A value controlling how many "should" clauses
in the resulting boolean query should match. It can be an absolute value
(`2`), a percentage (`30%`) or a
<<query-dsl-minimum-should-match,combination of
both>>.

|`lenient` |If set to `true` will cause format based failures (like
providing text to a numeric field) to be ignored.

|`locale` | Locale that should be used for string conversions.
Defaults to `ROOT`.

|`time_zone` | Time Zone to be applied to any range query related to dates. See also
http://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeZone.html[JODA timezone].
|=======================================================================

When a multi term query is being generated, one can control how it gets
rewritten using the
<<query-dsl-multi-term-rewrite,rewrite>>
parameter.

[float]
==== Default Field

When not explicitly specifying the field to search on in the query
string syntax, the `index.query.default_field` will be used to derive
which field to search on. It defaults to `_all` field.

So, if `_all` field is disabled, it might make sense to change it to set
a different default field.

[float]
==== Multi Field

The `query_string` query can also run against multiple fields. Fields can be
provided via the `"fields"` parameter (example below).

The idea of running the `query_string` query against multiple fields is to
expand each query term to an OR clause like this:

    field1:query_term OR field2:query_term | ...

For example, the following query

[source,js]
--------------------------------------------------
{
    "query_string" : {
        "fields" : ["content", "name"],
        "query" : "this AND that"
    }
}
--------------------------------------------------

matches the same words as


[source,js]
--------------------------------------------------
{
    "query_string": {
      "query": "(content:this OR name:this) AND (content:that OR name:that)"
    }
}
--------------------------------------------------

Since several queries are generated from the individual search terms,
combining them can be automatically done using either a `dis_max` query or a
simple `bool` query. For example (the `name` is boosted by 5 using `^5`
notation):

[source,js]
--------------------------------------------------
{
    "query_string" : {
        "fields" : ["content", "name^5"],
        "query" : "this AND that OR thus",
        "use_dis_max" : true
    }
}
--------------------------------------------------

Simple wildcard can also be used to search "within" specific inner
elements of the document. For example, if we have a `city` object with
several fields (or inner object with fields) in it, we can automatically
search on all "city" fields:

[source,js]
--------------------------------------------------
{
    "query_string" : {
        "fields" : ["city.*"],
        "query" : "this AND that OR thus",
        "use_dis_max" : true
    }
}
--------------------------------------------------

Another option is to provide the wildcard fields search in the query
string itself (properly escaping the `*` sign), for example:
`city.\*:something`.

When running the `query_string` query against multiple fields, the
following additional parameters are allowed:

[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`use_dis_max` |Should the queries be combined using `dis_max` (set it
to `true`), or a `bool` query (set it to `false`). Defaults to `true`.

|`tie_breaker` |When using `dis_max`, the disjunction max tie breaker.
Defaults to `0`.
|=======================================================================

The fields parameter can also include pattern based field names,
allowing to automatically expand to the relevant fields (dynamically
introduced fields included). For example:

[source,js]
--------------------------------------------------
{
    "query_string" : {
        "fields" : ["content", "name.*^5"],
        "query" : "this AND that OR thus",
        "use_dis_max" : true
    }
}
--------------------------------------------------

include::query-string-syntax.asciidoc[]