File: stop-tokenfilter.asciidoc

package info (click to toggle)
elasticsearch 1.0.3%2Bdfsg-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie-kfreebsd
  • size: 37,220 kB
  • sloc: java: 365,486; xml: 1,258; sh: 714; python: 505; ruby: 354; perl: 134; makefile: 41
file content (35 lines) | stat: -rw-r--r-- 1,432 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
[[analysis-stop-tokenfilter]]
=== Stop Token Filter

A token filter of type `stop` that removes stop words from token
streams.

The following are settings that can be set for a `stop` token filter
type:

[cols="<,<",options="header",]
|=======================================================================
|Setting |Description
|`stopwords` |A list of stop words to use. Defaults to english stop
words.

|`stopwords_path` |A path (either relative to `config` location, or
absolute) to a stopwords file configuration. Each stop word should be in
its own "line" (separated by a line break). The file must be UTF-8
encoded.

|`ignore_case` |Set to `true` to lower case all words first. Defaults to
`false`.

|`remove_trailing` |Set to `false` in order to not ignore the last term of
a search if it is a stop word. This is very useful for the completion
suggester as a query like `green a` can be extended to `green apple` even
though you remove stop words in general. Defaults to `true`.
|=======================================================================

stopwords allow for custom language specific expansion of default
stopwords. It follows the `_lang_` notation and supports: arabic,
armenian, basque, brazilian, bulgarian, catalan, czech, danish, dutch,
english, finnish, french, galician, german, greek, hindi, hungarian,
indonesian, italian, norwegian, persian, portuguese, romanian, russian,
spanish, swedish, turkish.