File: stop-tokenfilter.asciidoc

package info (click to toggle)

elasticsearch 1.0.3%2Bdfsg-5

links: PTS, VCS
area: main
in suites: jessie-kfreebsd
size: 37,220 kB
sloc: java: 365,486; xml: 1,258; sh: 714; python: 505; ruby: 354; perl: 134; makefile: 41

file content (35 lines) | stat: -rw-r--r-- 1,432 bytes

[[analysis-stop-tokenfilter]]
=== Stop Token Filter

A token filter of type `stop` that removes stop words from token
streams.

The following are settings that can be set for a `stop` token filter
type:

[cols="<,<",options="header",]
|=======================================================================
|Setting |Description
|`stopwords` |A list of stop words to use. Defaults to english stop
words.

|`stopwords_path` |A path (either relative to `config` location, or
absolute) to a stopwords file configuration. Each stop word should be in
its own "line" (separated by a line break). The file must be UTF-8
encoded.

|`ignore_case` |Set to `true` to lower case all words first. Defaults to
`false`.

|`remove_trailing` |Set to `false` in order to not ignore the last term of
a search if it is a stop word. This is very useful for the completion
suggester as a query like `green a` can be extended to `green apple` even
though you remove stop words in general. Defaults to `true`.
|=======================================================================

stopwords allow for custom language specific expansion of default
stopwords. It follows the `_lang_` notation and supports: arabic,
armenian, basque, brazilian, bulgarian, catalan, czech, danish, dutch,
english, finnish, french, galician, german, greek, hindi, hungarian,
indonesian, italian, norwegian, persian, portuguese, romanian, russian,
spanish, swedish, turkish.