File: lang-analyzer.asciidoc

package info (click to toggle)
elasticsearch 1.0.3%2Bdfsg-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie-kfreebsd
  • size: 37,220 kB
  • sloc: java: 365,486; xml: 1,258; sh: 714; python: 505; ruby: 354; perl: 134; makefile: 41
file content (21 lines) | stat: -rw-r--r-- 1,043 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[[analysis-lang-analyzer]]
=== Language Analyzers

A set of analyzers aimed at analyzing specific language text. The
following types are supported: `arabic`, `armenian`, `basque`,
`brazilian`, `bulgarian`, `catalan`, `chinese`, `cjk`, `czech`,
`danish`, `dutch`, `english`, `finnish`, `french`, `galician`, `german`,
`greek`, `hindi`, `hungarian`, `indonesian`, `italian`, `norwegian`,
`persian`, `portuguese`, `romanian`, `russian`, `spanish`, `swedish`,
`turkish`, `thai`.

All analyzers support setting custom `stopwords` either internally in
the config, or by using an external stopwords file by setting
`stopwords_path`. Check <<analysis-stop-analyzer,Stop Analyzer>> for
more details.

The following analyzers support setting custom `stem_exclusion` list:
`arabic`, `armenian`, `basque`, `brazilian`, `bulgarian`, `catalan`,
`czech`, `danish`, `dutch`, `english`, `finnish`, `french`, `galician`,
`german`, `hindi`, `hungarian`, `indonesian`, `italian`, `norwegian`,
`portuguese`, `romanian`, `russian`, `spanish`, `swedish`, `turkish`.