File: analysis.asciidoc

package info (click to toggle)

elasticsearch 1.0.3%2Bdfsg-5

links: PTS, VCS
area: main
in suites: jessie-kfreebsd
size: 37,220 kB
sloc: java: 365,486; xml: 1,258; sh: 714; python: 505; ruby: 354; perl: 134; makefile: 41

file content (18 lines) | stat: -rw-r--r-- 828 bytes

[[index-modules-analysis]]
== Analysis

The index analysis module acts as a configurable registry of Analyzers
that can be used in order to both break indexed (analyzed) fields when a
document is indexed and process query strings. It maps to the Lucene
`Analyzer`.

Analyzers are (generally) composed of a single `Tokenizer` and zero or
more `TokenFilters`. A set of `CharFilters` can be associated with an
analyzer to process the characters prior to other analysis steps. The
analysis module allows one to register `TokenFilters`, `Tokenizers` and
`Analyzers` under logical names that can then be referenced either in
mapping definitions or in certain APIs. The Analysis module
automatically registers (*if not explicitly defined*) built in
analyzers, token filters, and tokenizers.

See <<analysis>> for configuration details.