File: pattern-replace-charfilter.asciidoc

package info (click to toggle)
elasticsearch 1.0.3%2Bdfsg-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie-kfreebsd
  • size: 37,220 kB
  • sloc: java: 365,486; xml: 1,258; sh: 714; python: 505; ruby: 354; perl: 134; makefile: 41
file content (37 lines) | stat: -rw-r--r-- 1,348 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
[[analysis-pattern-replace-charfilter]]
=== Pattern Replace Char Filter

The `pattern_replace` char filter allows the use of a regex to
manipulate the characters in a string before analysis. The regular
expression is defined using the `pattern` parameter, and the replacement
string can be provided using the `replacement` parameter (supporting
referencing the original text, as explained
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplacement(java.lang.StringBuffer,%20java.lang.String)[here]).
For more information check the
http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceCharFilter.html[lucene
documentation]

Here is a sample configuration:

[source,js]
--------------------------------------------------
{
    "index" : {
        "analysis" : {
            "char_filter" : {
                "my_pattern":{
                    "type":"pattern_replace",
                    "pattern":"sample(.*)",
                    "replacement":"replacedSample $1"
                }
            },
            "analyzer" : {
                "custom_with_char_filter" : {
                    "tokenizer" : "standard",
                    "char_filter" : ["my_pattern"]
                },
            }
        }
    }
}
--------------------------------------------------