File: StringSplitter.md

package info (click to toggle)
error-prone-java 2.18.0-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 23,204 kB
  • sloc: java: 222,992; xml: 1,319; sh: 25; makefile: 7
file content (37 lines) | stat: -rw-r--r-- 1,793 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
`String.split(String)` and `Pattern.split(CharSequence)` have surprising
behaviour. For example, consider the following puzzler from
http://konigsberg.blogspot.com/2009/11/final-thoughts-java-puzzler-splitting.html:

```java
String[] nothing = "".split(":");
String[] bunchOfNothing = ":".split(":");
```

The result is `[""]` and `[]`!

More examples:

input    | `input.split(":")`  | `Pattern.compile(":").split(input)` | `Splitter.on(':').split(input)`
-------- | ------------------- | ----------------------------------- | -------------------------------
`""`     | `[""]`              | `[""]`                              | `[""]`
`":"`    | `[]`                | `[]`                                | `["", ""]`
`":::"`  | `[]`                | `[]`                                | `["", "", "", ""]`
`"a:::"` | `["a"]`             | `["a"]`                             | `["a", "", "", ""]`
`":::b"` | `["", "", "", "b"]` | `["", "", "", "b"]`                 | `["", "", "", "b"]`

Prefer either:

*   Guava's
    [`Splitter`](https://guava.dev/releases/23.0/api/docs/com/google/common/base/Splitter.html),
    which has less surprising behaviour and provides explicit control over the
    handling of empty strings and the trimming of whitespace with `trimResults`
    and `omitEmptyStrings`.

*   [`String.split(String, int)`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/String.html#split\(java.lang.String,int\))
    or
    [`Pattern.split(CharSequence, int)`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/String.html#split\(java.lang.String,int\))
    and setting an explicit 'limit' to `-1` to match the behaviour of
    `Splitter`.

TIP: if you use `Splitter`, consider extracting the instance to a `static`
`final` field.