File: README.md

package info (click to toggle)
php-gettext-languages 2.10.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 476 kB
  • sloc: php: 1,358; makefile: 12; xml: 10
file content (236 lines) | stat: -rw-r--r-- 10,651 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
[![Tests](https://github.com/php-gettext/Languages/actions/workflows/tests.yml/badge.svg)](https://github.com/php-gettext/Languages/actions/workflows/tests.yml)
# gettext language list automatically generated from CLDR data


## Static usage

To use the languages data generated from this tool you can use the `bin/export-plural-rules` command.

#### Export command line options
`export-plural-rules` supports the following options:
- `--us-ascii`
  If specified, the output will contain only US-ASCII characters.
  If not specified, the output charset is UTF-8.
- `--languages=<LanguageId>[,<LanguageId>,...]]`
  `--language=<LanguageId>[,<LanguageId>,...]]`
  Export only the specified language codes.
  Separate languages with commas; you can also use this argument more than once; it's case insensitive and accepts both '_' and '-' as locale chunks separator (eg we accept `it_IT` as well as `it-it`).
  If this option is not specified, the result will contain all the available languages.
- `--reduce=yes|no`
  If set to yes the output won't contain languages with the same base language and rules.
  For instance `nl_BE` (`Flemish`) will be omitted because it's the same as `nl` (`Dutch`).
  Defaults to `no` if `--languages` is specified, to `yes` otherwise.
- `--parenthesis=yes|no`
  If set to no, extra parenthesis will be omitted in generated plural rules formulas.
  Those extra parenthesis are needed to create a PHP-compatible formula.
  Defaults to `yes`
- `--output=<file name>`
  If specified, the output will be saved to `<file name>`. If not specified we'll output to standard output.

#### Export formats
`export-plural-rules` can generate data in the following formats:

- `json`: compressed JSON data
  ```bash
  export-plural-rules json
  ```

- `prettyjson`: uncompressed JSON data
  ```bash
  export-plural-rules prettyjson
  ```

- `html`: html table ([see the result](https://php-gettext.github.io/Languages/))
  ```bash
  export-plural-rules html
  ```

- `php`: build a php file that can be included
  ```bash
  export-plural-rules --output=yourfile.php php
  ```
  Then you can use that generated file in your php scripts:
  ```php
  $languages = include 'yourfile.php';
  ```

- `ruby`: build a ruby file that can be included
  ```bash
  export-plural-rules --parenthesis=no --output=yourfile.rb ruby
  ```
  Then you can use that generated file in your ruby scripts:
  ```ruby
  require './yourfile.rb'
  PLURAL_RULES['en']
  ```

- `xml`: generate an XML document ([here you can find the xsd XML schema](https://php-gettext.github.io/Languages/GettextLanguages.xsd))
  ```bash
  export-plural-rules xml
  ```

- `po`: generate the gettext .po headers for a single language
  ```bash
  export-plural-rules po --language=YourLanguageCode
  ```


## Dynamic usage

#### With Composer
You can use [Composer](https://getcomposer.org/) to include this tool in your project.
Simply launch `composer require gettext/languages` or add `"gettext/languages": "*"` to the `"require"` section of your `composer.json` file.

#### Without Composer
If you don't use composer in your project, you can download this package in a directory of your project and include the autoloader file:
```php
require_once 'path/to/src/autoloader.php';
```

#### Main methods
The most useful functions of this tools are the following
```php
$allLanguages = Gettext\Languages\Language::getAll();
...
$oneLanguage = Gettext\Languages\Language::getById('en_US');
...
```
`getAll` returns a list of `Gettext\Languages\Language` instances, `getById` returns a single `Gettext\Languages\Language` instance (or `null` if the specified language identifier is not valid).

The main properties of the `Gettext\Languages\Language` instances are:
- `id`: the normalized language ID (for instance `en_US`)
- `name`: the language name (for instance `American English` for `en_US`)
- `supersededBy`: the code of a language that supersedes this language code (for instance, `jw` is superseded by `jv` to represent the Javanese language)
- `script`: the script name (for instance, for `zh_Hans` - `Simplified Chinese` - the script is `Simplified Han`)
- `territory`: the name of the territory (for instance `United States` for `en_US`)
- `baseLanguage`: the name of the base language  (for instance `English` for `en_US`)
- `formula`: the [gettext formula](https://www.gnu.org/savannah-checkouts/gnu/gettext/manual/html_node/Plural-forms.html) to distinguish between different plural rules. For instance `n != 1`
- `categories`: the plural cases applicable for this language. It's an array of `Gettext\Languages\Category` instances. Each instance has these properties:
  - `id`: can be (in this order) one of `zero`, `one`, `two`, `few`, `many` or `other`. The `other` case is always present.
  - `examples`: a representation of some values for which this plural case is valid (examples are simple numbers like `1` or complex ranges like `0, 2~16, 100, 1000, 10000, 100000, 1000000, …`)


## Is this data correct?

Yes - as far as you trust the [Unicode CLDR](http://cldr.unicode.org) project.

The conversion from CLDR to gettext includes also [a lot of tests](https://travis-ci.org/php-gettext/Languages) to check the results.
And all passes :wink:.


## Reference

#### CLDR

The [CLDR specifications](https://unicode.org/reports/tr35/tr35-numbers.html#Language_Plural_Rules) define the following variables to be used in the CLDR plural formulas:
- `n`: absolute value of the source number (integer and decimals) (eg: `9.870` => `9.87`)
- `i`: integer digits of n (eg: `9.870` => `9`)
- `v`: number of visible fraction digits in n, with trailing zeros (eg: `9.870` => `3`)
- `w`: number of visible fraction digits in n, without trailing zeros (eg: `9.870` => `2`)
- `f`: visible fractional digits in n, with trailing zeros (eg: `9.870` => `870`)
- `t`: visible fractional digits in n, without trailing zeros (eg: `9.870` => `87`)
- `c`: exponent of the power of 10 used in compact decimal formatting (eg: `98c7` => `7`)
- `e`: synonym for `c`

#### gettext

The [gettext specifications](https://www.gnu.org/savannah-checkouts/gnu/gettext/manual/html_node/Plural-forms.html) define the following variables to be used in the gettext plural formulas:
- `n`: unsigned long int

### Conversion CLDR > gettext

| CLDR variable | gettext equivalent |
|---------------|--------------------|
| `n`           | `n`                |
| `i`           | `n`                |
| `v`           | `0`                |
| `w`           | `0`                |
| `f`           | *empty*            |
| `t`           | *empty*            |
| `c`           | *empty*            |
| `e`           | *empty*            |



## Parenthesis in ternary operators

The generated gettext formulas contain some extra parenthesis, in order to avoid problems in some programming language.
For instance, let's assume we have this formula:
`(0 == 0) ? 0 : (0 == 1) ? 1 : 2`
- [in C it evaluates to `0`](http://codepad.org/Epw5WkmJ) since is the same as `(0 == 0) ? 0 : ((0 == 1) ? 1 : 2)`
- [in Java it evaluates to `0`](https://ideone.com/vbRHjW) since is the same as `(0 == 0) ? 0 : ((0 == 1) ? 1 : 2)`
- [in JavaScript it evaluates to `0`](https://jsfiddle.net/7fnxa599/) since is the same as `(0 == 0) ? 0 : ((0 == 1) ? 1 : 2)`
- [in PHP it evaluates to `2`](https://3v4l.org/QAAnA) since is the same as `((0 == 0) ? 0 : (0 == 1)) ? 1 : 2`

So, in order to avoid problems, instead of a simple
`a ? 0 : b ? 1 : 2`
the resulting formulas will be in this format:
`a ? 0 : (b ? 1 : 2)`


## Contributing

### Generating the CLDR data
This repository uses the CLDR data, including American English (`en_US`) json files.
In order to generate this data, you can use Docker.
Start a new Docker container by running

```sh
docker run --rm -it -v path/to/src/cldr-data:/output alpine:3.13 sh
```

Then run the following script, setting the values of the variables accordingly to your needs:

```sh
# The value of the CLDR version (eg 39, 38.1, ...)
CLDR_VERSION=39
# Your GitHub username (required since CLDR 38) - see http://cldr.unicode.org/development/maven#TOC-Introduction
GITHUB_USERNAME=
# Your GitHub personal access token (required since CLDR 38) - see http://cldr.unicode.org/development/maven#TOC-Introduction
GITHUB_TOKEN=

if ! test -d /output; then
    echo 'Missing output directory' >&2
    return 1
fi
apk -U upgrade
apk add --no-cache git git-lfs openjdk8 apache-ant maven
CLDR_MAJORVERSION="$(printf '%s' "$CLDR_VERSION" | sed -E 's/^([0-9]+).*/\1/')"
SOURCE_DIR="$(mktemp -d)"
DESTINATION_DIR="$(mktemp -d)"
git clone --single-branch --depth=1 "--branch=release-$(printf '%s' "$CLDR_VERSION" | tr '.' '-')" https://github.com/unicode-org/cldr.git "$SOURCE_DIR"
if test $CLDR_MAJORVERSION -lt 38; then
    git -C "$SOURCE_DIR" lfs pull --include tools/java || true
    ant -f "$SOURCE_DIR/tools/java/build.xml" jar
    JARFILE="$SOURCE_DIR/tools/java/cldr.jar"
    DESTINATION_DIR_LOCALE="$DESTINATION_DIR/en_US"
    DESTINATION_FILE_PLURALS="$DESTINATION_DIR/supplemental/plurals.json"
else
    if test -z "${GITHUB_USERNAME:-}"; then
        echo 'GITHUB_USERNAME is missing' >&2
        return 1
    fi
    if test -z "${GITHUB_TOKEN:-}"; then
        echo 'GITHUB_TOKEN is missing' >&2
        return 1
    fi
    printf '<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"><servers><server><id>githubicu</id><username>%s</username><password>%s</password></server></servers></settings>' "$GITHUB_USERNAME" "$GITHUB_TOKEN" > "$SOURCE_DIR/mvn-settings.xml"
    mvn --settings "$SOURCE_DIR/mvn-settings.xml" package -DskipTests=true --file "$SOURCE_DIR/tools/cldr-code/pom.xml"
    JARFILE="$SOURCE_DIR//tools/cldr-code/target/cldr-code.jar"
    DESTINATION_DIR_LOCALE="$DESTINATION_DIR"
    DESTINATION_FILE_PLURALS="$DESTINATION_DIR/supplemental/plurals/plurals.json"
fi
java -Duser.language=en -Duser.country=US "-DCLDR_DIR=$SOURCE_DIR" "-DCLDR_GEN_DIR=$DESTINATION_DIR_LOCALE" -jar "$JARFILE" ldml2json -t main -r true -s contributed -m en_US
java -Duser.language=en -Duser.country=US "-DCLDR_DIR=$SOURCE_DIR" "-DCLDR_GEN_DIR=$DESTINATION_DIR/supplemental" -jar "$JARFILE" ldml2json -s contributed -o true -t supplemental
mkdir -p /output/main/en-US
cp $DESTINATION_DIR/en_US/languages.json /output/main/en-US/
cp $DESTINATION_DIR/en_US/scripts.json /output/main/en-US/
cp $DESTINATION_DIR/en_US/territories.json /output/main/en-US/
mkdir -p /output/supplemental
cp "$DESTINATION_FILE_PLURALS" /output/supplemental/
```


## Support this project

You can offer me a [monthy coffee](https://github.com/sponsors/mlocati) or a [one-time coffee](https://paypal.me/mlocati) :wink: