File: charset.md

package info (click to toggle)
php-league-csv 9.24.1%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 39,744 kB
  • sloc: php: 13,447; javascript: 80; makefile: 33; xml: 29
file content (147 lines) | stat: -rw-r--r-- 5,671 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
layout: default
title: Converting Csv records character encoding
---

# Charset conversion

The `CharsetConverter` class converts your CSV records using the `mbstring` extension and its [supported character encodings](http://php.net/manual/en/mbstring.supported-encodings.php).

## Settings

```php
public CharsetConverter::inputEncoding(string $input_encoding): self
public CharsetConverter::outputEncoding(string $output_encoding): self
```

The `inputEncoding` and `outputEncoding` methods set the object encoding properties. By default, the input encoding and the output encoding are set to `UTF-8`.

When building a `CharsetConverter` object, the methods do not need to be called in any particular order, and may be called multiple times. Because the `CharsetConverter` is immutable, each time its setter methods are called they return a new object without modifying the current one.

<p class="message-warning">If the submitted charset is not supported by the <code>mbstring</code> extension an <code>OutOfRangeException</code> will be thrown.</p>

## Conversion

```php
public CharsetConverter::convert(iterable $records): iterable
```

`CharsetConverter::convert` converts the collection of records according to the encoding settings.

```php
use League\Csv\CharsetConverter;

$csv = new SplFileObject('/path/to/french.csv', 'r');
$csv->setFlags(SplFileObject::READ_CSV | SplFileObject::SKIP_EMPTY);

$encoder = (new CharsetConverter())->inputEncoding('iso-8859-15');
$records = $encoder->convert($csv);
```

The resulting data is converted from `iso-8859-15` to the default `UTF-8` since no output encoding charset was set using the `CharsertConverter::outputEncoding` method.

## CharsetConverter as a Writer formatter

```php
public CharsetConverter::__invoke(array $record): array
```

Using the `CharsetConverter::__invoke` method, you can register a `CharsetConverter` object as a record formatter using the [Writer::addFormatter](/9.0/writer/#record-formatter) method.

```php
use League\Csv\CharsetConverter;
use League\Csv\Writer;

$encoder = (new CharsetConverter())
    ->inputEncoding('utf-8')
    ->outputEncoding('iso-8859-15')
;

$writer = Writer::createFromPath('/path/to/your/csv/file.csv');
$writer->addFormatter($encoder);

$writer->insertOne(["foo", "bébé", "jouet"]);
//all 'utf-8' characters are now automatically encoded into 'iso-8859-15' charset
```

## CharsetConverter as a PHP stream filter

```php
public static CharsetConverter::addTo(AbstractCsv $csv, string $input_encoding, string $output_encoding): AbstractCsv
public static CharsetConverter::register(): void
public static CharsetConverter::getFiltername(string $input_encoding, string $output_encoding): string
```

### Usage with CSV objects

If your CSV object supports PHP stream filters then you can use the `CharsetConverter` class as a PHP stream filter using the library [stream filtering mechanism](/9.0/connections/filters/) instead.

The `CharsetConverter::addTo` static method:

- registers the `CharsetConverter` class under the generic filtername `convert.league.csv.*` if it is not registered yet;
- configures the stream filter using the supplied parameters;
- adds the configured stream filter to the submitted CSV object;

```php
use League\Csv\CharsetConverter;
use League\Csv\Writer;

$writer = Writer::createFromPath('/path/to/your/csv/file.csv');
CharsetConverter::addTo($writer, 'utf8', 'iso-8859-15');

$writer->insertOne(["foo", "bébé", "jouet"]);
//all 'utf-8' characters are now automatically encoded into 'iso-8859-15' charset
```

### Usage with PHP stream resources

To use this stream filter outside `League\Csv` objects you need to:

- register the stream filter using `CharsetConverter::register` method.
- use `CharsetConverter::getFiltername` with one of PHP's attaching stream filter functions with the correct arguments as shown below:

```php
use League\Csv\CharsetConverter;

CharsetConverter::register();

$resource = fopen('/path/to/my/file', 'r');
$filter = stream_filter_append(
    $resource,
    CharsetConverter::getFiltername('utf-8', 'iso-8859-15'),
    STREAM_FILTER_READ
);

while (false !== ($record = fgetcsv($resource))) {
    //$record is correctly encoded
}
```

<p class="message-info">If your system supports the <code>iconv</code> extension you should use PHP's built
in iconv stream filters instead for better performance.</p>

<p class="message-info">available since version <code>9.22.0</code></p>

When not mentioned, PHP will register the stream filter twice as a stream filter that can be used on read
and as a stream filter that can be used on write. This behaviour may introduce subtle issues if you are
not aware of that behaviour. To avoid such scenario we are introducing the following more strict methods:

- `CharsetConverter::appendOnReadTo`,
- `CharsetConverter::appendOnWriteTo`,
- `CharsetConverter::prependOnReadTo`,
- `CharsetConverter::prependOnWriteTo`

To better convey when the conversion will happen. So if you only want to convert the resource on read you
should use the following snippet

```php
use League\Csv\CharsetConverter;

$resource = fopen('/path/to/my/file', 'r+');
$filter = CharsetConverter::appendOnReadTo($resource, 'utf-8', 'iso-8859-15');
echo stream_get_contents($resource); // the return string is converted from 'utf-8' to 'iso-8859-15'
```

<p class="message-info">Even if the resource is writable, the stream filter will only be used when the file is read</p>
<p class="message-warning">The <code>appendTo</code> and <code>prependTo</code> static methods that were introduced in
version <code>9.17</code> are therefore deprecated as of version <code>9.22</code></p>