1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179
|
--TEST--
mb_substr()
--EXTENSIONS--
mbstring
--FILE--
<?php
ini_set('include_path','.');
include_once('common.inc');
// EUC-JP
$euc_jp = mb_convert_encoding('0123この文字列は日本語です。EUC-JPを使っています。日本語は面倒臭い。', 'EUC-JP', 'UTF-8');
// SJIS
$sjis = mb_convert_encoding('日本語テキストです。0123456789。', 'SJIS', 'UTF-8');
// ISO-2022-JP
$iso2022jp = "\x1B\$B\x21\x21!r\x1B(BABC";
// GB-18030
$gb18030 = mb_convert_encoding('密码用户名密码名称名称', 'GB18030', 'UTF-8');
// HZ
$hz = "The next sentence is in GB.~{<:Ky2;S{#,NpJ)l6HK!#~}Bye.";
// UTF-8
$utf8 = "Greek: Σὲ γνωρίζω ἀπὸ τὴν κόψη Russian: Зарегистрируйтесь";
// UTF-32
$utf32 = mb_convert_encoding($utf8, 'UTF-32', 'UTF-8');
// UTF-7
$utf7 = mb_convert_encoding($utf8, 'UTF-7', 'UTF-8');
echo "EUC-JP:\n";
print "1: ". bin2hex(mb_substr($euc_jp, 10, 10, 'EUC-JP')) . "\n";
print "2: ". bin2hex(mb_substr($euc_jp, 0, 100, 'EUC-JP')) . "\n";
$str = mb_substr($euc_jp, 100, 10, 'EUC-JP');
print ($str === "") ? "3 OK\n" : "BAD: " . bin2hex($str) . "\n";
$str = mb_substr($euc_jp, -100, 10, 'EUC-JP');
print ($str !== "") ? "4 OK: " . bin2hex($str) . "\n" : "BAD: " . bin2hex($str) . "\n";
echo "SJIS:\n";
print "1: " . bin2hex(mb_substr($sjis, 0, 3, 'SJIS')) . "\n";
print "2: " . bin2hex(mb_substr($sjis, -1, null, 'SJIS')) . "\n";
print "3: " . bin2hex(mb_substr($sjis, -5, 3, 'SJIS')) . "\n";
print "4: " . bin2hex(mb_substr($sjis, 1, null, 'SJIS')) . "\n";
print "5:" . bin2hex(mb_substr($sjis, 10, 0, 'SJIS')) . "\n";
echo "-- Testing illegal SJIS byte 0x80 --\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 3, 2, 'SJIS')) . "\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 0, 3, 'SJIS')) . "\n";
echo "SJIS-2004:\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 3, 2, 'SJIS-2004')) . "\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 0, 3, 'SJIS-2004')) . "\n";
echo "MacJapanese:\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 3, 2, 'MacJapanese')) . "\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 0, 3, 'MacJapanese')) . "\n";
echo "SJIS-Mobile#DOCOMO:\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 3, 2, 'SJIS-Mobile#DOCOMO')) . "\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 0, 3, 'SJIS-Mobile#DOCOMO')) . "\n";
echo "SJIS-Mobile#KDDI:\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 3, 2, 'SJIS-Mobile#KDDI')) . "\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 0, 3, 'SJIS-Mobile#KDDI')) . "\n";
echo "SJIS-Mobile#SoftBank:\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 3, 2, 'SJIS-Mobile#SoftBank')) . "\n";
print bin2hex(mb_substr("\x80abc\x80\xA1", 0, 3, 'SJIS-Mobile#SoftBank')) . "\n";
echo "ISO-2022-JP:\n";
print "1: " . bin2hex(mb_substr($iso2022jp, 0, 3, 'ISO-2022-JP')) . "\n";
print "2: " . bin2hex(mb_substr($iso2022jp, -1, null, 'ISO-2022-JP')) . "\n";
print "3: " . bin2hex(mb_substr($iso2022jp, -6, 3, 'ISO-2022-JP')) . "\n";
print "4: " . bin2hex(mb_substr($iso2022jp, -3, 2, 'ISO-2022-JP')) . "\n";
print "5: " . bin2hex(mb_substr($iso2022jp, 1, null, 'ISO-2022-JP')) . "\n";
print "6:" . bin2hex(mb_substr($iso2022jp, 10, 0, 'ISO-2022-JP')) . "\n";
print "7:" . bin2hex(mb_substr($iso2022jp, 100, 10, 'ISO-2022-JP')) . "\n";
echo "GB-18030:\n";
print "1: " . bin2hex(mb_substr($gb18030, 0, 3, 'GB-18030')) . "\n";
print "2: " . bin2hex(mb_substr($gb18030, -1, null, 'GB-18030')) . "\n";
print "3: " . bin2hex(mb_substr($gb18030, -5, 3, 'GB-18030')) . "\n";
print "4: " . bin2hex(mb_substr($gb18030, 1, null, 'GB-18030')) . "\n";
print "5:" . bin2hex(mb_substr($gb18030, 10, 0, 'GB-18030')) . "\n";
echo "HZ:\n";
print "1: " . mb_substr($hz, 0, 3, 'HZ') . "\n";
print "2: " . mb_substr($hz, -1, null, 'HZ') . "\n";
print "3: " . mb_substr($hz, -5, 3, 'HZ') . "\n";
print "4: " . mb_substr($hz, 1, null, 'HZ') . "\n";
print "5:" . mb_substr($hz, 10, 0, 'HZ') . "\n";
echo "UTF-8:\n";
print "1: " . mb_substr($utf8, 0, 3, 'UTF-8') . "\n";
print "2: " . mb_substr($utf8, -1, null, 'UTF-8') . "\n";
print "3: " . mb_substr($utf8, -5, 3, 'UTF-8') . "\n";
print "4: " . mb_substr($utf8, 1, null, 'UTF-8') . "\n";
print "5:" . mb_substr($utf8, 10, 0, 'UTF-8') . "\n";
echo "UTF-32:\n";
print "1: " . mb_convert_encoding(mb_substr($utf32, 0, 3, 'UTF-32'), 'UTF-8', 'UTF-32') . "\n";
print "2: " . mb_convert_encoding(mb_substr($utf32, -1, null, 'UTF-32'), 'UTF-8', 'UTF-32') . "\n";
print "3: " . mb_convert_encoding(mb_substr($utf32, -5, 3, 'UTF-32'), 'UTF-8', 'UTF-32') . "\n";
print "4: " . mb_convert_encoding(mb_substr($utf32, 1, null, 'UTF-32'), 'UTF-8', 'UTF-32') . "\n";
print "5:" . mb_convert_encoding(mb_substr($utf32, 10, 0, 'UTF-32'), 'UTF-8', 'UTF-32') . "\n";
echo "UTF-7:\n";
print "1: " . mb_convert_encoding(mb_substr($utf7, 0, 3, 'UTF-7'), 'UTF-8', 'UTF-7') . "\n";
print "2: " . mb_convert_encoding(mb_substr($utf7, -1, null, 'UTF-7'), 'UTF-8', 'UTF-7') . "\n";
print "3: " . mb_convert_encoding(mb_substr($utf7, -5, 3, 'UTF-7'), 'UTF-8', 'UTF-7') . "\n";
print "4: " . mb_convert_encoding(mb_substr($utf7, 1, null, 'UTF-7'), 'UTF-8', 'UTF-7') . "\n";
print "5:" . mb_convert_encoding(mb_substr($utf7, 10, 0, 'UTF-7'), 'UTF-8', 'UTF-7') . "\n";
?>
--EXPECT--
EUC-JP:
1: c6fccbdcb8eca4c7a4b9a1a34555432d
2: 30313233a4b3a4cecab8bbfacef3a4cfc6fccbdcb8eca4c7a4b9a1a34555432d4a50a4f2bbc8a4c3a4c6a4a4a4dea4b9a1a3c6fccbdcb8eca4cfccccc5ddbdada4a4a1a3
3 OK
4 OK: 30313233a4b3a4cecab8bbfacef3a4cf
SJIS:
1: 93fa967b8cea
2: 8142
3: 825582568257
4: 967b8cea8365834c8358836782c582b781423031323334825482558256825782588142
5:
-- Testing illegal SJIS byte 0x80 --
6380
806162
SJIS-2004:
6380
806162
MacJapanese:
6380
806162
SJIS-Mobile#DOCOMO:
6380
806162
SJIS-Mobile#KDDI:
6380
806162
SJIS-Mobile#SoftBank:
6380
806162
ISO-2022-JP:
1: 1b2442212121721b284241
2: 43
3: 1b2442212121721b284241
4: 4142
5: 1b244221721b2842414243
6:
7:
GB-18030:
1: c3dcc2ebd3c3
2: b3c6
3: c2ebc3fbb3c6
4: c2ebd3c3bba7c3fbc3dcc2ebc3fbb3c6c3fbb3c6
5:
HZ:
1: The
2: .
3: ~{!#~}By
4: he next sentence is in GB.~{<:Ky2;S{#,NpJ)l6HK!#~}Bye.
5:
UTF-8:
1: Gre
2: ь
3: йте
4: reek: Σὲ γνωρίζω ἀπὸ τὴν κόψη Russian: Зарегистрируйтесь
5:
UTF-32:
1: Gre
2: ь
3: йте
4: reek: Σὲ γνωρίζω ἀπὸ τὴν κόψη Russian: Зарегистрируйтесь
5:
UTF-7:
1: Gre
2: ь
3: йте
4: reek: Σὲ γνωρίζω ἀπὸ τὴν κόψη Russian: Зарегистрируйтесь
5:
|