1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227
|
Source: simdutf
Section: libs
Homepage: https://simdutf.github.io/simdutf/
Standards-Version: 4.7.3
Vcs-Git: https://salsa.debian.org/debian/simdutf.git
Vcs-Browser: https://salsa.debian.org/debian/simdutf
Maintainer: Mo Zhou <lumin@debian.org>
Uploaders: Jeremy BĂcha <jbicha@ubuntu.com>
Build-Depends: debhelper-compat (= 13), cmake, python3,
Package: libsimdutf-dev
Architecture: any
Multi-Arch: same
Section: libdevel
Depends: libsimdutf29 (= ${binary:Version}),
${misc:Depends}
Description: Fast Unicode validation and transcoding - development files
Most modern software relies on the Unicode standard. In memory, Unicode
strings are represented using either UTF-8 or UTF-16. The UTF-8 format is the
de facto standard on the web (JSON, HTML, etc.) and it has been adopted as the
default in many popular programming languages (Go, Zig, Rust, Swift, etc.).
The UTF-16 format is standard in Java, C# and in many Windows technologies.
.
Not all sequences of bytes are valid Unicode strings. It is unsafe to use
Unicode strings in UTF-8 and UTF-16LE without first validating them.
Furthermore, we often need to convert strings from one encoding to another, by
a process called transcoding. For security purposes, such transcoding should
be validating: it should refuse to transcode incorrect strings.
.
This library provide fast Unicode functions such as
.
* ASCII, UTF-8, UTF-16LE/BE and UTF-32 validation, with and without error
identification,
* Latin1 to UTF-8 transcoding,
* Latin1 to UTF-16LE/BE transcoding
* Latin1 to UTF-32 transcoding
* UTF-8 to Latin1 transcoding, with or without validation, with and without
error identification,
* UTF-8 to UTF-16LE/BE transcoding, with or without validation, with and
without error identification,
* UTF-8 to UTF-32 transcoding, with or without validation, with and without
error identification,
* UTF-16LE/BE to Latin1 transcoding, with or without validation, with and
without error identification,
* UTF-16LE/BE to UTF-8 transcoding, with or without validation, with and
without error identification,
* UTF-32 to Latin1 transcoding, with or without validation, with and without
error identification,
* UTF-32 to UTF-8 transcoding, with or without validation, with and without
error identification,
* UTF-32 to UTF-16LE/BE transcoding, with or without validation, with and
without error identification,
* UTF-16LE/BE to UTF-32 transcoding, with or without validation, with and
without error identification,
* From an UTF-8 string, compute the size of the Latin1 equivalent string,
* From an UTF-8 string, compute the size of the UTF-16 equivalent string,
* From an UTF-8 string, compute the size of the UTF-32 equivalent string
(equivalent to UTF-8 character counting),
* From an UTF-16LE/BE string, compute the size of the Latin1 equivalent
string,
* From an UTF-16LE/BE string, compute the size of the UTF-8 equivalent
string,
* From an UTF-32 string, compute the size of the UTF-8 or UTF-16LE equivalent
string,
* From an UTF-16LE/BE string, compute the size of the UTF-32 equivalent
string (equivalent to UTF-16 character counting),
* UTF-8 and UTF-16LE/BE character counting,
* UTF-16 endianness change (UTF16-LE/BE to UTF-16-BE/LE),
* WHATWG forgiving-base64 (with or without URL encoding) to binary,
* Binary to base64 (with or without URL encoding).
.
The functions are accelerated using SIMD instructions (e.g., ARM NEON, SSE,
AVX, AVX-512, RISC-V Vector Extension, LoongSon, POWER, etc.). When your
strings contain hundreds of characters, we can often transcode them at speeds
exceeding a billion characters per second. You should expect high speeds not
only with English strings (ASCII) but also Chinese, Japanese, Arabic, and so
forth. We handle the full character range (including, for example, emojis).
.
The library compiles down to a small library of a few hundred kilobytes. Our
functions are exception-free and non allocating. We have extensive tests and
extensive benchmarks.
.
This package ships the development files.
Package: libsimdutf29
Architecture: any
Multi-Arch: same
Depends: ${misc:Depends}, ${shlibs:Depends}
Description: Fast Unicode validation and transcoding
Most modern software relies on the Unicode standard. In memory, Unicode
strings are represented using either UTF-8 or UTF-16. The UTF-8 format is the
de facto standard on the web (JSON, HTML, etc.) and it has been adopted as the
default in many popular programming languages (Go, Zig, Rust, Swift, etc.).
The UTF-16 format is standard in Java, C# and in many Windows technologies.
.
Not all sequences of bytes are valid Unicode strings. It is unsafe to use
Unicode strings in UTF-8 and UTF-16LE without first validating them.
Furthermore, we often need to convert strings from one encoding to another, by
a process called transcoding. For security purposes, such transcoding should
be validating: it should refuse to transcode incorrect strings.
.
This library provide fast Unicode functions such as
.
* ASCII, UTF-8, UTF-16LE/BE and UTF-32 validation, with and without error
identification,
* Latin1 to UTF-8 transcoding,
* Latin1 to UTF-16LE/BE transcoding
* Latin1 to UTF-32 transcoding
* UTF-8 to Latin1 transcoding, with or without validation, with and without
error identification,
* UTF-8 to UTF-16LE/BE transcoding, with or without validation, with and
without error identification,
* UTF-8 to UTF-32 transcoding, with or without validation, with and without
error identification,
* UTF-16LE/BE to Latin1 transcoding, with or without validation, with and
without error identification,
* UTF-16LE/BE to UTF-8 transcoding, with or without validation, with and
without error identification,
* UTF-32 to Latin1 transcoding, with or without validation, with and without
error identification,
* UTF-32 to UTF-8 transcoding, with or without validation, with and without
error identification,
* UTF-32 to UTF-16LE/BE transcoding, with or without validation, with and
without error identification,
* UTF-16LE/BE to UTF-32 transcoding, with or without validation, with and
without error identification,
* From an UTF-8 string, compute the size of the Latin1 equivalent string,
* From an UTF-8 string, compute the size of the UTF-16 equivalent string,
* From an UTF-8 string, compute the size of the UTF-32 equivalent string
(equivalent to UTF-8 character counting),
* From an UTF-16LE/BE string, compute the size of the Latin1 equivalent
string,
* From an UTF-16LE/BE string, compute the size of the UTF-8 equivalent
string,
* From an UTF-32 string, compute the size of the UTF-8 or UTF-16LE equivalent
string,
* From an UTF-16LE/BE string, compute the size of the UTF-32 equivalent
string (equivalent to UTF-16 character counting),
* UTF-8 and UTF-16LE/BE character counting,
* UTF-16 endianness change (UTF16-LE/BE to UTF-16-BE/LE),
* WHATWG forgiving-base64 (with or without URL encoding) to binary,
* Binary to base64 (with or without URL encoding).
.
The functions are accelerated using SIMD instructions (e.g., ARM NEON, SSE,
AVX, AVX-512, RISC-V Vector Extension, LoongSon, POWER, etc.). When your
strings contain hundreds of characters, we can often transcode them at speeds
exceeding a billion characters per second. You should expect high speeds not
only with English strings (ASCII) but also Chinese, Japanese, Arabic, and so
forth. We handle the full character range (including, for example, emojis).
.
The library compiles down to a small library of a few hundred kilobytes. Our
functions are exception-free and non allocating. We have extensive tests and
extensive benchmarks.
.
This package ships the shared object.
Package: libsimdutf-tools
Architecture: any
Section: misc
Depends: ${misc:Depends}, ${shlibs:Depends}
Description: Fast Unicode validation and transcoding - utilities
Most modern software relies on the Unicode standard. In memory, Unicode
strings are represented using either UTF-8 or UTF-16. The UTF-8 format is the
de facto standard on the web (JSON, HTML, etc.) and it has been adopted as the
default in many popular programming languages (Go, Zig, Rust, Swift, etc.).
The UTF-16 format is standard in Java, C# and in many Windows technologies.
.
Not all sequences of bytes are valid Unicode strings. It is unsafe to use
Unicode strings in UTF-8 and UTF-16LE without first validating them.
Furthermore, we often need to convert strings from one encoding to another, by
a process called transcoding. For security purposes, such transcoding should
be validating: it should refuse to transcode incorrect strings.
.
This library provide fast Unicode functions such as
.
* ASCII, UTF-8, UTF-16LE/BE and UTF-32 validation, with and without error
identification,
* Latin1 to UTF-8 transcoding,
* Latin1 to UTF-16LE/BE transcoding
* Latin1 to UTF-32 transcoding
* UTF-8 to Latin1 transcoding, with or without validation, with and without
error identification,
* UTF-8 to UTF-16LE/BE transcoding, with or without validation, with and
without error identification,
* UTF-8 to UTF-32 transcoding, with or without validation, with and without
error identification,
* UTF-16LE/BE to Latin1 transcoding, with or without validation, with and
without error identification,
* UTF-16LE/BE to UTF-8 transcoding, with or without validation, with and
without error identification,
* UTF-32 to Latin1 transcoding, with or without validation, with and without
error identification,
* UTF-32 to UTF-8 transcoding, with or without validation, with and without
error identification,
* UTF-32 to UTF-16LE/BE transcoding, with or without validation, with and
without error identification,
* UTF-16LE/BE to UTF-32 transcoding, with or without validation, with and
without error identification,
* From an UTF-8 string, compute the size of the Latin1 equivalent string,
* From an UTF-8 string, compute the size of the UTF-16 equivalent string,
* From an UTF-8 string, compute the size of the UTF-32 equivalent string
(equivalent to UTF-8 character counting),
* From an UTF-16LE/BE string, compute the size of the Latin1 equivalent
string,
* From an UTF-16LE/BE string, compute the size of the UTF-8 equivalent
string,
* From an UTF-32 string, compute the size of the UTF-8 or UTF-16LE equivalent
string,
* From an UTF-16LE/BE string, compute the size of the UTF-32 equivalent
string (equivalent to UTF-16 character counting),
* UTF-8 and UTF-16LE/BE character counting,
* UTF-16 endianness change (UTF16-LE/BE to UTF-16-BE/LE),
* WHATWG forgiving-base64 (with or without URL encoding) to binary,
* Binary to base64 (with or without URL encoding).
.
The functions are accelerated using SIMD instructions (e.g., ARM NEON, SSE,
AVX, AVX-512, RISC-V Vector Extension, LoongSon, POWER, etc.). When your
strings contain hundreds of characters, we can often transcode them at speeds
exceeding a billion characters per second. You should expect high speeds not
only with English strings (ASCII) but also Chinese, Japanese, Arabic, and so
forth. We handle the full character range (including, for example, emojis).
.
The library compiles down to a small library of a few hundred kilobytes. Our
functions are exception-free and non allocating. We have extensive tests and
extensive benchmarks.
.
This package ships several command line tools.
|