File: README

package info (click to toggle)
liburi-escape-xs-perl 0.14-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye, buster, stretch
  • size: 148 kB
  • ctags: 4
  • sloc: perl: 243; makefile: 3
file content (204 lines) | stat: -rw-r--r-- 6,637 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
NAME
    URI::Escape::XS - Drop-In replacement for URI::Escape

VERSION
    $Id: README,v 0.5 2013/02/25 17:24:25 dankogai Exp $

SYNOPSIS
      # use it instead of URI::Escape
      use URI::Escape::XS qw/uri_escape uri_unescape/;
      $safe = uri_escape("10% is enough\n");
      $verysafe = uri_escape("foo", "\0-\377");
      $str  = uri_unescape($safe);

      # or use encodeURIComponent and decodeURIComponent
      use URI::Escape::XS;
      $safe = encodeURIComponent("10% is enough\n");
      $str  = decodeURIComponent("10%25%20is%20enough%0A");

      # if you have CNet::IDN::Encode installed
      $safe = encodeURIComponentIDN("http://ドメイン名例.jp/dan/");
      $str  = decodeURIComponentIDN("http:%2F%2Fxn--eckwd4c7cu47r2wf.jp%2Fdan%2F");

EXPORT
  by default
    "encodeURIComponent" and "decodeURIComponent"

    "encodeURIComponentIDN" and "decodeURIComponentIDN" if either
    Net::LibIDN or Net::IDN::Encode is available

  on demand
    "uri_escape" and "uri_unescape"

FUNCTIONS
  encodeURIComponent
    Does what JavaScript's encodeURIComponent does.

      $uri = encodeURIComponent("http://www.example.com/");
      # http%3A%2F%2Fwww.example.com%2F

    Note you cannot customize characters to escape. If you need to do so,
    use "uri_escape".

  decodeURIComponent
    Does what JavaScript's decodeURIComponent does.

      $str = decodeURIComponent("http%3A%2F%2Fwww.example.com%2F");
      # http://www.example.com/

    It decode not only %HH sequences but also %uHHHH sequences, with
    surrogate pairs correctly decoded.

      $str = decodeURIComponent("%uD869%uDEB2%u5F3E%u0061");
      # \x{2A6B2}\x{5F3E}a

    This function UNCONDITIONALLY returns the decoded string with utf8 flag
    off. To get utf8-decoded string, use Encode and

      decode_utf8(decodeURIComponent($uri));

    This is the correct behavior because you cannot tell if the decoded
    string actually contains UTF-8 decoded string, like ISO-8859-1 and
    Shift_JIS.

  encodeURIComponentIDN
    Same as "encodeURIComponent" except that the host part is encoded in
    punycode. Either Net::LibIDN or Net::IDN::Encode is required to use this
    function.

    URIs with Internationalizing Domain Names require two encodings:
    Punycode for host part and URI escape for the rest.

    Currently only FULL URIs with "http:" or "https:" are supported.

  decodeURIComponentIDN
    Same as "decodeURIComponent" except that the host part is encoded in
    punycode. Either Net::LibIDN or Net::IDN::Encode is required to use this
    function.

  uri_escape
    Does exactly the same as URI::Escape::uri_escape() except when
    utf8-flagged string is fed.

    URI::Escape::uri_escape() croak and urge you to "uri_escape_utf8()" but
    it is pointless because URI itself has no such things as utf8 flag. The
    function in this module ALWAYS TREATS the string as byte sequence. That
    way you can safely use this function without worring about utf8 flags.

    Note this function is NOT EXPORTED by default. That way you can use
    URI::Escape and URI::Escape::XS simultaneously.

  uri_unescape
    Does exactly the same as URI::Escape::uri_escape() except when %uHHHH is
    fed.

    URI::Escape::uri_unescape() simply ignores %uHHHH sequences while the
    function in this module does decode it into the corresponding UTF-8 byte
    sequence.

    Like uri_escape, this funciton is NOT EXPORTED by default.

  Note on the %uHHHH sequence
    With this module the resulting strings never have the utf8 flag on. So
    if you want to decode it to perl utf8, You have to explicitly decode via
    Encode. Remember. URIs have always been a byte sequence, not UTF-8
    characters.

    If the %uHHHH sequence became standard, you could have safely told if a
    given URI is in Unicode. But more fortunately than unfortunately, the
    RFC proposal was rejected so you cannot tell which encoding is used just
    by looking at the URI.

    <http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementati
    ons>

    I said fortunately because %uHHHH can be nasty for non-BMP characters.
    Since each %uHHHH can hold one 16-bit value, you need a *surrogate pair*
    to represent it if it is U+10000 and above.

    In spite of that, there are a significant number of URIs with %uHHHH
    escapes. Therefore this module supports decoding only.

SPEED
    Since this module uses XS, it is really fast except for
    uri_escape("noop").

    Regexp which is used in URI::Escape is really fast for non-matching but
    slows down significantly when it has to replace string.

  BENCHMARK
    On Macbook Pro 2GHz, Perl 5.8.8.

     http://www.google.co.jp/search?q=%E5%B0%8F%E9%A3%BC%E5%BC%BE
     ============================================================
     Unescape it
     -----------
     U::E      58526/s       --     -88%
     U::E::XS 486968/s     732%       --
     --------------
     Escape it back
     --------------
     U::E      30046/s       --     -78%
     U::E::XS 136992/s     356%       --

     www.example.com
     ===============
     Unescape it
     -----------
                   Rate     U::E U::E::XS
      U::E     821972/s       --      -4%
      U::E::XS 854732/s       4%       --
     --------------
     Escape it back
     -------------
     U::E::XS 522969/s       --      -7%
     U::E     565112/s       8%       --

AUTHOR
    Dan Kogai, "<dankogai+cpan at gmail.com>"

BUGS
    Please report any bugs or feature requests to "bug-uri-escape-xs at
    rt.cpan.org", or through the web interface at
    <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=URI-Escape-XS>. I will
    be notified, and then you'll automatically be notified of progress on
    your bug as I make changes.

SUPPORT
    You can find documentation for this module with the perldoc command.

        perldoc URI::Escape::XS

    You can also look for information at:

    *   AnnoCPAN: Annotated CPAN documentation

        <http://annocpan.org/dist/URI-Escape-XS>

    *   CPAN Ratings

        <http://cpanratings.perl.org/d/URI-Escape-XS>

    *   RT: CPAN's request tracker

        <http://rt.cpan.org/NoAuth/Bugs.html?Dist=URI-Escape-XS>

    *   Search CPAN

        <http://search.cpan.org/dist/URI-Escape-XS>

ACKNOWLEDGEMENTS
    Gisle Aas for URI::Escape

    Koichi Taniguchi for URI::Escape::JavaScript

    Thomas Jacob for Net::LibIDN

    Claus Färber for Net::IDN::Encode

COPYRIGHT & LICENSE
    Copyright 2007-2012 Dan Kogai, all rights reserved.

    This program is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.