File: 1003_utf8_flag.patch

package info (click to toggle)
libre-engine-re2-perl 0.18%2Bds-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 440 kB
  • sloc: cpp: 270; perl: 80; makefile: 2; sh: 1
file content (50 lines) | stat: -rw-r--r-- 1,699 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Description: force treat scanned string as UTF-8 when compiled regex is UTF8
 re::engine::RE2 is documented (in BUGS section of README)
 to not handle UTF-8 correctly.
 .
 Without this patch,
 scanning Latin1 string with UTF-8 regex reports wrong positions
 or potentially crashes,
 and misses e.g. "£" (which Perl re engine correctly matches).
 .
 With this patch,
 scanning UTF-8 string with UTF-8 regex should behave correctly,
 and still misses e.g. "£".
 .
 Scanning should be safer and more correct for UTF-8 strings,
 with only known side-effect of being slower for non-UTF-8 strings
 due to always upgrading string to UTF-8.
 For faster scanning of known ASCII string, use an ASCII regex.
Origin: https://github.com/dgl/re-engine-RE2/pull/8
Author: Todd Richmond <trichmond@proofpoint.com>
Bug: https://rt.cpan.org/Public/Bug/Display.html?id=116747
Bug: https://rt.cpan.org/Public/Bug/Display.html?id=131618
Last-Update: 2023-06-21
---
This patch header follows DEP-3: http://dep.debian.net/deps/dep3/
--- a/re2_xs.cc
+++ b/re2_xs.cc
@@ -101,10 +101,12 @@
     // XXX: Need to compile two versions?
     /* The pattern is not UTF-8. Tell RE2 to treat it as Latin1. */
 #ifdef RXf_UTF8
-    if (!(flags & RXf_UTF8))
+    if (flags & RXf_UTF8)
 #else
-    if (!SvUTF8(pattern))
+    if (SvUTF8(pattern))
 #endif
+        extflags |= RXf_MATCH_UTF8;
+    else
         options.set_encoding(RE2::Options::EncodingLatin1);
 
     options.set_log_errors(false);
@@ -311,7 +313,7 @@
     RE2::Options options;
     options.Copy(previous->options());
 
-    return new RE2 (re2::StringPiece(RX_WRAPPED(rx), RX_WRAPLEN(rx)), options);
+    return new RE2 (previous->pattern(), options);
 }
 
 SV *