1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134
|
(* Bindings for Perl-compatible Regular Expressions.
* Copyright (C) 2017 Red Hat Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*)
(** Lightweight bindings for the PCRE library.
Note this is {i not} Markus Mottl's ocaml-pcre, and doesn't
work like that library.
To match a regular expression:
{v
let re = PCRE.compile "(a+)b"
...
if PCRE.matches re "ccaaaabb" then (
let whole = PCRE.sub 0 in (* returns "aaaab" *)
let first = PCRE.sub 1 in (* returns "aaaa" *)
...
)
v}
Note that there is implicit global state stored between the
call to {!matches} and {!sub}. This is stored in thread
local storage so it is safe provided there are no other calls
to {!matches} in the same thread.
*)
exception Error of string * int
(** PCRE error raised by various functions.
The string is the printable error message.
The integer is one of the negative [PCRE_*] error codes
(see pcreapi(3) for a full list), {i or} one of the positive
error codes from [pcre_compile2]. It may also be 0 if there
was no error code information. *)
type regexp
(** The type of a compiled regular expression. *)
val compile : ?anchored:bool -> ?caseless:bool -> ?dotall:bool -> ?extended:bool -> ?multiline:bool -> string -> regexp
(** Compile a regular expression. This can raise {!Error}.
The flags [?anchored], [?caseless], [?dotall], [?extended],
[?multiline]
correspond to the [pcre_compile] flags [PCRE_ANCHORED] etc.
See pcreapi(3) for details of what they do.
All flags default to false. *)
val matches : ?offset:int -> regexp -> string -> bool
(** Test whether the regular expression matches the string. This
returns true if the regexp matches or false otherwise.
This also saves any matched substrings in thread-local storage
until either the next call to {!matches} in the current thread
or the thread/program exits. You can call {!sub} to return
these substrings.
The [?offset] flag is used to change the start of the search,
which by default is at the beginning of the string (position 0).
This can raise {!Error} if PCRE returns an error. *)
val sub : int -> string
(** Return the nth substring (capture) matched by the previous call
to {!matches} in the current thread.
If [n == 0] it returns the whole matching part of the string.
If [n >= 1] it returns the nth substring.
If there was no nth substring then this raises [Not_found].
This can also raise {!Error} for other PCRE-related errors. *)
val subi : int -> int * int
(** Return the nth substring (capture) matched by the previous call
to {!matches} in the current thread.
This is the same as {!sub} but instead of copying the
substring out, it returns the indexes into the original string
of the first character of the substring and the first
character after the substring.
(See pcreapi(3) section "How pcre_exec() returns captured substrings"
for exact details).
If there was no nth substring then this raises [Not_found]. *)
val replace : ?global:bool -> regexp -> string -> string -> string
(** [replace ?global patt subst subj] performs a search and replace
on the subject string ([subj]). Where [patt] matches the
string, [subst] is substituted. This works similarly to the
Perl function [s///].
The [?global] flag defaults to false, so only the first
instance of [patt] in the string is replaced. If set to true
then every instance of [patt] in the string is replaced.
Note that this function does not allow backreferences.
Any captures in [patt] are ignored. *)
val split : regexp -> string -> string * string
val nsplit : ?max:int -> regexp -> string -> string list
(** [split patt subj] splits the string at the first occurrence
of the regular expression [patt], returning the parts of the
string before and after the match (the matching part is not
returned). If the pattern does not match then the whole
input is returned in the first string, and the second string
is empty.
[nsplit patt subj] is the same but the string is split
on every occurrence of [patt]. Note that if the pattern
matches at the beginning or end of the string, then an
empty string element will be returned at the beginning or
end of the list.
[nsplit] has an optional [?max] parameter which controls
the maximum length of the returned list. The final element
contains the remainder of the string. *)
|