File: string.yo

package info (click to toggle)
bobcat 6.02.02-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 13,960 kB
  • sloc: cpp: 18,954; fortran: 5,617; makefile: 2,787; sh: 659; perl: 401; ansic: 26
file content (253 lines) | stat: -rw-r--r-- 11,972 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
includefile(include/header)

COMMENT(manpage, section, releasedate, archive, short name)
manpage(FBB::String)(3bobcat)(_CurYrs_)(libbobcat-dev__CurVers_)
                    (Operations on std::string objects)

manpagename(FBB::String)(Several operations on bf(std::string) objects)

manpagesynopsis()
    bf(#include <bobcat/string>)nl()
    Linking option: tt(-lbobcat)

manpagedescription()
    This class offers facilities for often used transformations on
tt(std::string) objects, which are not supported by the tt(std::string)
class itself. All members of bf(FBB::String) are static.

    Initially this class was derived from bf(std::string). Deriving from
tt(std::string), however, is considerd bad design as tt(std::string) was
not designed as a base-class.

    bf(FBB::String) offers a series of em(static) member functions
providing the facilities originally implemented as non-static members. One of
these members is the (overloaded) tt(split) member, splitting a string into
elements separated by one or more configurable characters. These elements may
contain or consist of double- or single-quoted (sub) strings and escape
characters. Escape characters are converted to their implied byte-values
(e.g., tt(\n) is converted to byte value 10) unless they are embedded in
single-quoted (sub) strings. Quotes surrounding double- and single-quoted
(sub) strings are removed from the elements returned by the tt(split)
members.

includefile(include/namespace)

manpagesection(INHERITS FROM)
    --

manpagesection(ENUMERATIONS)
    itemization(
    itb(Type)
        This enumeration indicates the nature of the content of an element in
the array returned by the overloaded tt(split) members (see below).

            bf(DQUOTE), a subset of the characters in the matching tt(string)
element was delimited by double quotes in the in the string that was parsed by
the tt(split) members.

            bf(DQUOTE_UNTERMINATED), the content of the string that was
parsed by the tt(split) members started at some point with a double quote, but
the matching ending double quote was lacking.

            bf(ESCAPED_END),  the content of the string that was
parsed by the tt(split) members ended in a mere backslash.

            bf(NORMAL), a normal string;

            bf(SEPARATOR), a separator;

            bf(SQUOTE), a subset of the characters in the matching tt(string)
element was delimited by quotes in the in the string that was parsed by
the tt(split) members.

            bf(SQUOTE_UNTERMINATED), the content of the string that was
parsed by the tt(split) members started at some point with a quote, but
the matching ending quote was lacking.
    itb(SplitType)
       This enumeration is used to specify how tt(split) members should
        split the information in the string objects that are passed to these
        members:

       bf(TOK): the tt(split) member acts like the standard bf(C) function
        bf(strtok)(3). The essence here is that no empty elements are
        returned. E.g., a string containing tt("a,,") which is processed using
        the tt(TOK) mode returns a tt(NORMAL) element containing tt("a").

       bf(TOKSEP): the tt(split) member acts like the standard bf(C) function
        bf(strtok)(3), also returning information about encountered
        separators. Since tt(strtok) doesn't return empty elements, tt(TOKSEP)
        uses empty elements to indicate the occurrence of separators. E.g., a
        string containing tt("a,,") which is processed using the tt(TOKSEP)
        mode returns a tt(NORMAL) element containing tt("a"), followed by two
        empty tt(SEPARATOR) elements.

       bf(STR): the tt(split) member acts like the standard bf(C) function
        bf(strstr)(3). The essence here is that empty elements are also
        returned. E.g., a string containing tt("a,,") which is processed using
        the tt(STR) mode returns an element containing tt("a"), followed by
        two empty tt(NORMAL) elements.

       bf(STRSEP): the tt(split) member acts like the standard bf(C) function
        bf(strstr)(3), also returning information about encountered
        separators.  E.g., a string containing tt("a,,") which is processed
        using the tt(STRSEP) mode returns a tt(NORMAL) element containing
        tt("a"), followed by a tt(SEPARATOR) element containing tt(","),
        followed by a tt(NORMAL) empty element, followed by a tt(SEPARATOR)
        element containing tt(","), and finally followed by a tt(NORMAL) empty
        element,
    )

manpagesection(TYPEDEF)

    The bf(typedef SplitPair) represents bf(std::pair<std::string,
String::Type>) and is used by some overloaded bf(split) members (see
below).

manpagesection(STATIC MEMBER FUNCTIONS)
    itemization(

    itb(char const **argv(std::vector<std::string> const &words))
        Returns a pointer to an allocated series of pointers to the bf(C)
strings stored in the vector tt(words). The caller is responsible for
returning the array of pointers to the common pool, but should em(not) delete
the bf(C)-strings to which the pointers point. The last element of the
returned array is guaranteed to be a 0-pointer.

    itb(int casecmp(std::string const &lhs, std::string const &rhs))
        Performs a case-insensitive comparison of the content of two
tt(std::string) objects. A negative value is returned if tt(lhs) should be
ordered before tt(rhs); 0 is returned if the two strings have identical
content; a positive value is returned if the tt(lhs) object should be ordered
beyond tt(rhs).

    itb(std::string escape(std::string const &str,
            char const *series = "'\"\\"))
       Returns a copy of tt(str) in which all characters in tt(series) are
        prefixed by a backslash character.

    itb(std::string join(std::vector<std::string> const &words, char sep))
       The elements of the tt(words) vector are returned as one string,
        separated from each other by the tt(sep) character;

    itb(std::string join(std::vector<SplitPair> const &entries, char sep,
                                                        bool all = true))
       The tt(first) fields of the elements in tt(entries) are returned as one
        string, separated from each other by the tt(sep) character. If the
        parameter tt(all) is specified as tt(false) then elements whose
        tt(second) fields are equal to tt(String::SEPARATOR) are ignored.

    itb(std::string lc(std::string const &str) const)
       Returns a copy of tt(str) in which all letters were transformed to
        lower case letters.

    itb(std::vector<String::SplitPair> split(std::string const &str, SplitType
        mode, char const *sep = " \t"))
       The string tt(str) is split into substrings, separated by any of the
        characters in tt(sep). The substrings are returned in a vector of
        tt(SplitPair) elements, using the specified tt(SplitType) mode
        (cf. the description of the various tt(SplitPair) values and their
        effects in the tt(ENUMERATIONS) section).

    itb(std::vector<String::SplitPair> split(std::string const &str, char
        const *separators = " \t", bool addEmpty = false))
       This member acts like the previous one, using tt(addEmpty == false)
        to select tt(mode TOK) and tt(addEmpty == true) to select tt(mode
        TOKSEP).


    itb(size_t split(std::vector<String::SplitPair> *entries, std::string
        const &str, SplitType mode, char const *sep = " \t"))
       Same functionality as the first tt(split) member, but this member
        stores the tt(SplitPair) elements in the vector pointed at by the
        tt(entries) parameter, first clearing the vector. This member returns
        the new value of tt(entries->size()).

    itb(size_t split(std::vector<String::SplitPair> *entries, std::string
        const &str, char const *sep = " \t", bool addEmpty = false))
       This member acts like the previous one, using tt(addEmpty == false)
        to select tt(mode TOK) and tt(addEmpty == true) to select tt(mode
        TOKSEP).


    itb(std::vector<std::string> split(Type *type, std::string const &str,
        SplitType stype, char const *sep = " \t"))
       Same functionality as the first tt(split) member, but this member
        merely stores the tt(first) fields of the tt(SplitPair) elements in
        the returned vector. The tt(String::Type) variable whose address is
        passed to the tt(type) parameter is set to tt(NORMAL) if the final
        entry was successfully determined; to tt(DQUOTE_UNTERMINATED) if a
        final closing double quote could not be found; to
        tt(SQUOTE_UNTERMINATED) if a final closing single quote could not be
        found; and to tt(ESCAPE_END) if the final character in tt(str) is a
        backslash character.

    itb(std::vector<std::string> split(Type *type, std::string
        const &str, char const *sep = " \t", bool addEmpty = false))
       This member acts like the previous one, using tt(addEmpty == false)
        to select tt(mode TOK) and tt(addEmpty == true) to select tt(mode
        TOKSEP).


    itb(size_t split(std::vector<std::string> *words, std::string const &str,
        SplitType stype, char const *sep = " \t"))
       Same functionality as the first tt(split) member, but this member
        merely stores the tt(first) fields of the encountered tt(SplitPair)
        elements in the vector pointed at by tt(words), first clearing the
        vector. This member returns the new value of tt(words->size()).

    itb(size_t split(std::vector<std::string> *words, std::string const &str,
        char const *sep = " \t", bool addEmpty = false))
       This member acts like the previous one, using tt(addEmpty == false)
        to select tt(mode TOK) and tt(addEmpty == true) to select tt(mode
        TOKSEP).



    itb(std::string trim(std::string const &str))
       Returns a copy of tt(str) from which leading and trailing blank
        characters were removed.

    itb(std::string uc(std::string const &str))
       Returns a copy of tt(str) in which all letters were capitalized.

    itb(std::string unescape(std::string const &str))
       Returns a copy of tt(str) in which the escaped (i.e., prefixed by a
        backslash) characters were interpreted. All standard escape characters
        (tt(\a), tt(\b), tt(\f), tt(\n), tt(\r), tt(\t), tt(\v)) are
        recognized. If an escape character is followed by tt(x) at most the
        next two characters are interpreted as a hexadecimal number. If an
        escape character is followed by an octal digit, then at most the next
        three characters following the em(backslash) are interpreted as an
        octal number. In all other cases, the backslash is removed and the
        character following the backslash is kept.

    itb(std::string urlDecode(std::string const &str))
       URL specifications use tt(%xx) encoding to encode characters, except
        for alpha-numeric characters and the characters tt(- _ .) and tt(~),
        which are kept as-is. Other characters are encode by a tt(%)
        character, followed by two hexadecimal characters representing those
        characters' byte value. E.g., a blank space is encoded as tt(%20), a
        plus character is encoded as tt(%2B). The member tt(urlDecode) returns
        a tt(std::string) containing the decoded characters of the url-encoded
        string that is passed as argument to this member.

    itb(std::string urlEncode(std::string const &str))
       See the member tt(urlDecode): tt(urlEncode) returns a tt(std::string)
        containing the url-encoded characters of the characters in the string
        that is passed as argument to this member.
    )

manpagesection(EXAMPLE)

    verbinclude(../../string/driver/driver.cc)

manpagefiles()
    em(bobcat/string) - defines the class interface

manpageseealso()
    bf(bobcat)(7)

manpagebugs()
    None Reported.

includefile(include/trailer)