1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216
|
NOUSERMACRO(CSV4180)
includefile(include/header)
COMMENT(replace 'classname' by the name of the new class)
COMMENT(manpage, section, releasedate, archive, short name)
manpage(FBB::CSV4180)(3bobcat)(_CurYrs_)(libbobcat-dev__CurVers_)
(CSV4180 convertor)
manpagename(FBB::CSV4180)(Converter for comma separated values)
manpagesynopsis()
bf(#include <bobcat/csv>)nl()
Linking option: tt(-lbobcat)
manpagedescription()
Objects of the class bf(CSV4180) can be used to convert series of comma
separated values to the individual separated values (also called `fields'
below). The class implements RFC 4180
(cf. lurl(https://www.ietf.org/rfc/rfc4180.txt), section 2).
According to RFC 4180 lines contain comma separated values: comma separated
values on one line are processed together, as a series of values. The final
comma separated value on a line is not ended by a comma.
Comma separated values may be surrounded by double quotes. However, they
em(must) be surrounded by double quotes in these cases:
itemization(
it() if the values contain commas;
it() if the values contain double quotes (in which case the double quote
is `escaped' by doubling it, e.g., tt("a "" double quote"));
it() if the values extend over multiple lines. E.g.,
verb(
"First line
second line"
)
)
Comma separated values may be empty: the following line defines three
empty comma separated values:
verb(
,,
)
The first empty value starts at the beginning of the line, and continues
up to the first comma; the second empty value starts beyond the first comma
and continues up to the second comma; the third empty value starts beyond the
second comma, and continues up to the end of the line. If the line ends in
blank space characters then the third value isn't empty, but contains those
blank space characters.
By default, values are interpreted as strings. The bf(CSV4180) class also
offers facilities to ignore specific fields, or to ensure that they can be
converted to integral or floating point values. The second constructor (below)
expects a tt(std::string) argument defining how to interpret fields. Options
are:
itemization(
it() tt(I): the field must be convertible to an integral value;
it() tt(D): the field must be convertible to a floating point value;
it() tt(S): the field is a string: it is used as-is;
it() tt(X): the field is omitted from the final set of comma separated
values. I.e., if a line contains three comma separated values, and the
specification tt("SXS") is used then this results in two comma
separated values: the first and third value of three comma separated
values encountered on lines.
it() tt(-): synonym of tt(X).
)
In addition, field specifications may contain blank spaces, which are
ignored.
When processing comma separated values the first line may be considered a
em(header) line. tt(X) specifications also apply to header lines, but
otherwise they merely consist of tt(S)-type fields. In addition, when
processing multiple input lines all non-header lines are made available in a
vector of vectors of fields, whereas the header line itself can be accessed
via a dedicated member (tt(header())).
includefile(include/namespace)
manpagesection(INHERITS FROM)
-
manpagesection(CONSTRUCTORS)
itemization(
itb(explicit CSV4180(size_t nFields = 0, bool header = false,
char fieldSep = ','))
The first parameter specifies the number of fields that must be
present on input lines. When using the default value the number of
fields encountered on the first line determines the number of fields
that must be present on subsequent lines. If the second parameter is
tt(true) then the first line is interpreted as the header line. The
third parameter specifies the character separating the fields. By
default it's a comma, but sometimes (not part of the RFC) a semicolon
is used. By specifying tt(fieldSep) any character other than a comma
can be used as field separator.
itb(explicit CSV4180(std::string const &specs, bool header = false,
char fieldSep = ','))
The first parameter defines the number and types of the comma separated
values on input lines. Specifications can be
quote(
itemization(
it() tt(D): the field must be convertible to a floating point
value;
it() tt(I): the field must be convertible to an integral value;
it() tt(S): the field is left as-is, and can be retrieved as a
tt(std::string).
it() tt(X) or tt(-): the field is ignored and is not stored inside
the bf(CSV4180) object.
it() blank space characters are ignored.
)
)
An exception is thrown when encountering other than the abovementioned
characters are encountered.
If tt(I) or tt(D) fields cannot be properly converted, or if a line
contains too few or too many comma separated values the input stream's
fail status is set.
The last two parameters are interpreted as the last two parameters of
the previous constructor.
)
Copy and move constructors (and assignment operators) are available.
manpagesection(OVERLOADED OPERATORS)
itemization(
itb(std::istream &operator>>(std::istream &in, CSV4180 &csv))
One line of text is extracted from tt(in) and processed by the tt(csv)
object. The tt(csv) object may or may not already contain converted
comma separated values. When empty, the first line is processed
according to the specifications provided to the tt(csv) object at
construction time. Otherwise, the comma separated values on extracted
lines must match the number and types of the fields, as specified by
the tt(csv) object. When input lines do not match these specifications
tt(in's) fail status is set.
)
manpagesection(MEMBER FUNCTIONS)
itemization(
itb(void clear(size_t nFields = 0))
The internally stored data (referred to by the tt(data, header,) and
tt(lastLine) members) are erased. By default, the required number of
CSV fields is reset to 0, but can be set to a specific value by
specifying a value for its tt(nFields) parameter.
itb(std::vector<std::vector<std::string>> const &data() const)
A reference to the vector of vectors of fields stored inside the
bf(CSV4180) object is returned. The vector returned by tt(data) does
not contain the header line. If a header line was requested it can be
retrieved from the tt(header()) member.
itb(std::vector<std::string> const &header() const)
If the constructor's tt(header) parameter was specified as tt(true)
then this member returns the fields encountered on the first line that
was processed by the tt(read1) member. Otherwise, tt(header) returns
a reference to an empty vector.
itb(std::string const &lastLine() const)
A reference to the last line that was successfully extracted from the
input stream by the tt(read1) member is returned. So once the lines
containing the comma separated values have been processed, the next
line on the input stream can be obtained from this member.
itb(size_t nValues() const)
After successfully calling tt(read1) for the first time this member
returns the required number of comma separated values that must be
encountered on subsequent input lines.
itb(size_t read(std::istream &in, size_t nLines = 0))
By default, all lines of tt(in) are read and are processed by the
tt(read1) member. By specifying a non-zero value for the tt(nLines)
parameter the specified number of lines is read from tt(in).
Reading stops once tt(in's) status is not tt(good). When tt(nLines) is
specified as zero, then tt(in's) status flags are cleared. The number
of successfully processed lines is returned.
itb(std::istream &read1(std::istream &in))
One line is read from tt(in) and is parsed for its comma separated
values. If parsing fails, tt(in's fail) status is set. After
successfully calling tt(read1) for the first time all subsequent lines
read by tt(read1) must have the same number of comma separated values
as encountered when calling tt(read1) for the first time. The parsed
fields are stored in a vector of tt(std::string) objects, and that
vector is added to the vector of vectors of strings that is returned
by the tt(data) member.
itb(std::vector<std::vector<std::string>> release())
The vector of vectors of fields stored inside the bf(CSV4180) object is
returned. After calling tt(release) the internally stored vector of
fields is empty. The vector returned by tt(data) does not contain the
header line. If a header line was requested it can be retrieved from
the tt(header()) member. Note that this member does not reset the
number of expected fields for subsequently processed CSV-lines. If
that's what you want, call tt(clear) after calling tt(release).
)
manpagesection(EXAMPLE)
verbinsert(//man1 ../../csv4180/driver/driver.cc)
verbinsert(//man2 ../../csv4180/driver/driver.cc)
verbinsert(//man3 ../../csv4180/driver/driver.cc)
manpagefiles()
em(bobcat/csv) - defines the class interface
manpageseealso()
bf(bobcat)(7)
manpagebugs()
None Reported.
includefile(include/trailer)
|