1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338
|
.\" hey, Emacs: -*- nroff -*-
.\" This program is free software; you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation; either version 2 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program; if not, write to the Free Software
.\" Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
.\"
.\" Please update the above date whenever this man page is modified.
.\"
.\" Some roff macros, for reference:
.\" .nh disable hyphenation
.\" .hy enable hyphenation
.\" .ad l left justify
.\" .ad b justify to both left and right margins (default)
.\" .nf disable filling
.\" .fi enable filling
.\" .br insert line break
.\" .sp <n> insert n+1 empty lines
.\" for manpage-specific macros, see man(7)
.TH "FFE" "1" "2007-05-22" "" ""
.SH "NAME"
ffe \- flat file extractor
.SH "SYNOPSIS"
.B ffe
.RI [ options ]...
.SH "DESCRIPTION"
\fBffe\fP is a program for extracting fields from flat file records and displaying them in different formats. \fBffe\fP relies on the configuration file to control input file structure and the output format.
.SH "OPTIONS"
\fBffe\fP accepts the following options:
.TP
.BR \-c ", " \-\-configuration=\fIfile\fP
Read the configuration from \fIfile\fP, default is ~/.fferc.
.TP
.BR \-s ", " \-\-structure=\fISTRUCTURE\fR
Use the structure \fISTRUCTURE\fR for input file.
.TP
.BR \-p ", " \-\-print=\fIFORMAT\fR
Use output format \fIFORMAT\fR for printing. All printing can be suppressed using format 'no'.
.TP
.BR \-o ", " \-\-output=\fINAME\fP
Write output to \fINAME\fP instead of standard output.
.TP
.BR \-f ", " \-\-field\-list=\fILIST\fP
Print only fields and constants defined in comma separated list \fILIST\fP.
.TP
.BR \-e ", " \-\-expression=\fIEXPRESSION\fR
Print only those records for which the \fIEXPRESSION\fR evaluates to true.
.TP
.BR \-a ", " \-\-and
Expressions are combined with logical and, default is logical or.
.TP
.BR \-v ", " \-\-invert\-match
Print only those records which don't match the expression.
.TP
.BR \-l ", " \-\-loose
An invalid input line does not cause program to abort.
.TP
.BR \-r ", " \-\-replace=\fIFIELD\fR=\fIVALUE\fR
Replace \fIFIELD\fRs contents with \fIVALUE\fR in output. \fIVALUE\fR can contain same directives as output option \fBdata\fR.
.TP
.BR \-? ", " \-\-help
List all available options and their meanings.
.TP
.BR \-V ", " \-\-version
Show version of program.
.BR
.PP
All remaining arguments are names of input files;
if no input files are specified, then the standard input is read.
.SS Expressions (option \-e, \-\-expression)
Expression can be used to select specific records comparing field
values.
Expressions can be defined as:
.TP
.BR \fIfield\fR\fB=\fR\fIvalue\fR
A record will be selected if the field \fIfield\fR is equal to the value \fIvalue\fR.
.TP
.BR \fIfield\fR\fB^\fR\fIvalue\fR
A record will be selected if the field \fIfield\fR starts with the value \fIvalue\fR.
.TP
.BR \fIfield\fR\fB~\fR\fIvalue\fR
A record will be selected if the field \fIfield\fR contains the value \fIvalue\fR.
.TP
.BR \fIfield\fR\fB!\fR\fIvalue\fR
A record will be selected if the field \fIfield\fR is not equal to the value \fIvalue\fR.
.TP
.BR \fIfield\fR\fB?\fR\fIvalue\fR
A record will be selected if the field \fIfield\fR matches the regular expression in \fIvalue\fR.
.SH "FFE CONFIGURATION"
\fBffe\fR uses the configuration file for extracting fields from the input file and for formatting the fields for output. Every line of the input file is considered as a record. Default configuration file is \fB~/fferc\fR but another file can be defined with '\-c' option.
.PP
Configuration file for \fBffe\fR is a text file. The file may contain empty lines. Commands are case\-sensitive. Comments begin with the #\-character and end at the end of the line. The \fIstring\fR and \fIchar\fR definitions can be enclosed in double quotation '"' characters. \fIchar\fR is a single character. \fIstring\fR and \fIchar\fR can contain following escape codes: '\ea','\eb','\et','\en','\ev','\ef', '\er', '\e"' and '\e#'. Character '\e' can be escaped as '\e\e'.
.SS Input file structure
.PP
Input file structures are defined with keyword \fBstructure\fR:
.PP
\fBstructure\fR \fIname\fR {options...}
.PP
Options must be ended with newline, options are:
.PP
.TP
\fBtype\fR \fBfixed\fR|\fBseparated\fR [\fIchar\fR] [\fB*\fR]
Fields in the input are fixed length fields or separated by \fIchar\fR. If * is defined, multiple sequential separators are considered as one. Default separator is comma.
.TP
\fBquoted\fR [\fIchar\fR]
Fields may be quoted with \fIchar\fR, default quotation mark is double quotation mark '"'.
A quotation mark is assumed to be escaped as \e\fIchar\fR or doubling the mark as \fIchar\fR\fIchar\fR in input. Non escaped quotation marks are not preserved in output.
.TP
\fBheader\fR \fBfirst\fR|\fBall\fR|\fBno\fR
Controls the occurrence of the header line. Default is no. If set as first or all, the first line of the first input file is considered as header line containing the names of the fields. First means that only the first file has a header, all means means that all files have a header, although the names are still taken from the header of the first file. Header line is handled according the record definition, meaning that the name positions, separators etc. are the same as for fields.
.TP
\fBoutput\fR \fIname\fR
All records belonging this structure are printed according output format \fIname\fR. Default is to use output named as 'default'.
.TP
\fBrecord\fR \fIname\fR {options...}
Defines one record for a structure. A structure can contain several record types.
.SS Record options:
.PP
.TP
\fBid\fR \fIposition\fR \fIstring\fR
Identifies a record in the input file. Records are identified by \fIstring\fR in input record position \fIposition\fR. For fixed length input the \fIposition\fR is the byte position of input record and for separated input the \fIposition\fR means the \fIposition\fR'th field of the input record. Positions start from one. \fBId\fR's are required only if input structure contains several record types with equal lengths or field counts.
A record definition can contain several \fBid\fR's, then all \fBid\fR'd must match the input line (\fBid\fR's are combined with logical and).
.TP
\fBfield\fR \fIname\fR|\fBFILLER\fR|\fB*\fR [\fIlength\fR]|\fB*\fR [\fIlookup\fR]
Defines one field in the input structure. \fIlength\fR is mandatory for fixed length input structure.
Length is also used for printing fields in fixed length format using the \fB%D\fR directive. The order of fields in configuration file is essential, it defines the field order in a record.
If '*' is defined instead of the name, then the 'name' will be the ordinal number of the field, or if the 'header' option has value 'first' or 'all', then the name of the field will taken from the header line (first line of the input).
If \fIlookup\fR is defined then the fields contents is used to make a lookup in lookup table \fIlookup\fR. If length is not needed (separated format) but lookup is needed, use asterisk (*) in place of length definition.
Naming the field as FILLER causes field not to be printed in output.
.TP
\fBfields\-from\fR \fIrecord\fR
Fields for this record are the same as for record \fIrecord\fR.
.TP
\fBoutput\fR \fIname\fR
This record is printed according output\-format \fIname\fR. Default is to use output format defined in structure.
.SS Output definitions
.PP
There can be several output\-definitions in the configuration file. Needed format can be selected with '\-p' option. Default format is named as 'default'.
.TP
\fBoutput\fR \fIname\fR|\fBdefault\fR {options...}
Defines one output format. Output named as 'default' will be used if none is defined for structure or record, or none is given with option '\-p'.
.SS Output options:
.PP
Pictures in output definition can contain printf\-style %\-directives:
.LP
.TP
\fB%f\fR
Name of the input file.
.TP
\fB%s\fR
Name of the current structure.
.TP
\fB%r\fR
Name of the current record.
.TP
\fB%o\fR
Input record number in current file.
.TP
\fB%O\fR
Input record number starting from the first file.
.TP
\fB%n\fR
Field name.
.TP
\fB%t\fR
Field contents, without leading and trailing whitespaces.
.TP
\fB%d\fR
Field contents.
.TP
\fB%D\fR
Field contents, right padded to the field length (requires length definition for the field).
.TP
\fB%l\fR
Value from lookup.
.TP
\fB%L\fR
Value from lookup, right padded to the field length (requires length definition for the field).
.TP
\fB%e\fR
Does not print anything, causes still the "field empty" check to be performed. Can be used when only the names of non\-empty fields should be printed.
.TP
\fB%p\fR
Fields start position in a record. For fixed structure this is field's byte position in the input line and for separated structure this is the ordinal number of the field. Starts from one.
.TP
\fB%%\fR
Percent sign.
.TP
\fBfile_header\fR \fIpicture\fR
Picture is printed once before file contents.
.TP
\fBfile_trailer\fR \fIpicture\fR
Picture is printed once after file contents.
.TP
\fBheader\fR \fIpicture\fR
If defined, then the header line describing the field names is printed before records. Every field name is printed according the \fIpicture\fR using the same separator and fields length as defined for the fields. \fIPicture\fR can contain only \fB%n\fR directive.
.TP
\fBdata\fR \fIpicture\fR
Field contents is printed according \fIpicture\fR.
.TP
\fBlookup\fR \fIpicture\fR
If field is mapped to lookup table, this picture will be used instead of picture from \fBdata\fR option. If not defined picture from \fBdata\fR will be used.
.TP
\fBseparator\fR \fIstring\fR
All fields are terminated by \fIstring\fR, except the last field of the record. Default is not to print separator.
.TP
\fBrecord_header\fR \fIpicture\fR
All records are started by \fIpicture\fR. Default is not to print header.
.TP
\fBrecord_trailer\fR \fIpicture\fR
All records are ended with \fIpicture\fR. Default is newline.
.TP
\fBjustify\fR \fBleft\fR|\fBright\fR|\fIchar\fR
Fields are left or right justified. \fIchar\fR justifies output according the first occurrence of \fIchar\fR in the data picture. Default is left.
.TP
\fBindent\fR \fIstring\fR
Record contents is intended by \fIstring\fR. Field contents is intended by two times the \fIstring\fR. Default is not to indent.
.TP
\fBfield\-list\fR \fIname1\fR,\fIname2\fR,...
Only fields or constants named as \fIname1\fR,\fIname2\fR,... are printed, same effect as has '\-f' option. Default is print all the fields. Fields are also printed in the same order as they are listed.
.TP
\fBno\-data\-print\fR \fByes\fR|\fBno\fR
When set as no and \fBfield\-list\fR is defined, suppresses printing of \fBrecord_header\fR and \fBrecord_trailer\fR in case where current record contains none of the fields defined in \fBfield\-list\fR.
.TP
\fBfield\-empty\-print\fR \fByes\fR|\fBno\fR
When set as no, nothing is printed for fields which consist entirely of characters from \fBempty\-chars\fR. If none of the fields of a record are printed then the printing of \fBrecord_trailer\fR is also suppressed. Default is yes.
.TP
\fBempty\-chars\fR \fIstring\fR
\fIstring\fR defines a set of characters which define an "empty" field. Default is " \ef\en\er\et\ev" (space, form\-feed, newline, carriage return, horizontal tab and vertical tab)
.SS Lookup definitions
Several lookup tables can be defined in a configuration.
.TP
\fBlookup\fR \fIname\fR {options...}
Defines one lookup table.
.SS Lookup options:
.TP
\fBsearch\fR \fBexact\fR|\fBlongest\fR
Defines the search type for lookup table.
.TP
\fBdefault\-value\fR \fIvalue\fR
If the lookup is not successful, \fIvalue\fR from this option is printed for directives \fB%l\fR and \fB%L\fR.
.TP
\fBpair\fR \fIkey\fR \fIvalue\fR
One key/value pair for the lookup table.
.TP
\fBfile\fR \fIname\fR [\fIseparator\fR]
Key/value pairs are read from file \fIname\fR. Every line is considered as a key/value pair separated
by \fIseparator\fR. Default separator is semicolon.
.SS Constants
Additional to input fields constants values can be printed using option \fB\-f\fR,\fB\-\-field\-list\fR or
output option \fBfield\-list\fR. Constant will be printed using \fBdata\fR output option.
Constants are defined as
.TP
\fBconst\fR \fIname\fR \fIvalue\fR
when \fIname\fR appears in a field list, \fIvalue\fR will be printed for every record as \fIname\fR were one of the input fields.
.SH "EXAMPLES"
Example of fixed length flat file containing fields 'FirstName','LastName' and 'Age':
.br
John Ripper 23
.br
Scott Tiger 45
.br
Mary Moore 41
This file can be printed in XML with the following configuration:
structure personnel {
.br
type fixed
.br
output XML
.br
record person {
.br
field FirstName 9
.br
field LastName 13
.br
field Age 2
.br
}
.br
}
.br
output XML {
.br
file_header "<?xml version=\e"1.0\e" encoding=\e"ISO\-8859\-1\e"?>\en"
.br
data "<%n>%d</%n>\en"
.br
record_header "<%r>\en"
.br
record_trailer "</%r>\en"
.br
indent " "
.br
}
.SH "SEE ALSO"
.LP
More examples in Texinfo manual. If the \fBinfo\fR and \fBffe\fR are properly installed, the command
.sp 3
\fBinfo\fR \fBffe\fR
.sp 3
should give more information.
.SH "AUTHOR"
Timo Savinen <tjsa@iki.fi>
|