1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315
|
'\" et
.TH CSPLIT "1P" 2013 "IEEE/The Open Group" "POSIX Programmer's Manual"
.SH PROLOG
This manual page is part of the POSIX Programmer's Manual.
The Linux implementation of this interface may differ (consult
the corresponding Linux manual page for details of Linux behavior),
or the interface may not be implemented on Linux.
.SH NAME
csplit
\(em split files based on context
.SH SYNOPSIS
.LP
.nf
csplit \fB[\fR\(miks\fB] [\fR\(mif \fIprefix\fB] [\fR\(min \fInumber\fB] \fIfile arg\fR...
.fi
.SH DESCRIPTION
The
.IR csplit
utility shall read the file named by the
.IR file
operand, write all or part of that file into other files as directed
by the
.IR arg
operands, and write the sizes of the files.
.SH OPTIONS
The
.IR csplit
utility shall conform to the Base Definitions volume of POSIX.1\(hy2008,
.IR "Section 12.2" ", " "Utility Syntax Guidelines".
.P
The following options shall be supported:
.IP "\fB\(mif\ \fIprefix\fR" 10
Name the created files
.IR prefix \c
.BR 00 ,
.IR prefix \c
.BR 01 ,
\&.\|.\|.,
.IR prefixn .
The default is
.BR xx00
\&.\|.\|.
.BR xx \c
.IR n .
If the
.IR prefix
argument would create a filename exceeding
{NAME_MAX}
bytes, an error shall result,
.IR csplit
shall exit with a diagnostic message, and no files shall be created.
.IP "\fB\(mik\fP" 10
Leave previously created files intact. By default,
.IR csplit
shall remove created files if an error occurs.
.IP "\fB\(min\ \fInumber\fR" 10
Use
.IR number
decimal digits to form filenames for the file pieces. The default
shall be 2.
.IP "\fB\(mis\fP" 10
Suppress the output of file size messages.
.SH OPERANDS
The following operands shall be supported:
.IP "\fIfile\fR" 10
The pathname of a text file to be split. If
.IR file
is
.BR '\(mi' ,
the standard input shall be used.
.P
Each
.IR arg
operand can be one of the following:
.IP "/\fIrexp\fR/\fB[\fIoffset\fB]\fR" 10
.br
A file shall be created using the content of the lines from the current
line up to, but not including, the line that results from the
evaluation of the regular expression with
.IR offset ,
if any, applied. The regular expression
.IR rexp
shall follow the rules for basic regular expressions described in the Base Definitions volume of POSIX.1\(hy2008,
.IR "Section 9.3" ", " "Basic Regular Expressions".
The application shall use the sequence
.BR \(dq\e/\(dq
to specify a
<slash>
character within the
.IR rexp .
The optional offset shall be a positive or negative integer value
representing a number of lines. A positive integer value can be
preceded by
.BR '\(pl' .
If the selection of lines from an
.IR offset
expression of this type would create a file with zero lines, or one
with greater than the number of lines left in the input file, the
results are unspecified. After the section is created, the current line
shall be set to the line that results from the evaluation of the
regular expression with any offset applied. If the current line is the
first line in the file and a regular expression operation has not yet
been performed, the pattern match of
.IR rexp
shall be applied from the current line to the end of the file.
Otherwise, the pattern match of
.IR rexp
shall be applied from the line following the current line to the end of
the file.
.IP "%\fIrexp\fR%\fB[\fIoffset\fB]\fR" 10
.br
Equivalent to /\fIrexp\fR/\fB[\fIoffset\fB]\fR, except that no
file shall be created for the selected section of the input file. The
application shall use the sequence
.BR \(dq\e%\(dq
to specify a
<percent-sign>
character within the
.IR rexp .
.IP "\fIline_no\fR" 10
Create a file from the current line up to (but not including) the line
number
.IR line_no .
Lines in the file shall be numbered starting at one. The current line
becomes
.IR line_no .
.IP "{\fInum\fR}" 10
Repeat operand. This operand can follow any of the operands described
previously. If it follows a
.IR rexp
type operand, that operand shall be applied
.IR num
more times. If it follows a
.IR line_no
operand, the file shall be split every
.IR line_no
lines,
.IR num
times, from that point.
.P
An error shall be reported if an operand does not reference a line
between the current position and the end of the file.
.SH STDIN
See the INPUT FILES section.
.SH "INPUT FILES"
The input file shall be a text file.
.SH "ENVIRONMENT VARIABLES"
The following environment variables shall affect the execution of
.IR csplit :
.IP "\fILANG\fP" 10
Provide a default value for the internationalization variables that are
unset or null. (See the Base Definitions volume of POSIX.1\(hy2008,
.IR "Section 8.2" ", " "Internationalization Variables"
for the precedence of internationalization variables used to determine
the values of locale categories.)
.IP "\fILC_ALL\fP" 10
If set to a non-empty string value, override the values of all the
other internationalization variables.
.IP "\fILC_COLLATE\fP" 10
.br
Determine the locale for the behavior of ranges, equivalence classes,
and multi-character collating elements within regular expressions.
.IP "\fILC_CTYPE\fP" 10
Determine the locale for the interpretation of sequences of bytes of
text data as characters (for example, single-byte as opposed to
multi-byte characters in arguments and input files) and the behavior of
character classes within regular expressions.
.IP "\fILC_MESSAGES\fP" 10
.br
Determine the locale that should be used to affect the format and
contents of diagnostic messages written to standard error.
.IP "\fINLSPATH\fP" 10
Determine the location of message catalogs for the processing of
.IR LC_MESSAGES .
.SH "ASYNCHRONOUS EVENTS"
If the
.BR \(mik
option is specified, created files shall be retained. Otherwise, the
default action occurs.
.SH STDOUT
Unless the
.BR \(mis
option is used, the standard output shall consist of one line per
file created, with a format as follows:
.sp
.RS 4
.nf
\fB
"%d\en", <\fIfile size in bytes\fR>
.fi \fR
.P
.RE
.SH STDERR
The standard error shall be used only for diagnostic messages.
.SH "OUTPUT FILES"
The output files shall contain portions of the original input file;
otherwise, unchanged.
.SH "EXTENDED DESCRIPTION"
None.
.SH "EXIT STATUS"
The following exit values shall be returned:
.IP "\00" 6
Successful completion.
.IP >0 6
An error occurred.
.SH "CONSEQUENCES OF ERRORS"
By default, created files shall be removed if an error occurs. When the
.BR \(mik
option is specified, created files shall not be removed if an error
occurs.
.LP
.IR "The following sections are informative."
.SH "APPLICATION USAGE"
None.
.SH EXAMPLES
.IP " 1." 4
This example creates four files,
.BR cobol00
\&.\|.\|.
.BR cobol03 :
.RS 4
.sp
.RS 4
.nf
\fB
csplit \(mif cobol file '/procedure division/' /par5./ /par16./
.fi \fR
.P
.RE
.P
After editing the split files, they can be recombined as follows:
.sp
.RS 4
.nf
\fB
cat cobol0[0\(mi3] > file
.fi \fR
.P
.RE
.P
Note that this example overwrites the original file.
.RE
.IP " 2." 4
This example would split the file after the first 99 lines, and every
100 lines thereafter, up to 9\|999 lines; this is because lines in the
file are numbered from 1 rather than zero, for historical reasons:
.RS 4
.sp
.RS 4
.nf
\fB
csplit \(mik file 100 {99}
.fi \fR
.P
.RE
.RE
.IP " 3." 4
Assuming that
.BR prog.c
follows the C-language coding convention of ending routines with a
.BR '}'
at the beginning of the line, this example creates a file containing
each separate C routine (up to 21) in
.BR prog.c :
.RS 4
.sp
.RS 4
.nf
\fB
csplit \(mik prog.c '%main(%' '/^}/+1' {20}
.fi \fR
.P
.RE
.RE
.SH RATIONALE
The
.BR \(min
option was added to extend the range of filenames that could be
handled.
.P
Consideration was given to adding a
.BR \(mia
flag to use the alphabetic filename generation used by the historical
.IR split
utility, but the functionality added by the
.BR \(min
option was deemed to make alphabetic naming unnecessary.
.SH "FUTURE DIRECTIONS"
None.
.SH "SEE ALSO"
.IR "\fIsed\fR\^",
.IR "\fIsplit\fR\^"
.P
The Base Definitions volume of POSIX.1\(hy2008,
.IR "Chapter 8" ", " "Environment Variables",
.IR "Section 9.3" ", " "Basic Regular Expressions",
.IR "Section 12.2" ", " "Utility Syntax Guidelines"
.SH COPYRIGHT
Portions of this text are reprinted and reproduced in electronic form
from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
-- Portable Operating System Interface (POSIX), The Open Group Base
Specifications Issue 7, Copyright (C) 2013 by the Institute of
Electrical and Electronics Engineers, Inc and The Open Group.
(This is POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
event of any discrepancy between this version and the original IEEE and
The Open Group Standard, the original IEEE and The Open Group Standard
is the referee document. The original Standard can be obtained online at
http://www.unix.org/online.html .
Any typographical or formatting errors that appear
in this page are most likely
to have been introduced during the conversion of the source files to
man page format. To report such errors, see
https://www.kernel.org/doc/man-pages/reporting_bugs.html .
|