1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204
|
.\" Process this file with
.\" groff -man -Tascii foo.1
.\"
.TH ONLY 1 "" Haskell ""
.SH NAME
only \- an advanced filter for words, lines, and more
.SH SYNOPSIS
.B only [\-[bcwlf]
.I EXPR
.B ] ...
.I file
.B ...
.SH DESCRIPTION
.B Only
is an advanced filtering tool, like
.B grep,
but instead of filtering only on lines,
it can also filter on characters or words, called
.I tokens
in general.
When tokens
.I match,
there are two options that allow for greater control than
.B grep.
They can appear before and/or after a
.I regex
and are called
.I absolute indices
and
.I relative indices,
respectively.
.I Absolute indices
refer to matches, whereas
.I relative indices
refer to tokens, with the match being token zero.
For example,
.B -l
.I N/regex/M
will show the M-th line after the N-th occurance of
.I regex.
.P
For a more detailed description, see below.
.SH OPTIONS
.IP (-b|--bytes)=EXPR
Byte mode
.IP (-c|--chars)=EXPR
Character mode
.IP (-w|--words)=EXPR
Word mode
.IP (-l|--lines)=EXPR
Line mode
.IP (-f|--files)=EXPR
File mode
.SH EXTENDED DESCRIPTION
The original goal of
.B 'only'
was to combine the features of
.B head,
.B tail,
.B grep,
and
.B cut
into a single utility that was capable of all of their features,
but with the power to do so much more. For example,
.B head
and
.B tail
are good for selecting the first n-lines or last n-lines of a file,
but what if you want lines 10-30? Neither utility would be very
good alone, and combining them to accomplish your goal would be
a nightmare. Granted, one could probably construct a one liner in
.B awk
or
.B perl
to achieve the desired effect, but at the expense of clarity.
.P
To overview the features of
.B only,
there are two major kinds of inputs:
.I files
and
.I modes.
A file can
either be a filename or
.B '\-'
which means standard input.
The modes currently supported are:
.B bytes,
.B characters,
.B words,
.B lines,
and
.B files.
The difference between each mode is what the pattern /^.*$/ will match.
When no pattern is given, and a number is given instead, then
it will refer to the appropriate token type, for example, the first word,
the second line, etc.
.P
In byte mode, the input is broken up into 8-bit octets,
so the patterns must only match a single byte. In character mode, the input
is broken up according to the specified encoding (or UTF-8 if unspecified),
where each character may be multiple bytes. In word mode, the separators
can be any white-space, so it tries to remember what
separator was there in the beginning,
and puts it back before displaying.
In line mode,
.B only
behaves very similar to
.B grep
but with a few extra features.
In file mode, the filenames are not shown (unless -F is used) but the
entire file is shown if it matches the pattern.
.SH \ \ \ Syntax
.I Matching expressions
are expressions written in a small language
that forms a super-set of regular expressions.
The
.B syntax
of matching expressions
are the same regardless of what the current mode is.
This is true even of byte mode, where you must write "\\xFF" if you want a
non-printable character. Matching expressions can be as simple as a number
or a word. First,
.B only
tries to parse an expression as a number, then as an expression of the form
.I M/regex/N
and if that fails, then it treats the entire expression as a regex. Each M
and N may be a numeric expression.
.I Numeric expressions
have the syntax (in pseudo-Parsec):
num = [+\-][0-9]+
numeric = sepBy numbers ','
numbers = num ';' num ':' num # from A to C step (A-B)
| num ':' num ';' num # from A to B step C
| num ':' num # from A to B
| num
which means you can specify just a single number (3) or something as complicated
as multiple ranges (such as 3:5,100:109). These numeric expressions can occur on
either side of the regex, or both sides with a combined effect. The
syntax of the entire matching expression is:
expr = do optional numeric
c <- punct ; regex ; c
(try c ; regex ; c ; optional num
| optional numeric)
| numeric
| regex
where
.I punct
is any ASCII punctuation character except ".,:;",
and
.I regex
is a POSIX extended regular expression.
.\" This serves to discribe the syntax of matching expressions.
.SH \ \ \ Semantics
The
.B semantics
of matching expressions are a little harder to describe. However, a generalization
of the example given above should hold true:
"N/regex/M" means the M-th
.I tokens
relative to the N-th
.I matches
The default for
.I N
is
.B 1:-1
and the default for
.I M
is
.B 0. The
.I N
are known as
.I absolute indices,
and the
.I M
are known as
.I relative indices.
Absolute indices will take the list of matches (the list of tokens that were matched by the regular expression), and apply use the numbers in N as the indices of this list. This gives you the ability to select the first match (1) or the last match (-1). If you use negative numbers, then it will count from the end of file going backwards, so (-2) would be the second to last match. Relative indices will take the list of matches, the original list of tokens, and for each match, it forms a virtual list where 0 refers to the match's index in the list of tokens. This allows one to emulate
.B grep's
\-A (after) and \-B (before) options.
.P
Here are some equivalent command-lines for "after" a match:
grep -A3 expr file.txt
only -l/expr/0:3 file.txt
Here are some equivalent command-lines for "before" a match:
grep -B3 expr file.txt
only -l/expr/-3:0 file.txt
Normally, these would be used to select
line numbers, like if you got a compiler error in a file with 10 million lines,
and you just wanted to see the surrounding text.
.SH FILES
.I ~/.onlyrc
.RS
A user configuration file. [Not implemented yet]
|