1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166
|
.\"
.\" This file is part of the Detox package.
.\"
.\" Copyright (c) Doug Harple <detox.dharple@gmail.com>
.\"
.\" For the full copyright and license information, please view the LICENSE
.\" file that was distributed with this source code.
.\"
.Dd February 24, 2021
.Dt INLINE-DETOX 1
.Os
.Sh NAME
.Nm inline-detox
.Nd clean up filenames (stream-based)
.Sh SYNOPSIS
.Nm
.Op Fl f Pa configfile
.Op Fl s Ar sequence
.Op Fl v
.Nm
.Op Fl f Pa configfile
.Op Fl s Ar sequence
.Op Fl v
.Ar
.Nm
.Op Fl L
.Op Fl f Pa configfile
.Op Fl v
.Nm
.Op Fl h | -help
.Nm
.Op Fl V
.Sh DESCRIPTION
The
.Nm
utility generates new filenames to make them easier to work with under Unix and
Unix-like operating systems.
It replaces characters that make it hard to type out a filename with dashes and
underscores.
It also provides transliteration-based filters, converting ISO 8859-1 or UTF-8
to ASCII, in part or in whole.
An additional filter unescapes CGI-escaped filenames.
.Pp
.Nm
reads filename(s) from the input stream and writes the updated filename(s) to
the output stream.
.Pp
If a filename is passed on the command line,
.Nm
reads this file and processes each line before writing it to the output stream.
.Pp
Running
.Cm detox
.Fl -inline
is identical to running
.Nm .
.Ss Sequences
.Nm
is driven by a configurable series of filters, called a sequence.
Sequences are covered in more detail in
.Xr detoxrc 5
and are discoverable with the
.Fl L
option.
The default sequence will run the
.Ar safe
and
.Ar wipeup
filters.
Other examples of pre-configured sequences are
.Ar iso8859_1
and
.Ar utf_8 ,
which both provide transliteration to ASCII and then finish with the
.Ar safe
and
.Ar wipeup
filters.
.Ss Options
.Bl -tag -width Fl
.It Fl f Pa configfile
Use
.Pa configfile
instead of the default configuration files for loading translation
sequences.
No other config file will be parsed.
.It Fl h , -help
Display helpful information.
.It Fl L
List the currently available sequences.
When paired with
.Fl v
this option shows what filters are used in each sequence and any
properties applied to the filters.
.It Fl s Ar sequence
Use
.Ar sequence
instead of
.Cm default .
.It Fl v
Be verbose about which files are being renamed.
.It Fl V
Show the current version of
.Nm .
.El
.Sh FILES
.Bl -tag -width Fl
.It Pa /etc/detoxrc
The system-wide detoxrc file.
.It Pa ~/.detoxrc
A user's personal detoxrc.
Normally it extends the system-wide
.Pa detoxrc ,
unless
.Fl f
has been specified, in which case, it is ignored.
.It Pa /usr/share/detox/cp1252.tbl
The provided CP-1252 transliteration table.
.It Pa /usr/share/detox/iso8859_1.tbl
The provided ISO 8859-1 transliteration table.
.It Pa /usr/share/detox/safe.tbl
The provided safe character translation table.
.It Pa /usr/share/detox/unicode.tbl
The provided Unicode transliteration table, used by the UTF-8 filter.
.It Pa /usr/share/detox/unidecode.tbl
An additional Unicode tranlsiteration table, based on
.Xr Text::Unidecode 3pm .
.El
.Sh EXAMPLES
.Bl -tag -width Fl
.It echo "Foo Bar" | Nm Fl s Ar lower Fl v
Will run the sequence
.Ar lower ,
listing any changes and returning the result to the output stream.
.El
.Sh SEE ALSO
.Xr detox 1 ,
.Xr Text::Unidecode 3pm ,
.Xr detox.tbl 5 ,
.Xr detoxrc 5 ,
.Xr ascii 7 ,
.Xr iso_8859-1 7 ,
.Xr unicode 7 ,
.Xr utf-8 7
.Sh HISTORY
.Nm
was originally designed to clean up files that I had received from friends
which had been created using other operating systems.
It's trivial to create a filename with spaces, parenthesis, brackets, and
ampersands under some operating systems.
These have special meaning within
.Fx
and Linux, and cause problems when you go to access them.
I created
.Nm
to clean up these files.
.Pp
Version 2.0 stepped back from transliteration out of the box, instead focusing
on ease of use.
The primary motivations for this were user-provided feedback, and the fact that
many modern Unix-like OSs use UTF-8 as their primary character set.
Transliterating from UTF-8 to ASCII in this scenario is lossy and pointless.
.Sh AUTHORS
.Nm
was written by
.An Doug Harple .
|