File: README

package info (click to toggle)
msort 8.53-2.2
  • links: PTS
  • area: main
  • in suites: bullseye, buster, sid
  • size: 2,360 kB
  • sloc: sh: 10,138; ansic: 10,031; makefile: 51
file content (254 lines) | stat: -rw-r--r-- 10,213 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
This package consists of two related programs.
The first, msort, is the actual sort program.
It has a command-line interface and is written in C.
The code is quite standard and no exotic libraries
are required, so it should compile and run on
any POSIX-compliant system.

The one non-standard library required is
Ville Laurikari's TRE regular expression library,
available at http://laurikari.net/tre/.

The second program, msg, is a graphical front end
to msort. It isn't of any real use without msort,
but it doesn't literally depend on it. You can
run it on a system lacking msort. When it starts
up it will report that it cannot find msort, and
therefore of course it will not actually sort
anything, but if it amuses you, you can still play
with it. msg is written in Tcl and uses the
Tk toolkit. It is meant to be run under wish, the
Tcl/Tk windowing shell. So long as you have
Tcl/Tk/wish available, there is nothing much
to be done to install msg. Since Tcl is
interpreted, no compilation is necessary.

If you do not have Tcl/Tk, don't worry, it is easy
to obtain and install. For most platforms, the easiest
approach is to obtain the ActiveTcl distribution from:
http://www.activestate.com/Products/ActiveTcl
Further information is available at:
http://billposer.org/Software/msort.html


FURTHER DETAILS ON MSORT

Msort has been developed and tested primarily under
GNU/Linux. I also have access to a machine running
FreeBSD and am able to test it there. According to
reports from others, it compiles and runs under
Solaris and Mac OS X.

The man page only gives basic information.
The real reference manual is Doc/msort.pdf.

DEPENDENCIES

Msort makes use of several libraries that are not routinely installed.
The first is Ville Laurikari's regular expression library, wihch may be
obtained from: http://laurikari.net/tre/. This library is reported to
work on pretty much all varieties of Unix, including Mac OS X, as well
as MS Windows XP.

Second, msort requires support for Unicode normalization. It can be compiled to
use either libicu (International Components for Unicode), which may be obtained
from http://www.icu-project.org/, or libutf8proc, which may be obtained from
http://www.flexiguided.de/publications.utf8proc.en.html. ICU is fairly widely
used, so you already have it on your system. To use it, give the option
--disable-utf8proc to configure. msort defaults to using utf8proc because
utf8proc is smaller and easier to install. 

Third, msort optionally uses libuninum to handle numbers in systems other than
the usual Indo-Arabic system. Libuninum is my own library and may be obtained
from http://billposer.org/Software/libuninum.html. Packages for a variey of systems
are available. If you do not need support for exotic number systems, you may build
msort without libuninum. To do this, give the option --disable-uninum to configure.

Libuninum in turn uses the GNU MP library for arbitrary precision arithmetic.
It is available from http://www.swox.com/gmp/. libgmp is required if libuninum is linked.

To summarize, if you want to build msort with the minimum of trouble,
you will need libtre and either libutf8proc or libicu. If the latter is
not already installed, you will probably find it easier to go with the
libutf8, which is the default. If you do not need to handle exotic number
systems, you can forgo libuninum and libgmp. To build this minimal configuration,
call configure as follows:

configure --disable-uninum

On some systems, the autoconfiguration system will not detect the need to link
to libintl. If this happens to you, give the flag:

   LIBS="-lintl"

to configure, e.g.:

 	./configure LIBS="-lintl"


INSTALLATION

If you have the GNU autoconf system available, follow the
generic installation instructions in INSTALL. In short, these are:

./configure
make
make test
(su)
make install-strip

The last command arranges for the symbol table to be removed from
the executable file when it is installed, which results in a substantial
reduction in size. If you want to be able to use a debugger on msort
you will want to preserve the symbol table, in which case you should
give the command:

make install

instead.

"make test" is optional. It executes a set of regression tests.
The tests run very quickly so don't hesitate to try it.
The results will be written to the file RegressionTests/TestResults.

There are a few additional tests that are not executed by "make test".
These are tests that depend on the correct functioning of the locale
system, including the ability to switch into certain particular locales.
They are kept separate because they can fail even if msort itself is
working perfectly. To execute these tests, give the command:

make localetest

The results will be written to the file RegressionTests/LocaleTestResults.


There are several non-standard options to configure:

--disable-allocaok

By default, in certain situations msort uses the alloca routine to allocate
storage on the stack, which is faster than allocating it on the heap.
However, alloca is buggy on some systems. If you give configure the option
--disable-allocaok, msort will not use alloca. If you know that alloca
is funky on your system, or if msort seems to behave strangely, configuring
msort with this flag is wise.

--disable-uninum

Build without reliance on libuninum. This eliminates the ability to handle
exotic number systems.

--disable-utf8proc

Use libicu rather than the default of utf8proc for Unicode normalization.

--disable-comparison-count

Eliminates the comparison count. In theory this will speed things up slightly,
but the speed-up is unlikely to be noticable.

--enable-debugbuild

This adds replaces the default compiler options "-g -O2" with "-ggdb -g3",
causing the resulting executable to contain the maximum amount of useful
information for gdb and disabling optimazation. This eliminates the need
for manual editing of the Makefile. It also defines the MACRO DEBUGBUILD
in the C files, allowing conditional compilation of code for debugging.


For generic details on installation using the the autoconf system, see the
file "INSTALLATION". The standard option you are most likely to be
interested in is: --prefix=foo, which changes the directories in which
msort is installed. For example, by default the executables will be installed
in /usr/local/bin. If you prefer to install the executables in your personal
bin, in my case, /home/poser/bin, you can configure msort using the command:

./configure --prefix=/home/poser

This will result in the executables being put in /home/poser/bin, the manual page
in /home/poser/man/man1, etc.

If you do not have autoconf/automake, or if a problem arises,
look in the Doc directory for the file OriginalMakefile
and make a copy of it in this directory named Makefile.

To compile, first see if there is anything in the Makefile
that you want to change.  You may wish to change the default
installation directories BINDIR, where the executable goes,
and MANDIR, where the manual page goes. The compiler is also
set to gcc.  If you don't have gcc, or want to use another
compiler, change the value of CC.

Then a simple "make" should suffice to compile msort.

To install, su if necessary, then "make install".

Msort uses the TRE regular expression library to match tags
and to perform substitutions on keys. This library is
available for a wide range of systems but in source form. It
must be compiled and installed. Clear instructions for
compiling and installing it are provided with the
package. However, those not experienced with installing
libraries may encounter difficulties.

One problem that you may encounter is that, even after you
install the library, the linker (part of the compilation
process) says that it cannot find it. This is probably the
result of the library having been installed in a directory
that the linker does not know about.

To remedy this, you need to run the ldconfig program. On
Linux systems this should be located in /sbin, a directory
that contains programs normally used only by the system
administrator. You will need to be root to run ldconfig.

Ldconfig indexes the standard directories /usr/lib and /lib,
any directories listed in the file /etc/ld.so.conf, and
directories listed on the command line.  If you install the
TRE library in a directory other than /lib or /usr/lib, such
as the default /usr/local/lib, you will need to tell
ldconfig to search that directory. You can do this either by
adding the name of the directory to /etc/ld.so.conf or
supplying the directory name on the command line, e.g.:

/sbin/ldconfig /usr/local/lib

Another approach is to give the compiler options that it will
pass on to the linker to tell it where to look.
There are two such options: -L and -rpath.
On some systems -L is used for static libraries and -rpath
for shared libraries, but there is some variation. It appears
always to work if you just use both.

This is especially useful if you do not have root privileges
on the system. 

In the msort Makefile, the relevant portion looks like this:


msort:		${OBJS}
		${CC} -o msort ${OBJS} -ltre

This says that "msort" depends on the files listed in the
variable OBJS, namely msort.o, misc.o, etc., and that
"msort" is created from these files by running the command
that is the value of the variable CC.  The value of CC will
generally be "gcc". The flag -ltre indicates that the TRE
library should be loaded. To tell the linker that the files
for the TRE library are located in /usr/local/lib/, change
the second line above to:

	${CC} -o msort ${OBJS} -L /usr/local/lib -rpath /usr/local/lib -ltre

Of course, if you don't have root privileges you probably can't install
TRE in /usr/local/lib. If you install it in one of your own directories,
give that directory as argument to -L and -rpath instead, e.g.:

	${CC} -o msort ${OBJS} -L /home/wjposer/Src/lib -rpath /home/wjposer/Src/lib -ltre


Some sample sort order definition files are provided in the SortOrders subdirectory.
In addition to serving as examples, some of them may be useful, if, for example,
you need to sort country names in United Nations order, sort by the Chinese
Heavenly Stems, or handle traditional Armenian dates.