File: stellingwerff.html

package info (click to toggle)
lg-issue99 1-1
  • links: PTS
  • area: main
  • in suites: sarge
  • size: 496 kB
  • ctags: 71
  • sloc: sh: 61; makefile: 34
file content (589 lines) | stat: -rw-r--r-- 28,319 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589

<html>
<head>
<link href="../lg.css" rel="stylesheet" type="text/css" media="screen, projection"  />
<title>OCaml, an Introduction LG #99</title>

<style type="text/css" media="screen, projection">
<!--


.articlecontent {
	position:absolute;
	top:143px;
}


-->
</style>


</head>

<body>


<img src="../gx/2003/newlogo-blank-200-gold2.jpg" id="logo" alt="Linux Gazette"/>
<p id="fun">...making Linux just a little more fun!</p>


<div class="content articlecontent">

<div id="previousnexttop">
<A HREF="pramode.html" >&lt;-- prev</A>
</div>



<h1>OCaml, an Introduction</h1>
<p id="by"><b>By <A HREF="../authors/stellingwerff.html">Jurjen Stellingwerff</A></b></p>

<p>
<p>Object Caml is an ML type of language. For the non-gurus: it's a functional
language that can also be programmed in a non-functional and object-oriented way.
</p>
<p>This language is really easy to learn.  It's powerful and keeps impressing
me with its speed. Programs written in this language are almost always stable
by default. No segmentation faults, only occasional unending loops for the
programmers that still hang on to program their own loops. It is really not
needed to write most loops, since the libraries contain standard functions that
are good enough in 99% of the cases. So try to use those functions:  It really
pays off in terms of stability of your programs, and, unless you have intimate
knowledge of the inner works of this language, they tend to be better
optimised.
</p>
<p>The language can be obtained from the website <a href="http://caml.inria.fr">caml.inria.fr</a>. Here, they provide RPMs for the RedHat 7.2/8.0/9 and Mandrake 8.0 distributions. Also MS Windows binaries are available, but not all Unix library functions will work there, for some mysterious reason. The source tarball does compile flawlessly for me. It just has a somewhat unusual makefile layout:
</p>
<pre>
# ./configure
# make world; make opt; make install
</pre>
<p>The normal libraries include many usable data-structures like balanced trees, hash tables, and streams.
Their version of header files (.mli files) contain all the basic documentation you need, and those are directly converted into HTML and published on the Web in their OCaml manual. This manual is not very usable to study this language, so I'll try to explain here some of the basic language constructions. This is just to give you an impression of the power of this language.
</p>
<h3>Modules & Functions
</h3>
<p>
Now some real life examples. I wrote a program to help administrating a computer. It is a subset of a normal file finder, but is a command line tool and very fast. It helps locating large, not-recently-used files to be deleted from the system. It crawls through the directory tree and show the contents in different layouts.
</p>
<p>
Every module in OCaml has its own namespace. Specific definitions can be found by adding the module name, with the first character an upper-case character. You can also change the namespace of the current program to include a total module. Normally, only the standard module '<a href="http://caml.inria.fr/ocaml/htmlman/libref/Pervasives.html">pervasives.mli</a>' is included in the default namespace.
The example program 'show.ml' starts with:</p>
<pre>
open <a href="http://camlserv.sourceforge.net/Basics.html">Basics</a>
open <a href="http://caml.inria.fr/ocaml/htmlman/libref/Unix.html">Unix</a>
open <a href="http://caml.inria.fr/ocaml/htmlman/libref/Unix.LargeFile.html">Unix.LargeFile</a>
</pre>
<p>
This includes my own set of 'basics' functions and 2 standard libraries: 'Unix' and 'Unix.LargeFile'. A module normally consists of 2 files. The first file for exporting definitions 'module.mli' (like the C .h file), and the second one for actual code (the 'module.ml' file). The program uses the function 'string_sub' that provides a foolproof version of the 'String.sub' standard function (from the string.mli module).
The basics.mli file contains the lines:
</p>
<pre>
val string_sub: string -&gt; int -&gt; int -&gt; string
(** Get the sub string from a [string] from position [from] with [length].
This is the same function as String.sub, but it will never raise an exception.
And a negative [from] value is counted from the right side of the string. *)
</pre>
<p>
This gives the definition of this function and the description. There is an automatic documentation generator (ocamldoc) that reads .mli files and writes .html files as basic interface documentation. Normal comments start with (* but the documentation generator only writes comments that start with (** to the .html files. This document contains links to the documentation of the used modules.
This documentation is really helpful to start programming ocaml. The .mli files are all included in the distribution, but the complete <a href="http://caml.inria.fr/ocaml/htmlman/index.html">manual</a> and a <a href="http://caml.inria.fr/oreilly-book/">book</a> can be downloaded from the Web site <a href="http://caml.inria.fr">caml.inria.fr</a>
</p>
<p>
The function is followed by its type. It wants 3 parameters and provides a string. Normally we need to write 'Basics.string_sub' to use this function. But after the 'open Basics' instruction just 'string_sub' is enough.
</p>
<h3>Basic operations and function calls
</h3>
<p>
Now, back to the main program again. The first function is 'gettype'. It will try to return the type of a file. The file type is defined as the part of the filename following the last '.'. When there is no dot, the type is unknown and returned empty.
</p>
<pre>
let gettype file =
try
let pos = String.rindex file '.' in
String.sub file (pos+1) (String.length file-pos-1)
with Not_found -> ""
;;
</pre>
<p>
This function only uses standard functions. First, it catches the Not_found
exception in the 'try' 'with Not_found -&gt; ""' code. All other exceptions will
be passed to the caller to be handled, and can possibly stop the main program.
The local variable pos get is filled with the result of the function <a
href="http://caml.inria.fr/ocaml/htmlman/libref/String.html#VALrindex">rindex</a>.
This function is also the reason to catch the exception; otherwise, the main
program might stop on the first found file with no '.' in it. Local variables
can be declared everywhere inside ocaml with 'let &lt;variable&gt; = &lt;value&gt; in
&lt;code&gt;'. After the completion of the given code, the variable is out of
scope and will be forgotten. The data will be passed to the garbage collector
to be removed from memory.
Function calls do normally use brackets. The function call to '<a href="http://caml.inria.fr/ocaml/htmlman/libref/String.html#VALsub">String.sub</a>' gets 3 parameters the string 'file' the integer '(pos+1)' and the integer '(String.length file-pos-1)'.
The last parameter calls the function 'String.length' with a single parameter 'file'. So, the functions are eager for their parameters; brackets are needed only when the parameters are filled with calculations.
</p>
<p>
Also '<a href="http://caml.inria.fr/ocaml/htmlman/libref/Pervasives.html#VAL(+)">(+)</a>' and '<a href="http://caml.inria.fr/ocaml/htmlman/libref/Pervasives.html#VAL(+)">(-)</a>' are functions of the pervasives module. It is very easy to define your own operators; just add brackets around their definition, and they are ready.
</p>
<h3>If then else
</h3>
<p>
The next routine 'filesize' in the example code is far longer, but largely introduces sub-functions and 'if &lt;bool-expr&gt; then &lt;expr&gt; else &lt;expr&gt;' statements.
This function creates a string from an int64 number for human readable file and directory sizes. The types of parameters are normally not given; they are determined by ocaml through their usage. When something is not clear, the compiler or interpreter will complain about it before executing the code.
</p>
<pre>
let filesize s =
let tostr f =
  if f&gt;9.9 then
    string_of_int (int_of_float (f +. 0.5))
  else
    let res = string_of_float (floor (f *. 10.0 +. 0.5) /. 10.0) in
    if String.length res=2 then
      res ^ "0"
    else 
      res
in
let bytes = Int64.to_float s in
if bytes &gt; 512.0 then
  let kb = bytes /. 1024.0 in
  if kb &gt; 512.0 then
    let mb = kb /. 1024.0 in
    if mb &gt; 512.0 then
      let gb = mb /. 1024.0 in
      tostr gb ^ " Gb"
    else
      tostr mb ^ " Mb"
  else
    tostr kb ^ " kb"
else
  Int64.to_string s
;;
</pre>
<p>
The ocaml standard library has a set of conversion functions. These functions normally follow the form of 'int_of_float' and 'string_of_float'. Specific types like 'Int64' use shorthand notations like 'Int64.to_float'. String concatenations are done with the operation '(^)'. Normally, functions are defined for only one specific type, so there are new sets of arithmetic functions for floats like '(+.)', '(*.)' and '(/.)'. The 'tostr' sub-function has some extra calculation to change something like '5. Gb' into the nicer form of '5.0 Gb'.
</p>
<h3>List notation and type conversion
</h3>
<p>
The next function, 'converttime', converts a string into a float. OCaml uses floats for date for 2 reasons. The first is to prevent possible Year 2k problems, and can also be used for less than one-second time measurements. The function accepts English acronyms for month names. So let's introduce the list and the pair to create a translation of acronyms into numbers.
</p>
<pre>
let month = [("jan", 0); ("feb", 1); ("mar", 2); ("apr", 3); ("may", 4); ("jun", 5);
             ("jul", 6); ("aug", 7); ("sep", 8); ("oct", 9); ("nov", 10); ("dec", 11)]
;;
</pre>
<p>
This list is totally static, and can be used easily by the standard function <a href="http://caml.inria.fr/ocaml/htmlman/libref/List.html#VALassoc">List.assoc</a> to convert a string into the corresponding number.
</p>
<pre>
let converttime str =
try
begin match
  if str&gt;"a" &amp;&amp; str&lt;"z" then
  ( int_of_string (string_sub str (String.rindex str ' '+1) 99),
    List.assoc (string_sub str 0 3) month,
    1
  )
  else
  ( int_of_string (string_sub str 0 (
      try String.index str '-' with Not_found -&gt; 99
    )),
    ( try let pos=String.index str '-'+1 in
        int_of_string (string_sub str pos (
          try String.index_from str pos '-'-pos with err -&gt; 99
        ))-1
        with err -&gt; 0
    ),
    ( try let pos=String.index str '-'+1 in
        int_of_string (string_sub str (String.index_from str pos '-'+1) 99)
        with err -&gt; 1
    )
  )
with (yr,mn,md) -&gt;
(* print_string ("Last access before: "^
     string_of_int (if yr&lt;50 then yr+2000 else if yr&lt;100 then yr+1900 else yr)^"-"^
     string_of_int (mn+1)^"-"^
     string_of_int md^"\n"); 
*)
  fst (mktime 
  { tm_sec = 0; tm_min = 0; tm_hour = 0;
    tm_mday = md; tm_mon = mn;
    tm_year = if yr&lt;50 then yr+100 else if yr&lt;100 then yr else yr-1900;
    tm_wday = 0; tm_yday = 0; tm_isdst = false
  })
end with err -&gt;
  print_string ("Cannot decipher this date string '" ^ str ^ "'\n"); max_float
;;
</pre>
<p>
The new operation in this function is the 'match &lt;expr&gt; with &lt;template&gt; -&gt; expr'. This is one of the most versatile instructions of ocaml. It can be used to examine the contents of variables and get the needed information out of it. This function creates the triplet (year, month, day-of-month) out of 2 different date notations.
To debug this function the 'print_string' instruction is included but commented out to prevent clutter in the output of the program. Normally there is some logging mechanism to make the extra messages optional for the user.
The 'print_string' shows the ISO notation of the given date; it creates a 4-digits year and gives a month number with January=1 instead of the internal Unix use of January=0.
</p>
<p>
This function also shows the use of 'try &lt;expr&gt; with err -&gt; &lt;expr&gt;' that caches every possible exception and fills the variable 'err' with the details of the exception. This function can raise quite a lot of different exceptions, and frankly I am not very interested in the details. The routine just complains to the user about the given date string and gets over it. It returns the maximal possible float to include every filename.
</p>
<p>
The main standard function is the '<a href="http://caml.inria.fr/ocaml/htmlman/libref/Unix.html#VALmktime">Unix.mktime</a>' function. It wants to get a <a href="http://caml.inria.fr/ocaml/htmlman/libref/Unix.html#TYPEtm">record</a> filled with numbers about the current time. This function returns a pair with the needed float and a normalized record. With the pervasives function <a href="http://caml.inria.fr/ocaml/htmlman/libref/Pervasives.html#VALfst">fst</a> returns just the first parameter of the pair.
</p>
<p>
The ';' before the 'max_float' indicates that the expression results in a float, but the instructions before the ';' are calculated first. This is the first non-functional instruction inside the example code. OCaml is not strictly functional, but has the full power of other functional languages.
</p>
<h3>Dynamic data structure
</h3>
<p>
Now is the time for a real data structure that is dynamically build and can be used in a lot of different ways.
</p>
<pre>
type entrytype =
| Dir of entry list   (* directory with a list of files *)
| File of string      (* a file inside a directory *)

and

entry = {
	mutable e_name: string;   (* name of a file or directory *)
	e_type: entrytype;        (* what type is this together with type
                                     related information *)
	e_atime: float;           (* last access time *)
	e_size: int64;            (* size of the file or size of all the matching
                                     files in the directory *)
}
</pre>
<p>
The 'and' statement is used to glue the two definitions together. They are created at the same time so that 'entrytype' can include 'entry' and vice-versa. 'entrytype' can consist one of 2 things: a directory with a list of entries or a file with its type. The directory entry has a mutable name. This is can be used later on to change a filename info the full path to that file.
</p>
<p>
As with ANSI C, the operators for Boolean algebra are '(&amp;&amp;)' and '(||)'.
</p>
<h3>Recursion
</h3>
<pre>
let rec dirwrite el depth sortfn =
List.iter (
  fun e -&gt;
    match e.e_type with
    | Dir lst -&gt;
       if e.e_size &lt;&gt; Int64.of_int 0 then begin
         print_string ((String.make (depth*2) ' ') ^ "Directory " ^
           e.e_name ^ " = (" ^ filesize e.e_size ^ ")\n"); 
         dirwrite lst (depth+1) sortfn
       end
    | File string -&gt; 
       print_string ((String.make (depth*2) ' ') ^ e.e_name ^
         " (" ^ filesize e.e_size ^ ")\n")
  ) (List.sort sortfn el)
;;
</pre>
<p>
Here is the recursive ('rec') function 'dirwrite' that traverses a given tree 'el' and writes the result to the standard output. The parameter 'depth' indicates the amount of spaces to write a tree like structure of filenames. The function sorts all the lists with the given function 'sortfn'.
The new language structure here is 'fun &lt;parm-1&gt; ... &lt;parm-n&gt; -&gt; &lt;expr&gt;'. This construction creates a function without a name. The parameters of this function like construction can be used like a template to match pairs.
</p>
<p>
This function suppresses directories that are 0 bytes in size to reduce clutter.
</p>
<h3>Variables vs. definitions
</h3>
<pre>
(* List of global variables *)

let min_size = ref (Int64.of_int 0) and    (* minimum size of a file in bytes *)
    last_access = ref max_float and        (* last access time in seconds since 1970 *)
    has_type = ref "" and                  (* type of file to show or empty to
                                              show all *)
    name_match = ref "" and                (* regular expression to match the filename
                                              with; empty is show all *)
    name_regexp = ref (Str.regexp "") and  (* pre-calculated regular expression *)
    no_symlinks = ref false                (* don't follow symbolic links to
                                              directories *)
;;
</pre>
<p>
This is a list of variables that can be changed due to the 'ref &lt;expr&gt;' construction. Normally definitions are just a label to their contents. These definitions are pointers to the memory and can be read by '!&lt;variable&gt;' and written by '&lt;variable&gt; := &lt;expr&gt;'. The parameters given to the program can make changes to the way the files are read.
</p>
<pre>
let rec dirread path =
let list = ref [] and
    size = ref (Int64.of_int 0) in
try
let dh = opendir path in
while true do
  let file = readdir dh in
  if file&lt;&gt;".." &amp;&amp; file&lt;&gt;"." &amp;&amp; file&lt;&gt;"CVS" &amp;&amp; String.sub file 0 1 &lt;&gt; "." then
  let s=stat (path^"/"^file) in
  if s.st_kind = S_DIR &amp;&amp;
    (not !no_symlinks || (lstat (path^"/"^file)).st_kind &lt;&gt; S_LNK)
  then
    let dir = dirread (path^"/"^file) in
    list := 
    { e_name = file;
      e_type = Dir (fst dir);
      e_atime = s.st_atime;
      e_size = snd dir
    } :: !list;
    size := Int64.add !size (snd dir)
  else if 
    (!has_type = "" || gettype file = !has_type) &amp;&amp;
    s.st_size &gt; !min_size &amp;&amp; 
    s.st_atime &lt; !last_access &amp;&amp;
    (!name_match = "" || Str.string_match !name_regexp file 0)
  then begin
    list := 
    { e_name = file;
      e_type = File (gettype file);
      e_atime = s.st_atime;
      e_size = s.st_size;
    } :: !list;
    size := Int64.add !size s.st_size
  end
done;
(!list, !size)
with 
| End_of_file -&gt; (!list, !size)
| Unix_error (EACCES, err, parm) -&gt; (!list, !size)
;;
</pre>
<p>
The following functions are introduced in the function 'dirread':
<li>Unix.opendir to start reading a directory.
<li>Unix.readdir to read a filename.
<li>Unix.stat for a record (<a href="http://caml.inria.fr/ocaml/htmlman/libref/Unix.html#TYPEstats">Unix.stats</a>) of statistics on a file.
<li>Unix.lstat for statistics on a link.
<li>Int64.add to add two int64 type of variables
<li>Str.regexp to create a new interpreted regular expression
<li>Str.string_match to match a string against a regular expression
<li>Pervasives.(::) to create a list with an extra element in front of the old one
<li>Pervasives.true as a Boolean constant
<li>Pervasives.snd to return the second part of a pair
<li>exception Unix.Unix_error (EACCESS, err, parm) that is raised when an access denied is encountered.
</li>
There is also a new construction 'while &lt;boolean-expr&gt; do &lt;code&gt; done' it just does what it is supposed to do.
</p>
<h3>Small is beautiful
</h3>
<pre>
let rec flat el path =
List.fold_right (
  fun e ls -&gt;
    match e.e_type with
    | Dir lst -&gt; flat lst (path ^ "/" ^ e.e_name) @ ls
    | File string -&gt;
        e.e_name <- (path ^ "/" ^ e.e_name);
        e :: ls
  ) el []
;;
</pre>
<p>
This neat little routine 'flat' hits the tree 'el' flat on the ground. It takes every file from every branch and creates a single list of all the encountered files. This is done with one of the most versatile standard routines inside ocaml: the '<a href="http://caml.inria.fr/ocaml/htmlman/libref/List.html#VALfold_right">List.fold_right</a>' routine. This routine introduces a state machine (scarab) that crawls over a list and operates on every encountered element. It produces a new structure (droppings) as a result -- in this case, a flattened list.
</p>
<p>
The construction '&lt;record-field&gt; &lt;- &lt;expr&gt;' changes the contents of a mutable record field. Without mutable fields, you can mutate records only by creating a new one with lots of fields inherited from the old one. This is a shortcut for that.
</p>
<pre>
let name_order a b =
compare a.e_name b.e_name
;;

let type_order a b =
let typea = match a.e_type with Dir ls -&gt; "dir" | File tp -&gt; tp and
    typeb = match b.e_type with Dir ls -&gt; "dir" | File tp -&gt; tp in
if compare typea typeb = 0 then
  compare a.e_name b.e_name
else compare typea typeb
;;

let atime_order a b =
compare a.e_atime b.e_atime
;;
</pre>
<p>
A set of sorting functions to use inside 'dirwrite'. The function 'compare' results in the widely used values of -1 for lower than, 0 for equal and +1 for higher than.
</p>
<h3>Command line parameters
</h3>
<pre>
let dir = ref "." and
    sort = ref name_order and
    show = ref 0
    in

Arg.parse [
   ("-t",Arg.Unit (fun () -&gt; sort := type_order), 
     "Sort by type and filename");
   ("-l",Arg.Unit (fun () -&gt; sort := atime_order),
     "Sort by last access time");
   ("-n",Arg.Unit (fun () -&gt; show := 1),
     "List filenames");
   ("-b",Arg.Unit (fun () -&gt; show := 2),
     "List both filenames and sizes");
   ("-s",Arg.Unit (fun () -&gt; no_symlinks := true),
     "Don't follow symbolic links");
   ("--before",Arg.String (fun s -&gt; last_access := converttime s),
     "Last access older than give date (format 'yyyy-mm-dd' or 'mmm yyyy')");
   ("--size",Arg.Int (fun i -&gt;
        min_size := Int64.mul (Int64.of_int i) (Int64.of_int (1024*1024))
     ), "File size bigger than size in Mbytes");
   ("--type",Arg.String (fun s -&gt; has_type := s),
     "File is specific type");
   ("--name",Arg.String (fun s -&gt;
        name_match := s; name_regexp := Str.regexp (s ^ "$")
     ), "Filename matches regular expression")
] (fun d -&gt; dir := d) "show [DIR]";
let res = dirread !dir in
if !show=0 then begin
  dirwrite (fst res) 0 !sort;
  print_string ("Total size " ^ filesize (snd res) ^ "\n")
end else
  List.iter
    (fun e -&gt; 
      print_endline (e.e_name ^ if !show=2 then " ("^filesize e.e_size^")" else "")
    ) (List.sort !sort (flat (fst res) !dir))
;;
</pre>
<p>
And here is the main routine. It calls the Arg.parse routine to parse the parameters given to the program. But this is too much un-GNU for me. I wrote my own version of it that follows the GNU coding standards a bit more than the default one (<a href="http://camlserv.sourceforge.net/Gnuarg.html">Gnuarg</a>). The other version is a bit more complicated so I will include only the sources that use it.
</p>
<h3>Generating binaries
</h3>
<p>
The code can be obtained from <a href="http://camlserv.sourceforge.net/show.tar.gz">here</a>. Just unpack it somewhere with 'tar -xzf show.tar.gz' and move into the source directory with 'cd show/src'.
There is also a Makefile that compiles to machine code and installs everything. But Makefiles are too rough for sour eyes to show in this article. The nitty-gritty details are there in the source. The general compile form is.
</p>
<pre>
ocamlopt -o show unix.cmxa str.cmxa basics.cmx show.ml
</pre>
<p>
The only non-standard libraries in use here are unix.cmxa and str.cmxa.
</p>
<pre>
make
su
make install
exit
show --help
show -s ~ --size 3 --before "apr 2003"
</pre>
<p>
That concludes this example program.
</p>
<h3>Language features
</h3>
<dl>
<dt>Garbage collector
<dd>Just forget variables that contain complete data structures. Once it gets out of scope, the total structure will be eliminated from memory in due time.
<dt>Flexible data-structures
<dd>Any 2 data structures can be combined without hassle. Just create an array of records that contain 2 fields with hash tables of strings. No problem there... everything in a single variable than can be passed to functions or can be used globally in the program.
<dt>No pointers needed
<dd>A variable can have any type and when a new variable is created 
<dt>Flexible in language boundary checks
<dd>The language can check array and string boundaries automatically, or those checks can be turned off for an extra speed boost. Without it, the language can give a segmentation fault, but that is the programmer's choice.
<dt>High quality error handling
<dd>Totally integrated into the language and no notable performance hit.
<dt>Native code generator and byte code interpreter
<dd>All the tools are there -- interpreter (ocaml), byte code (ocamlc) and native code compiler (ocamlopt) -- every wish is granted. The package comes also with a documentation generator (ocamldoc) and a simple to use profiler (ocamlprof) that adds usage counts as comments to the original source code. The language is also compatible with the more sophisticated profilers around.
<dt>ANSI-C compatibility layer
<dd>It is possible to include ANSI C routines inside OCaml programs, and OCaml routines inside C programs. This has a very easy to use API. Slightly less easy is the creation of OCaml data structures inside C; for me, that was the source of many segfaults. So, my routines call exported OCaml routines to fill data structures and create only OCaml strings and numbers in C. That way I won't have the hassle to debug the C code... OCaml is much easier to debug for me.
<dt>Object orientation
<dd>Not my favourite programming paradigm, but it is possible to build object-oriented programs in this language. Those features are not part of this article. I can live without them.
<dt>An active mailing list
<dd>This list is at caml-list@inria.fr and is normally in English. Yes, this originally French project has taken the burden to translate almost everything they got. This is no easy feat for them, so be grateful.
</dl>
<h3>Cons:</h3>
<dl>
<dt>Duplicate efforts in libraries
<dd>There are separate libraries for different type of big arrays, big files, and extra long integers. This isn't a big problem, because you can always just start with the normal structures and drop in the special library when need arises. The naming of the different functions is very much standardized, so renaming of function calls isn't needed much. The extra long integers though are too much different from normal integers. That part of the standard functions really need some tuning.
<dt>Readability
<dd>You need to be familiar with the basis constructions of the language, to make any sense of the actual code. Some constructions can look really weird without intimate knowledge of the language. OCaml is not a very natural language and has a very powerful, short notation for things. But this not much worse than languages like ANSI C, Perl, or lisp.
<dt>Not known enough in the Linux world
<dd>This language has excellent interfaces to standard libraries and easy binding to ANSI C, but still isn't very known. I like to create some articles like this to change that a bit. This is a really great language to program in, and gives you real power without the pitfalls common in other languages. Programmers should give it a try and feel that power once.
</dl>


</p>


<!-- *** BEGIN author bio *** -->
<P>&nbsp;
<P>
<!-- *** BEGIN bio *** -->
<P>
<img ALIGN="LEFT" ALT="[BIO]" SRC="../gx/2002/note.png">
<em>
Developer at a small technology firm in the Netherlands called V&S bv. 
(<A HREF="http://www.v-s.nl/">www.v-s.nl</A>)
We sell firewall, anti-virus and spam boxes based on the Linux OS.
Using more and more the OCaml language to write my applications.
Busy writing a lightweight http server with an internal scripting language 
(<A HREF="http://camlserv.sourceforge.net/">camlserv.sourceforge.net</A>,
source code <A HREF="http://camlserv.sourceforge.net/show.tar.gz">here</A>)
Interested in writing AI based computer games. Always trying writing 
one, nothing ready yet.
</em>
<br CLEAR="all">
<!-- *** END bio *** -->

<!-- *** END author bio *** -->

<div id="articlefooter">

<p>
Copyright &copy; 2004, Jurjen Stellingwerff. Copying license 
<a href="http://linuxgazette.net/copying.html">http://linuxgazette.net/copying.html</a>
</p>

<p>
Published in Issue 99 of Linux Gazette, February 2004
</p>

</div>


<div id="previousnextbottom">
<A HREF="pramode.html" >&lt;-- prev</A>
</div>


</div>






<div id="navigation">

<a href="../index.html">Home</a>
<a href="../faq/index.html">FAQ</a>
<a href="../lg_index.html">Site Map</a>
<a href="../mirrors.html">Mirrors</a>
<a href="../mirrors.html">Translations</a>
<a href="../search.html">Search</a>
<a href="../archives.html">Archives</a>
<a href="../authors/index.html">Authors</a>
<a href="../contact.html">Contact Us</a>

</div>



<div id="breadcrumbs">

<a href="../index.html">Home</a> &gt; 
<a href="index.html">February 2004 (#99)</a> &gt; 
Article

</div>





<img src="../gx/2003/sit3-shine.7-2.gif" id="tux" alt="Tux"/>




</body>
</html>