File: okopnik.html

package info (click to toggle)
lg-issue69 2-1
  • links: PTS
  • area: main
  • in suites: sarge
  • size: 1,996 kB
  • ctags: 141
  • sloc: perl: 131; sh: 59; sql: 49; makefile: 45
file content (530 lines) | stat: -rw-r--r-- 27,489 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
<!--startcut  ==============================================-->
<!-- *** BEGIN HTML header *** -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML><HEAD>
<title>Learning Perl, part 5 LG #69</title>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
ALINK="#FF0000">
<!-- *** END HTML header *** -->

<CENTER>
<A HREF="http://www.linuxgazette.com/">
<IMG ALT="LINUX GAZETTE" SRC="../gx/lglogo.png" 
	WIDTH="600" HEIGHT="124" border="0"></A> 
<BR>

<!-- *** BEGIN navbar *** -->
<IMG ALT="" SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom"><A HREF="nielsen.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="index.html"><IMG ALT="[ Table of Contents ]" SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../index.html"><IMG ALT="[ Front Page ]" SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue69/okopnik.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom"  ></A><A HREF="../faq/index.html"><IMG ALT="[ FAQ ]" SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="orr.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom"  ></A><IMG ALT="" SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom">
<!-- *** END navbar *** -->
<P>
</CENTER>

<!--endcut ============================================================-->

<H4 ALIGN="center">
"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>

<P> <HR> <P> 
<!--===================================================================-->

<center>
<H1><font color="maroon">Learning Perl, part 5</font></H1>
<H4>By <a href="mailto:ben-fuzzybear@yahoo.com">Ben Okopnik</a></H4>
</center>
<P> <HR> <P>  

<!-- END header -->




<br><tt>What is the sound of Perl? Is it not the sound of a wall that people
have stopped banging their heads against?</tt>
<br><tt>&nbsp;-- Larry Wall</tt>

<p><b>Overview</b>
<p>This month, we're going to cover some general Perl issues, look at a
way to use it in Real Life(tm), and take a quick look at a mechanism that
lets you leverage the power of O.P.C. - Other People's Code :). <i>Modules</i>
let you plug chunks of pre-written code into your own scripts, saving you
hours - or even days - of programming. That will wrap up this introductory
series, hopefully leaving you with enough of an idea of what Perl is to
write some basic scripts, and perhaps a desire to explore further.
<br>&nbsp;
<p><b>A Quick Correction</b>
<p>One of our readers, A. N. Onymous (he didn't want to give me his name;
I guess he decided that fame wasn't for him...), wrote in regarding a statement
that I made in last month's article - that "close" without any parameters
closes <i>all</i> filehandles. After cranking out a bit of sample code
and reading the docs a bit more closely, I found that he was right: it
only closes the currently selected filehandle (<tt>STDOUT</tt> by default).
Thanks very much - well spotted!
<br>&nbsp;
<p><b>Excercises</b>
<p>In the last article, I suggested a couple of script ideas that would
give you some practice in using what you'd learned previously. One of the
people who sent in a script was Tjabo Kloppenburg; a brave man. :) No worries,
Tjabo; either you did a good job, or you get to learn a few things... it's
a win-win situation.
<p>The idea was to write a script that read "/etc/services", counted UDP
and TCP ports, and wrote them out to separate files. Here was Tjabo's solution
(my comments are preceded with a '###'):
<br>
<hr WIDTH="100%"><tt>#!/usr/bin/perl -w</tt>
<br><tt>### Well done; let the computer debug your script!</tt>
<p><tt>$udp = $tcp = 0;</tt>
<br><tt>### Unnecessary: Perl does not require variable declaration.</tt>
<p><tt># open target files:</tt>
<br><tt>open (TCP, ">>tcp.txt") or die "Arghh #1 !";</tt>
<br><tt>open (UDP, ">>udp.txt") or die "Arghh #2 !";</tt>
<p><tt>### My fault here: in a previous article, I showed a quick hack
in which</tt>
<br><tt>### I used similar wording for the "die" string. Here is the proper
way</tt>
<br><tt>### to use it:</tt>
<br><tt>###</tt>
<br><tt>### open TCP, ">tcp.txt" or die "Can't open tcp.txt: $!\n";</tt>
<br><tt>###</tt>
<br><tt>### The '$!' variable gives the error returned by the system, and
should</tt>
<br><tt>### definitely be used; the "\n" at the end of the "die" string
prevents</tt>
<br><tt>### the error line number from being printed. Also, the ">>" (append)</tt>
<br><tt>### modifier is inappropriate: this will cause anything more than
one</tt>
<br><tt>### execution of the script to append (rather than overwrite) the</tt>
<br><tt>### contents of those files.</tt>
<p><tt># open data source:</tt>
<br><tt>open (SERV, "&lt;/etc/services") or die "Arghh #3 !";</tt>
<p><tt>while( &lt;SERV> ) {</tt>
<br><tt>&nbsp; if (/^ *([^# ]+) +(\d+)\/([tcpud]+)/) {</tt>
<p><tt>### The above regex has several problems, some of them minor</tt>
<br><tt>### (unnecessary elements) and one of them critical: it actually
misses</tt>
<br><tt>### out on most of the lines in "/etc/services". The killer is
the ' +'</tt>
<br><tt>### that follows the first capture: "/etc/services" uses a mix
of spaces</tt>
<br><tt>### and *tabs* to separate its elements.</tt>
<p><tt>&nbsp;&nbsp;&nbsp; $name&nbsp;&nbsp; = $1;</tt>
<br><tt>&nbsp;&nbsp;&nbsp; $port&nbsp;&nbsp; = $2;</tt>
<br><tt>&nbsp;&nbsp;&nbsp; $tcpudp = $3;</tt>
<br><tt>&nbsp;&nbsp;&nbsp; $tmp = "$name ($port)\n";</tt>
<p><tt>### The above assignments are unnecessary; $1, $2, etc. will keep
their</tt>
<br><tt>### values until the next succesful match. Removing all of the
above</tt>
<br><tt>### and rewriting the "if" statement below as</tt>
<br><tt>###</tt>
<br><tt>### if ( $3 eq "udp" ) { print UDP "$1 ($2)\n"; $udp++; }</tt>
<br><tt>###</tt>
<br><tt>### would work just fine.</tt>
<p><tt>&nbsp;&nbsp;&nbsp; if ($tcpudp eq "udp") {</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; print UDP $tmp;</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; $udp++;</tt>
<br><tt>&nbsp;&nbsp;&nbsp; }</tt>
<p><tt>&nbsp;&nbsp;&nbsp; if ($tcpudp eq "tcp") {</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; print TCP $tmp;</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; $tcp++;</tt>
<br><tt>&nbsp;&nbsp;&nbsp; }</tt>
<br><tt>&nbsp; }</tt>
<br><tt>}</tt>
<p><tt># just learned :-) :</tt>
<br><tt>for ( qw/SERV TCP UDP/ ) { close $_ or die "can't close $_: $!\n";
}</tt>
<p><tt>print "TCP: $tcp, UDP: $udp\n";</tt>
<br>
<hr WIDTH="100%">
<p>The above script counted 14 TCPs and 11 UDPs in my "/etc/services" (which
actually contains 185 of one and 134 of the other). Let's see if we can
improve it a bit:
<br>
<hr WIDTH="100%">
<br><tt>#!/usr/bin/perl -w</tt>
<p><tt>open SRV, "&lt;/etc/services" or die "Can't read /etc/services:
$!\n";</tt>
<br><tt>open TCP, ">tcp.txt"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; or die
"Can't write tcp.txt: $!\n";</tt>
<br><tt>open UDP, ">udp.txt"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; or die
"Can't write udp.txt: $!\n";</tt>
<p><tt>for ( &lt;SRV> ) {</tt>
<br><tt>&nbsp;&nbsp;&nbsp; if ( s=^([^# ]+)(\s+\d+)/tcp.*$=$1$2= ) { print
TCP; $tcp++; }</tt>
<br><tt>&nbsp;&nbsp;&nbsp; if ( s=^([^# ]+)(\s+\d+)/udp.*$=$1$2= ) { print
UDP; $udp++; }</tt>
<br><tt>}</tt>
<p><tt>close $_ or die "Failed to close $_: $!\n" for qw/SRV TCP UDP/;</tt>
<p><tt>print "TCP: $tcp\t\tUDP: $udp\n";</tt>
<br>
<hr WIDTH="100%">
<br>In the "for" loop, where all the 'real' work is done, I perform the
following matches/substitutions:
<p><i>Starting at the beginning of the line, (begin capture into $1) match
any character that is not a '#' or a space and occurs one or more times
(end capture). (Begin capture into $2) Match any whitespace character that
occurs one or more times that is followed by one or more digits (end capture),
a forward slash, and the string 'tcp' followed by any</i>
<br><i>number of any character to the end of the line. Replace the matched
string (i.e., the entire line) with $1$2 (which contain the name of the
service, whitespace, and the port number.) Write the result to the TCP
filehandle, and increment the "$tcp" variable.</i>
<p><i>Repeat for 'udp'.</i>
<p>Note that I used the '=' symbol for the delimiter in the 's///' function.
'=' has no particular magic about it; it's just that I was trying to avoid
conflict with the '/' and the '#' characters which appear as part of the
regex (those being two commonly used delimiters), and there was a sale
on '=' at the neighborhood market. :) Any other character or symbol would
have done as well.
<br>&nbsp;
<p>Here are a couple of simple solutions for the other two problems:
<p>1. Open two files and exchange their contents.
<br>
<hr WIDTH="100%">
<br><tt>#!/usr/bin/perl -w</tt>
<br><tt># The files whose contents are to be exchanged are named "a" and
"b".</tt>
<p><tt>for ( A, B ) { open $_, "&lt;\l$_" or die "Can't open \l$_: $!\n";
}</tt>
<br><tt>@a = &lt;A>; @b = &lt;B>;</tt>
<p><tt>for ( A, B ) { open $_, ">\l$_" or die "Can't open \l$_: $!\n";
}</tt>
<br><tt>print A @b; print B @a;</tt>
<br>
<hr WIDTH="100%">
<br>Pretty conservative, basic stuff. A minor hack: I used the '\l' modifier
to set the filename to lowercase. Note that re-opening a filehandle closes
it automatically - you don't have to close a handle between different "open"s.
Also, explicitly closing a file isn't always necessary: Perl will close
the handles for you on script exit (but be aware that some OSs have been
reported as leaving them open.) By the way, the current version of Perl
(5.6.1) has a neat mechanism that helps you do what I did above, but far
more gracefully:
<br>
<hr WIDTH="100%">
<br><tt>...</tt>
<br><tt>$FN = "/usr/X11R6/include/X11/Composite.h";</tt>
<br><tt>open FN or die "I choked on $FN: $!\n";</tt>
<br><tt># "FN" is now open as a filehandle to "Composite.h".</tt>
<br><tt>...</tt>
<br>
<hr WIDTH="100%">
<br>All the distros with which I'm familiar currently come with Perl version
5.005.003 installed. I suggest getting 5.6.1 from CPAN (see below) and
installing it; different versions of Perl coexist quite happily on the
same machine. (Note that <b>replacing</b> an installed version by anything
other than a distro package can be rather tricky, given how much system
stuff depends on
<br>Perl.)
<p>I'm sure that a number of folks figured out that renaming the files
would produce the same result. That wasn't the point of the excercise...
but here's a fun way to do that:
<br>
<hr WIDTH="100%">
<br><tt>#!/usr/bin/perl -w</tt>
<br><tt>%h = qw/a $$.$$ b a $$.$$ b/;</tt>
<br><tt>rename $x, $y while ($x, $y) = each %h</tt>
<br>
<hr WIDTH="100%">
<br>Here, I created a hash using a list of the filenames and a temporary
variable - "$$" in Perl, just as in the shell, is the current process ID,
and "$$.$$" is almost certainly a unique filename - and cycled through
it with the "each" command, which retrieves key/value pairs from hashes.
I suppose you could call it "round-robin renaming"...
<br>&nbsp;
<p>2. Read "/var/log/messages" and print out any line that contains the
words "fail", "terminated/terminating", or " no " in it. Make it case-insensitive.
<p>This one is an easy one-liner:
<br>
<hr WIDTH="100%">
<br><tt>perl -wne 'print if /(fail|terminat(ed|ing)| no )/i' /var/log/messages</tt>
<br>
<hr WIDTH="100%">
<br>The interesting part there is the "alternation" mechanism in the match:
it allows strings like "<tt>(abc|def|ghi)</tt>" for lines matching any
of the above.
<br>&nbsp;
<p><b>Building Quick Tools</b>
<p>A few days ago, I needed to convert a text file into its equivalent
in phonetic alphabet - a somewhat odd requirement. There may or may not
have been a program to do this, but I figured I could write my own in
<br>less time that it would take me to find one:
<p>1) I grabbed a copy of the phonetic alphabet from the Web and saved
it to a file. I called the file "phon", and it loked like this:
<p><tt>Alpha</tt>
<br><tt>Bravo</tt>
<br><tt>Charlie</tt>
<br><tt>Delta</tt>
<br><tt>Echo</tt>
<br><tt>Foxtrot</tt>
<br><tt>Golf</tt>
<br><tt>...</tt>
<p>2) Then, I issued the following command:
<br>
<hr WIDTH="100%">
<br><tt>perl -i -wple's/^(.)(.*)$/\t"\l$1" => "$1$2",/' phon</tt>
<br>
<hr WIDTH="100%">
<br>Ta-daa! Magic. (See below for a breakdown of the substitute operation.)
The file now looked like this:
<p><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "a" => "Alpha",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "b" => "Bravo",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "c" => "Charlie",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "d" => "Delta",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "e" => "Echo",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "f" => "Foxtrot",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "g" => "Golf",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ...</tt>
<p>3) A few seconds later, I had the tool that I needed - a script with
exactly one function and one data structure in it:
<br>
<hr WIDTH="100%">
<br><tt>#!/usr/bin/perl -wlp</tt>
<br><tt># Created by Benjamin Okopnik on Sun May 27</tt>
<br><tt>13:07:49 2001</tt>
<p><tt>s/([a-zA-Z])/$ph{"\l$1"} /g;</tt>
<p><tt>BEGIN {</tt>
<br><tt>&nbsp;&nbsp;&nbsp; %ph = (</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "a" => "Alpha",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "b" => "Bravo",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "c" => "Charlie",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "d" => "Delta",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "e" => "Echo",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "f" => "Foxtrot",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "g" => "Golf",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "h" => "Hotel",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "i" => "India",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "j" => "Juliet",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "k" => "Kilo",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "l" => "Lima",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "m" => "Mike",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "n" => "November",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "o" => "Oscar",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "p" => "Papa",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "q" => "Quebec",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "r" => "Romeo",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "s" => "Sierra",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "t" => "Tango",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "u" => "Uniform",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "v" => "Victor",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "w" => "Whisky",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "x" => "X-ray",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "y" => "Yankee",</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "z" => "Zulu",</tt>
<br><tt>&nbsp;&nbsp;&nbsp; );</tt>
<br><tt>}</tt>
<br>
<hr WIDTH="100%">
<br>The above script will accept either keyboard input or a file as a command-line
argument, and return the phonetic alphabet equivalent of the text.
<p>This is one of the most common ways I use Perl - building quick tools
that I need to do a specific job. Other people may have other uses for
it - after all, <a href="#1">TMTOWTDI [1]</a> - but for me, a computer
without Perl is only half-useable. To drive the point even further home,
a group of Perl Wizards have rewritten most of the system utilities in
Perl - take a look at &lt;http://language.perl.com/ppt/> - and have fixed
a number of annoying quirks in the process. As I understand it, they were
motivated by the three chief virtues of the programmer: Laziness, Impatience,
and
<br>Hubris (if that confuses you, see the Camel Book ["Programming Perl,
Third Edition"] for the explanation). If you want to see <b>well</b>-written
Perl code, there are very few better places. Do note that the project is
not yet complete, but a number of Unices are already catching on: Solaris
8 has a large number of Perl scripts as part of the system
<br>executables, and doing a
<p><tt>file /sbin/* /usr/bin/* /usr/sbin/*|grep -c perl</tt>
<p>shows at least the Debian "potato" distro as having 82 Perl scripts
in the above directories.
<br>&nbsp;
<p>OK, now for the explanation of the two s///'s above. First, the "magic"
converter:
<p><tt>perl -i -wple's/^(.)(.*)$/\t"\l$1" => "$1$2",/' phon</tt>
<p>The "-i", "-w", "-p", and "-e" switches were described in the second
part of this series; as a quick overview, this will edit the contents of
the file by looping through it and acting on each line. The Perl "warn"
mechanism is enabled, and the script to be executed runs from the command
line. The "-l" enables end-of-line processing, in effect adding a carriage
return to the lines that don't have it. The substitution regex goes like
this:
<p><i>Starting at the beginning of the line, (begin capture into $1) match
one character (end capture, begin capture into $2). Capture any number
of any character (end capture) to the end of the line.</i>
<p>The replacement string goes like this:
<p><i>Print a tab, followed by the contents of $1 in lowercase* and surrounded
by double quotes. Print a space, the '=>' digraph, another space, $1$2
surrounded by double quotes and followed by a comma.</i>
<p>* This is done by the "\l" 'lowercase next character' operator (see
'Quote and Quote-like Operators' in the "perlop" page.)
<br>&nbsp;
<p>The second one is also worth studying, since it points up an interesting
feature - that of using a hash value (including modifying the key "on the
fly") in a substitution, a very useful method:
<p><tt>s/([a-zA-Z])/$ph{"\l$1"} /g;</tt>
<p>First, the regex:
<p><i>(Begin capture into $1) Match any character in the 'a-zA-Z' range
(end capture).</i>
<p>Second, the replacement string:
<p><i>Return a value from the "%ph" hash by using the lowercase version
of the contents of $1 as the key, followed by a space.</i>
<br>&nbsp;
<p>The BEGIN { ... } block makes populating the hash a one-time event,
despite the fact that the script may loop thousands of times. The mechanism
here is the same as in Awk, and was mentioned in the previous article.
So, all we do is use every character as a key in the "%ph" hash, and print
out the value associated with that key.
<p>Hashes are very useful structures in Perl, and are well worth studying
and understanding.
<br>&nbsp;
<p><b>Modular Construction</b>
<p>One of the wonderful things about Perl - really, the thing that makes
it a living, growing language - is the community that has grown up around
it. A number of these folks have contributed useful chunks of code that
are made to be re-used; that, in fact, make Perl one of the most powerful
languages on the planet.
<p>Imagine a program that goes out on the Web, connects to a server, retrieves
the weather data - either the current or the forecast - for your city,
and prints the result to your screen. Now, imagine this entire Perl script
taking just <b>one</b> line.
<p><tt>perl -MGeo::WeatherNOAA -we 'print print_forecast( "Denver", "CO"
)'</tt>
<p>That's it. The whole thing. How is it possible?
<p>(Note that this will not work unless you have the 'Geo::WeatherNOAA'
module installed on your system.)
<p>The CPAN (Comprehensive Perl Archive Network) is your friend. :) If
you go to &lt;http://cpan.org/> and explore, you'll find lots and lots
(and LOTS) of modules designed to do almost every programming task you
could imagine. Do you want your Perl script converted to Klingon (or Morse
code)? Sure. Would you like to pull up your stock's performance from Deutsche
Bank Gruppe funds? Easy as pie. Care to send some SMS text messages? No
problem! With modules, these are short, easy tasks that can be coded in
literally seconds.
<p>The standard Perl distribution comes with a number of useful modules
(for short descriptions of what they do, see "Standard Modules" in 'perldoc
perlmodlib'); one of them is the CPAN module, which automates the module
downloading, unpacking, building, and installation process. To use it,
simply type
<p><tt>perl -MCPAN -eshell</tt>
<p>and follow the prompts. The manual process, which you should know about
just in case there's some complication, is described on the "How to install"
page at CPAN, &lt;http://http://cpan.org/modules/INSTALL.html>. I highly
recommend reading it. The difference between the two processes, by the
way, is exactly like that of using "apt" (Debian) or "rpm" (RedHat) and
trying to install a tarball by hand: 'CPAN' will get all the prerequisite
modules to support the one you've requested, and do all the tests and the
installation, while doing it manually can be rather painful. For specifics
of using the CPAN module - although the above syntax is the way you'll
use it 99.9% of the time - just type
<p><tt>perldoc CPAN</tt>
<p>The complete information for any module installed on your system can
be accessed the same way.
<p>As you've probably guessed by now, the "-M" command line switch tells
Perl to use the specified module. If we want to have that module in a script,
here's the syntax:
<br>
<hr WIDTH="100%">
<br><tt>#!/usr/bin/perl -w</tt>
<p><tt>use Finance::Quote;</tt>
<p><tt>$q = Finance::Quote->new;</tt>
<br><tt>my %stocks = $q->fetch("nyse","LNUX");</tt>
<br><tt>print "$k: $v\n" while ($k, $v) = each %stocks;</tt>
<br>
<hr WIDTH="100%">
<br>The above program (you'll need to install the "Finance::Quote" module
for it to work) tells me all about VA Linux on the New York Stock Exchange.
Not bad for five lines of code.
<p>The above is an example of the <i>object-oriented</i> style of module,
the type that's becoming very common. After telling Perl to use the module,
we create a new instance of an object from the "Finance::Quote" class and
assign it to <tt>$q</tt>. We then call the "fetch" method (the methods
are listed in the module's documentation) with the "nyse" and "LNUX" variables,
and print the results stored in the returned hash.
<p>A lot of modules are of the so-called <i>exporting</i> style; these
simply provide additional functions when "plugged in" to your program.
<br>
<hr WIDTH="100%">
<br><tt>#!/usr/bin/perl -w</tt>
<br><tt>use LWP::Simple;</tt>
<p><tt>$code = mirror( "http://slashdot.org", "slashdot.html" );</tt>
<br><tt>print "Slashdot returned a code of $code.\n";</tt>
<br>
<hr WIDTH="100%">
<br>In this case, "mirror" is a new function that came from the LWP::Simple
module. Somewhat obviously, it will copy ("mirror") a given web page to
a specified file, and return the code (e.g., '404' for 'RC_NOT_FOUND).
<br>&nbsp;
<p><b>Wrapping It Up</b>
<p>Well, that was a quick tour through a few interesting parts of Perl.
Hopefully, this has whetted a few folks' tastebuds for more, and has shown
some of its capabilities. If you're interested in extending your Perl knowledge,
here are some recommendations for reading material:
<p><b><font size=-1>Learning Perl, 3rd Edition (coming out in July)</font></b>
<br><font size=-1>Randal Schwartz and Tom Phoenix</font>
<p><b><font size=-1>Programming Perl, 3rd Edition</font></b>
<br><font size=-1>Larry Wall, Tom Christiansen &amp; Jon Orwant</font>
<p><b><font size=-1>Perl Coookbook</font></b>
<br><font size=-1>By Tom Christiansen &amp; Nathan Torkington</font>
<p><b><font size=-1>Data Munging with Perl</font></b>
<br><font size=-1>By David Cross</font>
<p><b><font size=-1>Mastering Algorithms with Perl</font></b>
<br><font size=-1>By Jon Orwant, Jarkko Hietaniemi &amp; John Macdonald</font>
<p><b><font size=-1>Mastering Regular Expressions</font></b>
<br><font size=-1>By Jeffrey E. F. Friedl</font>
<p><b><font size=-1>Elements of Programming with Perl</font></b>
<br><font size=-1>by Andrew Johnson</font>
<br>&nbsp;
<p>Good luck with your Perl programming - and happy Linuxing!
<br>&nbsp;
<p><b>Ben Okopnik</b>
<br><tt>perl -we'print reverse split//,"rekcah lreP rehtona tsuJ"'</tt>
<br>
<hr WIDTH="100%">
<br><a NAME="1"></a>1. "There's More Than One Way To Do It" - the motto
of Perl. I find it applicable to all of Unix, as well.
<p><tt>References:</tt>
<p><tt>Relevant Perl man pages (available on any pro-Perl-y configured</tt>
<br><tt>system):</tt>
<p><tt>perl&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; - overview&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
perlfaq&nbsp;&nbsp; - Perl FAQ</tt>
<br><tt>perltoc&nbsp;&nbsp; - doc TOC&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
perldata&nbsp; - data structures</tt>
<br><tt>perlsyn&nbsp;&nbsp; - syntax&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
perlop&nbsp;&nbsp;&nbsp; - operators/precedence</tt>
<br><tt>perlrun&nbsp;&nbsp; - execution&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
perlfunc&nbsp; - builtin functions</tt>
<br><tt>perltrap&nbsp; - traps for the unwary&nbsp; perlstyle - style guide</tt>
<p><tt>"perldoc", "perldoc -q" and "perldoc -f"</tt>




<!-- *** BEGIN bio *** -->
<SPACER TYPE="vertical" SIZE="30">
<P> 
<H4><IMG ALIGN=BOTTOM ALT="" SRC="../gx/note.gif">Ben Okopnik</H4>
<EM>A cyberjack-of-all-trades, Ben wanders the world in his 38' sailboat, building
networks and hacking on hardware and software whenever he runs out of cruising
money. He's been playing and working with computers since the Elder Days
(anybody remember the Elf II?), and isn't about to stop any time soon.</EM>


<!-- *** END bio *** -->

<!-- *** BEGIN copyright *** -->
<P> <hr> <!-- P --> 
<H5 ALIGN=center>

Copyright &copy; 2001, Ben Okopnik.<BR>
Copying license <A HREF="../copying.html">http://www.linuxgazette.com/copying.html</A><BR> 
Published in Issue 69 of <i>Linux Gazette</i>, August 2001</H5>
<!-- *** END copyright *** -->

<!--startcut ==========================================================-->
<HR><P>
<CENTER>
<!-- *** BEGIN navbar *** -->
<IMG ALT="" SRC="../gx/navbar/left.jpg" WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="bottom"><A HREF="nielsen.html"><IMG ALT="[ Prev ]" SRC="../gx/navbar/prev.jpg" WIDTH="16" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="index.html"><IMG ALT="[ Table of Contents ]" SRC="../gx/navbar/toc.jpg" WIDTH="220" HEIGHT="45" BORDER="0" ALIGN="bottom" ></A><A HREF="../index.html"><IMG ALT="[ Front Page ]" SRC="../gx/navbar/frontpage.jpg" WIDTH="137" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="http://www.linuxgazette.com/cgi-bin/talkback/all.py?site=LG&article=http://www.linuxgazette.com/issue69/okopnik.html"><IMG ALT="[ Talkback ]" SRC="../gx/navbar/talkback.jpg" WIDTH="121" HEIGHT="45" BORDER="0" ALIGN="bottom"  ></A><A HREF="../faq/index.html"><IMG ALT="[ FAQ ]" SRC="./../gx/navbar/faq.jpg"WIDTH="62" HEIGHT="45" BORDER="0" ALIGN="bottom"></A><A HREF="orr.html"><IMG ALT="[ Next ]" SRC="../gx/navbar/next.jpg" WIDTH="15" HEIGHT="45" BORDER="0" ALIGN="bottom"  ></A><IMG ALT="" SRC="../gx/navbar/right.jpg" WIDTH="15" HEIGHT="45" ALIGN="bottom">
<!-- *** END navbar *** -->
</CENTER>
</BODY></HTML>
<!--endcut ============================================================-->