File: defurne1.html

package info (click to toggle)
lg-issue36 2-4
  • links: PTS
  • area: main
  • in suites: woody
  • size: 2,920 kB
  • ctags: 242
  • sloc: makefile: 36; sh: 4
file content (526 lines) | stat: -rw-r--r-- 22,670 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
<!--startcut ==========================================================-->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<title>Defining a Linux-based Production System LG #36</title>
   <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
   <META NAME="GENERATOR" CONTENT="Mozilla/4.07 [en] (X11; I; Linux 2.0.35 i486) [Netscape]">
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#A000A0"
ALINK="#FF0000">
<!--endcut ============================================================-->

<H4>
"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>

<P> <HR> <P> 
<!--===================================================================-->

<center>
<H1><font color="maroon">Defining a Linux-based Production System</font></H1>
<H4>By <a href="mailto:jurgen.defurne@scc.be">Jurgen Defurne</a></H4>
</center>
<P> <HR> <P>  

<H2>
<U>Introduction</U></H2>
In a previous article ("Thoughts About Linux", <I>LG</I>, October) I browsed upon
several topics to use Linux in business, not only for networking and communication,
but also for real production systems. There are several conditions which
need to be fulfilled, but it should be possible to define a basic database
system, which is rapidly deployed and has everything that is needed in
such a system.
<BR>The past two months I have been increasing my knowledge about Linux
and available tools on Linux. There are several points which need further
elaboration, but I have a fairly good idea of what is needed.
<H2>
<U>Goal</U></H2>
The goal is to have a look at the parts which are needed to implement a
reliable production database system, together with the tools needed to
provide for (rapid) (inhouse) development, but for a fairly lower cost
than is needed with traditional platforms. It must be reliable, in the
sense that all necessary procedures for a minimum downtime are available,
with the emphasis that a little downtime can be tolerated.
<BR>I do need to place a remark here. I worked on several projects, where
people tried to save time by asking for rapid development, or trying to
save money by reusing parts which lay around, or by converting systems.
What happened in all cases was that both money and time were lost, the
main reason being not understanding all aspects of the problems.
<BR>This is a mistake I don't want to make. I think that I now have enough
experience to show a way to achieve the above defined goal, describing
a Linux-based production platform which has lower deployment and exploitation
costs.
<H2>
<U>Basic recommendations</U></H2>
These are general guidelines. The first part in creating and exploiting
a successful production system is in constraining the amount of tools that
are needed on the platform. This leads to the second part of success, understanding
and knowing your tools. Experience is still the most valuable tool, but
depending on the amount and complexity of tools, much time can be wasted
trying to get to know the tools that are at hand. Good handbooks with clear
examples and thorough cross-references are a great help, as are courses
on the subjects that matter.
<H2>
<U>Hardware</U></H2>
At the moment I won't go very deep into hardware. The base platform should
be a standard PC with an IDE harddrive on a PCI controller which is fast
back-to-back capable. I tested the basic data rate of a Compaq Despro system
(166 MHz, Triton II chipset) and got a raw data-speed (unbuffered, using
write()) of 2.5 MB (megabytes)/s. I suppose that for a small entry platform
this is fairly reasonable. Further tests should be developed to test the
loading of the machine under production circumstances.
<BR>The most important part , however, is that all machines running Linux
with production data, should be equipped with a UPS. This is because the
e2fs file system (as most Un*x filesystems) is very vulnerable in the case
of an unexpected system shutdown. For this reason, a tape drive is also
indispensable, with good backup and restore procedures which must be run
from day 0.
<H2>
<U>Production tools</U></H2>
Our main engine is the database management system. For production purposes,
the following features must be available :
<UL>
<LI>
Fast query capability</LI>

<LI>
Batch job entry</LI>

<LI>
Printing</LI>

<LI>
Telecommunication</LI>

<LI>
Transaction monitoring</LI>

<LI>
Journaling</LI>

<LI>
User interfacing</LI>
</UL>

<H3>
Fast query capability</H3>
This feature is especially necessary for interactive applications. Your
clients shouldn't wait half a minute before the requested operation is
fulfilled. This capability can be enhanced by buffering, faster CPU's,
faster disk-drives, faster bus-systems and RAID.
<H3>
Batch job entry</H3>
This is a very valuable production tool. There are much jobs which depend
on the daily production changes, but which need much processing time afterwards.
These are mostly run at night, always on the same points of time, daily,
weekly, monthly or yearly.
<H3>
Printing</H3>
Printing is a very important action in every production system and should
be looked after from the design phase. This is because many companies have
several documents that are official. Not only printing on laser- or inkjet
printers should be supported, but also printing with heavy duty printing
equipment for invoices, multi-sheet paper, etc.
<H3>
Telecommunication</H3>
Telecommunication is not only about the Internet. There are still many
systems out there that work with direct lines between them. The main reason
is that this gives the people who are responsible for the services, a much
greater degree of control over access and implementation. In addition to
TCP/IP; e-mail and fax, support for X.25 should also be an option in this
area.
<BR>People should also have control over the messages and/or faxes they
send. A queue of messages should be available, where everybody can see
all messages globally (dest, time, etc) and where they have access to their
own messages.
<H3>
Transaction monitoring</H3>
With transaction monitoring, I mean the ability to rollback pending updates
on database tables. This feature is especially needed when one modification
concerns several tables. These modifications must all be committed at the
same time, or be rolled back into the previous state.
<H3>
Journaling</H3>
This capability is needed to repair files and filesystems which got corrupted
due to a system failure. After restarting the system, a special program
is used to undo all changes which couldn't be committed. In this sense,
journaling stands very close to transaction monitoring.
<H3>
User interfacing</H3>
This is a tricky part, because it is part development and part production.
On the production side, the interface system should give users rapid access
to their applications and also partition all applications between departments.
Most production systems I have seen do this with menu's. There are several
reasons. The main reason is that most production systems still work with
character-based applications. There are many GUI's out there, but production
systems will still be solely character based (except for graphics and printing,
but I consider these niche markets), even on a GUI. The second reson is
that a production system usually has lots and lots of large and little
program's. You just can't iconify them all and put them in maps. Then you
would only have a graphical menu, with all icons adding more to confusion
than clarity.
<H3>
What's available ?</H3>
Note : When I name or specify products, I will only specify those with
which I am already familiar. I presume that any one of you will have his/her
own choices. They serve as basic examples, and do not imply any preferences
on my side.
<P>The only database system on Linux I personally know for the moment,
is PostgreSQL. It supports SQL and transaction monitoring. Is it fast ?
I don't know. One should have a backup of a complete production database,
which can then be used to test against the real production machine, with
interactive, background and batch jobs running like they do in the real
world.
<P>For batch jobs, crond should always be started. In addition to this,
the <I>at</I> and <I>batch</I> commands can be used to have a more structured
implementation of business processes.
<P>For printing, I know (and use) the standard Linux tools lpd, Ghostscript
and TeTeX. There might be a catch however. In some places you need to merge
documents with data. The main reason for this is that a wordprocessing
package offers more control over the format and contents of the document,
instead of printing the document with a simple reporting application. On
my current workplace, a migration to HP is busy. The solution there is
WordPerfect. In the past, I have used this solution under DOS, to automatically
start WP and produce a merged document. Is this possible too with StarOffice
?
<BR>Are there other print solutions which offer more interactive control
over the printing process than lpd ? Users should have more easy access
to their printjobs and the printing process.
<P>Telecommunication is a real strong point of Linux. I won't enumerate
them all. Even if it doesn't support X.25, it is still possible to use
direct dial-up lines using SLIP or PPP.
<P>Journaling is the weakest point of Linux. I have worked with the following
filesystems : FAT, HPFS, NTFS, e2fs, the Novell filesystem and the filesystem
of the WANG VS minicomputer system. With all these systems, I have had
power-failures or crashes, but the only file-system that gives trouble
after this is e2fs. In all other cases, a filesystem check repairs all
damage and the computer continues. On WANG VS, index-sequential files are
available. When a crash occurs, the physical integrity of an indexed file
can be compromised. To defend against this, there are two solutions. The
first is reorganizing the file. This is copying the file on a record-by-record
basis. This rebuilds the complete file and its indices, and inserts or
deletes which were not committed are rejected. The second option is using
the built in transaction system. A file can flagged as belonging to a database.
Every modification to these files is logged until the transaction is completely
committed. After a failure has occurred, the files can be restored in their
original states using the existing journals. This is a matter of minutes.
<BR>I think that the only filesystem on PC which offers comparable functionality
is that of Novell.
<BR>The e2fs file system check works, but it does offer not enough explanation.
When there is a really bad crash, the filesystem is just a mess.
<H2>
<U>Development tools</U></H2>
I will describe here the kind of tools that I needed when I was maintaining
a production database in a previous job. The main theme here is that programmers
in a production environment should be productive. This means that they
should be offered a complete package, with all tools and documentation
necessary to start immediately (or in a relatively short time). This means
that for every package there should be a short, practical tutorial available.
<BR>I will divide this section into two parts, the first being necessary
tools, the second being optional tools. Also necessary for development
is a methodology. This methodology should be equal through all delivered
tools. The easiest way to do this is through an integrated development
environment.
<H2>
<U>Compulsory development tools</U></H2>
Which tools are the minimum needed to start coding without much hassle
? I found these tools to be invaluable on several platforms :
<UL>
<LI>
An integrated development environment</LI>

<LI>
A powerful editor</LI>

<LI>
An interactive screen development package</LI>

<LI>
A data dictionary</LI>

<LI>
A high-level language with DBMS preprocessor support</LI>

<LI>
A scripting language</LI>
</UL>

<H3>
Integrated development environment</H3>
Your IDE should give access to all your tools in an easy and consistent
manner. It should be highly customisable, but be delivered with in a configuration
which gives direct access to all installed tools.
<H3>
Editor</H3>
If you have a real good editor, it can act as an integrated development
environment. Features which enhance productivity are powerful search-and-replace
capabilities and macro features (even a simple record-only macro feature
is better than no macro features). Syntax colouring is nice, but one can
live easy without it. Syntax completion can be nice, but you have to learn
to live with it. Besides, the editor cannot know which parts of statement
you don't need, so ultimately you will have more clutter in your source,
or you waste your time erasing unnecessary source code.
<H3>
Screen development</H3>
This is an area where big savings can be done. For powerful screen development
you need the following parts in the development package :
<OL>
<LI>
Standard screens which are drawn upon information in the data dictionary</LI>

<LI>
Easy passing of variables between screens and applications</LI>

<LI>
A standard way of activating a screen in an application</LI>
</OL>
The savings are on several places. If you create a new screen, then you
should immediately get a default screen with all fields from the requested
table. After this, only some repositioning and formatting to local business
standards needs to be done. I worked with two such systems, FoxPro and
WANG PACE, and the savings are tremendous in all parts of the software
cycle (prototyping, implementation and maintenance).
<H3>
Data dictionary</H3>
A data dictionary is a powerful tool, from which much information can be
extracted for all parts of the development process. The screen builder
and the HL preprocessor should be able to extract their information from
the data dictionary. The ability to define field- and record-checking functions
in the data dictionary instead of the application program, eliminates the
need to propagate changes in this code through all applications. With the
proper reports, one should also be able to look at different angles into
the structure of the database.
<H3>
High level language with DBMS preprocessor support</H3>
You can't do complete development without a high-level language. There
are always functions needed which can't be implemented through the database
system. To make development easier, it should be possible to embed database
control statements in the source program. The compiler should be invoked
automatically.
<H3>
Scripting language</H3>
A scripting language is very useful in several aspects. Preparing batch
jobs is part of it. I also found out that a business system consists of
several reusable pieces, which can be easily strung together using a scripting
language. Also, the overall steering and maintenance of the system can
be greatly simplified.
<H2>
<U>Optional development tools</U></H2>
These are tools that were avalailable on several platforms, which can come
in handy, but aren't necessarily usable to deliver production environment
applications. I found out that these are little used.
<UL>
<LI>
Interactive query system</LI>

<LI>
Report editor</LI>
</UL>

<H3>
Interactive query system</H3>
This is often designed to be used by people which are not programmers.
Experience has thaught me however that people who are not programmers in
a business, don't have the time to learn these tools. It is a useful tool
for a programmer to test queries and views, but it isn't really useful
as a production tool. Only in some cases, for real quick and dirty work,
is it worth using.
<H3>
Report editor</H3>
This is even a more overestimated tool. I shared thoughts about this with
other programmers, and our conclusion was : bosses always ask reports which
are much more complicated than a simple report editor can handle. It would
be far better to use a programming language specifically designed for reporting
(any one know of such a thing ? Any experiences with Perl for extraction
and reporting ?).
<H3>
What's available ?</H3>
Note : I will direct my attention only at the compulsory development tools.
The rest of the environment will be centered around the features of PostgreSQL.
<P>As an integrated development environment, EMACS is probably the first
which comes to mind. It integrates even with my second subject, a powerful
editor. Is it even at all possible to draw a line between the two ? Is
EMACS a powerful editor which serves as a development environment, or is
it a development environment which is tightly integrated with its editor?
<P>The data dictionary, the screen development package and the DBMS preprocessor
are more thightly bound than other parts of the package. The screen editor
and the DBMS preprocessor should get their information from the data dictionary,
and the DBMS HL statements should also provide for interaction with screens.
It should be both possible to develop screens for X-windows, as well for
character-based terminals.
<BR>In the field of high level languages, there are several options, but
a business oriented language is still missing. Yes, I am talking COBOL
here, although an xBase dialect is also great for such applications. I
have programmed for eight years in several languages, only the last two
year in COBOL, and it IS better for expressing business programs than C/C++.
If anyone would ask me now to write business programs in C/C++, I think
the first thing I would do was write a preprocessor so that I could express
my programs with COBOL-like syntax.
<BR>I don't know how ADA goes for business programs, but a combination
of GNAT, with a provision to copy embedded SQL statements to the translated
C-source and then through the SQL preprocessor would maybe work.
<BR>I only had a small look at Perl, and from Tcl and Python I know absolutely
nothing, but while interactive languages are fine for interactive programs,
you should also bear in mind that some programs must process much data,
and that therefore access to a native code compiler is essential.
<BR>There is another point in which only COBOL is good. This is in financial
mathematics. This is due to the use of packed decimal numbers up to 18
digits long where the decimal point can be in any place. You should have
compiler support for that too. On the x86 platform this capability exists
in the numerical processor, which is capable of loading and storing 18
digit packed decimal numbers. Computations are carried out in the internal
80-bit floating point format of the co-processor.
<P>When you have a Linux system, the first scripting language you run into
is probably that of the bash shell. This should be sufficient for most
purposes, although my experiences with scripting languages is that they
benefit greatly from statements for simple interaction (prompting and simple
field entry).
<H2>
<U>What should be delivered ?</U></H2>
As I said before, this list doesn't present any endorsement from me towards
any of these products or programs. This list should be expanded with all
products which fit in either one of these categories, so all hints all
wellcome.
<BR>Another weak point in some areas of Linux is documentation. For a production
environment, the Linux documentation project is probably a must, preprinted
from the Postscript sources. For the commercial products, good documentation
is also not a problem. For other parts of Linux tools, the books from O'Reilly
&amp; Associates are very valuable. HOWTO's are NOT suited for a production
environment, but since they are more about implementation, they are suitable
for the people who put the system together. The catch is this : when a
system is delivered, all necessary documentation should be prepared and
delivered too. I worked with several on-line documentation systems, but
when in a production environment, nothing beats printed documentation.
<BR>&nbsp;
<CENTER><TABLE BORDER COLS=2 WIDTH="90%" NOSAVE >
<TR NOSAVE>
<TH COLSPAN="2" NOSAVE>Production system</TH>
</TR>

<TR NOSAVE>
<TD>DBMS
<BR>- Fast query/update
<BR>- Transaction processing
<BR>- Journaling</TD>

<TD VALIGN=TOP NOSAVE>postgreSQL
<BR>mySQL
<BR>mSQL
<BR>Adabas
<BR>c-tree Plus/Faircom Server
<BR>...</TD>
</TR>

<TR NOSAVE>
<TD VALIGN=TOP NOSAVE>Communication</TD>

<TD>ppp
<BR>slip
<BR>efax
<BR>...</TD>
</TR>

<TR NOSAVE>
<TD VALIGN=TOP NOSAVE>Batch job entry</TD>

<TD>crond
<BR>at
<BR>batch</TD>
</TR>

<TR>
<TD>Printing</TD>

<TD>lpd</TD>
</TR>

<TR>
<TD>User interfacing</TD>

<TD>?</TD>
</TR>

<TR NOSAVE>
<TH COLSPAN="2" NOSAVE>Development system</TH>
</TR>

<TR>
<TD>IDE</TD>

<TD>EMACS</TD>
</TR>

<TR NOSAVE>
<TD VALIGN=TOP NOSAVE>Editor</TD>

<TD>EMACS
<BR>vi</TD>
</TR>

<TR NOSAVE>
<TD VALIGN=TOP NOSAVE>Screen development</TD>

<TD>Depends on DBMS</TD>
</TR>

<TR>
<TD>Data dictionary</TD>

<TD>Depends on DBMS</TD>
</TR>

<TR NOSAVE>
<TD VALIGN=TOP NOSAVE>Application language</TD>

<TD>C
<BR>C++
<BR>Cobol ?
<BR>Perl
<BR>Tcl(/Tk)
<BR>Python
<BR>Java</TD>
</TR>

<TR>
<TD>Scripting language</TD>

<TD>bash</TD>
</TR>
</TABLE></CENTER>

<H2>
<U>Summary</U></H2>
I am still trying to drag Linux into business. If you want to do business
using Linux, you should be able to deliver a complete system to the customer.
In this article I outlined the components of such a system and some weaknesses
which should be overcome. As a result, I created a table enumerating the
needed components for such a system.
<BR>This table is absolutely <I><U>not</U></I> finished. I welcome all
references to programs and products to update this table. It should be
possible to publish an update once a month. What I also should do, is extend
the table with references to available documentation.
<BR>Another part which needs more attention is developing tests to assess
the power of the database system, ie. what can be expected in terms of
throughput and response under several load scenarios.
<BR>&nbsp;

<!--===================================================================-->
<P> <hr> <P> 
<center><H5>Copyright &copy; 1999, Jurgen Defurne <BR> 
Published in Issue 36 of <i>Linux Gazette</i>, January 1999</H5></center>

<!--===================================================================-->
<P> <hr> <P> 
<A HREF="./lg_toc36.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif" 
ALT="[ TABLE OF CONTENTS ]"></A>
<A HREF="../lg_frontpage.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
ALT="[ FRONT PAGE ]"></A>
<A HREF="./larriera.html"><IMG SRC="../gx/back2.gif"
ALT=" Back "></A>
<A HREF="./marsden.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
<P> <hr> <P> 
<!--startcut ==========================================================-->
</BODY>
</HTML>
<!--endcut ============================================================-->