File: http-analyze.1

package info (click to toggle)
http-analyze 1.9e-2
  • links: PTS
  • area: non-free
  • in suites: hamm
  • size: 376 kB
  • ctags: 327
  • sloc: ansic: 3,493; sh: 185; makefile: 106
file content (792 lines) | stat: -rw-r--r-- 32,426 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792



http-analyze(8L)				 http-analyze(8L)


NNAAMMEE
       hhttttpp--aannaallyyzzee - a real fast log analyzer for web servers

SSYYNNOOPPSSIISS
       hhttttpp--aannaallyyzzee   [--{{dd||mm||hh}}]  [--nnrrssttuuvvxxzz]  [--cc  ccffggffiillee]  [--ii
       llooggffiillee] [--oo oouuttddiirr]
	   [--pp pprriivvddiirr] [--NN ##ss||uu] [--HH hhoommeeppaaggee] [--SS ssrrvvnnaammee]  [--TT
       ttiittllee] [ffiillee]

DDEESSCCRRIIPPTTIIOONN
       _h_t_t_p_-_a_n_a_l_y_z_e  analyzes logfiles of web servers and creates
       detailed statistics of the servers's access load in graph
       ical  and  tabular  form.   hhttttpp--aannaallyyzzee	expects logfiles
       entries in _c_o_m_m_o_n _l_o_g_f_i_l_e _f_o_r_m_a_t, which	is  used  by  web
       servers	such as Netscape's, NCSA's, and CERN's httpd.  If
       your server uses another format, hhttttpp--aannaallyyzzee  can't  read
       the logfile.

       hhttttpp--aannaallyyzzee  has  been	highly optimized to process large
       logfiles at the maximum possible speed.	This is	achieved
       by using a history mechanism to skip logfile entries which
       have been processed already in a previous run of the  pro
       gram,  and  by  using  two modes of operation (named after
       their maximum useful update  interval)  with  a	different
       detail level in the analysis of the logfile entries:

       ddaaiillyy mmooddee (option --dd):
	      hhttttpp--aannaallyyzzee  generates a short summary showing the
	      hits per day only.  By  using  a	history	to  skip
	      entries  processed already and by avoiding detailed
	      analysis of each log entry,  hhttttpp--aannaallyyzzee	requires
	      only  a  fraction	of  the  time	needed for a full
	      report.

       mmoonntthhllyy mmooddee (option --mm):
	      In this mode, a full report with much more  details
	      is  generated.   The  history  is used to produce a
	      summary for the last 12 month.

       If your logfiles are rather large, you can use an  update-
       interval	in  the  range	of one to 24 hours to generate a
       short statistics more frequently	and  an  update-interval
       from  one  to  30  days	to generate a full report.  Since
       hhttttpp--aannaallyyzzee maintains a history of the results from  pre
       vious  runs,  you  may  rotate the logfile on a daily base
       when generating short (daily) reports.  However, to gener
       ate  a full (monthly) report you have to feed all logfiles
       of the appropriate summary period to hhttttpp--aannaallyyzzee at once,
       because	the  program  needs to do further analysis on all
       logfile entries.	After generating a detailed report for a
       month,  you  can save the corresponding logfile(s) on tape
       and remove them from your system.





			  Local Commands			1





http-analyze(8L)				 http-analyze(8L)


   HHTTMMLL OOUUTTPPUUTT FFIILLEESS
       In daily mode, hhttttpp--aannaallyyzzee writes the short summary  into
       the output file ssttaattss..hhttmmll and updates the daily values in
       the history file.  The short summary includes the  follow
       ing  informations by day (see the following section for an
       explanation of this numbers):
	      - the total number of hits
	      -	the  total  number  of	304's	(Not	Modified
	      responses)
	      - the total number of files transferred
	      - the total number of unique sites
	      - the amount of data sent by the server

       In monthly mode, hhttttpp--aannaallyyzzee updates the short summary in
       ssttaattss..hhttmmll and the monthly values  in  the  history  file.
       Additionally, it creates the following files:

       _s_t_a_t_s_M_M_Y_Y_._h_t_m_l
	      contains the detailed summary for the period deter
	      mined by analyzing the  logfile.	_M_M  and  _Y_Y  are
	      replaced by the month and the year respectively.

       _f_i_l_e_s_M_M_Y_Y_._h_t_m_l
	      lists  the  URLs	of  all	documents  sent  by your
	      server.  This file is created by default,	but  you
	      can  suppress  its  creation  with an option if you
	      want to exclude them from the statistics.

       _s_i_t_e_s_M_M_Y_Y_._h_t_m_l
	      lists the hostnames of  all  sites  accessing  your
	      server if the server could successfully resolve the
	      IP address.  Again, this file is created by default
	      unless you explicitely suppress its creation.

       _s_t_a_t_s_Y_Y_Y_Y_._h_t_m_l or _i_n_d_e_x_._h_t_m_l
	      contains	a  summary  of	the last 12 month.  Which
	      name is choosen depends on the  date  of	the  last
	      logfile  entry  processed:  If the last entry indi
	      cates that hhttttpp--aannaallyyzzee is  analyzing  the  current
	      month's  log,  the name _i_n_d_e_x_._h_t_m_l is used for easy
	      reference of the statistics pages.   In  all  other
	      cases the name _s_t_a_t_s_Y_Y_Y_Y_._h_t_m_l is used.  This naming
	      convention allows you to create reports for  previ
	      ous  summary  periods  (e.g. for last year) without
	      affecting the results for the current period.

       _g_r_-_i_c_o_n_._g_i_f
	      a small icon for your link to the	statistics  page
	      (59x41 pixels).

       All  files are created in the current directory unless you
       explicitely specify  an	output	directory  for	the  HTML
       files.	Furthermore,  the  files  containing the detailed
       lists of sites and  URLs	may  be  created  in  a  private



			  Local Commands			2





http-analyze(8L)				 http-analyze(8L)


       directory to protect them by authorization.

       The  full  summary (_s_t_a_t_s_Y_Y_M_M_._h_t_m_l) contains the following
       informations:
	      - the total number of hits/304's/files/KB for  this
	      month
	      - the amount of data requested/transferred/saved by
	      cache
	      - the total number of unique  URLs/sites	for  this
	      month
	      - the numbers of response codes other than 200 (OK)
	      or 304 (NoMod)
	      - the maximum/average hits per day/hour
	      - the total number of hits/files/304's/sites/KB  by
	      day
	      - the top 5 seconds, 5 minutes, and 24 hours of the
	      summary period
	      - the top 10 sites accessing your server most often
	      - the top 30 most commonly accessed URLs
	      - the last 10 frequently accessed URLs
	      - the hits/304's/KB sent by Country

       The following section describes the meaning of all entries
       in the summary report, which are not self-explaining:

       _H_i_t_s	(color key: green) The total number of hits pro
		 cessed	by  the  server including requests which
		 did generate an invalid response.

       _F_i_l_e_s	(color key: blue) The total number of files kind
		 sent  by the server (_O_K responses).  Here "file"
		 means any kind of file, thus including not  only
		 documents,  but  also images, CGI scripts, audio
		 and video clips, etc.

       _3_0_4_'_s	(color key: yellow) A code  304  (_N_o_t	_M_o_d_i_f_i_e_d)
		 response  is  sent  by	the server if a document
		 hasn't been updated since the last time  it  was
		 requested.   This  field  therefore contains the
		 total number of requests which didn't cause  the
		 transmission	of  a  file  because  of  various
		 caching mechanisms used by proxies and browsers.

       _O_t_h_e_r _r_e_s_p_o_n_s_e_s
		 The  total number of all answers from the server
		 which are not _O_K (200)	or  _N_o_t  _M_o_d_i_f_i_e_d  (304)
		 responses.   The full summary includes a list of
		 all those other responses.

       Unique URLs
		 This field contains the total number  of  unique
		 URLs (not counting erroneous requests).





			  Local Commands			3





http-analyze(8L)				 http-analyze(8L)


       _U_n_i_q_u_e _s_i_t_e_s
		 (color	key: red) In the _T_o_t_a_l_s section, this is
		 the total number  of  unique  sites  per  month,
		 while in the _H_i_t_s _b_y _d_a_y section it reflects the
		 number of unique sites per day.  Therefore,  the
		 sum of all sites shown in the "Hits by day" sec
		 tion is not equal to the total number of  unique
		 sites.

       _K_B_y_t_e_s _r_e_q_u_e_s_t_e_d
		 The  amount  of  data	requested by the users of
		 your server.  hhttttpp--aannaallyyzzee computes this  number
		 by adding the values of the next two fields (see
		 below).

       _K_B_y_t_e_s _t_r_a_n_s_f_e_r_r_e_d
		 (color key: orange) The amount of data	sent  as
		 reported by the server.

       _K_B_y_t_e_s _s_a_v_e_d _b_y _c_a_c_h_e
		 The  amount  of  data	saved  by various caching
		 mechanisms.  It's value is computed by multiply
		 ing the number of Not Modified requests per page
		 with the size of the document (if known).  Note:
		 Because hhttttpp--aannaallyyzzee can determine the size of a
		 page only if the page has  been  requested  suc
		 cessfully  at	least  once  in	the same summary
		 period, the values for "KB saved by  cache"  and
		 "KB  requested"  are  just approximations of the
		 real values.

OOPPTTIIOONNSS
       --hh     print a short help list explaining the usage of the
	      options.

       --dd     _(_d_a_i_l_y _m_o_d_e_) generate short statistics for the cur
	      rent month only.	If a  history  file  exists,  the
	      values  for  previous  days are read from this file
	      and the corresponding logfile entries are	skipped.
	      If  the history file does not exist, the whole log
	      file will be processed and a history will	be  cre
	      ated.  (This option is set by default.)

       --mm     _(_m_o_n_t_h_l_y _m_o_d_e_) generate full statistics for a whole
	      month.  Although the values from the  history  file
	      are  usually  used to create a summary for the last
	      12 month, the actual logfile  entries  always  have
	      preceedence  over	any records in the history file.
	      This means that you should rotate your  logfile  at
	      least  on	a  monthly base.  The option --mm includes
	      --dd.

       --nn     _(_n_o _u_p_d_a_t_e_) don't	create	or  update  the  history
	      file.   Useful  if  you want to generate statistics



			  Local Commands			4





http-analyze(8L)				 http-analyze(8L)


	      for  previous  summary  periods  (before	the  last
	      month) without overwriting the current state of the
	      history.

       --rr     don't create a list of all URLs  for  hidden  items
	      (if any) in the full statistics.

       --ss     _(_n_o  _s_i_t_e_l_i_s_t_)  don't create a list of all sites in
	      the full statistics.

       --tt     _(_n_o _T_O_P _l_i_s_t_s_) don't create  the	top  seconds/min
	      utes/hours  lists.   Also	suppresses  the "Hits by
	      hours" bar chart.

       --uu     _(_n_o _U_R_L _l_i_s_t_) don't create a list of all	requested
	      URLs in the full statistics.

       --vv     _(_v_e_r_b_o_s_e_) comment ongoing processing.

       --xx     Don't  comprise images by default.  Normally, hhttttpp--
	      aannaallyyzzee sums up the values of  all  images  (_*_._g_i_f_,
	      _*_._j_p_g_,  _*_._i_e_f_,  _*_._p_c_d_,  _*_._r_g_b_, _*_._x_b_m_, _*_._x_p_m_, _*_._x_w_d_,
	      _*_._t_i_f) and hides them under the item  "All  images"
	      to  avoid getting the top lists filled up with lots
	      of  image	URLs.	If  --xx	is  given,  images   are
	      accounted for as single items.

       --zz     don't   create  graphical	representations  of  the
	      results.

       --cc _c_f_g_f_i_l_e
	      Use _c_f_g_f_i_l_e as the configuration file.  By using	a
	      config file, hhttttpp--aannaallyyzzee allows you to define some
	      options and to tailor the basic  HTML  page  layout
	      somewhat.	See  "CONFIGURATION  FILE"  below for a
	      description of the config file format.

       --ii _l_o_g_f_i_l_e
	      Use _l_o_g_f_i_l_e as the server's  logfile.   If  `-'  is
	      given,  _s_t_d_i_n  is processed.  See also the HHTTTTPPLLoogg
	      FFiillee entry in the config file.

       --oo _o_u_t_d_i_r
	      This is the name of the directory	where	the  HTML
	      output files should be created.  If no directory is
	      given, the files are created in the current  direc
	      tory.   See  also	the  HHTTMMLLDDiirr entry in the config
	      file.

       --pp _p_r_i_v_d_i_r
	      Use this directory for the list of  all  URLs/sites
	      _(_f_i_l_e_s_M_M_Y_Y_._h_t_m_l	_a_n_d   _s_i_t_e_s_M_M_Y_Y_._h_t_m_l_)  _.  This is
	      useful if you want to grant public access	to  your
	      web  server's statistics while permitting access to



			  Local Commands			5





http-analyze(8L)				 http-analyze(8L)


	      the detailed lists  to  the  staff  only	by  using
	      server  authentication.	See  also  the PPrriivvaatteeDDiirr
	      entry in the config file.

       --HH _h_o_m_e_p_a_g_e
	      Use _h_o_m_e_p_a_g_e as an alternate  name  for  homepages.
	      If  your index files are named iinnddeexx..hhttmmll, there is
	      no need to define this option.   However,	if  your
	      server  looks  for  more	than  one  filename  (eg.
	      iinnddeexx..hhttmmll,WWeellccoommee..hhttmmll, and  hhoommee..hhttmmll,	you  must
	      define  the  latter  two explicitely.  hhttttpp--aannaallyyzzee
	      truncates the URLs containing a  homepage	name  so
	      that  they  merge	with  `/'  or	their "base URL",
	      respectively.  (For example,  the	"base	URL"  for
	      _/_d_i_r_/_i_n_d_e_x_._h_t_m_l  _i_s  _/_d_i_r_/ _._)  You can define up to
	      three alternate names in	addition  to  iinnddeexx..hhttmmll.
	      See also the HHoommeeppaaggee entry in the config file.

       --NN _#_{ssuull_}
	      This  option  defines  the number of entries in the
	      top site (ss or SS), top URL (uu or UU), or last URL (ll
	      or  LL)  list.  _# is either a positive number or the
	      value 0 to suppress the corresponding  list.   Note
	      that  the	list of last frequently accessed URLs is
	      generated only if the number of all unique URLs  is
	      greater  than the sum of the entries in the top and
	      last URL lists.  See  also  the  entries	TTooppSSiitteess,
	      TTooppUURRLLss, and LLaassttUURRLLss in the config file.

       --SS _s_r_v_n_a_m_e
	      Use  _s_r_v_n_a_m_e as the name of the server in the title
	      of the  HTML  files.   If	undefined,  hhttttpp--aannaallyyzzee
	      tries  to	determine the server name itself.  Note:
	      hhttttpp--aannaallyyzzee uses either the _u_n_a_m_e _(_2_) or the _g_e_t_h_
	      _o_s_t_n_a_m_e  _(_2_)  function to determine the server name
	      depending on what has been defined  at  compilation
	      time.   On  most	System	V  implementations, _u_n_a_m_e
	      returns the nodename (eg.	_h_o_s_t), while _g_e_t_h_o_s_t_n_a_m_e
	      often returns the full qualified domain name (FQDN,
	      eg.   _h_o_s_t_._m_y_._d_o_m_a_i_n).   See  also  the  SSeerrvveerrNNaammee
	      entry in the config file.

       --TT _t_i_t_l_e
	      Use  _t_i_t_l_e as the document title and header for the
	      HTML files.  hhttttpp--aannaallyyzzee appends the  server  name
	      and  the current summary period to this string.  If
	      left undefined, a default phrase is used.	See also
	      the DDooccTTiittllee entry in the config file.

   CCOONNFFIIGGUURRAATTIIOONN FFIILLEE
       When specified with the option --cc, hhttttpp--aannaallyyzzee reads some
       defaults from the named	configuration  file.   Parameters
       defined with options always take preceedence over the def
       initions in this configuration  file.   The  configuration



			  Local Commands			6





http-analyze(8L)				 http-analyze(8L)


       file  contains  one entry per line.  Each entry has a name
       field and one or two value fields, which must be separated
       by  one	or  more tabulator characters (not blanks!).  All
       names are case-insensitive.

       SSeerrvveerrNNaammee    The name of your server (same as option --SS).

       HHTTTTPPLLooggFFiillee   The name of the server's logfile.	Note that
		     if you define a default name of the logfile,
		     this file gets processed if no other file is
		     explicitely defined  at  the  invocation  of
		     hhttttpp--aannaallyyzzee.    Without	this  definition,
		     hhttttpp--aannaallyyzzee processes _s_t_d_i_n if no	file  is
		     given.   To  process _s_t_d_i_n even if a default
		     name has been defined, use `-' as the  file
		     name for the logfile.

       DDeeffaauullttMMooddee   Defines  the default operation mode of hhttttpp--
		     aannaallyyzzee.  The value  field	contains  either
		     the keyword ddaaiillyy or mmoonntthhllyy.  If left unde
		     fined, the default is the daily mode (--dd).

       HHoommeeppaaggee	Up to three alternate names for homepages in
		     addition  to iinnddeexx..hhttmmll (same as option --HH).
		     All URLs  containing  one	of  the	homepage
		     names  will get truncated so they merge with
		     `/' or the base URL respectively.

       HHTTMMLLDDiirr	The name of the  directory	where	the  HTML
		     output files should be created (same as --oo).
		     If left undefined, files are created in  the
		     current directory.

       PPrriivvaatteeDDiirr    The  name	of  a private directory where the
		     detailed site and URL lists should	be  cre
		     ated  (same  as  option --pp).  Access to this
		     private directory may be  granted	to  staff
		     only  by using server authentication.  Path
		     names not beginning with a `/' are	relative
		     to HHTTMMLLDDiirr.

       TTooppSSiitteess, TTooppUURRLLss, and LLaassttUURRLLss
		     The  number  of entries in the top site, top
		     URLs, and last frequently	used  URLs  lists
		     (same  as	option --NN ))..  If set to zero, the
		     corresponding list will be suppressed.

       DDooccTTiittllee	The document title and header to use in  the
		     HTML  output  files  (same	as  option  --TT).
		     hhttttpp--aannaallyyzzee appends the server's	name  and
		     the current summary period to this string.

       HHeeaaddPPrreeffiixx    The prefix string to output before the docu
		     ment header (after the  HTML  <TITLE>  tag).



			  Local Commands			7





http-analyze(8L)				 http-analyze(8L)


		     If	HHeeaaddPPrreeffiixx  is	defined, it must include
		     the HTML <BODY>  tag.   If	left  undefined,
		     HHeeaaddPPrreeffiixx defaults to:

		     HeadPrefix	<BODY BGCOLOR="#D6D6D6"><P><HR SIZE="8">


       HHeeaaddSSuuffffiixx    The  suffix string to output after the docu
		     ment header (after DDooccTTiittllee).  Useful if you
		     define  left-  or	right-aligned  images  in
		     HHeeaaddPPrreeffiixx	with	the   headline	floating
		     around.

       DDooccTTrraaiilleerr    The trailer string to output at end of page.
		     Useful to define a link back to  your  home
		     page, as in

		     DocTrailer	<BR><FONT SIZE="-1"><A HREF="/">Back</A> to my homepage</FONT>


       HHiiddeeSSyyss and HHiiddeeUURRLL
		     These  two	entries  let you define names of
		     sites or URLs which should be  hidden  under
		     some   arbitrary  text.   Hidden  items  are
		     accounted for separately, but in the summary
		     they  appear comprised under the description
		     defined here.  Both entries have  two  value
		     fields:  the  first field following the name
		     defines a site or	an  URL	and  the  second
		     field defines the text under which this item
		     is to be hidden.  The URL/site may begin  or
		     end  with	a  `*'	as  a wildcard.	However,
		     inside strings, a `*' is taken literal.   If
		     the  text a item is hidden under begins with
		     a `[' character, the item is  not	shown  in
		     the  top  sites/URLs  lists,  but it will be
		     always  shown  in	the  detailed  sites/URLs
		     lists.   Note  that URLs are case-sensitive,
		     while sitenames are not.	Note  also,  that
		     images  are  hidden automatically unless the
		     option --xx	is  specified  at  invocation  of
		     hhttttpp--aannaallyyzzee.   See the ssaammppllee..ccoonnff file for
		     examples on how to use HHiiddeeSSyyss and	HHiiddeeUURRLL.

EEXXAAMMPPLLEESS
       First of all, you must know the name of your server's log
       file.	If,   for   example,   the   name   is	_/_u_s_r_/_n_s_-
       _h_o_m_e_/_h_t_t_p_d_-_8_0_/_l_o_g_s_/_a_c_c_e_s_s,  you can create full statistics
       for the current month with the following command:

	      http-analyze -vm -S www.myserver.com /usr/ns-home/httpd-80/logs/access

       This command will create a  yearly  summary  in	the  file
       _i_n_d_e_x_._h_t_m_l  (or	_s_t_a_t_s_Y_Y_Y_Y_._h_t_m_l	for previous years) and a



			  Local Commands			8





http-analyze(8L)				 http-analyze(8L)


       monthly	summary	in  file  _s_t_a_t_s_M_M_Y_Y_._h_t_m_l,  where  _M_M  is
       replaced	by the month and _Y_Y is replaced by the year.  If
       the period determined by analyzing the logfile is the cur
       rent  month, hhttttpp--aannaallyyzzee creates also an up-to-date daily
       summary in the file _s_t_a_t_s_._h_t_m_l.	All files are created  in
       the current directory.

       Assuming	that your old logfiles have been saved under the
       name _l_o_g_Y_Y_Y_Y_/_a_c_c_e_s_s_._M_M in the server's log directory,  use
       the commands

	      cd /usr/ns-home/httpd-80/logs
	      http-analyze -vmn -o /usr/htdocs/stats log1996/access.01

       to create full statistics for January '96 in the directory
       _/_u_s_r_/_h_t_d_o_c_s_/_s_t_a_t_s preserving the current	history  (option
       --nn).   Note:  Generating	statistics  for previous summary
       periods without the --nn option will overwrite newer  values
       in  the	history	file.	To  reconstruct the history, you
       would have to run hhttttpp--aannaallyyzzee for  each	following  month
       until  the very last one (this situation may be avoided in
       a following version of  the  program).	Note  also,  that
       immediately  after  generating the statistics for the last
       month you should run hhttttpp--aannaallyyzzee --mm on the  current  log
       file  to	create	an  up-to-date	index file (iinnddeexx..hhttmmll).
       Remember that this index	file  is  created  automatically
       only  when  creating  a	monthly	summary  for the ccuurrrreenntt
       month.

       The following command creates statistics for a whole  year
       using  a customized configuration file and reading the log
       entries from a pipe:

	      gzcat log1996/access.0?.gz |
	      http-analyze -vm -c /usr/local/bin/sample.conf -


   RREEGGUULLAARR IINNVVOOCCAATTIIOONN VVIIAA CCRROONN
       To have statistics generated on a regular  base,	use  the
       following scheme:

       1)     Optionally install a cron job which calls hhttttpp--aannaa
	      llyyzzee --dd frequently to create a daily summary.   The
	      execution	interval  may range from once per day up
	      to twice per hour depending on  the  size	of  your
	      logfile  and  the time needed to analyze it.  On my
	      server, I run the daily statistics once per hour.

       2)     Install a cron job which calls  hhttttpp--aannaallyyzzee --mm  to
	      create  a monthly summary once per week or once per
	      day (again depending on the size of your	logfile).
	      Note  that  monthly  summaries _(_s_t_a_t_s_M_M_Y_Y_._h_t_m_l_) are
	      created for the first time at the second day  of	a
	      new  month.   On	my  server,  I	create	a monthly



			  Local Commands			9





http-analyze(8L)				 http-analyze(8L)


	      summary two times per day.

       3)     Create a script which rotates the server's logfile,
	      restarts	the  http  server,  and creates the final
	      summary for this period.	Have  _c_r_o_n  execute  this
	      script  at  00:00	on the first day of a new month.
	      See the script rroottaattee--hhttttppdd for an example  on  how
	      to  do this for several virtual web servers running
	      on the same machine.

       4)     Because of _c_r_o_n's scheduling overhead and delays in
	      execution	of the script which rotates the logfile,
	      heavy used servers sometimes writes a  few  entries
	      for the new month in the old logfile.  hhttttpp--aannaallyyzzee
	      usually ignores such kind of "white noise"  at  the
	      end  of  a month.	However, to get correct figures,
	      in this last step you should run hhttttpp--aannaallyyzzee --mm on
	      the logfile for the current month immediately after
	      generating the statistics for the previous month.

       Note that the cron jobs must run with the uid of the owner
       of  the directory where the HTML output files are going to
       be created, except for the rotate  script,  which  usually
       must run with the uid of the Server.  You should also take
       care to avoid running more  than	one  of  the  cron  jobs
       related to hhttttpp--aannaallyyzzee at the same time.

       Here  are  some	sample	_c_r_o_n_t_a_b(1) entries for the scheme
       described above:

	      # Generate a full report twice per day at 01:17 and 13:17
	      17  1,13 * * *  /usr/local/bin/http-analyze -m -c /usr/httpd/analyze.conf
	      # Generate a short summary each hour except at 01:17 or 13:17
	      17  2-12 * * *  /usr/local/bin/http-analyze -d -c /usr/httpd/analyze.conf
	      17 14-23 * * *  /usr/local/bin/http-analyze -d -c /usr/httpd/analyze.conf
	      # Rotate the HTTPD logfiles at the first day, 00:00 of a new month
	      0 0 1 * *	/usr/local/bin/rotate-httpd


CCOOPPYYRRIIGGHHTT
       Copyright  1996 by Stefan Stapelberg, RENT-A-GURU

       Permission to use, copy, modify, and distribute this soft
       ware and its documentation for any purpose and without fee
       is hereby  granted,  provided  that  the	above	copyright
       notice  appear in all copies and in all HTML output files,
       that both that copyright notice and this permission notice
       appear  in  the	supporting  documentation,  and	that the
       hypertext link to the homepage of hhttttpp--aannaallyyzzee  which  the
       program	produces  is  left intact.  This software is pro
       vided "as is" without express or implied warranty.

       Credit for hhttttpp--aannaallyyzzee must be given to	RENT-A-GURU  in
       all derived works.  This does not affect your ownership of



			  Local Commands		       10





http-analyze(8L)				 http-analyze(8L)


       the derived work itself,	and  the  intent  is  to  assure
       proper credit for RENT-A-GURU, not to interfere with your
       use of this software. If you have questions, ask.

       You may use this software at no cost on any  installation,
       even at commercial sites.  However, IITT IISS SSTTRRIICCTTLLYY FFOORRBBIIDD
       DDEENN to sell or lease this software in whole or in part  or
       to include it in whole or in part in a commercial product.
       If you plan to run hhttttpp--aannaallyyzzee on a commercial	installa
       tion  aanndd you need support, or if you would like to bundle
       the program with your products, you must sign an appropri
       ate license agreement available from RENT-A-GURU.  Please
       send an email to <office@rent-a-guru.de>.

       RENT-A-GURU is a registered trademark of Martin	Weitzel,
       Stefan Stapelberg, and Walter Mecky.

AAUUTTHHOORR
       Stefan Stapelberg, <stefan@rent-a-guru.de>

CCRREEDDIITTSS
       Thanks  to  the	over 50 beta testers of hhttttpp--aannaallyyzzeess for
       their feedback.
       Special thanks to  <Lars-Owe.Ivarsson@its.uu.se>	for  his
       suggestions  to optimize the parser algorithm and the code
       he provided as an example.
       Thanks also to Thomas Boutell (http://www.boutell.com) for
       his  great GD library for fast GIF creation, without hhttttpp--
       aannaallyyzzee couldn't produce such fancy graphics in	the  sum
       mary  reports  (gd 1.2 is copyright 1994, 1995, Quest Pro
       tein Database Center, Cold Spring Harbor Labs).

FFIILLEESS
       Note: output files are always  created  in  the	directory
       given  with  the	--oo option, with the HHTTMMLLDDiirr entry in the
       config file, or in the current directory (in this  order).
       See also HTML OUTPUT FILES above.

       _i_n_d_e_x_._h_t_m_l_,	summary report for last 12 month
       _s_t_a_t_s_Y_Y_Y_Y_._h_t_m_l	summary report for year _Y_Y_Y_Y
       _s_t_a_t_s_._h_t_m_l	short summary (daily mode)
       _s_t_a_t_s_M_M_Y_Y_._h_t_m_l	full summary for _M_M_/_Y_Y (monthly mode)
       _f_i_l_e_s_M_M_Y_Y_._h_t_m_l	list of all URLs requested in _M_M_/_Y_Y
       _s_i_t_e_s_M_M_Y_Y_._h_t_m_l	list of all sites accessing the server in _M_M_/_Y_Y
       _s_t_a_t_s_._h_i_s_t	the history file for the last 12 month and last _N days
       _a_v_l_o_a_d_M_M_Y_Y_._g_i_f	the _H_i_t_s _b_y _h_o_u_r_s bar chart image (492x190)
       _s_t_a_t_s_M_M_Y_Y_._g_i_f	the _H_i_t_s_/_F_i_l_e_s_/_S_i_t_e_s_/_K_B by day bar chart image (492x317)
       _c_n_t_r_y_M_M_Y_Y_._g_i_f	the _T_o_t_a_l _t_r_a_n_s_f_e_r_s _b_y _C_o_u_n_t_r_y pie chart image (492x320)
       _g_r_a_p_h_M_M_Y_Y_._g_i_f	the _H_i_t_s_/_F_i_l_e_s_/_S_i_t_e_s_/_K_B graph image (490x317)
       _s_q___*_._g_i_f_._g_i_f	icons for creating bars in the full summary (10x8)
       _g_r_-_i_c_o_n_._g_i_f	an icon for making links to your statistics page (59x41)

NNOOTTEESS
       If  you	are  going  to	analyze different logfiles in one



			  Local Commands		       11





http-analyze(8L)				 http-analyze(8L)


       invocation of hhttttpp--aannaallyyzzee, you must sort them in  ascend
       ing order of their date, otherwise the logfiles being pro
       cessed after the first logfile will be silently ignored.

SSEEEE AALLSSOO
       _3_D_s_t_a_t_s_(_8_L_)				A 3D Access Statistics Generator
       _h_t_t_p_:_/_/_w_w_w_._n_e_t_s_t_o_r_e_._d_e_/_S_u_p_p_l_y_/_h_t_t_p_-_a_n_a_l_y_z_e_/The homepage of hhttttpp--aannaallyyzzee

BBUUGGSS
       You tell me.















































			  Local Commands		       12