File: http-analyze.1

package info (click to toggle)
http-analyze 2.4-2.1
  • links: PTS
  • area: non-free
  • in suites: woody
  • size: 3,344 kB
  • ctags: 924
  • sloc: ansic: 11,678; perl: 241; sh: 215; makefile: 214
file content (2574 lines) | stat: -rw-r--r-- 130,580 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574



hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



NNNNAAAAMMMMEEEE
     hhttttpp--aannaallyyzzee - a fast log analyzer	for web	servers

SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
     hhttttpp--aannaallyyzzee [--{{hhddmmBBVVXX}}] [--33aaeeffggnnqqvvxxyyMM] [--bb _b_u_f_s_i_z_e] [--cc _c_f_g_f_i_l_e]
       [--ii _n_e_w_c_f_g] [--ll _l_i_b_d_i_r] [--oo _o_u_t_d_i_r] [--pp _p_r_v_d_i_r] [--ss _s_u_b_o_p_t,...]
       [--tt _n_u_m,...]  [--uu _t_i_m_e] [--ww _h_i_t_s] [--FF _l_o_g_f_m_t] [--LL _l_a_n_g] [--CC _c_h_r_s_e_t]
       [--II _d_a_t_e] [--EE _d_a_t_e] [--GG _s_u_f_f_i_x,...]  [--HH	_i_d_x_f_i_l_e,...]  [--OO _v_n_a_m_e,...]
       [--PP _p_r_o_l_o_g] [--RR _d_o_c_r_o_o_t]	[--SS _s_r_v_n_a_m_e] [--TT _T_L_D_f_i_l_e] [--UU _s_r_v_u_r_l]
       [--WW _3_D_w_i_n] [--ZZ _s_h_o_w_d_o_m] [_l_o_g_f_i_l_e[...]]

DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
     hhttttpp--aannaallyyzzee analyzes the logfile of a web	server and creates a detailed
     summary of	the servers's access load in graphical,	tabular, and three-
     dimensional form.	The analyzer does this by
	  o  reading all logfiles specified on the command line,
	  o  saving all	unique (different) URLs, hostnames, referrer URLs and
	     user agents,
	  o  accounting	for hits (successful requests),	files sent, files
	     cached, data sent,	etc.,
	  o  and finally creating a statistics report for the period detected
	     in	the logfile(s).

     The resulting statistics report is	a comprehensive	view of	the server's
     logfile.  The server writes a logfile entry for every response on behalf
     of	a request from a browser or a forwarding system	such as	proxy servers.
     To	understand the meaning of the terms in the report, you need a little
     knowledge about the type of data your web server records in its logfile.

   LLLLOOOOGGGGFFFFIIIILLLLEEEE FFFFOOOORRRRMMMMAAAATTTTSSSS
     NNCCSSAA CCoommmmoonn LLooggffiillee FFoorrmmaatt	((CCLLFF))

     The basic logfile format supported	by allmost all servers is the _N_C_S_A
     _C_o_m_m_o_n _L_o_g_f_i_l_e _F_o_r_m_a_t.  It	contains the following information for each
     request (hit):

	  dns-name - auth-user [date] "clf-request" clf-status ct-length

     where the fields have following meaning:

     dns-name	   The IP number of the	system accessing the web server.  If
		   there is an entry in	the _D_o_m_a_i_n _N_a_m_e	_S_y_s_t_e_m (_D_N_S) for this
		   IP number and the web server	is configured to do DNS
		   lookups, the	corresponding hostname is logged instead.

     -		   Unused.

     auth-user	   The username	provided by the	client if authentication was
		   required.

     [date]	   The date of the access in format
		   [DD/MMM/YYYY:HH:MM:SS +-ZZZZ].



Page 1							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     "clf-request" The request in format "method URI proto", where _m_e_t_h_o_d is
		   one of GGEETT, HHEEAADD, PPOOSSTT, PPUUTT,	BBRROOWWSSEE,	OOPPTTIIOONNSS, DDEELLEETTEE	or
		   TTRRAACCEE; _U_R_I is the _U_n_i_f_o_r_m _R_e_s_o_u_r_c_e _I_d_e_n_t_i_f_i_e_r, and _p_r_o_t_o is
		   the HTTP version number.

     clf-status	   The (numerical) response code from the server.

     ct-length	   This	is either the size of the document or the data
		   actually sent over the wire.


     Following is an example for an entry in _N_C_S_A _C_o_m_m_o_n _L_o_g_f_i_l_e _F_o_r_m_a_t:

	  car.4rent.de - - [01/Aug/1999:00:00:02 +0100]	"GET /doc.html HTTP/1.1" 200 393


     WW33CC EExxtteennddeedd LLooggffiillee FFoorrmmaatt ((EELLFF))

     The _W_3_C _E_x_t_e_n_d_e_d _L_o_g_f_i_l_e _F_o_r_m_a_t (_E_L_F) is basically	_N_C_S_A _C_L_F plus user-
     agent and referrer	URL information.  hhttttpp--aannaallyyzzee supports	two variants
     of	this extended format:  _D_L_F and _E_L_F.

     The _D_L_F format adds the referrer URL and the user-agent in	this order
     with or without surrounding double	quotes:

	  CLF "referrer_URL" "user_agent"
	  CLF referrer_URL user_agent

     This is an	example	for an entry in	_D_L_F format (wrapped on two lines for
     readability):

	  car.4rent.de - - [01/Aug/1999:00:00:02 +0100]	"GET /doc.html HTTP/1.1" 200 393
	  "http://inet-tv.net/hot.html"	"Mozilla/4.05 (X11; I; IRIX64 6.4 IP30)"


     The _E_L_F format also adds the referrer URL and the user-agent, but in the
     opposite order and	without	the double quotes:

	  CLF user_agent referrer_URL


     This is an	example	for an entry in	_E_L_F format (wrapped on two lines for
     readability):

	  car.4rent.de - - [01/Aug/1999:00:00:02 +0100]	"GET /doc.html HTTP/1.1" 200 393
	  Mozilla/4.05 (X11; I;	IRIX64 6.4 IP30) http://inet-tv.net/index.html


     The _E_L_F variant is	the preferred method to	pass referrer URL and user-
     agent information.	 When this format is used, hhttttpp--aannaallyyzzee	searches
     backwards for the protocol	specification of the referrer URL (to be
     precise, it looks for the colon in	hhttttpp::)	and then for the preceeding



Page 2							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     blank.  This ensures that broken referrer URLs which contain blanks or
     double quotes are handled correctly.

     To	select either logfile format, edit the configuration file of your web
     server and	define the fields to be	logged.	See the	web server's
     documentation for information how to customize logging.


     AAuuttoommaattiicc ddeetteeccttiioonn ooff tthhee	llooggffiillee	ffoorrmmaatt

     hhttttpp--aannaallyyzzee tries	to automatically detect	the correct logfile format by
     analyzing the first few entries of	a logfile (this	works only if your
     server records a hyphen (`-') for empty referrer URL or user-agent
     fields).  If hhttttpp--aannaallyyzzee detects referrer	URL and	user-agent
     information, it assumes the _E_L_F variant of	the _W_3_C	_E_x_t_e_n_d_e_d _L_o_g_f_i_l_e
     _F_o_r_m_a_t.  To process the _D_L_F variant, specify the logfile format
     explicitely using the option --FF.

     LLooggffiillee ddaattaa uusseedd bbyy hhttttpp--aannaallyyzzee

     The statistics report shows a summary of the information which has	been
     recorded into the logfile by the web server.  For each logfile entry
     hhttttpp--aannaallyyzzee processes the	origin (sitename) and date of the request, the
     request method, the URL of	the requested object, the server's response on
     behalf of the request, the	size of	the requested object and optionally
     the user-agent and	the referrer URL if sent by the	client.

     Note that hhttttpp--aannaallyyzzee does not recognize visitors, email addresses of
     users visting your	server,	the path a user	took through your web site,
     the last page visited by a	user before leaving your site nor anything
     else not recorded in the server's logfile.	 Although hostnames are
     recorded for each request,	they must not necessarily correspond to	the
     real system actually used by a visitor - the request could	be forwarded
     through a dialup service for example.  Furthermore, no request may	get
     logged by your server at all while	someone	is surfing through cached
     copies of parts of	your site depending on the configuration of his/her
     browser ...


















Page 3							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



   BBBBAAAASSSSIIIICCCC OOOOPPPPEEEERRRRAAAATTTTIIIIOOOONNNN
     By	default, hhttttpp--aannaallyyzzee creates a	_f_u_l_l _s_t_a_t_i_s_t_i_c_s	_r_e_p_o_r_t for a whole
     month, which contains complete details for	the period determined by the
     timestamps	of the first and last logfile entry processed.	It is
     therefore extremly	important to always feed all logfiles for a whole
     month into	hhttttpp--aannaallyyzzee, no matter	how frequently you rotate (save) the
     logfiles.

     The recommended way of providing an up-to-date statistics report for a
     web server	is to have a script running hhttttpp--aannaallyyzzee automatically on a
     regular base, say twice per day, and have it process the current logfile
     of	the web	server from the	beginning of the current month until today.
     At	the first of a new month, the logfile should be	saved elsewhere	and
     the web server should be restarted	to create a new	logfile	for the	new
     month. Then run hhttttpp--aannaallyyzzee on the old (saved) logfile to	create a final
     statistics	report for the previous	month.	A history file is used to
     produce a summary for the last 12 months on the main page of the
     statistics	report without having to analyze logfiles for those older
     periods again.

     If	you rotate the logfile more often to be	able to	compress them -	for
     example, once per day -, you must uncompress and concatenate all separate
     logfiles for the whole month into one, chronologically ordered data
     stream, which the can be processed	by hhttttpp--aannaallyyzzee.

     FFuullll ssttaattiissttiiccss rreeppoorrtt

     Due to technical reasons, a full statistics report	will not be created
     before the	second day of a	new month, although the	totals for the first
     day of the	new month on the summary main page of the report will be
     updated.  A full statistics report	contains a detailed summary including
     the following items (see the section _I_n_t_e_r_p_r_e_t_a_t_i_o_n _o_f _t_h_e	_r_e_s_u_l_t_s	for an
     explanation of the	terms):

       o  the number of	hits, files sent/cached, pageviews, sessions and the
	  amount of data sent
       o  the total amount of data requested, transferred, and saved by
	  caching mechanisms
       o  the total number of unique URLs, sites, sessions, browser types and
	  referrer URLs
       o  the total number of all response codes other than Code 200 (_O_K)
       o  the total number of requests which required authentication
       o  the average load per week, day, hour,	minute and second
       o  the Top 7 days, 24 hours, 5 minutes and 5 seconds
       o  the Top 30 most commonly accessed URLs (hits,	files, pageviews,
	  sessions, data sent)
       o  the 10 least frequently accessed URLs	(hits, files, pageviews,
	  sessions, data sent)
       o  the Top 30 client domains, browser types, and	referrer hosts
       o  an overview and a detailed list of all files,	sitenames, browser
	  types	and referrer URLs
       o  a list of all	Code 404 (_N_o_t _F_o_u_n_d) responses



Page 4							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     SShhoorrtt ssttaattiissttiiccss rreeppoorrtt

     Since analyzing the complete logfile for a	whole month increases
     processing	time on	heavily	accessed web servers, you can instruct hhttttpp--
     aannaallyyzzee to	create a _s_h_o_r_t _s_t_a_t_i_s_t_i_c_s _r_e_p_o_r_t for the current day only.  In
     this mode,	hhttttpp--aannaallyyzzee updates only the daily totals for the current
     month in the _H_i_t_s _b_y _D_a_y section of the report and	saves the results in a
     history file.  If the analyzer is then run	a second time to update	the
     _s_h_o_r_t _s_t_a_t_i_s_t_i_c_s _r_e_p_o_r_t, it skips all logfile entries from	the beginning
     of	the month until	it detects any entries for the current day, which are
     then processed to produce an up-to-date _H_i_t_s _b_y _D_a_y section in the
     statistics	report.

     In	_s_h_o_r_t _s_t_a_t_i_s_t_i_c_s _m_o_d_e, hhttttpp--aannaallyyzzee needs only a fraction of
     processing	time required for a _f_u_l_l _s_t_a_t_i_s_t_i_c_s _r_e_p_o_r_t, but	it updates
     only a very small part of the statistics report so	that this should be
     considered	an additional feature rather than a replacement	for the	_f_u_l_l
     _s_t_a_t_i_s_t_i_c_s	_m_o_d_e.  The recommended way for using this feature is to	have
     hhttttpp--aannaallyyzzee generate a _f_u_l_l _s_t_a_t_i_s_t_i_c_s _r_e_p_o_r_t once per day or week,
     while generating an up-to-date _s_h_o_r_t _s_t_a_t_i_s_t_i_c_s _r_e_p_o_r_t as often as	once
     per hour or day.

   UUUUSSSSEEEERRRR	IIIINNNNTTTTEEEERRRRFFFFAAAACCCCEEEESSSS
     Two user interfaces exists	for access to the statistics report:  a
     conventional interface suitable for any browser and a frames-based
     interface which requires JavaScript.

     TThhee ccoonnvveennttiioonnaall iinntteerrffaaccee

     The conventional interface	appears	as in version 1.9 if JavaScript	is
     disabled in your browser or the option --gg was specified at	invocation of
     hhttttpp--aannaallyyzzee.  If JavaScript is enabled, the following separate windows
     are used for different parts of the report	to allow for easy navigation:

     _T_h_e _M_a_i_n _w_i_n_d_o_w
	  This window is used for most parts of	the report such	as the yearly,
	  monthly, daily and weekly summaries, the _T_o_p _N lists and the
	  overviews.  Hotlinks in the _T_o_p _N most often point to	the
	  corresponding	page, which is then displayed in the _V_i_e_w_e_r _w_i_n_d_o_w if
	  the link is followed,	while hotlinks in the overviews	point to the
	  detailed lists, which	will show up in	the _L_i_s_t _w_i_n_d_o_w.

     _T_h_e _N_a_v_i_g_a_t_i_o_n _w_i_n_d_o_w
	  If JavaScript	is enabled in your browser and a summary for a year or
	  a month is loaded into the main window, a small window containing a
	  navigation panel will	pop up.	 If JavaScript is disabled, the
	  navigation links appear at the bottom	of the monthly summary pages.
	  In the latter	case, use the _B_a_c_k button of your browser for
	  navigation.

     _T_h_e _L_i_s_t _w_i_n_d_o_w
	  This window is used for the detailed lists of	URLs, sites, browser



Page 5							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  types	and referrer URLs.  A separate window for those	(often large)
	  lists	causes them to be loaded only once if you follow any link in
	  the _M_a_i_n _w_i_n_d_o_w while	the _L_i_s_t _w_i_n_d_o_w	is still open.

     _T_h_e _V_i_e_w_e_r	_w_i_n_d_o_w
	  This window is used for external pages which are loaded by following
	  the hotlinks in the statistics report. This way, you can visit the
	  pages	referred to in the report without having to go forth and back
	  between the report itself and	the pages listed there.

     _T_h_e _3_D _w_i_n_d_o_w
	  This window is used for the 3D (VRML)	model of the statistics.  If
	  you have JavaScript enabled, the window's size will be set to	the
	  smallest possible size so that the 3D	model fits onto	the screen or
	  to the dimensions specified with the 33DDWWiinnSSiizzee directive.

     TThhee ffrraammeess--bbaasseedd iinntteerrffaaccee

     The frames-based interface	requires a JavaScript-enabled browser.	It
     contains the following frames and windows:

     _T_h_e _N_a_v_i_g_a_t_i_o_n _f_r_a_m_e
	  This frame contains navigation buttons and text.  You	can specify
	  its width using the NNaavviiggFFrraammee directive in the configuration	file.

     _T_h_e _M_a_i_n _f_r_a_m_e
	  This frame is	used for most parts of the report such as the yearly,
	  monthly, daily and weekly summaries, the _T_o_p _N lists and the
	  overviews.  Hotlinks in the _T_o_p _N lists point	most often to the
	  corresponding	page, which is displayed in the	_V_i_e_w_e_r _w_i_n_d_o_w if the
	  link is followed, while hotlinks in the overviews point to the
	  detailed lists, which	show up	in the _L_i_s_t _w_i_n_d_o_w.

     _T_h_e _L_i_s_t _w_i_n_d_o_w
	  This window is used for the detailed lists of	URLs, sites, browser
	  types	and referrer URLs.  A separate window for those	(often large)
	  lists	causes them to be loaded only once if the links	in the _M_a_i_n
	  _w_i_n_d_o_w are followed and the _L_i_s_t _w_i_n_d_o_w is still open.

     _T_h_e _V_i_e_w_e_r	_w_i_n_d_o_w
	  This (separate) window is used for external pages which are loaded
	  by following the hotlinks in the statistics report. This way,	you
	  can visit the	pages referred to in the report	without	having to go
	  forth	and back between the report and	the pages listed there.

     _T_h_e _3_D _w_i_n_d_o_w
	  This window is used for the 3D (VRML)	model of the statistics.
	  Depending on the setting of the 33DDWWiinnddooww directive in	the
	  configuration	file, this is either a separate	window (_e_x_t_e_r_n_a_l) or a
	  new frame (_i_n_t_e_r_n_a_l) inside the _M_a_i_n _f_r_a_m_e (actually,	two frames are
	  created which	replace	the former _M_a_i_n	_f_r_a_m_e when the 3D model	is
	  being	displayed).  In	case of	a separate (external) _3_D _w_i_n_d_o_w, you



Page 6							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  can specify its dimensions using the 33DDWWiinnSSiizzee directive.

     TThhee 33DD mmooddeell

     The 3D model requires a VRML 2.0 plug-in such as CosmoPlayer from Cosmo
     Software (http://cosmosoftware.com/).  Using this plug-in,	which is
     available for Silicon Graphics, Windows and Macintosh platforms, you can
     >>walk<< or >>fly<< through the model and view the	scene from all sides.
     If	you look at the	models,	don't forget to	touch the buddha appearing in
     our 3D logo on top	of the statistics report in the	yearly summary pages!

     The 3D model contains two _s_c_e_n_e_s (models):	one shows the hits, files,
     cached files, sites and the amount	of data	sent by	day and	the other one
     shows the server's	access load by weekday and hour.  To view the second
     scene click on the	_s_c_e_n_e _s_w_i_t_c_h on	the right top of the model.  To
     navigate through the 3D space, use	the pre-defined	_V_i_e_w_p_o_i_n_t_s (camera
     positions)	and CosmoPlayer's _N_a_v_i_g_a_t_i_o_n _p_a_n_e_l.  For customization use the
     CosmoPlayer pop-up	menu.

     The 3D representation of hits by weekday and hour in the second scene
     allow easy	identification of the time your	server has been	most busy
     serving requests.

   IIIINNNNTTTTEEEERRRRPPPPRRRREEEETTTTAAAATTTTIIIIOOOONNNN OOOOFFFF TTTTHHHHEEEE RRRREEEESSSSUUUULLLLTTTTSSSS
     hhttttpp--aannaallyyzzee creates a summary of the information found in	the server's
     logfile.  The analyzer counts the requests, saves the unique URLs,
     sitenames,	browser	types and referrer-URLs	and creates a comprehensive
     statistics	report.	 The following terms are used in this report:

     HHiittss (color: green) A hit is any response from the	web server on behalf
	  of a request sent by a browser, such as text (HTML) files, images,
	  applets, audio/movie clips and even error messages.  For example, if
	  a page is requested which contains two inline	images,	the server
	  would	generate three hits: one hit for the HTML page itself and two
	  hits for the images.	If an invalid URL is requested,	the server
	  would	respond	with a Code 404	(_N_o_t _F_o_u_n_d) status code, which is also
	  a response accounted for as a	hit.

     FFiilleess
	  (color: blue)	If the server sends back a file	for this request, this
	  is accounted for as a	Code 200 (_O_K) response.	 Such a	response is
	  classified as	a _f_i_l_e _s_e_n_t.  Again, file here means any kind of a
	  file,	no matter whether it contains text (HTML documents) or binary
	  data (images,	applets, movies, etc.).	 Note that if you would
	  configure the	web server to only log accesses	to HTML	files, but not
	  images nor any other binary data, the	number of files	would directly
	  correspond to	the number of documents	served.

     CCaacchheedd
	  (color: yellow) A _c_a_c_h_e_d _f_i_l_e	is a Code 304 (_N_o_t _M_o_d_i_f_i_e_d) response.
	  This response	is generated by	the server if a	document hasn't
	  changed since	the last time it was transferred to the	site



Page 7							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  requesting it.  If the browser has access to a local copy of a
	  document requested by	a user - either	through	its local disk cache
	  or through a caching server -, it sends out a	_c_o_n_d_i_t_i_o_n_a_l _r_e_q_u_e_s_t,
	  which	asks for the document to be sent only if it has	been changed
	  since	it was requested the last time.	 If the	document hasn't	been
	  change since then, the server	sends back a _C_o_d_e _3_0_4 response to
	  inform the browser that it can use its local copy.

	  While	this caching mechanism can significantly reduce	network
	  traffic, it causes an	inaccuracy in the statistics report regarding
	  the number a file is requested by someone because of two reasons:
	  First, the browser can be configured to send conditional requests
	  _e_v_e_r_y	_t_i_m_e, _o_n_c_e _p_e_r _s_e_s_s_i_o_n or _n_e_v_e_r	if a cached file is requested.
	  Second, online services, ISPs, companies and many other
	  organizations	use so-called caching servers or proxies, which	itself
	  fulfill requests if the file is found	in the cache.  Since proxies
	  can serve hundreds to	thousands of users, requests from certain
	  sites	could be caused	by thousands of	users requesting a cached file
	  or by	just one person	with his/her browser configured	to not cache
	  anything at all.

	  The ratio between _f_i_l_e_s _s_e_n_t and _c_a_c_h_e_d _f_i_l_e_s	therefore reflects the
	  efficiency of	caching	mechanisms - but only for those	requests which
	  were handled by your web server.

     PPaaggeevviieewwss
	  (color: magenta) The _p_a_g_e_v_i_e_w	mechanism can be used to separate
	  requests for text or HTML files from all other types of requests.
	  If a filename	pattern	has been defined, hhttttpp--aannaallyyzzee classifies all
	  URLs matching	this pattern as	pageviews (text	files),	which allows
	  to estimate the number of >>real<< text documents transmitted	by
	  your web server.  Filename patterns may be defined using the option
	  --GG or	the PPaaggeeVViieeww directive in the configuration file.  The suffix
	  ..hhttmmll	is pre-defined already.

     KKBByytteess ttrraannssffeerrrreedd
	  (color: orange) This is the amount of	data sent during the whole
	  summary period as reported by	the server. Note that some servers
	  record the size of a document	instead	of the actual number of	bytes
	  transferred.	While in most cases this is the	same, if a user
	  interrupts the transmission by pressing the browser's	stop button
	  before the page has been received completely,	some servers (for
	  example all Netscape web servers) log	the size of the	file instead
	  the amount of	data transmitted actually.

     KKBByytteess rreeqquueesstteedd
	  This is the amount of	data requested during the whole	summary
	  period.  hhttttpp--aannaallyyzzee	computes this number by	summing	up the values
	  of _K_B_y_t_e_s _t_r_a_n_s_f_e_r_r_e_d	and _K_B_y_t_e_s _s_a_v_e_d _b_y _c_a_c_h_e (see below).

     KKBByytteess ssaavveedd bbyy ccaacchhee
	  The amount of	data saved by various caching mechanisms.  This	value



Page 8							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  is computed by multiplying the number	of _c_a_c_h_e_d _f_i_l_e_s	(_C_o_d_e _3_0_4)
	  responses with the size of the corresponding file.  Because hhttttpp--
	  aannaallyyzzee can determine	the size of a file only	if the file has	been
	  transmitted at least once in the same	summary	period,	the values for
	  _K_B_y_t_e_s _s_a_v_e_d _b_y _c_a_c_h_e	and _K_B_y_t_e_s _r_e_q_u_e_s_t_e_d are just approximations
	  of the real values.

     UUnniiqquuee UURRLLss
	  The total number of _u_n_i_q_u_e _U_R_L_s is the sum of	all different URLs
	  (files) on your web server, which have been requested	at least once
	  in the corresponding summary period.

     RReeffeerrrreerr UURRLLss
	  If a user follows a link to your web site and	his/her	browser	sends
	  the URL of the page containing the link to the server, this URL is
	  logged as the	_r_e_f_e_r_r_e_r _U_R_L (the location referring to	your
	  document).  Note that	the browser does not necessarily send a
	  referrer URL and even	if it does, a proxy server may alter or	delete
	  it before forwarding the request to a	web server.  Such requests
	  appear under _U_n_k_n_o_w_n in the referrer URL list.

     SSeellff--rreeffeerrrreerr UURRLLss
	  As soon as the browser detects any inline objects (images, applets,
	  etc.)	 in a page just	loaded,	it sends out separate requests for
	  those	objects.  If the objects reside	on the same server as the page
	  referring to them, the corresponding referrer	URLs contain the URL
	  of the page on your server.  Such requests are called	_s_e_l_f-_r_e_f_e_r_r_e_r
	  _U_R_L_s.	 If configured correctly, hhttttpp--aannaallyyzzee separates all self-
	  referrer URLs	from the rest of the referrer URLs in the report.
	  This allows to separate accesses, which actually originated by using
	  inline objects in a text page, from the remaining (external)
	  accesses.

     UUnniiqquuee ssiitteess
	  This is the number of	all different hostnames	or IP addresses	found
	  in the logfile.  Each	different hostname is counted only once	per
	  period, so this number shows how many	systems	did send requests to
	  your server.

     SSeessssiioonnss
	  (color: red) Similar to unique sites,	this is	the number of
	  different hostnames or IP addresses accessing	the server during a
	  certain _t_i_m_e-_w_i_n_d_o_w, which defaults to one day for backward
	  compatibility.  Accesses from	a known	hostname outside this time-
	  window get accounted for as a	new _s_e_s_s_i_o_n.  You can increase or
	  decrease the time-window for sessions	using the option --uu or the
	  SSeessssiioonn directive in the configuration file.	For example, if	you
	  set the time-window to 2 hours, all accesses from the	same host in
	  less than 2 hours are	accounted for as the same session, while any
	  access more than 2 hours apart from the first	one is accounted for
	  as a new session.




Page 9							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     RReeqquueesstt MMeetthhoodd
	  The browser uses a certain method to request a document from a web
	  server.  For example,	documents, images, applets, etc. are usually
	  requested using the GGEETT method.  Other often used methods are	the
	  HHEEAADD method to request more information about	a document such	as its
	  size without have the	server send its	actual content,	and the	PPOOSSTT
	  method, a special way	to transfer user input from forms into CGI
	  scripts.

	  Although all logfile entries with a valid request method are
	  accounted for	as hits, only URLs requested using either the GGEETT or
	  the PPOOSSTT method are processed	further.  The remaining	hits are
	  summarized under _R_e_q_u_e_s_t _M_e_t_h_o_d_s _o_t_h_e_r _t_h_a_n _G_E_T/_P_O_S_T.

     RReessppoonnssee CCooddeess
	  In reply of a	request	from a browser,	the server sends back a	status
	  code such as a Code 200 (_O_K) or Code 404 (_N_o_t	_F_o_u_n_d) response.
	  Similar to the request methods, the analyzer will account any	valid
	  response code	as a hit, but it will only process those URLs, which
	  did cause a Code 200 (_O_K), Code 304 (_N_o_t _M_o_d_i_f_i_e_d), or Code 404 (_N_o_t
	  _F_o_u_n_d) response from the server.  All	other responses	are summarized
	  in the monthly summary page under _O_t_h_e_r _R_e_s_p_o_n_s_e _C_o_d_e_s.  See the
	  current HTML specification at	http://www.w3.org/ for information
	  about	all valid response codes and their meaning.  hhttttpp--aannaallyyzzee
	  recognizes HTTP/1.1 responses	according to RFC2616.

     UUnnrreessoollvveedd
	  A system identifies itself to	a web server using an _I_P _n_u_m_b_e_r.
	  Depending on the configuration, the web server might perform a DNS
	  lookup to resolve the	IP number into a hostname.  If no hostname has
	  been assigned	to this	IP number, only	the IP number is logged.  Such
	  requests are accounted for under _U_n_r_e_s_o_l_v_e_d in the country list of
	  the statistics report.  Since	some systems intentionally have	no
	  hostname, a percentage of up to 35% for unresolved IP	numbers	is
	  absolutely normal.

	  If the country list shows only 100% unresolved IP numbers, either
	  enable the DNS lookup	in your	web server or have a DNS resolver
	  utility preprocess the logfile before	feeding	the data into hhttttpp--
	  aannaallyyzzee.  For	our Commercial Service Licensees, we offer a fast DNS
	  resolver utility with	negative caching and a history mechanism.
	  Visit	the support site at http://support.netstore.de/	for more
	  information.


     WWhhaatt tthhee rreeppoorrtt ddooeess NNOOTT sshhooww ......

     Due to the	nature of the HTTP protocol used for communication between the
     browser and the server and	due to the type	of information available in
     the server's logfile, the analyzer	can not:

	  o  identify a	person as a visitor of your server,



Page 10							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  o  count the number of visitors of your server,
	  o  find out the email	address	of a visitor,
	  o  track the path a visitor takes through your site,
	  o  measure the time a	visitor	sees a page of your server,
	  o  determine the last	page someone saw before	leaving	your site,
	  o  inform you	about the sudden death of the visitor while looking at
	     your homepage,
	  o  nor show any other	information not	recorded in the	server's
	     logfile.


     Even if you classify certain URLs as _p_a_g_e_v_i_e_w_s or use a specific time-
     window to count _s_e_s_s_i_o_n_s, this does in no way tell	you anything about the
     number of real visitors of	your server.

     However, if you use an appropriate	server structure with files grouped by
     its content or if you use the HHiiddeeUURRLL directive to	group unstructered
     files together, the statistics report does	show you at least a trend or a
     tendency.	Following the numbers for some time, you soon get a feeling
     which documents are most interesting for the visitors of your site.



































Page 11							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



   OOOOUUUUTTTTPPPPUUUUTTTT FFFFIIIILLLLEEEESSSS
     A statistics report is created in the current directory or	in the output
     directory specified at invocation of hhttttpp--aannaallyyzzee.	 All output files are
     placed into separate subdirectories to reduce the number of directory
     entries per report.  Those	subdirectories are named wwwwww_Y_Y_Y_Y, where	_Y_Y_Y_Y
     is	the year of the	period covered by the report.

     The analyzer can be instructed to place files with	>>private<< data such
     as	overviews and detailed lists of	files, hosts, browser types, and
     referrer URLs in a	separate (>>private<<) subdirectory.  The web server
     then can be configured to request authentication for access of files in
     this directory.  See also the option --pp and the PPrriivvaatteeDDiirr	directive in
     the configuration file.

     NNoottee:: for protection of the whole report, you would configure your	web
     server to request authentication for any file in the statistics output
     directory.	 A separate private area is needed only	if you want to secure
     certain lists while granting access to the	rest of	the statistics report.

     The following list	shows all output files of a full statistics report in
     a wwwwww_Y_Y_Y_Y directory:

     iinnddeexx..hhttmmll
	  is the main page for the year	and contains the total numbers of
	  _h_i_t_s,	_f_i_l_e_s _s_e_n_t, _c_a_c_h_e_d _f_i_l_e_s, _p_a_g_e_v_i_e_w_s, _s_e_s_s_i_o_n_s and _d_a_t_a _s_e_n_t
	  per month in tabular and graphical form for the last 12 months.  At
	  the end of the year, this file contains the values for the whole
	  year,	while the values for the last 12 months	then will be continued
	  in the index file for	the new	year.  This page is displayed in the
	  _M_a_i_n _w_i_n_d_o_w.

     ssttaattss_M_M_Y_Y..hhttmmll and	ttoottaallss_M_M_Y_Y..hhttmmll
	  contain the total summary for	the month _M_M of	year _Y_Y	in tabular
	  form.	 The file ttoottaallss_M_M_Y_Y..hhttmmll is the frames	version	of the report
	  in ssttaattss_M_M_Y_Y..hhttmmll.  In the conventional interface, this page is
	  displayed in the _M_a_i_n	_w_i_n_d_o_w.

     jjssnnaavv..hhttmmll	and nnaavv_M_M_Y_Y..hhttmmll
	  navigation panels for	JavaScript-enabled browsers, shown in the
	  _N_a_v_i_g_a_t_i_o_n _w_i_n_d_o_w.

     ddaayyss_M_M_Y_Y..hhttmmll
	  contains the numbers of _h_i_t_s,	_f_i_l_e_s _s_e_n_t, _c_a_c_h_e_d _f_i_l_e_s, _p_a_g_e_v_i_e_w_s,
	  _s_e_s_s_i_o_n_s and _d_a_t_a _s_e_n_t per day for the month _M_M of year _Y_Y.  This
	  report is displayed in the _M_a_i_n _w_i_n_d_o_w.

     aavvllooaadd_M_M_Y_Y..hhttmmll
	  contains a graphical representation of the _a_v_e_r_a_g_e _h_i_t_s per
	  weekday/hour and the _t_o_p _s_e_c_o_n_d_s, _m_i_n_u_t_e_s, _h_o_u_r_s, and	_d_a_y_s of	the
	  period.  This	list appears in	the _M_a_i_n _w_i_n_d_o_w.

     ccoouunnttrryy_M_M_Y_Y..hhttmmll



Page 12							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  contains the list of all countries the visitors of your web server
	  came from.  This information is determined by	analyzing the _t_o_p-
	  _l_e_v_e_l	_d_o_m_a_i_n (_T_L_D) of	the hostname assigned to a system in the
	  _D_o_m_a_i_n _N_a_m_e _S_y_s_t_e_m (_D_N_S).  The country report	is displayed in	the
	  _M_a_i_n _w_i_n_d_o_w.
	  Note that the	country	list is	meaningful only	for hostnames with ISO
	  two-letter domains.  All other domains (..ccoomm,	..oorrgg, ..nneett, etc.)  are
	  used by organizations	world-wide, so they are	not assigned a
	  country, but listed literally	in the charts.	The ISO	country	code
	  for the U.S. is ..uuss, by the way, not ..ccoomm ...

     33DDssttaattss_M_M_Y_Y..hhttmmll, 33DDssttaattss_M_M_Y_Y..wwrrll..ggzz, 33DDssttaattss_Y_Y_Y_Y..hhttmmll, 33DDssttaattss_Y_Y_Y_Y..wwrrll..ggzz
	  are pre-requisite files for the 3D statistics	models in the _V_i_r_t_u_a_l
	  _R_e_a_l_i_t_y _M_o_d_e_l_i_n_g _L_a_n_g_u_a_g_e (_V_R_M_L).  Those models are created if the
	  option --33 is given at	invocation of hhttttpp--aannaallyyzzee.  To	view those
	  models, you need a VRML2.0 compatible	plug-in	such as	the free
	  _C_o_s_m_o_P_l_a_y_e_r from Cosmo Software, which is currently available	for
	  Silicon Graphics, Windows and	Macintosh systems.  See
	  http://cosmosoftware.com/ for	more information.  All 3D models are
	  displayed in the _3_D _w_i_n_d_o_w, so that you can compare them with	the
	  graphs in the	conventional report.

     ttooppuurrll_M_M_Y_Y..hhttmmll, ttooppddoomm_M_M_Y_Y..hhttmmll, ttooppuuaagg_M_M_Y_Y..hhttmmll,	ttoopprreeff_M_M_Y_Y..hhttmmll
	  These	files contain the _T_o_p _T_e_n lists	(actually it's _T_o_p _N, where _N
	  is a configurable number) of the _f_i_l_e_s, _s_i_t_e_s, _b_r_o_w_s_e_r _t_y_p_e_s and
	  _r_e_f_e_r_r_e_r _U_R_L_s.  The URLs shown in ttooppuurrll_M_M_Y_Y..hhttmmll are	either the
	  real URLs requested by the visitor or	an _i_t_e_m	(arbitrary text) you
	  choosed to collect certain file names	under (see the HHiiddeeUURRLL
	  directive in the configuration file).
	  The domain names shown in ttooppddoomm_M_M_Y_Y..hhttmmll are	either the second-
	  level	domains	of the hosts accessing your server if the DNS name is
	  available or an item you choosed to collect certain hostnames	under
	  (see the HHiiddeeSSyyss directive in	the configuration file). Unresolved IP
	  numbers show up as _U_n_r_e_s_o_l_v_e_d.
	  The file ttooppuuaagg_M_M_Y_Y..hhttmmll contains a list of all different user
	  agents, which	have been used by visitors to access your web site.
	  The user agent information is	an identification sent by the browser
	  and logged by	the web	server.	Although the format for	this
	  identification is well-defined, it isn't obeyed by any browser.  If
	  possible, hhttttpp--aannaallyyzzee reduces the name of the user agent in the _T_o_p
	  _l_i_s_t_s	to the browser type including the first	digit of its version
	  number.  If it is not	possible to isolate the	browser	type from the
	  user agent, the full identification string as	sent by	the browser is
	  stored.
	  A referrer URL is the	URL of the page	containing a link to your web
	  site,	which has been followed	by someone to reach your site.	Note
	  that for manually entered URLs no referrer URL gets logged.  Also,
	  some browsers	do not send a referrer URL or send a faked one.
	  Entries without a referrer URL are collected under _U_n_k_n_o_w_n in	the
	  referrer list.  The list of referrer URLs is displayed in the	_M_a_i_n
	  _w_i_n_d_o_w.




Page 13							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     ffiilleess_M_M_Y_Y..hhttmmll, ssiitteess_M_M_Y_Y..hhttmmll, aaggeennttss_M_M_Y_Y..hhttmmll, rreeffeerrss_M_M_Y_Y..hhttmmll
	  Those	files contain a	complete overview of the _f_i_l_e_s,	_s_i_t_e_s, _b_r_o_w_s_e_r
	  _t_y_p_e_s	 _a_n_d  _r_e_f_e_r_r_e_r _U_R_L_s , similar to the _T_o_p _N lists.

     llffiilleess_M_M_Y_Y..hhttmmll, llssiitteess_M_M_Y_Y..hhttmmll, llaaggeennttss_M_M_Y_Y..hhttmmll, llrreeffeerrss_M_M_Y_Y..hhttmmll
	  Those	files contain the detailed lists of all	_f_i_l_e_s, _s_i_t_e_s, _b_r_o_w_s_e_r
	  _t_y_p_e_s	and _r_e_f_e_r_r_e_r _U_R_L_s, similar to the previous lists, but sorted
	  by item (if any) and hits.  On frequently accessed sites those lists
	  can become rather large, so they are shown in	the separate _L_i_s_t
	  _w_i_n_d_o_w.

     rrffiilleess_M_M_Y_Y..hhttmmll
	  contains all invalid URLs which caused the server to respond with a
	  _C_o_d_e _4_0_4 (_N_o_t	_f_o_u_n_d) status.	If there are large number of hits for
	  certain files	the server couldn't find, it's probably	due to missing
	  inline images	or other HTML objects embedded in other	pages.	This
	  report is displayed in the _M_a_i_n _w_i_n_d_o_w.

     rrssiitteess_M_M_Y_Y..hhttmmll
	  contains the list of reverse domains.	 This report is	displayed in
	  the _M_a_i_n _w_i_n_d_o_w.

     ffrraammeess..hhttmmll, hheeaaddeerr..hhttmmll
	  This two files are required for the frames-based user	interface.
	  All other files are shared with the ones for the non-frames UI.  In
	  the frames-based UI, the _M_a_i_n	window is inside the frame, while the
	  _L_i_s_t window is still an external window.  The	_3_D _w_i_n_d_o_w may be
	  inside the frame or an external window (see the 33DDWWiinnddooww directive).

     ggrr--iiccoonn..ppnngg
	  This small icon showing the graph from the main page is displayed on
	  the main page	under the base directory for each statistics report.























Page 14							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



OOOOPPPPTTTTIIIIOOOONNNNSSSS
     --hh	  print	a short	help list explaining the meaning of the	options.  Use
	  --hhhh to print an even more detailed help.

     --dd	  (_d_a_i_l_y _m_o_d_e) generate	a short	statistics report for the current day
	  only.	 If a history file exists, the values for the previous days
	  will be read from this history file and the corresponding logfile
	  entries are skipped.	If no history exist, the whole logfile will be
	  processed and	a history file will be created (unless --nn is also
	  given).

     --mm	  (_m_o_n_t_h_l_y _m_o_d_e) generate a full statistics report for a whole month.
	  In this mode,	the values from	the history file are used only to
	  create a summary page	for the	last 12	months.	 The timestamps	from
	  the logfile entries feed into	hhttttpp--aannaallyyzzee always take preceedence
	  over any records in the history unless the option --ee is specified.

     --BB	  create buttons only and exit.	 The analyzer copies or	links the
	  required files and buttons from the central directory	HHAA__LLIIBBDDIIRR into
	  the output directory specified by --oo.

     --VV	  (_v_e_r_s_i_o_n) print the version number of	hhttttpp--aannaallyyzzee and exit.

     --XX	  print	the URL	to file	a bug report.  Use command substitution	or cut
	  & paste to pass this URL to your favourite browser, complete the
	  form and submit it.

     --33	  create a VRML2.0-compliant 3D	model of the statistics	in addition to
	  the regular statistics report.  You need a VRML2.0 compliant plug-in
	  such as _C_o_s_m_o_P_l_a_y_e_r from Cosmo Software to view the resulting	model.

     --aa	  ignore all requests for URLs which required authentication.  If your
	  statistics report is publicly	available, you probably	do not want to
	  have >>secret	URLs<< listed in the report.  See also the AAuutthhUURRLL
	  directive in the configuration file.

     --ee	  use the history file even in full statistics (--mm) mode.  If this
	  option is specified and you analyze the logfiles for several months
	  in one run, hhttttpp--aannaallyyzzee uses	the results recorded in	the history
	  file for previous months and skips all logfile entries up to the
	  first	day of a new month not recorded	in the history (usually	the
	  current month).  This	option is useful if you	rotate your logfile
	  once per quarter and want hhttttpp--aannaallyyzzee to skip all entries for
	  previous months which	have been processed already.

     --ff	  create an additional frames-based user interface for the statistics
	  report.  This	interface requires JavaScript.

     --gg	  (_g_e_n_e_r_i_c _i_n_t_e_r_f_a_c_e) create a conventional (non-frames) user
	  interface for	the statistics report without the optional
	  JavaScript-based navigation window.  By default, the conventional
	  interface includes JavaScript	enhancements for window	control, which



Page 15							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  only become active if	the user has enabled JavaScript	in his/her
	  browser.  Use	this option only to completely disable JavaScript
	  enhancements in the report even if the user has enabled JavaScript
	  in the browser.

     --nn	  (_n_o _u_p_d_a_t_e) do not update the	history	file.  Since the history is
	  used in the statistics report	to create the main summary page	with
	  the results of last 12 months, this option must be used to not mess
	  up the statistics report when	analyzing logfiles for previous	months
	  (before the last one).

     --qq	  do not strip arguments to CGI	scripts.  By default, hhttttpp--aannaallyyzzee
	  strips arguments from	CGI URLs to be able to lump them together.  If
	  your server creates dynamic HTML files through a CGI script, they
	  are reduced to the URL of the	script.	 If --qq is specified, those
	  argument lists are left intact and CGI URLs with different arguments
	  are treated as different URLs.  Note that this only works for
	  requests to scripts, which receive their arguments using the GGEETT,
	  but not the PPOOSSTT method.  See	the section _I_n_t_e_r_p_r_e_t_a_t_i_o_n _o_f _t_h_e
	  _r_e_s_u_l_t_s for an explanation of	the request methods and	the SSttrriippCCGGII
	  directive.

     --vv	  (verbose) comment ongoing processing.	 Warnings are printed only in
	  verbose mode.	 Use this option to see	how hhttttpp--aannaallyyzzee processes the
	  logfile.  If --vv is doubled, a	dot is printed for each	new day	in the
	  logfile.

     --xx	  list each image URL literally	rather than lumping them together
	  under	the item _A_l_l _i_m_a_g_e_s.  Without this option, hhttttpp--aannaallyyzzee
	  collects all requests	for images (*._g_i_f, *._p_n_g, *._j_p_g, *._i_e_f,	*._p_c_d,
	  *._r_g_b, *._x_b_m,	*._x_p_m, *._x_w_d, *._t_i_f) under the item _A_l_l	_i_m_a_g_e_s to
	  avoid	cluttering up the lists	with lots of image URLs.  If --xx	is
	  given, each image URL	is listed literally unless matched by an
	  explicit HHiiddeeUURRLL directive in	the configuration file.

     --MM	  MS IIS-Mode: use case-insensitive matching for URLs.	This violates
	  the standard,	but is necessary for logfiles produced by IIS servers
	  to correctly identify	unique URLs.

     --bb	_b_u_f_s_i_z_e
	  defines the size of the I/O buffer for reading the logfile (default:
	  64KB).  Usually, the best size for I/O buffers is reported on	a
	  per-file base	by the operating system, but some OS report the
	  logical blocksize instead.  If hhttttpp--aannaallyyzzee --vv reports a >>Best
	  buffer size for I/O<<	less than or equal to 8KB, you should specify
	  a size of 16KB for pipes and up to 64KB for disk files to increase
	  the processing speed.

     --cc	_c_f_g_f_i_l_e
	  use _c_f_g_f_i_l_e as the configuration file.  A configuration file allows
	  you to define	the behaviour of hhttttpp--aannaallyyzzee and to define the	>>look
	  & feel<< of the statistics report.  See the section _C_o_n_f_i_g_u_r_a_t_i_o_n



Page 16							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  _F_i_l_e for a description of possible settings, which are called
	  _d_i_r_e_c_t_i_v_e_s in	the following text.

     --ii	_n_e_w_c_f_g
	  create a new configuration file named	_n_e_w_c_f_g.	 If an old
	  configuration	file was also specified	using the --cc option, older
	  settings are retained	in the new file.  Any command line options
	  take preceedence over	old configuration file entries and will	be
	  transformed into the corresponding directive if possible.  For
	  example, specifying the output directory using the option --oo _o_u_t_d_i_r
	  will produce an entry	OOuuttppuuttDDiirr _o_u_t_d_i_r in the	new configuration
	  file.

     --ll	_l_i_b_d_i_r
	  use _l_i_b_d_i_r as	the central library directory where hhttttpp--aannaallyyzzee looks
	  for the pre-requisite	files, buttons,	and license information
	  (usually /usr/local/lib/http-analyze).  This location	can also be
	  specified using the environment variable HHAA__LLIIBBDDIIRR.

     --oo	_o_u_t_d_i_r
	  use _o_u_t_d_i_r instead of	the current directory as the output directory
	  for the statistics report.  hhttttpp--aannaallyyzzee checks automatically	for
	  the required files and buttons in _o_u_t_d_i_r.  If	they are missing or
	  out of date, the analyzer copies them	from HHAA__LLIIBBDDIIRR into the	output
	  directory.  See also the OOuuttppuuttDDiirr and the BBttnnSSyymmlliinnkk	directives.

     --pp	_p_r_v_d_i_r
	  defines the name of a	>>private<< directory for the detailed lists
	  of _f_i_l_e_s, _s_i_t_e_s, _b_r_o_w_s_e_r_s and	_r_e_f_e_r_r_e_r _U_R_L_s.	Because	_p_r_v_d_i_r must
	  reside directly under	the output directory, its name may not contain
	  any slashes ('/').  A	private	directory for detailed lists may be
	  useful to restrict access to those lists if the rest of the
	  statistics report is publicly	available.  Note that for restricting
	  access to the	complete statistics report, you	do nnoott need to place
	  the detailed lists in	a private directory.  See also the PPrriivvaatteeDDiirr
	  directive.



















Page 17							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     --ss	_s_u_b_o_p_t,...
	  suppress certain lists in the	report.	 See also the SSuupppprreessss
	  directive.  _s_u_b_o_p_t may be:
	       AVLoad	   to suppress the average load	report (top seconds/minutes/hours),
	       URLs	   to suppress the overview and	list of	URLs/items,
	       URLList	   to suppress the list	of URLs/items only,
	       Code404	   to suppress the list	of Code	404 (_N_o_t _F_o_u_n_d)	responses,
	       Sites	   to suppress the overview and	list of	client domains,
	       RSites	   to suppress the overview of reverse client domains,
	       SiteList	   to suppress the list	of all client domains/hostnames,
	       Agents	   to suppress the overview and	list of	browser	types,
	       Referrer	   to suppress the overview and	list of	referrers URLs,
	       Country	   to suppress the list	of countries,
	       Pageviews   to suppress pageview	rating (cached files are shown instead),
	       AuthReq	   to suppress requests	which required authentication,
	       Graphics	   to suppress images such as graphs and pie charts,
	       Hotlinks	   to suppress hotlinks	in the list of all URLs,
	       Interpol	   to suppress interpolation of	values in graphs.

     --tt	_n_u_m
	  defines the size of certain lists.  _n_u_m is either a positive number
	  or the value 0 to suppress the corresponding list.  You specify the
	  list by appending one	of the following characters to the number
	  shown	here as	'#' (note that the characters are case-sensitive):
	     #U		 # is the number of entries in the Top URL list	(default: 30),
	     #L		 # is the number of entries in the Least URL list (default: 10).
	     #S		 # is the number of entries in the Top domain list (default: 30),
	     #A		 # is the number of entries in the Top agent/browser list (default: 30),
	     #R		 # is the number of entries in the Top referrer	URL list (default: 30),
	     #d		 # is the number of entries in the Top days table (default: 7),
	     #h		 # is the number of entries in the Top hours table (default: 24),
	     #m		 # is the number of entries in the Top minutes table (default: 5),
	     #s		 # is the number of entries in the Top seconds table (default: 5),
	     #N		 # is the size of the navigation frame (default: 120 pixels)
	  You can specify more than one	_n_u_m with a single --tt option by
	  separating them with commas as in -t 20U,0L,20S.  See	also the TToopp**
	  directives in	the configuration file.

     --uu	_t_i_m_e
	  defines the time-window for counting _s_e_s_s_i_o_n_s.  See _S_e_s_s_i_o_n_s in the
	  section _I_n_t_e_r_p_r_e_t_a_t_i_o_n _o_f _t_h_e	_r_e_s_u_l_t_s	for an explanation of this
	  term.

     --ww	_h_i_t_s
	  sets the noise-level to _h_i_t_s.	 If a noise-level is defined, all
	  URLs,	sites, agents and referrer URLs	with hits below	this level are
	  collected under the item _N_o_i_s_e in the	_T_o_p _N lists and	overviews to
	  avoid	cluttering up those lists.  See	also the NNooiisseeLLeevveell directive.

     --II	_d_a_t_e
	  skip all logfile entries until this day (exclusive).	The date may
	  be specified as _D_D/_M_M/_Y_Y_Y_Y  _o_r  _M_M/_Y_Y_Y_Y , where _M_M is	the number or



Page 18							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  the name of a	month. Note that in full statistics mode, _D_D defaults
	  to the first day of the month	if absent. If you specify any other
	  day in this mode, unpredictable results may occur.  For example,
	  -I Feb restricts the analysis	to the February	of the current year.

     --EE	_d_a_t_e
	  skip all logfile entries starting from this day on (inclusive).  The
	  date format is the same as in	--II.  To	restrict analysis to a certain
	  period, specify the starting date using --II and the first date	to be
	  ignored using	--EE.  For example, -I Jan/99 -E Feb/99 restricts	the
	  analysis to January 1999.

     --FF	_l_o_g_f_m_t
	  the logfile format to	use. Valid keywords for	_l_o_g_f_m_t are aauuttoo	for
	  auto-sensing the logfile format, ccllff for the _C_o_m_m_o_n _L_o_g_f_i_l_e _F_o_r_m_a_t,
	  or ddllff and eellff for the two variants of the _W_3_C _E_x_t_e_n_d_e_d _L_o_g_f_i_l_e
	  _F_o_r_m_a_t.  See also the	section	_L_o_g_f_i_l_e	_F_o_r_m_a_t_s	above.

     --LL	_l_a_n_g
	  use the language _l_a_n_g	for warning messages and for the statistics
	  report.  See also the	directive LLaanngguuaaggee and the section _M_u_l_t_i-
	  _N_a_t_i_o_n_a_l _L_a_n_g_u_a_g_e _S_u_p_p_o_r_t for	more information about localization of
	  hhttttpp--aannaallyyzzee.

     --CC	_c_h_r_s_e_t
	  force	use of _c_h_r_s_e_t for the browser's	encoding when displaying the
	  statistics report.  This is needed for languages which require
	  special character sets such as Chinese.  See also HHTTMMLLCChhaarrSSeett	and
	  the section about _M_u_l_t_i-_N_a_t_i_o_n_a_l _L_a_n_g_u_a_g_e _S_u_p_p_o_r_t.

     --GG	_p_a_t_t_e_r_n,...
	  defines additional pageview patterns.	 All URLs matching one of the
	  _p_a_t_t_e_r_n_s are classified as pageviews (text files).  If _p_a_t_t_e_r_n
	  starts (doesn't start) with a	slash (`/'), it	is treated as a	prefix
	  (suffix) each	URL is compared	with.  The suffix ..hhttmmll	is pre-defined
	  by default.  You can add 9 more patterns here, for example ..sshhttmmll,
	  ..tteexxtt	and //ccggii--bbiinn//.	To specify more	than one suffix	with a single
	  --GG option, use commas	to separate them.  See also the	PPaaggeeVViieeww
	  directive.

     --HH	_i_d_x_f_i_l_e,...
	  defines additional directory index filenames.	 The name iinnddeexx..hhttmmll
	  is pre-defined by default.  hhttttpp--aannaallyyzzee truncates URLs containing
	  an index filename so that they merge with `/'	(their >>base URL<<).
	  For example, /_d_i_r/_i_n_d_e_x._h_t_m_l is truncated to /_d_i_r/.  You can add up
	  to 9 more names for directory	index files, for example WWeellccoommee..hhttmmll
	  or hhoommee..hhttmmll.	 To specify more than one name with a single --HH
	  option, use commas to	separate them.	See also the IInnddeexxFFiilleess
	  directive.

     --OO	_v_n_a_m_e,...
	  defines additional (virtual) names for this server to	be classified



Page 19							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  as _s_e_l_f-_r_e_f_e_r_r_e_r _U_R_L_s.  The server's primary name (from --SS or	--UU) is
	  pre-defined already.	If _v_n_a_m_e doesn't include a protocol spcifier,
	  two URLs with	the http and the https protocol	specifier are added
	  for each name.  To specify more than one server name with a single
	  --OO option, use commas	to separate them.  See also the	VViirrttuuaallNNaammeess
	  directive.

     --PP	_p_r_o_l_o_g
	  use _p_r_o_l_o_g as	the prolog file	for a yearly VRML model	(optional).
	  The file 33DDpprroolloogg..wwrrll	is included in the distribution	as an example.
	  Note that the	resulting VRML model for a whole year may be suitable
	  only for viewing on a	fast system such as a workstation.  The
	  monthly VRML models do not need a prolog file	and can	be viewed on
	  any platform without problems.  See also the VVRRMMLLPPrroolloogg directive.

     --RR	_d_o_c_r_o_o_t
	  restrict logfile analysis to the given Document Root.	 If _d_o_c_r_o_o_t is
	  prefixed by a	`!', analysis takes place for all directories except
	  _d_o_c_r_o_o_t.  If _d_o_c_r_o_o_t does not	start with a slash (`/'), it is
	  interpreted as the name of a virtual server, which is	matched
	  against the normally unused second field of a	logfile	entry.
	  Intented for use with	virtual	servers	with a separate	Document Root
	  or for which the hostname is recorded	in the second field of a
	  logfile entry.  See also the DDooccRRoooott directive.

     --SS	_s_r_v_n_a_m_e
	  use _s_r_v_n_a_m_e for the server name. If no server	name is	defined,
	  hhttttpp--aannaallyyzzee uses the	hostname of the	system it is running on.  The
	  server name must be a	full qualified domain name, not	an URL.	 See
	  also the SSeerrvveerrNNaammee directive.

     --TT	_T_L_D_f_i_l_e
	  use _T_L_D_f_i_l_e for the list of valid top-level domains (TLDs).  This
	  list currently includes all ISO two-letter country domains, the
	  well-known domains ..nneett, ..iinntt, ..oorrgg, ..ccoomm, ..eedduu, ..ggoovv, ..mmiill, ..aarrppaa,
	  ..nnaattoo, and the new _C_O_R_E top-level domains ..ffiirrmm, ..iinnffoo, ..sshhoopp,
	  ..aarrttss, ..wweebb, ..rreecc, and ..nnoomm.	The length of a	top-level domain in
	  the TLD file may not exceed 6	characters.  Since hhttttpp--aannaallyyzzee	uses
	  its built-in defaults	if no TLD file is specified, you rarely	will
	  need this option.  See also the TTLLDDFFiillee directive and	the sample TLD
	  file included	in the distribution.

     --UU	_s_r_v_u_r_l
	  defines _s_r_v_u_r_l as the	URL of the server to be	used for hotlinks in
	  URL lists.  Useful if	the report for your web	server is published on
	  another server.  Also	necessary for virtual servers to have hhttttpp--
	  aannaallyyzzee generate correct hypertext links in the report.  See also
	  the SSeerrvveerrUURRLL	directive.

     --WW	_3_D_w_i_n
	  defines the window for the VRML model.  The keyword _3_D_w_i_n may	be
	  either eexxtteerrnn	or iinntteerrnn for display of the VRML model	in a new,



Page 20							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	  external window or in	the lower half of the main frame respectively
	  (meaningful only in the frames-based interface).

     --ZZ	_s_h_o_w_d_o_m
	  defines _s_h_o_w_d_o_m as the number	of components in a domain name which
	  make up the organizational part.  This is usually the	_s_e_c_o_n_d-_l_e_v_e_l
	  _d_o_m_a_i_n, so that the last two components of the domain	name (for
	  example, company.com)	are used as the	organizationial	part.
	  However, some	countries prefer to use	_t_h_i_r_d-_l_e_v_e_l _d_o_m_a_i_n_s, so	that
	  the hostnames	use 4 or more components, where	the last 3 are used
	  for the organizational part (as in company.co.uk).  To recognize
	  such third-level domains, _s_h_o_w_d_o_m can	be set to the value 3.
	  Hostnames with exactly 3 components will still be reduced to their
	  second-level domain if _s_h_o_w_d_o_m is set	to 3.

     _l_o_g_f_i_l_e(_s)
	  This are the name(s) of the logfile(s) to process.  If more than one
	  file is given, they are processed in the order in which their	names
	  appear on the	command	line.  hhttttpp--aannaallyyzzee checks for the existance
	  of all files before processing them.	If a `-' is specified as the
	  filename, standard input is read.  If	no file	is given, the analyzer
	  either processes the default logfile specified in the	configuration
	  file or the standard input.


     TTyyppiiccaall UUssaaggee

     This is an	example	for the	typical	use of hhttttpp--aannaallyyzzee on Unix systems:

	  $ http-analyze -v3f -o /usr/web/htdocs/stats /usr/ns-home/logs/access.log



     On	Windows	systems, open a	DOS window, change into	the directory where
     you did install hhttttpp--aannaallyyzzee and run a command similar to the following:

	  C:> http-analyze -v3f	-o c:\web\htdocs\stats c:\programs\msiis\access.log


     Note that on Windows systems, hhttttpp--aannaallyyzzee	searches for the required
     buttons and files in the subdirectory files of the	current	directory it
     is	running	in.  Therefore,	if you get error messages about	missing
     buttons make sure you did change into the directory the analyzer is
     installed in (by default the installation directory is C:\Programs\RENT-
     A-GURU\http-analyze2.4).










Page 21							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



   CCCCOOOONNNNFFFFIIIIGGGGUUUURRRRAAAATTTTIIIIOOOONNNN FFFFIIIILLLLEEEE
     You can define server-specific configuration settings for hhttttpp--aannaallyyzzee in
     an	_a_n_a_l_y_z_e_r _c_o_n_f_i_g_u_r_a_t_i_o_n _f_i_l_e.  To have the analyzer use such a
     configuration file, specify its name with the option --cc _c_f_g_f_i_l_e or	the
     environment variable HHAA__CCOONNFFIIGG.  Note that	command	line options always
     take preceedence over settings in a configuration file.

     If	the option --ii _n_e_w_c_f_g is	specified, hhttttpp--aannaallyyzzee	creates	a
     configuration template in the file	_n_e_w_c_f_g.	 Any other command line
     options present will be transformed into its appropriate definitions in
     the new configuration file.  The settings then can	be customized further
     by	manually editing the configuration definitions using a standard	text
     editor.

     To	update an old configuration file into a	new format, specify its	name
     using the option --cc in addition to	--ii.  This will instruct	the analyzer
     to	retain any settings from the old file.

     The configuration file contains a single directive	per line.  Except for
     IInnddeexxFFiilleess, PPaaggeeVViieeww, AAddddDDoommaaiinn, VViirrttuuaallNNaammeess, IIggnn**, and HHiiddee**, each
     directive may appear only once in the configuration file.	Following a
     directive field there are one or two value	fields,	which must be
     separated from the	directive and each other by one	or more	tabulators.
     Blanks are	considered part	of the string in an optional third field only.
     All directive names are case-insensitive.	Comment	lines starting with a
     hash character (#)	are ignored.


     33DDWWiinnSSiizzee _w_i_d_t_hx_h_e_i_g_h_t
	 Defines the size of the 3D window (default: 520x420 pixels).
	 Example:
	       3DWinSize	540x450

     33DDWWiinnddooww _k_e_y_w_o_r_d
	 Defines the 3D	window the VRML	model is displayed in (same as option
	 --WW).  The _k_e_y_w_o_r_d may be either eexxtteerrnn	(default) or iinntteerrnn for
	 display of the	VRML model in a	new, external window or	in the lower
	 half of the main frame	respectively.  Example:
	       3DWindow		intern

     AAddddDDoommaaiinn _d_o_m_a_i_n _s_t_r_i_n_g
	 Add entries to	the domain table causing certain _d_o_m_a_i_n_s to be
	 allocated to the mock domain _s_t_r_i_n_g.  Wildcards in _d_o_m_a_i_n are
	 ignored.  This	directive is useful to collect certain hostnames (for
	 example the hosts of world-wide operating online services), under
	 some _s_t_r_i_n_g (item) instead of the country associated with the top-
	 level-domain.	Example:
	       AddDomain	.compuserve.com	CompuServe
	       AddDomain	.aol.com	AOL

     AAuutthhUURRLL _b_o_o_l_e_a_n _v_a_l_u_e
	 Defines whether accesses which	required authentication	should be



Page 22							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	 skipped.  By default, such URLs appear	in the report just like
	 ordinary URLs.	 If AAuutthhUURRLL is set to _O_f_f, _N_o, _N_o_n_e, _F_a_l_s_e, or _0 the
	 analyzer skips	authenticated requests in the logfile, so that they
	 will be suppressed from the statistics	report.	 Example:
	       AuthURL		No

     BBttnnSSyymmlliinnkk	_b_o_o_l_e_a_n	_v_a_l_u_e
	 Creates symbolic links	to the required	buttons	and files in HHAA__LLIIBBDDIIRR
	 instead of copying them into the output directory.  If	you are	going
	 to analyze a large number of virtual servers which reside on the same
	 host, you can probably	save disk space	by avoiding copies of all
	 buttons and files into	each output directory.	Note that this
	 directive can be used only on systems which support symbolic links.
	 Example:
	       BtnSymlink	Yes

     CCuussttLLooggooWW _i_m_a_g_e _s_r_v_u_r_l and	CCuussttLLooggooBB _i_m_a_g_e	_s_r_v_u_r_l
	 Defines images	for use	as customer logos in the statistics report.
	 This feature is available only	in the commercial version of the
	 analyzer.  _i_m_a_g_e is the name of the image file	relative to the	output
	 directory OOuuttppuuttDDiirr and _s_r_v_u_r_l	is the URL to be followed if the user
	 clicks	on the image.  To use your own logos create two	images - one
	 for use on white backgrounds (CCuussttLLooggooWW) and another one for use on
	 black backgrounds (CCuussttLLooggooBB).	 The images should be approximately
	 72x72 pixels in size and must be placed into the buttons subdirectory
	 of the	central	libdir (HHAA__LLIIBBDDIIRR//bbttnn).	 Next time a report is
	 generated, the	analyzer copies	those logos into the output directory
	 and includes them in the report.  Example:
	       CustLogoW  btn/mycompany_sw.png	http://www.mycompany.com/
	       CustLogoB  btn/mycompany_sb.png	http://www.mycompany.com/

     DDeeffaauullttMMooddee _m_o_d_e
	 The default operation mode of hhttttpp--aannaallyyzzee.  The value	field contains
	 either	the keyword ddaaiillyy for short statistics mode or mmoonntthhllyy for
	 full statistics mode (see also	options	--dd and --mm).  If	left
	 undefined, the	default	is full	statistics mode	(mmoonntthhllyy).  Example:
	       DefaultMode	daily

     DDooccRRoooott _d_o_c_r_o_o_t
	 Restricts logfile analysis to the given Document Root (same as	option
	 --RR).  If _d_o_c_r_o_o_t is prefixed by a `!',	analysis takes place for all
	 directories except _d_o_c_r_o_o_t.  If _d_o_c_r_o_o_t does not start	with a slash
	 (`/'),	it is interpreted as the name of a virtual server, which is
	 matched against the normally unused second field of a logfile entry.
	 Useful	for virtual servers with a separate Document Root.  NNoottee:: Do
	 not define this directive to analyze the whole	server.	 Explicitely
	 setting DDooccRRoooott to `/'	(the default) only increases processing	time.
	 Example:
	       DocRoot		/customer/
	       DocRoot		www.customer.com

     HHTTMMLLCChhaarrSSeett _c_h_r_s_e_t



Page 23							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	 Force use of _c_h_r_s_e_t for the browser's encoding	when displaying	the
	 statistics report (same as option --CC).	 This is needed	for languages
	 which require special character sets such as Chinese.	See also the
	 section about _M_u_l_t_i-_N_a_t_i_o_n_a_l _L_a_n_g_u_a_g_e _S_u_p_p_o_r_t.	 Example:
	       HTMLCharSet	iso-8859-1

     HHTTMMLLPPrreeffiixx	_p_r_e_f_i_x and HHTTMMLLTTrraaiilleerr _t_r_a_i_l_e_r
	 The HTML _p_r_e_f_i_x and _t_r_a_i_l_e_r to	be inserted into the statistics	output
	 files at the top and bottom of	the page.  If defined, the HHTTMMLLPPrreeffiixx
	 string	must include the <BODY>	tag.  To read the HTML code from a
	 file, specify its name	as the _p_r_e_f_i_x or _t_r_a_i_l_e_r.  Example:
	       HTMLPrefix	<BODY BGCOLOR="#FF0000">
	       HTMLTrailer	<A HREF="/intern/">Back</A> to the internal page.

     HHeeaaddFFoonntt _f_o_n_t_l_i_s_t,	TTeexxttFFoonntt _f_o_n_t_l_i_s_t and LLiissttFFoonntt _f_o_n_t_l_i_s_t
	 The fonts to use for headers, for regular text, and for the detailed
	 lists.	 If unset, the analyzer	uses a list of common serif-less fonts
	 for headers and regular text and a monospaced (fixed) font for	the
	 detailed lists.  To force the navigator's default for fonts, use the
	 keyword ddeeffaauulltt as the	fontname.  Example:
	       HeadFont		Helvetica,Arial,Geneva,sans-serif
	       TextFont		Helvetica,Arial,Geneva,sans-serif
	       ListFont		Courier,monospaced

     HHeeaaddSSiizzee _s_i_z_e, TTeexxttSSiizzee _s_i_z_e, SSmmaallllSSiizzee _s_i_z_e and LLiissttSSiizzee _s_i_z_e
	 The font sizes	for headings (navigator	default, usually 3), regular
	 text (default:	2), small text (default: 1) and	lists (default:	2).
	 TTeexxttSSiizzee replaces the former FFoonnttSSiizzee,	which is still recognized for
	 backward compatibility	with older configuration files.	 Example:
	       HeadSize		4
	       SmallSize	2

     HHiiddeeAAggeenntt _a_g_e_n_t _s_t_r_i_n_g
	 Hide a	browser	type under an arbitrary	_s_t_r_i_n_g (item).	Needed only
	 for a certain browser whose vendor still can't	spell its name
	 correctly.  Only the leading part of the browser type is compared
	 against _a_g_e_n_t,	so no wildcards	are needed in the second field.
	 Example:
	       HideAgent	Mozilla/4.0 (compatible; MSIE 4.   MSIE	4.*
	       HideAgent	Mozilla/3.0 (compatible; MSIE 3.   MSIE	3.*

     HHiiddeeRReeffeerr _r_e_f_e_r_r_e_r	_s_t_r_i_n_g
	 Hide certain referrer URLs under an arbitrary _s_t_r_i_n_g (item).  Useful
	 to map	different referrer URLs	for a given host to a common name.
	 Since only the	leading	string of the referrer URL is compared against
	 _r_e_f_e_r_r_e_r, there is no need to specify wildcards.  As in HHiiddeeAAggeenntt, a
	 wildcard suffix is removed from the string, while a wildcard prefix
	 is taken literal.

	 If the	second argument	contains a string in square brackets, this
	 defines the CGI parameter which specifies the search key for search
	 engines.  In this case, the search key	will be	extracted from the



Page 24							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	 argument list and prominently displayed after the name	of the search
	 engine/web server.  See also the configuration	file template produced
	 by hhttttpp--aannaallyyzzee --ii for	more examples hot to use the HHiiddeeRReeffeerr
	 directive.  Example:
	       HideRefer	http://www.altavista.com/	AltaVista [q=]
	       HideRefer	http://lycospro.lycos.com/	Lycos [query=]
	       HideRefer	http://www.excite.com/		Excite [search=]
	       HideRefer	http://www.dino-online.de/	Dino Online [query=]

     HHiiddeeSSyyss _h_o_s_t_n_a_m_e _s_t_r_i_n_g
	 Hide a	_h_o_s_t_n_a_m_e under an arbitrary _s_t_r_i_n_g (item).  The	string may
	 contain blanks. If the	first character	of _s_t_r_i_n_g is a `[', this item
	 is suppressed in the _T_o_p _N lists.  Hidden items are accounted for
	 separately, but in the	summary	they are collected under the
	 description defined with this directive.  You may use the wildcard
	 character `*' as either a prefix or as	a suffix of the	_h_o_s_t_n_a_m_e (as
	 in **..hhoosstt..ccoomm and 119922..116688..1122..**), bot not as both.  Hostnames are
	 case-insensitive.

	 When building the list	of countries, hhttttpp--aannaallyyzzee determines the
	 country from the top-level domain given in _h_o_s_t_n_a_m_e.  If _h_o_s_t_n_a_m_e is
	 an IP number, you can optionally define the top-level domain to be
	 accounted for by appending it in square brackets to the _s_t_r_i_n_g	as
	 shown in the last example below.  Example:
	       HideSys		*.mycompany.com	MY COMPANY
	       HideSys		192.168.12.*	MY COMPANY [US]

     HHiiddeeUURRLL_u_r_l	_s_t_r_i_n_g
	 Hide an _U_R_L under an arbitrary	_s_t_r_i_n_g (item).	The string may contain
	 blanks. If the	first character	of _s_t_r_i_n_g is a `[', this item is
	 suppressed in the _T_o_p _N lists.	 Hidden	items are accounted for
	 separately, but in the	summary	they are collected under the
	 description defined with this directive.  You may use the wildcard
	 character `*' as either a prefix or as	a suffix of the	_U_R_L (as	in
	 **..mmaapp and //ssuubbddiirr//**), bot not as both.	 URLs are case-sensitive as
	 required by the HTTP standard.	 If the	option --MM is specified,	URLs
	 will become case-insensitive for compatibility	with non-compliant web
	 servers.  Note	that images are	hidden automatically under _A_l_l _i_m_a_g_e_s
	 by default unless --xx is specified.  Example:
	       HideURL		*.map		[All image maps]
	       HideURL		/robots.txt	[Robot control file]
	       HideURL		/newsletter/*	MyCompany Monthly Newsletter
	       HideURL		/products/*	MyCompany Products
	       HideURL		/~delta-t/	DELTA-t	Homepage
	       HideURL		/~delta-t/*	DELTA-t	more pages

     IIggnnUURRLL _u_r_l	and IIggnnSSyyss _h_o_s_t_n_a_m_e
	 Ignore	entries	with a specific	URL or accesses	from a certain system.
	 You may use the wildcard character `*'	as either a prefix or as a
	 suffix	of the URL or the hostname (as in **..ppnngg, //ssuubbddiirr//ffiillee** and
	 **..hhoosstt..ccoomm), but not as both.	Note that all logfile entries are
	 compared against this list while hhttttpp--aannaallyyzzee reads the logfile



Page 25							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	 opposed to the	HHiiddeeUURRLL	and HHiiddeeSSyyss directives,	which are looked up
	 for when all entries have been	reduced	to the set of unique URLs and
	 hostnames, respectively.  Therefore, many IIggnnUURRLL/IIggnnSSyyss definitions
	 will significantly increase processing	time of	hhttttpp--aannaallyyzzee.
	 Example:
	       IgnURL		*.gif,*.png,*.jpg,*.jpeg
	       IgnURL		/stats/

     IInnddeexxFFiilleess	_i_d_x_f_i_l_e[,_i_d_x_f_i_l_e...]
	 Defines additional directory index filenames (same as option --HH).
	 The name iinnddeexx..hhttmmll is	pre-defined by default.	 hhttttpp--aannaallyyzzee
	 truncates URLs	containing an index filename so	that they merge	with
	 `/' (their >>base URL<<).  For	example, /_d_i_r/_i_n_d_e_x._h_t_m_l is truncated
	 to /_d_i_r/.  You	can add	up to 9	more names for directory index files.
	 Note that each	name requires another table lookup, which may
	 significantly increase	processing time.  Example:
	       IndexFiles	Welcome.html,home.html,index.htm

     LLaanngguuaaggee _l_a_n_g
	 Use the language _l_a_n_g for warning messages and	for the	statistics
	 report	(same as option	--LL).  See the section _M_u_l_t_i-_N_a_t_i_o_n_a_l _L_a_n_g_u_a_g_e
	 _S_u_p_p_o_r_t for more information about localization of hhttttpp--aannaallyyzzee.
	 Example:
	       Language		de

     LLooggFFiillee _f_i_l_e_n_a_m_e
	 The name of the server's logfile.  If you define a default name for
	 the logfile, this file	is processed if	no other filenames are
	 explicitely specified on the command line.  If	no logfile is
	 specified, hhttttpp--aannaallyyzzee always	reads _s_t_d_i_n.  Example:
	       LogFile		/usr/ns-home/logs/access

     LLooggFFoorrmmaatt _l_o_g_f_m_t
	 Use this logfile format. Valid	values for _l_o_g_f_m_t are aauuttoo for auto-
	 sensing the logfile format, ccllff for the _N_C_S_A _C_o_m_m_o_n _L_o_g_f_i_l_e _F_o_r_m_a_t,
	 or ddllff	and eellff	for the	two supported variants of the _W_3_C _E_x_t_e_n_d_e_d
	 _L_o_g_f_i_l_e _F_o_r_m_a_t.  See the section _L_o_g_f_i_l_e _F_o_r_m_a_t_s for a	detailed
	 description of	those formats.	Example:
	       LogFormat	clf

     MMSSIIIISSmmooddee _b_o_o_l_e_a_n _v_a_l_u_e
	 Use case-insensitive string comparison	for URLs.  Needed for MS IIS
	 which makes no	difference between upper- and lower-case characters.
	 MS users may regard this as an	enhancement, while for the rest	of the
	 world this is just a violation	of the RFC2616 HTTP standard and
	 should	be ignored.  Example:
	       MSIISmode	Yes

     NNaavvWWiinnSSiizzee	_w_i_d_t_hx_h_e_i_g_h_t
	 Defines the size of the navigation window which pops up in the
	 conventional interface	if JavaScript is enabled.  Useful if the
	 browser displays scrollbars when using	the default size of 420x190



Page 26							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	 pixels.  Example:
	       NavWinSize	440x200

     NNaavviiggFFrraammee	_s_i_z_e
	 Defines the size of the navigation frame in pixels.  Useful if	the
	 browser displays scrollbars when using	the default size of 120
	 pixels.  Example:
	       NavigFrame	140

     NNooiisseeLLeevveell	_h_i_t_s
	 Sets the noise-level to _h_i_t_s.	If a noise-level is defined, all URLs,
	 sites,	agents and referrer URLs with hits below this level are
	 collected under the item _N_o_i_s_e	in the _T_o_p _N lists and overviews to
	 avoid cluttering up those lists.  Example:
	       NoiseLevel	7

     OOuuttppuuttDDiirr _d_i_r_e_c_t_o_r_y
	 The name of the directory where the output files of the statistics
	 report	should be created (same	as option --oo).	By default, the	output
	 directory is the current directory.  Example:
	       OutputDir	/usr/web/htdocs/stats

     PPaaggeeVViieeww _p_a_t_t_e_r_n[,_p_a_t_t_e_r_n...]
	 Defines additional pageview patterns (same as option --GG).  All	URLs
	 matching one of the _p_a_t_t_e_r_n_s are classified as	pageviews (text
	 files).  If _p_a_t_t_e_r_n starts (doesn't start) with a slash (`/'),	it is
	 treated as a prefix (suffix) each URL is compared with.  The suffix
	 ..hhttmmll is pre-defined by default. You can add 9	more patterns here,
	 for example ..sshhttmmll, ..tteexxtt and //ccggii--bbiinn//.  Note	that each pattern
	 requires another table	lookup,	which may significantly	increase
	 processing time.  Example:
	       PageView		.shtml,.text,/cgi-bin/

     PPrriivvaatteeDDiirr	_p_r_v_d_i_r
	 Defines the name of a >>private<< directory for the detailed lists of
	 _f_i_l_e_s,	_s_i_t_e_s, _b_r_o_w_s_e_r_s	and _r_e_f_e_r_r_e_r _U_R_L_s (same	as option --pp).
	 Because _p_r_v_d_i_r	must reside directly under the output directory, its
	 name may not contain any slashes (`/').  A private directory for
	 detailed lists	may be useful to restrict access to those lists	if the
	 rest of the statistics	report is publicly available.  Note that for
	 restricting access to the complete statistics report, you do nnoott need
	 to place the detailed lists in	a private directory.  Example:
	       PrivateDir	lists

     RReeggIInnffoo _c_u_s_t_o_m_e_r__n_a_m_e _r_e_g_i_s_t_r_a_t_i_o_n__I_D
	 Defines the customer's	name and the registration ID, which are	both
	 shown on the main page	in the summary report.	Example:
	       RegInfo		MyCompany	3745JMJZ00000311300000682344

     RReeppoorrttTTiittllee _t_i_t_l_e
	 The document title to use in the statistics report.  Example:
	       ReportTitle	Access Statistics for MyCompany



Page 27							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     SSeerrvveerrNNaammee	_s_r_v_n_a_m_e
	 The official name of the server (same as option --SS).  If no server
	 name is defined, hhttttpp--aannaallyyzzee uses the	hostname of the	system it is
	 running on.  The server name must be a	full qualified domain name,
	 not an	URL.  Example:
	       ServerName	www.mycompany.com

     SSeerrvveerrUURRLL _s_r_v_u_r_l
	 The URL of the	server to be used for hotlinks in URL lists (same as
	 option	--UU).  Useful if	the report for your web	server is published on
	 another server.  Also necessary for virtual servers to	have hhttttpp--
	 aannaallyyzzee generate correct hypertext links in the report.  Example:
	       ServerURL	http://www.mycompany.com

     SSeessssiioonn _t_i_m_e
	 The time-window for counting _s_e_s_s_i_o_n_s.	 All unique hosts accessing
	 your server more than once inside this	time-window are	accounted for
	 as the	same session.  If the distance between two adjacend accesses
	 from the same host is greater than the	time-window, the accesses from
	 this host are accounted for as	different sessions.  Example:
	       Session		4 hours

     SShhoowwDDoommaaiinn	_n_u_m_b_e_r
	 Defines the number of components in a domain name which make up the
	 organizational	part (same as option --ZZ).  This	is usually the
	 _s_e_c_o_n_d-_l_e_v_e_l _d_o_m_a_i_n, so that the last two components of the domain
	 name (for example, company.com) are used as the organizationial part.
	 However, some countries prefer	to use _t_h_i_r_d-_l_e_v_e_l _d_o_m_a_i_n_s, so that
	 the hostnames use 4 or	more components, where the last	3 are used for
	 the organizational part (as in	company.co.uk).	 To recognize such
	 third-level domains, _S_h_o_w_D_o_m_a_i_n can be	set to the value 3.  Hostnames
	 with exactly 3	components will	still be reduced to their second-level
	 domain	if _S_h_o_w_D_o_m_a_i_n is set to	3.  Example:
	       ShowDomain	3

     SSttrriippCCGGII _b_o_o_l_e_a_n _v_a_l_u_e
	 Do not	strip arguments	to CGI scripts (same as	option --qq).  By
	 default, hhttttpp--aannaallyyzzee strips arguments	from CGI URLs to be able to
	 lump them together.  If your server creates dynamic HTML files
	 through a CGI script, they are	reduced	to the URL of the script.  If
	 SSttrriippCCGGII is set to _O_f_f, _N_o, _N_o_n_e, _F_a_l_s_e or _0, those argument lists
	 are left intact and CGI URLs with different arguments are treated as
	 different URLs.  Note that this only works for	requests to scripts,
	 which receive their arguments using the GGEETT, but not the PPOOSSTT method.
	 See the section _I_n_t_e_r_p_r_e_t_a_t_i_o_n	_o_f _t_h_e _r_e_s_u_l_t_s for an explanation of
	 the request methods.  Example:
	       StripCGI		No

     SSuupppprreessss _s_u_b_o_p_t,...
	 Suppress certain lists	in the report (same as --ss).  _s_u_b_o_p_t may	be one
	 of:
	      AVLoad	  to suppress the average load report (top seconds/minutes/hours),



Page 28							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	      URLs	  to suppress the overview and list of URLs/items,
	      URLList	  to suppress the list of URLs/items only,
	      Code404	  to suppress the list of Code 404 (_N_o_t	_F_o_u_n_d) responses,
	      Sites	  to suppress the overview and list of client domains,
	      RSites	  to suppress the overview of reverse client domains,
	      SiteList	  to suppress the list of all client domains/hostnames,
	      Agents	  to suppress the overview and list of browser types,
	      Referrer	  to suppress the overview and list of referrers URLs,
	      Country	  to suppress the list of countries,
	      Pageviews	  to suppress pageview rating (cached files are	shown instead),
	      AuthReq	  to suppress requests which required authentication,
	      Graphics	  to suppress images such as graphs and	pie charts,
	      Hotlinks	  to suppress hotlinks in the list of all URLs,
	      Interpol	  to suppress interpolation of values in graphs.
	 Example:
	       Suppress		Country,Interpol

     TTLLDDFFiillee _f_i_l_e_n_a_m_e
	 Use _f_i_l_e_n_a_m_e for the list of top-level	domains	(same as option	--TT).
	 This list includes all	ISO two-letter country domains,	the well-known
	 domains ..nneett, ..iinntt, ..oorrgg, ..ccoomm, ..eedduu, ..ggoovv, ..mmiill, ..aarrppaa, ..nnaattoo, and
	 the new _C_O_R_E top-level	domains	..ffiirrmm, ..iinnffoo, ..sshhoopp, ..aarrttss, ..wweebb,
	 ..rreecc, and ..nnoomm.  The length of	a domain in the	TLD file may not
	 exceed	6 characters.  Since hhttttpp--aannaallyyzzee uses its built-in defaults
	 if no TLD file	is specified, you rarely will need this	directive.
	 Example:
	       TLDFile		/usr/local/lib/http-analyze/TLD

     TTbbllFFoorrmmaatt _t_b_l_n_a_m_e _s_p_e_c_i_f_i_e_r
	 Defines the layout of tables in the statistics	report.	 The argument
	 _t_b_l_n_a_m_e may be	one of:
	      Month	  for the statistics of	the last 12 months (main page)
	      Day	  for the daily	statistics in the short	and full summaries
	      Load	  for the average load by weekday, hour, minute, second
	      Country	  for the list of countries
	      TopTen	  for all _T_o_p _N	lists
	      Overview	  for all overviews
	      Lists	  for all detailed lists (preformatted text)
	      NotFound	  for the list of _N_o_t_F_o_u_n_d responses
	 The _s_p_e_c_i_f_i_e_r string defines the items	to be shown in the table:
	      n, N	  an index number or label (don't touch!)
	      h, H	  the number of	_h_i_t_s
	      f, F	  the number of	_f_i_l_e_s _s_e_n_t
	      c, C	  the number of	_c_a_c_h_e_d _f_i_l_e_s
	      p, P	  the number of	_p_a_g_e_v_i_e_w_s
	      s, S	  the number of	_s_e_s_s_i_o_n_s
	      k, K	  the amount of	_d_a_t_a _s_e_n_t in Kbytes (integer value)
	      B		  the amount of	_d_a_t_a _s_e_n_t in bytes (float value)
	      L		  a dynamically	created	label (don't touch!)
	 If a format specifier is used in upper-case, the value	displayed in
	 the report will include the percentage	for this number.  Example:
	       TblFormat	Month	n h f c	p s k



Page 29							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	       TblFormat	Day	n H F C	P S k
	       TblFormat	Country	N H F P	S k L

     TToopp{DDaayyss,,HHoouurrss,,MMiinnuutteess,,SSeeccoonnddss,,UURRLLss,,SSiitteess,,AAggeennttss,,RReeffeerrss}, LLeeaassttUURRLLss
	 Defines the size of certain _T_o_p _N tables and lists.  If set to	zero,
	 the corresponding list	will be	suppressed.  Example:
	       TopURLs		20
	       LeastURLs	0
	       TopDays		14

     VViirrttuuaallNNaammeess _v_n_a_m_e,...
	 The list of additional	(>>virtual<<) names for	this server to be
	 classified as _s_e_l_f-_r_e_f_e_r_r_e_r _U_R_L_s.  The	server's primary name (from
	 SSeerrvveerrNNaammee or SSeerrvveerrUURRLL) is pre-defined already. If _v_n_a_m_e doesn't
	 include a protocol specifier, two URLs	with the http and the https
	 protocol specifier will be added for each name.  Since	self-referrers
	 are suppressed	from the list of referrer URLs,	the remaining entries
	 give a	good impression	about external pages referring to some
	 document on your site.	 Example:
	       VirtualNames	www2.mycompany.com,mycompany.com
	       VirtualNames	www.customer.com,customer.com
	       VirtualNames	http://www.other.com,https://secure.other.com

     VVRRMMLLPPrroolloogg	_f_i_l_e
	 The name of a prolog file for a yearly	VRML model (same as option
	 --PP).  Pathnames not beginning with a `/' are relative to OOuuttppuuttDDiirr.
	 If a prolog file is given, an additional yearly model with all
	 12 monthly models embedded as inlines is created.  See	the section
	 _O_u_t_p_u_t	_f_i_l_e_s for further information about this yearly	model.
	 Example:
	       VRMLProlog	3Dprolog.wrl

MMMMUUUULLLLTTTTIIII----NNNNAAAATTTTIIIIOOOONNNNAAAALLLL LLLLAAAANNNNGGGGUUUUAAAAGGGGEEEE	SSSSUUUUPPPPPPPPOOOORRRRTTTT
     hhttttpp--aannaallyyzzee supports _M_u_l_t_i-_N_a_t_i_o_n_a_l-_L_a_n_g_u_a_g_e-_S_u_p_p_o_r_t (_M_N_L_S) according to
     the _X/_O_p_e_n	_P_o_r_t_a_b_i_l_i_t_y _G_u_i_d_e (_X_P_G_4) and the _S_y_s_t_e_m	_V _I_n_t_e_r_f_a_c_e _D_e_f_i_n_i_t_i_o_n
     (_S_V_R_4).  For systems without MNLS support,	a simple native	implementation
     is	used.  See the file INSTALL included in	the distribution for
     information about installation of the appropriate MNLS support for	your
     system.  The option --VV displays the type of MNLS support compiled into a
     binary.

     All text strings and messages of hhttttpp--aannaallyyzzee are contained in a separate
     message catalog, which is read at start-up	of the program.	 If a message
     catalog is	installed in the system, you can select	the language to	be
     used for warning messages and for the statistics report by	setting	the
     appropriate _l_o_c_a_l_e.  This can be done by defining the LANG	(_X_P_G_4/_S_V_R_4
     _M_N_L_S) or the HA_LANG (_n_a_t_i_v_e _M_N_L_S)	environment variable or	by using the
     option --LL.	 When using --LL,	the analyzer switches to the specified
     language when it has recognized the option.  If no	message	catalog	exists
     for the specified locale, hhttttpp--aannaallyyzzee uses built-in messages in english
     language.




Page 30							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     Certain languages require a specific character set	to be used by the
     browser when displaying the statistics report.  This can be defined using
     the option	--cc or the CChhaarrSSeett directive.  The following table summarizes
     the most common combinations of languages and character sets.  Note that
     the name of the locale is system-specific (for example, ddee	could be ddee--
     iissoo88885599 on	some systems.

	  _C_o_u_n_t_r_y	      _L_o_c_a_l_e	_E_n_c_o_d_i_n_g
	  Standard C	      C		us-ascii
	  Arabic Countries    ar	iso-8859-6
	  Belarus	      be	iso-8859-5
	  Bulgaria	      bg	iso-8859-5
	  Czech	Republic      cs	iso-8859-2
	  Denmark	      da	iso-8859-1
	  Germany	      de	iso-8859-1
	  Greece	      el	iso-8859-7
	  Spain		      es	iso-8859-1
	  Mexico	      es_MX	iso-8859-1
	  Finland	      fi	iso-8859-1
	  France	      fr	iso-8859-1
	  Switzerland	      fr_CH	iso-8859-1
	  Croatia	      hr	iso-8859-2
	  Hungary	      hu	iso-8859-2
	  Iceland	      is	iso-8859-1
	  Italy		      it	iso-8859-1
	  Israel	      iw	iso-8859-8
	  Japan		      ja	Shift_JIS or iso-2022-jp
	  Korea		      ko	EUC-kr or iso-2022-kr
	  Netherlands	      nl	iso-8859-1
	  Belgium	      nl_BE	iso-8859-1
	  Norway	      no	iso-8859-1
	  Poland	      pl	iso-8859-2
	  Portugal	      pt	iso-8859-1
	  Russia	      ru	KOI8-R or iso-8859-5
	  Sweden	      sv	iso-8859-1
	  Chinese	      zh	big5


     Since the message catalogs	are independent	from the base software,	more
     languages may become available without having to re-compile or re-install
     the software.  Please visit the homepage of hhttttpp--aannaallyyzzee for up-to-date
     information about the available languages.	 For more information about
     localization, see _e_n_v_i_r_o_n(_5) and _s_e_t_l_o_c_a_l_e(_3) in the online manual.












Page 31							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



EEEEXXXXAAAAMMMMPPPPLLLLEEEESSSS
     After successful compilation of hhttttpp--aannaallyyzzee you can test-run the
     analyzer before installing	it permanently.	 Just create a subdirectory
     for the output files and run hhttttpp--aannaallyyzzee on either one of	the sample
     logfiles included in the distribution (as shown below) or use your	web
     server's logfile.	For example, to	create a full statistics including a
     frames-based interface and	a 3D VRML model	in the subdirectory tteessttdd, use
     the following commands:

	 $ cd http-analyze2.4
	 $ mkdir testd
	 $ http-analyze	-vm3f -o testd files/logfmt.elf
	 http-analyze 2.4 (IP22; IRIX 6.2; XPG4	MNLS; PNG)
	 Copyright 1999	by RENT-A-GURU(TM)
	 Generating full statistics in output directory	`testd'
	 Reading data from `files/logfmt.elf'
	 Best blocksize	for I/O	is set to 64 KB
	 Hmm, looks like Extended Logfile Format (ELF)
	 Start new period at 01/Jan/1999
	 Creating VRML model for January 1999
	 Creating full statistics for January 1999
	 ... processing	URLs
	 ... processing	hostnames
	 ... processing	user agents
	 ... processing	referrer URLs
	 Total entries read: 8,	processed: 8
	 Clear almost all counters at 03/Jan/1999
	 Start new period at 01/Feb/1999
	 No more hits since 02/Feb/1999
	 Creating VRML model for February 1999
	 Creating full statistics for February 1999
	 ... processing	URLs
	 ... processing	hostnames
	 ... processing	user agents
	 ... processing	referrer URLs
	 ... updating `www1999/index.html': last report	is for February	1999
	 Total entries read: 3,	processed: 3
	 Statistics complete until 28/Feb/1999
	 $

     To	view the statistics report, start your browser and open	the file
     tteessttdd//iinnddeexx..hhttmmll.

     For permanent installation	of hhttttpp--aannaallyyzzee, issue a make install to copy
     the required files	into the appropriate directory.	 The executable	is
     usually installed in /usr/local/bin, while	the required buttons and files
     are placed	under /usr/local/lib/http-analyze unless this has been changed
     by	defining the HHAA__LLIIBBDDIIRR make macro during installation.

     Note that you do not need to install files	in a new statistics output
     directory anymore if they have been installed in HHAA__LLIIBBDDIIRR; this is now
     done automatically	by hhttttpp--aannaallyyzzee	if it runs the first time on this



Page 32							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     output directory.

     Following are some	more examples, which assume that the analyzer has been
     installed permanently.  The first command processes an archived logfile
     _l_o_g_Y_Y_Y_Y/_a_c_c_e_s_s._M_M from the	server's log directory to create a report for
     January 1999 in the directory //uussrr//wweebb//hhttddooccss//ssttaattss:

	 $ cd /usr/ns-home/logs
	 $ http-analyze	-vm3f -o /usr/web/htdocs/stats log1999/access.01


     The next command uncompresses the logfiles	for a whole year and feeds the
     data via a	pipe into the analyzer,	which then creates a statistics	report
     for this period.  All options are passed to the analyzer through a
     customized	configuration file specified with --cc:

	 $ gzcat log1998/access.[01]?.gz | http-analyze	-c /usr/httpd/analyze.conf -


     The following command creates a configuration file	template with the name
     ssaammppllee..ccoonnff.  Any additional options will be transformed into the
     appropriate directives in the new configuration file.  In this example,
     the server's name specified with --SS is transformed	into a SSeerrvveerrNNaammee
     directive and the output directory	specified with --oo is transformed into
     a OOuuttppuuttDDiirr directive.  All other directives are set to their respective
     default value.  To	further	customize any settings,	use a standard text
     editor.

	 $ http-analyze	-i sample.conf -S www.myserver.com -o /usr/web/htdocs/stats



     To	update an old configuration file into the new format while retaining
     any old settings, specify its name	when creating the new file.  Again,
     command line options may be used to alter certain settings; they take
     preceedence over definitions in the old configuration file.  The
     following command reads the file oollddffiillee..ccoonnff and transforms its content
     into a new	file named nneewwffiillee..ccoonnff:

	 $ http-analyze	-c oldfile.conf	-i newfile.conf


   RRRREEEEGGGGUUUULLLLAAAARRRR IIIINNNNVVVVOOOOCCCCAAAATTTTIIIIOOOONNNN VVVVIIIIAAAA CCCCRRRROOOONNNN
     Although hhttttpp--aannaallyyzzee can be run manually to process logfiles, it usually
     is	executed automatically on a regular base.  On Unix systems you use the
     _c_r_o_n(_1) utility, while Windows systems provide a similar functionality
     with the _A_T command.  To have your	statistics report updated
     automatically, use	the following scheme:

	 1)  Install a cron job	which calls hhttttpp--aannaallyyzzee --mm33ff to create	a full
	     statistics	report once per	hour or	twice per day depending	on the
	     processing	load caused by analyzing the logfile.  Note that the



Page 33							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	     full statistics report is created for the first time at the
	     second day	of a new month.

	 2)  Optionally	install	a cron job which calls hhttttpp--aannaallyyzzee --dd more
	     often to create a short statistics	report.	 Although this will
	     only update the _H_i_t_s _b_y _d_a_y section of the	report,	the advantage
	     of	the short statistics mode is that hhttttpp--aannaallyyzzee needs only a
	     fraction of the time required to create a full statistics report.
	     However, this is only needed if the total time needed to create
	     full statistics reports requires more than	15 minutes.

	 3)  Install a shell script which rotates (saves) the server's
	     logfile, restarts the web server, and then	creates	the final
	     summary for this period.  Have _c_r_o_n execute this script at	00:00
	     on	the ffiirrsstt ddaayy of a new month.  See the script rroottaattee--hhttttppdd for
	     an	example	how to do this for several virtual web servers at
	     once.

	 4)  Because of	delays in execution of the script which	rotates	the
	     logfile, heavy used servers sometimes writes a few	entries	for
	     the new month in the old logfile.	hhttttpp--aannaallyyzzee usually detects
	     and ignores such >>noise<<	appearing at the end of	a logfile.
	     However, to initialize the	files for the new month, you should
	     run hhttttpp--aannaallyyzzee --mm33ff on the logfile for the current month
	     immediately after the statistics for the previous month have been
	     generated.

     Note that all cron	jobs must run with the user ID of the owner of the
     output directory except for rroottaattee--hhttttppdd, which must run with the user ID
     of	the server user.  This is a sample _c_r_o_n_t_a_b(1) for the scheme described
     above:

	 # Generate a full statistics report twice per day at 01:17 and	13:17
	 17  1,13 * * *	 /usr/local/bin/http-analyze -m3f -c /usr/httpd/analyze.conf
	 # Generate a short statistics report each hour	except at 01:17	or 13:17
	 17  2-12 * * *	 /usr/local/bin/http-analyze -d	-c /usr/httpd/analyze.conf
	 17 14-23 * * *	 /usr/local/bin/http-analyze -d	-c /usr/httpd/analyze.conf
	 # Rotate the logfiles at the first day	of a new month at 00:00
	 0 0 1 * *	 /usr/local/bin/rotate-httpd


PPPPEEEERRRRFFFFOOOORRRRMMMMAAAANNNNCCCCEEEE CCCCOOOONNNNSSSSIIIIDDDDEEEERRRRAAAATTTTIIIIOOOONNNNSSSS
     The processing time needed	to create full statistics reports depends on
     many factors:

	   o The size of the I/O buffer	(reported by hhttttpp--aannaallyyzzee when --vv is
	     given) should be as big as	possible.  For example,	a buffer size
	     of	64KB can significantly reduce disk activity when reading the
	     logfile.
	   o If	many IIggnn** directives are defined, the analyzer must compare
	     each logfile entry	against	each entry in the corresponding	IIggnn**
	     list.  The	recommended way	to suppress certain parts of the web



Page 34							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	     server in the statistics report is	to have	the server not record
	     any accesses to those areas in the	logfile.  Similar, many	HHiiddee**
	     directives	may also require additional table lookups, although
	     this will happen only once	for each unique	(different) URLs,
	     sitename, browser type or referrer	URL.
	   o If	SSttrriippCCGGII is set	to NNoo, this will require more memory.
	   o Some systems impose a memory limit	on a per-process base (see
	     _u_l_i_m_i_t(_1) and _s_e_t_r_l_i_m_i_t(_3)).  There are no	unusual	requirements
	     regarding main memory needed by hhttttpp--aannaallyyzzee - to be precise that
	     means >>the bigger, the better<< -, but you should	make sure that
	     about 5-10MB is available for processing of a medium-size
	     logfile.

TTTTRRRROOOOUUUUBBBBLLLLEEEESSSSHHHHOOOOOOOOTTTTIIIINNNNGGGG
     If	you discover any problems using	the analyzer you may find the verbose
     mode helpful.  Each --vv option increases the verbosity level. In verbosity
     level 1, hhttttpp--aannaallyyzzee comments ongoing processing;	in level 2 it
     indicates progress	by printing a dot for each new day discovered in the
     logfile.  In level	3, a debug message for each logfile entry parsed
     successfully is printed and in level 4 an even more detailed message
     appears on	standard error.	 Furthermore, compiling	hhttttpp--aannaallyyzzee without
     the macro _N_D_E_B_U_G includes various assertion checks	in the executable.

       $ http-analyze -vvvm3f -o testd files/logfmt.elf
       http-analyze 2.4	(IP22; IRIX 6.2; XPG4 MNLS; PNG)
       Copyright 1999 by RENT-A-GURU(TM)
       Generating full statistics in output directory `testd'
       Reading data from `files/logfmt.elf'
       Best blocksize for I/O is set to	64 KB
       Hmm, looks like Extended	Logfile	Format (ELF)
	     1 01/Jan/1999:16:37:25 [298971279], req="GET /", sz=280 <-	OK (Code 200), PAGEVIEW
       Start new period	at 01/Jan/1999
	     2 01/Jan/1999:16:38:39 [298971355], req="GET /def/", sz=910 <- OK (Code 200), PAGEVIEW
	     3 02/Jan/1999:16:39:39 [299060697], req="GET /abc/", sz=910 <- OK (Code 200), PAGEVIEW
       ...


     FFiilliinngg bbuugg--rreeppoorrttss

     If	you want to file a bug report, use the option --XX to have hhttttpp--aannaallyyzzee
     generate an URL of	a bug reporting	form with some information already
     filled in.	 You can pass this URL to your favourite browser using
     cut&paste or - on Unix systems - using command substitution as in:

       $ netscape `http-analyze	-X`


     This address a bug	report form on http://support.netstore.de/ with	the
     following information filled in already:

	   o the customer's name as specified in the registration
	   o the registration ID with licensing	information



Page 35							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



	     (Personal/Commercial License)
	   o the version number	of hhttttpp--aannaallyyzzee
	   o the platform the program was compiled for.

     Using this	interface to submit report bugs	will ensure proper handling
     and timely	response.  Please note that although we	gladly accept bug
     reports from everyone, only Commercial Service Licensees are entitled to
     request technical assistance or open a support call.

RRRREEEEGGGGIIIISSSSTTTTRRRRAAAATTTTIIIIOOOONNNN
     hhttttpp--aannaallyyzzee is available through our web site for	evaluation purposes.
     In	the evaluation version an >>unregistered version<< button will show up
     in	the statistics report.	To replace this	button with the	Netstore(R)
     logo of the free version for personal and educational use,	just click on
     the >>unregistered	version<< button to follow the link to our online
     registration form on our web site and register for	a free,	non-commercial
     version.

   NNNNOOOONNNN----CCCCOOOOMMMMMMMMEEEERRRRCCCCIIIIAAAALLLL VVVVEEEERRRRSSSSIIIIOOOONNNN
     After registration	you will receive a registration	ID and two
     registration images as replacements for the >>unregistered	version<<
     button by email.  In the free version, the	Netstore(R) logo, a copyright
     note and a	link to	the homepage of	hhttttpp--aannaallyyzzee appears in	the statistics
     report, which must	be left	intact according to the	license	under which
     this software is made available to	you.

   CCCCOOOOMMMMMMMMEEEERRRRCCCCIIIIAAAALLLL VVVVEEEERRRRSSSSIIIIOOOONNNN
     If	you use	hhttttpp--aannaallyyzzee for commercial purposes such as providing
     statistics	services for your customers, you must buy a _C_o_m_m_e_r_c_i_a_l _S_e_r_v_i_c_e
     _L_i_c_e_n_s_e available from RENT-A-GURU(R) and its authorized resellers.  You
     will receive a registration ID and	two registration images	as
     replacements for the >>unregistered version<< button by email from	our
     office.

     In	the commercial version,	the Netstore(R)	logo, the copyright note and
     the link to the homepage of hhttttpp--aannaallyyzzee are supressed from the
     statistics	report - except	for the	logo and copyright note, which appears
     only once on the main page	and inside the navigation frame. On all	other
     pages, your company's name	is shown.  Additionally, you can add your
     company's logo to the report using	the CCuussttLLooggooWW and CCuussttLLooggooBB directives
     in	the configuration file,	which are enabled in the commercial version
     only.  Except for this feature and	the individual support for Commercial
     Service Licensees,	both versions of the software have identical
     functionality.

   BBBBRRRRAAAANNNNDDDDIIIINNNNGGGG TTTTHHHHEEEE	SSSSOOOOFFFFTTTTWWWWAAAARRRREEEE
     For all license types, you	have to	brand your copy	of hhttttpp--aannaallyyzzee	with
     the registration ID and the registration images.  The registration	ID may
     be	set either in a	system-wide file (usually /usr/local/lib/http-
     analyze/REGID) or via the RReeggIInnffoo directives in an	analyzer configuration
     file.  The	latter method requires specification of	the configuration file
     each time hhttttpp--aannaallyyzzee is invoked.	 If you	create a system-wide



Page 36							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     registration file,	the registration information applies to	all virtual
     servers being analyzed.

     To	brand the software, detach the registration images we sent to you from
     the email.	 After detaching them, there should be two files free-
     netstore_s[bw].png	for the	free version and comm-netstore_s[bw].png for
     the commercial version.  Next, define the HHAA__LLIIBBDDIIRR environment variable
     if	you did	choose another directory for the central libdir	rather than
     the default (/usr/local/lib/http-analyze).	 For example, if you can't
     become _r_o_o_t, you would choose a directory for which you have write
     permissions, install the analyzer files there and then use	the HHAA__LLIIBBDDIIRR
     variable to pass its name to hhttttpp--aannaallyyzzee.	 Finally, brand	the software
     by	executing the following	command	as root:

	 # http-analyze	-r "_C_u_s_t_o_m_e_r _N_a_m_e" _r_e_g_I_D _t_y_p_e
	 Registration information saved	in file	`/usr/local/lib/http-analyze/REGID'
	 #

     where _C_u_s_t_o_m_e_r _N_a_m_e is the	name of	the organization this license is
     registered	for, _r_e_g_I_D is the registration ID of the license and _t_y_p_e is
     either the	keyword	free or	comm according to the type of the license.
     Now run the analyzer to have the new buttons appear in the	statistics
     report.

     Note that running the analyzer the	first time will	install	or update any
     older buttons and files in	the statistics output directory	automatically;
     there is no need to run some helper application as	it was the case	in
     previous versions of hhttttpp--aannaallyyzzee.

YYYYEEEEAAAARRRR 2222000000000000 CCCCOOOOMMMMPPPPLLLLIIIIAAAANNNNCCCCEEEE
     All versions 2.X and above	of hhttttpp--aannaallyyzzee	are fully Year 2000 compliant.
     There will	be no problems with date-related functions after the year 1999
     as	long as	the operating system itself is Year 2000 compliant also.  Year
     2000 compliant means, that	the software does not produce errors in	date-
     related data or calculations or experience	loss of	functionality as a
     result of the transition to the year 2000.	 This Year 2000	compliance
     statement is not a	product	warranty.  hhttttpp--aannaallyyzzee	is provided under the
     terms of the license agreement included in	each distribution.

     Please see	http://www.netstore.de/Supply/http-analyze/year2000.html for
     more information about the	Year 2000 compliance real-time tests we	did
     run with hhttttpp--aannaallyyzzee.


   DDDDAAAATTTTEEEE	UUUUSSSSAAAAGGGGEEEE IIIINNNN HHHHTTTTTTTTPPPP----AAAANNNNAAAALLLLYYYYZZZZEEEE
     The analyzer depends on the timestamp found in the	logfile	entries
     produced by a web server.	For the	_N_C_S_A _C_o_m_m_o_n _L_o_g_f_i_l_e _F_o_r_m_a_t  _a_n_d	_t_h_e
     _W_3_C _E_x_t_e_n_d_e_d _L_o_g_f_i_l_e _F_o_r_m_a_t a Year	2000 compliant date format was choosen
     from the beginning	on.  This unique date format is	- and ever was -
     required by hhttttpp--aannaallyyzzee to be able to generate a statistics report, so
     there are no problems unless those	caused by your Operating System	(see
     below).



Page 37							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     To	retain compatibility with previous versions of the log analyzer,
     hhttttpp--aannaallyyzzee generates two-digit years in some output filenames.
     However, those files are placed in	a subdirectory containing the year in
     four digits, which	makes all output filenames fully Year 2000 compliant.

     The date format in	the --II and --EE options allows specification of a	year
     using only	two digits.  hhttttpp--aannaallyyzzee interprets values greater and	equal
     to	69 in 1900 and values lower than 69 in 2000.  This way,	the analyzer
     covers the	whole range of the time	representation in modern Operating
     Systems.  However,	any year can always be specified unambiguously by
     using four	digits.


   DDDDAAAATTTTEEEE	UUUUSSSSAAAAGGGGEEEE IIIINNNN TTTTHHHHEEEE OOOOPPPPEEEERRRRAAAATTTTIIIINNNNGGGG SSSSYYYYSSSSTTTTEEEEMMMM
     Rumors has	it that	some systems don't recognize the Year 2000 as a	leap
     year.  Although hhttttpp--aannaallyyzzee computes leap	years for itself correctly, it
     maps dates	into weekdays using the	_l_o_c_a_l_t_i_m_e(_3) function, which might
     fail if the OS doesn't recognize the Year 2000 as a leap year.

     Actually, there is	a date-related function	in modern operating systems,
     which may cause problems after the	year 2037. For those interested	in the
     technical details,	here's why:

     In	operating systems the date is often represented	in seconds since a
     certain date. For example,	in Unix	systems	the date is represented	as
     seconds since the birth of	the OS at January, 1st 1970. This value	is
     stored in a _s_i_g_n_e_d	_l_o_n_g (4-byte) data object, so it can represent as much
     as	2147483648 seconds, which equals 35791394 minutes = 596523 hours =
     24855 days	= 68 years.  Therefore,	most clocks in traditional Unix
     systems will overflow at January, 1st 2038	if the OS is not updated
     before this date.	Since hhttttpp--aannaallyyzzee uses	several	data structures
     depending on the operating	system's idea of the time (for example,	the
     _t_m__y_e_a_r variable contains the years since 1900), the software has to be
     updated also before the year 2038 in order	to take	advantage of the time
     representation in future OS versions.

EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT VVVVAAAARRRRIIIIAAAABBBBLLLLEEEESSSS
     Environment variables might work only in the Unix version of hhttttpp--
     aannaallyyzzee.

     HA_LIBDIR	       name of the library directory (default: /usr/local/lib/http-analyze)
     HA_CONFIG	       name of the configuration file for hhttttpp--aannaallyyzzee (no default)
     LANG	       language	to use if XPG4 MNLS support is compiled	in (see	--VV)
     HA_LANG	       language	to use if native MNLS support is compiled in (see --VV)

FFFFIIIILLLLEEEESSSS
     The following required files are installed	in the library directory as
     defined by	the environment	variable HHAA__LLIIBBDDIIRR or the hard-coded default
     defined at	compile-time.  See also	the section _S_t_a_t_i_s_t_i_c_s _R_e_p_o_r_t above
     for the names of the HTML output files.

     _b_t_n/*._p_n_g	       buttons files used in the statistics report



Page 38							     (printed 11/1/99)






hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))			 VVVVeeeerrrrssssiiiioooonnnn 2222....4444		       hhhhttttttttpppp----aaaannnnaaaallllyyyyzzzzeeee((((1111))))



     _T_L_D	       list of all top-level-domains
     _h_a_2._0_*._p_n_g       hhttttpp--aannaallyyzzee logos for your web site (for black and white bg)
     _l_o_g_f_m_t.[_c_d_e]_l_f    sample logfiles in CLF, DLF and ELF format
     _3_D*	       required	files for VRML model

SSSSEEEEEEEE AAAALLLLSSSSOOOO
     _r_o_t_a_t_e-_h_t_t_p_d      shell script to rotate the web server's logfiles
     _h_t_t_p://_w_w_w._n_e_t_s_t_o_r_e._d_e/_S_u_p_p_l_y/_h_t_t_p-_a_n_a_l_y_z_e/homepage of hhttttpp--aannaallyyzzee
     _h_t_t_p://_s_u_p_p_o_r_t._n_e_t_s_t_o_r_e._d_e/support	site of	hhttttpp--aannaallyyzzee

NNNNOOOOTTTTEEEESSSS
     Logfile entries must be sorted in chronological order (ascending date)
     when feed into the	analyzer.  If hhttttpp--aannaallyyzzee detects logfile entries
     from an older month between newer ones, it	prints a warning and skips all
     entries up	to the date of the last	entry processed.  To sort the data
     from several different logfiles into a chronologically sorted data
     stream, we	provide	a utility ha-sort to our Commercial Service Licensees.

     To	increase response time of web servers, DNS lookups are often disabled.
     In	this case hhttttpp--aannaallyyzzee does not	see any	hostname, but only numerical
     IP	addresses.  To resolve the IP addresses	into hostnames,	we provide a
     very fast DNS resolver ipresolve to our Commercial	Service	Licensees,
     which does	negative caching and saves all data in a history file.

     Please visit our support site at http://support.netstore.de/ for more
     information about the available helper applications.

CCCCOOOOPPPPYYYYRRRRIIIIGGGGHHHHTTTT
     Copyright (C) 1996-1999 by	Stefan Stapelberg, RENT-A-GURU(R),
     <stefan@rent-a-guru.de>

     Please see	the file LLIICCEENNSSEE included in the distribution for the license
     terms under which this program is made available to you in	the free,
     non-commercial version.

     RENT-A-GURU(R) is a registered trademark of Martin	Weitzel, Stefan
     Stapelberg, and Walter Mecky.
     Netstore(R) is a registered trademark of Stefan Stapelberg.

CCCCRRRREEEEDDDDIIIITTTTSSSS
     Thanks to the numeruous users of hhttttpp--aannaallyyzzee for their valuable
     feedback.	Special	thanks to Lars-Owe Ivarsson for	his suggestions	to
     optimize the parser algorithm and for the code he provided	as an example.
     Many thanks also to Thomas	Boutell	(http://www.boutell.com/) for his
     great GD library for fast image creation, without hhttttpp--aannaallyyzzee couldn't
     produce such fancy	graphics in the	statistics report.









Page 39							     (printed 11/1/99)