1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093
|
.\" $Header: /home/amb/wwwoffle/doc/RCS/wwwoffle.conf.man.template 2.84 2008/08/26 18:00:31 amb Exp $
.\"
.\" WWWOFFLE - World Wide Web Offline Explorer - Version 2.9e.
.\"
.\" Manual page for wwwoffle.conf
.\"
.\" Written by Andrew M. Bishop
.\"
.\" This file Copyright 1997-2008 Andrew M. Bishop
.\" It may be distributed under the GNU Public License, version 2, or
.\" any higher version. See section COPYING of the GNU Public license
.\" for conditions under which this file may be redistributed.
.\"
.TH wwwoffle.conf 5 "August 26, 2008"
.SH NAME
wwwoffle.conf \- The configuration file for the proxy server for the World Wide Web Offline Explorer.
.SH Introduction
The configuration file (
.I wwwoffle\.conf
) specifies all of the parameters that
control the operation of the proxy server\. The file is split into sections
each containing a series of parameters as described below\. The file
.I CHANGES\.CONF
explains the changes in the configuration file between this
version of the program and previous ones\.
.LP
The file is split into sections, each of which can be empty or contain one or
more lines of configuration information\. The sections are named and the order
that they appear in the file is not important\.
.LP
The general format of each of the sections is the same\. The name of the
section is on a line by itself to mark the start\. The contents of the section
are enclosed between a pair of lines containing only the \'{\' and \'}\'
characters or the \'[\' and \']\' characters\. When the \'{\' and \'}\' characters are
used the lines between contain configuration information\. When the \'[\' and
\']\' characters are used then there must only be a single non\-empty line
between them that contains the name of a file (in the same directory)
containing the configuration information for the section\.
.LP
Comments are marked by a \'#\' character at the start of the line and they are
ignored\. Blank lines are also allowed and ignored\.
.LP
The phrases
.I URL\-SPECIFICATION
(or
.I URL\-SPEC
for short) and
.I WILDCARD
have
specific meanings in the configuration file and are described at the end\. Any
item enclosed in \'(\' and \')\' in the descriptions means that it is a parameter
supplied by the user, anything enclosed in \'[\' and \']\' is optional, the \'|\'
symbol is used to denote alternate choices\. Some options apply to specific
URLs only, this is indicated by having a
.I URL\-SPECIFICATION
enclosed between
\'<\' & \'>\' in the option, the first
.I URL\-SPECIFICATION
to match is used\. If no
.I URL\-SPECIFICATION
is given then it matches all URLs\.
.SH StartUp
This contains the parameters that are used when the program starts, changes to
these are ignored if the configuration file is re\-read while the program is
running\.
.TP
.B bind\-ipv4 = (hostname) | (ip\-address) | none
Specify the hostname or IP address to bind the HTTP proxy and WWWOFFLE
control port sockets to using IPv4 (default=\'0\.0\.0\.0\')\. If \'none\' is
specified then no IPv4 socket is bound\. If this is changed from the
default value then the first entry in the LocalHost section may need
to be changed to match\.
.TP
.B bind\-ipv6 = (hostname) | (ip\-address) | none
Specify the hostname or IP address to bind the HTTP proxy and WWWOFFLE
control port sockets to using IPv6 (default=\'::\')\. If \'none\' is
specified then no IPv6 socket is bound\. This requires the IPv6
compilation option\. If this is changed from the default value then
the first entry in the LocalHost section may need to be changed to
match\.
.TP
.B http\-port = (port)
An integer specifying the port number for connections to access the
internal WWWOFFLE pages and for HTTP/HTTPS/FTP proxying
(default=8080)\. This is the port number that must be specified in the
client to connect to the WWWOFFLE proxy for HTTP/HTTPS/FTP proxying\.
.TP
.B https\-port = (port)
An integer specifying the port number for encrypted connections to
access the internal WWWOFFLE pages and for HTTP/FTP proxying
(default=8443)\. Requires gnutls compilation option\.
.TP
.B wwwoffle\-port = (port)
An integer specifying the port number for the WWWOFFLE control
connections to use (default=8081)\.
.TP
.B spool\-dir = (dir)
The full pathname of the top level cache directory (spool directory)
(default=/var/spool/wwwoffle or whatever was used when the program was
compiled)\.
.TP
.B run\-uid = (user) | (uid)
The username or numeric uid to change to when the WWWOFFLE server is
started (default=none)\. This option only works if the server is
started by the root user on UNIX\-like systems\.
.TP
.B run\-gid = (group) | (gid)
The group name or numeric gid to change to when the WWWOFFLE server is
started (default=none)\. This option only works if the server is
started by the root user on UNIX\-like systems\.
.TP
.B use\-syslog = yes | no
Whether to use the syslog facility for messages or not (default=yes)\.
.TP
.B password = (word)
The password used for authentication of the control pages, for
deleting cached pages etc (default=none)\. For the password to be
secure the configuration file must be set so that only authorised
users can read it\.
.TP
.B max\-servers = (integer)
The maximum number of server processes that are started for online and
automatic fetching (default=8)\.
.TP
.B max\-fetch\-servers = (integer)
The maximum number of server processes that are started to fetch pages
that were marked in offline mode (default=4)\. This value must be less
than max\-servers or you will not be able to use WWWOFFLE interactively
online while fetching\.
.SH Options
Options that control how the program works\.
.TP
.B log\-level = debug | info | important | warning | fatal
The minimum log level for messages in syslog or stderr
(default=important)\.
.TP
.B socket\-timeout = (time)
The time in seconds that WWWOFFLE will wait for data on a socket
connection before giving up (default=120)\.
.TP
.B dns\-timeout = (time)
The time in seconds that WWWOFFLE will wait for a DNS (Domain Name
Service) lookup before giving up (default=60)\.
.TP
.B connect\-timeout = (time)
The time in seconds that WWWOFFLE will wait for the socket connection
to be made before giving up (default=30)\.
.TP
.B connect\-retry = yes | no
If a connection cannot be made to a remote server then WWWOFFLE should
try again after a short delay (default=no)\.
.TP
.B dir\-perm = (octal int)
The directory permissions to use when creating spool directories
(default=0755)\. This option overrides the umask of the user and must
be in octal starting with a \'0\'\.
.TP
.B file\-perm = (octal int)
The file permissions to use when creating spool files (default=0644)\.
This option overrides the umask of the user and must be in octal
starting with a \'0\'\.
.TP
.B run\-online = (filename)
The full pathname of a program to run when WWWOFFLE is switched to
online mode (default=none)\. The program is started in the background
with a single parameter set to the current mode name "online"\.
.TP
.B run\-offline = (filename)
The full pathname of a program to run when WWWOFFLE is switched to
offline mode (default=none)\. The program is started in the background
with a single parameter set to the current mode name "offline"\.
.TP
.B run\-autodial = (filename)
The full pathname of a program to run when WWWOFFLE is switched to
autodial (default=none)\. The program is started in the background with
a single parameter set to the current mode name "fetch"\.
.TP
.B run\-fetch = (filename)
The full pathname of a program to run when a WWWOFFLE fetch starts or
stops (default=none)\. The program is started in the background with two
parameters, the first is the word "fetch" and the second is "start" or
"stop"\.
.TP
.B lock\-files = yes | no
Enable the use of lock files to stop more than one WWWOFFLE process from
downloading the same URL at the same time (default=no)\. Disabling the
lock\-files may result in incomplete pages being displayed or many copies
being downloaded if multiple requests are made for the same URL at the
same time\.
.TP
.B reply\-compressed\-data = yes | no
If the replies that are made to the client are to contain compressed
data when requested (default=no)\. Requires zlib compilation option\.
.TP
.B reply\-chunked\-data = yes | no
If the replies that are made to the client are to use chunked encoding
when possible (default=yes)\.
.TP
.B exec\-cgi = (pathname)
Enable the use of CGI scripts for the local pages on the WWWOFFLE
server that match the wildcard pathname (default=none)\.
.SH OnlineOptions
Options that control how WWWOFFLE behaves when it is online\.
.TP
.B [<URL\-SPEC>] pragma\-no\-cache = yes | no
Whether to request a new copy of a page if the request from the client
has \'Pragma: no\-cache\' (default=yes)\. This option takes precedence
over the request\-changed and request\-changed\-once options\.
.TP
.B [<URL\-SPEC>] cache\-control\-no\-cache = yes | no
Whether to request a new copy of a page if the request from the client
has \'Cache\-Control: no\-cache\' (default=yes)\. This option takes
precedence over the request\-changed and request\-changed\-once options\.
.TP
.B [<URL\-SPEC>] cache\-control\-max\-age\-0 = yes | no
Whether to request a new copy of a page if the request from the client
has \'Cache\-Control: max\-age=0\' (default=yes)\. This option takes
precedence over the request\-changed and request\-changed\-once options\.
.TP
.B [<URL\-SPEC>] cookies\-force\-refresh = yes | no
Whether to force the refresh of a page if the request from the client
contains a cookie (default=no)\. This option takes precedence over
the request\-changed and request\-changed\-once options\.
.TP
.B [<URL\-SPEC>] request\-changed = (time)
While online pages will only be fetched if the cached version is older
than this specified time in seconds (default=600)\. Setting this value
negative will indicate that cached pages are always used while online\.
Longer times can be specified with a \'m\', \'h\', \'d\' or \'w\' suffix for
minutes, hours, days or weeks (e\.g\. 10m=600)\.
.TP
.B [<URL\-SPEC>] request\-changed\-once = yes | no
While online pages will only be fetched if the cached version has not
already been fetched once this session online (default=yes)\. This
option takes precedence over the request\-changed option\.
.TP
.B [<URL\-SPEC>] request\-expired = yes | no
While online pages that have expired will always be requested again
(default=no)\. This option takes precedence over the request\-changed
and request\-changed\-once options\.
.TP
.B [<URL\-SPEC>] request\-no\-cache = yes | no
While online pages that ask not to be cached will always be requested
again (default=no)\. This option takes precedence over the
request\-changed and request\-changed\-once options\.
.TP
.B [<URL\-SPEC>] request\-redirection = yes | no
While online pages that redirect the client to another URL temporarily
will be requested again\. (default=no)\. This option takes precedence
over the request\-changed and request\-changed\-once options\.
.TP
.B [<URL\-SPEC>] request\-conditional = yes | no
While online pages that are requested from the server will be
conditional requests so that the server only sends data if the page
has changed (default=yes)\.
.TP
.B [<URL\-SPEC>] validate\-with\-etag = yes | no
When making a conditional request to a server enable the use of the
HTTP/1\.1 cache validator \'Etag\' as well as modification time
\'If\-Modified\-Since\' (default=yes)\. The request\-conditional option
must also be selected for this option to take effect\.
.TP
.B [<URL\-SPEC>] try\-without\-password = yes | no
If a request is made for a URL that contains a username and password
then a request is made for the same URL without a username and
password specified (default=yes)\. This allows for requests for the
URL without a password to re\-direct the client to the passworded
version\.
.TP
.B [<URL\-SPEC>] intr\-download\-keep = yes | no
If the client closes the connection while online the currently
downloaded incomplete page should be kept (default=no)\.
.TP
.B [<URL\-SPEC>] intr\-download\-size = (integer)
If the client closes the connection while online the page should
continue to download if it is smaller than this size in kB
(default=1)\.
.TP
.B [<URL\-SPEC>] intr\-download\-percent = (integer)
If the client closes the connection while online the page should
continue to download if it is more than this percentage complete
(default=80)\.
.TP
.B [<URL\-SPEC>] timeout\-download\-keep = yes | no
If the server connection times out while reading then the currently
downloaded incomplete page should be kept (default=no)\.
.TP
.B [<URL\-SPEC>] keep\-cache\-if\-not\-found = yes | no
If the remote server replies with an error message or a redirection
while there is a cached version with status 200 the previously cached
version should be kept (default=no)\.
.TP
.B [<URL\-SPEC>] request\-compressed\-data = yes | no
If the requests that are made to the server are to request compressed
data (default=yes)\. Requires zlib compilation option\.
.TP
.B [<URL\-SPEC>] request\-chunked\-data = yes | no
If the requests that are made to the server are to request chunked
encoding (default=yes)\.
.SH OfflineOptions
Options that control how WWWOFFLE behaves when it is offline\.
.TP
.B [<URL\-SPEC>] pragma\-no\-cache = yes | no
Whether to request a new copy of a page if the request from the client
has \'Pragma: no\-cache\' (default=yes)\. This option should be set to
\'no\' if when browsing offline all pages are re\-requested by a \'broken\'
browser\.
.TP
.B [<URL\-SPEC>] cache\-control\-no\-cache = yes | no
Whether to request a new copy of a page if the request from the client
has \'Cache\-Control: no\-cache\' (default=yes)\. This option should be
set to \'no\' if when browsing offline all pages are re\-requested by a
\'broken\' browser\.
.TP
.B [<URL\-SPEC>] cache\-control\-max\-age\-0 = yes | no
Whether to request a new copy of a page if the request from the client
has \'Cache\-Control: max\-age=0\' (default=yes)\. This option should be
set to \'no\' if when browsing offline all pages are re\-requested by a
\'broken\' browser\.
.TP
.B [<URL\-SPEC>] confirm\-requests = yes | no
Whether to return a page requiring user confirmation instead of
automatically recording requests made while offline (default=no)\.
.TP
.B [<URL\-SPEC>] dont\-request = yes | no
Do not request any URLs that match this when offline (default=no)\.
.SH SSLOptions
Options that control how WWWOFFLE behaves when a connection is made to it for an
https or Secure Sockets Layer (SSL) server\. Normally only tunnelling (with no
decryption or caching of the data) is possible\. When WWWOFFLE is compiled with
the gnutls library it is possible configure WWWOFFLE to decrypt, cache and
re\-encrypt the connections\.
.TP
.B quick\-key\-gen = yes | no
Normally generation of secret keys for the SSL/https functions uses the
default GnuTLS option for random number source\. This can be slow on
some machines so this option selects a quicker but less secure random
number source (default = no)\. Requires GnuTLS compilation option\.
.TP
.B expiration\-time = (age)
The length of time after creation that each certificate will expire
(default = 1y)\. Requires GnuTLS compilation option\.
.TP
.B enable\-caching = yes | no
If caching (involving decryption and re\-encryption) of SSL/https
server connections is allowed (default = no)\. Requires GnuTLS
compilation option\.
.TP
.B allow\-tunnel = (host[:port])
A hostname and port number (a
.I WILDCARD
match) for an https/SSL server
that can be connected to using WWWOFFLE as a tunnelling proxy (no
caching or decryption of the data) (default is no hosts or ports
allowed)\. This option should be set to *:443 to allow https to the
default port number\. There can be more than one option for other
ports or hosts as required\. This option takes precedence over the
allow\-cache option\. The host value is matched against the URL as
presented, no hostname to IP or IP to hostname lookups are performed
to find alternative equivalent names\.
.TP
.B disallow\-tunnel = (host[:port])
A hostname and port number (a
.I WILDCARD
match) for an https/SSL server
that can not be connected to using WWWOFFLE as a tunnelling proxy\.
There can be more than one option for other ports or hosts as
required\. This option takes precedence over the allow\-tunnel option\.
The host value is matched against the URL as presented, no hostname to
IP or IP to hostname lookups are performed to find alternative
equivalent names\.
.TP
.B allow\-cache = (host[:port])
A hostname and port number (a
.I WILDCARD
match) for an https/SSL server
that can be connected to using WWWOFFLE as a caching proxy (decryption
of the data) (default is no hosts or ports allowed)\. This option
should be set to *:443 to allow https to the default port number\.
There can be more than one option for other ports or hosts as
required\. The host value is matched against the URL as presented, no
hostname to IP or IP to hostname lookups are performed to find
alternative equivalent names\. Requires GnuTLS compilation option\.
.TP
.B disallow\-cache = (host[:port])
A hostname and port number (a
.I WILDCARD
match) for an https/SSL server
that can not be connected to using WWWOFFLE as a caching proxy\. This
option takes precedence over the allow\-cache option\. The host value
is matched against the URL as presented, no hostname to IP or IP to
hostname lookups are performed to find alternative equivalent names\.
Requires GnuTLS compilation option\.
.SH FetchOptions
Options that control what linked elements are downloaded when fetching pages
that were requested while offline\.
.TP
.B [<URL\-SPEC>] stylesheets = yes | no
If style sheets are to be fetched (default=no)\.
.TP
.B [<URL\-SPEC>] images = yes | no
If images are to be fetched (default=no)\.
.TP
.B [<URL\-SPEC>] webbug\-images = yes | no
If images that are declared in the HTML to be 1 pixel square are also to
be fetched, requires the images option to also be selected
(default=yes)\. If these images are not fetched then the
replace\-webbug\-images option in the ModifyHTML section can be used to
stop browsers requesting them\.
.TP
.B [<URL\-SPEC>] icon\-images = yes | no
If icons (also called favourite icons or shortcut icons) as used by
browsers for bookmarks are to be fetched (default=no)\.
.TP
.B [<URL\-SPEC>] only\-same\-host\-images = yes | no
If the only images that are fetched are the ones that are on the same
host as the page that references them, requires the images option to
also be selected (default=no)\.
.TP
.B [<URL\-SPEC>] frames = yes | no
If frames are to be fetched (default=no)\.
.TP
.B [<URL\-SPEC>] iframes = yes | no
If inline frames (iframes) are to be fetched (default=no)\.
.TP
.B [<URL\-SPEC>] scripts = yes | no
If scripts (e\.g\. Javascript) are to be fetched (default=no)\.
.TP
.B [<URL\-SPEC>] objects = yes | no
If objects (e\.g\. Java class files) are to be fetched (default=no)\.
.SH IndexOptions
Options that control what is displayed in the indexes\.
.TP
.B create\-history\-indexes = yes | no
Enables creation of the lasttime/prevtime and lastout/prevout indexes
(default=yes)\. The cycling of the indexes is always performed and
they will flush even if this option is disabled\.
.TP
.B cycle\-indexes\-daily = yes | no
Cycles the lasttime/prevtime and lastout/prevout indexes daily instead
of each time online or fetching (default = no)\.
.TP
.B <URL\-SPEC> list\-outgoing = yes | no
Choose if the URL is to be listed in the outgoing index (default=yes)\.
.TP
.B <URL\-SPEC> list\-latest = yes | no
Choose if the URL is to be listed in the lasttime/prevtime and
lastout/prevout indexes (default=yes)\.
.TP
.B <URL\-SPEC> list\-monitor = yes | no
Choose if the URL is to be listed in the monitor index (default=yes)\.
.TP
.B <URL\-SPEC> list\-host = yes | no
Choose if the URL is to be listed in the host indexes (default=yes)\.
.TP
.B <URL\-SPEC> list\-any = yes | no
Choose if the URL is to be listed in any of the indexes (default=yes)\.
.SH ModifyHTML
Options that control how the HTML that is provided from the cache is modified\.
.TP
.B [<URL\-SPEC>] enable\-modify\-html = yes | no
Enable the HTML modifications in this section (default=no)\. With this
option disabled the following HTML options will not have any effect\.
With this option enabled there is a small speed penalty\.
.TP
.B [<URL\-SPEC>] add\-cache\-info = yes | no
At the bottom of all of the spooled pages the date that the page was
cached and some navigation buttons are to be added (default=no)\.
.TP
.B [<URL\-SPEC>] anchor\-cached\-begin = (HTML code) |
Anchors (links) in the spooled page that are in the cache are to have
the specified HTML inserted at the beginning (default="")\.
.TP
.B [<URL\-SPEC>] anchor\-cached\-end = (HTML code) |
Anchors (links) in the spooled page that are in the cache are to have
the specified HTML inserted at the end (default="")\.
.TP
.B [<URL\-SPEC>] anchor\-requested\-begin = (HTML code) |
Anchors (links) in the spooled page that are not in the cache but have
been requested for download are to have the specified HTML inserted at
the beginning (default="")\.
.TP
.B [<URL\-SPEC>] anchor\-requested\-end = (HTML code) |
Anchors (links) in the spooled page that are not in the cache but have
been requested for download are to have the specified HTML inserted at
the end (default="")\.
.TP
.B [<URL\-SPEC>] anchor\-not\-cached\-begin = (HTML code) |
Anchors (links) in the spooled page that are not in the cache or
requested are to have the specified HTML inserted at the beginning
(default="")\.
.TP
.B [<URL\-SPEC>] anchor\-not\-cached\-end = (HTML code) |
Anchors (links) in the spooled page that are not in the cache or
requested are to have the specified HTML inserted at the end
(default="")\.
.TP
.B [<URL\-SPEC>] disable\-script = yes | no
Removes all scripts and scripted events (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-applet = yes | no
Removes all Java applets (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-style = yes | no
Removes all stylesheets and style references (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-blink = yes | no
Removes the <blink> tag from HTML but does not disable blink in
stylesheets (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-marquee = yes | no
Removes the <marquee> tag from HTML to stop scrolling text
(default=no)\.
.TP
.B [<URL\-SPEC>] disable\-flash = yes | no
Removes any Shockwave Flash animations (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-iframe = yes | no
Removes any inline frames (the <iframe> tag) from HTML (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-meta\-refresh = yes | no
Removes any meta tags in the HTML header that re\-direct the client to
change to another page after an optional delay (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-meta\-refresh\-self = yes | no
Removes any meta tags in the HTML header that re\-direct the client to
reload the same page after a delay (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-meta\-set\-cookie = yes | no
Removes any meta tags in the HTML header that cause cookies to be set
(default=no)\.
.TP
.B [<URL\-SPEC>] disable\-dontget\-links = yes | no
Disables any links to URLs that are in the DontGet section of the
configuration file (default=no)\.
.TP
.B [<URL\-SPEC>] disable\-dontget\-iframes = yes | no
Disables inline frame (iframe) URLs that are in the DontGet section of
the configuration file (default=no)\.
.TP
.B [<URL\-SPEC>] replace\-dontget\-images = yes | no
Replaces image URLs that are in the DontGet section of the
configuration file with a static URL (default=no)\.
.TP
.B [<URL\-SPEC>] replacement\-dontget\-image = (URL)
The replacement image to use for URLs that are in the DontGet section
of the configuration file (default=/local/dontget/replacement\.gif)\.
.TP
.B [<URL\-SPEC>] replace\-webbug\-images = yes | no
Replaces image URLs that are 1 pixel square with a static URL
(default=no)\. The webbug\-images option in the FetchOptions section
can be used to stop these images from being automatically downloaded\.
.TP
.B [<URL\-SPEC>] replacement\-webbug\-image = (URL)
The replacement image to use for images that are 1 pixel square
(default=/local/dontget/replacement\.gif)\.
.TP
.B [<URL\-SPEC>] demoronise\-ms\-chars = yes | no
Replaces strange characters that some Microsoft applications put into
HTML with character equivalents that most browsers can display
(default=no)\. The idea for this comes from the public domain
Demoroniser perl script\.
.TP
.B [<URL\-SPEC>] fix\-mixed\-cyrillic = yes | no
Replaces punctuation characters in cp\-1251 encoding that are combined
with text in koi\-8 encoding that appears in some cyrillic web pages\.
.TP
.B [<URL\-SPEC>] disable\-animated\-gif = yes | no
Disables the animation in animated GIF files (default=no)\.
.SH LocalHost
A list of hostnames that the host running the WWWOFFLE server may be known by\.
This is so that the proxy does not need to contact itself if the request has a
different name for the same server\.
.TP
.B (host)
A hostname or IP address that in connection with the port number (in
the StartUp section) specifies the WWWOFFLE proxy HTTP server\. The
hostnames must match exactly, it is not a
.I WILDCARD
match\. The first
named host is used as the server name for several features so must be
a name that will work from any client host on the network\. The
entries can be hostnames, IPv4 addresses or IPv6 addresses enclosed
within \'[\.\.\.]\'\. None of the hosts named here are cached or fetched
via a proxy\.
.SH LocalNet
A list of hostnames whose web servers are always accessible even when offline
and are not to be cached by WWWOFFLE because they are on a local network\.
.TP
.B (host)
A hostname or IP address that is always available and is not to be
cached by WWWOFFLE\. The host name matching uses
.I WILDCARD
s\. A host
can be excluded by appending a \'!\' to the start of the name\. The host
value is matched against the URL as presented, no hostname to IP or IP
to hostname lookups are performed to find alternative equivalent
names\. The entries can be hostnames, IPv4 addresses or IPv6 addresses
enclosed within \'[\.\.\.]\'\. All entries here are assumed to be reachable
even when offline\. None of the hosts named here are cached or fetched
via a proxy\.
.SH AllowedConnectHosts
A list of client hostnames that are allowed to connect to the server\.
.TP
.B (host)
A hostname or IP address that is allowed to connect to the server\.
The host name matching uses
.I WILDCARD
s\. A host can be excluded by
appending a \'!\' to the start of the name\. If the IP address or
hostname (if available) of the machine connecting matches then it is
allowed\. The entries can be hostnames, IPv4 addresses or IPv6
addresses enclosed within \'[\.\.\.]\'\. All of the hosts named in
LocalHost are also allowed to connect\.
.SH AllowedConnectUsers
A list of the users that are allowed to connect to the server and their
passwords\.
.TP
.B (username):(password)
The username and password of the users that are allowed to connect to
the server\. If this section is left empty then no user authentication
is done\. The username and password are both stored in plaintext
format\. This requires the use of clients that handle the HTTP/1\.1
proxy authentication standard\.
.SH DontCache
A list of URLs that are not to be cached by WWWOFFLE\.
.TP
.B [!]URL\-SPECIFICATION
Do not cache any URLs that match this\. The
.I URL\-SPECIFICATION
can be
negated to allow matches to be cached\. The URLs that are not cached
will not have requests recorded if offline or fetched automatically\.
.SH DontGet
A list of URLs that are not to be got by WWWOFFLE when it is fetching and not
to be served from the WWWOFFLE cache even if they exist\.
.TP
.B [!]URL\-SPECIFICATION
Do not get any URLs that match this\. The
.I URL\-SPECIFICATION
can be
negated to allow matches to be got\.
.TP
.B [<URL\-SPEC>] replacement = (URL)
The URL to use to replace any URLs that match the
.I URL\-SPECIFICATION
s
instead of using the standard error message (default=none)\. The URLs
in /local/dontget/ are suggested replacements (e\.g\. replacement\.gif or
replacement\.png which are 1x1 pixel transparent images or
replacement\.js which is an empty javascript file)\.
.TP
.B <URL\-SPEC> get\-recursive = yes | no
Choose whether to get URLs that match this when doing a recursive
fetch (default=yes)\.
.TP
.B <URL\-SPEC> location\-error = yes | no
When a URL reply contains a \'Location\' header that redirects to a URL
that is not got (specified in this section) then the reply is modified
to be an error message instead (default=no)\. This will stop ISP
proxies from redirecting users to adverts if the advert URLs are
in this section\.
.SH DontCompress
A list of MIME types and file extensions that are not to be compressed by
WWWOFFLE (because they are already compressed or not worth compressing)\.
Requires zlib compilation option\.
.TP
.B mime\-type = (mime\-type)/(subtype)
The MIME type of a URL that is not to be compressed in the cache (when
purging) or when providing pages to clients\.
.TP
.B file\-ext = \.(file\-ext)
The file extension of a URL that is not to be requested compressed
from a server\.
.SH CensorHeader
A list of HTTP header lines that are to be removed from the requests sent to
web servers and the replies that come back from them\.
.TP
.B [<URL\-SPEC>] (header) = yes | no | (string)
A header field name (e\.g\. From, Cookie, Set\-Cookie, User\-Agent) and
the string to replace the header value with (default=no)\. The header
is case sensitive, and does not have a \':\' at the end\. The value of
"no" means that the header is unmodified, "yes" or no string can be
used to remove the header or a string can be used to replace the
header\. This only replaces headers it finds, it does not add any new
ones\. An option for Referer here will take precedence over the
referer\-self and referer\-self\-dir options\.
.TP
.B [<URL\-SPEC>] referer\-self = yes | no
Sets the Referer header to the same as the URL being requested
(default=no)\. This will add the Referer header if none is contained
in the original request\.
.TP
.B [<URL\-SPEC>] referer\-self\-dir = yes | no
Sets the Referer header to the directory name of the URL being
requested (default=no)\. This will add the Referer header if none is
contained in the original request\. This option takes precedence over
referer\-self\.
.TP
.B [<URL\-SPEC>] referer\-from = yes | no
Removes the Referer header based on a match of the referring URL
(default=no)\.
.TP
.B [<URL\-SPEC>] force\-user\-agent = yes | no
Forces a User\-Agent header to be inserted into all requests that are
made by WWWOFFLE (default=no)\. This User\-Agent is added only if there
is not an existing User\-Agent header and is set to the value
WWWOFFLE/<version\-number>\. This header is inserted before censoring
and may be changed by the normal header censoring method\.
.SH FTPOptions
Options to use when fetching files using the ftp protocol\.
.TP
.B anon\-username = (string)
The username to use for anonymous ftp (default=anonymous)\.
.TP
.B anon\-password = (string)
The password to use for anonymous ftp (default determined at run
time)\. If using a firewall then this may contain a value that is not
valid to the FTP server and may need to be set to a different value\.
.TP
.B <URL\-SPEC> auth\-username = (string)
The username to use on a host instead of the default anonymous
username\.
.TP
.B <URL\-SPEC> auth\-password = (string)
The password to use on a host instead of the default anonymous
password\.
.SH MIMETypes
MIME Types to use when serving files that were not fetched using HTTP or for
files on the built\-in web\-server\.
.TP
.B default = (mime\-type)/(subtype)
The default MIME type (default=text/plain)\.
.TP
.B \.(file\-ext) = (mime\-type)/(subtype)
The MIME type to associate with a file extension\. The \'\.\' must be
included in the file extension\. If more than one extension matches
then the longest one is used\.
.SH Proxy
This contains the names of the HTTP (or other) proxies to use external to the
WWWOFFLE server machine\.
.TP
.B [<URL\-SPEC>] proxy = (host[:port])
The hostname and port on it to use as the proxy\.
.TP
.B <URL\-SPEC> auth\-username = (string)
The username to use on a proxy host to authenticate WWWOFFLE to it\.
The
.I URL\-SPEC
in this case refers to the proxy and not the URL being
retrieved\.
.TP
.B <URL\-SPEC> auth\-password = (string)
The password to use on a proxy host to authenticate WWWOFFLE to it\.
The
.I URL\-SPEC
in this case refers to the proxy and not the URL being
retrieved\.
.TP
.B [<URL\-SPEC>] ssl = (host[:port])
A proxy server that should be used for https or Secure Socket Layer
(SSL) connections\. Note that for the
.I <URL\-SPEC>
that only the host is
checked and that the other parts must be \'*\'
.I WILDCARD
s\.
.SH Alias
A list of aliases that are used to replace the server name and path with
another server name and path\.
.TP
.B URL\-SPECIFICATION = URL\-SPECIFICATION
Any requests that match the first
.I URL\-SPECIFICATION
are replaced by
the second
.I URL\-SPECIFICATION.
The first
.I URL\-SPECIFICATION
is a
wildcard match for the protocol and host/port, the path must match the
start of the requested URL exactly and includes all subdirectories\.
.SH Purge
The method to determine which pages to purge, the default age the host
specific maximum age of the pages in days, and the maximum cache size\.
.TP
.B use\-mtime = yes | no
The method to use to decide which files to purge, last access time
(atime) or last modification time (mtime) (default=no)\.
.TP
.B max\-size = (size)
The maximum size for the cache in MB after purging (default=\-1)\. A
maximum cache size of \-1 (or 0 for backwards compatibility) means
there is no limit to the size\. If this and the min\-free options are
both used the smaller cache size is chosen\. This option take into
account the URLs that are never purged when measuring the cache size
but will not purge them\.
.TP
.B min\-free = (size)
The minimum amount of free disk space in MB after purging
(default=\-1)\. A minimum disk free of \-1 (or 0) means there is no
limit to the free space\. If this and the max\-size options are both
used the smaller cache size is chosen\. This option take into account
the URLs that are never purged when measuring the cache size but will
not purge them\.
.TP
.B use\-url = yes | no
If true then use the URL to decide on the purge age, otherwise use the
protocol and host only (default=no)\.
.TP
.B del\-dontget = yes | no
If true then delete the URLs that match the entries in the DontGet
section (default=no)\.
.TP
.B del\-dontcache = yes | no
If true then delete the URLs that match the entries in the DontCache
section (default=no)\.
.TP
.B [<URL\-SPEC>] age = (age)
The maximum age in the cache for URLs that match this (default=14)\.
An age of zero means always to delete, negative means not to delete\.
The
.I URL\-SPECIFICATION
matches only the protocol and host unless
use\-url is set to true\. Longer times can be specified with a \'w\', \'m\'
or \'y\' suffix for weeks, months or years (e\.g\. 2w=14)\.
.TP
.B [<URL\-SPEC>] compress\-age = (age)
The maximum age in the cache for URLs that match this to be stored
uncompressed (default=\-1)\. Requires zlib compilation option\. An age
of zero means always to compress, negative means never to compress\.
The
.I URL\-SPECIFICATION
matches only the protocol and host unless
use\-url is set to true\. Longer times can be specified with a \'w\', \'m\'
or \'y\' suffix for weeks, months or years (e\.g\. 2w=14)\.
.SH WILDCARD
A
.I WILDCARD
match is one that uses the \'*\' character to represent any group of
characters\.
.LP
This is basically the same as the command line file matching expressions in
DOS or the UNIX shell, except that the \'*\' can match the \'/\' character\.
.LP
For example
.TP
.B *.gif
matches foo.gif and bar.gif
.TP
.B *.foo.com
matches www.foo.com and ftp.foo.com
.TP
.B /foo/*
matches /foo/bar.html and /foo/bar/foobar.html
.SH URL-SPECIFICATION
When specifying a host and protocol and pathname in many of the sections a
.I URL\-SPECIFICATION
can be used, this is a way of recognising a URL\.
.LP
For the purposes of this explanation a URL is considered to be made up of five
parts\.
.TP
.B proto
The protocol that is used (e.g. 'http', 'ftp')
.TP
.B host
The server hostname (e.g. 'www.gedanken.demon.co.uk').
.TP
.B port
The port number on the host (e.g. default of 80 for HTTP).
.TP
.B path
The pathname on the host (e.g. '/bar.html') or a directory name
(e\.g\. \'/foo/\')\.
.TP
.B args
Optional arguments with the URL used for CGI scripts etc.
(e\.g\. \'search=foo\')\.
.LP
For example the WWWOFFLE homepage: http://www\.gedanken\.demon\.co\.uk/wwwoffle/
The protocol is \'http\', the host is \'www\.gedanken\.demon\.co\.uk\', the port is
the default (in this case 80), and the pathname is \'/wwwoffle/\'\.
.LP
In general this is written as (proto)://(host)[:(port)]/[(path)][?(args)]
.LP
Where [] indicates an optional feature, and () indicate a user supplied name
or number\.
.LP
Some example
.I URL\-SPECIFICATION
options are the following:
.TP
.B *://*/*
Any protocol, Any host, Any port, Any path, Any args
(This is the default for options that can have a
.I <URL\-SPEC>
prefix when none is specified)\.
.TP
.B *://*/(path)
Any protocol, Any host, Any port, Named path, Any args
.TP
.B *://*/*?
Any protocol, Any host, Any port, Any path, No args
.TP
.B *://*/(path)?*
Any protocol, Any host, Any port, Named path, Any args
.TP
.B *://(host)
Any protocol, Named host, Any port, Any path, Any args
.TP
.B (proto)://*/*
Named proto, Any host, Any port, Any path, Any args
.LP
(proto)://(host)/* Named proto, Named host, Any port, Any path, Any args
.LP
(proto)://(host):/* Named proto, Named host, Default port, Any path, Any args
.LP
*://(host):(port)/* Any protocol, Named host, Named port, Any path, Any args
.LP
The matching of the host, the path and the args use the
.I WILDCARD
matching that
is described above\. The matching of the path has the special condition that a
.I WILDCARD
of \'/*/foo\' will match \'/foo\' and \'/any/path/foo\', in other words it
matches any path prefix\.
.LP
In some sections that accept
.I URL\-SPECIFICATION
s they can be negated by
inserting the \'!\' character before it\. This will mean that the comparison
of a URL with the
.I URL\-SPECIFICATION
will return the logically opposite value
to what would be returned without the \'!\'\. If all of the
.I URL\-SPECIFICATION
s
in a section are negated and \'*://*/*\' is added to the end then the sense of
the whole section is negated\.
.LP
In all sections that accept
.I URL\-SPECIFICATION
s the comparison can be made case
insensitive for the path and arguments part by inserting the \'~\' character
before it\. (The host and the protocol comparisons are always case
insensitive)\.
.SH EXAMPLE
StartUp
{
bind-ipv4 = 0.0.0.0
bind-ipv6 = ::
http-port = 8080
https-port = 8443
wwwoffle-port = 8081
spool-dir = /var/spool/wwwoffle
use-syslog = yes
password =
}
Options
{
add-info-refresh = no
request-changed = 3600
}
SSLOptions
{
enable-caching = no
allow-tunnel = *:443
}
FetchOptions
{
images = yes
frames = yes
iframes = yes
}
LocalHost
{
wwwoffle.foo.com
localhost
127.0.0.1
::ffff:127.0.0.1
ip6-localhost
::1
}
DontGet
[
wwwoffle.DontGet.conf
]
LocalNet
{
*.foo.com
}
AllowedConnectHosts
{
*.foo.com
}
Proxy
{
<http://foo.com/*> proxy = www-cache.foo.com:8080
}
Purge
{
max-size = 10
age = 28
<http://*.bar.com/*> age = 7
}
.SH FILES
CONFDIR/wwwoffle.conf The wwwoffled(8) configuration file.
.LP
SPOOLDIR The WWWOFFLE spool directory.
.SH SEE ALSO
wwwoffle(1), wwwoffled(8).
.SH AUTHOR
Andrew M. Bishop 1996-2007 (amb@gedanken.demon.co.uk)
|