File: FAQ-8.html

package info (click to toggle)
squid 1.1.21-1
  • links: PTS
  • area: main
  • in suites: hamm
  • size: 2,828 kB
  • ctags: 3,705
  • sloc: ansic: 34,400; sh: 1,975; perl: 899; makefile: 559
file content (610 lines) | stat: -rw-r--r-- 24,710 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
<HTML>
<HEAD>
<TITLE>SQUID Frequently Asked Questions: The Cache Manager</TITLE>
</HEAD>
<BODY>
<A HREF="FAQ-7.html">Previous</A>
<A HREF="FAQ-9.html">Next</A>
<A HREF="FAQ.html#toc8">Table of Contents</A>
<HR>
<H2><A NAME="s8">8. The Cache Manager</A></H2>

<P>Contributed by Jonathan Larmour &lt;JLarmour@origin-at.co.uk&gt;</P>

<H2><A NAME="ss8.1">8.1 What is the cache manager?</A></H2>

<P>The cache manager (<EM>cachemgr.cgi</EM>) is a CGI utility for
displaying statistics about the <EM>squid</EM> process as it runs.
The cache manager is a convenient way to manage the cache and view
statistics without logging into the server.</P>


<H2><A NAME="ss8.2">8.2 How do you set it up?</A></H2>

<P>That depends on which web server you're using.  Below you will
find instructions for configuring the CERN and Apache servers
to permit <EM>cachemgr.cgi</EM> usage.</P>
<P><EM>EDITOR'S NOTE: readers are encouraged to submit instructions
for configuration of cachemgr.cgi on other web server platforms, such
as Netscape.</EM></P>

<P>After you edit the server configuration files, you will probably
need to either restart your web server or or send it a <CODE>SIGHUP</CODE> signal
to tell it to re-read its configuration files.</P>

<P>When you're done configuring your web server, you'll connect to
the cache manager with a web browser, using a URL such as:
<PRE>
        http://www.example.com/Squid/cgi-bin/cachemgr.cgi/
</PRE>
</P>


<H2><A NAME="ss8.3">8.3 Cache manager configuration for CERN httpd 3.0</A></H2>

<P>First, you should ensure that only specified workstations can access
the cache manager.  That is done in your CERN <EM>httpd.conf</EM>, not in
<EM>squid.conf</EM>.</P>
<P>
<PRE>
        Protection MGR-PROT {
                 Mask    @(workstation.example.com)
        }
</PRE>
</P>
<P>Wildcards are acceptable, IP addresses are acceptable, and others
can be added with a comma-separated list of IP addresses. There
are many more ways of protection.  Your server documentation has
details.</P>

<P>You also need to add:
<PRE>
        Protect         /Squid/*        MGR-PROT
        Exec            /Squid/cgi-bin/*.cgi    /usr/local/squid/bin/*.cgi
</PRE>

This marks the script as executable to those in <CODE>MGR-PROT</CODE>.</P>


<H2><A NAME="ss8.4">8.4 Cache manager configuration for Apache</A></H2>

<P>First, make sure the cgi-bin directory you're using is listed with a
<CODE>ScriptAlias</CODE> in your Apache <EM>srm.conf</EM> file like this:
<PRE>
        ScriptAlias /Squid/cgi-bin/ /usr/local/squid/cgi-bin/
</PRE>

It's probably a <B>bad</B> idea to <CODE>ScriptAlias</CODE>
the entire <EM></EM>usr/local/squid/bin/ directory where all the
Squid executables live.</P>
<P>Next, you should ensure that only specified workstations can access
the cache manager.  That is done in your Apache <EM>access.conf</EM>,
not in <EM>squid.conf</EM>.  At the bottom of <EM>access.conf</EM>
file, insert:
<PRE>
        &lt;Location /Squid/cgi-bin/cachemgr.cgi&gt;
        order deny,allow
        deny from all
        allow from workstation.example.com
        &lt;/Location&gt;
</PRE>
</P>
<P>You can have more than one allow line, and you can allow
domains or networks.</P>
<P> 
Alternately, <EM>cachemgr.cgi</EM> can be password-protected.  You'd
add the following to <EM>access.conf</EM>:</P>
<P>
<PRE>
        &lt;Location /Squid/cgi-bin/cachemgr.cgi&gt;
        AuthUserFile /path/to/password/file
        AuthGroupFile /dev/null
        AuthName User/Password Required
        AuthType Basic
        &lt;Limit GET&gt;
        require user cachemanager
        &lt;/Location&gt;
</PRE>
</P>
<P>Consult the Apache documentation for information on using <EM>htpasswd</EM>
to set a password for this ``user.''</P>


<H2><A NAME="ss8.5">8.5 Cache manager ACLs in <EM>squid.conf</EM></A></H2>

<P>The default cache manager access configuration in <EM>squid.conf</EM> is:</P>
<P>
<PRE>
        acl manager proto cache_object
        acl localhost src 127.0.0.1/255.255.255.255
        acl all src 0.0.0.0/0.0.0.0
</PRE>
</P>
<P>With the following rules:</P>
<P>
<PRE>
        http_access deny manager !localhost
        http_access allow all
</PRE>
</P>

<P>The first ACL is the most important as the cache manager program
interrogates squid using a special <CODE>cache_object</CODE> protocol
Try it yourself by doing:</P>
<P>
<PRE>
        telnet mycache.example.com 3128
        GET cache_object://mycache.example.com/info HTTP/1.0
</PRE>
</P>
<P>The default ACLs say that if the request is for a
<CODE>cache_object</CODE>, and it isn't the local host, then deny
access; otherwise allow access.</P>

<P>In fact, only allowing localhost access means that on the
initial <EM>cachemgr.cgi</EM> form you can only specify the cache
host as <CODE>localhost</CODE>. We recommend the following:</P>
<P>
<PRE>
        acl manager proto cache_object
        acl localhost src 127.0.0.1/255.255.255.255
        acl example src 123.123.123.123/255.255.255.255
        acl all src 0.0.0.0/0.0.0.0
</PRE>
</P>
<P>Where <CODE>123.123.123.123</CODE> is the IP address of your web server.
Then modify the rules like this:</P>
<P>
<PRE>
        http_access allow manager localhost
        http_access allow manager example
        http_access deny manager
        http_access allow all
</PRE>

If you're using <EM>miss_access</EM>, then don't forget to also add
a <EM>miss_access</EM> rule for the cache manager:
<PRE>
        miss_access allow manager
</PRE>
</P>


<P>The default ACLs assume that your web server is on the same machine
as <EM>squid</EM>. Remember that the connection from the cache
manager program to squid originates at the web server, not the
browser. So if your web server lives somewhere else, you should
make sure that IP address of the web server that has <EM>cachemgr.cgi</EM>
installed on it is in the <CODE>example</CODE> ACL above.</P>

<P>Always be sure to send a <CODE>SIGHUP</CODE> signal to <EM>squid</EM>
any time you change the <EM>squid.conf</EM> file.</P>


<H2><A NAME="ss8.6">8.6 Why does it say I need a password and a URL?</A></H2>

<P>If you ``drop'' the list box, and browse it, you will see that the
password is only required to shutdown the cache, and the URL is
required to refresh an object (i.e., retrieve it from its original
source again) Otherwise these fields can be left blank:  a password
is not required to obtain access to the informational aspects of
<EM>cachemgr.cgi</EM>.</P>


<H2><A NAME="ss8.7">8.7 I want to shutdown the cache remotely. What's the password?</A></H2>

<P>See the <CODE>cachemgr_passwd</CODE> directive in <EM>squid.conf</EM>.</P>


<H2><A NAME="ss8.8">8.8 How do I make the cache host default to <EM>my</EM> cache?</A></H2>

<P>Edit Makefile.in. Look at the line
<PRE>
        HOST_OPT        = # -DCACHEMGR_HOSTNAME=&quot;getfullhostname()&quot;
</PRE>

If the webserver that <EM>cachemgr.cgi</EM> runs from is the same
machine as Squid runs on, just remove the <CODE>#</CODE>.  If your web server
is somewhere else use:
<PRE>
        HOST_OPT        = -DCACHEMGR_HOSTNAME=\&quot;mycache.example.com\&quot;
</PRE>
</P>
<P>If you change this, you will need to recompile and reinstall
cachemgr.cgi before the changes take effect.</P>


<H2><A NAME="ss8.9">8.9 What's the difference between Squid TCP connections and Squid UDP connections?</A></H2>

<P>Browsers and caches use TCP connections to retrieve web objects
from web servers or caches.  UDP connections are used when another
cache using you as a sibling or parent wants to find out if you
have an object in your cache that it's looking for.  The UDP
connections are ICP queries.</P>


<H2><A NAME="ss8.10">8.10 It says the storage expiration will happen in 1970!</A></H2>

<P>Don't worry. The default (and sensible) behavior of <EM>squid</EM>
is to expire an object when it happens to overwrite it.  It doesn't
explicitly garbage collect (unless you tell it to in other ways).</P>


<H2><A NAME="ss8.11">8.11 What do the Meta Data entries mean?</A></H2>

<P>
<DL>

<DT><B>StoreEntry</B><DD><P>Entry describing an object in the cache.</P>

<DT><B>IPCacheEntry</B><DD><P>An entry in the DNS cache.</P>

<DT><B>Hash link</B><DD><P>Link in the cache hash table structure.</P>

<DT><B>URL strings</B><DD><P>The strings of the URLs themselves that map to
an object number in the cache, allowing access to the
StoreEntry.</P>

</DL>
</P>

<P>Basically just like the <CODE>log</CODE> file in your cache directory:</P>
<P>
<OL>
<LI><CODE>PoolMemObject structures</CODE></LI>
<LI>Info about objects currently in memory,
(eg, in the process of being transferred).</LI>
<LI><CODE>Pool for Request structures</CODE></LI>
<LI>Information about each request as it happens.</LI>
<LI><CODE>Pool for in-memory object</CODE></LI>
<LI>Space for object data as it is retrieved.</LI>
</OL>
</P>


<H2><A NAME="huge-memory-pool"></A> <A NAME="ss8.12">8.12 The pool for in-memory objects is <B>huge</B>, and it doesn't get smaller!  Is this a memory leak?</A></H2>


<P>No. This pool only grows, it doesn't shrink. It reflects the
largest object cached by <EM>squid</EM> in its lifetime. If you don't
want it to be so large, reduce your <CODE>cache_mem</CODE> and object
size limits for gopher, http and ftp in <EM>squid.conf</EM>.</P>


<H2><A NAME="ss8.13">8.13 The ``Total accounted'' field in the meta data isn't the same as the size of my <EM>squid</EM>!</A></H2>

<P>If it's close to the size mentioned don't worry. If <EM>squid</EM>
is much larger than this field, it is probably a memory leak, and
all you can do is watch for new patches and occasionally restart
<EM>squid</EM>.</P>

<P>If <EM>squid</EM> is much smaller than this field, run for cover!
Something is very wrong, and you should probably restart <EM>squid</EM>.</P>


<H2><A NAME="ss8.14">8.14 In the utilization section, what is <CODE>Other</CODE>?</A></H2>


<P><CODE>Other</CODE> is a default category to track objects which
don't fall into one of the defined categories.</P>


<H2><A NAME="ss8.15">8.15 In the utilization section, why is the <CODE>Transfer KB/sec</CODE></A>column always zero?</H2>

<P>This column contains gross estimations of data transfer rates
averaged over the entire time the cache has been running.  These
numbers are unreliable and mostly useless.</P>


<H2><A NAME="ss8.16">8.16 In the utilization section, what is the <CODE>Object Count</CODE>?</A></H2>

<P>The number of objects of that type in the cache right now.</P>


<H2><A NAME="ss8.17">8.17 In the utilization section, what is the <CODE>Max/Current/Min KB</CODE>?</A></H2>

<P>These refer to the size all the objects of this type have grown
to/currently are/shrunk to.</P>


<H2><A NAME="ss8.18">8.18 What is the <CODE>I/O</CODE> section about?</A></H2>

<P>These are histograms on the number of bytes read from the network
per <CODE>read(2)</CODE> call.  Somewhat useful for determining
maximum buffer sizes.</P>


<H2><A NAME="ss8.19">8.19 What is the <CODE>Objects</CODE> section for?</A></H2>

<P><B><EM>Warning:</EM></B> this will download to your browser
a list of every URL in the cache and statistics about it. It can
be very, very large.  <B><EM>Sometimes it will be larger than
the amount of available memory in your client!</EM></B> You
probably don't need this information anyway.</P>


<H2><A NAME="ss8.20">8.20 What is the <CODE>VM Objects</CODE> section for?</A></H2>

<P><CODE>VM Objects</CODE> are the objects which are in Virtual Memory.
These are objects which are currently being retrieved and
those which were kept in memory for fast access (accelerator
mode).</P>


<H2><A NAME="ss8.21">8.21 What does <CODE>AVG RTT</CODE> mean?</A></H2>

<P>Average Round Trip Time. This is how long on average after
an ICP ping is sent that a reply is received.</P>


<H2><A NAME="ss8.22">8.22 In the IP cache section, what's the difference between a hit, a negative hit and a miss?</A></H2>


<P>A HIT means that the document was found in the cache. A
MISS, that it wasn't found in the cache. A negative hit
means that it was found in the cache, but it doesn't exist.</P>


<H2><A NAME="ss8.23">8.23 What do the IP cache contents mean anyway?</A></H2>


<P>The hostname is the name that was requested to be resolved.</P>

<P>For the <CODE>Flags</CODE> column:</P>
<P>
<UL>
<LI><CODE>C</CODE> Means positively cached.</LI>
<LI><CODE>N</CODE> Means negatively cached.</LI>
<LI><CODE>P</CODE> Means the request is pending being dispatched.</LI>
<LI><CODE>D</CODE> Means the request has been dispatched and we're waiting for an answer.</LI>
<LI><CODE>L</CODE> Means it is a locked entry because it represents a parent or sibling.</LI>
</UL>
</P>
<P>The <CODE>TTL</CODE> column represents ``Time To Live'' (i.e., how long
the cache entry is valid).  (May be negative if the document has
expired.)</P>

<P>The <CODE>N</CODE> column is the number of IP addresses from which
the cache has documents.</P>

<P>The rest of the line lists all the IP addresses that have been associated
with that IP cache entry.</P>



<H2><A NAME="analyze-memory-usage"></A> <A NAME="ss8.24">8.24 How do I analyze memory usage from <EM>cachemgr.cgi</EM>'s output?</A></H2>


<P>Look at your <EM>cachemgr.cgi</EM> <CODE>Cache
Information</CODE> page.  For example:
<PRE>
        Memory usage for squid via mallinfo():
               Total space in arena:   94687 KB
               Ordinary blocks:        32019 KB 210034 blks
               Small blocks:           44364 KB 569500 blks
               Holding blocks:             0 KB   5695 blks
               Free Small blocks:       6650 KB
               Free Ordinary blocks:   11652 KB
               Total in use:           76384 KB 81%
               Total free:             18302 KB 19%

        Meta Data:
        StoreEntry                246043 x 64 bytes =  15377 KB
        IPCacheEntry              971 x   88 bytes  =     83 KB
        Hash link                 2 x   24 bytes    =      0 KB
        URL strings                                 =  11422 KB
        Pool MemObject structures 514 x  144 bytes  =     72 KB (    70 free)
        Pool for Request structur 516 x 4380 bytes  =   2207 KB (  2121 free)
        Pool for in-memory object 6200 x 4096 bytes =  24800 KB ( 22888 free)
        Pool for disk I/O         242 x 8192 bytes =   1936 KB (  1888 free)
        Miscellaneous                              =   2600 KB
        total Accounted                            =  58499 KB
</PRE>
</P>

<P>First note that <CODE>mallinfo()</CODE> reports 94M in ``arena.''  This
is pretty close to what <EM>top</EM> says (97M).</P>

<P>Of that 94M, 81% (76M) is actually being used at the moment.  The
rest has been freed, or pre-allocated by <CODE>malloc(3)</CODE>
and not yet used.</P>

<P>Of the 76M in use, we can account for 58.5M (76%).  There are some
calls to <CODE>malloc(3)</CODE> for which we can't account.</P>

<P>The <CODE>Meta Data</CODE> list gives the breakdown of where the
accounted memory has gone.  45% has gone to <CODE>StoreEntry</CODE>
and URL strings.  Another 42% has gone to buffering hold objects
in VM while they are fetched and relayed to the clients (<CODE>Pool
for in-memory object</CODE>).</P>

<P>The pool sizes are specified by <EM>squid.conf</EM> parameters.
In version 1.0, these pools are somewhat broken:  we keep a stack
of unused pages instead of freeing the block.  In the <CODE>Pool
for in-memory object</CODE>, the unused stack size is 1/2 of
<CODE>cache_mem</CODE>.  The <CODE>Pool for disk I</CODE>O/ is
hardcoded at 200.  For <CODE>MemObject</CODE> and <CODE>Request</CODE>
it's 1/8 of your system's <CODE>FD_SETSIZE</CODE> value.</P>

<P>If you need to lower your process size, we recommend lowering the
max object sizes in the 'http', 'ftp' and 'gopher' config lines.
You may also want to lower <CODE>cache_mem</CODE> to suit your
needs. But if you <CODE>make cache_mem</CODE> too low, then some
objects may not get saved to disk during high-load periods.  Newer
Squid versions allow you to set <CODE>memory_pools off</CODE> to
disable the free memory pools.</P>


<H2><A NAME="ss8.25">8.25 What is the fqdncache and how is it different from the ipcache?</A></H2>

<P>IPCache contains data for the Hostname to IP-Number mapping, and
FQDNCache does it the other way round.  For example:</P>
<P><EM>IP Cache Contents:</EM>
<PRE>
        Hostname                      Flags lstref    TTL  N [IP-Number]
        gorn.cc.fh-lippe.de               C       0  21581 1 193.16.112.73
        lagrange.uni-paderborn.de         C       6  21594 1 131.234.128.245
        www.altavista.digital.com         C      10  21299 4 204.123.2.75  ...
        2/ftp.symantec.com                DL   1583 -772855 0  
        
        Flags:  C --&gt; Cached
                D --&gt; Dispatched
                N --&gt; Negative Cached
                L --&gt; Locked
        lstref: Time since last use
        TTL:    Time-To-Live until information expires
        N:      Count of addresses
</PRE>
</P>

<P><EM>FQDN Cache Contents:</EM>
<PRE>
        IP-Number                    Flags    TTL N Hostname
        130.149.17.15                    C -45570 1 andele.cs.tu-berlin.de
        194.77.122.18                    C -58133 1 komet.teuto.de
        206.155.117.51                   N -73747 0
        
        Flags:  C --&gt; Cached
                D --&gt; Dispatched
                N --&gt; Negative Cached
                L --&gt; Locked
        TTL:    Time-To-Live until information expires
        N:      Count of names
</PRE>
</P>


<H2><A NAME="ss8.26">8.26 What does ``Page faults with physical i/o: 4897'' mean?</A></H2>

<P>This question was asked on the <EM>squid-users</EM> mailing list, to which
there were three excellent replies.</P>

<P>by 
<A HREF="mailto:JLarmour@origin-at.co.uk">Jonathan Larmour</A></P>

<P>You get a ``page fault'' when your OS tries to access something in memory
which is actually swapped to disk. The term ``page fault'' while correct at
the kernel and CPU level, is a bit deceptive to a user, as there's no
actual error - this is a normal feature of operation.</P>

<P>Also, this doesn't necessarily mean your squid is swapping by that much.
Most operating systems also implement paging for executables, so that only
sections of the executable which are actually used are read from disk into
memory. Also, whenever squid needs more memory, the fact that the memory
was allocated will show up in the page faults.</P>

<P>However, if the number of faults is unusually high, and getting bigger,
this could mean that squid is swapping. Another way to verify this is using
a program called ``vmstat'' which is found on most UNIX platforms. If you run
this as ``vmstat 5'' this will update a display every 5 seconds. This can
tell you if the system as a whole is swapping a lot (see your local man
page for vmstat for more information).</P>

<P>It is very bad for squid to swap, as every single request will be blocked
until the requested data is swapped in. It is better to tweak the <EM>cache_mem</EM>
and/or <EM>memory_pools</EM> setting in squid.conf, or switch to the NOVM versions
of squid, than allow this to happen.</P>

<P>by 
<A HREF="mailto:peter@spinner.dialix.com.au">Peter Wemm</A></P>

<P>There's two different operations at work, Paging and swapping.  Paging
is when individual pages are shuffled (either discarded or swapped
to/from disk), while ``swapping'' <EM>generally</EM> means the entire
process got sent to/from disk.</P>

<P>Needless to say, swapping a process is a pretty drastic event, and usually 
only reserved for when there's a memory crunch and paging out cannot free 
enough memory quickly enough.  Also, there's some variation on how 
swapping is implemented in OS's.  Some don't do it at all or do a hybrid 
of paging and swapping instead.</P>

<P>As you say, paging out doesn't necessarily involve disk IO, eg: text (code)
pages are read-only and can simply be discarded if they are not used (and
reloaded if/when needed).  Data pages are also discarded if unmodified, and
paged out if there's been any changes.  Allocated memory (malloc) is always
saved to disk since there's no executable file to recover the data from.
mmap() memory is variable..  If it's backed from a file, it uses the same
rules as the data segment of a file - ie: either discarded if unmodified or
paged out.</P>

<P>There's also ``demand zeroing'' of pages as well that cause faults..  If you
malloc memory and it calls brk()/sbrk() to allocate new pages, the chances
are that you are allocated demand zero pages.  Ie: the pages are not
``really'' attached to your process yet, but when you access them for the
first time, the page fault causes the page to be connected to the process
address space and zeroed - this saves unnecessary zeroing of pages that are
allocated but never used. </P>

<P>The ``page faults with physical IO'' comes from the OS via getrusage(). It's
highly OS dependent on what it means.  Generally, it means that the process
accessed a page that was not present in memory (for whatever reason) and
there was disk access to fetch it.  Many OS's load executables by demand
paging as well, so the act of starting squid implicitly causes page faults
with disk IO - however, many (but not all) OS's use ``read ahead'' and
``prefault'' heuristics to streamline the loading.  Some OS's maintain
``intent queues'' so that pages can be selected as pageout candidates ahead
of time.  When (say) squid touches a freshly allocated demand zero page and
one is needed, the OS can page out one of the candidates on the spot, 
causing a 'fault with physical IO' with demand zeroing of allocated memory 
which doesn't happen on many other OS's.  (The other OS's generally put 
the process to sleep while the pageout daemon finds a page for it).</P>

<P>The meaning of ``swapping'' varies.  On FreeBSD for example, swapping out is
implemented as unlocking upages, kernel stack, PTD etc for aggressive
pageout with the process.  The only thing left of the process in memory is
the 'struct proc'.  The FreeBSD paging system is highly adaptive and can
resort to paging in a way that is equivalent to the traditional swapping
style operation (ie: entire process).  FreeBSD also tries stealing pages
from active processes in order to make space for disk cache.  I suspect
this is why setting 'memory_pools off' on the non-NOVM squids on FreeBSD is
reported to work better - the VM/buffer system could be competing with
squid to cache the same pages.  It's a pity that squid cannot use mmap() to
do file IO on the 4K chunks in it's memory pool (I can see that this is not
a simple thing to do though, but that won't stop me wishing. :-).</P>

<P>by 
<A HREF="mailto:webadm@info.cam.ac.uk">John Line</A></P>

<P>The comments so far have been about what paging/swapping figures mean in
a ``traditional'' context, but it's worth bearing in mind that on some systems
(Sun's Solaris 2, at least), the virtual memory and filesystem handling are 
unified and what a user process sees as reading or writing a file, the system 
simply sees as paging something in from disk or a page being updated so it 
needs to be paged out. (I suppose you could view it as similar to the operating 
system memory-mapping the files behind-the-scenes.)</P>

<P>The effect of this is that on Solaris 2, paging figures will also include file 
I/O. Or rather, the figures from vmstat certainly appear to include file I/O, 
and I presume (but can't quickly test) that figures such as those quoted by 
Squid will also include file I/O. </P>

<P>To confirm the above (which represents an impression from what I've read and 
observed, rather than 100% certain facts...), using an otherwise idle Sun Ultra
1 system system I just tried using cat (small, shouldn't need to page) to copy
(a) one file to another, (b) a file to /dev/null, (c) /dev/zero to a file, and
(d) /dev/zero to /dev/null (interrupting the last two with control-C after a
while!), while watching with vmstat. 300-600 page-ins or page-outs per second
when reading or writing a file (rather than a device), essentially zero in
other cases (and when not cat-ing).</P>

<P>So ... beware assuming that all systems are similar and that paging figures 
represent *only* program code and data being shuffled to/from disk - they 
may also include the work in reading/writing all those files you were 
accessing...</P>

<H3>Ok, so what is unusually high?</H3>

<P>You'll probably want to compare the number of page faults to the number of
HTTP requests.  If this ratio is close to, or exceeding&nbsp1, then 
Squid is paging too much.</P>




<HR>
<A HREF="FAQ-7.html">Previous</A>
<A HREF="FAQ-9.html">Next</A>
<A HREF="FAQ.html#toc8">Table of Contents</A>
</BODY>
</HTML>