File: FAQ-6.html

package info (click to toggle)
squid 1.1.21-1
  • links: PTS
  • area: main
  • in suites: hamm
  • size: 2,828 kB
  • ctags: 3,705
  • sloc: ansic: 34,400; sh: 1,975; perl: 899; makefile: 559
file content (423 lines) | stat: -rw-r--r-- 16,606 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
<HTML>
<HEAD>
<TITLE>SQUID Frequently Asked Questions: Squid Log Files</TITLE>
</HEAD>
<BODY>
<A HREF="FAQ-5.html">Previous</A>
<A HREF="FAQ-7.html">Next</A>
<A HREF="FAQ.html#toc6">Table of Contents</A>
<HR>
<H2><A NAME="s6">6. Squid Log Files</A></H2>

<P>The logs are a valuable source of information about Squid workloads
and performance. The logs record not only access information, but
also system configuration errors and resource consumption (eg,
memory, disk space).</P>

<H2><A NAME="ss6.1">6.1 <EM>access.log</EM></A></H2>

<P>There are basically two formats for the <EM>access.log</EM> file: ``native'' and
``common.''  The
<A HREF="http://www.w3.org/pub/WWW/Daemon/User/Config/Logging.html#common-logfile-format">Common Logfile Format</A>
is used by numerous HTTP servers.  This format consists of the following
seven fields:
<PRE>
        remotehost rfc931 authuser [date] &quot;method URL&quot; status bytes
</PRE>
</P>
<P>The native format is different for different major versions of Squid.
For Squid-1.0 it is:
<PRE>
        time elapsed remotehost code/status/peerstatus bytes method URL
</PRE>
</P>

<P>For Squid-1.1, the information from the <EM>hierarchy.log</EM> was moved 
into <EM>access.log</EM>.  The format is:
<PRE>
        time elapsed remotehost code/status bytes method URL rfc931 peerstatus/peerhost
</PRE>
</P>



<H2><A NAME="ss6.2">6.2 <EM>hierarchy.log</EM></A></H2>

<P>This logfile exists for Squid-1.0 only.  The format is
<PRE>
        [date] URL peerstatus peerhost
</PRE>
</P>


<H2><A NAME="ss6.3">6.3 <EM>store.log</EM></A></H2>

<P>The <EM>store.log</EM> consists of the following fields:</P>
<P>
<PRE>
    time       The time this entry was logged.  The value is the
               raw Unix time plus milliseconds.

    action     One of RELEASE, SWAPIN, or SWAPOUT.
               RELEASE means the object has been removed from the cache.
               SWAPOUT means the object has been saved to disk.
               SWAPIN  means the object existed on disk and has been
                       swapped into memory.

    status     The HTTP reply code.

    The following three fields are timestamps parsed from the HTTP
    reply headers.  All are expressed in Unix time.  A missing header
    is represented with -2 and an unparsable header is represented as -1.

    datehdr    The value of the HTTP Date: reply header.

    lastmod    The value of the HTTP Last-Modified: reply header.

    expires    The value of the HTTP Expires: reply header.

    type       The HTTP Content-Type reply header.

    expect-len The value of the HTTP Content-Length reply header.
               Zero if Content-Length was missing.

    real-len   The number of bytes of content actually read.  If the
               expect-len is non-zero, and not equal to the real-len,
               the object will be released from the cache.

    method     HTTP request method

    key        The cache key.  Often this is simply the URL.  Cache objects
               which never become public will have cache keys that include
               a unique integer sequence number, the request method, and
               then the URL.
</PRE>
</P>



<H2><A NAME="ss6.4">6.4 Field Definitions</A></H2>

<P>These are the definitions for the various log format components:
<DL>
<DT><B>remotehost</B><DD><P>The IP address of the client host.  In Squid-1.1, if the
<EM>log_fqdn</EM> option is enabled, full hostnames will
be logged when available.</P>
<DT><B>rfc931</B><DD><P>The username associated with the client connection, determined
from an Ident (RFC 931) server running on the client host.
By default Ident lookups are not made, but may be enabled with
the <EM>ident_lookup</EM> option.</P>
<DT><B>authuser</B><DD><P>Always NULL ("-") for Squid logs.</P>
<DT><B>method</B><DD><P>GET, HEAD, POST, etc. for HTTP requests.  ICP_QUERY for ICP requests.</P>
<DT><B>URL</B><DD><P>The requested URL.</P>
<DT><B>code</B><DD><P>The ``cache result'' of the request.  This describes if the
request was a cache hit or miss, and if the object was refreshed.
See the full list of 
<A HREF="#cache-result-codes">cache result codes</A>.</P>
<DT><B>status</B><DD><P>HTTP status code: 200 for succesful actions, 000 for UDP
requests, 403 for redirects, 500 for server errors, etc.
See the 
<A HREF="#http-status-codes">HTTP status codes</A>
for a complete list.</P>
<DT><B>bytes</B><DD><P>The number of bytes delivered to the client.</P>
<DT><B>peerstatus</B><DD><P>A status code that explains how the request was forwarded, either
too your peer (neighbor) caches, or directly to the origin server.</P>
<DT><B>peerhost</B><DD><P>The host where the request was forwarded to.</P>
<DT><B>time</B><DD><P>Unix timestamp (since Jan 1, 1970) with millisecond resolution.</P>
<DT><B>date</B><DD><P>HTTP date format: <F>dd/mmm/yyyy:hh:mm:ss TZ-offset</F></P>
<DT><B>elapsed</B><DD><P>The time elapsed (milliseconds) during the client connection.
For HTTP requests,
this is the time between the accept() and close() system calls
for the TCP socket.  For ICP requests, this represents the 
time between scheduling the reply message for sending and actually
sending it.</P>
</DL>
</P>


<H2><A NAME="cache-result-codes"></A> <A NAME="ss6.5">6.5 Cache Result Codes</A></H2>

<P>Note, <B>TCP_</B> refers to requests on the HTTP port (3128).</P>
<P>
<DL>
<DT><B>TCP_HIT</B><DD><P>A valid copy of the requested object was
in the cache.</P>
<DT><B>TCP_MISS</B><DD><P>The requested object was not in the cache.</P>
<DT><B>TCP_REFRESH_HIT</B><DD><P>The object was in the cache, but STALE.
An If-Modified-Since request was made and
a "304 Not Modified" reply was received.</P>
<DT><B>TCP_REF_FAIL_HIT</B><DD><P>The object was in the cache, but STALE.
The request to validate the object failed,
so the old (stale) object was returned.</P>
<DT><B>TCP_REFRESH_MISS</B><DD><P>The object was in the cache, but STALE.
An If-Modified-Since request was made and
the reply contained new content.</P>
<DT><B>TCP_CLIENT_REFRESH</B><DD><P>The client issued a request with the
"no-cache" pragma.</P>
<DT><B>TCP_IMS_HIT</B><DD><P>The client issued an If-Modified-Since
request and the object was in the cache
and still fresh.</P>
<DT><B>TCP_IMS_MISS</B><DD><P>The client issued an If-Modified-Since
request for a stale object.</P>
<DT><B>TCP_SWAPFAIL</B><DD><P>The object was believed to be in the cache,
but could not be accessed.</P>
<DT><B>TCP_DENIED</B><DD><P>Access was denied for this request</P>
</DL>
</P>
<P>
<PRE>
&quot;UDP_&quot; refers to requests on the ICP port (3130)

        UDP_HIT         A valid copy of the requested object was in the cache.
        UDP_HIT_OBJ     Same as UDP_HIT, but the object data was small enough
                        to be sent in the UDP reply packet.  Saves the
                        following TCP request.
        UDP_MISS        The requested object was not in the cache.
        UDP_DENIED      Access was denied for this request.
        UDP_INVALID     An invalid request was received.
        UDP_RELOADING   The ICP request was &quot;refused&quot; because the cache is
                        busy reloading its metadata.

&quot;ERR_&quot; refers to various types of errors for HTTP requests.
</PRE>
</P>


<H2><A NAME="ss6.6">6.6 Peer Status Coces</A></H2>

<P>
<PRE>
Hierarchy Data Tags

        DIRECT                  The object has been requested from the origin
                                server.
        FIREWALL_IP_DIRECT      The object has been requested from the origin
                                server because the origin host IP address is
                                inside your firewall.
        FIRST_PARENT_MISS       The object has been requested from the
                                parent cache with the fastest weighted round
                                trip time.
        FIRST_UP_PARENT         The object has been requested from the first
                                available parent in your list.
        LOCAL_IP_DIRECT         The object has been requested from the origin
                                server because the origin host IP address 
                                matched your 'local_ip' list.
        SIBLING_HIT             The object was requested from a sibling cache
                                which replied with a UDP_HIT.
        NO_DIRECT_FAIL          The object could not be requested because
                                of firewall restrictions and no parent caches
                                were available.
        NO_PARENT_DIRECT        The object was requested from the origin server
                                because no parent caches exist for the URL.
        PARENT_HIT              The object was requested from a parent cache
                                which replied with a UDP_HIT.
        SINGLE_PARENT           The object was requested from the only
                                parent cache appropriate for this URL.
        SOURCE_FASTEST          The object was requested from the origin server
                                because the 'source_ping' reply arrived first.
        PARENT_UDP_HIT_OBJ      The object was received in a UDP_HIT_OBJ reply
                                from a parent cache.
        SIBLING_UDP_HIT_OBJ     The object was received in a UDP_HIT_OBJ reply
                                from a sibling cache.
        PASSTHROUGH_PARENT      The neighbor or proxy defined in the config
                                option 'passthrough_proxy' was used.
        SSL_PARENT_MISS         The neighbor or proxy defined in the config
                                option 'ssl_proxy' was used.
        DEFAULT_PARENT          No ICP queries were sent to any parent
                                caches.  This parent was chosen because
                                it was marked as 'default' in the config
                                file.
        ROUNDROBIN_PARENT       No ICP queries were received from any parent
                                caches.  This parent was chosen because
                                it was marked as 'default' in the config 
                                file and it had the lowest round-robin use
                                count.
        CLOSEST_PARENT_MISS     This parent was selected because it
                                included the lowest RTT measurement to
                                the origin server.  This only appears
                                with 'query_icmp on' set in the config
                                file.
        CLOSEST_DIRECT          The object was fetched directly from the
                                origin server because this cache measured
                                a lower RTT than any of the parent caches.
</PRE>
</P>

<P>Almost any of these may be preceeded by 'TIMEOUT_' if the two-second
(default) timeout occurs waiting for all ICP replies to arrive from
neighbors.</P>


<H2><A NAME="http-status-codes"></A> <A NAME="ss6.7">6.7 HTTP status codes</A></H2>

<P>These are taken from
<A HREF="http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2068.txt">RFC 2068</A>.
<PRE>
100  Continue
101  Switching Protocols
200  OK
201  Created
202  Accepted
203  Non-Authoritative Information
204  No Content
205  Reset Content
206  Partial Content
300  Multiple Choices
301  Moved Permanently
302  Moved Temporarily
303  See Other
304  Not Modified
305  Use Proxy
400  Bad Request
401  Unauthorized
402  Payment Required
403  Forbidden
404  Not Found
405  Method Not Allowed
406  Not Acceptable
407  Proxy Authentication Required
408  Request Time-out
409  Conflict
410  Gone
411  Length Required
412  Precondition Failed
413  Request Entity Too Large
414  Request-URI Too Large
415  Unsupported Media Type
500  Internal Server Error
501  Not Implemented
502  Bad Gateway
503  Service Unavailable
504  Gateway Time-out
505  HTTP Version not supported
</PRE>
</P>


<H2><A NAME="swaplog"></A> <A NAME="ss6.8">6.8 <EM>cache/log</EM></A></H2>

<P>This file has a rather unfortuntate name.  It also is often called the <EM>swap log</EM>.
It is a record of every cache object written to disk.  It is read when Squid starts
up to ``reload'' the cache.  If you remove this file,
you will effectively wipe out your cache contents.</P>

<P>For Squid-1.1, there are six fields:
<OL>
<LI><B>fileno</B>:
The swap file number holding the object data.  This is mapped to a pathname on your filesystem.
</LI>
<LI><B>timestamp</B>:
This is the time when the object was last verified to be current.  The time is a
hexadecimal representation of Unix time.
</LI>
<LI><B>expires</B>:
This is the value of the Expires header in the HTTP reply.  If an Expires header
was not present, this will be -2 or fffffffe.  If the Expires header was
present, but invalid (unparsable), this will be -1 or ffffffff.
</LI>
<LI><B>lastmod</B>:
Value of the HTTP reply Last-Modified header.  If missing it will be -2, 
if invalid it will be -1.
</LI>
<LI><B>size</B>:
Size of the object, including headers.
</LI>
<LI><B>url</B>:
The URL naming this object.
</LI>
</OL>
</P>


<H2><A NAME="ss6.9">6.9 Which log files can I delete safely?</A></H2>

<P>The best way to maintain Squid log files is to send the
<EM>squid</EM> process a USR1 signal.  This causes the current log
files to be closed and renamed.  You can then remove any of
the old log files.  For example, if your <EM>squid.pid</EM> file
is <EM>/usr/local/squid/logs/squid.pid/</EM> (as defined in your
<EM>squid.conf</EM> file) you would do:</P>
<P>
<PRE>
        kill -USR1 `cat /usr/local/squid/logs/squid.pid`
</PRE>
</P>

<P><EM>NOTE:</EM> The <CODE>logfile_rotate</CODE>
line in <EM>squid.conf</EM> makes it generally unnecessary to delete
logfiles by hand.  Just set <CODE>logfile_rotate</CODE> to the
number of old logs you want saved.  Each time the value of
<CODE>logfile_rotate</CODE> is reached, the oldest log will be
deleted automatically.  You may find it useful to simply set
<CODE>logfile_rotate</CODE> to the number of old logs you want,
and then set up a crontab to send <EM>squid</EM> the <CODE>SIGUSR1</CODE> signal.
The following crontab entry would tell Squid to rotate the logs
every day at midnight:
<PRE>
        0 0 * * * /bin/kill -USR1 `cat /usr/local/squid/logs/squid.pid`
</PRE>
</P>

<P>The only logfile you should <B><EM>never</EM></B> delete
is the file cleverly named <CODE>log</CODE> which normally exists
in the first <CODE>cache_dir</CODE> directory.  This file contains
the meta data needed to rebuild the cache when squid starts up.
<B><EM>Deleting this file effectively wipes out your
cache.</EM></B></P>


<H2><A NAME="ss6.10">6.10 Why do I get ERR_NO_CLIENTS_BIG_OBJ messages so often?</A></H2>

<P>This message means that the requested object was in ``Delete Behind''
mode and the user aborted the transfer.  An object will go into
``Delete Behind'' mode if
<UL>
<LI>It is larger than <EM>maximum_object_size</EM></LI>
<LI>It is being fetched from a neighbor which has the <EM>proxy-only</EM> option set.</LI>
</UL>
</P>


<H2><A NAME="ss6.11">6.11 What does ERR_LIFETIME_EXP mean?</A></H2>

<P>This means that a timeout occurred while the object was being transferred.  Most
likely the retrieval of this object was very slow (or it stalled before finishing)
and the user aborted the request.  However, depending on your settings for
<EM>quick_abort</EM>, Squid may have continued to try retrieving the object.
Squid imposes a maximum amount of time on all open sockets, so after some amount
of time the stalled request was aborted and logged win an ERR_LIFETIME_EXP 
message.</P>


<H2><A NAME="ss6.12">6.12 Retrieving ``lost'' files from the cache</A></H2>

<P>
<BLOCKQUOTE>
<I>I've been asked to retrieve an object which was accidentally
destroyed at the source for recovery. 
So, how do I figure out where the things are so I can copy
them out and strip off the headers?</I>
</BLOCKQUOTE>
</P>
<P>The following method applies only to the Squid-1.1 versions:</P>
<P>Use <EM>grep</EM> to find the named object (Url) in the
<A HREF="#swaplog">cache/log</A> file.  The first filed in
this file is an integer <EM>file number</EM>.</P>

<P>Then, find the file <EM>fileno-to-pathname.pl</EM> from the ``scripts''
directory of the Squid source distribution.  The usage is
<PRE>
        perl fileno-to-pathname.pl [-c squid.conf]
</PRE>

file numbers are read on stdin, and pathnames are printed on
stdout.</P>



<HR>
<A HREF="FAQ-5.html">Previous</A>
<A HREF="FAQ-7.html">Next</A>
<A HREF="FAQ.html#toc6">Table of Contents</A>
</BODY>
</HTML>