1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
|
Intent
======
This note is about running search(1) as a daemon under Solaris
where it will be expected to service a large number of requests
per unit time.
Background
==========
The first side of a TCP connection to close it goes into the
TIME-WAIT state. From RFC 1122:
The graceful close algorithm of TCP requires
that the connection state remain defined on (at
least) one end of the connection, for a timeout
period of 2xMSL, i.e., 4 minutes. During this
period, the (remote socket, local socket) pair
that defines the connection is busy and cannot
be reused.
The request/response class of servers, such as web servers and
search(1), tend to close the connection first. From the above,
this means that such a server's socket goes into the TIME-WAIT
state.
Under sustained heavy load, the server will run out of
resources for sockets because they will be needed for new
requests faster than they are coming out of TIME-WAIT.
Solaris
=======
The Solaris TCP stack implements the quoted section of RFC 1122
above to the letter. The 4-minute wait can be considered too
long by some. To change that to, say, 1 minute, you can do:
ndd -set /dev/tcp tcp_time_wait_interval 60000
Solaris also starts ephemeral port allocation unusually high at
32768. Ephemeral ports can start at 1024. To change this, you
can do:
ndd -set /dev/tcp tcp_smallest_anon_port 1024
The above two changes will buy the server more time before it
runs out of socket resources, but under sustained heavy load,
it will still run out eventually.
The best thing to do is to ensure that any search(1) clients
you write call shutdown(2) after they have finished sending
their request. This will cause them to go into TIME-WAIT
instead of the server.
Note: this isn't a problem under Linux because, as far as I can
tell, Linux violates RFC 1122 by allowing sockets in TIME-WAIT
to be used for new connections. Despite the violation, it
works out for the best in the case of servers.
|