File: spec_3.html

package info (click to toggle)
exim-html 3.20-1
links: PTS
area: main
in suites: etch, etch-m68k, sarge, woody
size: 2,868 kB
ctags: 4,188
sloc: makefile: 40; sh: 19
file content (507 lines) | stat: -rw-r--r-- 21,214 bytes
<HTML>
<HEAD>
<!-- This HTML file has been created by texi2html 1.52
     from spec on 25 November 2000 -->

<TITLE>Exim Specification - 3. How Exim delivers mail</TITLE>
</HEAD>
<body bgcolor="#FFFFFF" text="#00005A" link="#FF6600" alink="#FF9933" vlink="#990000">
Go to the <A HREF="spec_1.html">first</A>, <A HREF="spec_2.html">previous</A>, <A HREF="spec_4.html">next</A>, <A HREF="spec_59.html">last</A> section, <A HREF="spec_toc.html">table of contents</A>.
<P><HR><P>


<H1><A NAME="SEC10" HREF="spec_toc.html#TOC10">3. How Exim delivers mail</A></H1>

<P>



<H2><A NAME="SEC11" HREF="spec_toc.html#TOC11">3.1 Philosophy</A></H2>

<P>
Exim is designed to work efficiently on systems that are permanently connected
to the Internet and are handling a general mix of mail. In such circumstances,
most messages can be delivered immediately. Consequently, Exim does not
maintain independent queues of messages for specific domains or hosts, though
it does try to send several messages in a single SMTP connection after a host
has been down, and it also maintains per-host retry information.

</P>



<H2><A NAME="SEC12" HREF="spec_toc.html#TOC12">3.2 Message reception</A></H2>

<P>
<A NAME="IDX30"></A>
<A NAME="IDX31"></A>
When Exim receives a message, it writes two files in its spool directory. The
first contains the <EM>envelope</EM> information, the current status of the message,
and the headers, while the second contains the body of the message.

</P>
<P>
The envelope information consists of the address of the message's sender and
the address(es) of the recipient(s). This information is entirely separate from
any addresses contained in the headers. The status of the message includes a
list of recipients who have already received the message. The format of the
first spool file is described in chapter 56.

</P>
<P>
<font color=green>
Address rewriting that is specified in the rewrite section of the configuration
(see chapter 34) is done once and for all on incoming addresses,
both in the header and the envelope, at the time the message is received. If
during the course of delivery additional addresses are generated (for example,
via aliasing), these new addresses get rewritten as soon as they are generated.
At the time a message is actually delivered (transported) further rewriting can
take place; because this is a transport option, it can be different for
different forms of delivery. It is also possible to specify the addition or
removal of certain headers at the time the message is delivered (see chapters
14 and 20).
</font>

</P>
<P>
<A NAME="IDX32"></A>
<A NAME="IDX33"></A>
Every message handled by Exim is given a <EM>message id</EM> which is sixteen
characters long. It is divided into three parts, separated by hyphens. Each
part is a sequence of letters and digits, representing a number in base 62:

</P>

<UL>

<LI>

The first six characters are the time the message was received, as a number in
seconds -- the normal Unix way of representing a time of day.
If the clock goes backwards (due to resetting) in a process that is receiving
more than one message, the later time is retained.

<LI>

After the first hyphen, the next six characters are the id of the process that
received the message.

<LI>

The final two characters, after the second hyphen, are used to ensure
uniqueness of the id. There are two different formats:


<OL>

<LI>

<A NAME="IDX34"></A>
If the <EM>localhost_number</EM> option is not set, uniqueness is required only
within the local host. This portion of the id is `00' except when a process
receives more than one message in a single second, when the number is
incremented for each additional message.

<LI>

If the <EM>localhost_number</EM> option is set, uniqueness among a set of hosts is
required. This portion of the id is set to the base 62 encoding of

<PRE>
&#60;<EM>sequence number</EM>&#62; * 256 + &#60;<EM>host number</EM>&#62;
</PRE>

where &#60;<EM>sequence number</EM>&#62; is the count of messages received by the current
process within the current second. As the maximum value of the host number is
255, this allows for a maximum value of 14 for the sequence number. If this
limit is reached, a delay of one second is imposed before reading the next
message, in order to allow the clock to tick and the sequence number to get
reset.
</OL>

</UL>

<P>
The names of the two spool files consist of the message id, followed by -H
for the file containing the envelope and headers, and -D for the data
file.

</P>
<P>
By default all these spool files are held in a single directory called
<TT>`input'</TT> inside the general Exim spool directory. Some operating systems do
not perform very well if the number of files in a directory gets very large; to
improve performance in such cases, the <EM>split_spool_directory</EM> option can be
used. This causes Exim to split up the input files into 62 sub-directories
whose names are single letters or digits.

</P>
<P>
Exim can be configured not to start a delivery process when a message is
received; this can be unconditional, or depend on the number of incoming SMTP
connections or the system load. In these situations, new messages wait on the
queue until a queue-runner process picks them up. However, in standard
configurations under normal conditions, delivery is started as soon as a
message is received.

</P>


<H2><A NAME="SEC13" HREF="spec_toc.html#TOC13">3.3 Life of a message</A></H2>

<P>
A message remains in the spool directory until it is completely delivered to
its recipients or to an error address, or until it is deleted by an
administrator or by the user who originally created it. In cases when delivery
cannot proceed -- for example, when a message can neither be delivered to its
recipients nor returned to its sender, the message is marked `frozen' on the
spool, and no more deliveries are attempted.

</P>
<P>
<A NAME="IDX35"></A>
An administrator can `thaw' such messages when the problem has been corrected,
and can also freeze individual messages by hand if necessary. In addition, an
administrator can force a delivery error, causing an error message to be sent.

</P>
<P>
<A NAME="IDX36"></A>
There is also an <EM>auto_thaw</EM> option, which can be used to cause Exim to retry
frozen messages after a certain time. When this is set, no message will remain
on the queue for ever, because the delivery timeout will eventually be reached.
<A NAME="IDX37"></A>
Delivery failure reports that reach this timeout are discarded.

</P>
<P>
<A NAME="IDX38"></A>
<A NAME="IDX39"></A>
As delivery proceeds, Exim writes timestamped information about each address to
a per-message log file; this includes any delivery error messages. This log is
solely for the benefit of the administrator, and is normally deleted with the
spool files when processing of a message is complete. However, Exim can be
configured to retain it (a dangerous option, as the files can accumulate
rapidly on a busy system). Exim also writes delivery messages to its main log
file, whose contents are described in chapter 51.

</P>
<P>
<A NAME="IDX40"></A>
All the information Exim itself needs to set up a delivery is kept in the first
spool file with the headers. When a successful delivery occurs, the address is
immediately written at the end of a journal file, whose name is the message id
followed by -J. At the end of a delivery run, if there are some addresses
left to be tried again later, the first spool file is updated to indicate which
these are, and the journal file is then deleted. Updating the spool file is
done by writing a new file and renaming it, to minimize the possibility of data
loss.

</P>
<P>
Should the system or the program crash after a successful delivery but before
the spool file has been updated, the journal is left lying around. The next
time Exim attempts to deliver the message, it reads the journal file and
updates the spool file before proceeding. This minimizes the chances of double
deliveries caused by crashes.

</P>



<H2><A NAME="SEC14" HREF="spec_toc.html#TOC14">3.4 Drivers</A></H2>

<P>
<A NAME="IDX41"></A>
<A NAME="IDX42"></A>
<A NAME="IDX43"></A>
<A NAME="IDX44"></A>
The main delivery processing elements of Exim are called <EM>directors</EM>,
<EM>routers</EM>, and <EM>transports</EM>, and collectively these are known as
<EM>drivers</EM>. Code for a number of them is provided, compile-time options
specify which ones are included in the binary, and run time options specify
which ones are actually used.

</P>
<P>
A <EM>transport</EM> is a driver that transmits a copy of the message from Exim's
spool to some destination. There are two kinds of transport: for a <EM>local</EM>
transport, the destination is a file or a pipe on the local host, whereas for a
<EM>remote</EM> transport the destination is some other host. A message is passed
to a specific transport as a result of successful directing or routing. If a
message has several recipients, it may be passed to a number of different
transports.

</P>
<P>
A <EM>director</EM> is a driver that operates on a local address, either
determining how its delivery should happen, or converting the address into one
or more new addresses (for example, via an alias file). A local address is one
whose domain matches an entry in the list given in the <EM>local_domains</EM> option,
or has been determined to be local by a router -- see below. The fact that an
address is local does not imply that the message has to be delivered locally;
it can be directed either to a local or to a remote transport.

</P>
<P>
A <EM>router</EM> is a driver that operates on an apparently remote address, that
is an address whose domain does not match anything in the list given in
<EM>local_domains</EM>. When a router succeeds it can route an address either to a
local or to a remote transport, or it can change the domain, and pass the
address on to subsequent routers.

</P>
<P>
In exceptional cases, a router may determine that an address is local after
all, and cause it to be passed to the directors. This happens automatically if
a host lookup expands an abbreviated domain into one that is local. It can also
be made to happen
(optionally) if an MX record or other routing information points to the local
host, though by default this situation is treated as a configuration error.
This is the only case in which the directors are used to process an address
that may not match anything in <EM>local_domains</EM>. The diagram below illustrates
the relationship between the three kinds of driver.
<img src="drivers.gif" alt="Driver interactions"><br>
As new features have been added to Exim, the distinction between routers and
directors has become less clear-cut than it once was.
<font color=green>
It is possible that in some future release the difference will be abolished and
they will be merged into one type of driver. However, at present, they remain
distinct.
</font>

</P>



<H2><A NAME="SEC15" HREF="spec_toc.html#TOC15">3.5 Delivery in detail</A></H2>

<P>
When a message is to be delivered, the sequence of events is roughly as
follows:

</P>

<UL>

<LI>

If a system-wide filter file is specified, the message is passed to it. The
filter may add recipients to the message, replace the recipients, discard the
message, cause a new message to be generated, or cause the message delivery to
fail. The format of the filter file is the same as for user filter files,
described in the separate document entitled <EM>Exim's interface to mail
filtering</EM>. Some additional features are available in system filters -- see
chapter 47 for details. Note that a message is passed to the
system filter only once per delivery attempt, however many recipients it has.
However, if there are several delivery attempts because one or more addresses
could not be immediately delivered, the system filter is run each time. The
filter condition <EM>first_delivery</EM> can be used to detect this.

<LI>

Each recipient address is parsed and a check is made to see if it is local, by
comparing the domain with the list in the <EM>local_domains</EM> option. This can
contain wildcards and file lookups.

<LI>

If an address is local, it is offered to each configured director in turn
until one is able to handle it.
<font color=green>
When a director cannot handle an address, it is said to <EM>decline</EM>. If no
directors can handle the address, that is, if they all decline,
</font>
the address is failed. Directors can be targeted at particular local domains,
so several local domains can be processed entirely independently of each other.

<LI>

<A NAME="IDX45"></A>
<A NAME="IDX46"></A>
A director that accepts an address may set up a local or a remote transport for
it. The transport is not run at this time; the address is placed on a queue for
the particular transport, to be run later. Alternatively, the director may
generate one or more new addresses (typically from alias, forward, or filter
files). New addresses are fed back into this process from the top, but in order
to avoid loops, a director ignores any address which has an identically-named
ancestor that was processed by itself.

<LI>

If an address is not local, it is offered to each configured router in turn,
until one is able to handle it. If none can, the address is failed.

<LI>

A router that accepts an address may set up a transport for it, or may pass an
altered address to subsequent routers, or it may discover that the address is a
local address after all. This typically happens when a partial domain name is
used and (for example) the DNS lookup is configured to try to extend such
names. In this case, the address is passed to the directors. Exim can
also be configured to do this for any domain whose lowest MX record or other
routing information
points to the local host.

<LI>

Routers normally set up remote transports for messages that are to be delivered
to other machines. However, a router can pass a message to a local transport,
and by this means such messages can be routed to transport mechanisms other
than SMTP by means of pipes or files.

<LI>

When all the directing and routing is done, addresses that have been
successfully handled are passed to their assigned transports. When local
transports are doing real local deliveries, they handle only one address at a
time, but if a local transport is being used as a pseudo-remote transport (for
example, to collect batched SMTP messages for transmission by some other means)
multiple addresses can be handled. Remote transports can always handle more
than one address at once, but can be configured not to do so, or to restrict
multiple addresses to the same domain.

<LI>

Each local delivery runs in a separate process under a non-privileged uid, and
they are run in sequence. Exim can be configured so that remote deliveries run
under a uid that is private to Exim, instead of running as root. By default the
remote deliveries run one at a time in the main Exim process, but a
configuration option is available to allow multiple remote deliveries for a
single message to be run simultaneously, each in its own sub-process.

<LI>

When it is doing a queue run, Exim checks its retry database to see if there
has been a previous temporary delivery failure for the address before running
any local transport. If it finds one, it does not attempt a new delivery until
the retry time for the address is reached. However, this happens only for
delivery attempts that are part of a queue run. Local deliveries are always
attempted when delivery immediately follows message reception.

<LI>

Remote transports do their own retry handling, since an address may be
deliverable to one of a number of hosts, each of which may have a different
retry time. If there have been previous temporary failures and no host has
reached its retry time, no delivery is attempted, whether in a queue run or
not. See chapter 33 for details of retry strategies.

<LI>

If there were any errors, a message is returned to an appropriate address (the
sender in the common case), with details of the error for each failing address.
Exim can be configured to send copies of error messages to other addresses.

<LI>

If one or more addresses suffered a temporary failure, the message is left on
the queue, to be tried again later. Otherwise the spool files and message log
are deleted, though the message log can optionally be preserved if required.
</UL>

<P>
<A NAME="IDX47"></A>
<A NAME="IDX48"></A>
Delivery is said to be <EM>deferred</EM> when the message remains on the queue for a
subsequent delivery attempt after a temporary failure. Such messages get
processed again by queue-runner processes that are periodically started, either
by an Exim daemon or via <EM>cron</EM> or by hand.

</P>
<P>
Temporary failures may be detected during routing and directing as well as
during the transport stage. Exim uses a set of configured rules to determine
when next to retry the failing address (see chapter 33).
<A NAME="IDX49"></A>
These rules also specify when Exim should give up trying to deliver to the
address, at which point it generates a failure report.

</P>
<P>
When a delivery is not part of a queue run (typically an immediate delivery
on receipt of a message), the directors are always run for local addresses, and
local deliveries are always attempted, even if retry times are set for them.
This makes for better behaviour if one particular message is causing problems
(for example, causing quota overflow, or provoking an error in a filter file).
If such a delivery suffers a temporary failure, the retry data gets updated as
usual, for use by the next queue-runner process.

</P>
<P>
<A NAME="IDX50"></A>
When a message cannot be delivered to some or all of its intended recipients, a
delivery failure report is generated. All the addresses that failed in a given
delivery attempt are listed in a single failure report. If a message has many
recipients, it is possible for some addresses to fail in one delivery attempt
and others to fail subsequently, giving rise to more than one failure report
for a single message. The wording of delivery failure reports can be customized
by the administrator. See chapter 39 for details.

</P>
<P>
<A NAME="IDX51"></A>
Delivery failure messages contain an <EM>X-Failed-Recipients:</EM> header,
listing all failed addresses, for the benefit of programs that try to analyse
such messages automatically.

</P>
<P>
A failure report is normally sent to the sender of the original message, as
obtained from the message's envelope. For incoming SMTP messages, this is the
address given in the MAIL command. However, when an address is
expanded via a forward or alias file, an alternative address can be specified
for delivery failures of the generated addresses. For a mailing list expansion
(see chapter 42) it is common to direct failure reports to the
manager of the list.

</P>
<P>
If a failure report (either locally generated or received from a remote host)
itself suffers a delivery failure, the message is left on the queue, but is
`frozen', awaiting the attention of an administrator. There are options which
can be used to make Exim discard such failure reports, or to keep them for only
a short time.

</P>


<H2><A NAME="SEC16" HREF="spec_toc.html#TOC16">3.6 Temporary delivery failures</A></H2>

<P>
There are many reasons why a message may not be immediately deliverable to a
particular address. Failure to connect to a remote machine (because it, or the
connection to it, is down) is one of the most common. Local deliveries may also
be delayed if NFS files are unavailable, or if a mailbox is on a file system
where the user is over quota. Exim can be configured to impose its own quotas
on local mailboxes; where system quotas are set they will also apply.

</P>
<P>
A machine that is connected to the Internet can normally deliver most
mail straight away (the usual figure at Cambridge University is 98%). In its
default configuration, Exim starts a delivery process whenever it receives a
message, and usually this completes the entire delivery. This is a lightweight
approach, avoiding the need for any centralized queue managing software. There
are those who argue that a central message manager would be able to batch up
messages for the same host and send them in a single SMTP call. I do not myself
believe this would occur much in general, unless messages were significantly
delayed in order to create a batch.

</P>
<P>
However, if a host is unreachable for a period of time, a number of messages
may be waiting for it by the time it recovers, and sending them in a single
SMTP connection is clearly beneficial. Whenever a delivery to a remote host is
deferred, Exim makes a note in its hints database, and whenever a successful
SMTP delivery has happened, it looks to see if any other messages are waiting
for the same host. If any are found, they are sent over the same SMTP
connection, subject to a configuration limit as to the maximum number in any
one connection.

</P>

<P><HR><P>
Go to the <A HREF="spec_1.html">first</A>, <A HREF="spec_2.html">previous</A>, <A HREF="spec_4.html">next</A>, <A HREF="spec_59.html">last</A> section, <A HREF="spec_toc.html">table of contents</A>.
</BODY>
</HTML>