1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402
|
PPoossttffiixx DDeebbuuggggiinngg HHoowwttoo
-------------------------------------------------------------------------------
PPuurrppoossee ooff tthhiiss ddooccuummeenntt
This document describes how to debug parts of the Postfix mail system when
things do not work according to expectation. The methods vary from making
Postfix log a lot of detail, to running some daemon processes under control of
a call tracer or debugger.
The text assumes that the Postfix main.cf and master.cf configuration files are
stored in directory /etc/postfix. You can use the command "ppoossttccoonnff
ccoonnffiigg__ddiirreeccttoorryy" to find out the actual location of this directory on your
machine.
Listed in order of increasing invasiveness, the debugging techniques are as
follows:
* Look for obvious signs of trouble
* Debugging Postfix from inside
* Try turning off chroot operation in master.cf
* Verbose logging for specific SMTP connections
* Record the SMTP session with a network sniffer
* Making Postfix daemon programs more verbose
* Manually tracing a Postfix daemon process
* Automatically tracing a Postfix daemon process
* Running daemon programs with the interactive ddd debugger
* Running daemon programs with the interactive gdb debugger
* Running daemon programs under a non-interactive debugger
* Unreasonable behavior
* Reporting problems to postfix-users@postfix.org
LLooookk ffoorr oobbvviioouuss ssiiggnnss ooff ttrroouubbllee
Postfix logs all failed and successful deliveries to a logfile.
* When Postfix uses syslog logging (the default), the file is usually called
/var/log/maillog, /var/log/mail, or something similar; the exact pathname
is configured in a file called /etc/syslog.conf, /etc/rsyslog.conf, or
something similar.
* When Postfix uses its own logging system (see MAILLOG_README), the location
of the logfile is configured with the Postfix maillog_file parameter.
When Postfix does not receive or deliver mail, the first order of business is
to look for errors that prevent Postfix from working properly:
% ggrreepp --EE ''((wwaarrnniinngg||eerrrroorr||ffaattaall||ppaanniicc))::'' //ssoommee//lloogg//ffiillee || mmoorree
Note: the most important message is near the BEGINNING of the output. Error
messages that come later are less useful.
The nature of each problem is indicated as follows:
* "ppaanniicc" indicates a problem in the software itself that only a programmer
can fix. Postfix cannot proceed until this is fixed.
* "ffaattaall" is the result of missing files, incorrect permissions, incorrect
configuration file settings that you can fix. Postfix cannot proceed until
this is fixed.
* "eerrrroorr" reports an error condition. For safety reasons, a Postfix process
will terminate when more than 13 of these happen.
* "wwaarrnniinngg" indicates a non-fatal error. These are problems that you may not
be able to fix (such as a broken DNS server elsewhere on the network) but
may also indicate local configuration errors that could become a problem
later.
DDeebbuuggggiinngg PPoossttffiixx ffrroomm iinnssiiddee
Postfix version 2.1 and later can produce mail delivery reports for debugging
purposes. These reports not only show sender/recipient addresses after address
rewriting and alias expansion or forwarding, they also show information about
delivery to mailbox, delivery to non-Postfix command, responses from remote
SMTP servers, and so on.
Postfix can produce two types of mail delivery reports for debugging:
* What-if: report what would happen, but do not actually deliver mail. This
mode of operation is requested with:
% //uussrr//ssbbiinn//sseennddmmaaiill --bbvv aaddddrreessss......
Mail Delivery Status Report will be mailed to <your login name>.
* What happened: deliver mail and report successes and/or failures, including
replies from remote SMTP servers. This mode of operation is requested with:
% //uussrr//ssbbiinn//sseennddmmaaiill --vv aaddddrreessss......
Mail Delivery Status Report will be mailed to <your login name>.
These reports contain information that is generated by Postfix delivery agents.
Since these run as daemon processes that cannot interact with users directly,
the result is sent as mail to the sender of the test message. The format of
these reports is practically identical to that of ordinary non-delivery
notifications.
For a detailed example of a mail delivery status report, see the debugging
section at the end of the ADDRESS_REWRITING_README document.
TTrryy ttuurrnniinngg ooffff cchhrroooott ooppeerraattiioonn iinn mmaasstteerr..ccff
A common mistake is to turn on chroot operation in the master.cf file without
going through all the necessary steps to set up a chroot environment. This
causes Postfix daemon processes to fail due to all kinds of missing files.
The example below shows an SMTP server that is configured with chroot turned
off:
/etc/postfix/master.cf:
# =============================================================
# service type private unpriv cchhrroooott wakeup maxproc command
# (yes) (yes) ((yyeess)) (never) (100)
# =============================================================
smtp inet n - nn - - smtpd
Inspect master.cf for any processes that have chroot operation not turned off.
If you find any, save a copy of the master.cf file, and edit the entries in
question. After executing the command "ppoossttffiixx rreellooaadd", see if the problem has
gone away.
If turning off chrooted operation made the problem go away, then
congratulations. Leaving Postfix running in this way is adequate for most
sites. If you prefer chrooted operation, see the Postfix
BASIC_CONFIGURATION_README file for information about how to prepare Postfix
for chrooted operation.
VVeerrbboossee llooggggiinngg ffoorr ssppeecciiffiicc SSMMTTPP ccoonnnneeccttiioonnss
In /etc/postfix/main.cf, list the remote site name or address in the
debug_peer_list parameter. For example, in order to make the software log a lot
of information to the syslog daemon for connections from or to the loopback
interface:
/etc/postfix/main.cf:
debug_peer_list = 127.0.0.1
You can specify one or more hosts, domains, addresses or net/masks. To make the
change effective immediately, execute the command "ppoossttffiixx rreellooaadd".
RReeccoorrdd tthhee SSMMTTPP sseessssiioonn wwiitthh aa nneettwwoorrkk ssnniiffffeerr
This example uses ttccppdduummpp. In order to record a conversation you need to
specify a large enough buffer with the "--ss" option or else you will miss some
or all of the packet payload.
# ttccppdduummpp --ww //ffiillee//nnaammee --ss 00 hhoosstt eexxaammppllee..ccoomm aanndd ppoorrtt 2255
Older tcpdump versions don't support "--ss 00"; in that case, use "--ss 22000000"
instead.
Run this for a while, stop with Ctrl-C when done. To view the data use a binary
viewer, eetthheerreeaall, or good old lleessss.
MMaakkiinngg PPoossttffiixx ddaaeemmoonn pprrooggrraammss mmoorree vveerrbboossee
Append one or more "--vv" options to selected daemon definitions in /etc/postfix/
master.cf and type "ppoossttffiixx rreellooaadd". This will cause a lot of activity to be
logged to the syslog daemon. For example, to make the Postfix SMTP server
process more verbose:
/etc/postfix/master.cf:
smtp inet n - n - - smtpd -v
To diagnose problems with address rewriting specify a "--vv" option for the
cleanup(8) and/or trivial-rewrite(8) daemon, and to diagnose problems with mail
delivery specify a "--vv" option for the qmgr(8) or oqmgr(8) queue manager, or
for the lmtp(8), local(8), pipe(8), smtp(8), or virtual(8) delivery agent.
MMaannuuaallllyy ttrraacciinngg aa PPoossttffiixx ddaaeemmoonn pprroocceessss
Many systems allow you to inspect a running process with a system call tracer.
For example:
# ttrraaccee --pp pprroocceessss--iidd (SunOS 4)
# ssttrraaccee --pp pprroocceessss--iidd (Linux and many others)
# ttrruussss --pp pprroocceessss--iidd (Solaris, FreeBSD)
# kkttrraaccee --pp pprroocceessss--iidd (generic 4.4BSD)
Even more informative are traces of system library calls. Examples:
# llttrraaccee --pp pprroocceessss--iidd (Linux, also ported to FreeBSD and BSD/OS)
# ssoottrruussss --pp pprroocceessss--iidd (Solaris)
See your system documentation for details.
Tracing a running process can give valuable information about what a process is
attempting to do. This is as much information as you can get without running an
interactive debugger program, as described in a later section.
AAuuttoommaattiiccaallllyy ttrraacciinngg aa PPoossttffiixx ddaaeemmoonn pprroocceessss
Postfix can attach a call tracer whenever a daemon process starts. Call tracers
come in several kinds.
1. System call tracers such as ttrraaccee, ttrruussss, ssttrraaccee, or kkttrraaccee. These show the
communication between the process and the kernel.
2. Library call tracers such as ssoottrruussss and llttrraaccee. These show calls of
library routines, and give a better idea of what is going on within the
process.
Append a --DD option to the suspect command in /etc/postfix/master.cf, for
example:
/etc/postfix/master.cf:
smtp inet n - n - - smtpd -D
Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes
the call tracer of your choice, for example:
/etc/postfix/main.cf:
debugger_command =
PATH=/bin:/usr/bin:/usr/local/bin;
(truss -p $process_id 2>&1 | logger -p mail.info) & sleep 5
Type "ppoossttffiixx rreellooaadd" and watch the logfile.
RRuunnnniinngg ddaaeemmoonn pprrooggrraammss wwiitthh tthhee iinntteerraaccttiivvee dddddd ddeebbuuggggeerr
If you have X Windows installed on the Postfix machine, then an interactive
debugger such as dddddd can be convenient.
Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes
dddddd:
/etc/postfix/main.cf:
debugger_command =
PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin
ddd $daemon_directory/$process_name $process_id & sleep 5
Be sure that ggddbb is in the command search path, and export XXAAUUTTHHOORRIITTYY so that X
access control works, for example:
% sseetteennvv XXAAUUTTHHOORRIITTYY ~~//..XXaauutthhoorriittyy (csh syntax)
$ eexxppoorrtt XXAAUUTTHHOORRIITTYY==$$HHOOMMEE//..XXaauutthhoorriittyy (sh syntax)
Append a --DD option to the suspect daemon definition in /etc/postfix/master.cf,
for example:
/etc/postfix/master.cf:
smtp inet n - n - - smtpd -D
Stop and start the Postfix system. This is necessary so that Postfix runs with
the proper XXAAUUTTHHOORRIITTYY and DDIISSPPLLAAYY settings.
Whenever the suspect daemon process is started, a debugger window pops up and
you can watch in detail what happens.
RRuunnnniinngg ddaaeemmoonn pprrooggrraammss wwiitthh tthhee iinntteerraaccttiivvee ggddbb ddeebbuuggggeerr
If you have the screen command installed on the Postfix machine, then you can
run an interactive debugger such as ggddbb as follows.
Edit the debugger_command definition in /etc/postfix/main.cf so that it runs
ggddbb inside a detached ssccrreeeenn session:
/etc/postfix/main.cf:
debugger_command =
PATH=/bin:/usr/bin:/sbin:/usr/sbin; export PATH; HOME=/root;
export HOME; screen -e^tt -dmS $process_name gdb
$daemon_directory/$process_name $process_id & sleep 2
Be sure that ggddbb is in the command search path.
Append a --DD option to the suspect daemon definition in /etc/postfix/master.cf,
for example:
/etc/postfix/master.cf:
smtp inet n - n - - smtpd -D
Execute the command "ppoossttffiixx rreellooaadd" and wait until a daemon process is started
(you can see this in the maillog file).
Then attach to the screen, and debug away:
# HOME=/root screen -r
gdb) continue
gdb) where
RRuunnnniinngg ddaaeemmoonn pprrooggrraammss uunnddeerr aa nnoonn--iinntteerraaccttiivvee ddeebbuuggggeerr
If you do not have X Windows installed on the Postfix machine, or if you are
not familiar with interactive debuggers, then you can try to run ggddbb in non-
interactive mode, and have it print a stack trace when the process crashes.
Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes
the ggddbb debugger:
/etc/postfix/main.cf:
debugger_command =
PATH=/bin:/usr/bin:/usr/local/bin; export PATH; (echo cont; echo
where; sleep 8640000) | gdb $daemon_directory/$process_name
$process_id 2>&1
>$config_directory/$process_name.$process_id.log & sleep 5
Append a --DD option to the suspect daemon in /etc/postfix/master.cf, for
example:
/etc/postfix/master.cf:
smtp inet n - n - - smtpd -D
Type "ppoossttffiixx rreellooaadd" to make the configuration changes effective.
Whenever a suspect daemon process is started, an output file is created, named
after the daemon and process ID (for example, smtpd.12345.log). When the
process crashes, a stack trace (with output from the "wwhheerree" command) is
written to its logfile.
UUnnrreeaassoonnaabbllee bbeehhaavviioorr
Sometimes the behavior exhibited by Postfix just does not match the source
code. Why can a program deviate from the instructions given by its author?
There are two possibilities.
* The compiler has erred. This rarely happens.
* The hardware has erred. Does the machine have ECC memory?
In both cases, the program being executed is not the program that was supposed
to be executed, so anything could happen.
There is a third possibility:
* Bugs in system software (kernel or libraries).
Hardware-related failures usually do not reproduce in exactly the same way
after power cycling and rebooting the system. There's little Postfix can do
about bad hardware. Be sure to use hardware that at the very least can detect
memory errors. Otherwise, Postfix will just be waiting to be hit by a bit
error. Critical systems deserve real hardware.
When a compiler makes an error, the problem can be reproduced whenever the
resulting program is run. Compiler errors are most likely to happen in the code
optimizer. If a problem is reproducible across power cycles and system reboots,
it can be worthwhile to rebuild Postfix with optimization disabled, and to see
if optimization makes a difference.
In order to compile Postfix with optimizations turned off:
% mmaakkee ttiiddyy
% mmaakkee mmaakkeeffiilleess OOPPTT==
This produces a set of Makefiles that do not request compiler optimization.
Once the makefiles are set up, build the software:
% mmaakkee
% ssuu
Password:
# mmaakkee iinnssttaallll
If the problem goes away, then it is time to ask your vendor for help.
RReeppoorrttiinngg pprroobblleemmss ttoo ppoossttffiixx--uusseerrss@@ppoossttffiixx..oorrgg
The people who participate on postfix-users@postfix.org are very helpful,
especially if YOU provide them with sufficient information. Remember, these
volunteers are willing to help, but their time is limited.
When reporting a problem, be sure to include the following information.
* A summary of the problem. Please do not just send some logging without
explanation of what YOU believe is wrong.
* Complete error messages. Please use cut-and-paste, or use attachments,
instead of reciting information from memory.
* Postfix logging. See the text at the top of the DEBUG_README document to
find out where logging is stored. Please do not frustrate the helpers by
word wrapping the logging. If the logging is more than a few kbytes of
text, consider posting an URL on a web or ftp site.
* Consider using a test email address so that you don't have to reveal email
addresses or passwords of innocent people.
* If you can't use a test email address, please anonymize email addresses and
host names consistently. Replace each letter by "A", each digit by "D" so
that the helpers can still recognize syntactical errors.
* Command output from:
o "ppoossttccoonnff --nn". Please do not send your main.cf file, or 1000+ lines of
ppoossttccoonnff command output.
o "ppoossttccoonnff --MMff" (Postfix 2.9 or later).
* Better, provide output from the ppoossttffiinnggeerr tool. This can be found at
https://github.com/ford--prefect/postfinger.
* If the problem is SASL related, consider including the output from the
ssaassllffiinnggeerr tool. This can be found at https://packages.debian.org/
search?keywords=sasl2-bin.
* If the problem is about too much mail in the queue, consider including
output from the qqsshhaappee tool, as described in the QSHAPE_README file.
* If the problem is protocol related (connections time out, or an SMTP server
complains about syntax errors etc.) consider recording a session with
ttccppdduummpp, as described in the DEBUG_README document.
|