1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193
|
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Language" content="en" />
<title>s6: service startup notifications</title>
<meta name="Description" content="s6: service startup notifications" />
<meta name="Keywords" content="s6 ftrig notification notifier writer libftrigw ftrigw startup U up svwait s6-svwait" />
<!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
</head>
<body>
<p>
<a href="index.html">s6</a><br />
<a href="//skarnet.org/software/">Software</a><br />
<a href="//skarnet.org/">skarnet.org</a>
</p>
<h1> Service startup notifications </h1>
<p>
It is easy for a process supervision suite to know when a service that was <em>up</em>
is now <em>down</em>: the long-lived process implementing the service is dead. The
supervisor, running as the daemon's parent, is instantly notified via a SIGCHLD.
When it happens, <a href="s6-supervise.html">s6-supervise</a> sends a 'd' event
to its <tt>./event</tt> <a href="fifodir.html">fifodir</a>, so every subscriber
knows that the service is down. All is well.
</p>
<p>
It is much trickier for a process supervision suite to know when a service
that was <em>down</em> is now <em>up</em>. The supervisor forks and execs the
daemon, and knows when the exec has succeeded; but after that point, it's all
up to the daemon itself. Some daemons do a lot of initialization work before
they're actually ready to serve, and it is impossible for the supervisor to
know exactly <em>when</em> the service is really ready.
<a href="s6-supervise.html">s6-supervise</a> sends a 'u' event to its
<tt>./event</tt> <a href="fifodir.html">fifodir</a> when it successfully
spawns the daemon, but any subscriber
reacting to 'u' is subject to a race condition - the service provided by the
daemon may not be ready yet.
</p>
<p>
Reliable startup notifications need support from the daemons themselves.
Daemons should do two things to signal the outside world that they are
ready:
</p>
<ol>
<li> Update a state file, so other processes can get a snapshot
of the daemon's state </li>
<li> Send an event to processes waiting for a state change. </li>
</ol>
<p>
This is complex to implement in every single daemon, so s6 provides
tools to make it easier for daemon authors, without any need to link
against the s6 library or use any s6-specific construct:
daemons can simply write a line to a file descriptor of their choice,
then close that file descriptor, when they're ready to serve. This is
a generic mechanism that some daemons already implement.
</p>
<p>
s6 supports that mechanism natively: when the
<a href="servicedir.html">service directory</a> for the daemon contains
a valid <tt>notification-fd</tt> file, the daemon's supervisor, i.e. the
<a href="s6-supervise.html">s6-supervise</a> program, will properly catch
the daemon's message, update the status file (<tt>supervise/status</tt>),
then notify all the subscribers
with a <tt>'U'</tt> event, meaning that the service is now up and ready.
</p>
<p>
This method should really be implemented in every long-running
program providing a service. When it is not the case, it's impossible
to provide reliable startup notifications, and subscribers should then
be content with the unreliable <tt>'u'</tt> events provided by s6-supervise.
</p>
<p>
Unfortunately, a lot of long-running programs do not offer that
functionality; instead, they provide a way to poll them, an external
program that runs and checks whether the service is ready. This is a
<a href="//skarnet.org/software/s6/ftrig.html">bad</a> mechanism, for
<a href="//skarnet.org/lists/supervision/1606.html">several</a>
reasons. Nevertheless, until all daemons are patched to notify their
own readiness, s6 provides a way to run such a check program to poll
for readiness, and route its result into the s6 notification system:
<a href="s6-notifyoncheck.html">s6-notifyoncheck</a>.
</p>
<h2> How to use a check program with s6 (i.e. readiness checking via polling) </h2>
<ul>
<li> Let's say you have a daemon <em>foo</em>, started under s6 via a
<tt>/run/service/foo</tt> service directory, and that comes with a
<tt>foo-check</tt> program that exhibits different behaviours when
<em>foo</em> is ready and when it is not. </li>
<li> Create an executable script <tt>/run/service/foo/data/check</tt>
that calls <tt>foo-check</tt>. Make sure this script exits 0 when
<em>foo</em> is ready and nonzero when it's not. </li>
<li> In your <tt>/run/service/foo/run</tt> script that starts <em>foo</em>,
instead of executing into <tt>foo</tt>, execute into
<tt>s6-notifyoncheck foo</tt>. Read the
<a href="s6-notifyoncheck.html">s6-notifyoncheck</a> page if you need to
give it options to tune the polling. </li>
<li> <tt>echo 3 > /run/service/foo/notification-fd</tt>. If file descriptor
3 is already open when your run script executes <em>foo</em>, replace 3 with
a file descriptor you <em>know</em> is not already open. </li>
<li> That's it.
<ul>
<li> Your check script will be automatically invoked by
<a href="s6-notifyoncheck.html">s6-notifyoncheck</a>, until it succeeds. </li>
<li> <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> will send the
readiness notification to the file descriptor given in the <tt>notification-fd</tt>
file. </li>
<li> <a href="s6-supervise.html">s6-supervise</a> will receive it and will
mark <em>foo</em> as ready. </li>
</ul> </li>
</ul>
<h2> How to design a daemon so it uses the s6 mechanism <em>without</em> resorting to polling (i.e. readiness notification) </h2>
<p>
The <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> mechanism was
made to accommodate daemons that provide a check program but do not notify
readiness themselves; it works, but is suboptimal.
If you are writing the <em>foo</em> daemon, here is how you can make things better:
</p>
<ul>
<li> Readiness notification should be optional, so you should guard all
the following with a run-time option to <em>foo</em>. </li>
<li> Assume a file descriptor other than 0, 1 or 2 is going to be open.
You can hardcode 3 (or 4); or you can make it configurable via a command line
option. See for instance the <tt>-D <em>notif</em></tt> option to the
<a href="//skarnet.org/software/mdevd/mdevd.html">mdevd</a> program. It
really doesn't matter what this number is; the important thing is that your
daemon knows that this fd is already open, and is not using it for another
purpose. </li>
<li> Do nothing with this file descriptor until your daemon is ready. </li>
<li> When your daemon is ready, write a newline to this file descriptor.
<ul>
<li> If you like, you may write other data before the newline, just in
case it is printed to the terminal. It is not necessary, and it is best to
keep that data short. If the line is read by
<a href="s6-supervise.html">s6-supervise</a>, it will be entirely ignored;
only the newline is important. </li>
</ul>
<li> Then close that file descriptor. </li>
</ul>
<p>
The user who then makes <em>foo</em> run under s6 just has to do the
following:
</p>
<ul>
<li> Write 3, or the file descriptor the <em>foo</em> daemon uses
to notify readiness, to the <tt>/run/service/foo/notification-fd</tt> file. </li>
<li> In the <tt>/run/service/foo/run</tt> script, invoke <tt>foo</tt>
with the option that activates the readiness notification. If <em>foo</em>
makes the notification fd configurable, the user needs to make sure that
the number that is given to this option is the same as the number that is
written in the <tt>notification-fd</tt> file. </li>
<li> And that is all. <strong>Do not</strong> use <tt>s6-notifyoncheck</tt>
in this case, because you do not need to poll to know whether <em>foo</em>
is ready; instead, <em>foo</em> will directly communicate its readiness to
<a href="s6-supervise.html">s6-supervise</a>, and that is a much more efficient
mechanism. </li>
</ul>
<h2> What does <a href="s6-supervise.html">s6-supervise</a> do with this
readiness information? </h2>
<ul>
<li> <a href="s6-supervise.html">s6-supervise</a> maintains a readiness
state for other programs to read. You can check for it, for instance, via
the <a href="s6-svstat.html">s6-svstat</a> program. </li>
<li> <a href="s6-supervise.html">s6-supervise</a> also broadcasts the
readiness event to programs that are waiting for it - for instance the
<a href="s6-svwait.html">s6-svwait</a> program. This can be used to
make sure that other programs only start when the daemon is ready. For
instance, the
<a href="//skarnet.org/software/s6-rc/">s6-rc</a> service manager uses
that mechanism to bring sets of services up or down: a service starts as
soon as all its dependencies are ready, but never earlier. </li>
</ul>
</body>
</html>
|