1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364
|
<!--startcut ======================================================= -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html><head>
<META NAME="generator" CONTENT="lgazmail v1.1pre9c">
<TITLE>The Answer Guy 32:
Web server clustering project
</TITLE>
<!-- ORIGINAL SUBJECT:
Web server clustering project
JTD SUBTITLE:
-->
</head>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#A000A0"
ALINK="#FF0000">
<H4>"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>
<P> <hr> <P>
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<H1 align="center"><A NAME="answer">
<img src="../gx/dennis/qbubble.gif" alt="" border="0" align="middle">
<a href="./lg_toc32.html">The Answer Guy</a>
<img src="../gx/dennis/bbubble.gif" alt="" border="0" align="middle">
</A></H1>
<BR>
<H4 align="center">By James T. Dennis,
<a href="mailto:answerguy@ssc.com">answerguy@ssc.com</a>
<BR>Starshine Technical Services, <A HREF="http://www.starshine.org/">http://www.starshine.org/</A>
</H4>
<p><hr><p>
<!--endcut ========================================================= -->
<H3><img src="../gx/dennis/qbub.gif" alt="(?)"width="50" height="28"
align="left" border="0">Web server clustering project</H3>
<p><strong>From Jim Kinney
in the</strong>
<a href="news:comp.unix.questions">comp.unix.questions</a>
<strong>newsgroup on 22 Jul 1998 </strong></p>
<!-- begin body -->
<STRONG><P>
I am starting the research into the design and implementation of a 3
node cluster to provide high availability web, database, and support
services to a computer based physics lab. As envisioned, the primary
interface machine will be the web server. The database that provides
the dynamic web pages will be on a separate machine. Some other
processes that accept input from the web process and output to the
database will be on the third machine.
</P></STRONG>
<BLOCKQUOTE><IMG SRC="../gx/dennis/bbub.gif" ALT="(!)" width="50" height="28" border="0" lign="bottom"
>Have you looked at the "High Availability HOWTO"?
</blockquote>
<code><blockquote><blockquote
><A HREF="http://sunsite.unc.edu/pub/Linux/ALPHA/linux-ha/High-Availability-HOWTO.html"
>http://sunsite.unc.edu/pub/Linux/ALPHA/linux-ha/High-Availability-HOWTO.html</A>
</blockquote></blockquote></code>
<blockquote>There's also the common "round robin DNS" model ---
which is already used by many service providers. It
has its limitations --- but it's the first thing
to try if the clients can be configured to gracefully
retry transactions on failure.
</blockquote>
<blockquote>There's also the MOSIX project which was developed
under BSD/OS and is allegedly being ported to Linux.
This provides for process migration (again, more of
a performance clustering and load balancing feature set).
</blockquote>
<code><blockquote><blockquote
><A HREF="http://www.cs.huji.ac.il/mosix/">http://www.cs.huji.ac.il/mosix/</A>
</blockquote></blockquote></code>
<blockquote>However, there is another concept called "checkpointing."
You can think of this as having regular, transparent,
non-terminal "core dumps" (snapshots) taken of each
process (or process group). These are written to
disk and can be reloaded and restarted at the point where
they left off. I'm not aware of any projects to
provide checkpointing to Linux (or checkpointing subsystems).
(Obviously any application can do its own checkpointing
in a non-transparent fashion --- roughly equivalent to the
periodic automatic saves performed by '<TT>emacs</TT>' and other editors).
</blockquote>
<blockquote>I have a pointer to some miscellaneous notes on checkpointing:
</blockquote>
<code><blockquote><blockquote
><A HREF="http://warp.dcs.st-and.ac.uk/warp/systems/checkpoint/"
>http://warp.dcs.st-and.ac.uk/warp/systems/checkpoint/</A>
</blockquote></blockquote></code>
<blockquote>The implication here is that you could create a hybrid
checkpointing <EM>and</EM> process migration model that would
provide high availability. In a client/server context
this would probably only be suitable for situations where
the communications protocols were very robust --- and
it might still require some IP and/or MAC address assumption
or some specialized routing tricks.
</blockquote>
<blockquote>One such routing trick might be the IP NAT project.
IP masquerading is one form of NAT (allowing many
clients to masquerade as a single proxy system).
</blockquote>
<code><blockquote><blockquote
><A HREF="http://www.csn.tu-chemnitz.de/~mha/"
>http://www.csn.tu-chemnitz.de/~mha/</A>
</blockquote></blockquote></code>
<blockquote>Another form of NAT is many-to-many. Let's say you
connected two disconnected sites that both chose
<tt>10.1.*.*</tt> addresses for their use --- you could put
a NAT router between them that would bidirectionally
translate the <tt>10.1.*.*</tt> to corresponding <tt>172.16.*.*</tt>
addresses. Thus the two sites would be able to
interoperate over a broader range of protocols than
would be the case for IP masqurading --- since the
TCP/UDP ports would not be re-written --- each <tt>10.1.*.*</tt>
address corresponds on a one-to-one basis with a <tt>172.16.*.*</tt>
counterpart.
</blockquote>
<blockquote>The one other form of NAT is one-to-many (or "load
balancing"). This makes one simple router look like a server.
In actuality that "server" is just dispatching the packets it
receives to any one of the backend servers it chooses
(statistically or based on metrics that they communicate
amongst themselves, privately). Cisco has a product called
"Local Director" that does exactly this. One of the experimental
versions of the Linux IP NAT code also appeared to do this
with some success. I don't know if any further work as
progressed on these lines.
</blockquote>
<blockquote>Yet another approach that might make sense is to provide
for replication of the data (files) across servers and
to use protocols that transparent select among available
servers (mirrors). This sounds just like CODA.
</blockquote>
<code><blockquote><blockquote>
<A HREF="http://www.coda.cs.cmu.edu/top.html"
>http://www.coda.cs.cmu.edu/top.html</A>
</blockquote></blockquote></code>
<BLOCKQUOTE>A less sophisticated approach to replication is to
use the <TT>rsync</TT> package to maintain some failover servers
(mirrors) --- and require that writes all go to one
active server.
</blockquote>
<STRONG><P><IMG SRC="../gx/dennis/qbub.gif" ALT="(?)" width="50" height="28" border="0" lign="bottom"
>So, I am open to suggestions, comments, info, links to sites, book
titles, etc. I have proposed a one year development time for the whole
cluster, with a single machine application prototype of the user
visible/used portion by around the Jan 1999. I love my job!
</p></strong>
<P><STRONG>Jim Kinney M.S.
<BR>Educational Technology Specialist
<BR>Department of Physics
<BR>Emory University
</strong></p>
<BLOCKQUOTE><IMG SRC="../gx/dennis/bbub.gif" ALT="(!)" width="50" height="28" border="0" lign="bottom"
>Web, mail, DNS and a number of other Internet services
are naturally robust. With DNS you normally list up to
three servers per host (in the <TT>/etc/resolv.conf</TT>) and all
of these will be checked before a name lookup will fail.
With SMTP the client will try each of the hosts listed
in the results of an MX query. Round robin DNS will force
most clients to try multiple different IP addresses on
failure most of the time.
</blockquote>
<BLOCKQUOTE>However, the applications that really need HA (fail over)
and clustering for performance are things like db servers.
</blockquote>
<BLOCKQUOTE>Having two systems monitor and process something like
a set of db transactions in parallel (one active the
other "mimic'ing" the first but not returning results)
would be very interesting. The "mimic" would attempt
to maintain the same applications state as the server
--- and would assume the server's IP and MAC (ethernet
media access control) addresses on failure --- to then
transparently continue the transaction processing
that was going on.
</blockquote>
<BLOCKQUOTE>You might prototype such a system using web and ftp
(the FTP application is a more dramatic demonstration
--- since a web server involves many short transactions
and mostly operates in a "disconnected" fashion).
</blockquote>
<BLOCKQUOTE>One approach might be to have a custom ethernet
driver that can be instructed to throw all of its
output into the bit bucket. Thus the mimic is
normally silent, but following a failure on the
server it does the address assumption and rips off
the muzzle. I suspect you'd have to have another
interface between the two servers, one which is
dedicated to maintaining the same state between
the server and the mimic.
</blockquote>
<BLOCKQUOTE>(For example if the server get a collision or
an error that wasn't sensed by the mimic -- or
vice versa -- the two might get horribly out
of sync when the upper layer protocols require
a resend. With special drivers the two systems
might resolve these discrepancies at the kernel/driver
layer --- so that the applications will always get
the same data on their sockets).
</blockquote>
<BLOCKQUOTE>I really have no idea how much tweaking this would
take and whether or not it's even feasible.
</blockquote>
<BLOCKQUOTE>However, it seems that your intent is to provide
failover that is transparent to the applications
layer. So, the work obviously has to happen below that.
</blockquote>
<BLOCKQUOTE>It is unclear whether you are primarily interested
in deploying a set of servers for use by your
Physics team or whether you are interested in
doing research and development in the computer science.
</blockquote>
<blockquote>In any event your project will probably involve a
hybrid of several of these approaches:
</blockquote>
<ul>
<li>Round robin DNS
<li>Failover with IP and MAC address assumption.
(and/or load balancing NAT).
<li>Replication and or "mirroring" (more failover).
<li>Multi-initiator SCSI (where a single SCSI
bus has multiple computers active on it, such
that these computers have shared access to
the attached peripheral devices).
</ul>
<blockquote>It would be very interesting to see someone develop
process migration and checkpointing features for Linux
though there doesn't seem to be any active work going
on now.
</blockquote>
<BLOCKQUOTE>I'd also love to see an
"<a href="http://google.stanford.edu/linux?query=doc:cesdis.gsfc.nasa.gov/linux-web/beowulf/beowulf.html">Beowulf</a> enabled" SQL dbserver
(where a couple of failover capable "dispatchers" could
farm out transactions to multiple clustered Linux boxes
in some sensible manner). I'm not even sure if that's
feasible --- but it sure would knock down the scaleability
walls that I hear about from those dbadmins.
</blockquote>
<!-- end body -->
<!--startcut ======================================================= -->
<P> <hr> <P>
<H5 align="center"><a href="http://www.linuxgazette.com/ssc.copying.html"
>Copyright ©</a> 1998, James T. Dennis <BR>
Published in <I>Linux Gazette</I> Issue 32 September 1998</H5>
<P> <hr> <P>
<!--::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<table width="98%"><tr valign="center" align="center">
<td rowspan="3"><A HREF="../lg_answer32.html"><IMG
SRC="../gx/dennis/answernew.gif"
ALT="[ Answer Guy Index ]"></A></td>
<td><A HREF="tag_phreak.html">phreak</A>
<td><A HREF="tag_abandon.html">abandon</A>
<td><A HREF="tag_javaterm.html">javaterm</A>
<td><A HREF="tag_BBS.html">BBS</A>
<td><A HREF="tag_flaws.html">flaws</A>
<td><A HREF="tag_doslinux.html">doslinux</A>
<td><A HREF="tag_resume.html">resume</A>
</tr><tr valign="center" align="center">
<td><A HREF="tag_softwindows.html">softwindows</A>
<td><A HREF="tag_convert.html">convert</A>
<td><A HREF="tag_apache.html">apache</A>
<td><A HREF="tag_emulate.html">emulate</A>
<td><A HREF="tag_database.html">database</A>
<td><A HREF="tag_distrib.html">distrib</A>
<td><A HREF="tag_proxy.html">proxy</A>
</tr><tr valign="center" align="center">
<td><A HREF="tag_disable.html">disable</A>
<td><A HREF="tag_DVI.html">DVI</A>
<td><A HREF="tag_superblock.html">superblock</A>
<td><A HREF="tag_serial.html">serial</A>
<td><A HREF="tag_permission.html">permission</A>
<td><A HREF="tag_detach.html">detach</A>
<td><A HREF="tag_cdr.html">cdr</A>
</tr><tr valign="center" align="center">
<td><A HREF="tag_rs422.html">rs422</A>
<td><A HREF="tag_modem.html">modem</A>
<td><A HREF="tag_notfound.html">notfound</A>
<td><A HREF="tag_tuning.html">tuning</A>
<td><A HREF="tag_libc5.html">libc5</A>
<td><A HREF="tag_startup.html">startup</A>
<td><A HREF="tag_clock.html">clock</A>
<td><A HREF="tag_ping.html">ping</A>
</tr><tr valign="center" align="center">
<td><A HREF="tag_accounts.html">accounts</A>
<td><A HREF="tag_lilo.html">lilo</A>
<td><A HREF="tag_NDS.html">NDS</A>
<td><A HREF="tag_95slow.html">95slow</A>
<td><A HREF="tag_nonlinux.html">nonlinux</A>
<td><A HREF="tag_progenv.html">progenv</A>
<td><A HREF="tag_cluster.html">cluster</A>
<td><A HREF="tag_ftpd.html">ftpd</A>
</tr></table>
<P> <hr> <P>
<!--::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<A HREF="./lg_toc32.html"><IMG SRC="../gx/indexnew.gif"
ALT="[ Table Of Contents ]"></A>
<A HREF="../index.html"><IMG SRC="../gx/homenew.gif"
ALT="[ Front Page ]"></A>
<A HREF="lg_bytes32.html"><IMG SRC="../gx/back2.gif"
ALT="[ Previous Section ]"></A>
<A HREF="./stemen.html"><IMG SRC="../gx/fwd.gif"
ALT="[ Next Section ]"></A>
<!--::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::-->
</body>
</html>
<!--endcut ========================================================= -->
|