File: FAQ

package info (click to toggle)
libapache-mod-backhand 1.2.1-1
  • links: PTS
  • area: main
  • in suites: woody
  • size: 1,048 kB
  • ctags: 513
  • sloc: ansic: 3,650; cpp: 661; sh: 467; makefile: 120
file content (259 lines) | stat: -rw-r--r-- 14,718 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
Q: What documents should I read to learn more about mod_backhand?
A: The papers, presentations and assorted documentation:
<UL><LI><A HREF="FAQ.shtml">Frequently Asked Questions (FAQ)</A> (<I>This document</I>)
<LI>mod_backhand: A load balancing module for the Apache web server -- Technical white paper (<a href="http://www.cnds.jhu.edu/pub/papers/cnds-2000-2.rtf">rtf</a>,
<a href="http://www.cnds.jhu.edu/pub/papers/cnds-2000-2.ps">ps</a>,
<a href="http://www.cnds.jhu.edu/pub/papers/cnds-2000-2.ps.gz">ps.gz</a>)
<LI><A HREF="http://www.backhand.org/ApacheCon2000/">mod_backhand: A load balancing module for the Apache web server (ApacheCon2000 in Orlando, FL) -- Presentation</A>
<LI><A HREF="compiling.shtml">Compilation Help</A>
<LI><A HREF="installing.shtml">Installation Help</A>
<LI><A HREF="configuring.shtml">Configuration Help</A>
<LI><A HREF="performance.shtml">Performance Tuning Tips</A>
</UL>

Q: Are there mailing lists?
A: Yes, visit <A HREF=http://lists.backhand.org/mailman/listinfo/>http://lists.backhand.org/mailman/listinfo/</A> to join the users or developers lists.

Q: Does mod_backhand induce much overhead when it is not enabled for a
directory?
A: Very little.  The same as any other module.

Q: How does mod_backhand know about the resources on other machines?
A: When you configure the web server, you supply a mechanism for distributing
resource information.  The currently supported mechanisms are UDP based via
Ethernet broadcast or IP multicast.

Q: How does mod_backhand decide which is the "best" server?
A: It uses what we call candidacy functions.  A list of ALL the webservers are
passed to the first function and that list is modified.  Then it is passed to
the next function and so on and so forth.  The functions are executed in the
order in which they appear in the configuration.

Q: Does mod_backhand only proxy?  Can it HTTP redirect?
A: Version 1.0.9 and later have the ability to issues HTTP redirects to 
browsers instead of proxying them internally.  See the HTTPRedirectTo{IP,Name}
directives described in the "What do the built-in candidacy functions do?"
section in this FAQ.

Q: What do the built-in candidacy functions do?
A: Their are current five built-in candidacy functions.
<UL>
<LI><B>off</B> -- this disables mod_backhand for a directory.
<LI><B>addSelf</B> -- adds the local server to the end of the list of
candidates if that server does not already exist in the list.  If you are
using a peer-based topology, you most often want to consider yourself and
some configurations (byRandom, byLogWindow) could remove the local machine.
<LI><B>removeSelf</B> -- removes the local server from the list of candidates
if and only if that server already exists in the list.  If you are
using a tier-based topology, front-end servers most often would not like to
include themselves.  Of course, if you have multiple front-end machines you
will need to eliminate all of them using a custom candidacy function or 
somethine like byHostname.  removeSelf is great for testing.
<LI><B>byAge [time in seconds]</B> -- eliminates servers that you have not
received resource information for in a certain amount of time.  The default is
20 seconds, but you can pass a parameter that is interpreted as an integer
number of seconds.
<LI><B>byLoad [bias]</B> -- Reorders the list of candidates from
the lowest load to the highest load.
The bias (a floating point number) is used to
prefer yourself over proxying the request and can be used to
approximate the effort involved in proxying a request.  It is the
amout of load subtracted from your load before you sort the servers.
<LI><B>byBusyChildren [bias]</B> -- like the byLoad function, it reorders
the list of candidates from least loaded to most loaded.  Instead of using
the system load (which has poor ganularity), it uses the number of busy
Apache children.  This should be a decent approximation of the length of the
system run queue.
The bias (an integer) is used to
prefer yourself over proxying the request and can be used to
approximate the effort involved in proxying a request.  It is the
amout of load (in run-queue units) subtracted from your load before
you sort the servers.  A good number is between 1 and 5.
<LI><B>byCPU</B> -- like byLoad except work on the servers with the
absolute highest CPU idle.  This is mostly useless.  Don't use it
unless you really know <I>why</I> you are using it.
<LI><B>byLogWindow</B> -- eliminates all but the first log base 2 of
the n servers passed in.  So, if 17 servers are passed in, the first 4 remain.
<LI><B>byRandom</B> -- this function randomly (psuedo of course)
reorders the list of servers given as input.
<LI><B>byCost</B> -- this function attempts to assign a cost to the assingment 
of a request to each machine in the cluster.  It then chooses the assignment
that costs the least.  The method of cost assignment is based on a
cost-benefit framework as discussed in the paper titled
<A HREF="http://www.cnds.jhu.edu/pub/papers/dss99.ps">"A Cost-Benefit
Framework for Online Management of a Metacomputing System"</A> by
Amir, Awerbuch, and Borgstrom.
<LI><B>HTTPRedirectToIP</B> -- Tell mod_backhand that it is to send clients
to servers within the cluster via an HTTP redirect of the form
http://1.2.3.4/request/uri rather than the default method of proxying.
<LI><B>HTTPRedirectToName [format string]</B> -- Tell mod_backhand that it
is to send clients to servers within the cluster via an HTTP redirect of the
form http://format string/request/uri rather than the default method of
proxying.  The format string, if omitted will simply be the ServerName for
the Apache server chosen.  As this is not always a desirable choice, format
string provides a means for a more intelligent hostname creation.  It allows
the construction of the new hostname based on the left portion of the
ServerName (the <B>Hostname</B> on the backhand-handler page) and the right
portion of the hostname provided from the <I>Host:</I> header.  This
facilitates clustered name based vurtual hosting setups.  A more detailed
example is outlined in the "How do I use the HTTPRedirectToName directive's
format string?" section of this FAQ.
<LI><B>bySession [identifier]</B> -- this function will attempt to find a
cookie named [identifier] or a querystring variable named [identifier].  It will
then attempt to hex decode the first 8 bytes of its content into an IPv4 style
IP address.  It will attempt to find this IP address in the list of candidates
and if it is found it will make the server in question the only remaining
candidate.  If any of the above steps fail, it will not augment the candidacy
list.  This, plus a bit fo server side application code, can be used to
implement sticky user sessions -- where a given user will always be delivered
to the same server once a session has been established.  [identifier] defaults
to "PHPSESSID=" as it was original written to support PHP sessions by Martin
Domig.  For more information see:
<A HREF="README.bySession.shtml">README.bySession file included in the
distribution</A>.
<LI><B>byHostname &lt;regexp&gt;</B> -- this function is an example of a user
built function that mod_backhand loads at run-time; it is not actually "built
in".  It eliminates all servers whose hostnames do not match the regular
expression given as the parameter.
</UL>
<BR>
This can be cascaded to do fairly complicated things:<BR><BR>
Say you want to redirect all database oriented cgi requests to your 64-bit
architecture machines.  you have the real names of you machines as intel1
through intel10, sun1 through sun10, and alpha1 through alpha10.  Of course,
you want to choose a random window of those machines (so as not to have
EVERYONE go to the least loaded machine... because it would get clobbered).
Of that window you want to pick the machine with the highest CPU idle.  Oh
yeah, of course you would like to avoid any machine you have not heard from in
the last 6 seconds.
<PRE>
Backhand byAge 6
BackhandFromSO libexec/byHostname.so byHostname (sun|alpha)
Backhand byRandom
Backhand byLogWindow
Backhand byLoad
</PRE>
Pretty easy, huh?

Q: How do I use the HTTPRedirectToName directive's format string?
A: This directive is used to issue HTTP redirect responses to requests.  When
will it service the request rather than redirect?  How will it preserve my
domain name if the ServerName doesn't have the same domain name?  These are
all really a rephrasing of the same question.  Here is the answer.<BR>
The format string is just like C format string except that it only has two
insertion tokens: %#S and %#H (where # is a number).
<UL><LI>%-#S will be the server name with the right # parts chopped off.  If
your server name is www-1.jersey.domain.com, %-3S will yield www-1
<LI>%#S will be the server name with only the # left parts preserved.  If
your server name is www-1.jersey.domain.com, %2S will yield www-1.jersey
<LI>%-#H will be the Host: with only the right # parts preserved.  If the
Host: is www.client.com, %-2S will yield client.com
<LI>%#H will be the Host: with the left # parts chopped off.  If the Host:
is www.client.com, %1H will yield client.com
</UL>
For example, if you run a hosting company hosting.com and you have 5
machines named www[1-5].sanfran.hosting.com.  You host www.client1.com and
www.client2.com.  You also add appropriate DNS names for
www[1-5].sanfran.client[12].com.<BR>
<PRE>Backhand HTTPRedirectToName %-2S.%-2H</PRE>
This will redirect requests to www.client#.com to one of the
www[1-5].sanfran.client#.com.

Q: Can I write my own candidacy functions?
A: byHostname.c is supplied as an example.  The entire serverstats structure
is available to you.  This provides things like load, number of CPUs,
available memory, total memory, number of Apache servers running (and number
occupied) and some other things.<BR>
Apparently there have been many reported problems on various sytems regarding
dynamically loaded objects from dynamically loaded modules in Apache.  If you
plan on making heavy use of run-time loadable candidacy functions, you should
compile the mod_backhand module into Apache statically.<BR><BR>
Oh yeah, the answer to this question is "Yes."

Q: Why does the load on my machine soar when I start Apache, and why does it
take a few seconds (or minutes) to actually be available to serve pages?
A: You are (a) running mod_backhand version 1.1.0 or lower, (b) running
mod_backhand for the first time, or (c) have deleted your mod_backhand-Arriba
file.  There is something we call "Arriba".  This is how fast your machine's
processor(s) are combined.  We calculate this when apache starts... and yes it
hurts.  If you don't like it, you can modify the source so that it does not
recalculate this value at start up.  Maybe later this will be a run-time
option.
<BR>
Graceful restarts use the existing mod_backhand-Arriba file to determine the
Arriba of your machine.  So it doesn't hurt at all.

Q: mod_backhand won't work..  what is wrong?
A: Yikes! what a question!  Well, a little troubleshooting guide.
<UL>
<LI>mod_backhand uses SysV IPC, make sure you have shared memory and AF_UNIX
domain sockets available to you (by loading the modules or compiling it into
your kernel).
<LI>mod_backhand users AF_UNIX domain sockets and they are stored in the
UnixSocketDir (set in httpd.conf).  Make sure this directory is owned by
Apache UID/GID pair, usually nobody/nobody.  I suggest the permissions to be
0700.
<LI>mod_backhand logs to the ErrorLog, so look there.
<LI>mod_backhand has a diagnostic handler (like server-status).  Do a
<PRE>
&lt;Location /backhand-status&gt;
  SetHandler backhand-handler
  ... access permissions here ...
&lt;/Location&gt;
</PRE>
and then do visit it.  It is actually a nice monitoring tool for you cluster.
I use it for that quite often.  Think of it as a cluster like perftool.
<LI>Make sure that you have Backhand directives in the directory that you want to
be balanced by backhand.  It doesn't try to redirect requests if you don't ask
it to.
</UL>

Q: I start backhand, but I can only see myself on the status page.  Why?
A: Hmm, this is complicated and could be caused my many misconfigurations.  The
most common is that of IP addresses.  mod_backhand servers announce themselves
to each other via IP datagrams (broadcast or multicast).  In that packet, it
says: "my IP is aaa.bbb.ccc.ddd".  mod_backhand has a Directive called
AcceptStats that will allow you to set up ACLs that can control the inclusion
of other servers in your "candidates list".<BR><BR>
Well, that leads us to: <I>How does mod_backhand determine the IP address to
advertise?</I>  Unless told otherwise, mod_backhand will call
<B>gethostbyname(gethostname())</B>, so whatever IP the 'hostname' command
results resolve to will stuffed in that packet.<BR><BR>
With that said, how do you change it?  The MulticastStats directive takes two
forms:<BR>
<PRE>
MulticastStats &lt;dest addr&gt;:&lt;port&gt;[,ttl]
MulticastStats &lt;myip addr&gt; &lt;dest addr&gt;:&lt;port&gt;[,ttl]
</PRE>
So, your hostname is www3 and it resolves to 1.2.3.4, but you want to announce
stats to 10.0.0.255 port 4445 and only accept stats from 10.0.0.0/24.  By
default, it will stuff 1.2.3.4 into the packet and the announcement will be
trashed because it doesn't match 1.2.3.4.  Also, if it annouces 1.2.3.4, it will
try to proxy to 1.2.3.4 (and you want to use your high speed private network).
For this type of (common) configuration, use the two argument form of
MulticastStats, for example:
<PRE>
MulticastStats 10.0.0.4 10.0.0.255:4445
</PRE>

Q: I run Solaris and I get this weird error upon start up: "[error]
(22)Invalid argument: shmctl(., IPC_SET, [-1,-1]) could not set segment
#<I>number</I>", what is wrong?
A: The uid and gid have not been set for Apache in the configuration file yet,
we don't know why Apache exhibits this behaviour on Solaris and not Linux.
But, more importantly, we know how to fix it.  Simply move the 
<PRE>
User nobody
Group nobody
</PRE>
directive above the AddModule/LoadModule sections in httpd.conf.  Weird, huh?

Q: When I start up Apache/mod_backhand, the backhand-handler tells me that the
number of available servers and the total number of servers is 0.  This isn't
true, why is it wrong and how do I fix it?
A: We don't actually know why this is happening, the internals of Apache are
somewhat confusing.  I believe it has to do with the order of loading modules.
The fact is that mod_backhand attaches to the scoreboard file in the parent
process NOT in the child, this has somthing to do with it.  But, again, we
know how to work around this problem. Simply gracefully restart apache and the
number will, from then on out, be correct and current.