1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229
|
Issues for ProxyBackendSelection 'roundRobin':
1. Selection strategy is per-vhost.
2. Selection lists are per-vhost.
This means that a 'core.fork' event, before the vhost is known,
would not work easily. We don't just want to increment some
counter/index for all vhosts, as that would not actually be
the expected "round robin" behavior.
To implement round robin, we need:
1. An ordered list of backend servers, whose order is preferably stable.
How would healthchecks, with servers moving into and out of the "live"
list, affect this?
What happens if the "previous selection" or "next selection" (whichever
we persist) is not available in the "live" list for the next connection?
2. Knowledge of what the "next" server should be (or, conversely,
what the previous server was)
3. Persistence of #2 in some storage accessible across connections to
that vhost (i.e. vhost-specific storage).
Possibilities: SysV shm, file, SQL, memcache, external process.
What about an mmap'd file? Still needs locking (with retries?)
Any of these would require a postparse (startup?) event listener, to see
if any vhost has RoundRobin selection configured, to create/prep the
shared storage area.
What if there are THREE backend server lists:
configured: conf/
live live/
dead dead/
The "configured" list would be static, wouldn't change, would have stable
ordering. That could then be the reference server/index for round robin.
proxy_select_backends:
vhost_sid INTEGER,
name TEXT, // e.g. "server1.example.com"
backend_id INTEGER, // used for ordering
healthy BOOLEAN // used for healthchecks
nconns INTEGER // used for least conns
SELECT name FROM proxy_backends WHERE vhost_sid = {...} ORDER BY position ASC;
proxy_select_roundrobin:
vhost_sid INTEGER,
current_backend_id INTEGER (FK into proxy_select_backends.backend_id?)
proxy_select_shuffle:
insert rows for used backend IDs; delete them all on _reset()
(OR insert rows for UNUSED backend IDs; delete as used)
a selection policy object, with callbacks into the database
Basic data structure:
vhost1
index (into 'configured' list) of current/selected backend server
max index value (i.e. length of 'configured' list minus 1)
...
vhostN
Could use SID to identify vhost.
unsigned int idx;
int proxy_roundrobin_get_index(main_server->sid, &idx);
/* Get backend for index */
idx++;
if (idx == max_idx) {
idx = 0;
}
int proxy_roundrobin_set_index(main_server->sid, idx);
OR:
unsigned int idx;
int proxy_roundrobin_incr_index(main_server->sid, &idx);
This would "atomically" return the current index, and
increment (with wraparound) the index for the next call.
Callers, then, don't need to know about the max_idx.
With this arrangement, an on-disk mmap'd file would have range
locking, and a "row" would be:
uint32_t sid
uint32_t backend_server_count
uint32_t backend_server_idx
Alternatively, the entire selection database could be a single JSON file;
the "locking" would be done on the basis of the entire file.
OR, alternatively, the selection database could be a SQL database (e.g.
SQLite), a la mod_auth_otp.
Store these databases in a policy-specific, vhost-specific file (to reduce
contention)? This would make it easy for each policy to have its own
format, as needed.
CONF_ROOT|CONF_VIRTUAL, NO <Global>.
Leaning toward SQLite.
SELECT ...
INSERT sid, ...
UPDATE ...
Per Policy! Hrm. Maybe have mod_proxy create its own tables, etc.
(just configure a path). That'd work nicely. On startup, delete
table if it exists (warn about this!), create needed schema. Much
less fuss for the admin.
Make lib/db/sqlite.c file, use sqlite3 directly (NOT via
mod_sql+mod_sql_sqlite).
This would be the "select.dat" file/table, the basis for backend
selection.
For health checks:
uint32_t sid;
uint32_t backend_server_count
uint32_t live_servers[backend_server_count]
uint32_t dead_servers[backend_server_count]
...
Note: Use big-endian values for all numeric values on disk, as if writing
to the network, for file "reuse" on other servers?
Note: Would be nice, given the above format, to NOT have to scan the
entire file to find the vhost in question, given the variable length
of the server lists per vhost.
Could deal with that by having a header, which would be the offset of
that SID into the file. I.e.:
off_t sid_offs[vhost_count];
sid 1: off = 0
sid 2: off = 42
and remember that vhost SIDs start at 1!
off_t sid_off = sid_offs[main_server->sid-1];
lseek(fd, sid_off, SEEK_SET);
readv(...)
readv(...)
Note: have a version uint32_t as first value in value, for indicating file
format version.
Note: have ftpdctl proxy action for dumping out (as JSON) the contents of
the select.dat file?
Note: files/structure:
select.c
select-random.c
select-roundrobin.c
select-shuffle.c
...
policy/
random/
{sid}.json
roundrobin/
{sid}.json
shuffle/
{sid}.json
When writing out the file initially, scan each vhost, figure out its
position, then write out header, then vhost entry.
And, to support the leastConns, equalConns, and leastResponseTime policies,
we'll need a separate table (also mmap'd):
unsigned int idx
unsigned long curr_conns
unsigned long curr_response_ms
OR, even better, to handle the leastConns/leastResponseTime etc strategies,
use a fourth list:
configured
ranked
live
dead
Where "ranked" is a list of indices (into the configured list) in
preference/ranked order. In the case of least conns, this ranking is done
by number of current connections. Hmm, no, we'd still need to know that
number of current connections somewhere, not just the relative rank of
that server versus others -- consider that the number of connections may
change, changing the relative ranking, and we'd need to know the number of
current connections in order to do the re-ranking.
To make this more general (e.g. for other selection policies), maybe:
int proxy_reverse_select_index_next(main_server->sid, &idx, policy);
Where policy could be:
POLICY_RANDOM
POLICY_ROUND_ROBIN
POLICY_LEAST_CONNS
POLICY_EQUAL_CONNS
POLICY_LOWEST_RESPONSE_TIME
If that backend is chosen/used successfully, then:
int policy_reverse_select_index_used(main_server->sid, idx, response_ms);
Note when we would need to maintain an fd on the database until the
session ended (e.g. for leastConns), so that we could decrement the number
of connections to a given SID when that connection ended. This is not
necessary for roundRobin, which doesn't care about number of current
connections per sid, only next SID to index. LeastConns, on the other hand,
DOES care about number of current connections.
|