File: NOTES.load-balancing

package info (click to toggle)
proftpd-mod-proxy 0.9.2-1%2Bdeb12u1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 4,972 kB
  • sloc: perl: 43,469; ansic: 43,171; sh: 3,479; makefile: 247
file content (105 lines) | stat: -rw-r--r-- 3,993 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105

ProxyReverseConnectPolicy Random RoundRobin LeastConns PerUser PerHost

  on startup event, iterate through configured backend servers, mark which
  ones cannot be connected to as "dead".
    Means backend server list management, with attributes:
      server:
        connCount
        connectTime
        lastChecked

    Or, better: TWO lists: live list, dead list

    For tracking connectTime (time to connect, in millisecs):

      long timevaldiff(struct timeval *starttime, struct timeval *finishtime) {
        long msec;
        msec=(finishtime->tv_sec-starttime->tv_sec)*1000;
        msec+=(finishtime->tv_usec-starttime->tv_usec)/1000;
        return msec;
      }

      struct timeval start, finish;
      long msec;

      gettimeofday(&start, NULL);
      sleep(1);
      gettimeofday(&finish, NULL);
      msec = timevaldiff(&start, &finish);
      printf("Elapsed time for sleep(1) is: %d milliseconds.\n", msec);

    Some platforms have a timersub(3) macro for this, but it'll be more
    portable to do our own.

Advanced balancing (selection at USER time):

  user-specific (lookup per user)
    Randomly select N backend servers for user if not already assigned?
 
  host-specific (lookup per HOST)
    Randomly select N backend servers for host if not already assigned?


For stickiness, we have two cases: one where the backend(s) are administratively
assigned, and one where they are randomly chosen/assigned.  Assume the former
is the more common case.

  Could do:
    ConnectPolicy PerUser/PerHost
    Servers ...

    with no specific assignments.  In this case, the username/client IP
    is hashed into an index of one of the servers, and kept that way.  Let's
    start with that as the first iteration.

  Next:
    ConnectPolicy PerUser/PerHost

    <IfUser>
      ReverseServers ...
    </IfUser>

FAQ: Why no PerClass stickiness?
A: It's the same as PerHost, with the added use of <IfClass>+ReverseServers

FAQ: Why no per-SSL session/ticket/SNI stickiness?
A: It's too late; TLS is hop-by-hop, and thus the SSL session on the frontend,
with the client, has no relation to the SSL session on the backend.  Thus
any routing based on SSL session ID to a backend, based on the client's
session ID, is too late -- and that client session is for the proxy anyway.

We should to load balancing based on TLS protocol version, weak ciphers, etc,
but that would be to a pool of backend servers, NOT "sticky".

Complex/adaptive balancing:

  Client IP address/class (connect time selection)
  FTP USER (i.e. user affinity to specific server(s)) (USER time selection)
    lookup AND hashed (algo?  Same algo used by Table API?)
  Backend response time (connect time selection)

Balancing Versus Stickiness

For reverse proxy configurations, there is a choice between "balancing"
strategies and "sticky" strategies when it comes to selecting the backend 
server that will handle the incoming connection.

Which should you use, and why?

All of the "balancing" strategies are able to select the backend server *at
connect time*.  Most of the "sticky" strategies, on the other hand, require
more knowledge about the user (e.g. USER name, HOST name, SSL session ID),
thus backend server selection is delayed until that information is obtained.

Balancing is best when all of your backend severs are identical with regard to
the content they have, AND when it doesn't matter which server handles a
particular client.  Maybe all of your backend servers use a shared filesystem
via NFS or similar, thus directory listings will be the same for a user no
matter which backend server is used, and uploading files to one server means
that those files can be downloaded/seen by the other servers.

Stickiness is best when your backend servers are NOT identical, and some
users/clients should only ever go to some particular set of backend servers.
Thus the user/client needs to be "sticky" to a given backend server(s) --
that's when you need the sticky selection strategies.