EQL Driver: Serial IP Load Balancing HOWTO Simon "Guru Aleph-Null" Janes, simon@ncm.com v1.2, April 1, 1995 This is the manual for the EQL device driver. EQL is a software device that lets you load-balance IP serial links (SLIP or uncompressed PPP) to increase your bandwidth. It will not reduce your latency (i.e. ping times) except in the case where you already have lots of traffic on your link, in which it will help it out. This driver has been tested with the 1.2.1 kernel and should patch cleanly in future 1.2.x ker- nels. This driver is expected to be merged into the 1.3.x kernel very shortly. The eql-1.2.patch file was generated against the v1.2.2 ker- nel. 1. Introduction Which is worse? A huge fee for a 56K leased line or two phone lines? Its probably the former. If you find yourself craving more bandwidth, and have a ISP that is flexible, it is now possible to bind modems together to work as one point-to-point link to increase your bandwidth. All this without any need of special black box routers. The eql driver has been tested with the Livingston PortMaster-2e terminal server and with another Linux box running the eql driver in the reverse direction. Other terminal servers and routers are expected to be "proven" with the eql driver very shortly. ISPs should be more than happy to just charge you for the cost of using a second port, line and modem for your load balancing connections. If they are not, find a more flexible and open minded provider. 2. Kernel Configuration Here I describe the general steps of getting a kernel up and working with the eql driver. From patching, building, to installing. 2.1. Patching The Kernel 2.1.1. Obtaining the Patches As of this writing, the eql driver is not yet available in a kernel source tree. The driver works fine with only the eql_enslave program, although there are hooks for a eql_emancipate program and some other configuration programs. I do not know at this time if they will be implemented. If not, they will be cut out of the driver to conserve space. This documentation, driver, sample configs, and enslaving utility can be FTP'ed from: ftp://slaughter.ncm.com/pub/Linux/LOAD_BALANCING/eql-1.2.tar.gz Unpack this archive someplace obvious like /usr/local/lib/. It will create the following files (more or less): ______________________________________________________________________ -rw-r--r-- guru/ncm 198 Mar 31 23:35 1995 eql-1.2/NO-WARRANTY -rw-r--r-- guru/ncm 58226 Apr 1 04:15 1995 eql-1.2/eql-1.2.patch -rw-r--r-- guru/ncm 20540 Apr 1 17:48 1995 eql-1.2/eql-driver.txt -rwxr-xr-x guru/ncm 16111 Mar 31 23:35 1995 eql-1.2/eql_enslave -rw-r--r-- guru/ncm 2195 Mar 31 23:35 1995 eql-1.2/eql_enslave.c ______________________________________________________________________ Unpack a fresh kernel where you usually work on your kernels. You may want to move your "working" kernel sources out of the way in case you need to revert to a good source tree for other reasons. Apply the patch by running the commands: ______________________________________________________________________ cd /usr/src patch -p0 /dev/null if you don't want to get any error messages if the scripts have any problems. ______________________________________________________________________ # Run the ppp-eql-x scripts every minute of every day, etc... * * * * * /etc/ppp/ppp-eql-1 * * * * * /etc/ppp/ppp-eql-2 ______________________________________________________________________ o The PPP-eql Script You will want to change the DEVICE for each line naturally, the phone number if you aren't dialing into a rotary setup on the remote side. Note the >/dev/tty8 redirection. That is a status display which will show errors from the scripts and the dialogue between the modem and expect script. You will want to change this for each line you have. ______________________________________________________________________ #!/bin/sh PATH=$PATH:/usr/etc LOCKFILE_DIR=/var/spool/locks DEVICE=cua1 LOGIN_NAME=Pname PASSWORD=password PHONE_NO=555-1212 LOCAL_IP=199.199.199.1 cd /etc/ppp if [ -f /var/spool/lock/LCK..$DEVICE ] then # N.B. This will not work if you have a stale # lock file in the lock dir. exit 0 fi DIALER_CMD="/etc/ppp/dialout/Zoom-V.34.xp $PHONE_NO \ $LOGIN_NAME $PASSWORD $DEVICE" setserial /dev/$DEVICE spd_vhi stty `cat $DEVICE.stty` /dev/tty8 ______________________________________________________________________ o eql-options The -vj is probably the most important option. You must always keep in mind that Van Jacobson compression depends on packets coming in a serial order, which just does not happen when you have more than one path between two points. ______________________________________________________________________ modem crtscts lcp-echo-interval 10 lcp-echo-failure 6 -vj ______________________________________________________________________ o Zoom-V.34.xp I like to use Expect for modem dial scripts. This one prints status messages to stderr so I can redirect them and have a handly monitoring screen to watch the modem dial. If everyone would write up nifty little Expect scripts, it would be a good thing. The UUCP style monsters just aren't flexible enough. ______________________________________________________________________ #!/usr/local/bin/expect -f # Zoom-V.34 Modem Dialer set DialDesc "Zoom V.34 Modem Dialer" puts -nonewline stderr "\0330;0r\033\[46;30m\033\[H\033\[J" puts -nonewline stderr "$DialDesc: " if { $argc != 4 } { puts stderr "usage: DIALER.xp " exit } set PhoneNumber [lindex $argv 0] set LoginName [lindex $argv 1] set Password [lindex $argv 2] set DeviceName [lindex $argv 3] puts stderr "Dialing $PhoneNumber on $DeviceName. Login $LoginName." puts stderr "\033\[2;2H\033\[2;25r\033\[44;33;1m\033\[1;1d" for {set i 0} {$i < 56} {incr i 1} { puts stderr "~" } set InitString "AT&C1&D2W1L1S0=0%E2" proc print {string} { set CurrentDate [exec /bin/date "+%a %d %r"] puts stderr "$CurrentDate - $string" } set send_human { .05 .05 .05 .05 .05 } proc hayes_escape {} { print "+++" send "\r+++" set timeout 10 expect { "OK\r\n" {} timeout {} } } print "+++/ATH -- hanging up modem" hayes_escape print "AT -- sending modem attention" send -h "AT\r\n" set timeout 5 expect "OK\r\n" {} timeout { exit 1 } print "ATZ -- resetting modem" send -h "ATZ\r" set timeout 30 expect { "OK\r\n" {} timeout { exit 1 } } print {OK -- modem alive} print "$InitString -- initialization string" send -h $InitString send -h "\r" set timeout 12 expect "OK\r\n" {} timeout { exit 1 } print {OK -- modem configured} print "ATDT$PhoneNumber -- dialing" send -h "ATDT" send $PhoneNumber send "\r" set timeout 90 expect {*CONNECT*} { print "\n-----\n$expect_out(buffer)\n-----\n" } \ {*BUSY*} { print "\n-----\n$expect_out(buffer)\n-----\n" exit 1 } \ {*VOICE*} { print "\n-----\n$expect_out(buffer)\n-----\n" exit 1 } \ {*NO CARRIER*} { print "\n-----\n$expect_out(buffer)\n-----\n" exit 1 } \ {*NO ANSWER*} { print "\n-----\n$expect_out(buffer)\n-----\n" exit 1 } \ timeout { print "\n-----\n$expect_out(buffer)\n-----\n" exit 1 } set timeout 20 print "$LoginName -- sending login name" expect {*ogin*} { print "\n-----\n$expect_out(buffer)\n-----\n" send $LoginName send "\r" expect {*assword*} { print "\n-----\n$expect_out(buffer)\n-----\n" } \ timeout { exit 1 } print {... -- sending password} send $Password send "\r" } timeout { exit 1 } exit 0 ______________________________________________________________________ o ip-up This is a very important script, it is what is going to make your eql configuration work automatically whenever ppp devices come up after redialing. ______________________________________________________________________ #!/bin/sh INTERFACE=$1 DEVICE=$2 SPEED=$3 LOCAL_IP=$4 REMOTE_IP=$5 # Load Balancing Configuration For Client Side if [ $LOCAL_IP = "204.180.7.41" ] then # this deletes the route ppp creates, and allows us to load balance # the traffic going directly to that host. /sbin/route del 198.67.33.16 /etc/ppp/eql_enslave eql $INTERFACE $SPEED # This re-adds the route in case its lost for one reason or another. /sbin/route add default dev eql fi ______________________________________________________________________ 3.3. eql's eql_enslave Syntax The syntax for enslaving a device is "eql_enslave ". Here are some example enslavings: ______________________________________________________________________ eql_enslave eql sl0 28800 eql_enslave eql ppp0 14400 eql_enslave eql sl1 57600 ______________________________________________________________________ When you want to free a device from its life of slavery, you can just down the device with ifconfig and the eql master will automatically bury the dead slave and remove it from its scheduling queue. 4. About the Slave Scheduler Algorithm The slave scheduler probably could be replaced with a dozen other things and push traffic much faster. The formula in the current set up of the driver was tuned to handle slaves with wildly different bits-per-second "priorities". All testing I have done was with two 28.8 V.FC modems, one connecting at 28800 bps or slower, and the other connecting at 14400 bps all the time. One version of the scheduler was able to push 5.3 K/s through the 28800 and 14400 connections, but when the priorities on the links were very wide apart (57600 vs. 14400) The "faster" modem received all traffic and the "slower" modem starved. 5. Tester's Reports Some people have experimented with the eql device with newer kernels kernels (than 1.1.75). I have since updated the driver to patch cleanly in newer kernels because of the removal of the old "slave- balancing" driver config option. The latest patch was generated against the v1.2.2 kernel. o _Anarchy_ (aka Alan Cox) reported 117 K/s running eql over two ISDN B channels. Would someone from the U.K. explain what "dead funky" is supposed to mean? o icee from LinuxNET patched 1.1.86 without any rejects and was able to boot the kernel and enslave a couple of ISDN PPP links. 5.1. Randoph Bentson's Test Report From bentson@grieg.seaslug.org Wed Feb 8 19:08:09 1995 Date: Tue, 7 Feb 95 22:57 PST From: Randolph Bentson To: guru@ncm.com Subject: EQL driver tests I have been checking out your eql driver. (Nice work, that!) Although you may already done this performance testing, here are some data I've discovered. Randolph Bentson bentson@grieg.seaslug.org --------------------------------------------------------- A pseudo-device driver, EQL, written by Simon Janes, can be used to bundle multiple SLIP connections into what appears to be a single connection. This allows one to improve dial-up network connectivity gradually, without having to buy expensive DSU/CSU hardware and services. I have done some testing of this software, with two goals in mind: first, to ensure it actually works as described and second, as a method of exercising my device driver. The following performance measurements were derived from a set of SLIP connections run between two Linux systems (1.1.84) using a 486DX2/66 with a Cyclom-8Ys and a 486SLC/40 with a Cyclom-16Y. (Ports 0,1,2,3 were used. A later configuration will distribute port selection across the different Cirrus chips on the boards.) Once a link was established, I timed a binary ftp transfer of 289284 bytes of data. If there were no overhead (packet headers, inter-character and inter-packet delays, etc.) the transfers would take the following times: bits/sec seconds 345600 8.3 234600 12.3 172800 16.7 153600 18.8 76800 37.6 57600 50.2 38400 75.3 28800 100.4 19200 150.6 9600 301.3 A single line running at the lower speeds and with large packets comes to within 2% of this. Performance is limited for the higher speeds (as predicted by the Cirrus databook) to an aggregate of about 160 kbits/sec. The next round of testing will distribute the load across two or more Cirrus chips. The good news is that one gets nearly the full advantage of the second, third, and fourth line's bandwidth. (The bad news is that the connection establishment seemed fragile for the higher speeds. Once established, the connection seemed robust enough.) #lines speed mtu seconds theory actual %of kbit/sec duration speed speed max 3 115200 900 _ 345600 3 115200 400 18.1 345600 159825 46 2 115200 900 _ 230400 2 115200 600 18.1 230400 159825 69 2 115200 400 19.3 230400 149888 65 4 57600 900 _ 234600 4 57600 600 _ 234600 4 57600 400 _ 234600 3 57600 600 20.9 172800 138413 80 3 57600 900 21.2 172800 136455 78 3 115200 600 21.7 345600 133311 38 3 57600 400 22.5 172800 128571 74 4 38400 900 25.2 153600 114795 74 4 38400 600 26.4 153600 109577 71 4 38400 400 27.3 153600 105965 68 2 57600 900 29.1 115200 99410.3 86 1 115200 900 30.7 115200 94229.3 81 2 57600 600 30.2 115200 95789.4 83 3 38400 900 30.3 115200 95473.3 82 3 38400 600 31.2 115200 92719.2 80 1 115200 600 31.3 115200 92423 80 2 57600 400 32.3 115200 89561.6 77 1 115200 400 32.8 115200 88196.3 76 3 38400 400 33.5 115200 86353.4 74 2 38400 900 43.7 76800 66197.7 86 2 38400 600 44 76800 65746.4 85 2 38400 400 47.2 76800 61289 79 4 19200 900 50.8 76800 56945.7 74 4 19200 400 53.2 76800 54376.7 70 4 19200 600 53.7 76800 53870.4 70 1 57600 900 54.6 57600 52982.4 91 1 57600 600 56.2 57600 51474 89 3 19200 900 60.5 57600 47815.5 83 1 57600 400 60.2 57600 48053.8 83 3 19200 600 62 57600 46658.7 81 3 19200 400 64.7 57600 44711.6 77 1 38400 900 79.4 38400 36433.8 94 1 38400 600 82.4 38400 35107.3 91 2 19200 900 84.4 38400 34275.4 89 1 38400 400 86.8 38400 33327.6 86 2 19200 600 87.6 38400 33023.3 85 2 19200 400 91.2 38400 31719.7 82 4 9600 900 94.7 38400 30547.4 79 4 9600 400 106 38400 27290.9 71 4 9600 600 110 38400 26298.5 68 3 9600 900 118 28800 24515.6 85 3 9600 600 120 28800 24107 83 3 9600 400 131 28800 22082.7 76 1 19200 900 155 19200 18663.5 97 1 19200 600 161 19200 17968 93 1 19200 400 170 19200 17016.7 88 2 9600 600 176 19200 16436.6 85 2 9600 900 180 19200 16071.3 83 2 9600 400 181 19200 15982.5 83 1 9600 900 305 9600 9484.72 98 1 9600 600 314 9600 9212.87 95 1 9600 400 332 9600 8713.37 90 5.2. Anthony Healy's Report Date: Mon, 13 Feb 1995 16:17:29 +1100 (EST) From: Antony Healey To: Simon Janes Subject: Re: Load Balancing Hi Simon, I've installed your patch and it works great. I have trialed it over twin SL/IP lines, just over null modems, but I was able to data at over 48Kb/s [ISDN link -Simon]. I managed a transfer of upto 7.5 Kbyte/s on one go, but averaged around 6.4 Kbyte/s, which I think is pretty cool. :) 6. Load Balancing Futures In the future, this driver may no longer be needed, because of proposed extensions to PPP called "Multilink PPP". But now, eql is here, and its here today. Lock and load! :)