1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199
|
# RabbitMQ Consistent Hash Exchange Type
## What it Does
This plugin adds a consistent-hash exchange type to RabbitMQ.
In various scenarios, you may wish to ensure that messages sent to an
exchange are consistently and equally distributed across a number of
different queues based on the routing key of the message, a nominated
header (see "Routing on a header" below), or a message property (see
"Routing on a message property" below). You could arrange for this to
occur yourself by using a direct or topic exchange, binding queues
to that exchange and then publishing messages to that exchange that
match the various binding keys.
However, arranging things this way can be problematic:
1. It is difficult to ensure that all queues bound to the exchange
will receive a (roughly) equal number of messages without baking in to
the publishers quite a lot of knowledge about the number of queues and
their bindings.
2. If the number of queues changes, it is not easy to ensure that the
new topology still distributes messages between the different queues
evenly.
[Consistent Hashing](http://en.wikipedia.org/wiki/Consistent_hashing)
is a hashing technique whereby each bucket appears at multiple points
throughout the hash space, and the bucket selected is the nearest
higher (or lower, it doesn't matter, provided it's consistent) bucket
to the computed hash (and the hash space wraps around). The effect of
this is that when a new bucket is added or an existing bucket removed,
only a very few hashes change which bucket they are routed to.
## How It Works
In the case of Consistent Hashing as an exchange type, the hash is
calculated from the hash of the routing key of each message
received. Thus messages that have the same routing key will have the
same hash computed, and thus will be routed to the same queue,
assuming no bindings have changed.
When you bind a queue to a consistent-hash exchange, the binding key
is a number-as-a-string which indicates the number of points in the
hash space at which you wish the queue to appear. The actual points
are generated randomly.
The hashing distributes *routing keys* among queues, not *messages*
among queues; all messages with the same routing key will go the
same queue. So, if you wish for queue A to receive twice as many
routing keys routed to it than are routed to queue B, then you bind
the queue A with a binding key of twice the number (as a string --
binding keys are always strings) of the binding key of the binding
to queue B. Note this is only the case if your routing keys are
evenly distributed in the hash space. If, for example, only two
distinct routing keys are used on all the messages, there's a chance
both keys will route (consistently!) to the same queue, even though
other queues have higher values in their binding key. With a larger
set of routing keys used, the statistical distribution of routing
keys approaches the ratios of the binding keys.
Each message gets delivered to at most one queue. Normally, each
message gets delivered to exactly one queue, but there is a race
between the determination of which queue to send a message to, and the
deletion/death of that queue that does permit the possibility of the
message being sent to a queue which then disappears before the message
is processed. Hence in general, at most one queue.
The exchange type is "x-consistent-hash".
## Supported RabbitMQ Versions
This plugin supports RabbitMQ 3.3.x and later versions.
## Examples
### Erlang
Here is an example using the Erlang client:
```erlang
-include_lib("amqp_client/include/amqp_client.hrl").
test() ->
{ok, Conn} = amqp_connection:start(#amqp_params_network{}),
{ok, Chan} = amqp_connection:open_channel(Conn),
Queues = [<<"q0">>, <<"q1">>, <<"q2">>, <<"q3">>],
amqp_channel:call(Chan,
#'exchange.declare' {
exchange = <<"e">>, type = <<"x-consistent-hash">>
}),
[amqp_channel:call(Chan, #'queue.declare' { queue = Q }) || Q <- Queues],
[amqp_channel:call(Chan, #'queue.bind' { queue = Q,
exchange = <<"e">>,
routing_key = <<"10">> })
|| Q <- [<<"q0">>, <<"q1">>]],
[amqp_channel:call(Chan, #'queue.bind' { queue = Q,
exchange = <<"e">>,
routing_key = <<"20">> })
|| Q <- [<<"q2">>, <<"q3">>]],
Msg = #amqp_msg { props = #'P_basic'{}, payload = <<>> },
[amqp_channel:call(Chan,
#'basic.publish'{
exchange = <<"e">>,
routing_key = list_to_binary(
integer_to_list(
random:uniform(1000000)))
}, Msg) || _ <- lists:seq(1,100000)],
amqp_connection:close(Conn),
ok.
```
As you can see, the queues `q0` and `q1` get bound each with 10 points
in the hash space to the exchange `e` which means they'll each get
roughly the same number of routing keys. The queues `q2` and `q3`
however, get 20 points each which means they'll each get roughly the
same number of routing keys too, but that will be approximately twice
as many as `q0` and `q1`. We then publish 100,000 messages to our
exchange with random routing keys, the queues will get their share of
messages roughly equal to the binding keys ratios. After this has
completed, running `rabbitmqctl list_queues` should show that the
messages have been distributed approximately as desired.
Note the `routing_key`s in the bindings are numbers-as-strings. This
is because AMQP specifies the routing_key must be a string.
The more points in the hash space each binding has, the closer the
actual distribution will be to the desired distribution (as indicated
by the ratio of points by binding). However, large numbers of points
(many thousands) will substantially decrease performance of the
exchange type.
Equally, it is important to ensure that the messages being published
to the exchange have a range of different `routing_key`s: if a very
small set of routing keys are being used then there's a possibility of
messages not being evenly distributed between the various queues. If
the routing key is a pseudo-random session ID or such, then good
results should follow.
## Routing on a header
Under most circumstances the routing key is a good choice for something to
hash. However, in some cases you need to use the routing key for some other
purpose (for example with more complex routing involving exchange to
exchange bindings). In this case you can configure the consistent hash
exchange to route based on a named header instead. To do this, declare the
exchange with a string argument called "hash-header" naming the header to
be used. For example using the Erlang client as above:
```erlang
amqp_channel:call(
Chan, #'exchange.declare' {
exchange = <<"e">>,
type = <<"x-consistent-hash">>,
arguments = [{<<"hash-header">>, longstr, <<"hash-me">>}]
}).
```
If you specify "hash-header" and then publish messages without the named
header, they will all get routed to the same (arbitrarily-chosen) queue.
## Routing on a message property
In addition to a value in the header property, you can also route on the
``message_id``, ``correlation_id``, or ``timestamp`` message property. To do so,
declare the exchange with a string argument called "hash-property" naming the
property to be used. For example using the Erlang client as above:
```erlang
amqp_channel:call(
Chan, #'exchange.declare' {
exchange = <<"e">>,
type = <<"x-consistent-hash">>,
arguments = [{<<"hash-property">>, longstr, <<"message_id">>}]
}).
```
Note that you can not declare an exchange that routes on both "hash-header" and
"hash-property". If you specify "hash-property" and then publish messages without
a value in the named property, they will all get routed to the same
(arbitrarily-chosen) queue.
## Getting Help
Any comments or feedback welcome, to the
[RabbitMQ mailing list](https://groups.google.com/forum/#!forum/rabbitmq-users).
## Continuous Integration
[](https://travis-ci.org/rabbitmq/rabbitmq-consistent-hash-exchange)
## Copyright and License
(c) 2013-2015 Pivotal Software Inc.
Released under the Mozilla Public License 1.1, same as RabbitMQ.
See [LICENSE](https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange/blob/master/LICENSE) for
details.
|