File: LogicalAddresses.txt

package info (click to toggle)
libjgroups-java 2.12.2.Final-6
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 8,712 kB
  • sloc: java: 109,098; xml: 9,423; sh: 149; makefile: 2
file content (243 lines) | stat: -rw-r--r-- 13,940 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243


Logical addresses
=================

Author:  Bela Ban
JIRA:    https://jira.jboss.org/jira/browse/JGRP-129

The address chosen by each node is essentially the IP address and port of the receiver socket. However, for the
following reasons, this is not good enough:

- Reincarnation: if we use fixed ports (bind_port is set), then a restarted (or shunned) node will have the same
  address. If other nodes in the cluster don't clear their states before the reincarnated node comes up again, we'll
  have issues (see JGRP-130 for details)

- NIC failover: a NIC goes down, we want to continue sending/receiving on a different NIC

- The sender sends on all available NICs (send_on_all_interfaces="true"). This means that -if we take the receiver's
  datagram packet's address to be the identity of the sender - we get N different identities; 1 for each interface
  the message is sent on

- Network Address Translation: the sender's address might get changed by the NAT

DESIGN:

- A logical address consists of a unique identifier (UUID) and a logical name. The name is passed to JGroups when a
  channel is created (new JChannel(String logical_name, String props)). If logical_name is null, JGroups picks a
  logical name (which is not guaranteed to be unique though). The logical name stays with the channel until the latter
  is destroyed

- A UUID is represented by org.jgroups.util.UUID, which is a subclass of Address and consists of the least and
  most significant bits variables copied from java.util.UUID. All other instance variables are omitted.
  - The isMulticastAddress() method uses 1 bit out of the 128 bits of UUID
    (Maybe we can remove this method and always represent multicast addresses as nulls, so UUIDs would only be used
     to represent non multicast addresses)

- All UUIDs have a reference to a static table which contains the mappings between logical names and UUIDs
  (classloader issues ?)

- The logical name is used in UUID.toString(), and the least and most significant bits are used for equals() and
  hashCode()

- A UUID is created on channel connect, deleted on channel disconnect and re-created on channel connect.
  Since it is re-created on every connect(), it will prevent reincarnation issues

Transport (TP)
--------------

TP maintains a cache of mappings between UUIDs and physical addresses. Whenever a message is sent, the physical address
of the receiver is looked up from the cache.

A UUID can have more than 1 physical address. We envisage providing a pluggable strategy which picks a physical
address given a UUID. For example, a plugin could load balance between physical addresses.

To exchange cache information, we'll use the existing Discovery protocol (see below).

When a physical address for a given UUID is not present, a sender discards (or queues, TBD) the message and broadcasts
a WHO-HAS message. Receivers then broadcast (or unicast) the UUID-physical address mapping.

The discovery phase needs to return logical *and* physical addresses: on reception of a discovery request, we return
our logical address (UUID) and the physical address(es) associated with it, plus the logical name. On reception
of a discovery response, the transport places the physical addresses returned into its cache (if not yet present).
See the scenarios below for details as to why this is needed.


UDP startup
-----------
- The joiner multicasts a discovery request with its UUID, logical name and physical address(es)
- The receivers' transports add this information into their caches
- Each receiver unicasts a discovery response, containing the coordinator's address, plus its own UUID, logical
  name and physical address(es)
- On reception of a discovery response, the transport adds this information to its cache if not yet there


TCPPING:TCP startup
-------------------
- The problem is that TCPPING has as initial_hosts the *physical* (not logical) addresses listed !
- The joiner sends a discovery request to all physical addresses listed in initial_hosts
- The dest_addr field of a discovery request message is the physical address
- The transport usually expects UUIDs as addresses and finds the associated physical address in the cache
- However, in this case, the transport re-uses the physical address (dest_addr) in the message, bypasses the
  translation UUID --> physical address, and nulls dest_addr in Message
- The destination address is now *null*. This works because Discovery/PING/TCPPING don't check the messages's
  dest_addr field ! (Otherwise 'null' would be interpreted as a multicast destination !)
- On the receiver side, the destination is not used for discovery
- The response is sent back to the sender (Message.getSrc()), this is a UUID and will be translated back
  into a physical address


TCPGOSSIP:TCP startup
---------------------
- The joiner asks the GossipRouter for a list of nodes for a given cluster name
- The information returned contains a list of nodes, for each node:
  - The logical address (UUID)
  - The logical name
  - The physical address(es) associated with the UUID
- The joiner's transport adds this information to its cache
- Then each node of the initial membership is sent a discovery request
- The rest is the same as for UDP


TCPGOSSIP:TUNNEL startup
------------------------
- Same as for TCPGOSSIP:TCP, but here we don't really need the physical addresses, because we send every request
  to the GossipRouter anyway (via our TCP connection to it)
- The physical address will simply be ignored


ARP-like functionality
----------------------
- This is handled by Discovery
- We'll add WHO-HAS, I-HAVE and INVALIDATE message handling to Discovery
- The Discovery and transport protocols communicate via events
- When the transport wants to send a message whose destination UUID is not in its cache, it sends a (non-blocking)
  WHO-HAS up the stack which is handled by Discovery. Meanwhile the message is queued (bounded queue).
- Discovery sends a WHO-HAS(UUID) message (multicast if PING, sent to initial_hosts in TCPPING, or sent to the
  GossipRouter if TCPGOSSIP)
- On reception of  I-HAVE, Discovery sends the result down the stack via an I-HAVE event
- When the transport receives an I-HAVE event, it updates its local cache and then tries to send the queued messages
- A discovery request also ships the logical name and physical address(es)
- A discovery response also contains these items
- When a discovery request or response is received, the cache is updated if necessary
- When a channel is closed, we send an INVALIDATE(UUID) message around, which removes the UUID from UUID.cache and
  the transport's cache in all cluster nodes



Runtime scenarios
-----------------

Startup
-------
- The transport stores the logical name (generated if user didn't pass one to the JChannel() constructor)
- On connect:
  - The UUID is generated (local address) and associated with the logical name (in UUID.cache)
  - The local socket is created and associated with the UUID in the transport's cache
- On disconnect: the local_addr (UUID) is nulled and removed from the transport's cache and UUID.cache
- On close: an INVALIDATE(UUID) message is broadcast so every node can remove UUID from the transport's cache
  and from UUID.cache

Discovery
---------
- Discovery fetches the local_addr (UUID), logical name and physical address(es) from the transport via a
  GET_PHYSICAL_ADDRESS
- A discovery request containing this info is sent out (multicast, unicast via TCP or sent to the GossipRouter)
- The receivers (Discovery protocols) fetch this info from the message and send it down to the transport
  (via a SET_PHYSICAL_ADDRESS), which adds it to its cache and to UUID.cache
- The receivers then fetch their own local information from the transport and send it along with the discovery
  response
- On reception of the discovery response, the requester extracts this info from the message and sends it
  down to its transport, which adds it to its own local cache

Sending of a message with no physical address available for UUID
----------------------------------------------------------------
- The transport queues the message
- The transport sends a GET_PHYSICAL_ADDRESS *event* up the stack
- The Discovery protocol sends out a GET_MBRS_REQ *message* (via multicast, TCP unicast, or to the GossipRouter)
- The receivers fetch their local information from the transport (exception: the GossipRouter has this information
  in its local loookup cache), and return it with a GET_MBRS_RSP message
- On reception of the GET_MBRS_RSP message, a SET_PHYSICAL_ADDRESS event is sent down the stack to the transport
- The transport updates its local cache from the information
- The transport sends all queued messages if the UUID are now available


On view change
--------------
- The coordinator's Discovery protocol broadcasts an INVALIDATE(UUID) message for each node which left the cluster
- On reception, every receiver sends down a REMOVE_PHYSICAL_ADDRESS(UUID)
- The transport then removes the mapping for the given UUID


IDs instead of UUIDs
--------------------
- Implemented in TP or a separate protocol (ID ?)
- The coordinator dishes out IDs (shorts) for new nodes
- The IDs are always increasing
- Every new ID is broadcast across the cluster, so everyone knows the highest IDs
- An ID is associated with a UUID, and UUIDs are also associated with physical addresses (2 tables)
- When we send a message to dest UUID, we lookup the ID associated with UUID. If found, we send the ID (a short)
  rather than the UUID
- On reception, if an ID is found, we create an IdAddress, which work on the short field for equals() and hashCode()
- IdAddress.equals() and IdAddress.compareTo() can compare to both IdAddress *and* UUIDs: in the latter case, they
  fetch the ID from the table given the UUID as key and do the comparison. UUIDs can also compare to IdAddresses
- We should probably either add a PhysicalAddress interface, which inherits from Address, and have physical addresses
  implement PhysicalAddress (so we can do an instanceof PhysicalAddress), or have an isPhysicalAddress() method
- ID canonicalization should be configurable: we can enable or disable it
- This could be used by an ID protocol (sitting on top of the transport), which maintains a UUID-ID table and *replaces*
  dest and src addresses for messages coming in and going out
- This protocol would replace an UUID dest with an IdAddress and provide the physical address as well (?)


Shared transport and UUIDs
--------------------------
- With a shared transport, every channel has a local_addr: the transport itself cannot have a local_addr anymore. The
  reason is that a channel could get shunned, leaves the cluster and then reconnects. However, because the address
  of the shared transport hasn't changed, the rejoined member has the same address, thus chances of reincarnation are
  higher
- When a channel connects, it creates a new local_addr (UUID) and sends it down via SET_LOCAL_ADDRESS. The transport
  then adds the UUID/physical address mapping to its cache
- When we send a multicast message (dest == null), but don't have a multicast capable transport (e.g. UDP.ip_mcast=false)
  or TCP/TCP_NIO, then we simply send it to all *physical* addresses in the transport's UUID cache.
  If we don't currently have all physical addresses, that's fine because MERGE2 will eventually fetch all physical
  addresses in the cluster


TODOs
-----
- Multicast messages should always be null, so Address.isMulticastAddress() should not be needed anymore
- GossipRouter: we need to register logical *and* physical addresses, plus the logical name
- Find all uses of IpAddress and change them to SocketAddress (if possible), or try to use Address rather than
  IpAddress
- UUID: generate a nice name from UUID if the logical name is not defined. Alternative: don't use the UUID to generate
  the logical name, but maybe the host name and a random short
- Util.writeAddress()/writeAddresses(): currently optimized for IpAddress, change to UUID.
  Example: View.writeTo()/readFrom()
- How are we going to associate logical names to channels when a shared transport is used ?
- Marshalled size of UUID: 18 bytes, IpAddress: 9 bytes. Size in memory (without superclass, additional_data, on
  64 bit machine): 16 bytes for UUID, 32 bytes for IpAddress ! So while marshalling takes more space for UUID,
  it is quite compact in memory !
  If it turns out that UUIDs generate too much overhead on the wire (bandwidth), we should think about canonicalizing
  UUIDs to shorts: run an agreement protocol which assigns cluster-wide unique shorts to UUIDs, and send the shorts
  rather than the UUIDs around !
- UDP.initial_hosts also needs to send to physical addresses
- Implement getPhysicalAddress() in all transports
- BasicTCP.handleDownEvent(): view has to be handled in ProtocolAdapter and connections have to be closed there, too
  --> The semantics of retainAll() have to be inspected: do we handle UUIDs or PhysicalAddresses ?
  --> Should we switch to a model where ConnectionTable never reaps connections based on view changes, but based on
      idle time ? Ie. reap a connection that hasn't been used for more than 30 seconds
- TCPPING.initial_hosts="A,B": how will we find out about C, D and E ?


- Shared transport
  - Move setSourceAddress() from TP to JChannel ?
  - Check whether thread names in a shared transport are still correct.
  - Do thread names use the logical or UUID name of a channel ?
  - Is Global.DUMMY still needed when we set the local address top down ?
  - The 'members' variable in TP is a union of all memberships in the shared transport case
    - In BasicTCP, we need to change retainAll() to use physical addresses rather than UUIDs stored ib TP.members
    - Dito for suspected_mbrs
    - In general, change all addresses in TCP to PhysicalAddresses. Or should we use logical addresses ?