File: failover-and-high-availability-with-mariadb-connector-j.creole

package info (click to toggle)
mariadb-connector-java 2.7.6-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 5,564 kB
  • sloc: java: 64,842; xml: 646; sql: 445; makefile: 2
file content (265 lines) | stat: -rw-r--r-- 19,076 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265

//**This guide will teach you:**//

 * The load balancing and high availability concepts in Mariadb java connector
 * the different options

Failover and high availability were introduced in 1.2.0.


= Load balancing and failover distinction
Failover occurs when a  connection to a primary database server fails and  the connector will open up a connection to another database server.\\
For example, server A has the current connection. After a failure (server crash, network down …) the connection will switch to another server (B).

Load balancing allows load (read and write) to be distributed over multiple servers.
\\
= Replication cluster type
In MariaDB (and MySQL) replication, there are 2 different replication roles:
* Master role: Database server that permits read and write operations
* Replica role: Database server that permits only read operations

This document describes configuration and implementation for 3 types of clusters:
* Multi-Master replication cluster. All hosts have a master replication role. (example : Galera)
* Master/replica cluster: one host has the master replication role with multiple hosts in replica replication role.
* Hybrid cluster: multiple hosts in master replication role with multiple hosts in replica role.

= Load balancing implementation
== Random picking
When initializing a connection or after a failed connection, the connector will attempt to connect to a host with a certain role (replica/master).
The connection is selected randomly among the valid hosts. Thereafter, all statements will run on that database server until the connection will be closed (or fails).

The load-balancing will includes a pooling mechanism. 
Example: when creating a pool of 60 connections, each one will use a random host. With 3 master hosts, the pool will have about 20 connections to each host.

== Master/replica distributed load

For a cluster composed of masters and replicas on connection initialization, there will be 2 underlying connections: one with a master host, another with a replica host. Only one connection is used at a time. \\
For a cluster composed of master hosts only, each connection has only one underlying connection. \\
The load will be distributed due to the random distribution of connections..\\

== Master/replica connection selection

It’s the application that has to decide to use master or replica connection (the master connection is set by default).\\
Switching the type of connection is done by using JDBC [[http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#setReadOnly(boolean)|connection.setReadOnly(boolean readOnly)]] method. Setting read-only to true will use the replica connection, false, the master connection.\\

Example in standard java:
{{{
connection = DriverManager.getConnection("jdbc:mariadb:replication://master1,replica1/test");
stmt = connection.createStatement();
stmt.execute("SELECT 1"); // will execute query on the underlying master1 connection
connection.setReadOnly(true);
stmt.execute("SELECT 1"); // will execute query on the underlying replica1 connection
}}}

Some frameworks render this kind of operation easier, as for example Spring [[http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/transaction/annotation/Transactional.html#readOnly--|@transactionnal]] readOnly parameter (since spring 3.0.1).
In this example, setting readOnly to false will call the connection.setReadOnly(false) and therefore use the master connection.
{{{
@Autowired
private EntityManager em;

@Transactional(readOnly = false, propagation = Propagation.REQUIRED)
public void createContacts() {
  Contact contact1 = new Contact();
  contact1.setGender("M");
  contact1.setName("JIM");
  em.persist(contact1);
}
}}}

Generated Spring Data repository objects use the same logic: the find* method will use the replica connection, other use master connection without having to explicitly set that for each method.

On a cluster with master hosts only, the use of connection.setReadOnly(true) does not change the connection, but if the database version is 10.0.0 or higher, the session is set to readOnly if option assureReadOnly is set to true, which means that any write query will throw an exception.

=Failover behaviour
==Basic failover
When no failover/high availability parameter is set, the failover support is basic. Before executing a query, if the connection with the host is discarded, the connection will be reinitialized if parameter “autoReconnect” is set to true.

==Standard failover
When a failover /high availability parameter is set.Check the [[configuration]] section for an overview on how to set the parameters.

There can be multiple fail causes. When a failure occurs many things will be done: 
* The fail host address will be put on a blacklist (shared by JVM). This host will not be used for the amount of time defined by the “loadBalanceBlacklistTimeout” parameter (default to 50 seconds). The only time a blacklisted address can be used is if all host of the same type (master/replica) are blacklisted.
* The connector will check the connection (with the mysql [[https://dev.mysql.com/doc/internals/en/com-ping.html|ping protocol]]). If the connection is back, is not read-only, and is in a transaction, the transaction will be rollbacked (there is no way to know if the last query has been received by the server and executed).
* If the failure relates to a replica connection
  *     If the master connection is still active, the master connection will be used immediately. 
        The query that was read-only will be relaunched and the connector will not throw any exception. 
        A "failover" thread will be launched to attempt to reconnect a replica host.
        (if the query was a prepared query, this query will be re-prepared before execution)
  *     If the master connection is not active, the driver will attempt to create a new master or replica connection with a [[#connection-loop|connection loop]].
        if any connection is found, the query will be relaunched, if not, an SQLException with sqlState like “08XXX” will be thrown.  
* If the failure relates to a master connection, the driver will attempt to create a new master connection with a [[#connection-loop|connection loop]], so the connection object will be immediately reusable.\\
  *       on failure, an SQLException with be thrown with SQLState "08XXX". If using a pool, this connection will be discarded.
  *       on success,    
    *       if possible query will be relaunched without throwing error (if was using a replica connection, or was a SELECT query not in a transaction for example).
    *       if not possible, an SQLException with be thrown with SQLState "25S03".  
* When throwing an SQLException with SQLState "08XXX", the connection will be marked as closed. 
* A “failover” thread will be launched to attempt to reconnect failing connection if connection is not closed.  

It’s up to the application to take measures to handle SQLException. See details in [[#application-concerns|application concerns]].

#Connection loop
When initializing a connection or after a failure, the driver will launch a connection loop the only case when this connection loop will not be executed is when the failure occurred on a replica with an active master.
This connection loop will try to connect to a valid host until finding a new connection or until the number of connections exceed the parameter “retriesAllDown” value (default to 120).

This loop will attempt to connect sequentially to hosts in the following order:

For a master connection : 
* random connect to master host not blacklisted
* random connect to master blacklisted

For a replica connection :
* random connect to replica host not blacklisted
* random connect to master host not blacklisted (if no active master connection)
* random connect to replica blacklisted
* random connect to master host blacklisted (if no active master connection)
The sequence stops as soon as all the underlying needed connections are found. Every time an attempt fails, the host will be blacklisted. 
If after an entire loop a master connection is missing, the connection will be marked as closed. 

=Additional threads

==Failover reconnection threads
A thread pool is created in case of a master/replica cluster, the size is defined according to the number of connection.
After a failure on a replica connection, readonly operations are temporary executed on the master connection. Some “failover threads” will try to reconnect the failed underlying connections.
When a new replica connection is retrieved, this one will be immediately used if connection was still in read-only mode.\\
More details in [[failover_loop.creole|Failover loop threads]].


==Connection validation thread
An additional thread is created when setting the option "validConnectionTimeout".
This thread will very that connections are all active. 
This is normally done by pool that call [[https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#isValid(int)|Connection.isValid()]].
 
=Application concerns
When a failover happen a SQLException with sqlState like "08XXX" or "25S03" may be thrown.

Here are the different connection error codes:

|=Code       |=Condition |
|08000      | connection exception|
|08001      |SQL client unable to establish SQL connection|
|08002      |connection name in use|
|08003      |connection does not exist|
|08004      |SQL server rejected SQL connection|
|08006      |connection failure|
|08007      |transaction resolution unknown|
|25S03      |invalid transaction state-transaction is rolled back|

A connection pool will detect connection error in SQLException (SQLState begin with "08"), and this connection will be discarded from pool.\\

When a failover occur the connector cannot know if the last request has been received by the database server and executed. Applications may have failover design to handle these particular cases: 
If the application was in autoCommit mode (not recommended), the last query may have been executed and committed. The application will have no possibility to know that but the application will be functional.
If not in autoCommit mode, the query has been launched in a transaction that will not be committed. Depending of what caused the exception, the host may have the connection open on his side during a certain amount of time. Take care of [[https://mariadb.com/kb/en/mariadb/set-transaction/|transaction isolation]] level that may lock too much rows.
  


=Configuration

(See [[about-mariadb-connector-j.creole|About MariaDB java connector]] for all connection parameters)
JDBC connection string format is :
{{{
jdbc:(mysql|mariadb):[replication:|failover:|loadbalance:|aurora:]//<hostDescription>[,<hostDescription>...]/[database][?<key1>=<value1>[&<key2>=<value2>]...]
}}}

The standard option "connectTimeout" defined the socket connection timeout. by default, these option is set to 0 (no timeout).\\
Since there are many servers, setting this option to a small amount of time make sense.\\
During the [[#connection-loop|connection loop phase]], the driver will try to connect to server sequentially until the creation of an active connection.
Set this option to a small value (like 2000ms - to be set according to your environment) will permit to reject faulty server quickly.   


==Failover / high availability parameters

Each parameter corresponds to a specific use case:

|=Failover option|=Description|
| **failover** | High availability (random picking connection initialisation) with failover support for master replication cluster (exemple Galera). \\* Since 1.2.0*|
| **sequential** |Failover support for master replication cluster (for example Galera) **without** High availability. \\the host will be connected in the order in which they were declared.\\\\Example when using the jdbc url string "jdbc:mariadb:replication:host1,host2,host3/test" : \\When connecting, the driver will always try first host1, and if not available host2 and following. After a host fail, the driver will reconnect according to this order.\\*Since 1.3.0*|
| **replication** | High availability (random picking connection initialisation) with failover support for master/replica replication cluster (one or multiple master).\\* Since 1.2.0*|
| **aurora** | High availability (random picking connection initialisation) with failover support for Amazon Aurora replication cluster.\\* Since 1.2.0*|


==Failover / high availability options

|=Option|=Description|
|autoReconnect|With basic failover only, if true, will attempt to recreate connection after a failover.\\*Default is false. Since 1.1.7*|
|retriesAllDown|When searching a valid host, maximum number of connection attempts before throwing an exception.\\*Default: 120. Since 1.2.0|
|failoverLoopRetries|When searching silently for a valid host, maximum number of connection attempts.\\This differ from "retriesAllDown" parameter, because this silent search is for example used after a disconnection of a replica connection when using the master connection.\\*Default: 120. Since 1.2.0*|
|validConnectionTimeout|With multiple hosts, after this time in seconds has elapsed it’s verified that the connections haven’t been lost.\\When 0, no verification will be done.\\*Default:120 seconds. Since 1.2.0*|
|loadBalanceBlacklistTimeout|When a connection fails, this host will be blacklisted during the "loadBalanceBlacklistTimeout" amount of time.\\When connecting to a host, the driver will try to connect to a host in the list of not blacklisted hosts and after that only on blacklisted ones if none has been found before that.\\This blacklist is shared inside the classloader.\\*Default: 50 seconds. Since 1.2.0*|
|assureReadOnly|If true, in high availability, and switching to a read-only host, assure that this host is in read-only mode by setting session read-only.\\alias "readOnlyPropagatesToServer" worked to for compatibility\\*Default to false.\\ Since 1.3.0*|


=Specifics for Amazon Aurora

Amazon Aurora is a Master/Replicas cluster composed of one master instance with a maximum of 15 replica instances. Amazon Aurora includes automatic promotion of a replica instance in case of the master instance failing. The MariaDB connector/J implementation for Aurora is specific to handle this automatic failover.\\

To permit development/integration on a single-node cluster, only one host can be defined.  
In this case, the driver behaves as for the configuration **failover**. 
    

==Aurora failover implementation
Aurora failover management steps : 
* Instance A is in write replication mode, instance B and C are in read replication mode.
* Instance A fails.
* Aurora detects A failure, and promote instance B in write mode. Instance C will change his master to use B. 
* Cluster end-point will change to instance B end-point.
* Instance A will recover and be in read replication mode.  

==Aurora configuration

===Aurora endpoints and discovery

Every aurora instance has a specific endpoint, i.e. an URL that identify the host. Those endpoints look like `xxx.yyy.zzz.rds.amazonaws.com`.

There is another endpoint named "cluster endpoint" (format `xxx.cluster-yyy.zzz.rds.amazonaws.com`) which is assigned to the current master instance and will change when a new master is promoted.

In version before 1.5.1, cluster endpoint use was discouraged, since when a failover occur, this cluster endpoint can point for a limited time to a host that isn't the current master anymore. Old recommandation was to list all specific end-points.
This kind of url string will still work, but now, recommended url string has to use only cluster endpoint.

Driver will automatically discover master and replicas of this cluster from current cluster end-point during connection time. This permit to add new replicas to the cluster instance will be discovered without changing driver configuration.

This discovery append at connection time, so if you are using pool framework, check if this framework as a property that controls the maximum lifetime of a connection in the pool, and set a value to avoid infinite lifetime. When this lifetime is reached, pool will discarded the current connection, and create a new one (if needed). New connections will use the new replicas.
(If connections are never discarded, new replicas will begin be used only when a failover occur)

===JDBC connection string

The implementation is activated by specifying the “aurora” failover parameter.
Recommended connection string is using cluster end-point:
{{{
jdbc:(mysql|mariadb):aurora://[clusterInstanceEndPoint[:port]]/[database][?<key1>=<value1>[&<key2>=<value2>]...]
}}}

Before driver version 1.5.1, connection string has to list all end-point:
{{{
jdbc:(mysql|mariadb):aurora://[instanceEndPoint[:port]][,instanceEndPoint[:port]...]/[database][?<key1>=<value1>[&<key2>=<value2>]...]
}}}

If setting many endpoint, the replication role of each instance must not be defined for Aurora, because the role of each instance changes over time. The driver will check the instance role after connection initialisation.

Example of connection string
{{{
    jdbc:mariadb:aurora://cluster.cluster-xxxx.us-east-1.rds.amazonaws.com/db
}}}

Another difference is the option "socketTimeout" that defaults to 10 seconds, meaning that - if not changed - queries exceeding 10 seconds will throw exceptions.  
Note that the option "createDatabaseIfNotExist=true" causes reader instances to be blacklisted during the creation of a new connection, because this option sends a "CREATE DATABASE" statement to the reader which is not read-only and therefore is rejected.

==Aurora connection loop

When searching for the master instance and connect to a replica instance, the connection order will be:
* Every Aurora instance knows the hostname of the current master. If the host has been described using their instance endpoint, that will permit to know the master instance and connect directly to it.  
* If this isn’t the current master (because using IP, or possible after a failover between step 2 and 3), the loop will connect randomly the other not blacklisted instance (minus the current replica instance)
* Connect randomly to a blacklisted instance.

When searching for a replica instance, the loop will connection order will be:
* random not blacklisted instances (excluding the current host if connected)
* random blacklisted instances 
The loop will retry until the connections are found or parameter “retriesAllDown” is exceeded.

==Aurora master verification
Without any query during the time defined by the parameter validConnectionTimeout (default to 120s) and if not set to 0, a verification will be done that the replication role of the underlying connections haven’t changed.

==Aurora connection validation thread
Aurora as a specific [[#connection-validation-thread|connection validation thread]] implementation.
Since role of each instance can change over time, this will validate that connection are active AND role have not changed.