1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
|
################################################################################
# This test causes to consecutive group updates on recovery in a failure
# situation. First the only donor leaves (first view change) and then as a
# consequence the joiner also leaves as he cannot recover.
#
# Test:
# 0) The test requires two servers.
# 1) Start group replication on server1. Add some data for recovery.
# 2) Set a bad password on server2 so it cannot recover.
# Also set only 2 recovery attempts with a sleep time of 10 seconds
# 3) Start group replication on server2 and wait for it to fail once.
# 4) Make the donor (server1) leave.
# 5) Wait for the joiner (server2) to be in ERROR state.
# 6) Check all is fine.
# 7) Clean up.
################################################################################
--source include/big_test.inc
--source include/have_debug.inc
--source include/have_debug_sync.inc
--source include/have_group_replication_xcom_communication_stack.inc
--let $group_replication_group_name= 6749cab0-93ae-11e5-a837-0800200c9a66
--source include/have_group_replication_plugin.inc
--let $rpl_skip_group_replication_start= 1
--source include/group_replication.inc
--echo #
--echo # Start group replication on server 1
--echo #
--connection server1
--echo server1
--source include/start_and_bootstrap_group_replication.inc
# Add some data for recovery
CREATE TABLE t1 (c1 INT NOT NULL PRIMARY KEY) ENGINE=InnoDB;
--echo #
--echo # Set server 2 options to fail on recovery
--echo #
--connection server2
--echo server2
SET @debug_save_rec_int= @@GLOBAL.group_replication_recovery_reconnect_interval;
SET @debug_save_ret_count= @@GLOBAL.group_replication_recovery_retry_count;
--disable_warnings
CHANGE REPLICATION SOURCE TO SOURCE_PASSWORD='bad_password' FOR CHANNEL 'group_replication_recovery';
--enable_warnings
--eval SET GLOBAL group_replication_recovery_reconnect_interval= 10 # seconds
--eval SET GLOBAL group_replication_recovery_retry_count= 2
--echo #
--echo # Set server 2 making sure 2 views are received when recovery fails
--echo #
set session sql_log_bin=0;
call mtr.add_suppression("There was an error when connecting to the donor*");
call mtr.add_suppression("For details please check performance_schema.replication_connection_status table and error log messages of Replica I/O for channel group_replication_recovery.");
call mtr.add_suppression("Maximum number of retries when*");
call mtr.add_suppression("Fatal error during the incremental recovery process of Group Replication.*");
call mtr.add_suppression("The member is leaving a group without being on one");
call mtr.add_suppression("Skipping leave operation: concurrent attempt to leave the group is on-going.");
call mtr.add_suppression("The server was automatically set into read only mode after an error was detected.");
# on slow runs (valgrind) it can happen because the connection
# of the recovery thread may be slower than 5 seconds and the
# server1 may actually be stopped before the reconnection attempt
# TODO: the sleeps below should be replaced by debug sync points
# For now ensure that we get coverage in most of the runs and
# do not fail on valgrind
call mtr.add_suppression("All donors left. Aborting group replication incremental recovery.*");
set session sql_log_bin=1;
SET @debug_save= @@GLOBAL.DEBUG;
SET @@GLOBAL.DEBUG='d,recovery_thread_wait_before_cleanup';
--eval SET GLOBAL group_replication_group_name= "$group_replication_group_name"
--source include/start_group_replication_command.inc
--echo #
--echo # Give time for server 2 to fail the connection once
--echo # Stop recovery on the donor (server1)
--echo #
# give it time to fail
--sleep 5
--connection server1
--echo server1
--source include/stop_group_replication.inc
#give it time to fail and leave as there are no donors.
--sleep 10
--echo #
--echo # Watch server 2 enter an error state.
--echo #
--connection server2
--echo server2
SET DEBUG_SYNC= "now SIGNAL signal.recovery_end_end";
SET @@GLOBAL.DEBUG= @debug_save;
--let $group_replication_member_state= ERROR
--source include/gr_wait_for_member_state.inc
--echo #
--echo # Check all is fine
--echo #
--connection server1
--echo server1
--source include/start_and_bootstrap_group_replication.inc
--connection server2
--echo server2
--source include/stop_group_replication.inc
--disable_warnings
CHANGE REPLICATION SOURCE TO SOURCE_USER='root', SOURCE_PASSWORD='' FOR CHANNEL 'group_replication_recovery';
--enable_warnings
--source include/start_group_replication.inc
INSERT INTO t1 VALUES (1);
--source include/rpl_sync.inc
--let $assert_text= The table should contain 1 elements
--let $assert_cond= [SELECT COUNT(*) FROM t1] = 1;
--source include/assert.inc
DROP TABLE t1;
--echo #
--echo # Cleanup
--echo #
SET @@GLOBAL.group_replication_recovery_reconnect_interval= @debug_save_rec_int;
SET @@GLOBAL.group_replication_recovery_retry_count= @debug_save_ret_count;
--source include/group_replication_end.inc
|