1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178
|
###############################################################################
#
# group_replication_exit_state_action sysvar specifies which action is taken by
# a server once it has involuntarily left the group. Currently there are only
# two actions: either the server continues running but with super_read_only
# enabled (READ_ONLY) or it aborts (ABORT_SERVER).
#
# This test shall verify that the correct exit state action is executed when
# an error occurs during the catch-up phase of distributed recovery.
#
# Test:
# 0) Setup group of 2 members (M1 and M2).
# 1) Force error during the catch-up phase of M1.
# 1.1) Verify that M1 goes into ERROR state and to super_read_only mode.
# 2) Set group_replication_exit_state_action to ABORT_SERVER on M1.
# 3) Force another error during the catch-up phase of M1.
# 3.1) Verify that M1 aborted.
# 4) Relaunch M1 and join the group.
# 5) Cleanup.
#
################################################################################
--source include/big_test.inc
--source include/have_group_replication_plugin.inc
--echo
--echo #########################################################################
--echo # 0) Setup group of 2 members (M1 and M2) but only start GR on M2.
--echo #########################################################################
--echo
--let $rpl_skip_group_replication_start= 1
--source include/group_replication.inc
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--let $member1_uuid= `SELECT @@GLOBAL.server_uuid`
--let $local_address_server1= `SELECT @@GLOBAL.group_replication_local_address`
--let $group_seeds_server1= `SELECT @@GLOBAL.group_replication_group_seeds`
# Create a table on M1 while it isn't part of the group
SET sql_log_bin = 0;
CREATE TABLE t1 (a INT PRIMARY KEY);
INSERT INTO t1 VALUES (1);
SET sql_log_bin = 1;
# Bootstrap the rest of the group (minus M1)
--let $rpl_connection_name= server2
--source include/rpl_connection.inc
--source include/start_and_bootstrap_group_replication.inc
# Suppress expected errors and warnings
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
SET sql_log_bin = 0;
call mtr.add_suppression("read failed");
call mtr.add_suppression("Replica SQL for channel 'group_replication_recovery': Error 'Table 't1'*");
call mtr.add_suppression("Replica SQL for channel 'group_replication_recovery': Worker [0-9] failed executing transaction*");
call mtr.add_suppression("Replica SQL for channel 'group_replication_recovery': ... The replica coordinator and worker threads are stopped,*");
call mtr.add_suppression("Replica: Table 't1' already exists Error_code:*");
call mtr.add_suppression("Maximum number of retries when trying to connect to a donor reached. Aborting group replication incremental recovery.");
call mtr.add_suppression("Fatal error during the incremental recovery process of Group Replication. The server will leave the group.");
call mtr.add_suppression("The server was automatically set into read only mode after an error was detected.");
call mtr.add_suppression("Skipping leave operation: concurrent attempt to leave the group is on-going.");
call mtr.add_suppression("The plugin encountered a critical error and will abort: Fatal error in the recovery module of Group Replication.");
call mtr.add_suppression("Error while starting the group replication incremental recovery receiver/applier threads");
SET sql_log_bin = 1;
--echo
--echo #########################################################################
--echo # 1) Force error during the catch-up phase of M1.
--echo #########################################################################
--echo
# Create and replicate a table on the group
--let $rpl_connection_name= server2
--source include/rpl_connection.inc
CREATE TABLE t1 (a INT PRIMARY KEY);
# Add M1 to the group
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--let $group_replication_start_member_state= ERROR
--source include/start_group_replication.inc
--echo
--echo #########################################################################
--echo # 1.1) Verify that M1 goes into ERROR state and to super_read_only mode.
--echo #########################################################################
--echo
# Firstly verify that the member entered an error state
--let $group_replication_member_state= ERROR
--let $group_replication_member_id= $member1_uuid
--source include/gr_wait_for_member_state.inc
# Then verify that it enabled super_read_only
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--let $assert_text= super_read_only should be enabled
--let $assert_cond= [SELECT @@GLOBAL.super_read_only] = 1;
--source include/assert.inc
# Lastly, verify that the member is not viewed as part of the group on M2 and
# M3
--let $rpl_connection_name= server2
--source include/rpl_connection.inc
--let $group_replication_number_of_members= 1
--source include/gr_wait_for_number_of_members.inc
--echo
--echo #########################################################################
--echo # 2) Set group_replication_exit_state_action to ABORT_SERVER on M1.
--echo #########################################################################
--echo
# Stop GR
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--source include/stop_group_replication.inc
# Set the exit state action sysvar to ABORT_SERVER
SET @@GLOBAL.group_replication_exit_state_action = ABORT_SERVER;
--echo
--echo #########################################################################
--echo # 3) Force another error during the catch-up phase of M1.
--echo #########################################################################
--echo
# Inform MTR that we are expecting an abort and that it should wait before
# restarting the aborting member
--exec echo "wait" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
# Join the group again
--let $group_replication_start_member_state= ERROR
--source include/start_group_replication.inc
--echo
--echo #########################################################################
--echo # 3.1) Verify that M1 aborted.
--echo #########################################################################
--echo
# For simplicity, let's assume that once the group size is 1, then member 1 has
# abort()'ed
--let $rpl_connection_name= server2
--source include/rpl_connection.inc
--let $group_replication_number_of_members = 1
--source include/gr_wait_for_number_of_members.inc
# Also, the member should not be in the group view of any of the other members
--let $assert_text = Member 1 should have aborted
--let $assert_cond = COUNT(*) = 0 FROM performance_schema.replication_group_members WHERE MEMBER_ID = "$member1_uuid"
--source include/assert.inc
--echo
--echo #########################################################################
--echo # 4) Relaunch M1 and join the group.
--echo #########################################################################
--echo
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--source include/wait_until_disconnected.inc
# Inform MTR that it should restart the aborted member
--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
# Reconnect
--let $rpl_server_number= 1
--source include/rpl_reconnect.inc
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
# Remove conflicting trx so M1 can stay in the group
SET SESSION sql_log_bin= 0;
DROP TABLE t1;
SET SESSION sql_log_bin= 1;
RESET MASTER;
--replace_result $group_seeds_server1 GROUP_SEEDS_SERVER1
--eval SET @@global.group_replication_group_seeds="$group_seeds_server1"
--replace_result $local_address_server1 LOCAL_ADDRESS_SERVER1
--eval SET @@global.group_replication_local_address="$local_address_server1"
--source include/start_group_replication.inc
# Wait for group to stabilize
--let $group_replication_number_of_members = 2
--source include/gr_wait_for_number_of_members.inc
--echo
--echo #########################################################################
--echo # 5) Cleanup.
--echo #########################################################################
--echo
DROP TABLE t1;
--source include/group_replication_end.inc
|