.avif)
.avif)
MySQL 8 Asynchronous Replication Failover
MySQL 8 provides several replication types, but asynchronous replication remains the most popular choice because it is simple and flexible. The downside has always been managing downtime: when a source node fails, fixing it usually requires manual intervention.
In this post, we will look at Asynchronous Connection Failover, a feature in MySQL 8 that helps minimize downtime by automatically switching a replica to a new source when the primary connection drops.
Asynchronous Replication in MySQL 8
This feature allows a replica to automatically establish a replication connection to a new source server from a pre-configured list if the current connection fails.
It isn't just for total failures; the connection also switches if the currently connected source is no longer the highest-weighted priority in the group. The details of these source lists are maintained in the mysql system tables:
- mysql.replication_asynchronous_connection_failover
- mysql.replication_asynchronous_connection_failover_managed
How Asynchronous Replication works
When the current source fails, the replica doesn't switch immediately. First, it tries to reconnect to the same source based on the MASTER_RETRY_COUNT setting. The time between these attempts is controlled by MASTER_CONNECT_RETRY.
Once the replica exhausts these retry attempts, the asynchronous connection failover mechanism activates and picks a new source from the list.
Important Configuration Note:
By default, MASTER_RETRY_COUNT is set to 86400 (24 hours) and MASTER_CONNECT_RETRY is 60 seconds.
If you want the failover to happen quickly during a network outage or server crash, you must reduce the retry count. If you leave it at the default, the replica will keep trying the dead server for a full day before attempting to switch.
Asynchronous Replication Functions
To configure this, MySQL provides specific User Defined Functions (UDFs) to manage the source lists.
- asynchronous_connection_failover_add_source(): Add a specific source server IP and weight to the replication channel.
- asynchronous_connection_failover_delete_source(): Remove a source server from the list.
- asynchronous_connection_failover_add_managed(): Add a managed group (like Group Replication members) to the list.
- asynchronous_connection_failover_delete_managed(): Remove a managed group.
- asynchronous_connection_failover_reset(): Wipe all failover settings.
Configuring Asynchronous Replication
Let's look at a practical example using three instances.
The Architecture:
- Master: 10.50.59.218 (Server ID: 1)
- Hot Standby: 10.53.19.45 (Server ID: 2)
- Backup (Replica): 10.52.162.158 (Server ID: 3)
1. Prerequisites (my.cnf)
Add the following configurations to the my.cnf file on the respective servers. Ensure GTID is enabled on all nodes.
Master :10.50.59.218
[mysqld]
server-id=1
binlog_format=ROW
gtid_mode=ON
enforce_gtid_consistency=ONHotstandby : 10.53.19.45
[mysqld]
server-id=2
binlog_format=ROW
gtid_mode=ON
enforce_gtid_consistency=ONBackup : 10.52.162.158
[mysqld]
server-id=3
binlog_format=ROW
gtid_mode=ON
enforce_gtid_consistency=ON2. User Creation
Create a replication user on the Master (and the Hot Standby, so the replica can connect there later).
CREATE USER 'repl'@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
FLUSH PRIVILEGES;3. Configure Replication on the Backup Node
We will configure the Backup node to replicate from the Master. We must enable SOURCE_CONNECTION_AUTO_FAILOVER=1 and lower the retry count to ensure failover happens quickly.
set global gtid_purged = 'Executed_Gtid_Set of master';
CHANGE MASTER TO master_host='10.50.59.218',master_user='repl',master_password='password',MASTER_AUTO_POSITION=1,MASTER_RETRY_COUNT=30,MASTER_CONNECT_RETRY=1,source_connection_auto_failover=1,SOURCE_SSL=1;4. Populate the Source List
Now, on the Backup node, we add our potential sources using the UDFs. We will give the Master a weight of 100 and the Hot Standby a weight of 80.
mysql> select asynchronous_connection_failover_add_source('', '10.50.59.218', 3306, '', 100);
+--------------------------------------------------------------------------------+
| asynchronous_connection_failover_add_source('', '10.50.59.218', 3306, '', 100) |
+--------------------------------------------------------------------------------+
| The UDF asynchronous_connection_failover_add_source() executed successfully. |
+--------------------------------------------------------------------------------+
mysql> select asynchronous_connection_failover_add_source('','10.53.19.45', 3306, '', 80);
+------------------------------------------------------------------------------+
| asynchronous_connection_failover_add_source('','10.53.19.45', 3306, '', 80) |
+------------------------------------------------------------------------------+
| The UDF asynchronous_connection_failover_add_source() executed successfully. |
+------------------------------------------------------------------------------+Verify the configuration:
mysql> select * from mysql.replication_asynchronous_connection_failover;
+--------------+--------------+------+-------------------+--------+--------------+
| Channel_name | Host | Port | Network_namespace | Weight | Managed_name |
+--------------+--------------+------+-------------------+--------+--------------+
| | 10.50.59.218 | 3306 | | 100 | |
| | 10.53.19.45 | 3306 | | 80 | |
+--------------+--------------+------+-------------------+--------+--------------+Testing the Failover
Now, let's stop the MySQL service on the Master (10.50.59.218) to simulate a crash.
service mysql stopPhase 1: Retrying the Old Connection
Immediately after stopping the master, the Backup node attempts to reconnect to .218. It will do this 30 times (as per MASTER_RETRY_COUNT).
Checking SHOW SLAVE STATUS \G:
mysql> show slave status \G system ip r ;
*************************** 1. row ***************************
Slave_IO_State: Connecting to source
Master_Host: 10.50.59.218
Master_User: repl
Master_Port: 3306
Connect_Retry: 1
Master_Log_File:
Read_Master_Log_Pos: 1042
Relay_Log_File: warehouse-b2b-db-playground-none-8815633-relay-bin.000002
Relay_Log_Pos: 620
Relay_Master_Log_File: 1-mysql-binary.000001
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1042
Relay_Log_Space: 865
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: Yes
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 2003
Last_IO_Error: Error connecting to source 'repl@10.50.59.218:3306'. This was attempt 29/30, with a delay of 1 seconds between attempts. Message: Can't connect to MySQL server on '10.50.59.218:3306' (111)
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: e17bfcf9-ba60-11f0-949d-02010a323bda
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 30
Master_Bind:
Last_IO_Error_Timestamp: 251105 23:25:12
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: e17bfcf9-ba60-11f0-949d-02010a323bda:4
Executed_Gtid_Set: e17bfcf9-ba60-11f0-949d-02010a323bda:1-4
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.00 sec)
default via 10.52.160.1 dev enp1s0
10.52.160.0/20 dev enp1s0 proto kernel scope link src 10.52.162.158
169.254.1.1 dev enp1s0 scope link
mysql>Phase 2: Switching to the Hot Standby
Once the 30 retry attempts are exhausted, the failover mechanism kicks in. It checks the weight list, sees that 10.53.19.45 is the next available source, and connects to it.
Checking SHOW SLAVE STATUS \G again:
mysql> show slave status \G system ip r ;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 10.53.19.45
Master_User: repl
Master_Port: 3306
Connect_Retry: 1
Master_Log_File: 1-mysql-binary.000001
Read_Master_Log_Pos: 1048
Relay_Log_File: warehouse-b2b-db-playground-none-8815633-relay-bin.000003
Relay_Log_Pos: 475
Relay_Master_Log_File: 1-mysql-binary.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 349
Relay_Log_Space: 1882
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: Yes
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 25
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: 68c00b57-ba61-11f0-9a87-02010a35132d
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 30
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 68c00b57-ba61-11f0-9a87-02010a35132d:1-3,
e17bfcf9-ba60-11f0-949d-02010a323bda:4
Executed_Gtid_Set: 68c00b57-ba61-11f0-9a87-02010a35132d:1-3,
e17bfcf9-ba60-11f0-949d-02010a323bda:1-4
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set, 1 warning (0.00 sec)
default via 10.52.160.1 dev enp1s0
10.52.160.0/20 dev enp1s0 proto kernel scope link src 10.52.162.158
169.254.1.1 dev enp1s0 scope link
mysql>As you can see, the Master_Host automatically updated from .218 to .45 without any manual CHANGE MASTER command.
The asynchronous failover mechanism is a powerful tool for keeping a replica in sync within a multisource environment. It offers a lightweight approach to high availability. However, remember that while it automates the replica's connection, effective HA requires careful configuration of retry timers and weight priorities to match your network stability.
Configuring failover is just one aspect of a healthy database environment. At Mydbops, we provide end-to-end database excellence. Whether you need Remote DBA support, Managed Services, or high-level Consulting, we have you covered.

.avif)
.avif)
.avif)

.avif)
