I have recently migrated to Amazon RDS from MySQL on Azure VM.
This percona article about migrating to RDS is actually very helpful. I have a few things to add though.
My setup is a self-hosted MySQL 5.6 server in VM on Azure and a new Amazon RDS for MySQL 5.6.
I have tried AWS Database Migration Service (DMS). It is reasonably fast. The main problem is it does not look like mature technology at all. I got a “mismatched records” warning on migration of my staging MySQL server which was not very busy. This means I cannot use DMS on production server. Another problem is that I don’t see how well the ongoing replication is doing (e.g. seconds behind master in master-slave setup). There is no binlog position or any other metrics to show whether it is time to stop the replication and change master to RDS. DMS is not viable for me at all.
Let’s get back to some reliable technology. MySQL master-slave replication is battle-tested and gives fewer surprises.
For backup, mysqldump is slower on backup and restoration, and has a performance impact on the server. Of course it could be done on slave and make RDS a slave of the slave (i.e. multi-level), but it makes things more complicated.
I opt for XtraBackup for a physical backup mainly because it is reliable and fast. To compress, the percona article uses
xtrabackup --backup --stream=tar | gzip -c although
pigz should be faster than
xbstream should be faster than
tar because of more parallelism, for example:
xtrabackup --backup --parallel=4 --stream=xbstream --compress --compress-threads=4
--prepare is not needed since RDS import will prepare the backup.
RDS for MySQL does not support replicating using SSL. To avoid communicating directly over the internet, I establish a Site-to-Site (S2S) IPSec VPN across Azure and AWS such that MySQL master and slave could communicate over a secure channel. Here is a tutorial about connecting an AWS VPC to Azure VNet.
After RDS imports and creates the DB, it has to start replicating as a slave. The corresponding master binlog position of the DB can be found either in the stderr of xtrabackup command after backup finishes or in ‘Recent events’ in RDS panel. They are useful for the
mysql_binary_log_file_position in the following statement.
The statements to execute on RDS:
CALL mysql.rds_set_external_master ( "<host_name>", <host_port>, "<replication_user_name>", "<replication_password>", "<mysql_binary_log_file_name>", <mysql_binary_log_file_position>, 0 ); CALL mysql.rds_start_replication;
To verify that the slave catches up with the master, check whether
SHOW SLAVE STATUS on slave is equal to
SHOW MASTER STATUS on master.
When it is almost there, execute these statements on master MySQL:
FLUSH TABLES WITH READ LOCK; SET GLOBAL read_only = ON;
Verify RDS slave master log position again. When it is fine, execute this on RDS MySQL to stop replication:
Then kill connections and shutdown master MySQL.
Notify the application about the change of MySQL host. Optionally, use ProxySQL to make this transparent to application and minimize downtime.
Although the backup task and replication may take time, there was only 2 minutes of downtime for such a huge and error-prone operation. Sure it could be further minimized given better preparation work, but it was good enough for me.