blog
How to Automate Migration from Standalone MySQL to Galera Cluster using Ansible
Database migrations don’t scale well. Typically you need to perform a great deal of tests before you can pull the trigger and switch from old to new. Migrations are usually done manually, as most of the process does not lend itself to automation. But that doesn’t mean there is no room for automation in the migration process. Imagine setting up a number of nodes with new software, provisioning them with data and configuring replication between old and new environments by hand. This takes days. Automation can be very useful when setting up a new environment and provisioning it with data. In this blog post, we will take a look at a very simple migration – from standalone Percona Server 5.7 to a 3-node Percona XtraDB Cluster 5.7. We will use Ansible to accomplish that.
Environment Description
First of all, one important disclaimer – what we are going to show here is only a draft of what you might like to run in production. It does work on our test environment but it may require modifications to make it suitable for your environment. In our tests we used four Ubuntu 16.04 VM’s deployed using Vagrant. One contains standalone Percona Server 5.7, remaining three will be used for Percona XtraDB Cluster nodes. We also use a separate node for running ansible playbooks, although this is not a requirement and the playbook can also be executed from one of the nodes. In addition, SSH connectivity is available between all of the nodes. You have to have connectivity from the host where you run ansible, but having the ability to ssh between nodes is useful (especially between master and new slave – we rely on this in the playbook).
Playbook Structure
Ansible playbooks typically share common structure – you create roles, which can be assigned to different hosts. Each role will contain tasks to be executed on it, templates that will be used, files that will be uploaded, variables which are defined for this particular playbook. In our case, the playbook is very simple.
.
├── inventory
├── playbook.yml
├── roles
│ ├── first_node
│ │ ├── my.cnf.j2
│ │ ├── tasks
│ │ │ └── main.yml
│ │ └── templates
│ │ └── my.cnf.j2
│ ├── galera
│ │ ├── tasks
│ │ │ └── main.yml
│ │ └── templates
│ │ └── my.cnf.j2
│ ├── master
│ │ └── tasks
│ │ └── main.yml
│ └── slave
│ └── tasks
│ └── main.yml
└── vars
└── default.yml
We defined a couple of roles – we have a master role, which is intended to do some sanity checks on the standalone node. There is slave node, which will be executed on one of the Galera nodes to configure it for replication, and set up the asynchronous replication. Then we have a role for all Galera nodes and a role for the first Galera node to bootstrap the cluster from it. For Galera roles, we have a couple of templates that we will use to create my.cnf files. We will also use local .my.cnf to define a username and password. We have a file containing a couple of variables which we may want to customize, just like passwords. Finally we have an inventory file, which defines hosts on which we will run the playbook, we also have the playbook file with information on how exactly things should be executed. Let’s take a look at the individual bits.
Inventory File
This is a very simple file.
[galera]
10.0.0.142
10.0.0.143
10.0.0.144
[first_node]
10.0.0.142
[master]
10.0.0.141
We have three groups, ‘galera’, which contains all Galera nodes, ‘first_node’, which we will use for the bootstrap and finally ‘master’, which contains our standalone Percona Server node.
Playbook.yml
The file playbook.yml contains the general guidelines on how the playbook should be executed.
- hosts: master
gather_facts: yes
become: true
pre_tasks:
- name: Install Python2
raw: test -e /usr/bin/python || (apt -y update && apt install -y python-minimal)
vars_files:
- vars/default.yml
roles:
- { role: master }
As you can see, we start with the standalone node and we apply tasks related to the role ‘master’ (we will discuss this in details further down in this post).
- hosts: first_node
gather_facts: yes
become: true
pre_tasks:
- name: Install Python2
raw: test -e /usr/bin/python || (apt -y update && apt install -y python-minimal)
vars_files:
- vars/default.yml
roles:
- { role: first_node }
- { role: slave }
Second, we go to node defined in ‘first_node’ group and we apply two roles: ‘first_node’ and ‘slave’. The former is intended to deploy a single node PXC cluster, the later will configure it to work as a slave and set up the replication.
- hosts: galera
gather_facts: yes
become: true
pre_tasks:
- name: Install Python2
raw: test -e /usr/bin/python || (apt -y update && apt install -y python-minimal)
vars_files:
- vars/default.yml
roles:
- { role: galera }
Finally, we go through all Galera nodes and apply ‘galera’ role on all of them.
Variables
Before we begin to look into roles, we want to mention default variables that we defined for this playbook.
sst_user: "sstuser"
sst_password: "pa55w0rd"
root_password: "pass"
repl_user: "repl_user"
repl_password: "repl1cati0n"
As we stated, this is a very simple playbook without much options for customization. You can configure users and passwords and this is basically it. One gotcha – please make sure that the standalone node’s root password matches ‘root_password’ here as otherwise the playbook wouldn’t be able to connect there (it can be extended to handle it but we did not cover that).
This file is without much of a value but, as a rule of thumb, you want to encrypt any file which contains credentials. Obviously, this is for the security reasons. Ansible comes with ansible-vault, which can be used to encrypt and decrypt files. We will not cover details here, all you need to know is available in the documentation. In short, you can easily encrypt files using passwords and configure your environment so that the playbooks can be decrypted automatically using password from file or passed by hand.
Roles
In this section we will go over roles that are defined in the playbook, summarizing what they are intended to perform.
Master role
As we stated, this role is intended to run a sanity check on the configuration of the standalone MySQL. It will install required packages like percona-xtrabackup-24. It also creates replication user on the master node. A configuration is reviewed to ensure that the server_id and other replication and binary log-related settings are set. GTID is also enabled as we will rely on it for replication.
First_node role
Here, the first Galera node is installed. Percona repository will be configured, my.cnf will be created from the template. PXC will be installed. We also run some cleanup to remove unneeded users and to create those, which will be required (root user with the password of our choosing, user required for SST). Finally, cluster is bootstrapped using this node. We rely on the empty ‘wsrep_cluster_address’ as a way to initialize the cluster. This is why later we still execute ‘galera’ role on the first node – to swap initial my.cnf with the final one, containing ‘wsrep_cluster_address’ with all the members of the cluster. One thing worth remembering – when you create a root user with password you have to be careful not to get locked off MySQL so that Ansible could execute other steps of the playbook. One way to do that is to provide .my.cnf with correct user and password. Another would be to remember to always set correct login_user and login_password in ‘mysql_user’ module.
Slave role
This role is all about configuring replication between standalone node and the single node PXC cluster. We use xtrabackup to get the data, we also check for executed gtid in xtrabackup_binlog_info to ensure the backup will be restored properly and that replication can be configured. We also perform a bit of the configuration, making sure that the slave node can use GTID replication. There is a couple of gotchas here – it is not possible to run ‘RESET MASTER’ using ‘mysql_replication’ module as of Ansible 2.7.10, it should be possible to do that in 2.8, whenever it will come out. We had to use ‘shell’ module to run MySQL CLI commands. When rebuilding Galera node from external source, you have to remember to re-create any required users (at least user used for SST). Otherwise the remaining nodes will not be able to join the cluster.
Galera role
Finally, this is the role in which we install PXC on remaining two nodes. We run it on all nodes, the initial one will get “production” my.cnf instead of its “bootstrap” version. Remaining two nodes will have PXC installed and they will get SST from the first node in the cluster.
Summary
As you can see, you can easily create a simple, reusable Ansible playbook which can be used for deploying Percona XtraDB Cluster and configuring it to be a slave of standalone MySQL node. To be honest, for migrating a single server, this will probably have no point as doing the same manually will be faster. Still, if you expect you will have to re-execute this process a couple of times, it will definitely make sense to automate it and make it more time efficient. As we stated at the beginning, this is by no means production-ready playbook. It is more of a proof of concept, something you may extend to make it suitable for your environment. You can find archive with the playbook here: https://staging1.severalnines.com/sites/default/files/ansible.tar.gz
We hope you found this blog post interesting and valuable, do not hesitate to share your thoughts.