Thursday, April 13, 2017

Setting up a MySQL Cluster in the Cloud

In my previous post I covered how to install MySQL Cluster on
a Red Hat VM.

In order to run MySQL Cluster in a cloud environment I use 4 VMs.
This is sufficient for a demo or proof of concept of MySQL Cluster.
For a production environment it is likely that one would want at least
6 VMs. It is generally a good idea to use one VM per process.
This has the advantage that one can perform an upgrade in the
order suitable for upgrading MySQL Cluster.

I used one VM to handle the NDB management server. When this
node is down no new nodes can join the cluster, but living nodes
will continue to operate as normal. So in a demo environment it
is quite sufficient to use 1 NDB management server. In a production
environment it is quite likely that one wants at least two
management server VMs.

I used two VMs for data nodes. This is the minimal setup for a
highly available MySQL Cluster. Since MySQL Cluster is designed
for high availability it makes sense to use a replicated setup even in
a demo environment.

I used one VM for the MySQL Server and for MySQL client and
NDB management client. This is good enough for a demo, but in
a production environment it is likely that one wants at least two
MySQL Server VMs for failover handling but also for load
balancing and normally probably even more than two VMs for
MySQL Servers.

The first step is to create these small 4 instances. This is very straightforward
and I used the Red Hat 7.3 Linux OS.

Each instance comes with 3 IP addresses. One IP address is a global
IP address used to SSH into the instance. Next there is Public
hostname of the machine and finally there is a private IP address.

The Public IP address is used to connect with SSH, it isn't intended for
communication between the VMs inside the cloud. It is possible to use
the Public hostnames for this, it is however better to use the private IP
addresses for communication inside the cluster. The reason is that otherwise
the cluster also depends on a DNS service for its availability.

I will call PrivIP_MGM the private IP address of the NDB management
server VM, PrivIP_DN1 the private IP address of the first NDB data node
and PrivIP_DN2 of the second data node and PrivIP_Server.

Each instance you log in as the user ec2-user, so I simply created a directory
called ndb under /home/ec2-user as the data directory on all VMs.

So the first step towards setting up the cluster is to create a configuration
file for the cluster called config.ini and place this in
/home/ec2-user/ndb/config.ini on the NDB management server VM.

Here is the content of this file:
[ndb_mgmd]
nodeid = 49
hostname=PrivIP_MGM
datadir=/home/ec2-user/ndb

[ndbd default]
NoOfReplicas=2
ServerPort=11860
datadir=/home/ec2-user/ndb

[ndbd]
nodeid=1
hostname=PrivIP_DN1

[ndbd]
nodeid=2
hostname=PrivIP_DN2

[mysqld]
nodeid=53
[mysqld]
nodeid=54
[mysqld]
nodeid=55

We set ServerPort to 11860 to ensure that we always use the same port number
to connect to the NDB data nodes. Otherwise it is hard to setup a
secure environment with firewalls.

A good rule for node ids is to use 1 through 48 for data nodes, 49 through
52 for NDB management servers and 53 to 255 for API nodes and MySQL
Servers. This will work in almost all cases.

In addition we need to create a configuration file for the MySQL Server.
In this file we have the following content:

[mysqld]
ndbcluster
datadir=/home/ec2-user/ndb/data
socket=/home/ec2-user/ndb/mysql.sock
log-error=/home/ec2-user/ndb/mysqld.log
pid-file=/home/ec2-user/ndb/mysqld.pid
port=3316

We provide a socket file, a log file, a pid file and a data directory
to house the data of the MySQL Server under the ndb data directory.

The reason I use port 3316 is that I wanted to avoid any problems
with other MySQL Server installations. Oftentimes there is already
a MySQL Server installed for various purposes on a machine. So
I decided to make it easy and use port 3316, this is absolutely not
a necessity, more out of laziness on my part.

Now before we move on to start the cluster there is an important
part missing before we are ready to go.

The Linux instances are not going to be able to communicate with
each other unless we set them up for that.

To set up Linux instances to communicate with each other one
uses a concept called Security Group. I created a special security
group I called NDB Cluster that opened up the following ports
for TCP traffic. 1186 (NDB Management Server port), 3306
(MySQL Server port), 3316 (extra MySQL Server port),
8081 (MySQL Cluster Auto-Installer port), 11860
(MySQL Cluster Data node port) and 33060 (MySQLX port).
It is important to open up all those ports for both inbound traffic
as well as outbound traffic.

Now we are ready to start things up.

First we start the NDB management server in its VM
using the command:
ndb_mgmd -f /home/ec2-user/ndb/config.ini --initial
--configdir=/home/ec2-user/ndb
--ndb-nodeid=49

Next I started the two data nodes in their VMs using the command:
ndbd --ndb-connecstring=PrivIP_MGM --ndb-nodeid=1
and
ndbd --ndb-connectstring=PrivIP_MGM --ndb-nodeid=2

Now after a short time we have a working cluster.

So next step is to start up the MySQL Server.
This is done in two steps. The first step is a bootstrap step to
create the MySQL data directory and create the first user.
This is just a demo so we will use the simple insecure
method. In a production environment one should use a
proper initialisation that sets passwords.

The first command is:
mysqld --defaults-file=/home/ec2-user/ndb/my.cnf --initialize-insecure

This command will bootstrap the server. Next we start the MySQL Server
using the command:

mysqld --defaults-file=/home/ec2-user/ndb/my.cnf --ndb-connectstring=PrivIP_MGM

Now we have a setup that is working with a management server, two data nodes
and a MySQL Server up and running. So now one can connect a MySQL client
to the MySQL Server and perform more experiments or some other MySQL
application. Obviously to use some more realistic application it is likely that
a configuration with more memory, more disk space and more CPU resources
is needed.

No comments: