February 8, 2012

High availability for MySQL on Amazon EC2 – Part 4 – The instance restart script

This post is the fourth of a series that started here.

From the previous of this series, we now have resources configured but instead of starting MySQL, Pacemaker invokes a script to start (or restart) the EC2 instance running MySQL. This blog post describes the instance restart script. Remember, I am more a DBA than a script writer so it might not be written in the most optimal way.

First, let’s recap what’s the script has to perform (the full script is given below).

  1. Kill the MySQL EC2 instance if running
  2. Make sure the MySQL EC2 instance is stopped
  3. Prepare the user-data script for the new MySQL EC2 instance
  4. Launch the new MySQL instance
  5. Make sure it is running
  6. Reconfigure local heartbeat
  7. Broadcast the new MySQL instance IP to the application servers

Kill the MySQL EC2 instance

In order to kill the existing MySQL EC2 instance, we first have to identify it. This is done by:

OLD_INSTANCE_ID=`ec2-describe-instances -K $PK -C $CERT | /usr/local/bin/filtre_instances.pl | grep $AMI_HA_MYSQL | egrep "running|pending" | tail -n 1 | cut -d'|' -f3`

by filtering on the AMI type of the instance. Since an instance can be listed at the “stopped” state, it is mandatory to filter for states “running” or “pending”. Then the instance is terminated with:

ec2-terminate-instances -K $PK -C $CERT $OLD_INSTANCE_ID > /dev/null

Make sure the MySQL EC2 instance is stopped

Terminating an EC2 instance is not instantaneous, we can confirm an instance is really stopped by monitoring its status and wait until it is actually “terminated”. The code below is how the script performs this task.

#wait until the old instance is terminated  it takes a few seconds to stop
done="false"
while [ $done == "false" ]
do
	status=`ec2-describe-instances -K $PK -C $CERT $OLD_INSTANCE_ID | /usr/local/bin/filtre_instances.pl |  grep -c terminated`
  		if [ "$status" -eq "1" ]; then
			done="true"
		else
		        ec2-terminate-instances -K $PK -C $CERT $OLD_INSTANCE_ID > /dev/null
			sleep 5
   		fi
done

Prepare the user-data script for the new MySQL EC2 instance

The new MySQL instance will be running heartbeat. Since we cannot use neither Ethernet broadcast or multicast, we need to configure the new instance so that it communicates through unicast with its partner node in the cluster, the node on which the restart script is run. This configuration is achieved by providing a user-data script (see the hamysql.user-data below) which completes the heartbeat configuration of the new instance. The hamysql.user-data script only performs a search and replace operation on the /etc/ha.d/ha.cf file and then restart the heartbeat service. In order for this to work properly, we just have to put the IP of the current instance in the script like here:

OUR_IP=`/sbin/ifconfig eth0 | grep 'inet addr' | cut -d':' -f2 | cut -d' ' -f1`
#Now, modify the user-data script, we need to put our IP address in
if [ "$OUR_IP" == "" ]
then
	echo "Error getting Our IP"
else
	perl -pi -e "s/ucast eth0 (\d+)(\.\d+){3}/ucast eth0 $OUR_IP/g" $USER_DATA_SCRIPT
fi

Launch the new MySQL instance

Once things are ready, a new MySQL instance can be launched with:

#Now we are ready to start a new one
INSTANCE_INFO=`ec2-run-instances -K $PK -C $CERT $AMI_HA_MYSQL -n 1 -g $HA_SECURITY_GROUP -f $USER_DATA_SCRIPT -t $INSTANCE_TYPE -z $INSTANCE_ZONE -k $INSTANCE_KEY | /usr/local/bin/filtre_instances.pl`

#wait until the new instance is running  it take a few seconds to start
NEW_INSTANCE_ID=`echo $INSTANCE_INFO | cut -d'|' -f3`

Out of this operation, we retrieve the new instance “instance_id”.

Make sure it is running

Since we know the “instance_id” of the new instance, checking if it is running is easy:

done="false"
while [ $done == "false" ]
do
	INSTANCE_INFO=`ec2-describe-instances -K $PK -C $CERT  $NEW_INSTANCE_ID | /usr/local/bin/filtre_instances.pl`
	status=`echo $INSTANCE_INFO | grep -c running`
	if [ "$status" -eq "1" ]; then
		done="true"
	else
		sleep 5
	fi
done

Reconfigure local heartbeat

Now, Heartbeat, on the monitoring host, must be informed of the IP address of its new partner. In order to achieve this, a search and replace operation in the local ha.cf file followed of restart of Heartbeat is sufficient.

#Set the IP in /etc/ha.d/ha.cf and ask heartbeat to reload its config
MYSQL_IP=`ec2-describe-instances -K $PK -C $CERT  $NEW_INSTANCE_ID | /usr/local/bin/filtre_instances.pl | cut -d'|' -f2`
perl -pi -e "s/ucast eth0 (\d+)(\.\d+){3}/ucast eth0 $MYSQL_IP/g" /etc/ha.d/ha.cf
/etc/init.d/heartbeat reload

Broadcast the new MySQL instance IP to the application servers

The final phase is to inform the application servers that the IP of the MySQL has changed. The best way to list those application servers is through a security group and, provided the appropriate ssh keys have been exchanged, this code will push the IP update.

TMPFILE=`mktemp`
ec2-describe-instances -K $PK -C $CERT | /usr/local/bin/filtre_instances.pl | grep $CLIENT_SECURITY_GROUP > $TMPFILE

while read line
do
	IP=`echo $line | cut -d'|' -f2`
	ssh -i /usr/local/bin/update_mysql ubuntu@$IP sudo ./updated_xinetd.sh $MYSQL_IP
done < $TMPFILE

rm $TMPFILE

The full script:

#!/bin/bash
HA_SECURITY_GROUP=testyves
CLIENT_SECURITY_GROUP=hamysql-client
CLIENT_SCRIPT=/usr/local/bin/update_client.sh
AMI_HA_MYSQL=ami-84a74fed
EBS_DATA_VOL=vol-aefawf
USER_DATA_SCRIPT=/usr/local/bin/hamysql.user-data
PK=/usr/local/bin/pk-FNMBRRABFRKVICBDZ4IOOSF7YROYZRZW.pem
CERT=/usr/local/bin/cert-FNMBRRABFRKVICBDZ4IOOSF7YROYZRZW.pem
INSTANCE_TYPE=m1.small
INSTANCE_ZONE=us-east-1c
INSTANCE_KEY=yves-key

OUR_IP=`/sbin/ifconfig eth0 | grep 'inet addr' | cut -d':' -f2 | cut -d' ' -f1`
#Now, modify the user-data script, we need to put our IP address in
if [ "$OUR_IP" == "" ]
then
	echo "Error getting Our IP"
else
	perl -pi -e "s/ucast eth0 (\d+)(\.\d+){3}/ucast eth0 $OUR_IP/g" $USER_DATA_SCRIPT
fi

while [ 1 ]; do

	#First thing to do, terminate the other instance ID
	OLD_INSTANCE_ID=`ec2-describe-instances -K $PK -C $CERT | /usr/local/bin/filtre_instances.pl | grep $AMI_HA_MYSQL | egrep "running|pending" | tail -n 1 | cut -d'|' -f3`

	if [ "$OLD_INSTANCE_ID" == "" ]
	then
		#no running instance
		:
	else
		ec2-terminate-instances -K $PK -C $CERT $OLD_INSTANCE_ID > /dev/null

		#wait until the old instance is terminated  it takes a few seconds to stop
		done="false"
		while [ $done == "false" ]
		do
			status=`ec2-describe-instances -K $PK -C $CERT $OLD_INSTANCE_ID | /usr/local/bin/filtre_instances.pl |  grep -c terminated`
	   		if [ "$status" -eq "1" ]; then
      				done="true"
   			else
				ec2-terminate-instances -K $PK -C $CERT $OLD_INSTANCE_ID > /dev/null
				sleep 5
   			fi
		done
	fi

	#Now we are ready to start a new one
	INSTANCE_INFO=`ec2-run-instances -K $PK -C $CERT $AMI_HA_MYSQL -n 1 -g $HA_SECURITY_GROUP -f $USER_DATA_SCRIPT -t $INSTANCE_TYPE -z $INSTANCE_ZONE -k $INSTANCE_KEY | /usr/local/bin/filtre_instances.pl`

	#wait until the new instance is running  it take a few seconds to start
	NEW_INSTANCE_ID=`echo $INSTANCE_INFO | cut -d'|' -f3` 

	if [ "$NEW_INSTANCE_ID" == "" ]
	then
		echo "Error creating the new instance"
	else

		done="false"
		while [ $done == "false" ]
		do
   			INSTANCE_INFO=`ec2-describe-instances -K $PK -C $CERT  $NEW_INSTANCE_ID | /usr/local/bin/filtre_instances.pl`
	   		status=`echo $INSTANCE_INFO | grep -c running`
   			if [ "$status" -eq "1" ]; then
      				done="true"
	   		else
				sleep 5
   			fi
		done

		#Set the IP in /etc/ha.d/ha.cf and ask heartbeat to reload its config
		MYSQL_IP=`ec2-describe-instances -K $PK -C $CERT  $NEW_INSTANCE_ID | /usr/local/bin/filtre_instances.pl | cut -d'|' -f2`
		perl -pi -e "s/ucast eth0 (\d+)(\.\d+){3}/ucast eth0 $MYSQL_IP/g" /etc/ha.d/ha.cf

		TMPFILE=`mktemp`
		ec2-describe-instances -K $PK -C $CERT | /usr/local/bin/filtre_instances.pl | grep $CLIENT_SECURITY_GROUP > $TMPFILE

		while read line
		do
			IP=`echo $line | cut -d'|' -f2`
			ssh -i /usr/local/bin/update_mysql ubuntu@$IP sudo ./updated_xinetd.sh $MYSQL_IP
		done < $TMPFILE

		rm $TMPFILE

		/etc/init.d/heartbeat reload
	fi

        sleep 300 # 5 min before attempting again. Normally heartbeat should kill the script before
done

The hamysql.user-data script:

The script sets the IP of the monitor host in the heartbeat ha.cf configuration file and then, finishes up some missing configuration settings of the AMI.

#!/bin/bash
sudo hostname hamysql
sudo perl -pi -e "s/ucast eth0 (\d+)(\.\d+){3}/ucast eth0 10.220.230.18/g" /etc/ha.d/ha.cf

# to eventually be added to the ebs image
sudo perl -pi -e 's/bind-address/#bind-address/g' /etc/mysql/my.cnf
sudo service mysql restart
sleep 5
/usr/bin/mysql -u root -proot -e "grant all on *.* to root@'%' identified by 'root'"
sudo /etc/init.d/heartbeat start
About Yves Trudeau

Yves is a Principal Consultant at Percona, specializing in technologies such as MySQL Cluster and DRBD. He was previously a senior consultant for MySQL and Sun Microsystems. He holds a Ph.D. in Experimental Physics.

Comments

  1. Yves Trudeau says:

    If you look in the Pacemaker configuration here:
    http://www.mysqlperformanceblog.com/2010/07/12/high-availability-for-mysql-on-amazon-ec2-%E2%80%93-part-3-%E2%80%93-configuring-the-ha-resources/

    You’ll see that you can define the path. The current config points to /usr/local/bin/mysql.

  2. Luke says:

    where do the scripts go?

  3. Colin says:

    Any idea when you’ll put up Part 4?

  4. Yves Trudeau says:

    Hi Colin,
    I am guessing you are talking about part 5. The draft is partly written, I have been very busy recently but I’ll find some time to complete the series.

  5. Colin says:

    Yeah, as soon as I hit “Submit Comment” I realized I hit the wrong number! :-/

    Thanks, these are really interesting and I’m really curious about you’re instance monitoring script. :)

  6. Rickm says:

    I found these series of notes very interesting and waiting to see your next part hopefully soon.
    Thanks a lot.

Speak Your Mind

*