Q&A: Putting MySQL Fabric to use

Martin Arrieta and I gave an online presentation last week on “Putting MySQL Fabric To Use.” If you missed it, you can find a recording and the slides here, and the vagrant environment we used plus a transcript of the commands we ran here (be sure to check out the ‘sharding’ branch, as that’s what we used during the webinar).

Thank you all for attending and asking interesting questions. We were unable to answer all of them in the scheduled time, so here are our replies to all the questions.

What is GTID? And how does it relate to MySQL Fabric?
GTID stands for Global Transaction Identifier, and MySQL Fabric requires MySQL replication to be configured to used this in order to work. This means that it won’t work with MySQL versions previous to 5.6. You can find more information in the manual, and in the upcoming Percona webinar, “Using MySQL Global Transaction IDs in Production,” that our colleague Stephane Combaudon will present on August 27.

During any issue, does MySQL Fabric internally do a switch over or failover?
MySQL Fabric will only detect failures and initiate failovers for a group that has been activated. Otherwise, any changes must be made manually through the mysqlfabric command.

For an alter table, how do you avoid replication lag?
We think using pt-online-schema for the alter table is a good way to avoid or minimize this problem.

Are there benchmarks available for MySQL Fabric?
We’re not aware of any, but once we wrap up our current series of blog posts on the topic, we plan to do some benchmarks ourselves. Our interest lies in the overhead the connectors can place vs a standard connector accessing a single MySQL Server, but if you have other ideas, let us know in the comments and we’ll consider adding them to our list.

Can MySQL Fabric be used to handle Big Data?
We think the HA features are usable regardless of data set size, with the caveat that MySQL Fabric does not (currently) handle provisioning the data of the managed instances. It just configures and points replication as required, but you’re responsible for making sure all servers have the same set of data.

We feel that the sharding features, however, may not be a good fit for working with very large data sets yet, specially because of the use of mysqldump/mysql on the shard split/move operations. However, since sharding usually goes hand in hand with big data (a data set size too large is one of the reasons to shard after all) we’re sure this will get improved in the future. Or, someone with some python skills can adjust the scripts to use a more efficient mechanism to move data from one server to another, like Percona XtraBackup.

Does sharding require many changes to application coding (database code on shards etc) at the MySQL Fabric level? Should we sit with the developers/architects to help them understand tables, their usage, and to plan sharding?
Sharding with MySQL Fabric is not transparent (and we’d say sharding, in general, is typically not transparent) so some application coding and DBAs/Devs interaction will be needed. But you should be working close to your developers and architects (or DBAs and sysadmins, depending on which side of the counter you stand) anyway 🙂

Would you say that MySQL Fabric is a good HA replacement removing the need for VIPs?
We don’t see them as mutually exclusive. Simplifying, the HA features of MySQL Fabric are just another way to manage replication between a set of MySQL servers. It doesn’t force any way to make writable and readable nodes available to clients. With some coding, you can use MySQL Fabric together with VIPs, or dynamic DNS entries, or a load balancer like HA Proxy.

Does global group mean fabric node?
If you did not attend the webinar, we’d recommend you take a look at slide 7 to get the required context for this question.

Global group means a group that has the schema for all tables, all the data for global tables, and no data for sharded tables. Since it’s a group, it can be one or more nodes, but at any given time, there will only be one PRIMARY node in the group, and the PRIMARY nodes on all shard groups will be replication slaves of this node.

Is the config file at the application side or on the MySQL Fabric node?
There is some configuration at both ends. The MySQL Fabric node (the ‘store’ server in our Vagrant setup) has some configuration, but applications need their own too. The separation is rather clean though. Typically, the application only needs to know how to reach the MySQL Fabric server (host, port, credentials), while the MySQL Fabric node needs to know how to connect to the nodes it manages, how long to wait before starting a failover, etc. The configuration on MySQL Fabric’s side is done by the fabric.cfg file, while on the application side, it depends on the connector used.

When setting a faulty to secondary, does it also automatically come back in replication as a slave?
Yes, though remember you’ll first have to set it to SPARE 🙂

When we set primary and secondary and it sets the replication, it never copies data and builds replication? It only starts replication from that side and we will lose all the data which is not available on secondary
Yes to the first question. As to the second, we reiterate the fact that MySQL Fabric does not handle provisioning data for instances. When you’re using it to manage nodes, it (currently) assumes the nodes all have a copy of the data. It is your responsibility to make sure that is the case, though this may change in the future. To give you a specific example, suppose you run the following commands:

mysqlfabric group create ha mysqlfabric group add ha node1 mysqlfabric group add ha node2
Then, on node1’s MySQL’s CLI:
create database test;
and finally run:
mysqlfabric group promote ha --slave_id=<uuid_for_node2>
You’ll end up with node2 being PRIMARY, node1 being SECONDARY, without the ‘test’ database on the PRIMARY node.

Are there any limitations around adding shards? Is it possible to re-configure the sharding keys to handle the additional shard(s)?
We don’t know about any limitations, and yes, it is possible to add more shards and configure the mapping. We did not have enough time to do this at the webinar, but the transcript on the repo includes examples for splitting and moving shards. What you want in this case is to add a new group (can be a single node) and then split a shard (optionally providing a pivot value), keeping one part on the original group and the rest on the new one. Fernando encountered one problem when doing this on a busy server, though the bug is not yet verified so that may have been a user error on his side, but we’d recommend you do similar tests yourself before attempting this in production.

Is the addition of Connectors at the MySQL Fabric level or even at application side? Is an application code change a huge effort?
It’s at the application side. It’s difficult to estimate the effort for the code changes, but if your database access is reasonably modularized/abstracted from your application logic, it shouldn’t be huge. In fact, for HA, if you don’t want to split read and writes, you can migrate an application to a MySQL Fabric-aware connector with just a couple of lines changed (changing the connection string and then requesting a MODE_READWRITE connection always).

What is the minimum server requirements to setup mysqlfabric and does it use any multithreading if i have more than one core for each node. Can we get the transcripts with examples..
We don’t know of the official minimum requirements, but it is a python script which in our experience has been lightweight. Remember it does *not* act as a middle man/proxy between your app and the servers, it is only involved when a client establishes a new connection or when a change in status happens in a server. As for multithreading, we know it’s concurrent (i.e. multiple clients can establish connections at the same time) but we don’t know if it’s parallel.

The transcript with examples can be found here.

How does MySQL Fabric handle latency issues? Does latency increase with mysql traffic?
MySQL Fabric uses regular MySQL connections so latency will affect it the same as any other app using MySQL, and yes, it will increase with traffic (but not due to Fabric).

What if MySQL Fabric is hung? Will connections going to the primary stop? And how can we deduce if there is an issue at the fabric side?
If MySQL Fabric is hung, the following scenarios will be problematic:

New connections come in. Clients will try to connect to Fabric before trying to connect any hosts (they won’t know what hosts to connect to otherwise!) so if MySQL Fabric is down, no new connections can be established.
A failure happens in the managed servers. MySQL Fabric is responsible for monitoring for failures and taking the appropriate actions (including promoting another server, repointing replicas, etc), so if something fails while MySQL Fabric is down, there will be nobody to take action.

Are there any scenarios where a data loss can happen when promoting/demoting nodes (or nodes failure happens) in a production environment?
Given the use of gtid-enforce-consistency and the fact that MySQL Fabric won’t promote a node to PRIMARY until it has caught up with any pending changes, we feel this is unlikely, but we’re planning a blog post specifically focused on evaluating the potential of data loss during failover.

How to configure MySQL Fabric to use synchronous / semi-synchronous mechanisms?
We have not attempted this, but from this manual page, we think currently only async replication is supported. I would expect sync to be in the roadmap though. If this is something you’d like to see, another section on the same page has some suggestions on how to make it happen.

Thanks again for attending the webinar, and feel free to post any further questions in the comments section.

12 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Moody

9 years ago

Nice Q&A .. will any tests be made integrating PXC with Fabric? i.e replacing the replication topologies with xtradb cluster? I’d imagine it would give PXC the ability for write scaling and auto sharding..

Fernando Ipar

Author

9 years ago

@Moody:

I think trying to integrate Fabric’s HA parts with PXC would not add any benefit (that’s my personal opinion), and it would be complicated anyway, since PXC relies on Galera and not on MySQL replication.

Sharding could work though, I don’t see a reason not to have each shard group be a PXC, with Fabric setting up async M/S replication between the global PXC and the shards. I would only use Fabric for shard routing though, and I’d never promote or activate any of the groups (as again, the HA features wouldn’t work with Galera anyway).

Finally, to answer your specific question, I don’t have any short term plans to test this kind of integration, but of course that does not mean it may not happen if the needs comes up while working on a case 🙂

Longxing

9 years ago

Hi Fernando,
I have a question about the fault-tolerant ability of fabric server, would you please do me a favor.

I used to read a article about fabric which says:
(http://www.clusterdb.com/mysql-fabric/mysql-fabric-adding-high-availability-and-scaling-to-mysql)

The MySQL Fabric process itself is not fault-tolerant and must be restarted in the event of it failing.
Note that this does not represent a single-point-of-failure for the server farm (HA and/or sharding) as
the connectors are able to continue routing operations using their local caches while the MySQL
Fabric process is unavailable.

So I do some tests about the fault-tolerant ability. I stop the fabirc server and do some queries
to the connector. But a error happens. The error message is as follows:

[root@JSSIOEtest01 python]# python test_ddl.py
Traceback (most recent call last):
File “test_ddl.py”, line 7, in
autocommit=True
File “/usr/lib/python2.6/site-packages/mysql/connector/__init__.py”, line 144, in connect
return mysql.connector.fabric.connect(*args, **kwargs)
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/__init__.py”, line 53, in connect
return MySQLFabricConnection(**kwargs)
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 809, in __init__
self.store_config(**kwargs)
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 948, in store_config
self._configure_fabric(config[‘fabric’])
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 929, in _configure_fabric
self._fabric.seed()
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 332, in seed
fabinst.connect()
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 763, in connect
self._proxy = self._xmlrpc_get_proxy()
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 747, in _xmlrpc_get_proxy
proxy._some_nonexisting_method() # pylint: disable=W0212
File “/usr/lib64/python2.6/xmlrpclib.py”, line 1199, in __call__
return self.__send(self.__name, args)
File “/usr/lib64/python2.6/xmlrpclib.py”, line 1489, in __request
verbose=self.__verbose
File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 272, in request
raise InterfaceError(“Connection with Fabric failed: ” + msg)
mysql.connector.errors.InterfaceError: Connection with Fabric failed:

It seems that connector need to connect fabric server first rather than using the local cache.
Am I right? Or something else?

Regards
Longxing

Martin Arrieta

9 years ago

@Longxing,

Correct, the first time that you use the connector it will need to read the server information from the store server, however after that it will use the local cache. Try the following example:

1- Start your script, open the connector and keep running queries.
2- Stop the store server
3- Check if the status of your script

You will see that your script will keep running without problems using the local cache even with the store server down.

Regards,

Martin
@MartinArrietaC

Shen Longxing

Reply to Martin Arrieta

9 years ago

@martin,

Grateful for your reply, it does helpful to me.

And I have another question, would you please do me a favor. Thank you!
I want to reshard the shards, and it seems that there is no api to achieve that.

I used to read an article from Oracle saying there are tools using to reshard, but I do not know what are they.
Can you please give me some advice?

Regards
Longxing

Fernando Ipar

Author

9 years ago

@Longxing,

To my knowledge, the only currently existing operations that change shards are split and move. You can see an example of a split here: https://github.com/martinarrieta/vagrant-fabric/blob/sharding/webinar-commands/session-0.org#splitting-a-shard

FYI, at this moment there are two open bugs that I know about that affect the split and move functionality: http://bugs.mysql.com/bug.php?id=73212 and http://bugs.mysql.com/bug.php?id=73546

Regards,
Fernando.

Shen Longxing

Reply to Fernando Ipar

9 years ago

@Fernando Ipar

Grateful for your reply, it does helpful to me.
I do the split according to your example:https://github.com/martinarrieta/vagrant-fabric/blob/sharding/webinar-commands/session-0.org#splitting-a-shard.

But, I got a error when execute “mysqlfabric sharding split_shard 1 salaries-4”.
The log is as follows:
[root@JSSIOEtest05 python]# mysqlfabric sharding split_shard 1 group_id-3 –split_v alue=10000
Password for admin:
Procedure :
{ uuid = 2cf79d32-feaa-4355-a6d4-b644d92704f1,
finished = True,
success = False,
return = BackupError: (‘Error while restoring the backup using the mysql client\n, %s’, “ERROR 1840 (HY000) at line 24 in file: ‘MySQL_132.228.239.24_3317.sql’ : @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.\n”),
activities =
}

And you told me that there is a open bug about the split:http://bugs.mysql.com/bug.php?id=73212.
Is it the reason for the error that I met?

Regards
Longxing

Fernando Ipar

Author

9 years ago

@Longxing,

Yes, I believe that bug is the reason for the error you’re getting. In my test case for the bug, I have ongoing write activity to the shard while doing the split. Is that your case too? If you get this error even with no activity, please add that as a comment to the bug report to help the engineers working on it.

Regards,
Fernando.

Shen Longxing

9 years ago

@Fernando Ipar

I just split the shard without any activity. And i will add this to the bug report. Thank you！

Regards
Longxing

Mahesh Patil

9 years ago

mysqlfabric v1.5.2 is a shit product from Oracle I have never seen worst quality product than this. It’s PITA .. can’t get setup done. Always throws errors on Index size is too large and connection failures.

Fernando Ipar

Author

9 years ago

@Mahesh:

I have not hit any errors like the one you mention, but I did experience odd errors (including ‘mysqlfabric manage start’ being unable to work at all) after upgrading, which where resolved by completely uninstalling the previous version of mysql-utilities and then doing a fresh install of the new one.

If you’re using our vagrant environment, you can achieve that by doing vagrant destroy; vagrant up.
If you’re hitting your problems in our vagrant env please file a bug report here https://github.com/percona/vagrant-fabric/issues in case the problem is our fault.

If you’re running this on your own test systems I’d recommend filing a bug report at https://bugs.mysql.com.

Fernando Ipar

Author

9 years ago

I see you’ve reported this already as http://bugs.mysql.com/bug.php?id=74492 so you can disregard my last comment.

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Q&A: Putting MySQL Fabric to use

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Q&A: Putting MySQL Fabric to use

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation