On Friday, February 15, 2013 10:00am Pacific Standard Time, I will be delivering a webinar entitled “Building a highly scaleable distributed row, document or column store with MySQL and Shard-Query” The first part of this webinar will focus on why distributed databases are needed, and on the techniques employed by Shard-Query to implement a distributed […]
Webinar: Building a highly scaleable distributed row, document or column store with MySQL and Shard-Query
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]
Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]
Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]
Preamble: On performance, workload and scalability: MySQL has always been focused on OLTP workloads. In fact, both Percona Server and MySQL 5.5.7rc have numerous performance improvements which benefit workloads that have high concurrency. Typical OLTP workloads feature numerous clients (perhaps hundreds or thousands) each reading and writing small chunks of data. The recent improvements to […]
I wrote a post a while back that said why you don’t want to shard.Â In that post that I tried to explain that hardware advances such as 128G of RAM being so cheap is changing the point at which you need to shard, and that the (often omitted) operational issues created by sharding can […]
Note: This blog post is part 1 of 4 on building our training workshop.
The Percona training workshop will not cover sharding. If you follow our blog, you’ll notice we don’t talk much about the subject; in some cases it makes sense, but in many we’ve seen that it causes architectures to be prematurely complicated.
So let me state it: You don’t want to shard.
Optimize everything else first, and then if performance still isn’t good enough, it’s time to take a very bitter medicine. The reason you need to shard basically comes down to one of these two reasons
For large number of online applications once you implemented proper sharding you can consider your scaling problems solved – by getting more and more hardware you can grow. As I recently wrote it however does not mean it is the most optimal way by itself to do things. The “classical” sharding involves partitioning by user_id,site_id […]
ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by Uday Sawant (ScaleArc) and myself. You can also download the report directly as a PDF here. The goal of these benchmarks is to identify the potential overhead of the ScaleArc […]
Facebook, the undisputed king of online social networks, has 1.23 billion monthly active users collectively contributing to an ocean of data-intensive tasks – making the company one of the world’s top MySQL users. A small army of Facebook MySQL experts will be converging on Santa Clara, Calif. next week where several of them are leading […]