July 23, 2014

Q&A: Even More Deadly Mistakes of MySQL Development

On Wednesday I gave a presentation on “How to Avoid Even More Common (but Deadly) MySQL Development Mistakes” for Percona MySQL Webinars.  If you missed it, you can still register to view the recording and my slides. Thanks to everyone who attended, and especially to folks who asked the great questions.  I answered as many as we had […]

Using UDFs for geo-distance search in MySQL

In my previous post about geo-spatial search in MySQL I described (along with other things) how to use geo-distance functions. In this post I will describe the geo-spatial distance functions in more details. If you need to calculate an exact distance between 2 points on Earth in MySQL (very common for geo-enabled applications) you have at […]

Parallel Query for MySQL with Shard-Query

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the […]

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

Advanced MySQL Query Tuning: Webinar followup Q&A

Thanks to all who attended my “MySQL Query Tuning” webinar on July 24.  If you missed it, you can you can download the slides and also watch the recorded video. Thank you for the excellent questions after the webinar as well. Query tuning is a big topic and, due to the limited time, I had […]

Schema Design in MongoDB vs Schema Design in MySQL

For people used to relational databases, using NoSQL solutions such as MongoDB brings interesting challenges. One of them is schema design: while in the relational world, normalization is a good way to start, how should we design our collections when creating a new MongoDB application? Let’s see with a simple example how we would create […]

When it’s faster to use SQL in MySQL NDB Cluster over memcache API

Memcache access for MySQL Cluster (or NDBCluster) provides faster access to the data because it avoids the SQL parsing overhead for simple lookups – which is a great feature. But what happens if I try to get multiple records via memcache API (multi-GET) and via SQL (SELECT with IN())? I’ve encountered this a few times […]

More on MySQL transaction descriptors optimization

Since my first post on MySQL transaction descriptors optimization introduced in Percona Server 5.5.30-30.2 and a followup by Dimitri Kravchuk, we have received a large number of questions on why the benchmark results in both posts look rather different. We were curious as well, so we tried to answer that question by retrying benchmarks on […]

MySQL 5.6.10 Optimizer Limitations: Index Condition Pushdown

While preparing the webinar I will deliver this Friday, I ran into a quite interesting (although not very impacting) optimizer issue: a “SELECT *” taking half the time to execute than the same “SELECT one_indexed_column” query in MySQL 5.6.10. This turned into a really nice exercise for checking the performance and inner workings of one […]

MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload.  I wanted to test a set of queries which, unlike sysbench, utilize joins.  I also wanted an easily reproducible set of data which is more rich than the simple sysbench table.  The Star Schema Benchmark (SSB) seems ideal for this. […]