April 23, 2014

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

A conversation with 5 Facebook MySQL gurus

Facebook, the undisputed king of online social networks, has 1.23 billion monthly active users collectively contributing to an ocean of data-intensive tasks – making the company one of the world’s top MySQL users. A small army of Facebook MySQL experts will be converging on Santa Clara, Calif. next week where several of them are leading […]

Creating GEO-enabled applications with MySQL 5.6

In my previous post I’ve showed some new MySQL 5.6 features which can be very helpful when creating geo-enabled applications. In this post I will show how we can obtain open-source GIS data, convert it to MySQL and use it in our GEO-enabled applications. I will also present at the upcoming Percona Live conference on this […]

Q&A: Common (but deadly) MySQL Development Mistakes

On Wednesday I gave a presentation on “How to Avoid Common (but Deadly) MySQL Development Mistakes” for Percona MySQL Webinars. If you missed it, you can still register to view the recording and my slides. Thanks to everyone who attended, and especially to folks who asked the great questions. I answered as many as we had time […]

QA: Advanced Option Combinatorics (Pairwise Testing): Combinatorial mysqld Option Test Case Generation

How do we ensure that, when we have 35+ testable option combinations for mysqld, we test each and every combination of them? For example: will a different innodb_log_file_size combined with more innodb_log_files_in_group and a modified innodb_fast_shutdown setting truly not affect Percona’s log archiving feature? Most option-related bugs are caused by the setting of 1 or […]

InnoDB scalability issues due to tables without primary keys

Each day there is probably work done to improve performance of the InnoDB storage engine and remove bottlenecks and scalability issues. Hence there was another one I wanted to highlight: Scalability issues due to tables without primary keys This scalability issue is caused by the usage of tables without primary keys. This issue typically shows […]

How InnoDB promotes UNIQUE constraints

The other day I was running pt-duplicate-key-checker on behalf of a customer and noticed some peculiar recommendations on an InnoDB table with an odd structure (no PRIMARY key, but multiple UNIQUE constraints). This got me thinking about how InnoDB promotes UNIQUE constraints to the role of PRIMARY KEYs. The documentation is pretty clear: [DOCS] When […]

Crash-resistant replication: How to avoid MySQL replication errors

Percona Server’s “crash-resistant replication” feature is useful in versions 5.1 through 5.5. However, in Percona Server 5.6 it’s replaced with Oracle MySQL 5.6′s “crash safe replication” feature, which has it’s own implementation (you can read more about it here). A MySQL slave normally stores its position in files master.info and relay-log.info which are updated by […]

Implementing SchemaSpy in your MySQL environment

Lately I have been working with a set of customers on a longer term basis which has given me time to explore new tools using their environments.  One tool that I am finding very helpful is called SchemaSpy. SchemaSpy is a Java-based tool (requires Java 5 or higher) that analyzes the metadata of a schema in […]

The small improvements of MySQL 5.6: Duplicate Index Detection

Here at the MySQL Performance Blog, we’ve been discussing the several new features that MySQL 5.6 brought: GTID-based replication, InnoDB Fulltext, Memcached integration, a more complete performance schema, online DDL and several other InnoDB and query optimizer improvements. However, I plan to focus on a series of posts on the small but handy improvements – […]