July 22, 2014

Working with large data sets in MySQL

What does working with large data sets in mySQL teach you ? Of course you have to learn a lot about query optimization, art of building summary tables and tricks of executing queries exactly as you want. I already wrote about development and configuration side of the problem so I will not go to details […]

Sysbench Benchmarking of Tesora’s Database Virtualization Engine

Tesora, previously called Parelastic, asked Percona to do a sysbench benchmark evaluation of its Database Virtualization Engine on specific architectures on Amazon EC2. The focus of Tesora is to provide a scalable Database As A Service platform for OpenStack. The Database Virtualization Engine (DVE) plays a part in this as it aims at allowing databases […]

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

Upcoming Webinar “Managing Big Data with Percona Server, XtraBackup and Tungsten”

On February 10, We will be giving a webinar “Managing Big Data with Percona Server, XtraBackup and Tungsten”. This is joint Continuent and Percona Webinar with full description: Big data is a big problem for growing SaaS businesses and large web applications. In this webinar, we’ll teach you how to set up Percona Server, XtraBackup, […]

Sample datasets for benchmarking and testing

Sometimes you just need some data to test and stress things. But randomly generated data is awful — it doesn’t have realistic distributions, and it isn’t easy to understand whether your results are meaningful and correct. Real or quasi-real data is best. Whether you’re looking for a couple of megabytes or many terabytes, the following […]

MongoDB Approach to database synchronization

I went to MongoSF today – quite an event, and I hope to have a chance to write more about it. This post is about one replication problem and how MongoDB solves it. If you’re using MySQL Replication when your master goes down it is possible for some writes to be executed on the master, […]

Converting Character Sets

The web is going the way of utf8. Drizzle has chosen it as the default character set, most back-ends to websites use it to store text data, and those who are still using latin1 have begun to migrate their databases to utf8. Googling for “mysql convert charset to utf8″ results in a plethora of sites, […]

Recovery beyond data restore

Quite frequently I see customers looking at MySQL recovery as on ability to restore data from backup which can be far from being enough to restore the whole system to operating state, especially for complex systems. Instead of looking just at data restore process you better look at the whole process which is required to […]

Large result sets vs. compression protocol

mysql_connect() function in PHP’s MySQL interface (which for reference maps to mysql_real_connect() function in MySQL C API) has a $client_flags parameter since PHP 4.3.0. This parameter is barely known and almost always overlooked but in some cases it could provide a nice boost to your application. There’s a number of different flags that can be […]

How fast can you sort data with MySQL ?

I took the same table as I used for MySQL Group by Performance Tests to see how much MySQL can sort 1.000.000 rows, or rather return top 10 rows from sorted result set which is the most typical way sorting is used in practice. I tested full table scan of the table completes in 0.22 […]