April 24, 2014

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

When should you store serialized objects in the database?

A while back Friendfeed posted a blog post explaining how they changed from storing data in MySQL columns to serializing data and just storing it inside TEXT/BLOB columns. It seems that since then, the technique has gotten more popular with Ruby gems now around to do this for you automatically.

How fast can you sort data with MySQL ?

I took the same table as I used for MySQL Group by Performance Tests to see how much MySQL can sort 1.000.000 rows, or rather return top 10 rows from sorted result set which is the most typical way sorting is used in practice. I tested full table scan of the table completes in 0.22 […]

Database problems in MySQL/PHP Applications

Article about database design problems is being discussed by Kristian. Both article itself and responce cause mixed feellings so I decided it is worth commenting: 1. Using mysql_* functions directly This is probably bad but I do not like solutions proposed by original article ether. PEAR is slow as well as other complex conectors. I […]

PERFORMANCE_SCHEMA vs Slow Query Log

A couple of weeks ago, shortly after Vadim wrote about Percona Cloud Tools and using Slow Query Log to capture the data, Mark Leith asked why don’t we just use Performance Schema instead? This is an interesting question and I think it deserves its own blog post to talk about. First, I would say main […]

Followup questions to ‘What’s new in Percona XtraDB Cluster 5.6′ webinar

Thanks to all who attended my webinar yesterday.  The slides and recording are available on the webinar’s page.  I was a bit overwhelmed with the amount of questions that came in and I’ll try to answer them the best I can here. Q: Does Percona XtraDB Cluster support writing to multiple master? Yes, it does.  However, […]

MySQL server memory usage troubleshooting tips

There are many blog posts already written on topics related to “MySQL server memory usage,” but nevertheless there are some who still get confused when troubleshooting issues associated with memory usage for MySQL. As a Percona support engineer, I’m seeing many issues regularly related to heavy server loads – OR OOM killer got invoked and […]

The ARCHIVE Storage Engine – does it do what you expect?

Sometimes there is a need for keeping large amounts of old, rarely used data without investing too much on expensive storage. Very often such data doesn’t need to be updated anymore, or the intent is to leave it untouched. I sometimes wonder what I should really suggest to our Support customers. For this purpose, the […]

InnoDB scalability issues due to tables without primary keys

Each day there is probably work done to improve performance of the InnoDB storage engine and remove bottlenecks and scalability issues. Hence there was another one I wanted to highlight: Scalability issues due to tables without primary keys This scalability issue is caused by the usage of tables without primary keys. This issue typically shows […]

MySQL webinar: ‘Introduction to open source column stores’

Join me Wednesday, September 18 at 10 a.m. PDT for an hour-long webinar where I will introduce the basic concepts behind column store technology. The webinar’s title is: “Introduction to open source column stores.” What will be discussed? This webinar will talk about Infobright, LucidDB, MonetDB, Hadoop (Impala) and other column stores I will compare […]