Recently I had a case when a customer deleted the InnoDB main table space – ibdata1 – and redo logs – ib_logfile*. MySQL keeps InnoDB files open all the time. The following recovery technique is based on this fact and it allowed to salvage the database. Actually, the files were deleted long time ago – [...]
How to recover a single InnoDB table from a Full Backup
Sometimes we need to restore only some tables from a full backup maybe because your data loss affect a small number of your tables. In this particular scenario is faster to recover single tables than a full backup. This is easy with MyISAM but if your tables are InnoDB the process is a little bit [...]
Improved InnoDB fast index creation
One of the serious limitations in the fast index creation feature introduced in the InnoDB plugin is that it only works when indexes are explicitly created using ALTER TABLE or CREATE INDEX. Peter has already blogged about it before, here I’ll just briefly reiterate other cases that might benefit from that feature: when ALTER TABLE [...]
A recovery trivia or how to recover from a lost ibdata1 file
A few day ago, a customer came to Percona needing to recover data. Basically, while doing a transfer from one SAN to another, something went wrong and they lost the ibdata1 file, where all the table meta-data is stored. Fortunately, they were running with innodb_file_per_table so the data itself was available. What they could provide [...]
The case for getting rid of duplicate “sets”
The most useful feature of the relational database is that it allows us to easily process data in sets, which can be much faster than processing it serially. When the relational database was first implemented, write-ahead-logging and other technologies did not exist. This made it difficult to implement the database in a way that matched [...]
Distributed Set Processing with Shard-Query
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: [...]
MySQL caching methods and tips
“The least expensive query is the query you never run.” Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding [...]
Flexviews – part 3 – improving query performance using materialized views
Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually [...]
What’s up with HandlerSocket?
I’ve presented at two different venues about HandlerSocket recently and the number one question that always arises is: Why hasn’t HandlerSocket become more popular than it is? Considering how fast and awesome HandlerSocket is, it’s not seeing as rapid adoption as some might expect. I theorize that there are five reasons for this:
Ultimate MySQL variable and status reference list
I am constantly referring to the amazing MySQL manual, especially the option and variable reference table. But just as frequently, I want to look up blog posts on variables, or look for content in the Percona documentation or forums. So I present to you what is now my newest Firefox toolbar bookmark: an option and [...]

