April 25, 2014

Which Linux distribution for a MySQL database server? A specific point of view.

One of the more common questions I get asked is which Linux distribution I would use for a MySQL database server. Bearing the responsibility for someone else’s success means I should advise something that is stable, reliable, easy to manage and has plenty of resources available online. It should also allow running MySQL without too […]

Preprocessing Data

There are many ways of improving response times for users. There are some people that spend a lot of time, energy and money on trying to have the application respond as fast as possible at the time when the users made the request. Those people may miss out on an opportunity to do some or […]

Upcoming Webinar “Managing Big Data with Percona Server, XtraBackup and Tungsten”

On February 10, We will be giving a webinar “Managing Big Data with Percona Server, XtraBackup and Tungsten”. This is joint Continuent and Percona Webinar with full description: Big data is a big problem for growing SaaS businesses and large web applications. In this webinar, we’ll teach you how to set up Percona Server, XtraBackup, […]

Sample datasets for benchmarking and testing

Sometimes you just need some data to test and stress things. But randomly generated data is awful — it doesn’t have realistic distributions, and it isn’t easy to understand whether your results are meaningful and correct. Real or quasi-real data is best. Whether you’re looking for a couple of megabytes or many terabytes, the following […]

Data mart or data warehouse?

This is part two in my six part series on business intelligence, with a focus on OLAP analysis. Part 1 – Intro to OLAP Identifying the differences between a data warehouse and a data mart. (this post) Introduction to MDX and the kind of SQL which a ROLAP tool must generate to answer those queries. […]

Checking for a live database connection considered harmful

It is very common for me to look at a customer’s database and notice a lot of overhead from checking whether a database connection is active before sending a query to it. This comes from the following design pattern, written in pseudo-code:

Many of the popular development platforms do something similar to this. Two […]

MongoDB Approach to database synchronization

I went to MongoSF today – quite an event, and I hope to have a chance to write more about it. This post is about one replication problem and how MongoDB solves it. If you’re using MySQL Replication when your master goes down it is possible for some writes to be executed on the master, […]

When should you store serialized objects in the database?

A while back Friendfeed posted a blog post explaining how they changed from storing data in MySQL columns to serializing data and just storing it inside TEXT/BLOB columns. It seems that since then, the technique has gotten more popular with Ruby gems now around to do this for you automatically.

Recovery beyond data restore

Quite frequently I see customers looking at recovery as on ability to restore data from backup which can be far from being enough to restore the whole system to operating state, especially for complex systems. Instead of looking just at data restore process you better look at the whole process which is required to bring […]

Missing Data – rows used to generate result set

As Baron writes it is not the number of rows returned by the query but number of rows accessed by the query will most likely be defining query performance. Of course not all row accessed are created equal (such as full table scan row accesses may be much faster than random index lookups row accesses […]