October 19, 2009

MySQL-Memcached or NOSQL Tokyo Tyrant – part 3

Posted by matt |

This is part 3 of our series.  In part 1 we talked about boosting performance with memcached on top of MySQL, in Part 2 we talked about running 100% outside the data with memcached, and now in Part 3 we are going to look at a possible solution to free you from the database.  The solution I am going to discuss here is Tokyo Cabinet and Tyrant.

I am not going to give you a primer  or Tutorial on Tyrant and Cabinet, there are plenty of these out there already.  Instead I want to see what sort of performance we can see compared to MySQL and Memcached, and later on other NoSQL solutions.  Tokyo actually allows you to use several types of databases that are supported, there are hash databases which are very similar to memcached, a table database which is similar to your classic database tables where you can add a where clause and search individual columns, and a ton more “database options”  beyond just those two.  Again my goal is to not make this a Tokyo Tyrant tutorial but rather show one potential role it can play.

More details can be read about here:
http://1978th.net/tokyotyrant/
http://1978th.net/tokyocabinet/

So if we can get performance similar to memcached with Tokoyo Tyrant when using disk based hash tables it would be a compelling replacement for our application here.  It should provide the interface and the same access we saw in memcached but with disk persistence. So let’s look at the numbers:

Tyrant -vs- memcached

Tyrant’s disk based hash was almost 2x faster then combining memcached and MySQL, and about 20% slower then the all memory memcached approach.  So for this particular application I would have been much better off not storing my data in MySQL and instead looking outside the database for an answer.  Now sure there are other reasons you may want to keep data in the database… but I am trying to get you to think about your application and if those reasons are really valid.  Helping clients pick the right solution is one of the things we do here at Percona.  If an application requires a database great, but if there is a better solution we want to suggest it.  It’s our goal to make your application perform optimally.

Finally, one concern you have to have is the scalability of your storage solution.  As load, number of threads, and data size increases how does performance differ or change?  One knock on Tokyo -vs- Memcached is Tokyo is not distributed by default.  Now that’s not to say we could not shard it based on a hash, or even build an api with the capability built in ( or use the memcached clients which works! )…  but native support is lacking.  It does support replication which could make some rather interesting architectures in the future.

So lets look at some scalability benchmarks, my server resources are rather limited but I thought I should try throwing more threads and work at the server until it hit its limit and fell over dead.  It’s interesting to see the number of transactions that occur with a given number of threads.  let’s look at some of these:

Tyrant/MySQL/Memcached Thread Benchmark performance

As expected the smaller buffer pool struggled ( why a smaller buffer pool?  This simulates a much larger data set.  A BP of 256M with 1GB of data, can give similar performance to 20GB of data and a 5GB BP ).  So with 256M BP and 4GB of memcached we were well off the numbers we hit with a 4GB BP+4Gb of memcached ( which is expected ).   Adding more threads even up to 128 threads increased overall throughput but my load average on the server hit 40 and my CPU was pegged.  At 128 threads I was pegging out my CPU across the board.  Also interesting is I started to hit bottlenecks in MySQL/Innodb when I had enough memory but I increased the threads from 64 to 128.  As time permits I should revisit this and look at increased datasets, and look for area’s where Tyrant may stumble a bit.

Bottom line given a specific application and data pattern sometimes a relational database is not the appropriate place for storing data.  A tool like Tokyo Tyrant may not be for everyone or every application, but neither is a relational database.  Before building your next application try and understand whether an RDBMS is really needed or not.

How did I do these tests:

The above number were run with 32 Threads, Tyrant was started with 8 threads and 128M of memory,  memached was started with 16 threads ( 1.4 memcached ), mysql was 5.1 XtraDB.  Each environment had 2 tables each with 2 million rows.  The data was identical. memcached and Tyrant stored a comma delimited string to represent the row.   Mysql was running with 256M allocated to the innodb buffer unless otherwise noted.

What’s next?  Well next I am going to try and continue this series by exploring and benchmarking other NOSQL options and comparing them to database based solutions.  I think showing the performance of a couple of different Tokyo database formats would also be interesting.  What other solutions are people interested in?  I know I have gotten a lot of requests for cassandra #’s, but what else?  Drop a comment and let me know!

October 16, 2009

How (not) to find unused indexes

Posted by Morgan Tocker |

I've seen a few people link to an INFORMATION_SCHEMA query to be able to find any indexes that have low cardinality, in an effort to find out what indexes should be removed.  This method is flawed - here's the first reason why:

SQL:
  1. CREATE TABLE `sales` (
  2. `id` int(11) NOT NULL AUTO_INCREMENT,
  3. `customer_id` int(11) DEFAULT NULL,
  4. `status` enum('archived','active') DEFAULT NULL,
  5. PRIMARY KEY (`id`),
  6. KEY `status` (`status`)
  7. ) ENGINE=MyISAM AUTO_INCREMENT=65691 DEFAULT CHARSET=latin1;
  8.  
  9. mysql> SELECT count(*), STATUS FROM sales GROUP BY STATUS;
  10. +----------+---------+
  11. | count(*) | STATUS  |
  12. +----------+---------+
  13. |    65536 | archived |
  14. |      154 | active  |
  15. +----------+---------+
  16. 2 rows IN SET (0.17 sec)
  17.  
  18. mysql> EXPLAIN SELECT * FROM sales WHERE STATUS='active'; # query 1
  19. +----+-------------+-------+------+---------------+--------+---------+-------+------+-------------+
  20. | id | select_type | TABLE | type | possible_keys | KEY    | key_len | ref   | rows | Extra       |
  21. +----+-------------+-------+------+---------------+--------+---------+-------+------+-------------+
  22. 1 | SIMPLE      | sales | ref  | STATUS        | STATUS | 2       | const |  196 | USING WHERE |
  23. +----+-------------+-------+------+---------------+--------+---------+-------+------+-------------+
  24. 1 row IN SET (0.06 sec)
  25.  
  26. mysql> EXPLAIN SELECT * FROM sales WHERE STATUS='archived'; # query 2
  27. +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
  28. | id | select_type | TABLE | type | possible_keys | KEY  | key_len | ref  | rows  | Extra       |
  29. +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
  30. 1 | SIMPLE      | sales | ALL  | STATUS        | NULL | NULL    | NULL | 65690 | USING WHERE |
  31. +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
  32. 1 row IN SET (0.01 sec)

The cardinality of status index is woeful, but provided that the application is always only sending query 1 to MySQL it's actually a pretty good index!  It's not always like this, but there are a lot of cases where applications have good selectivity with some queries despite what cardinality shows.

Not convinced?  Here's reason number two:

SQL:
  1. CREATE TABLE `Country` (
  2. `Code` char(3) NOT NULL DEFAULT '',
  3. `Name` char(52) NOT NULL DEFAULT '',
  4. `Continent` enum('Asia','Europe','North America','Africa','Oceania','Antarctica','South America') NOT NULL DEFAULT 'Asia',
  5. `Region` char(26) NOT NULL DEFAULT '',
  6. `SurfaceArea` float(10,2) NOT NULL DEFAULT '0.00',
  7. `IndepYear` smallint(6) DEFAULT NULL,
  8. `Population` int(11) NOT NULL DEFAULT '0',
  9. `LifeExpectancy` float(3,1) DEFAULT NULL,
  10. `GNP` float(10,2) DEFAULT NULL,
  11. `GNPOld` float(10,2) DEFAULT NULL,
  12. `LocalName` char(45) NOT NULL DEFAULT '',
  13. `GovernmentForm` char(45) NOT NULL DEFAULT '',
  14. `HeadOfState` char(60) DEFAULT NULL,
  15. `Capital` int(11) DEFAULT NULL,
  16. `Code2` char(2) NOT NULL DEFAULT '',
  17. PRIMARY KEY (`Code`),
  18. KEY `Population` (`Population`)
  19. ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
  20.  
  21. mysql> SELECT count(*) FROM Country;
  22. +----------+
  23. | count(*) |
  24. +----------+
  25. |      239 |
  26. +----------+
  27. 1 row IN SET (0.00 sec)
  28.  
  29. mysql> SELECT count(DISTINCT(population)) FROM Country;
  30. +-----------------------------+
  31. | count(DISTINCT(population)) |
  32. +-----------------------------+
  33. |                         226 |
  34. +-----------------------------+
  35. 1 row IN SET (0.05 sec)
  36.  
  37. mysql> EXPLAIN SELECT * FROM country WHERE population> 1000; # query 3
  38. +----+-------------+---------+------+---------------+------+---------+------+------+-------------+
  39. | id | select_type | TABLE   | type | possible_keys | KEY  | key_len | ref  | rows | Extra       |
  40. +----+-------------+---------+------+---------------+------+---------+------+------+-------------+
  41. 1 | SIMPLE      | country | ALL  | Population    | NULL | NULL    | NULL239 | USING WHERE |
  42. +----+-------------+---------+------+---------------+------+---------+------+------+-------------+
  43. 1 row IN SET (0.04 sec)
  44.  
  45. mysql> EXPLAIN SELECT * FROM country WHERE population> 100000000; # query 4
  46. +----+-------------+---------+-------+---------------+------------+---------+------+------+-------------+
  47. | id | select_type | TABLE   | type  | possible_keys | KEY        | key_len | ref  | rows | Extra       |
  48. +----+-------------+---------+-------+---------------+------------+---------+------+------+-------------+
  49. 1 | SIMPLE      | country | range | Population    | Population | 4       | NULL |   23 | USING WHERE |
  50. +----+-------------+---------+-------+---------------+------------+---------+------+------+-------------+
  51. 1 row IN SET (0.00 sec)

The index on query 3 had high cardinality but should not be used since too many countries have a population greater than 1000.  An automated search for low cardinality indexes wouldn't have revealed it's uselessness.  For range scans, it's very easy to lead yourself into a trap where your index can not filter out enough rows to be effective.  I see this a lot in consulting issues where customers have queries that use a BETWEEN on a date, but the window of time it is searching in is too wide.

Side Note: In some texts you'll see people quote the numbers "20-30%" as the minimum amount of rows you have to filter down to for an index to be useful (that is, eliminate 70-80% of rows).  It's not quite correct to quote this as an exact percentage, since this value is not fixed in MySQL and can be a much wider window (15-60%) depending on the circumstances.  In this case, MySQL flipped from tablescan to index at about 34%.

How am I supposed to find unused indexes then?
You really have to run queries against your server - there is no other way.  From there, there's a helpful patch in 5.0-percona called INDEX_STATISTICS that can then show you which indexes were touched and which were not.

If you are not running a patched server, then the alternative is to either use a proxy that checks EXPLAIN information (like QUAN) or set your slow query log to zero microseconds (5.1 feature) and then find someway to parse and EXPLAIN all results, then subtract the indexes that were mentioned from all indexes known.  There's an old tool called mysqlidxchx which should be able to do this.

MySQL-Memcached or NOSQL Tokyo Tyrant – part 2

Posted by matt |

Part 1 of our series set-up our "test"  application and looked at boosting performance of the application by buffer MySQL with memcached.  Our test application is simple and requires only 3 basic operations per transaction 2 reads and 1 write.  Using memcached combined with MySQL we ended up nearly getting a 10X performance boost from the application.  Now we are going to look at what we could achieve if we did not have to write to the database at all.  So let's look at what happens if we push everything including writes into memcached.

Benchmarks if everything is in memcached

Wow that's shockingly fast isn't it! I guess being completely in memory helps for this app.  What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB ( part 1 benchmarked a 4GB bp as being able to handle 7K TPS)... something is not 100% right here.  It stands to reason that memcached should be faster for this application then the DB.  Its just doing two gets via key and 1 set.  So why the similar numbers?

Well glad you asked.  It's the API.  The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:

Memcached API - Fast

That is a nice jump in performance!

Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:

Sometimes Api changes can make a huge difference

In this case I think the Fast api was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load.  When we eliminate this bottleneck with the Fast api, MySQL gets hammered.  This type of thing happens a lot.  For example an application is CPU bound, so you add more processing power, but then you hit disks harder and  now your disk bound.

A couple of good things to remember here:  #1 resolving 1 bottleneck can open another bottleneck that is much worse.  #2  is to understand that not all API's are created equal.  Additionally the configuration and setup that works well on one system may not work well on another.  Because of this people often leave lots of performance on the table.  Don't just trust that your current API or config is optimal, test and make sure it fits your application.

So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance over a mixed solution and 100X over just stock MySQL.  As the number of writes against the database increase this gap will increase.  So let's ditch the database!  But wait!  you need the DB for  persistence, right?

It depends.  A database may not be the best fit for every application.  There are several “NOSQL”  solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for.   Each application is different and understanding the application's requirements is key to picking an appropriate solution.   I am going to look at several database alternatives over the next few months.  I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet.    So stop in next time for part 3 of this series where we will focus on running the same tests against Tokyo Tyrant.

Wow that's shockingly fast isn't it! I guess being completely in memory helps for this app. What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB... something is not 100% right here. It stands to reason that memcached should be faster for this application then the DB, two gets via key and 1 set. So why the similar numbers?

Well glad you asked. It's the API. The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:

That is a nice jump in performance!

Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:

In this case I think Fast was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load. When we eliminate this bottleneck with the Fast api, MySQL gets hammered.

A couple of good things to remember here: #1 resolving 1 bottleneck can open another bottleneck that is much worse. #2 is to understand that not all API's are created equal. Additionally the configuration and setup that works well on one system may not work well on another. Because of this people often leave lots of performance on the table. Don't just trust that your current API or config is optimal, test and make sure it fits your application.

So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance. As the number of writes against the database increase this gap will increase. So let's ditch the database! But wait! you need the DB for persistence, right?

It depends. A database may not be the best fit for every application. There are several “NOSQL” solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for. Each application is different and understanding the application's requirements is key to picking an appropriate solution. I am going to look at several database alternatives over the next few months. I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet.

October 15, 2009

MySQL-Memcached or NOSQL Tokyo Tyrant – part 1

Posted by matt |

All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why?  Because their other applications use it, so why not the new application?  Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached .  Lookup a row based on a key, update the data in the row, stuff the row back in the database.  Rinse and repeat.  Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count.  But for the most part they are designed to be simple.

A classic example is a simple online game.  An online game may only require that an application retrieve a single record from the database.  The record may contain all the vital stats for the game, be updated and stuffed back into the database.  You would be surprised how many people use this type of system as I run into this type of application frequently.  Keeping it simple, ensures that application is generally mean and lean and performs well.  The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory.  Is there a better architecture?  Is there a way to get more scalability out of your database?  Is the database even the best place for this data?

I decided to walk through setting up a very simple application that does what I have seen many clients do.  Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra.   My Application does the following:

A.)  read a row from a database based on an integer based primary key
B.)  Update data from that row and replace the stored contents on disk
C.)  Use the data from that row to lookup up a row in another table based on a text field ( called email address ).

Seems simple enough right?  My two tables each contain 5M rows of data.  let's see what happens:

DB Fits into Memory

Chart of TPS for benchmark application

You can see a dramatic drop off in performance as my data falls out of memory, that's not cool is it?  After all database sizes tend to always grow and very rarely shrink.  Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?

Here is where people start to scratch their heads.  They naturally assume they need to scale more, we need more memory!   If performance sucks, we must need more.  So here comes the bigger boxes, the read-only slaves,  the complex sharding systems, the discussions on cluster, more memcached.  We need to cover up the databases inefficiencies to ensure that our application scales.

The problem is for some applications, we are fixing symptoms, not the problem itself.  No matter how much you want it to fit,  some things may not work (like the Godfather 3).    The issue is people assume that data storage has to be in the database.  “It's data, it needs to go into the database.” is often the battle cry.   But hold on to your hat,  I  am going to shock you.  For some applications, putting your data in the database is silly.  Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database.  Heresy I know!  But for many of us we already accept storing data ( at least temporarily ) outside the DB.  Think memcached.

Almost everyone loves memcached, it's simple, fast, and just works.  When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance.  I know you're thinking my simple key lookup should really benefit from memcached. So let's try it!  I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required.  I tested with a memcached size of 1GB, 2GB, and 4GB.  For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.

let's look at the 1GB Setting:

Ensure you have enough memory for memcached

What, a performance regression?  But we threw more memory at it!!   How can that be!

Memcached is not a cure all.  I have talked to many client's who say “we will just throw memcached at it”.   Sometimes an app will scream other times it won't... and yet others require lots and lots of memory allocated to memcached to be successful.    This application selects a random # between 1 and 2 Million and looks up the result via that key.  It then uses data from that random row to look up a second piece of information via email address.  Because the entire dataset  is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful.   I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll.  Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached.  For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows... sequentially accessing this data will destroy the use of cache:

get record 1, miss, read from disk, cache record 1
….
get record 500,001, miss, read from disk, expunge record 1, cache record 500,001
....
get record 1, miss, read from disk, expunge record 500,001, cache record 1

you keep overwriting the cache before you can use it.  So here the complexity of adding memcached hurts us, because the cache is not actually buying us anything.

Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS.  My guess is the movement is caused by the random #'s being generated and simply the luck of the draw.

Finally let's look when we have 4GB of memory allocated to memcached (full dataset fits ):

Transactions with and without Memcached

Here you can see that our “transactions”  per second for this app increased almost 10Xby using memcached.  The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS.   I think we would all be very happy if we could get a 10X performance boost from your application.

But wouldn't it be great if we could get more?  I mean our reads are going pretty fast, but our writes leave a lot to be desired:

Read -vs- write times with memcached + mysql mixed

Over 17 MS to do an update.  Wouldn't be great to just eliminate all the updates as well?  What sort of throughput would we get?   I will show you in part 2.  Part 2 of this post will talk about performance in a 100% pure memcached environment. Part 3 will focus on these same benchmarks in Tokyo tyrant.

All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why? Because their other applications use it, so why not the new application? Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached . Lookup a row based on a key, update the data in the row, stuff the row back in the database. Rinse and repeat. Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count. But for the most part they are designed to be simple.

A classic example is a simple online game. An online game may only require that an application retrieve a single record from the database. The record may contain all the vital stats for the game, be updated and stuffed back into the database. You would be surprised how many people use this type of system as I run into this type of application frequently. Keeping it simple, ensures that application is generally mean and lean and performs well. The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory. Is there a better architecture? Is there a way to get more scalability out of your database? Is the database even the best place for this data?

I decided to walk through setting up a very simple application that does what I have seen many clients do. Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra. My Application does the following:

A.) read a row from a database based on an integer based primary key

B.) Update data from that row and replace the stored contents on disk

C.) Use the data from that row to lookup up a row in another table based on a text field ( called email address ).

Seems simple enough right? My two tables each contain 5M rows of data. let's see what happens:

You can see a dramatic drop off in performance as my data falls out of memory, that's not cool is it? After all database sizes tend to always grow and very rarely shrink. Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?

Here is where people start to scratch their heads. They naturally assume they need to scale more, we need more memory! If performance sucks, we must need more. So here comes the bigger boxes, the read-only slaves, the complex sharding systems, the discussions on cluster, more memcached. We need to cover up the databases inefficiencies to ensure that our application scales.

The problem is for some applications, we are fixing symptoms, not the problem itself. No matter how much you want it to fit, somethings may not work (like the Godfather 3). The issue is people assume that data storage has to be in the database. “It's data, it needs to go into the database.” is often the battle cry. But hold on to your hat, I am going to shock you. For some applications, putting your data in the database is silly. Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database. Heresy I know! But for many of us we already accept storing data ( at least temporarily ) outside the DB. Think memcached.

Almost everyone loves memcached, it's simple, fast, and just works. When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance. I know you're thinking my simple key lookup should really benefit from memcached. So let's try it! I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required. I tested with a memcached size of 1GB, 2GB, and 4GB. For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.

let's look at the 1GB Setting:

What, a performance regression? But we threw more memory at it!! How can that be!

Memcached is not a cure all. I have talked to many client's who say “we will just throw memcached as it”. Sometimes an app will scream other times it won't... and yet others require lots and lots of memory allocated to memcached to be successful. This application selects a random # between 1 and 2 Million and looks up the result via that key. It then uses data from that random row to look up a second piece of information via email address. Because the entire dataset is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful. I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll. Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached. For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows... sequentially accessing this data will destroy the use of cache:

  • get record 1, miss, read from disk, cache record 1

  • ….

  • get record 500,001, miss, read from disk, expunge record 1, cache record 500,001

  • ....

  • get record 1, miss, read from disk, expunge record 500,001, cache record 1

    you keep overwriting the cache before you can use it. So here the complexity of adding memcached hurts us, because the cache is not actually buying us anything.

Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS. My guess is the movement is caused by the random #'s being generated and simply the luck of the draw.

Finally let's look when we have 4GB of memory allocated to memcached (full dataset fits ):

Here you can see that our “transactions” per second for this app increased almost 10Xby using memcached. The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS. I think we would all be very happy if we could get a 10X performance boost from your application.

But wouldn't it be great if we could get more? I mean our reads are going pretty fast, but our writes leave a lot to be desired:

Over 17 MS to do an update. Wouldn't be great to just eliminate all the updates as well? What sort of throughput would we get?

September 16, 2009

How to generate per-database traffic statistics using mk-query-digest

Posted by Ryan Lowe |

We often encounter customers who have partitioned their applications among a number of databases within the same instance of MySQL (think application service providers who have a separate database per customer organization ... or wordpress-mu type of apps). For example, take the following single MySQL instance with multiple (identical) databases:
[read more...]

September 15, 2009

Which adaptive should we use?

Posted by Yasufumi |

As you may know, InnoDB has 2 limits for unflushed modified blocks in the buffer pool. The one is from physical size of the buffer pool. And the another one is oldness of the block which is from the capacity of transaction log files.

In the case of heavy updating workload, the modified ages of the many blocks are clustered. And to reduce the maximum of the modified ages InnoDB needs to flush many of the blocks in a short time, if these are not flushed at all. Then the flushing storm affect the performance seriously.

We suggested the "adaptive_checkpoint" option of constant flushing to avoid such a flushing storm. And finally, the newest InnoDB Plugin 1.0.4 has the new similar option "adaptive_flushing" as native.

Let's check the adaptive flushing options at this post.
[read more...]

September 12, 2009

3 ways MySQL uses indexes

Posted by peter |

I often see people confuse different ways MySQL can use indexing, getting wrong ideas on what query performance they should expect. There are 3 main ways how MySQL can use the indexes for query execution, which are not mutually exclusive, in fact some queries will use indexes for all 3 purposes listed here.
[read more...]

August 29, 2009

Give me 8 hours, and I’ll help you build a better application

Posted by Baron Schwartz |

I have run into a number of cases recently that all had a similar look and feel. In most of these cases, the symptoms were very complicated, but they boiled down to just a few problems that can be prevented in very simple ways.

If you are not doing any of the following three simple things, you probably should. These are essential practices for building successful applications, which everyone should be doing. And the best part is how easy it is to do them -- read on for the details.

[read more...]

August 13, 2009

XtraDB: The Top 10 enhancements

Posted by Morgan Tocker |

Note: This post is part 2 of 4 on building our training workshop.

Last week I talked about why you don't want to shard. This week I'm following up with the top 10 enhancements that XtraDB has over the built-in InnoDB included in MySQL 5.0 and 5.1.  Building this list was not really a scientific process - It's always difficult to say which feature is better than another, because a lot of it depends on the individual workload.  My ranking method was to pick the features that have the highest impact and are most applicable to all workloads first:

  1. CPU scalability fixes - XtraDB improves performance on systems with multi-cpus (see docs 1, 2).
  2. Import/Export Tables - XtraDB allows you to import an arbitrary  table from one server to another, by backing up the .ibd file with Xtrabackup (see docs).
  3. IO scalability fixes - A lot of the internal algorithms of InnoDB are based on the non-configurable assumption that the server has only a single disk installed (100 iops).  One problem in particular that this causes, is that InnoDB the algorithm which chooses if InnoDB is too busy to flush dirty pages can consider it's self busy very easily.  Keeping a large percentage of pages dirty increases recovery time, and will lead to more work when checkpoints are eventually forced at the end of a log file.  XtraDB improves this with innodb_io_capacity, as well as configuration items for innodb_read_threads, innodb_write_threads (see docs).
  4. Better Diagnostics - The SHOW ENGINE INNODB STATUS command in XtraDB shows a lot more information than the standard InnoDB status (see docs).  The built-in InnoDB status also has some problems with the placement of items (a long transaction list will prevent the rest of the information shown).  In addition to this, XtraDB diagnostics include the ability to see the contents of the buffer pool (see docs), and InnoDB row statistics are inserted into the slow query log (see docs).
  5. Fast Crash Recovery - In the built-in InnoDB, the crash recovery process is sometimes best measured in hours and days - this restricts users to using very small transaction log files (innodb_log_file_size), which is worse for performance.  In a simple test, XtraDB recovered ten times faster (see docs).
  6. InnoDB Plugin Features - XtraDB is derived from the InnoDB plugin, which has fast index creation (as opposed to recreating the whole table!) and page compression.
  7. Adaptive Checkpointing - The built-in InnoDB can have erratic dips in performance as it approaches the end of a log file and needs to checkpoint - which can cause a denial of service to your application (this can be seen in any benchmark - such as this one).  In XtraDB, adaptive checkpointing can smooth out the load, and checkpoint data more aggressively as you approach the end of a log file (see docs).
  8. Insert Buffer control - The insert buffer is a great feature of InnoDB that is not often discussed.  It allows you to delay the writing of non-unique secondary index pages, which can often lead to a lot of merged requests and reduced IO.  The  problem with the insert buffer in the built-in InnoDB, is that there are no options to tweak it.  It can grow to 1/2 the size of your buffer pool, and when it does, it doesn't try to aggressively free entries (a full buffer provides no use) or reduce its size (see docs).
  9. Data dictionary control - Once an InnoDB table is opened it is never freed from the in-memory data dictionary (which is unlimited in size).  XtraDB introduces a patch to be able to control this, which is useful in cases where users have a large number of tables. (see docs).
  10. Additional undo slots - In the built-in InnoDB, the number of open transactions is limited to 1023 (see bug report).  XtraDB allows this to be expanded to 4072 (Warning: Incompatible change!).  (see docs).

All of these 10 items will be covered in our Training workshops for InnoDB and XtraDB.  In Santa Clara / San Francisco between 14-16 September?  Come along!

My next post in this series will be on XtraDB: The Top 10 Configuration Parameters.

August 6, 2009

Why you don’t want to shard.

Posted by Morgan Tocker |

Note: This blog post is part 1 of 4 on building our training workshop.

The Percona training workshop will not cover sharding. If you follow our blog, you'll notice we don't talk much about the subject; in some cases it makes sense, but in many we've seen that it causes architectures to be prematurely complicated.

So let me state it: You don't want to shard.

Optimize everything else first, and then if performance still isn't good enough, it's time to take a very bitter medicine. The reason you need to shard basically comes down to one of these two reasons:

  1. Very large working set - The amount of memory you require to keep your frequently accessed data loaded exceeds what you can (economically) fit in a commodity machine. 5 years ago this was 4GB, today it is 128GB or even 256GB.  Defining "working set" is always an interesting concept here, since with good schema and indexing it normally doesn't need to be the same size as your entire database.
  2. Too many writes - Either the IO system, or a slave can't keep up with the amount of writes being sent to the server.  While the IO system can be improved with a RAID 10 controller w/battery backed write cache, the slave delay problem is actually very hard to solve. Maatkit has a partial-solution (via Paul Tuckfield), but it doesn't work for all workloads.

(Yes, I am simplifying some of the scalability issues with MySQL on big machines, but I have faith that Yasufumi is making this better).

What types of Sharding are there?

Despite my cautions, if you have established that you need to shard there are quite a few options available to you:

  1. Sharding Partitioning by Application Function - This is usually the best way to fix any of the problems mentioned above. What you do is pick a few very busy tables, and move them onto their own MySQL server.  Partition-by-function keeps the architecture still simple, and should work for most cases unless you have a single table which by itself can't fit into the above constraints.
  2. Sharding by hash or key - This method works by picking a column on a table and try and divide up your data based on it.  You can choose any column to hash on, you just need to make sure that it will equally distribute the data equally. In practice this method can be really hard to get working right, since even if each shard has the same amount of 'customers', demanding users tend to by far exceed average users and some servers are overloaded while others are not.

    (Tip: There are a few famous cases of both (a) bad hashing algorithms and (b) users becoming unequal all of the sudden;  You don't want to shard based on the first character of a username - as there will be a lot more 'M' than 'Z'.  For users becoming unequal all of the sudden, it's always interesting to think of what scaling challenges Flickr would have had for the official Obama photographer in the lead up to the 08 election.)

  3. Sharding via a Lookup Service - This method works by having some sort of directory service which you query first to ask "what shard number will this users data exist on?".  It's a highly scalable architecture, and once you write scripts to be able to migrate users to/from shards you can tweak and rebalanced to make sure that all your hardware is utilized efficiently.  The only problem with this method is what I stated at the start: it's complicated.

(Note: I've left out some of the more complicated sharding architectures.  For example; another solution is to have shards all store fragments of data, and to cross backup those fragments across shards.)

Why is it so complex?

The reason it's complex comes down to two reasons:

  1. The application developer has to write more code to be able to handle sharding logic (this is actually lessened with projects such as HiveDB.)
  2. Operational issues become more difficult (backing up, adding indexes, changing schema).

I think that a lot of people remember (1), but (2) can be a real pain point.  It can take a lot of work to build an application that works correctly when you are rolling through an upgrade where the schema will not be the same on all nodes.  A lot of these tasks remain only semi-automated, so from an operations perspective there can often be a lot more work to be done.

This concludes Part 1 - I hope I've justified why we are not covering sharding.  In Part 2, I will write about something that is going to be in the course - "XtraDB: The top 10 enhancements", and in Part 3 "XtraDB: The top 10 parameters".