Several days ago MySQL AB made new storage engine Falcon available for wide auditory. We cannot miss this event and executed several benchmarks to see how Falcon performs in comparison to InnoDB and MyISAM.
The second goal of benchmark was a popular myth that MyISAM is faster than InnoDB in reads, as InnoDB is transactional, supports Foreign Key and has an operational overhead. As you will see it is not always true.
For benchmarks I used our PHPTestSuite which allows to test wide range tables and queries.
The script and instruction are available here:
http://www.mysqlperformanceblog.com/files/benchmarks/phptestsuite.stable.tar.gz
We used table “normal” table structure which corresponds to typical structure you would see in OLTP or Web applications – medium size rows, auto increment primary key and couple of extra indexes.
CREATE TABLE IF NOT EXISTS `$tableName` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varchar(64) NOT NULL default '',
`email` varchar(64) NOT NULL default '',
`password` varchar(64) NOT NULL default '',
`dob` date default NULL,
`address` varchar(128) NOT NULL default '',
`city` varchar(64) NOT NULL default '',
`state_id` tinyint(3) unsigned NOT NULL default '0',
`zip` varchar(8) NOT NULL default '',
`country_id` smallint(5) unsigned NOT NULL default '0',
PRIMARY KEY (`id`),
UNIQUE KEY `email` (`email`),
KEY `country_id` (`country_id`,`state_id`,`city`)
)
In this benchmark we used only read (SELECT) queries with different typical data access patterns:
primary key single row lookup, primary key range lookup, same access types for primary key and full table scans.
To highlight different properties of storage engines we tested ranges with and without LIMIT clause, and tested queries which
need to read the data or can only be satisfied by reading the index.
This benchmark is so called “micro” benchmark which concentrates on particular simple storage engine functions and we use it to see performance and scalability in this simple cases. We also use CPU bound workload in this case (no disk IO) to see how efficient storage engines are in terms of CPU usage. In real life workload results are likely to be very different.
The schema and queries are described here
Used hardware
CentOS release 4.4 (Final)
2 Ñ… Dual Core Intel XEON 5130model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
stepping : 6
cpu MHz : 1995.004
cache size : 4096 KB16GB of RAM
MySQL version
We used MySQL 5.1.14-beta sources for MyISAM / InnoDB
and MySQL 5.1.14-falcon bitkeeper tree
bk://mysql.bkbits.net/mysql-5.1-falcon for Falcon
(Please note this is a first release of Falcon and it is still in alpha stage and performance parameters may vary a lot in next releases)
Compilation parameters:
For MyISAM / InnoDB ./configure --prefix=/usr/local/mysqltest/mysql---with-innodb For Falcon ./configure --prefix=/usr/local/mysqltest/mysql- --with-falcon
mysqld startup params:
Falcon: libexec/mysqld --no-defaults --user=root --falcon_min_record_memory=1G --falcon_max_record_memory=2GB --falcon_page_cache_size=1500M --max-connections=1500 --table-cache=512 --net_read_timeout=30 --net_write_timeout=30 --backlog=128 MyISAM / InnoDB: libexec/mysqld --no-defaults --user=root --key-buffer-size=1500M --innodb-buffer-pool-size=1500M --innodb-log-file-size=100M --innodb-thread-concurrency=8 --max-connections=1500 --table-cache=512 --net_read_timeout=30 --net_write_timeout=30 --back_log=128
Method of benchmark:
1. Prepare table with 1,000,000 records (about 350Mb of data on disk)
2. Run each query for 1, 4, 16, 64, 128, 256 concurrent threads.
3. For each thread perform a warm-up run (duration 180 sec), and then
run three effective runs (duration of each is 60 sec).
As the final result we get a maximal result of three runs.
The raw numbers are available here:
http://www.mysqlperformanceblog.com/files/benchmarks/innodb-myisam-falcon.html
(Note: This benchmark is synthetic micro benchmarks focusing on particular simple data access patterns. Results for your workload are likely to be different.)
There are interesting results I want to show graphics with comments
READ_PK_POINT

Query: SELECT name FROM $tableName WHERE id = %d
The very common query with access by primary key.
InnoDB is faster than MyISAM by 6-9%.
Falcon shows very bad scalabilty.
READ_KEY_POINT

Query: SELECT name FROM $tableName WHERE country_id = %d
In this case Falcon is the best, because Falcon uses a tricky technic to retrieve rows (more
details with Jim Starkey’s comments in Part 2).
There MyISAM shows bad scalability with increasing count of thread. I think the reason is pread system
call MyISAM uses to access data and retrieving from OS cache is not scaled.
READ_KEY_POINT_LIMIT

Query: SELECT name FROM $tableName WHERE country_id = %d LIMIT 5
The same query as previous but with LIMIT clause.
Due to Falcon’s way of key access Falcon cannot handle LIMIT properly and that is why
we see bad performance. We hope the performance of LIMIT queries will be fixed before release.
MyISAM shows stable result.
InnoDB is better than MyISAM by 58% in case with 4 threads, but does not scale good enough.
Perhaps there is still a problem with InnoDB mutexes.
READ_KEY_POINT_NO_DATA

Query: SELECT state_id FROM $tableName WHERE country_id = %d
This query is similar to previous READ_KEY_POINT with only different the values of accessed column is stored in key. MyISAM and InnoDB handle this case and retrive the value only from key.
InnoDB is better by 25-30%.
Falcon needs an access to data beside key access, and most likely this will not be fixed, as this is
specific Falcon’s way to handle multi-versioning. I think this is a big weakness of Falcon, as ‘using index’ is very common optimization we use in our practice.
READ_KEY_POINT_NO_DATA_LIMIT

Query: SELECT state_id FROM $tableName WHERE country_id = %d LIMIT 5
The previous query but with LIMIT.
Again the LIMIT is bad for Falcon.
InnoDB is better than MyISAM by 87% in case with 4 threads but drops down very fast.
READ_PK_POINT_INDEX

Query: SELECT id FROM $tableName WHERE id = %d
Simple but very quick query to retrieve value from PK.
The results for InnoDB and MyISAM are comparable and I think this shows both engines are maximally optimized and the result is maximal that can be reached for this query.
Falcon scales pretty bad and there is a big room for optimization.
READ_PK_RANGE

Query: SELECT min(dob) FROM $tableName WHERE id between %d and %d
Access by range of PK values.
MyISAM scales very bad, and reason is the same as for READ_KEY_POINT queries.
InnoDB is better than MyISAM by 2-26 times
and than Falcon by 1.64 – 3.85 times.
READ_PK_RANGE_INDEX

Query: SELECT count(id) FROM $tableName WHERE id between %d and %d
MyISAM scales good here, because of access only to key column and ‘pread’ syscall is not used.
READ_KEY_RANGE

Query: SELECT name FROM $tableName WHERE country_id = %d and state_id between %d and %d
As in case with READ_KEY_RANGE Falcon is the best here.
Falcon’s resuts better than InnoDB by 10-30%
MyISAM drops down with 128-256 threads
READ_KEY_RANGE_LIMIT

Query: SELECT name FROM $tableName WHERE country_id = %d and state_id between %d and %d LIMIT 50
Again Falcon does not hanle LIMIT and the results are much worse.
READ_KEY_RANGE_NO_DATA

Query: SELECT city FROM $tableName WHERE country_id = %d and state_id between %d and %d
READ_KEY_RANGE_NO_DATA_LIMIT

Query: SELECT city FROM $tableName WHERE country_id = %d and state_id between %d and %d LIMIT 50
READ_FTS

Query: SELECT min(dob) FROM $tableName
The hardest query performs a scan of all million rows.
InnoDB is better than MyISAM by ~30% with 4-16 threads, but MyISAM scales a bit better in this case.
InnoDB is better than Falcon by 2-3 times.


Need to know whether SolidDB is better/worse than InnoDB. I am evaluating SolidDB ?
Vadim,
In general: each tool has an area to be applied for, and all ‘universal’ is called ‘horilla’ in MySQL forums: Oracle, DB2. Want to compare test results for a simple query SELECT COUNT(*) FROM …?
Theoretically, InnoDB is faster than MyISAM (or MyISAM-based transactional SolidDB) for multi-user concurrent access because of physical data files format (in InnoDB, data is written into fixed-size blocks; in MyISAM, data is usually written at the end of file).
My database currently consists from mixed InnoDB and MyISAM datafiles. I need MyISAM for dictionary-like tables (like as list of countries, cities, postal codes), and I need InnoDB for frequently updated data.
One weakness: JDBC driver + Prepared Statements.
Thanks
Try this test SELECT COUNT(*) FROM $tableName;
Any chance for “Part 2″ based on more recent releases?
RDG
Yes, We are going to test next release of Falcon
Great information.
I was having trouble deciding between MyIsam and InnoDB. I wanted to create a log for everything that happened on my website, so I created a log table. But…what if a user inserts/updates something and then the PHP script fails before updating the log table?
I realized that I should use transactions, but unfortunately I had to give up the “speed” of MyISAM for InnoDB. Your benchmarks make me feel much better about my decision. =)
Hi,
I measured the performance of InnoDB vs MyISAM for insertion queries for up to 100 threads (see presentation “Measuring MySQL Server Performance for the Sensor Data Stream Processing” at MySQL User Conference 2006. Unless you have very short records (less than 200 bytes) MyISAM outpreforms InnoDB for insertion speed. For the recird sizes between 1KB and 10MB MyISAM is about twice faster.
Inserts by themselves aren’t all that interesting. What is interesting is what happens to 20 reading threads when one inserting thread shows up.
Or how do reading threads behave when updating threads show up.
Or when you’ve got say 20 inserting threads, and then you start one long reading thread that hits a lot of data.
Each of those use cases simulates some kind of real world behaviour.
100 inserting threads, with no readers simulates a real world behaviour too, so I’m not slamming it. It’s a pretty common behavior in things like petroleum exploration.
But, the situation where you have mixed selects, updates, and inserts if far more common, and usually far more interesting.
I don’t think that statement “inserts by themselves aren’t all that interesting” is correct. What about logging? What if you have sensors writing into a database? Is the database able to keep up? Those are VERY interesting and important cases.
For me pure reading is not not interesting at all – you have to insert before you read.
Inserts are much slower than reads, so if you two times slower with inserts your reading speed might be not important – you have to wait for inserts to get your data.
QUOTE:
What about logging? What if you have sensors writing into a database?
UNQUOTE:
Um, you did read my whole post, right? I made the point that write only is still used, and an example of petroleum exploration, which is exactly that, sensors writing into the database.
Either table handler can likely keep up just fine with the insert load. But what happens if insert performance falls 50% on one table handler when a large query runs against your logs, while another table handler drops by only 2%? That’s far more interesting. Or what happens when you have a few small selects against that heavily written table running all the time?
You ARE going to query the log table at some point, and if you’re like me, you’re likely going to need to query it at the worst possible times, in the middle of the day when the system is acting up. You need to read the logs, and if your table handler has a tendency to explode in sheets of flame when you do that, you really need to know THAT, so you can change table handlers (or databases etc…) before you find out in production. After all, what good would logs be that you can’t read?
The point I’m making is that the INTERACTION of reads and writes is what makes for an interesting thing to study. The individual bits, just reads, or just writes are a nice baseline. It’s like chemistry class. Water, kinda intersting. Sodium, kinda interesting. Water AND sodium, much much more interesting.
“I realized that I should use transactions, but unfortunately I had to give up the “speed†of MyISAM for InnoDB. Your benchmarks make me feel much better about my decision. =)”
Allthough using transactions is desirable in many cases, you should also consider calling stored procedures if you are worried about some kind of client-failure, like a php-script crashing etc.
If all the PHP-script did was to call a Stored Procedure with the required parameters, the crash would not hurt the database-operations.
A real-world example:
A typical user-signup process may call several checks, updates and inserts.
In a php-script, this could look like (pseudo code)
- Check validity
- Insert user-data (which can be one or more rows in different tables)
- Possibly fetch the new “auto-increment” id for the user, and use this to further insert rows in other tables
- Other logic
- Check and verify that everything went ok
- Send feedback to the client
Here you can easily make a stored procedure that handles everything, from checking, inserting, updating etc.
The PHP-script would then simply look like
- Check validity
- Call the stored procedure with the required parameters
- Send feedback to the client
So, you could stick to MyIsam here, and still make sure crashing scripts do not cause problems.
Just my 2 cents
“The point I’m making is that the INTERACTION of reads and writes is what makes for an interesting thing to study. The individual bits, just reads, or just writes are a nice baseline. It’s like chemistry class. Water, kinda intersting. Sodium, kinda interesting. Water AND sodium, much much more interesting.”
Indeed. More often than not, real-word usage consists of reading and writing to the database simultanously, and how one affects the other is of utmost importance. If Storeage-Engine A is faster when inserting data, this means nothing if the speed drops a lot when 30 threads are inserting at the same time.
This is what makes Benchmarking difficult, and forces us to interpret results carefully.
The point I’m making is that the INTERACTION of reads and writes is what makes for an interesting thing to study. The individual bits, just reads, or just writes are a nice baseline. It’s like chemistry class. Water, kinda intersting. Sodium, kinda interesting. Water AND sodium, much much more interesting.â€
Indeed. More often than not, real-word usage consists of reading and writing to the database simultanously
Interesting but for selects.
Can anyone suggest if which engine out of InnoDB and MyISAM should be used for a forum based site?
I can understood how the best is InnoDB from ur practical example.
This is what makes Benchmarking difficult, and forces us to interpret results carefully.
Very interesting benchmark. Anyway, when will the InnoDB vs MyISAM vs Falcon benchmarks – part 2 availabe ? I’m very curious to see the progress after 1 year from part 1.
I found nice application stress test at spamassassin
http://wiki.apache.org/spamassassin/BayesBenchmarkResults
In this test MyISAM won, anyway quite interesting numbers.
The InnoDB vs. MyISAM performance numbers were just as interesting as the performance of Falcon.
Thanks for sharing.
Results are interesting, especialy MyISAM vs InnoDB.
Try increasing it to 100 threads and then 300 threads. I think you’ll see a difference.
Vadim,
I only have one question. Which type of MYISAM rows did you use ? Fixed or Dynamic ?
I am pretty sure that the performance will be completely different when you use one or other !
Thanks,
Renato
Thank you for information,
I useing MyISAM!
Once we query the log table we read over the logs. Our table handler did very well with it. For each thread we performed the warm-up run (duration 180 sec), and then ran three effective runs (duration of each is 60 sec). Only problem our final result was much different.
Once we query the log table we read over the logs. Our table handler did very well with it. For each thread we performed the warm-up run (duration 180 sec), and then ran three effective runs (duration of each is 60 sec). Only problem our final result was much different.
InnoDB tables are mostly faster than MyISAM. The only thing MyISAM supports is full-text search which is crappy and slow on large datasets anyway
Very intresting, i think i should migrate to InnoDB with my site’s.
How about a speed test with a MyISAM-packed table vs InnoDB? InnoDB’s great, but if you’re uploading massive tables back and forth to your server, MyISAM will be less of a hassle due to much smaller files.
This test is not neutral. MyISAM is disadvantaged from the beginning, since MyISAM is optimized to be used in combination with CHAR fields in contrast to VARCHAR fields. This given, MyISAM is optimized for read access when fixed row size is given.
To make a statement comparing the InnoDB and MyISAM engine the table has to be set up differently for MyISAM. Then it would be very interesting to see the results especially concerning reads for MyISAM compared to InnoDB and Falcon. Likely MyISAM will beat InnoDB is at least some of the stated queries above.
In my case, I need to insert/update 40,000+ rows quite often
InnoDB will take me around 1 hr
compare with MyISAM only will take no more than 70 “SECONDS”
Eric, you should open a transaction (BEGIN TRAN) before inserting a large number of rows, them COMMIT.
Falcon attack
Can I second 54. Shaun?
“I realized that I should use transactions, but unfortunately I had to give up the “speed†of MyISAM for InnoDB. Your benchmarks make me feel much better about my decision. =)”
This is exactly the same thing I was led to believe… I keep thinking… mmm I NEED Transactions but I don’t want to lose the performance! Damn where did I get this idea from? I’ve only started looking into it deeply since MySQL stated they were moving to InnoDB as their default engine and that there are so many myths surrounding MyISAM being a ‘faster reader’ etc.
Thanks for the great article… now I have to get ahold of your book as I feel there is so much I do not know and I should
Thank you for information,
I useing MyISAM!
thank you for researching.
amazing results
Great work but you just can’t compare myisam with innodb without a group by statement and 10 updates/seccond:)
I wish I could do the test by my core duo is just too slow.
hey Vadim,
You have posted some extremely useful piece of work.
I appreciate your efforts.
many thanks.
My ISAM InnoDB
Required full text Search Yes
Require Transactions Yes
frequent select queries Yes
frequent insert,update,delete Yes
Row Locking (multi processing on single table) Yes
Relational base design Yes
Hi, I have briefly discuss this matter by table so you can conclude which has to be chosen either innodb or MyISAM.
http://developer99.blogspot.com/2011/07/mysql-innodb-vs-myisam.html