Sun is aggressively pushing T2000 as Scalable MySQL Platforms, and indeed it is Scalable in terms of high concurrency workloads – it is able to execute a lot of concurrent threads and so speed gain from 1 thread to say 32 thread will be significant.

But thing a lot of people miss is – Being Scalable is Not Enough – you need to scale from reasonable base to claim the good performance, and this is where T2000 performs subpar in many cases.

I often hear about people complaining queries take much longer on T2000 compared to recent Intel or AMD CPUs when there is no concurrent load – It is reported T2000 can be as much as 5-15 times slower in this case depending on the workload.

Here is example run of purely CPU consuming “Benchmark” function for 2.6Ghz Intel Xeon vs T2000:

As you can see this is hell a lot of difference !

Depending on your application performance with single thread may be important or non important for you – it is surely important for the slave if you’re having active replication, if you’re running time sensitive long running CPU bound queries or if queries contribute significant time to generating web page.

For example if on Xeon queries take 50ms to generate the page, the MySQL Latency you may see on T2000 may be as high as 500ms which would be well above performance guidelines for many web applications.

I’m hearing Sun is working on new CPUs which would offer significantly higher single thread performance, but at this time I have to be very careful advising this platform to the customers.

12 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Matt Ingenthron

#1, “Sun” is not, to my knowledge, “aggressively pushing T2000 as Scalable MySQL Platforms”. Could you provide a reference?

That’s not to say there could be individuals who are talking about T2000 for MySQL, and there are scenarios where any of the CMT systems are good for this, but there’s not any kind of aggressive pushing. I would know. 🙂

#2, the T2000 is the first generation CoolThread CMT CPU, and the second generation has been out for over a year. The respin of the second generation is out now. I recognize you may not have had one of these available for testing, but there are certainly differences.

#3, Sun has always been clear that the CMT systems are not the right thing for all jobs. Just have a look at the profiling tools and discussions to help find the right apps at http://cooltools.sunsource.net

If you have one user doing selects as fast as possible, you have a weird reason for using a database at all. If you have a million users connecting, selecting, disconnecting, or you have a large number of databases or applications, that’s something else entirely.

There are certainly clear cases where the CMT systems are perfectly reasonable, and even the best performer, for a workload. Just have a look here:
http://www.sun.com/servers/coolthreads/benchmarks/index.jsp

What happens when you have a lot of of concurrency with a Xeon? The OS has to context switch. This means spending a bunch of those 2.6GHz cycles moving data out of registers and onto the stack. As you add more concurrent workload, the response time per request will start dropping, and the number of responses within an acceptable timeframe (i.e. under 1sec, the way many benchmarks test) will drop.

What happens when you have a lot of concurrency with a CMT system? It just switches between strands of execution. As you add more concurrent workload, the response time stays pretty constant, and the number of requests just goes up.

It’s called efficiency. It’s why you hire a moving van and workers to help you load your furniture, rather than rent a Corvette for the day. 🙂

Denis Sheahan

Hi

You are showing a 9x difference in performance here which is way higher than expected.
Even with frequency differences I would only expect 3-4x difference

Is it possible to get the data from your benchmark table so we can run the workload
in-house and determine why it is so slow

Also on the T2000 where does the database reside, local disk?

What OS is running on both boxes

Luke Monahan

I have been testing MySQL 5.0.45 as distributed by Sun on a T5120 over the last few days. The T5120 essentially a next gen T2000. Twice as many threads-per-core leads to 64 logical processors being seen by the OS. Another big change is the addition of an FPU per core rather than a single FPU for the whole chip as in the T2000. For comparison I’ve been up against a Sun X4100 with 2 AMD dual-cores. Both machines have 16GB of RAM, but I’ve been testing with and without a large cache enabled to see the difference. My tests are all using innodb and Sysbench (latest versions). I’ve been using mainly the MySQL config to tune, and haven’t delved into filesystem and OS configuration or source code changes (eek!) yet.

Essentially I am getting very similar results from each machine. The main difference is the resource utilization on the T5120 is much lower: 20% CPU versus 80-85% on the X4100. I have a while to go to see if I can do any better on both machines, but I am sure the Sun Niagara chips — especially in their latest incarnation — are very capable.

Luke Monahan

Peter:

The Niagara was running best at 32 thread concurrency — showing I believe a limit of MySQL to scaling out to more threads than this. Disk IO at this stage was fine (expected: separate disk IO tests showed the Niagara to excel here), so I am continuing to search for other bottlenecks. With low numbers of threads the Niagara was (predictably) slow, I only have results here from 4 threads upwards, but I can do some more tests for you if you like on Monday. I’ve still got a few days to finish up before we send the box back, so any suggestions on the most worthwhile benchmarks would help.

The Opteron ran it’s best at 8-12 threads.

Using Sysbench on 1M rows:

sysbench –test=oltp –num-threads= –max-time=60 –max-requests=0 –oltp-read-only=on run

T5120:
X=4: 400 TPS
X=8: 716 TPS
X=16: 1178 TPS
X=32: 1935 TPS
X=48: 1869 TPS
X=64: 1674 TPS

I do have some more R/W and benchmarks with different configs, but not at work to dig them out.

As far as getting your T2000 to work a bit better: http://hell.jedicoder.net/?p=88 contains some tuning resources at the bottom. I find the latest Coolstack release of MySQL has many of these applied already, so make sure you use that to test.

Mikael Ronstrom

Peter,
Your blog is usually a very interesting but this type of benchmarks is about as
informative as a benchmark of SELECT COUNT(*) from t and benchmarking InnoDB vs
MyISAM where MyISAM will beat InnoDB by a large factor.

A blog gives you the ability to quickly report findings but this blog certainly
lacked the normal proper technical research that I would expect from you.

Rgrds Mikael

Luke Monahan

Hi Peter,

Sorry I haven’t got back to this, but I’ll hopefully post some more benchmarks tomorrow. However, the single-threaded benchmarks aren’t going to do anything other than confirm what is already known: The Niagara isn’t aimed at single threaded workloads, and has never been advertised as such AFAIK. We are finding it to be well positioned for most web-based workloads (short queries, lots of them), but to do any data mining or long reporting processes we replicate to a more suitable server.