When dealing with high performance, low latency storage devices, such as SSD cards, one finds bottlenecks in new places. This is a story about such a bottle neck and how to work around it. Continue reading
Oracle has now launched MySQL-5.6.10-GA, so it is time to come up with some new benchmark results. The test candidates in this benchmark run are
The 5.5 versions are in because I wanted to check for any regressions. In the past we have often seen performance regressions in newer versions which were caused by new features.
This time the benchmark was run on a different box. The main difference is that this box does not have SSD but a high performance RAID-5 with 512M of battery-backed cache. Besides that the machine has 16 cores out of which 12 were used for mysqld and the other 4 for sysbench.
The benchmark uses sysbench-0.5 OLTP with 8 tables and 10G worth of data. InnoDB buffer pool was 16G, InnoDB log group capacity 4G (the maximum for MySQL-5.5). The different disk system required different InnoDB configuration:
- innodb_io_capacity = 1000 (was 20000 for SSD)
- innodb_flush_neighbors = 1 (was 0 for SSD)
Now for the results. OLTP read only comes first:
And here is the first surprise: MySQL-5.6 behaves significantly different. It competes well up to 8 threads, it even wins 16 threads. But at higher concurrency performance drops off rapidly, even compared to MySQL-5.5. MariaDB-10.0 shows also a slight drop in performance compared to MariaDB-5.5, but it’s much less pronounced.
The response time graph is nice and smooth though:
Both MySQL-5.6 and MariaDB-10.0 look a little better which means they distribute cpu cycles more evenly on concurrent requests.
Disclaimer: no thread pool was used in this benchmark. The Oracle implementation of the thread pool is closed source and thus cannot be benchmarked or used by anybody. It seemed a bit unfair to use the MariaDB thread pool under those cirumstances.
If you want to see the impact of the MariaDB thread pool, have a look at the benchmarks published previously:
Next stop: OLTP read/write:
The picture is very similar. Both MySQL-5.6 and MariaDB-10.0 show a performance drop, compared to the 5.5 versions. For MySQL the drop is more than 10% and thus rather heavy.
However it’s a well known fact that MySQL-5.5 exhibits severe write stalls under high load when InnoDB starts synchronous flushing. The response time graph is good to spot this:
This is the good news. While the 5.5 versions both show heavy write stalls at 64 threads and more, the behavior is much less pronounced with MySQL-5.6 and MariaDB-10.0. So it seems the new adaptive flushing algorithm is working well.
There is however one problem here: if you use multiple buffer pool instances, then you see write stalls more often. For the above results I have run the read-only tests with 16 buffer pools and the read-write tests with only 1.
- MySQL-5.6 shows a rather severe performance regression, especially at higher concurrency levels. This does not match the results published by Oracle. I can only speculate why the results are so different, but I guess it’s the (closed source) thread pool and maybe the fact that Oracle benchmarks were done on much bigger hardware.
- with a single buffer pool you don’t have to be afraid of write stalls any more. Also MySQL-5.6 allows now up to 512G redo log capacity which further reduces the odds to run into synchronous flushing (MariaDB-5.5 lifted this limit with XtraDB already)
As always the scripts used for the benchmark as well as the results are available from launchpad:
I invite anybody to rerun this benchmark and share the results.
Recently I tested our new segmented key cache feature for MyISAM in MariaDB 5.2.2-gamma for performance gains. You can check our new features in MariaDB 5.2 in our Ask Monty Knowledge Base
You will also find the details about the segmented key cache feature in our Knowledge Base at:
We wrote a test in LUA for SysBench v0.5 called select_random_points.lua, to figure out the performance gain of splitting the key cache’s global mutex into several mutex under multi user load.
You can find all the details about the benchmark in our Knowledge Base article here:
The results were quite surprising, because we found up to 250% performance gain depending on the amount of concurrent users. On a side note we also found out that our new Intel system with 24 virtual cores has more than a 10 times performance advantage compared to our older 4 core system.