Oracle has now launched MySQL-5.6.10-GA, so it is time to come up with some new benchmark results. The test candidates in this benchmark run are

  • MySQL-5.5.29
  • MySQL-5.6.10
  • MariaDB-5.5.28a
  • MariaDB-10.0.1

The 5.5 versions are in because I wanted to check for any regressions. In the past we have often seen performance regressions in newer versions which were caused by new features.

This time the benchmark was run on a different box. The main difference is that this box does not have SSD but a high performance RAID-5 with 512M of battery-backed cache. Besides that the machine has 16 cores out of which 12 were used for mysqld and the other 4 for sysbench.

The benchmark uses sysbench-0.5 OLTP with 8 tables and 10G worth of data. InnoDB buffer pool was 16G, InnoDB log group capacity 4G (the maximum for MySQL-5.5). The different disk system required different InnoDB configuration:

  • innodb_io_capacity = 1000 (was 20000 for SSD)
  • innodb_flush_neighbors = 1 (was 0 for SSD)

Now for the results. OLTP read only comes first:

20130213-sb-ro-tps

And here is the first surprise: MySQL-5.6 behaves significantly different. It competes well up to 8 threads, it even wins 16 threads. But at higher concurrency performance drops off rapidly, even compared to MySQL-5.5. MariaDB-10.0 shows also a slight drop in performance compared to MariaDB-5.5, but it’s much less pronounced.

The response time graph is nice and smooth though:

20130213-sb-ro-rt

Both MySQL-5.6 and MariaDB-10.0 look a little better which means they distribute cpu cycles more evenly on concurrent requests.

Disclaimer: no thread pool was used in this benchmark. The Oracle implementation of the thread pool is closed source and thus cannot be benchmarked or used by anybody. It seemed a bit unfair to use the MariaDB thread pool under those cirumstances.

If you want to see the impact of the MariaDB thread pool, have a look at the benchmarks published previously:

Next stop: OLTP read/write:

20130213-sb-rw-tps

The picture is very similar. Both MySQL-5.6 and MariaDB-10.0 show a performance drop, compared to the 5.5 versions. For MySQL the drop is more than 10% and thus rather heavy.

However it’s a well known fact that MySQL-5.5 exhibits severe write stalls under high load when InnoDB starts synchronous flushing. The response time graph is good to spot this:

20130213-sb-rw-rt

This is the good news. While the 5.5 versions both show heavy write stalls at 64 threads and more, the behavior is much less pronounced with MySQL-5.6 and MariaDB-10.0. So it seems the new adaptive flushing algorithm is working well.

There is however one problem here: if you use multiple buffer pool instances, then you see write stalls more often. For the above results I have run the read-only tests with 16 buffer pools and the read-write tests with only 1.

Conclusions:

  • MySQL-5.6 shows a rather severe performance regression, especially at higher concurrency levels. This does not match the results published by Oracle. I can only speculate why the results are so different, but I guess it’s the (closed source) thread pool and maybe the fact that Oracle benchmarks were done on much bigger hardware.
  • with a single buffer pool you don’t have to be afraid of write stalls any more. Also MySQL-5.6 allows now up to 512G redo log capacity which further reduces the odds to run into synchronous flushing (MariaDB-5.5 lifted this limit with XtraDB already)

As always the scripts used for the benchmark as well as the results are available from launchpad:

http://bazaar.launchpad.net/~ahel/maria/mariadb-benchmarks/revision/20

I invite anybody to rerun this benchmark and share the results.

MariaDB-5.5.21-beta is the first MariaDB release featuring the new thread pool. Oracle offers a commercial thread pool plugin for MySQL Enterprise, but now MariaDB brings a thread pool implementation to the community!

If you are not familiar with the term, please read the Knowledge Base article about it.

The main design goal of the thread pool is to increase the scalability of the MariaDB server with many concurrent connections. In order to test and demonstrate this, I have run the sysbench OLTP RO benchmark with up to 4096 threads to compare the new pool-of-threads and the traditional thread-per-connection scheduler:

OLTP(ro) MariaDB-5.5.21 pool-of-threads vs. thread-per-connection

Benchmark description:

  • sysbench multi table OLTP, readonly
  • 16 tables, totaling 40 mio rows (~10G of data)
  • 16G buffer pool – result is independent of disk performance
  • mysqld bound to 16 cpu cores, sysbench to the other 8

Read/write OLTP benchmark results will be published as soon as they are available.

Raw benchmark results and the scripts used can be downloaded here

PS:

Let me add few more words about binding cpu cores and thread pool configuration.

Sysbench-0.5 which was used for this benchmark, turned out to be quite a cpu hog; especially for high concurrency levels. If both mysqld and sysbench are allowed to run on all cpu cores, they will fight for resources – which will result in artificially bad results with many threads. With 4 cores for sysbench and 20 for mysqld, sysbench itself became the bottleneck. So I ended with 8 cores for sysbench (leaving some room here) and 16 for mysqld.

On UNIX, the MariaDB thread pool normally needs no configuration because it autoconfigures thread-pool-size to the number of cpu cores (this depends on sysconf(3) to implement _SC_NPROCESSORS_ONLN) which is in almost all cases the optimal size. If one binds mysqld to a subset of cpu cores, one should set thread-pool-size manually to the number of cpu cores given to mysqld.

Recently I tested our new segmented key cache feature for MyISAM in MariaDB 5.2.2-gamma for performance gains. You can check our new features in MariaDB 5.2 in our Ask Monty Knowledge Base

You will also find the details about the segmented key cache feature in our Knowledge Base at:

We wrote a test in LUA for SysBench v0.5 called select_random_points.lua, to figure out the performance gain of splitting the key cache’s global mutex into several mutex under multi user load.

You can find all the details about the benchmark in our Knowledge Base article here:

The results were quite surprising, because we found up to 250% performance gain depending on the amount of concurrent users. On a side note we also found out that our new Intel system with 24 virtual cores has more than a 10 times performance advantage compared to our older 4 core system.