Back when the first version of the MariaDB Java Client was released, someone asked in the comments about the performance characteristics of the driver compared to ConnectorJ. I answered with hand-waving, saying that nobody does anything stupid, the performance of the drivers would be roughly the same, but I promised to measure it and tell the world one day. And now that day has come. The day where three MySQL JDBC drivers (ConnectorJ, MariaDB JDBC, and Drizzle JDBC) are compared against each other. Unlike the server, which gets benchmarking attention all the time, there is no standard benchmark for connectors, so I needed to improvise, while trying to keep the overhead of the server minimal. So I did something very primitive to start. I used my two favorite queries:
DO 1— this one does not retrieve a result set, and thus can be seen as a small “update”.
SELECT 1— the minimal SELECT query.
The test program runs a query N times, and if the query was a select, it retrieves all values from the result set, using
ResultSet.getObject(i), and calculates the queries-per-second value. (The best thing is that the test program is single-threaded, and how often does one get to run single-threaded tests? the test was run on my own workstation, which runs Windows Server 2008 R2, and I have
useConfigs=maxPerformance in the URL for ConnectorJ.
Results (Queries per second, unprepared)
MariaDB JDBC appears to be a little faster (~10%) than ConnectorJ, and much faster (~30%) than Drizzle JDBC.
Can ConnectorJ do better? I bet it can. Looking into profiler output – CPU profiling, instrumentation mode in NetBeans – for a test that executes “SELECT 1″ in a loop, shows
com.mysql.jdbc.StatementImpl.findStartOfStatement() taking 7.5% of runtime. Ok, instrumentation results should be taken with a grain of salt, however the single reason string search is used, is because - if an update (DML) statement is executed inside
ResultSet.executeQuery(), it is rejected with an exception. This can be done differenty, I believe. If absolutely necessary, throwing an exception can be delayed, until the client finds out that the server sent an OK packet instead of a result set.
Even more interesting is the case with Drizzle JDBC. In theory, since the MariaDB driver has a Drizzle JDBC heritage, the performance characteristics should be similar, but they are not, so there must be a bug somewhere. It appears very easy to find, as according to profiler, 50.2% CPU time (take that number with a big grain of salt) is spent in a function that constructs a hexdump from a byte buffer. Looking at the source code, we find following line that is unconditionally executed:
log.finest("Sending : " + MySQLProtocol.hexdump(byteHeader, 0));
While the result of the hexdump is never used (unless logging level is FINEST), the dump string is still created, using relatively expensive
Formatter routines, concatenated with the String “Sending:”, and then thrown away… In Markus’ defense,
hexdump() is not his fault, it was contributed 3 years ago. But it remained undetected for 3 years. This bug is now filed https://github.com/krummas/DrizzleJDBC/issues/21 [UPDATE: this bug was resolved within hours after reporting]
So, let’s check how much we can gain by putting the offending code into an
if (log.getLevel() == java.util.logging.Level.FINEST) condition.
The QPS from “DO 1″ raises from 15288 to 19968 (30%), and for “SELECT 1″ we have increase from 13410 to respectable 16824 (25%). Not bad for a single line fix.
While the one-liner makes the Drizzle JDBC faster, with slightly better numbers than ConnectorJ, it is still not as fast as MariaDB.
In the MariaDB JDBC connector, there were a couple of improvements to performance which were made since forking. One of the early improvements was to avoid copying data unnecessarily when sending, and to decrease the number of byte buffers. Another improvement came recently, after profiling and finding that parsing Field packets is expensive (mostly due to the construction of Strings for column name, aliases, and etc…). The improvement was lazy parsing, delaying string construction, and avoiding it entirely in most cases. For example, if column names are not used, and rows are accessed using integer indexes in
ResultSet.getXXX(int i), the metadata won’t be fully parsed. Also, perhaps there were some other fixes that I do not remember anymore.
Can we further increase the QPS?
We can try. First, statements can be prepared. MariaDB and Drizzle so far only provide client-side prepared statements (ConnectorJ can do both client and server-side prepared statements) but using them saves having to convert the query to bytes, and JDBC escapes preprocessing. From now on I’ll stay just with “DO 1″ which proved to be the fastest query. Trying it on MariaDB driver shows some minimal QPS increase 22104 (not prepared) vs 22183 (prepared), or 0.3%. Slightly more on ConnectorJ (19543 vs 20096, or 2.9%). Nothing revolutionary so far.
But, We still have not used all of the options in this (admittedly silly) quest for maximizing the performance of “DO 1″. Recall that ConnectorJ can support named pipes on Windows, which are allegedly much faster than TCP connections. Restart server with named pipe, set JDBC URL to “jdbc:mysql:///?socketFactory=com.mysql.jdbc.NamedPipeSocketFactory&namedPipePath=\\\\.\\Pipe\\MySQL&user=root&useConfigs=maxPerformance”, and rerun the test with 1000000 prepared queries. Now the QPS grew to 29542! That is strong, and is a 33% improvement compared to the best result seen so far. Yet, unfortunately, still no cigar, since JVM dumps a stack trace when the named pipe connection is closed. This is a “Won’t fix” (chalked off as a JVM problem) MySQL bug Bug#62518, which renders named pipe support almost useless – though maybe there is a trick to shut up th JVM somehow in this case, but I do not know of such a trick.
How fast is C client library in comparison?
Out of curiosity, I also tested how the native client compares to JDBC. With the TCP protocol, it does slightly better than the fastest JDBC (MariaDB, prepared), but it is not a huge margin – 24063 QPS vs 22183 (8.5% difference), and I believe Java drivers could improve further.
With named pipe, QPS is 33122, which is ~12% better than what ConnectorJ could do, if pipes worked properly there.
Accessing benchmark program
I put the benchmark program on Launchpad, together with the drivers. If you’re on Windows, and if you have a server running on port 3306, and the ‘root’ user doesn’t have a password, you can just branch the repository and run bench_all.bat. Those of you who are using other operating systems, I trust you to be able to quickly rewrite the batch files as shell scripts.