The MariaDB Project is at OSCON 2011. We’ve got a booth, and we plan to also have a BoF session: Wednesday 27 July 2011 at room E142 at 8pm.
There will as usual be lots of black vodka (there was some yesterday at the MySQL BoF as well), and we’re going to talk about and celebrate the release of MariaDB 5.3.0 beta.
Come drop by the booth… we clearly have an interesting booth giveaway. And feel free to say hi to Kurt von Finck, Michael “Monty” Widenius, or Colin Charles who will be present, walking around, etc. Spot them in a MariaDB t-shirt of course!
There are many new features in MariaDB 5.3. I’m looking forward to many of them, but one of the ones I’m most excited about is Progress Reporting.
It’s a fact of life in the database world that some commands take longer to run than others. Commands like ALTER TABLE, LOAD DATA INFILE, and adding and dropping an index simply take time to run, depending (of course) on your data and schema. I always have hated having to wait for those commands to run with no indication of how much progress has been made or how much is left to do. All of that changes with the upcoming release of MariaDB 5.3.
In MariaDB 5.3 there is a new “Progress” column in the output of SHOW PROCESSLIST (this can be turned off) which shows the progress, and there are three new columns (STAGE, MAX_STAGE, and PROGRESS_DONE) in INFORMATION_SCHEMA.PROCESSLIST which allow you to see which process stage you are in, the number of stages, and how much of the current stage is completed. Additionally, the mysqld server sends progress reports to clients at a specified interval (the default is every 5 seconds, 0 will disable it). In the MariaDB 5.3 mysql command-line client, these progress reports look like this:
MariaDB [test]> alter table my_mail engine=maria;
Stage: 1 of 2 'copy to tmp table' 5.37% of stage done
Currently, the following commands can send progress report messages to the client:
- ALTER TABLE
- ADD INDEX
- DROP INDEX
- LOAD DATA INFILE (not LOAD DATA LOCAL INFILE, as in that case we don’t know the size of the file).
Some Aria storage engine operations also support progress messages:
- CHECK TABLE
- REPAIR TABLE
- ANALYZE TABLE
- OPTIMIZE TABLE
Third-party clients and storage engines can easily add support for receiving and sending (respectively) these progress reports. Details are on the Progress Reporting page in the AskMonty Knowledgebase. One application which has been updated to support progress reports (in addition to the MariaDB mysql command-line client) is the ‘mytop‘ program, which will be included in source and binary packages of MariaDB 5.3.
Progress reporting doesn’t speed up ALTER TABLE or any of the other supported commands, but at least I now have some visibility into them, and that is “a very good thing”.
During my years at MySQL AB I had the unfortunate task of manually maintaining the download page for enterprise customers. This involved a ton of boring, error prone work and almost always led to some sort of error every release. Some of our downloads were eventually replaced with an automated system written by the web team but the memory of all that time wasted still hurts me. So when I joined Monty Program and saw our downloads were manually maintained in mediawiki I knew something had to change.
Most of the websites for Monty Program and the MariaDB project are written with Django so this is where I started. I used our existing website code base and just created a new django application for downloads. There are many models / tables involved in the system but the important ones are:
- Releases: A list of all the releases we have made, i.e. MariaDB 5.2.7, MariaDB 5.1.55, etc
- Files: The individual files that make up a release.
- Mirrors: The information (name, url, location) of the MariaDB mirrors.
- Rules: This is the heart of the system and controls how a file name gets assigned to a release and its various other attributes such as OS and release.
When a MariaDB release is ready to publish our release coordinator pushes the files to our primary mirror and tells the download management system to check for a new release. The system scans the mirror and captures the information (name, size, directory) of new files. The system then loops through each rule in order and checks if it applies.
A rule is basically a regular expression and then a snippet of python code to run. Massive regular expressions are always a pain to work with so we try to keep the rules as simple as possible. For example, this is one of our rules.
Name: CPU – x86_64
Code: file.cpu = ‘x86_64′
Some rules obviously are more complex, but this is a good example of what we aim for. It is easy to understand and if something needs to be changed it can be done easily. The file object in the code section is a helper object to make writing the rules easier by hiding the actual complexities of the underlying objects. I considered using some sort of rules engine but decided that added unneeded complexity (the top answer on this question helped shape my opinion: http://stackoverflow.com/questions/467738/implementing-a-rules-engine-in-python)
Once all the rules have been applied the release coordinator takes a final look and publishes the release. If there is a problem later, the whole release or individual files can be pulled.
The front end is fairly straightforward and there isn’t much to discuss but here are a few highlights:
- The file listing is loaded via ajax so applying filters is fast.
- Your mirror is picked by first looking at your country then your continent. If we have someone trying to download from Antarctica a random mirror will be chosen.
That in a nut shell is how our downloads system works. If anyone has any questions about it I’m happy to answer, either in the comments or Freenode #maria.