There are many new features in MariaDB 5.3. I’m looking forward to many of them, but one of the ones I’m most excited about is Progress Reporting.
It’s a fact of life in the database world that some commands take longer to run than others. Commands like ALTER TABLE, LOAD DATA INFILE, and adding and dropping an index simply take time to run, depending (of course) on your data and schema. I always have hated having to wait for those commands to run with no indication of how much progress has been made or how much is left to do. All of that changes with the upcoming release of MariaDB 5.3.
In MariaDB 5.3 there is a new “Progress” column in the output of SHOW PROCESSLIST (this can be turned off) which shows the progress, and there are three new columns (STAGE, MAX_STAGE, and PROGRESS_DONE) in INFORMATION_SCHEMA.PROCESSLIST which allow you to see which process stage you are in, the number of stages, and how much of the current stage is completed. Additionally, the mysqld server sends progress reports to clients at a specified interval (the default is every 5 seconds, 0 will disable it). In the MariaDB 5.3 mysql command-line client, these progress reports look like this:
MariaDB [test]> alter table my_mail engine=maria;
Stage: 1 of 2 'copy to tmp table' 5.37% of stage done
Currently, the following commands can send progress report messages to the client:
- ALTER TABLE
- ADD INDEX
- DROP INDEX
- LOAD DATA INFILE (not LOAD DATA LOCAL INFILE, as in that case we don’t know the size of the file).
Some Aria storage engine operations also support progress messages:
- CHECK TABLE
- REPAIR TABLE
- ANALYZE TABLE
- OPTIMIZE TABLE
Third-party clients and storage engines can easily add support for receiving and sending (respectively) these progress reports. Details are on the Progress Reporting page in the AskMonty Knowledgebase. One application which has been updated to support progress reports (in addition to the MariaDB mysql command-line client) is the ‘mytop‘ program, which will be included in source and binary packages of MariaDB 5.3.
Progress reporting doesn’t speed up ALTER TABLE or any of the other supported commands, but at least I now have some visibility into them, and that is “a very good thing”.
During my years at MySQL AB I had the unfortunate task of manually maintaining the download page for enterprise customers. This involved a ton of boring, error prone work and almost always led to some sort of error every release. Some of our downloads were eventually replaced with an automated system written by the web team but the memory of all that time wasted still hurts me. So when I joined Monty Program and saw our downloads were manually maintained in mediawiki I knew something had to change.
Most of the websites for Monty Program and the MariaDB project are written with Django so this is where I started. I used our existing website code base and just created a new django application for downloads. There are many models / tables involved in the system but the important ones are:
- Releases: A list of all the releases we have made, i.e. MariaDB 5.2.7, MariaDB 5.1.55, etc
- Files: The individual files that make up a release.
- Mirrors: The information (name, url, location) of the MariaDB mirrors.
- Rules: This is the heart of the system and controls how a file name gets assigned to a release and its various other attributes such as OS and release.
When a MariaDB release is ready to publish our release coordinator pushes the files to our primary mirror and tells the download management system to check for a new release. The system scans the mirror and captures the information (name, size, directory) of new files. The system then loops through each rule in order and checks if it applies.
A rule is basically a regular expression and then a snippet of python code to run. Massive regular expressions are always a pain to work with so we try to keep the rules as simple as possible. For example, this is one of our rules.
Name: CPU – x86_64
Code: file.cpu = ‘x86_64’
Some rules obviously are more complex, but this is a good example of what we aim for. It is easy to understand and if something needs to be changed it can be done easily. The file object in the code section is a helper object to make writing the rules easier by hiding the actual complexities of the underlying objects. I considered using some sort of rules engine but decided that added unneeded complexity (the top answer on this question helped shape my opinion: http://stackoverflow.com/questions/467738/implementing-a-rules-engine-in-python)
Once all the rules have been applied the release coordinator takes a final look and publishes the release. If there is a problem later, the whole release or individual files can be pulled.
The front end is fairly straightforward and there isn’t much to discuss but here are a few highlights:
- The file listing is loaded via ajax so applying filters is fast.
- Your mirror is picked by first looking at your country then your continent. If we have someone trying to download from Antarctica a random mirror will be chosen.
That in a nut shell is how our downloads system works. If anyone has any questions about it I’m happy to answer, either in the comments or Freenode #maria.
OLX, a free classifieds site, is serving up 40 million pages a day using MariaDB. Not an insignificant task.
There’s a nice write-up in the MariaDB knowledgebase with particulars. In short, the 5.2 series of MariaDB and some of the unique features of the project have made a migration easy and valuable.
It’s nice to hear such stories. Both because we like interesting sites and projects, as well as our natural interest in larger scale or larger visibility deployments. Got a story to share? Please create a KB entry, or e-mail the community team.