During my years at MySQL AB I had the unfortunate task of manually maintaining the download page for enterprise customers. This involved a ton of boring, error prone work and almost always led to some sort of error every release. Some of our downloads were eventually replaced with an automated system written by the web team but the memory of all that time wasted still hurts me. So when I joined Monty Program and saw our downloads were manually maintained in mediawiki I knew something had to change.
Most of the websites for Monty Program and the MariaDB project are written with Django so this is where I started. I used our existing website code base and just created a new django application for downloads. There are many models / tables involved in the system but the important ones are:
- Releases: A list of all the releases we have made, i.e. MariaDB 5.2.7, MariaDB 5.1.55, etc
- Files: The individual files that make up a release.
- Mirrors: The information (name, url, location) of the MariaDB mirrors.
- Rules: This is the heart of the system and controls how a file name gets assigned to a release and its various other attributes such as OS and release.
When a MariaDB release is ready to publish our release coordinator pushes the files to our primary mirror and tells the download management system to check for a new release. The system scans the mirror and captures the information (name, size, directory) of new files. The system then loops through each rule in order and checks if it applies.
A rule is basically a regular expression and then a snippet of python code to run. Massive regular expressions are always a pain to work with so we try to keep the rules as simple as possible. For example, this is one of our rules.
Name: CPU – x86_64
Code: file.cpu = ‘x86_64′
Some rules obviously are more complex, but this is a good example of what we aim for. It is easy to understand and if something needs to be changed it can be done easily. The file object in the code section is a helper object to make writing the rules easier by hiding the actual complexities of the underlying objects. I considered using some sort of rules engine but decided that added unneeded complexity (the top answer on this question helped shape my opinion: http://stackoverflow.com/questions/467738/implementing-a-rules-engine-in-python)
Once all the rules have been applied the release coordinator takes a final look and publishes the release. If there is a problem later, the whole release or individual files can be pulled.
The front end is fairly straightforward and there isn’t much to discuss but here are a few highlights:
- The file listing is loaded via ajax so applying filters is fast.
- Your mirror is picked by first looking at your country then your continent. If we have someone trying to download from Antarctica a random mirror will be chosen.
That in a nut shell is how our downloads system works. If anyone has any questions about it I’m happy to answer, either in the comments or Freenode #maria.