Introduction

When you e.g. delete rows, these rows are just marked as deleted not really physically deleted from indexes and free space introduced is not returned to operating system for later reuse. Purge thread will physically delete index keys and rows, but still free space introduced is not returned to operating system and this operation can lead holes on page. If you have variable length rows, this could lead to situation where this free space can’t be used for new rows (if these rows are larger than old ones). User may use OPTIMIZE TABLE or ALTER TABLE <table> ENGINE=InnoDB to reconstruct the table.

Unfortunately, running OPTIMIZE TABLE against an InnoDB table stored in the shared table-space file ibdata1 does two things:

  • Makes the table’s data and indexes contiguous inside ibdata1
  • Makes ibdata1 grow because the contiguous data and index pages are appended to ibdata1

New defragmentation

In MariaDB 10.1 we have merged Facebooks defragmentation code prepared for MariaDB by Matt, Seong Uck Lee from Kakao. Only major difference to Facebooks code and Matt’s patch is the fact that in MariaDB we decided not to introduce new literals to SQL and no changes to server code. Instead we use already existing OPTIMIZE TABLE and all code changes are inside InnoDB/XtraDB storage engines. To enable this new feature you need to add following to my.cnf (this requirement is to keep the original behavior of OPTIMIZE TABLE for those users that need it).

This new defragmentation feature works inplace, thus no new tables are created and there is no need to copy data from old table to new table. Instead this feature loads n pages and tries to move records so that pages would be full of records and frees pages that are fully empty after the operation.

New configuration variables

  • innodb_defragment: Enable/disable InnoDB defragmentation. When set to FALSE, all existing defragmentation will be paused. And new defragmentation command will fail. Paused defragmentation commands will resume when this variable is set to TRUE. Default value FALSE.
  • innodb_defragment_n_pages: Number of pages considered at once when merging multiple pages to defragment. Range of 2–32 and default is 7.
  • innodb_defragment_stats_accuracy: How many defragment stats changes there are before the stats are written to persistent storage. Set to 0 meaning disable defragment stats tracking. Default 0.
  • innodb_defragment_fill_factor_n_recs:  How many records of space defragmentation should leave on the page. This variable, together with innodb_defragment_fill_factor, is introduced so defragmentation won’t pack the page too full and cause page split on the next insert on every page. The variable indicating more defragmentation gain is the one effective. Range of 1–100 and default 20.
  • innodb_defragment_fill_factor: A number between [0.7, 1] that tells defragmentation how full it should fill a page. Default is 0.9. Number below 0.7 won’t make much sense. This variable, together with innodb_defragment_fill_factor_n_recs, is introduced so defragmentation won’t pack the page too full and cause page split on the next insert on every page. The variable indicating more defragmentation gain is the one effective.
  • innodb_defragment_frequency: Do not defragment a single index more than this number of time per second.This controls the number of time defragmentation thread can request X_LOCK on an index. Defragmentation thread will check whether 1/defragment_frequency (s) has passed since it worked on this index last time, and put the index back to the queue if not enough time has passed. The actual frequency can only be lower than this given number.

New status variables

  • Innodb_defragment_compression_failures: Number of defragment re-compression failures
  • Innodb_defragment_failures: Number of defragment failures.
  • Innodb_defragment_count: Number of defragment operations.

Example

After CREATE TABLE and INSERT operations we can see following from INFORMATION_SCHEMA:

Now if we delete 3/4 of the records that will leave holes in pages and then we optimize table to execute defragmentation:

After this we can see that some pages are freed and some pages merged:

Links

WebScaleSQL Git repository https://github.com/webscalesql/webscalesql-5.6

Facebook Percona Live presentation: https://www.google.fi/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCQQFjAB&url=https%3A%2F%2Fwww.percona.com%2Flive%2Fmysql-conference-2014%2Fsites%2Fdefault%2Ffiles%2Fslides%2Fdefragmentation.pdf&ei=UgNKVNnZMcHhywP7qwI&usg=AFQjCNGREUpen21jCcy0bchUa6Ro83ol_A&sig2=MDZU2Ue9sX1kB9OusvdiFA

One of the most popular plugin types both in MariaDB and MySQL is INFORMATION_SCHEMA plugin type. INFORMATION_SCHEMA plugins add new tables to the INFORMATION_SCHEMA. There are lots of INFORMATION_SCHEMA plugins, because they can be used to show just anything to the user and are very easy to write.

MariaDB 10.1.1 comes with nine INFORMATION_SCHEMA plugin:

  • Feedback — shows the anonymised server usage information and can optionally send it to the configured url.
  • Locales — lists compiled-in server locales, implemented by Roberto Spadim
  • METADATA_LOCK_INFO — Lists metadata locks in the server. Implemented by Kentoku Shiba
  • QUERY_CACHE_INFO — Lists queries in the query cache. Originally by Roland Bouman
  • QUERY_RESPONSE_TIME — Shows distribution of query response times. Originally implemented in Percona
  • CLIENT_STATISTICS — part of Percona “userstat” patch
  • USER_STATISTICS — part of Percona “userstat” patch
  • INDEX_STATISTICS — part of Percona “userstat” patch
  • TABLE_STATISTICS — part of Percona “userstat” patch

Also there are many INFORMATION_SCHEMA plugins that come together with storage engines and show various information about them. InnoDB comes with 28 of such plugins. XtraDB — with 32, TokuDB — with 12. And if you google you can find many more INFORMATION_SCHEMA plugins — for example, a plugin that shows various system information or a plugin that lists all created user variables. INFORMATION_SCHEMA plugins are indeed numerous and very popular.

If you look at the these plugins, you can see that there are plugins that display some kind of a current state of something, and there are plugins that accumulate and display some statistics. For this second group of plugins it is sometimes desirable to reset this statistics — but there was no natural way for a user to do that. For status variables (as in SHOW STATUS statement) one can use FLUSH STATUS, but for plugins there was nothing similar. But now MariaDB 10.1.1 extends INFORMATION_SCHEMA plugin API by introducing reset_table callback. For example, look at the QUERY_RESPONSE_TIME plugin initialization function:

See the highlighted line — it sets the reset_table callback to the query_response_time_flush function. When this plugin is loaded, MariaDB will automatically start supporting new statement

and it will invoke this callback that will, in turn, reset query response time statistics. Similarly “userstat” tables CLIENT_STATISTICS, USER_STATISTICS, INDEX_STATISTICS, and TABLE_STATISTICS support the FLUSH statement in a similar way.

MariaDB 10.1.1 also implements another enhancement for the INFORMATION_SCHEMA plugins — the SHOW statement. Indeed, until 10.1.1 only native server INFORMATION_SCHEMA tables could have a SHOW counterpart. To query a plugin INFORMATION_SCHEMA table one would need to type a lengthy SELECT and than see how long lines wrap around in the terminal window, making it impossible to see what column each value belongs to. You all know it very well:

So we’ve introduced a SHOW statement for INFORMATION_SCHEMA plugins that is quick to type and is supposed to provide just enough columns to fit nicely on the screen:

The API is very simple, a plugin does not need to implement anything special to support this. It only needs to decide what subset of columns will be visible in the SHOW statement and specify names for this columns when declaring INFORMATION_SCHEMA table fields. For example, in the LOCALES plugin:

See, the highlighted lines specify column names for the SHOW statement and only these columns are visible in SHOW. It is, of course, a sole responsibility of the plugin writer to pick up a reasonable subset of columns so that the resulting table is not too wide but still contains enough information to be useful. If a plugin does not specify any column names for SHOW, the SHOW statement will not work for this plugin.


An inquisitive reader may note that “userstat” tables CLIENT_STATISTICS, USER_STATISTICS, INDEX_STATISTICS, and TABLE_STATISTICS supported FLUSH and SHOW also in 10.0 and in earlier versions of MariaDB. That’s right, but before 10.1.1 these tables were not plugins, and the syntax support was hard-coded into the server. Now they are plugins which, indeed, required INFORMATION_SCHEMA plugins to support FLUSH and SHOW. That’s why these extensions were implemented.

Every now and then there is a need to execute certain SQL statements conditionally. Easy, if you do it from your PHP (or Java or whatever) application. But if all you have is pure SQL? There are two techniques that MariaDB and MySQL use in the mysql_fix_privilege_tables.sql script (applied by mysql_upgrade tool).

  1. Create a stored procedure with IF statements inside, call it once and drop it. This requires the user to have the CREATE ROUTINE privilege and mysql.proc table must exist and be usable (which is not necessarily true — we’re doing it from mysql_upgrade, right?).
  2. Use dynamic SQL, like

    which is not very readable and doesn’t work well if you need to execute many statements conditionally.

This may be standard, but I never understood why one cannot simply use SQL control statements (besides CALL that is) directly from the mysql command line tool prompt. Imagine a bash variant that only supports if and while in scripts, but not in the interactive mode from the command line — how would you like it?

May be Antony Curtis was asking himself similar questions when he contributed a patch for compound statements in MDEV-5317. Either way, we thought it’s a great idea and implemented this feature, based on the Antony’s contribution.

Now one can use BEGIN, IF, CASE, WHILE, LOOP, REPEAT statements directly in SQL scripts and from the mysql command line prompt — outside of stored programs. For example, one can rewrite the above as

One can use BEGIN ... END blocks and loops without having CREATE ROUTINE privilege or with corrupted (or missing mysql.proc table). This all works as you would expect it to, with no artificial “standard says so” limitations.

Still, there are some limitations to keep in mind:

  • You cannot use a simple BEGIN to start a block, this is historically used to start a transaction. Use the standard syntax BEGIN NOT ATOMIC.
  • Compound statements from the mysql command line prompt cannot start with a label.
  • Not all statements that can be used in the stored program are supported from the mysql command line prompt, only those listed above are.

These limitations, though, only apply to the top-level statement. For example, if you need a labeled loop or SIGNAL, you start a block and put your statement inside it:

Enjoy!