Introduction

For the moment, the only engines that fully support encryption are XtraDB and InnoDB. The Aria storage engine also supports encryption, but only for temporary tables.

MariaDB supports 2 different way to encrypt data in InnoDB/XtraDB:

  1. Specified table encryption: Only tables which you create with PAGE_ENCRYPTION=1 are encrypted. This feature was created by eperi.
  2. Tablespace encryption: Everything is encrypted (including log files). This feature was created by Google and is based on their MySQL branch.

InnoDB Specified Table Encryption

Specified Table encryption means that you choose which tables to encrypt. This allows you to balance security with speed. To use table encryption, you have to:

  • Set the value of encryption-algorithm to the algorithm of your choice.
  • Load the file-key-management-plugin (or similar)
  • Define the location of key file
  • Create keys

Example:

Keys can be generated using OpenSSL with following command

The key file is a text file containing an key id, the hex-encoded iv and the hex-encoded key. Example keys.txt using above generated key:

After this is it up to database designer to select tables that contain sensitive data for encryption. Encryption can be enabled to table in table creation time or using ALTER TABLE. As an example:

In table encryption currently keys can’t be changed but used key can be changed using ALTER TABLE. If no key identifier is provided a default key is used. Default key can be set either on my.cnf with

or dynamically using global setting:

Default key is used e.g.

InnoDB Tablespace Encryption

In tablespace encryption all InnoDB tables are encrypted. Additionally, you may encrypt InnoDB log files, Aria tables (ROW_FORMAT=PAGE) and Aria temporary tables. To use tablespace encryption, you have to:

  •  Set the value of encryption-algorithm to the algorithm of your choice.
  • Load the example-key-management-plugin (or similar)

Example:

In tablespace encryption keys are not static. Instead so called key rotation is used. In key rotation used encryption key is changed if key used on a page is older than innodb-encryption-rotate-key-age seconds.

InnoDB Tablespace Scrubbing

Scrubbing means that there is a background process that regularly scans through all tables and upgrades the encryption keys for the pages. This happens either as part of purge (non compressed) or scrubbing by scanning whole tablespaces (added into key rotation threads). Purge is a a type of garbage collection that InnoDB internally runs to improve performance. Configuration for this feature might look as follows:

Performance Impact

Encrypting the tables or tablespaces naturally have some effect on overall performance of the system. Naturally, the amount of performance effect encryption has is dependent on used hardware, workload and used encryption method. Goal of this section is to give some indication how much effect on performance there is when table encryption is used or when tablespace encryption is used when compared to setup where no encryption is used.

All experiments where conducted on Intel Xeon E5-2690 @ 2.9GHz CPU containing 2 sockets with 8 cores each using hyper threading, thus 32 total cores and Linux 3.4.12 with 132G main memory. The database is stored on a Fusion-io ioDrive2 Duo 2.41TB Firmware v7.2.5, rev 110646, Driver 3.3.4 build 5833069. The database filesystem is using NVMFS and all test logs and outputs are stored on second ioDrive using EXT4. We use On-Line Transaction Processing (OLTP) benchmark from Percona https://code.launchpad.net/~percona-dev/perconatools/tpcc-mysql. This TPC-C like workload involves a mix of five concurrent transaction types executed on-line or queued for deferred execution. The database is comprised of nine tables with a wide range of record and population sizes. Results are measured in terms of transactions per minute (tpmC). We will use 1000 warehouses producing ~100G database and buffer pool size 50G, so that full database does not fit to buffer pool. Additionally, we will use only InnoDB plugin as a storage engine. Finally, we use 3 hour measure time.

In the first graph we compare the resulting tpmC results on normal InnoDB tables (unencrypted tables), page encrypted tables, using passive key rotation and scrubbing (setting both intervals bigger than test time) and tablespace encryption (google full encrypted on graph).

encryption_tpcc2

Conclusions

MariaDB Corporation would like to thank eperi and Google for their contributions to MariaDB.

The MariaDB project is pleased to announce the immediate availability of MariaDB 10.0.17 and MariaDB 10.1.3. MariaDB 10.0.17 is a Stable (GA) release. MariaDB 10.1.3 is a Beta release.

Download MariaDB 10.0.17

Release Notes Changelog What is MariaDB 10.0?


Download MariaDB 10.1.3

Release Notes Changelog What is MariaDB 10.1?

MariaDB APT and YUM Repository Configuration Generator

See the Release Notes and Changelogs for detailed information on these releases.

Thanks, and enjoy MariaDB!

Introduction

Causal consistency [1] is one of the consistency criteria that can be used on distributed databases as consistency criteria.

Distributed database provides causal consistency if read and write operations that are causally related are seen by every node of the distributed system in the same order. Concurrent writes may be seen in different order in diffrent nodes.  Causal consistency is waker than sequential consistency [2] but stronger than eventual consistency [3]. See earlier blog for more detailed description on eventual consistency https://blog.mariadb.org/eventually-consistent-databases-state-of-the-art/.

When a transaction performs a read operation followed later by a write operation, even on different object, the first read is said to be causally ordered before the write. This is because the value created by the write may have been dependent upon the result of the read operation. Similarly, a read operation is causally ordered after the earlier write on the same object that stored the data retrieved by the read. Also, even two write operations performed by the same node are defined to be causally ordered, in the order they were performed. Intuitively, after writing value v into object x, a node knows that a read of x would give v, so a later write could be said to be (potentially) causally related to the earlier one. Finally, causal order is transitive: that is, if operation A is (causally) ordered before B, and B is ordered before C, A is ordered before C.

Operations that are not causally related, even through other operations, are said to be concurrent. Consider following example:

This execution is causally consistent because read operations r3[x] is causally dependent on write w2[x] and read operation r4[x] is causally dependent on write w1[x]. Note that concurrent write operations may be seen on different order on different transactions or nodes.

Implementing Causal Consistency

Causal consistency can be reached by using Lamport clocks [4] or version vectors [5]. The causal consistency model is implemented by using multi-part timestamps, which are assigned to each object. These timestamps are stored on a vector that contains the version number of the object at each replica. This vector must be included (in the form of dependencies) in all update and query requests so that operations respect the causal ordering: an operation A can only be processed at a given node if all operations, on which the operation A causally depends, have already been applied at that node.

Each element in the vector clock corresponds to one host and the value of element indicates the number of messages sent from that host. The vector clock information is used to order or delay the delivery of messages if necessary, thus maintaining the required consistency. However, for maintaining the consistency of data items, we need information about writes on each data item, and maintaining a clock per data item can help. Therefore, instead of a vector clock of size N (number of hosts), we maintain a vector of size M (number of objects). The value of v[i] in the vector v contains the number of writes on data item i.

Each host maintains a vector of size M. Whenever it updates the value of an item, it increments the corresponding element and sends the vector along with the message of data item and new value to every site, which has a copy of the replica. When a host receives an update message, it delays the delivery of a message till each element in its vector is greater than or equal to the one that is piggybacked. After that, the updates to the data items are applied. In this case, the message overhead is O(M) and thus is independent of the number of hosts in the system.

If each message is an update message, it carries the new value of the data item rather than instructions. Then the delivery of an update on a data-item does not need not wait for the previous updates on the same item. This would not have been possible if vector clocks had been used. In that case, the delivery of a massage would have been delayed even for previous messages that are causally overwritten.

Applications and Databases using Causal Consistency

COPS and Eiger

COPS system [6] (Clusters of Order-Preserving Servers) introduces the causal+ consistency and is designed to support complex online applications that are hosted in a small number of large-scale data-centers, each of which is composed of front-end servers (clients of COPS) and back-end key-value data stores. Eiger [7] has a similar design but a different implementation.

COPS and Eiger support causality through a client library. Both systems replicate writes to geographically distributed data centers and enforce observed ordering. The observed ordering is enforced by delaying the write operations until all causally previous operations have been already applied at that data center.

COPS executes all read and write operations in the local data center in a linearizable [8] fashion, and then replicates data across data centers in a causal+ consistent order in the background. Figure 1 describes the high level architecture of COPS.

 

cops

Figure 1: Architecture of COPS and Eiger [6].

Similarly, the Eiger system provides the linearizability inside each data center and the causally-consistent data store based on a column-family data model to achieve better performance in a geo-distributed setting. All operations from clients are served from the local data center using a client library. The library mediates access to nodes in the local data center, executes the read and write transaction algorithms, and tracks causality and attaches dependencies to write operations. Each replica stores full replica of the database, and operations are handled locally. After an operation is executed locally, the operation is asynchronously pushed to remote data centers, but committed only after all causally dependent operations have been previously committed.

Bolts-on

Bailis et. al. in [9] propose a client-side middle-ware software called Bolt-on. This middle-ware guarantees only application-defined dependencies as an alternative to the causal consistency. Figure 2 describes the architecture of the Bolt-on middle-ware.

 

bolt

Figure 2: Architecture of Bolts-on [9].

The Bolt-on architecture assumes that the underlying data store handles most aspects of data management, including replication, availability, and convergence. In the architecture, the underlying data store locally uses the eventual consistency and allows a large space of read and write histories; the middle-ware handles causal dependencies, and consists of a set of rules, which limit the possible histories to the histories that obey the desired consistency model.

Application: MMORPG

In Massively Multi-player Online Role-Playing Games (MMORPG), players can cooperate with others in a virtual game world, and both players and different game words are naturally distributed. These systems manage large amounts of data, and the biggest problem is how to support data consistency. According to the CAP theorem, we have to sacrifice one of two properties: consistency or availability [10]. If an online game does not guarantee the availability, players’ requests may fail. If data is inconsistent, players may get data not conforming to the game logic, and this data can affect their operations. Therefore, it is important for the MMORPG environment to find a balance between the data consistency and the system availability. For this reason, we must analyze the data consistency requirements of MMORPG so as to find the balance [10].

Diao [10] has studied different consistency models for MMORPG and found that there indeed are part of data, where the causal consistency is an appealing choice: Game data. The game data contains e.g. the world appearance, the meta-data of non-player characters (the characters are created by game developers and controlled only by the game logic), the system configuration and game rules. This data is used by players and the game engine in the entire game, but can be only modified by the game developers. Consistency requirements for the game data are not so strict compared e.g. to the account data. Because e.g. a change of non-player character name or of the duration of bird animation may not be noticed by players.

Furthermore, some change of the game data needs to be delivered to all online players synchronously, e.g. a change of the word appearance, the weapon power, non-player characters, game rules and scripts. If there is inconsistency on these areas, it will cause errors on game display and logic errors on players. Therefore, some data needs to be stored on the server side and some on the client side. The game data on the client side could only synchronize with servers when a player logs in to or starts a game. For this reason, the causal consistency is required [10].

This could mean that when a player A uses the browser to connect with the game server, the game server will check the current local data and update the game data of the client side in the form of data packets. After updating, all future local accesses will return the updated value. Player B, who has not communicated with the game server, will still retain the outdated game data.

Game servers maintain the primary version of game data, and transfer it to client sides. Additionally, players on different game words cannot discuss to each other. Thus, the only need is to make sure that the game data is consistent in one game word in a time so that all players on that game word are handled equally. This requires using the strong consistency locally in the game word and the causal consistency among different game words. When the game data is modified by developers, the update value should be delivered synchronously to all replicates on that game word, and asynchronously to other game words.

While the possibility of using the causal consistency on MMORPG has been identified on research [10] to the authors’ knowledge there is no actual publications or other information that the causal consistency is actually used on MMORPG games (e.g. Lord of the Rings Online).

Application: Facebook

Facebook is an online Social networking service. After registering to use the site, users can create a user profile, add other users as friends, exchange messages, post status updates and photos, share videos and receive notifications when others update their profiles.

When you log into your account on Facebook, the server will show your own status messages and your friends’ status messages at that point in time. Status messages on Facebook may contain pictures, shared links and stories or your own messages. Naturally, your account data requires a strong consistency, but for status data the weaker consistency models are acceptable. During the time the user is online, the status updates of a user’s friends and of the user do not need to be strictly ordered, and the causal ordering is enough.

Thus when a user A sends a status update and a user B replies to that update, there is a causal order on the two updates. However, when users C and D do a totally unrelated update, the order these updates appear to users A and B is not relevant. This is because users A and B do not know in which order updates are performed.

The reason why the eventual consistency is not enough for Facebook status updates is that the eventual consistency does not require any ordering between writes. Consider a case, where the user A first sends a status update, and after few seconds A updates the first status update. With the eventual consistency, all friends of A could see only the first update, because the eventual consistency does not guarantee that first update is performed before the second one. In the causal consistency, as there is a read (by user A) of first update and then write (updated status from user A), these are causally related and all user A’s friends will naturally see second update.

Although the causal consistency is the possible consistency model for Facebook status updates and several similar distributed services containing status updates like LinkedIn, Twitter and Yahoo, to author’s knowledge there is not scientific or other literature that would show the causal consistency being really used.

Conclusions

The causal consistency model can be enforced with Lamport clocks. Transactions using the causal consistency are executed in an order that reflects their causally-related read/write operations’ order. Concurrent operations may be committed in different orders and their results can be read also in different orders.

Actually, the causal consistency can solve many problems, which cannot be solved in the eventual consistency, such as ordering operations. The causal consistency ensures that every sees operations in the same causal order, and this makes the causal consistency stronger than the eventual consistency. However, the causal consistency cannot support e.g. distributed integrity constraints.

Although there are a few promising systems developed on research to the authors’ knowledge there is no commercial or mature systems using the causal consistency model. For more thorough discussion see [11].

References

[1] Mustaque Ahamad , Gil Neiger , James E. Burns , Prince Kohli , P.W. Hutto: Causal Memory: Definitions, Implementation and Programming, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.3356

[2] Leslie Lamport: How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs, IEEE Trans. Comput. C-28,9 (Sept. 1979), 690-691.

[3] Werner Vogels: Eventually consistent. Communications of the ACM 52: 40. doi:10.1145/1435417.1435432

[4] Leslie Lamport: Time, clocks, and the ordering of events in a distributed system, Communications of the ACM, 21: 7, pp. 558–565, 1978.

[5] C. J. Fidge: Timestamps in message passing systems that preserve the partial ordering, in Theoretical Computer Science, 1988.

[6] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen: Don’t settle for eventual: scalable causal consistency for wide-area storage with COPS, in Proceedings of the 23rd ACM Symposium on Operating Systems Principles, pp. 401–416, Portugal, October 23-26, 2011.

[7] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen: Stronger semantics for low-latency geo-replicated storage,” in Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, pp. 313–328, Lombard, IL, USA, April 2-5, 2013.

[8] M. Herlihy and J. M. Wing, Linearizability: A correctness condition for concurrent objects, ACM
Trans. Program. Lang. Syst., 12:3, pp. 463–492, 1990.

[9] . Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica,: Bolt-on causal consistency,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 761–772, New York, NY, USA, June 22-27, 2013

[10] Z. Diao, “Consistency models for cloud-based on-line games: the storage system’s perspective,” in
25th Workshop on Grundlagen von Datenbanken, Computing, Portland, Oregon, USA, July 16-19,
pp. 16–21, Ilmenau, Germany, May 28 – 31, 2013.

[11] Mawahib Musa Elbushra, Jan Lindström: Causal Consistent Databases, Open Journal of Databases (OJDB), 2:1, http://www.ronpub.com/publications/OJDB_2015v2i1n02_Elbushra.pdf