Introduction

Fortran (FORmula TRANslating System) is a general-purpose, imperative programming language that is especially suited to numeric computation and scientific computing. History of FORTRAN can be tracked late 1953 when John W. Backus submitted a proposal to his superiors at IBM. The First FORTRAN compiler appeared in April 1957.

Some notable historical steps where:

  • FORTRAN II in 1958
  • FORTRAN III in 1958,
  • FORTRAN IV in 1962.
  • FORTRAN 66 or X3.9-1966 become the first industry-standard
  • FORTRAN 77 or X3.9-1978. This is the version of the Fortran I learned 1996.
  • Fortran 90 was released as ISO/IEC standard 1539:1991 and ANSI Standard in 1992
  • Fortran 95 was released as ISO/IEC standard 1539-1:1997
  • Fortan 2003 was released as ISO/IEC 1539-1:2004
  • Fortran 2008 was released as ISO/IEC 1539-1:2010 is most recent standard
  • Fortran 2015 is planned in late 2016.

More comprehensive history and introduction can be found e.g. from http://en.wikipedia.org/wiki/Fortran.

Thus Fortran programming language is not dead ! I did use Fortran in same day as I started writing this blog (05/07/2015). There is some historical reason why I decided to learn Fortran. In department of computer science, university of Helsinki there is course named Software Project where students design, implement and test larger application. I participated on this course 1996 and my application was Ringing Software for Ringing Centre, Natural History Museum, University of Helsinki. Their original software used magnetic tapes and Fortran66/77 programs. Our assignment was to transform this to use Oracle database and UNIX. At that time we decided to use Fortran77 (with some Fortran90 extensions, mainly structures) and ProFortran precompiler from Oracle.

Compilers

There is version of GNU Fortran named GFortran. The GFortran compiler is fully compliant with the Fortran 95 Standard and includes legacy F77 support. In addition, a significant number of Fortran 2003 and Fortran 2008 features are implemented.

To my experience GFortran is very good compiler and really includes most of the legacy support you need (and lot of new stuff I really do not need). However, I found one example that is useful but not supported, variable length format. Consider following:

Here format (2x, I4) is repeated times and depends runtime values. This can be transformed to:

This is because format can be a string variable and above produces format (A,44(2X,I4),A) (assuming years 1971 and 2014). But, in my opinion the first one is more clearer and simpler. Additionally, I learned to use pre-Fortran90 STRUCTURE and RECORD extensions, like

This can naturally expressed using TYPE:

I mostly use Fortran90 and free-form (longer line lengths than  allowed by standard Fortran77) but only limited amount of new features. Thus code might look like Fortran77 mostly:

Naturally, there is number of commercial Fortran compilers like Intel Fortran https://software.intel.com/en-us/fortran-compilers and NAG http://www.nag.com/nagware/np.asp .

Clearly one of the bad features of Fortran are implicit types. If a variable is undeclared, Fortran 77 uses a set of implicit rules to establish the type. This means all variables starting with the letters i-n are integers and all others are real. Many old Fortran 77 programs uses these implicit rules, but you should not! The probability of errors in your program grows dramatically if you do not consistently declare your variables. Therefore, always put following in start of your Fortran program:

SQL and Fortran

Fortran does not natively understand SQL-clauses, but you can use e.g. embedded SQL. Embedded SQL is SQL-clauses inside a host language like Fortran. Lets take a example:

Naturally, normal Fortran compiler will not understand clauses starting with EXEC SQL. Thus, you need to first use precompiler. Precompiler changes embedded SQL-clauses (above include clauses are copied to resulting file) and other SQL-clauses are transformed to CALL-clauses to provided database server API-calls. Naturally, this means that you software will work only for precompiled (and then compiled) database provider.

Currently, there are precompilers at least for Oracle and DB2 databases (see http://en.wikipedia.org/wiki/Embedded_SQL). However, OS support is diminishing. E.g. Oracle Fortran Precompiler does not anymore work on Linux 64bit when using Oracle >10g. This in my opinion is bad because porting your Fortran software from Oracle to e.g. DB2 is not trivial especially if you have application with 100000 lines of Fortran code.

This fact has lead on my experience to situation where some of the system is re-implemented using Java and some of the code modified to pure Fortran so that it read input from files (generated using pure SQL) and by removing all embedded SQL-clauses.

Fortran and MariaDB

There is no connectors for Fortran to MariaDB /MySQL. However, you could use ODBC, however the free ODBC modules FLIBS and fodbc fail to compile in my 64-bit Linux and after some hacking with fodbc, it did not really work. Naturally, you could write your own fodbc for MariaDB/MySQL but currently I do not have a real need or enough free time to do so. Alternative way of doing this is create C-language interface between Fortran code and ODBC driver.

Lets take very simple example where Fortran program connects to MariaDB database, selects a version and disconnects.

As you note string variables need special handling as Fortran has constant strings. Therefore, we need to add C string end character before calling C-routines and then remove trailer before using string in Fortran again. And then simple C-interface (no real error handling):

And, if you compile these and run the resulting program you might see something like following:

Future of Fortran ?

There is clearly need languages like Fortran. It has some very nice features like formatted I/O and mathematical functions. However, learning Fortran might be up to you because it is not taught as first (or second) programming language on most universities or similar schools. Thus number of people who can use Fortran on programming or teach it is decreasing rapidly. However, my experience is that learning Fortran is simple if you can master at least one programming language (ok, I had learn already C/C++, COBOL, PL/I, Basic on my earlier studies). So you want to learn Fortran ? If Internet resources are not enough there is number of books. Book I have used is obsolete (Fortran 77 and Finish-language version Fortran 90/95) but e.g. http://www.amazon.com/Introduction-Programming-Fortran-With-Coverage/dp/0857292323 is a good one.

Introduction

For the moment, the only engines that fully support encryption are XtraDB and InnoDB. The Aria storage engine also supports encryption, but only for temporary tables.

MariaDB supports 2 different way to encrypt data in InnoDB/XtraDB:

  1. Specified table encryption: Only tables which you create with PAGE_ENCRYPTION=1 are encrypted. This feature was created by eperi.
  2. Tablespace encryption: Everything is encrypted (including log files). This feature was created by Google and is based on their MySQL branch.

InnoDB Specified Table Encryption

Specified Table encryption means that you choose which tables to encrypt. This allows you to balance security with speed. To use table encryption, you have to:

  • Set the value of encryption-algorithm to the algorithm of your choice.
  • Load the file-key-management-plugin (or similar)
  • Define the location of key file
  • Create keys

Example:

Keys can be generated using OpenSSL with following command

The key file is a text file containing an key id, the hex-encoded iv and the hex-encoded key. Example keys.txt using above generated key:

After this is it up to database designer to select tables that contain sensitive data for encryption. Encryption can be enabled to table in table creation time or using ALTER TABLE. As an example:

In table encryption currently keys can’t be changed but used key can be changed using ALTER TABLE. If no key identifier is provided a default key is used. Default key can be set either on my.cnf with

or dynamically using global setting:

Default key is used e.g.

InnoDB Tablespace Encryption

In tablespace encryption all InnoDB tables are encrypted. Additionally, you may encrypt InnoDB log files, Aria tables (ROW_FORMAT=PAGE) and Aria temporary tables. To use tablespace encryption, you have to:

  •  Set the value of encryption-algorithm to the algorithm of your choice.
  • Load the example-key-management-plugin (or similar)

Example:

In tablespace encryption keys are not static. Instead so called key rotation is used. In key rotation used encryption key is changed if key used on a page is older than innodb-encryption-rotate-key-age seconds.

InnoDB Tablespace Scrubbing

Scrubbing means that there is a background process that regularly scans through all tables and upgrades the encryption keys for the pages. This happens either as part of purge (non compressed) or scrubbing by scanning whole tablespaces (added into key rotation threads). Purge is a a type of garbage collection that InnoDB internally runs to improve performance. Configuration for this feature might look as follows:

Performance Impact

Encrypting the tables or tablespaces naturally have some effect on overall performance of the system. Naturally, the amount of performance effect encryption has is dependent on used hardware, workload and used encryption method. Goal of this section is to give some indication how much effect on performance there is when table encryption is used or when tablespace encryption is used when compared to setup where no encryption is used.

All experiments where conducted on Intel Xeon E5-2690 @ 2.9GHz CPU containing 2 sockets with 8 cores each using hyper threading, thus 32 total cores and Linux 3.4.12 with 132G main memory. The database is stored on a Fusion-io ioDrive2 Duo 2.41TB Firmware v7.2.5, rev 110646, Driver 3.3.4 build 5833069. The database filesystem is using NVMFS and all test logs and outputs are stored on second ioDrive using EXT4. We use On-Line Transaction Processing (OLTP) benchmark from Percona https://code.launchpad.net/~percona-dev/perconatools/tpcc-mysql. This TPC-C like workload involves a mix of five concurrent transaction types executed on-line or queued for deferred execution. The database is comprised of nine tables with a wide range of record and population sizes. Results are measured in terms of transactions per minute (tpmC). We will use 1000 warehouses producing ~100G database and buffer pool size 50G, so that full database does not fit to buffer pool. Additionally, we will use only InnoDB plugin as a storage engine. Finally, we use 3 hour measure time.

In the first graph we compare the resulting tpmC results on normal InnoDB tables (unencrypted tables), page encrypted tables, using passive key rotation and scrubbing (setting both intervals bigger than test time) and tablespace encryption (google full encrypted on graph).

encryption_tpcc2

Conclusions

MariaDB Corporation would like to thank eperi and Google for their contributions to MariaDB.

Introduction

Causal consistency [1] is one of the consistency criteria that can be used on distributed databases as consistency criteria.

Distributed database provides causal consistency if read and write operations that are causally related are seen by every node of the distributed system in the same order. Concurrent writes may be seen in different order in diffrent nodes.  Causal consistency is waker than sequential consistency [2] but stronger than eventual consistency [3]. See earlier blog for more detailed description on eventual consistency https://blog.mariadb.org/eventually-consistent-databases-state-of-the-art/.

When a transaction performs a read operation followed later by a write operation, even on different object, the first read is said to be causally ordered before the write. This is because the value created by the write may have been dependent upon the result of the read operation. Similarly, a read operation is causally ordered after the earlier write on the same object that stored the data retrieved by the read. Also, even two write operations performed by the same node are defined to be causally ordered, in the order they were performed. Intuitively, after writing value v into object x, a node knows that a read of x would give v, so a later write could be said to be (potentially) causally related to the earlier one. Finally, causal order is transitive: that is, if operation A is (causally) ordered before B, and B is ordered before C, A is ordered before C.

Operations that are not causally related, even through other operations, are said to be concurrent. Consider following example:

This execution is causally consistent because read operations r3[x] is causally dependent on write w2[x] and read operation r4[x] is causally dependent on write w1[x]. Note that concurrent write operations may be seen on different order on different transactions or nodes.

Implementing Causal Consistency

Causal consistency can be reached by using Lamport clocks [4] or version vectors [5]. The causal consistency model is implemented by using multi-part timestamps, which are assigned to each object. These timestamps are stored on a vector that contains the version number of the object at each replica. This vector must be included (in the form of dependencies) in all update and query requests so that operations respect the causal ordering: an operation A can only be processed at a given node if all operations, on which the operation A causally depends, have already been applied at that node.

Each element in the vector clock corresponds to one host and the value of element indicates the number of messages sent from that host. The vector clock information is used to order or delay the delivery of messages if necessary, thus maintaining the required consistency. However, for maintaining the consistency of data items, we need information about writes on each data item, and maintaining a clock per data item can help. Therefore, instead of a vector clock of size N (number of hosts), we maintain a vector of size M (number of objects). The value of v[i] in the vector v contains the number of writes on data item i.

Each host maintains a vector of size M. Whenever it updates the value of an item, it increments the corresponding element and sends the vector along with the message of data item and new value to every site, which has a copy of the replica. When a host receives an update message, it delays the delivery of a message till each element in its vector is greater than or equal to the one that is piggybacked. After that, the updates to the data items are applied. In this case, the message overhead is O(M) and thus is independent of the number of hosts in the system.

If each message is an update message, it carries the new value of the data item rather than instructions. Then the delivery of an update on a data-item does not need not wait for the previous updates on the same item. This would not have been possible if vector clocks had been used. In that case, the delivery of a massage would have been delayed even for previous messages that are causally overwritten.

Applications and Databases using Causal Consistency

COPS and Eiger

COPS system [6] (Clusters of Order-Preserving Servers) introduces the causal+ consistency and is designed to support complex online applications that are hosted in a small number of large-scale data-centers, each of which is composed of front-end servers (clients of COPS) and back-end key-value data stores. Eiger [7] has a similar design but a different implementation.

COPS and Eiger support causality through a client library. Both systems replicate writes to geographically distributed data centers and enforce observed ordering. The observed ordering is enforced by delaying the write operations until all causally previous operations have been already applied at that data center.

COPS executes all read and write operations in the local data center in a linearizable [8] fashion, and then replicates data across data centers in a causal+ consistent order in the background. Figure 1 describes the high level architecture of COPS.

 

cops

Figure 1: Architecture of COPS and Eiger [6].

Similarly, the Eiger system provides the linearizability inside each data center and the causally-consistent data store based on a column-family data model to achieve better performance in a geo-distributed setting. All operations from clients are served from the local data center using a client library. The library mediates access to nodes in the local data center, executes the read and write transaction algorithms, and tracks causality and attaches dependencies to write operations. Each replica stores full replica of the database, and operations are handled locally. After an operation is executed locally, the operation is asynchronously pushed to remote data centers, but committed only after all causally dependent operations have been previously committed.

Bolts-on

Bailis et. al. in [9] propose a client-side middle-ware software called Bolt-on. This middle-ware guarantees only application-defined dependencies as an alternative to the causal consistency. Figure 2 describes the architecture of the Bolt-on middle-ware.

 

bolt

Figure 2: Architecture of Bolts-on [9].

The Bolt-on architecture assumes that the underlying data store handles most aspects of data management, including replication, availability, and convergence. In the architecture, the underlying data store locally uses the eventual consistency and allows a large space of read and write histories; the middle-ware handles causal dependencies, and consists of a set of rules, which limit the possible histories to the histories that obey the desired consistency model.

Application: MMORPG

In Massively Multi-player Online Role-Playing Games (MMORPG), players can cooperate with others in a virtual game world, and both players and different game words are naturally distributed. These systems manage large amounts of data, and the biggest problem is how to support data consistency. According to the CAP theorem, we have to sacrifice one of two properties: consistency or availability [10]. If an online game does not guarantee the availability, players’ requests may fail. If data is inconsistent, players may get data not conforming to the game logic, and this data can affect their operations. Therefore, it is important for the MMORPG environment to find a balance between the data consistency and the system availability. For this reason, we must analyze the data consistency requirements of MMORPG so as to find the balance [10].

Diao [10] has studied different consistency models for MMORPG and found that there indeed are part of data, where the causal consistency is an appealing choice: Game data. The game data contains e.g. the world appearance, the meta-data of non-player characters (the characters are created by game developers and controlled only by the game logic), the system configuration and game rules. This data is used by players and the game engine in the entire game, but can be only modified by the game developers. Consistency requirements for the game data are not so strict compared e.g. to the account data. Because e.g. a change of non-player character name or of the duration of bird animation may not be noticed by players.

Furthermore, some change of the game data needs to be delivered to all online players synchronously, e.g. a change of the word appearance, the weapon power, non-player characters, game rules and scripts. If there is inconsistency on these areas, it will cause errors on game display and logic errors on players. Therefore, some data needs to be stored on the server side and some on the client side. The game data on the client side could only synchronize with servers when a player logs in to or starts a game. For this reason, the causal consistency is required [10].

This could mean that when a player A uses the browser to connect with the game server, the game server will check the current local data and update the game data of the client side in the form of data packets. After updating, all future local accesses will return the updated value. Player B, who has not communicated with the game server, will still retain the outdated game data.

Game servers maintain the primary version of game data, and transfer it to client sides. Additionally, players on different game words cannot discuss to each other. Thus, the only need is to make sure that the game data is consistent in one game word in a time so that all players on that game word are handled equally. This requires using the strong consistency locally in the game word and the causal consistency among different game words. When the game data is modified by developers, the update value should be delivered synchronously to all replicates on that game word, and asynchronously to other game words.

While the possibility of using the causal consistency on MMORPG has been identified on research [10] to the authors’ knowledge there is no actual publications or other information that the causal consistency is actually used on MMORPG games (e.g. Lord of the Rings Online).

Application: Facebook

Facebook is an online Social networking service. After registering to use the site, users can create a user profile, add other users as friends, exchange messages, post status updates and photos, share videos and receive notifications when others update their profiles.

When you log into your account on Facebook, the server will show your own status messages and your friends’ status messages at that point in time. Status messages on Facebook may contain pictures, shared links and stories or your own messages. Naturally, your account data requires a strong consistency, but for status data the weaker consistency models are acceptable. During the time the user is online, the status updates of a user’s friends and of the user do not need to be strictly ordered, and the causal ordering is enough.

Thus when a user A sends a status update and a user B replies to that update, there is a causal order on the two updates. However, when users C and D do a totally unrelated update, the order these updates appear to users A and B is not relevant. This is because users A and B do not know in which order updates are performed.

The reason why the eventual consistency is not enough for Facebook status updates is that the eventual consistency does not require any ordering between writes. Consider a case, where the user A first sends a status update, and after few seconds A updates the first status update. With the eventual consistency, all friends of A could see only the first update, because the eventual consistency does not guarantee that first update is performed before the second one. In the causal consistency, as there is a read (by user A) of first update and then write (updated status from user A), these are causally related and all user A’s friends will naturally see second update.

Although the causal consistency is the possible consistency model for Facebook status updates and several similar distributed services containing status updates like LinkedIn, Twitter and Yahoo, to author’s knowledge there is not scientific or other literature that would show the causal consistency being really used.

Conclusions

The causal consistency model can be enforced with Lamport clocks. Transactions using the causal consistency are executed in an order that reflects their causally-related read/write operations’ order. Concurrent operations may be committed in different orders and their results can be read also in different orders.

Actually, the causal consistency can solve many problems, which cannot be solved in the eventual consistency, such as ordering operations. The causal consistency ensures that every sees operations in the same causal order, and this makes the causal consistency stronger than the eventual consistency. However, the causal consistency cannot support e.g. distributed integrity constraints.

Although there are a few promising systems developed on research to the authors’ knowledge there is no commercial or mature systems using the causal consistency model. For more thorough discussion see [11].

References

[1] Mustaque Ahamad , Gil Neiger , James E. Burns , Prince Kohli , P.W. Hutto: Causal Memory: Definitions, Implementation and Programming, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.3356

[2] Leslie Lamport: How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs, IEEE Trans. Comput. C-28,9 (Sept. 1979), 690-691.

[3] Werner Vogels: Eventually consistent. Communications of the ACM 52: 40. doi:10.1145/1435417.1435432

[4] Leslie Lamport: Time, clocks, and the ordering of events in a distributed system, Communications of the ACM, 21: 7, pp. 558–565, 1978.

[5] C. J. Fidge: Timestamps in message passing systems that preserve the partial ordering, in Theoretical Computer Science, 1988.

[6] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen: Don’t settle for eventual: scalable causal consistency for wide-area storage with COPS, in Proceedings of the 23rd ACM Symposium on Operating Systems Principles, pp. 401–416, Portugal, October 23-26, 2011.

[7] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen: Stronger semantics for low-latency geo-replicated storage,” in Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, pp. 313–328, Lombard, IL, USA, April 2-5, 2013.

[8] M. Herlihy and J. M. Wing, Linearizability: A correctness condition for concurrent objects, ACM
Trans. Program. Lang. Syst., 12:3, pp. 463–492, 1990.

[9] . Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica,: Bolt-on causal consistency,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 761–772, New York, NY, USA, June 22-27, 2013

[10] Z. Diao, “Consistency models for cloud-based on-line games: the storage system’s perspective,” in
25th Workshop on Grundlagen von Datenbanken, Computing, Portland, Oregon, USA, July 16-19,
pp. 16–21, Ilmenau, Germany, May 28 – 31, 2013.

[11] Mawahib Musa Elbushra, Jan Lindström: Causal Consistent Databases, Open Journal of Databases (OJDB), 2:1, http://www.ronpub.com/publications/OJDB_2015v2i1n02_Elbushra.pdf