![]() |
Matthew Dublin is a senior writer at Genome Technology. |
A group of researchers at Yale University may have leveled the playing field for database performance. What's interesting is their solution is not even technically a database.
Called "Calvin," their new "database" is actually a transaction scheduling and replication coordination service its developers say could provide a nice alternative to the pricey distributed relational databases offered by Oracle and IBM.
Two of Calvin's developers, Daniel Abadi and Alexander Thomson, raise some interesting questions on their blog that were also part of the impetus for developing Clavin:
Why are Oracle's 11g and 10g databases as well as versions of IBM's DB2 database— technologies that were developed several decades ago — still at the top of the TPC-C list? What about all the new general-purpose database management system technologies that can supposedly scale easily?
The reason is that scalability cannot be achieved without a few sacrifices — some quite painful — which they detail in their blog post.
So how does Calvin compare to SQL, "NewSQL" (various new scalable/high performance SQL database vendors), and NoSQL?
Adadi and Thomson write that Calvin should not be compared to any of the three as they "designed the system to integrate with any data storage layer, relational or otherwise. Calvin allows user transaction code to access the data layer freely, using any data access language or interface supported by the underlying storage engine (so long as Calvin can observe which records user transactions access)."
Worthy of note is that Calvin can reportedly run 500,000 transactions per second on 100 Amazon EC2 instances in the cloud and can maintain strongly-consistent, up-to-date 100-node replicas in Amazon’s Europe and US West data centers — at no cost to throughput, they write. Not too shabby.
For more information on Calvin, check out their paper here.