One of my earlier blog posts (on the GigaSpaces Blog) was "Persistence and the reliability myth". I've been talking about alternative approaches to reliability and scalability in a stateful environments, other than the traditional database-centric approach, since the inception of GigaSpaces.
It is very encouraging to see that what used to be said behind closed doors is now becoming common knowledge.
I recently came across a piece by Michael
Stonebraker. Michael, was the main architect for the Ingres prototype project at UC Berkeley 9 years before Oracle was released and more than a decade before the commercial version of Ingress was released). In One size fits all: A concept whose time has come and gone, he writes:
"What I see happening is that the large database vendors, whom I’ll call the elephants, are selling a one-size-fits-all, 30-year-old architecture that dates from somewhere in the late 1970s. Way back then the technological requirements were very different; the machines and hardware architectures that we were running on were very different. Also, the only application was business data processing. For example, there was no embedded database market to speak of. And there was no data warehouse market. Today, there are a variety of different markets with very different computing requirements, and the vendors are still selling the same one-size-fits-all architecture from 25 years ago. There are at least half a dozen or so vertical markets in which the one-size-fits-all technology can be beaten by one to two orders of magnitude, which is enough to make it interesting for a startup. So I think the aging legacy code lines that the major elephants have are presenting a great opportunity, as are the substantial number of new markets that are becoming available."
Arnon Rotem-Gal-Oz recently wrote a few interesting blogs on this topic, The RDBMS is legacy, where he reviews Stonebraker's article. I particularly like how Arnon summarizes the topic in his The RDBMS is dead post:
RDBMS succeeded to become the de-facto standard to building system because they offer some very compelling attributes - ACID brings a lot of piece of mind. Large scale systems,low-latency system and fault tolerant systems opt for another set of compelling attributes (BASE). The point is that when you design your next solution maybe the conventional database thinking is something that you should at least give another thought to and instead of just following dogma.
As you can imagine I'm in violent agreement with both Arnon and Stonebraker. Although the database is not going to disappear, it is obviously not the best fit with the scalability requirements of many of today's applications. Having said that, the question is where and how it will fit in the architecture of this new class of applications?
To answer that, we need to look at what the database brings to the table in this context:
- The database manages our ACID transactions
- Durability – the database serves as a transaction-safe persistent storage on top of a file-system
I believe that the first role can be addressed today without the database using an In-Memory Data Grid (IMDG). This means that I can manage transactions without even going to the disk. See my recent post: Lessons from Pat Helland: Life Beyond Distributed Transactions. In any case, databases should still be used for durability purposes, as they do a reasonably good job at that.
The challenge is how to combine the database and the Data Grid in a way that will fit a large-scale environment. I suggest the concept of Persistence as a Service (PaaS), in which the database will play a similar role to a Data Warehouse. In other words, it will move a step back to play more of a background role in the data processing food chain.
I'll cover this in more detail in my next post.