At the TheServerSide Java Symposium in Barcelona two weeks ago, I took part in the High Performance Architecture panel.
Some of the questions raised in the panel's discussion strengthened my realization that the terms "scalability" means different things to different people. For example, people often confuse performance and scalability. At the same time, others refer to scalability as a measure of optimizing the application to utilize more processing power given to it in the form of additional CPUs or cores. Grid vendors often refer to scalability as a measure of parallelizing your application across different machines. Data Grid vendors refer to scalability as a way to remove the data bottleneck by scaling-out the data.
In a sense, all of them are correct – scalability is a multi dimensional topic. What many fail to realize is that each of the different solutions (additional hardware, grid, data grids) is just an optimization that can improve application scalability, but doesn't really addresses the scalability issue at the source.
In his excellent book Pro-JavaEE, Steve Haines discusses the topic of scalability and performance. Here's Steve's definition of Scalability vs Performance:
"The terms “performance” and “scalability” are commonly used interchangeably, but the two are distinct: performance measures the speed with which a single request can be executed, while scalability measures the ability of a request to maintain its performance under increasing load. For example, the performance of a request may be reported as generating a valid response within three seconds, but the scalability of the request measures the request’s ability to maintain that three-second response time as the user load increases."
Steve also provides various real-life examples on the implications of scalability in the chapter Performance and Scalability Testing. The clear message that comes out of it is the impact of the architecture on scalability: when it comes to scalability, you're only as strong as your weakest link.
Achieving true linear scalability:
A linearly scalable application is an application that can scale just by adding more machines and/or CPUs, without changing the application code.
So how do we achieve true linear scalability?
Dan Creswell provides a good summary in his recent post Amazon on Data Storage, which covers how Amazon approached this challenge. In an earlier post, Dodging the Concurrency Bullet, he provided a good summary of some of the core principles for achieving scalability in a stateful environment:
Any time we have a piece of state that needs to be accessed concurrently we hit problems. One can hide this problem using messaging (or similar) but the key aspect in these solutions is that we can partition operations into streams against discrete elements of data (a discrete element could be a group of things) that don’t interfere with each other. Partitioning however can be problematic:
1. Our data has to be amenable to partitioning via hashing or some other method.
2. It gets tricky when we need to deal with availability and disaster recovery.
3. Getting the correct granularity of partitioning be challenging.
What Dan is referring to is also known as Amdahl's Law. It says that if, for example, your program has only 10% of a given function synchronized, then if the throughput of that function at a single CPU is 100 messages per second, to increase performance by a factor of 10 -- to 1,000 msg/sec -- we will need to increase our CPU resources by a factor of 100 (10 times more then what would have been required if the application wouldn't have any synchronization blocks in its code). In reality most existing applications are stateful and, therefore, by definition have a requirement for synchronization as part of their code. This means that the throughput gain expected for these types of applications by throwing more hardware at the problem is going to be fairly low. In other words, if you want to achieve true linear scalability in a stateful environment you must design your application in a distributed/partitioned fashion.
Dan the makes a comment that summarizes well the challenges in applying such a pattern:
...we got rid of our concurrency problem and swapped it for a partitioning problem which then turned into something of an exotic problem. Are we any better off?
While it's not possible to completely eliminate these challenges, we can certainly simplify the solution for overcoming them. The key is to hide the complexity of partitioning and the details required to achieve parallelization as much as possible from the business-logic and push it to the middleware stack.
Obviously, if the middleware itself is not inherently designed to solve this sort of challenge, like most existing tier-based implementations, there is only so much you can hide. We can't assume that we can simply take existing middleware implementations that are designed for a tier-based approach and turn it to fit into the partitioned/scale-out model. We have to approach how the middleware itself is implemented in a different way.
I recently published the Scalability Revolution white paper, which covers in-depth a proposed pattern and architecture for achieving linear scalability in a stateful environment named Space-Based Architecture. The paper discusses the existing bottlenecks within today's typical middleware stack and how those bottlenecks can be overcome by changing the underlying implementation of the core middleware stack, namely, the messaging, data and processing.
To make the transition simple, I tried to keep the way the application interacts with those middleware components pretty much the same as it does today: this is one of the keys for easing the transition from tier-based implementations to a scale-out model.
The principles behind this pattern are not new, and are based on the same principles that Google, Amazon, eBay and others have used to achieve linear scalability. The big guys have all built their own middleware stacks to address this. Knowing that most of us can't afford the same investment, our focus at GigaSpaces has been to provide ways to make the building of scalable applications as simple as possible, as simple as Spring. If you're interested in trying it out and writing your own Amazon/Google-like application in an hour, try out our new release, GigaSpaces eXtreme Application Platform 6.0.