So once again Twitter was down last week for a good chunk of time.
A lot has been said already about Twitter's scalability issues. Many have given Twitter as an anti-pattern of how not to deal
with scalability and have suggested different solutions for scaling it. As Twitter is famously a Ruby-on-Rails
deployment, this case has also been used as a weapon in the language/platform wars between the RoR and Java camps, and to a lesser degree, also with the
LAMP (PHP) camp.
Reading through this summary, a pattern emerges. Twitter is no different than many other web apps that have become overnight successes.
They had a good idea and went to
implement it as quickly as possible. Ruby seemed to be a perfect tool for them
to get there quickly. Success was much bigger and faster
then they imagined. Not surprisingly. the architecture was not
designed for scalability and they are now forced to go through the painful process
of scaling their architecture. There are similar stories about eBay, LinkedIn, MySpace and many other notable web companies.
What's somewhat surprising is that they had to hit almost every
possible bottleneck before realizing that they still
have a scaling issue. For example, they used memcached to address the scalability of their (single!) MySQL database. They tried various messaging solutions (including a Linda-based implementation), and threw a lot of hardware at the problem. Now
it seems they are looking to completely re-write the application, as noted
here.
I hate to repeat myself with this cliché, but with
scalability your only as strong as your weakest link. Trying
to bypass this problem by plugging in point solutions is not going to cut it. It is
therefore not surprising that those who dealt with scalability challenges
successfully -- such as eBay, Amazon, Google and Yahoo -- went through architecture changes
and eventually reached a similar pattern to the one I described in my summary of the
Qcon conference: Architecture You Always Wondered About: Lessons Learned at Qcon
Scalability -- Doing It Right
- Asynchronous event-driven design: Avoid as much as possible any synchronous interaction with the data or business logic tier. Instead, use an event-driven approach and workflow
- Partitioning/Shards: Design the data model to fit the partitioning model
- Parallel execution: Use parallel execution to get the most out of available resources. A good place to use parallel execution is the processing of users requests. Multiple instances of each service can take the requests from the messaging system and execute them in parallel. Another good place for
parallel processing is using MapReduce for performing aggregated requests on partitioned data
- Replication (read-mostly): In read-mostly scenarios (LinkedIN seems to fall into this category well), database replication can help load-balance the read load by splitting the
read requests among the replicated database nodes
- Consistency without distributed transactions: This was one of the hot topics of the conference,
which also sparked some discussion during one of the panels I participated in. An argument was made that to reach scalability you had to sacrifice consistency and handle consistency in your applications using things such as optimistic locking and asynchronous error-handling. It also assumes that you will need to handle idempotency in your code. My argument was that while this pattern addresses scalability, it creates complexity and is therefore error-prone. During another panel, Dan Pritchett argued that there are ways to avoid this level of complexity and still achieve
the same goal, as I outlined in this blog post.
- Move the database to the background: There was violent agreement that the database
bottleneck can only be solved if database interactions happen in the background. (NOTE: I recently wrote a more detailed post explaining how you can effectively move the database to the background.
This brings me to an interesting
story about a startup company that has an ad engine using PHP.
Like many others, when they became successful, they faced scalability issues and faced a
similar dilemma to to the one faced by Twitter. Their story, written by the implementor, a consulting company called Rocketier, is a good example of how
to do it right (in this case, using GigaSpaces):
As part of a project for a Web 2.0 company, which focuses on the Affiliate Junction market, Rocketier developed a generic connector from PHP to GigaSpaces XAP. The company's product serves as a hub which connects advertisers and publishers in a unique proprietary model, tracking and billing ad views, ad clicks and resulting sales in affiliate networks. The software was originally developed using PHP, however, once the number of customers started to grow, the product hit severe performance and scalability issues. The main bottleneck in the product was the ad view counting mechanism which effectively limited the software to an upper bound of 1M hits per day. Instead of replacing the entire product, Rocketier focused on the performance critical business processes and developed a backend GigaSpaces module, responsible for counting the ad views and clicks. This module was integrated into the software using a custom designed PHP connector, employing COM+ technology. The new solution is easily scaled up and out and can support the company's growing customer base up to 200M hits per day...
...This solution enabled the company to meet its business needs in a short period of time, while keeping a low risk factor due to the fact that only the performance critical business processes were replaced. The chosen technological solution added grid support to the PHP-based product, including: PHP and GS mediation (the aforementioned connector), application persistence, session based continuity, asynchronous programming capabilities and presentation and business logic separation.
To sum up, the company was able to leverage its existing investment and migrate evolutionarily from an initial prototype to a heavy load, production level solution (from Beta to a Post Digg Effect system), in a short time to market and with minimal risk"
Rocketier was kind enough to
publish part of their work on OpenSpaces.org.
We seem to take it for
granted that dealing with scalability is complex. When we
start a new application it's hard to know whether we're ever going to be
successful to the point where the investment in scalability is worth the
effort. At this initial stage the important thing is time-to-market. We want to get
our idea out there as quickly as possible. This is a reasonable desire
as indeed most projects don't take off.
Now imagine what would happen if dealing with scalability wasn't that difficult. That would have change the entire decision making process, and would enable Twitter and many others to start with a scalable architecture from day one, avoiding this painful process.
So the question is what would is required to simplify building a scalable application to the point in which it is as simple as building it for a single machine?
From my experience, most challenges have already been faced by and dealt with others - so the first thing that I did was look at how others (not necessarily in the same industry) addressed this issue.
In this case, storage virtualization is a good example. At first, we used local disks. Local disks tend to get filled-up quite fast. It was very hard to deal with this problem as it required replacing the disk with a bigger one every time full capacity was reached. IT had to go through this process for every user and every application -- very painful and costly. The solution came in the form of NAS and SAN, or network-attached storage. Instead of using a local disk, use a virtual disk that resides somewhere on the network. The user and the application don't need to be aware of it, because they use a local disk driver that virtualizes the network devices to make them look as if they were just another local disk. The application scales but hasn't changed as a result. We can add and remove devices as we wish with no changes to the application. Later on, if there is a more cost effective solution available, we can easily replace the devices.
We can apply the same concept of virtualization to the middleware stack -- namely the data, the messaging and the processing -- with the same degree of simplicity. The application interacts with a "proxy" that hides the details of how a message or update operation is routed, how fail-over is handled, how data is partitioned and so on.
With services such as Amazon EC2, and other cloud environments, this can be made even simpler, as we can have a pre-configured image and hardware ready for deployment. All we need to do is just deploy our business logic. (See an EC2 example here).