Last week I took part in an interesting discussion with a group of architects, and the question of build vs. buy came up. It came up specifically in the context of the recent experience with many of new Internet companies. I was wondering why it is that that many of them seem to spend so much in developing their own proprietary infrastructure, when it's clear that their needs are not that unique and that such development is not really part of their core IP. Many of them seem to continuously go through difficult experiences until they get their infrastructure right. And it seems that they all stumble into the same pitfalls along the way.
The typical answers as to why they build vs. buy were:
-
It's core to our intellectual property and therefore we have to own all of our infrastructure
-
We didn't find a solution that fits our needs since our needs are very unique
-
We had a bad experience with Product FooBar which made us reevaluate build vs. buy
I can see how I'd react in exactly the same way; the most basic human instinct, when entering uncharted territory, is to rely only on yourself.
But looking at the amount of repeated failure over the past few years, it's pretty clear that this pattern isn't really proving itself too well either. Even when we choose to build it ourselves, according to our own specific in-house requirements, we still end up falling into the same trap over and over again.
Where to draw the line of build vs. buy?
To answer that question, I looked at Fred Brooks's article "No Silver Bullet" which was pointed out to me again by one of our lead architects few weeks ago.
One of the interesting points was the drastic impact of the economy on the build vs. buy decision pattern:
"The development of the mass market is, I believe, the most profound long-run trend in software engineering. The cost of software has always been development cost, not replication cost. Sharing that cost among even a few users radically cuts the per-user cost. Another way of looking at it is that the use of N copies of a software system effectively multiplies the productivity of its developers by N. That is an enhancement of the productivity of the discipline and of the nation.
The key issue, of course, is applicability. Can I use an available off-the-shelf package to perform my task? A surprising thing has happened here. During the 1950's and 1960's, study after study showed that users would not use off-the-shelf packages for payroll, inventory control, accounts receivable, and so on. The requirements were too specialized, the case-to-case variation too high. During the 1980's, we find such packages in high demand and widespread use.
What has changed? Not the packages, really. They may be somewhat more generalized and somewhat more customizable than before, but not much. Not the applications, either. If anything, the business and scientific needs of today are more diverse and complicated than those of 20 years ago.
The big change has been in the hardware/software cost ratio. In 1960, the buyer of a two-million dollar machine would have felt that he could afford $250,000 more for a customized payroll program, one that slipped easily and nondisruptively into the computer-hostile social environment. Today, the buyer of a $50,000 office machine cannot conceivably afford a customized payroll program, so he adapts the payroll procedure to the packages available. Computers are now so commonplace, if not yet so beloved, that the adaptations are accepted as a matter of course."
The impact of cloud computing on the buy vs. build decision
I think Fred's analysis above is much more than just a historic curiosity. Exactly the same process is playing out today, with the advent of cloud computing and virtualization techniques that are turning IT infrastructure into a commodity, on the road to becoming a utility, and dramatically reducing its total cost.
As Fred says in his paper - when the hardware gets cheap, development becomes very expensive. Under these new conditions, we're all going to have to change how we evaluate off-the-shelf products compared to the alternative of developing in-house. Proper TCO measurements need to be put in place at an early stage of the decision making process.
For example, it will no longer be sufficient to choose a product based on the "best performance" or even "best reliability," because each of those factors has a direct cost associated with it. Instead, we are forced to have a better picture of the business requirements, so that we can choose the right product to meet our business needs, and it's not always going to be that the best product from a technical perspective is the right product - and the cheapest product won't be the right product either.
"The hardest single part of building a software system is deciding precisely what to build. No other part of the conceptual work is as difficult as establishing the detailed technical requirements, including all the interfaces to people, to machines, and to other software systems. No other part of the work so cripples the resulting system if done wrong. No other part is more difficult to rectify later."
It is quite surprising to see how much of the current decision-making process is not based on real business requirements. It is even more surprising to see how little we as architects and business people know about their system requirements and real application behavior.
A good example that was given in the architect meeting is the user experience. One participant in the discussion said that at one point, he was focusing on making the latency of serving his site pages as fast as possible and did a good job at that, but at the end of the day, when measured against a competing site that was performing slower, the impression of the user was that the competing site was performing better - the reason was simple, the other site was focused on user experience which led to less clicks per request and not how much time a single request is being executed.
If using off-the-shelf products can cut costs dramatically, why are there are so many product failures?
Fred provide an interesting answer to that question as well:
"Much of present-day software-acquisition procedure rests upon the assumption that one can specify a satisfactory system in advance, get bids for its construction, have it built, and install it. I think this assumption is fundamentally wrong, and that many software-acquisition problems spring from that fallacy. Hence, they cannot be fixed without fundamental revision--revision that provides for iterative development and specification of prototypes and products."
Final words
You might be thinking by now that these are all new lessons learned from the recent changes in the economy, right? – wrong. Go check when Fred Brooks' article was written.
If anything, I would strongly recommend that everyone reading this post would spend time reading Fred's article from start to finish, because I've only covered a small part of the philosophy behind his paper. I think the paper's viewpoint is extremely relevant today -- perhaps even more relevant today then it was when he originally wrote it.