Docker started as just a software container on top of a Linux operating system which seemed like a simple optimization for a fat hypervisor.
Its disruptive force however comes from the fact that it does force us to rethink many of the layers of the cloud stack. Starting from the way we handle configuration management, through the way we handle networking and build systems, and even microservices. Not all of this is directly related to Docker per se, but that’s the difference between thinking of Docker as a container, and Docker as a change agent or a movement.
Aside from Docker’s traditional analogy to the shipping world and how containerization changed the landscape of the maritime world, to me a similar analogy to this is moving from bricks and mortar to glass and metal buildings. You could think of glass and metal as just another form of constructing the same buildings and houses we’ve always had, however the introduction of glass and metal design changed the entire landscape and standard of our former city lives. The fact that with this new approach we can now build entire buildings in a fraction of the time, and rising to a height that is exponentially higher than with the former construction methods, is much more than optimization; it has led to a complete disruption and renewed way of thinking about architecture and design principles.
In this way, we can choose to think of Docker as a simple optimization for building PaaS - instead of just using buildpack for shipping our software into a PaaS – since we can now use a Docker images instead right?
The problem with that thinking lies in its root - thinking of Docker and containers as yet another software packaging tool is analogous to thinking that glass and metal are just another form of bricks.
Anyway the thing that brought me to write about this again is an interesting Twitter thread that developed on the subject over the previous weekend, which was triggered by one of my earlier posts on the subject - Why do I need PaaS if I use Docker?
A basic summary of what was discussed is whether Docker can be considered a platform without having supporting services, whether orchestration is the missing link, whether Docker is a viable VM replacement, and if PaaS is just a buzzword or actually constitutes anything on top of IaaS with orchestration implementations on various layers, and what really is the difference between abstraction and automation?
So, why is this *such* a heated debate?
To understand why is this such a heated debate we need to understand the various players in this discussion, as their perspective is very much influenced by whether you’re a cloud provider, a PaaS provider, an orchestration provider or a container provider.
Mapping the different players and their approach to this trend.
The cloud providers perspective:
The most interesting perspective IMO is that of the cloud providers. Cloud providers like Amazon and Google already provide a PaaS: Elastic Beanstalk and GAE, yet in their recent announcements they announced a new offering for orchestration and containers as a service that is not tied to their PaaS offering. Judging by the market reaction it looks like many of the users have been quite enthusiastic and in favor of this new offering.
What can we learn from this?
Cloud providers looks at PaaS as yet another tool to drive workloads into their cloud. In the case of AWS they don’t even charge extra for their PaaS other than the cost of the infrastructure instances that they use. Quoting from the AWS pricing page:
“There is no additional charge for Elastic Beanstalk – you only pay for the underlying AWS resources (e.g. Amazon EC2, Amazon S3) that your application consumes.”
That puts them in a much more pragmatic and unbiased position to basically offer containerization as part of their PaaS offering, or as a new service as long as it meets their users demand and thus drives more utilization onto their infrastructure.
The fact that they decided to offer orchestration as an independent service and not an extension to their PaaS offering is IMO the strongest validation that this is probably the approach that best meets the user’s needs.
PaaS providers on the other hand are making a good amount of money selling PaaS platforms and services. It is, therefore, clear that when they approach this question that they are biased, by definition. They view Docker as a threat and as a result are trying to minimize its real value, and position it as a natural evolution of their existing platform. Their strategy is to declare support for Docker as an underlying container. They would also offer the option to use containers as the packaging format for applications similar to buildpack. OpenShift from Red Hat have taken even another step in this direction and are planning to switch their underlying orchestration engine to Kubernetes.
To me, all this is a fine progression but the main question that still remains open is whether PaaS is indeed the right tool for handling more complex application workloads?
This opened up another interesting question - is there enough value left for PaaS if we can use containers with orchestration (the most obvious being Docker orchestration) as an automation and management tool?
That question sparked an interesting debate which I found to be fairly surprising as it seemed to reflect what I view as a bit of a narrow PaaS-centric view by many of the PaaS providers who fail to realize that there is more than one approach to managing applications rather than just putting a container abstraction on top of my apps.
What is the difference between PaaS (abstraction) and orchestration (automation)?
Both PaaS and orchestration aim to solve the complexity challenge of deploying and managing apps. Having said that there there is a fundamental difference between the two approaches, let me explain.
Takes an abstraction approach, with abstraction we’re basically hiding complexity by exposing a simpler interface. That approach also comes with an opinionated architecture i.e. in order to provide a simple interface, applications need to be built and written in a certain way that fits the assumptions behind the design of that platform.
In the case of PaaS you don’t have much control over many of the operational aspects associated with managing your application, for example the way it handles scaling, high availability, performance, monitoring, logging, updates. There is also a much stronger dependency on the platform provider in the choice of language and stack. Of course, some of these are open source and provide a range of plug-ins that allow some degree of extensibility, but at the end of the day you still have to make sure that this fits with the core design of the platform.
The main advantage with this approach is that as long as the app fits into the PaaS design principles, you do get a simple way to deploy applications without worrying about the operational aspects. It’s also much simpler to guarantee the behavior of your applications once they have been deployed.
With automation we’re basically taking the same steps that we would have performed manually, and scripting them. By scripting them, we’re achieving a similar outcome, i.e. we can run a complex processes such as application deployment in one command, however the fact that the end result may be similar doesn’t make the two approaches the same as is often argued, let me explain. Kind of like the end doesn’t justify the means, but more to the effect of - the end doesn’t necessarily account for the means.
With automation we run a script, and as such we can actually read the script, and quite often understand the underlying steps that will be executed when we run it. As those steps will often follow the same steps that we would do ourselves, it’s also easier to follow up on these steps and retrace them for troubleshooting purposes. All this is fine, but that’s not the main difference. A script is something that can be shared, cloned, modified or rewritten completely so the degree of control and flexibility is significantly higher than with that of a PaaS/abstraction approach.
That flexibility also comes with a cost. With automation / scripting it’s much harder to guarantee portability and the behavior of an application, as it often relies on many external dependencies that can break at any given point in time. So in the end, we may still end up with too much complexity. (Unfortunately, the usual tradeoff for flexibility is complexity when it comes to technology).
So to put it in @nukemberg’s terms : The difference between PaaS and Automation/Orchestration can be summed up as Magic vs Black Magic.
Adding containers to the rescue
This is where the combination of containers and automation becomes handy. Containers allow us to strike a better balance when it comes to the degree of control vs. simplicity. Containers allow us to reduce the complexity, which is a result of the number of moving parts and dependencies. The right balance in this way is to use automation mostly to handle the dependencies between services and tiers of applications, handle policies such as scaling and failover, and less so how to install software and configure it on RHL, Ubuntu, Windows or whatever environment.
Putting PaaS, orchestration/automation and containers together
Both PaaS and orchestration/automation and containers shouldn’t be viewed as alternative to one another but rather as complementary stacks, in a very similar way to the way Google and Amazon have approached the same challenge. The diagram below is taken from my previous post on Docker vs. PaaS that outlines how each of the layers are ordered in this new stack.
It is important to note that the PaaS box in the diagram refers to the more traditional PaaS implementations i.e. Elastic Beanstalk, Heroku, GAE and such. Both Pivotal/CloudFoundry and Red Hat/OpenShift are building new PaaS versions that will expand into orchestration/containers to support more advanced orchestration and container support.
The diagram above illustrates how I believe the new application deployment platform stack would shape up when we add orchestration/automation and containers to the mix. This layered approach enables users to choose which layer of the stack they want to use based on their specific use case. E.g. they can choose a PaaS for use cases where they just want to deploy simple apps and not worry about how their apps are managed, or use automation/orchestration if they want to have more tight control over the operational aspects of their apps. Whether this new stack will be packaged under the same platform is less important for the sake of this discussion.
Why is this still disruptive?
So if we can put PaaS, automation/orchestration and containers together why is this still disruptive?
I think that the answer to that is based on the fact that with the combination of containers, this now allows us to remove a fairly big chunk of the complexity for automating our application deployments. By doing this, the difference in terms of complexity between the use of PaaS/abstraction or the automation/orchestration approach to deploy even a simple application has narrowed significantly.
I, therefore, think that given those two options, most users would prefer to use an approach that is simple enough but does leave them with a higher degree of control. Because of all this, I expect that we will see a much wider and broader adoption of containers/orchestration to manage apps, rather than PaaS.