I first heard the term NFV a year ago as part of the writeup that I was doing with Alcatel Lucent on CloudBand titled Carrier Grade PaaS covering our collaboration. Ever since that time it looks like the term has risen in popularity with every network provider or infrastructure provider now launching a new NFV initiative.
As with any new hype, the term NFV quickly became overloaded and confusing, so I thought it would be beneficial to try and clarify it a bit.
What is NFV (technical definition)?
NFV stands for Network Function Virtualization - it basically means that your routers, load balancers and firewalls that are currently shipped in boxes will become a virtual entity rather than a physical box. As in the case of compute or storage virtualization, the process of virtualizing network functions is quite the same; we take each function (box) and turn it into a software function that can run on any given box. The decision of which box is going to serve each function can change at runtime based on the desired SLA and would not involve changing the software piece.
What does that really mean?
For most people, network functions do not mean a lot. In addition, the definition that I found over the internet for NFV for the most part comes from people who often tend to speak in a language that only a small community can really understand.
To understand what NFV really means we need to take a step back and look at the overall change that the IT industry as a whole is undergoing.
The IT industry is going through a big industrialization change that is very similar to the change that the car industry underwent when Ford came out with Model-T. The change with the Model-T car wasn’t in the making a faster engine, quite the contrary, the Model-T was inferior to other cars that were produced in those days. The industrial revolution with the Model-T was in the ability to produce cars in mass production. This revolution changed the entire automotive industry that propagated to other industries, as well, from consumer electronics through the food industry, which followed the same principles and adopted the same mass production manufacturing principles.
The IT industry is going through a fairly similar change - until now we used to build our IT manually, and we’re now shifting into mass production of IT, as well. In this analogy cloud is the manufacturing floor, and DevOps is the process for optimizing and automating the production pipeline.
How IT industrialization is related to NFV?
The carrier industry today relies on a highly customized infrastructure, no one carrier infrastructure runs in the same way as another carrier infrastructure. Once upon a time carriers saw this as an advantage and ended up with layers and layers of fairly proprietary and costly setups for running their business.
As the competition in this industry became tighter this operational model becomes not just less economical, but a huge threat, as it limits the speed in which carriers can adapt to new economic and market changes. This already has resulted in their loss of ground quite rapidly to competition from internet shops such as Google, Amazon, Microsoft, and others.
Carriers realized that in order to survive in this new world they needed to reduce their operational costs and increase the speed in which they can introduce new services, as well as scale their business.
This is where NFV comes into play. NFV is basically a better operational model for running the carrier backbone. Instead of a highly customized and costly backbone, we’ll use a commodity based infrastructure, and use opensource frameworks and with that we can innovate faster, scale and all this at a much lower cost.
What NFV is comprised of?
NFV isn’t a standard (yet) or a product. Even though there are various bodies such as ETSI that try to define standards for NFV, it is very unlikely that we’ll see any real standard emerging from this sort of initiative. This is simply due to the fact that standards often emerge when an industry reaches a certain level of maturity and we’re just not there yet.
At this point it is a set of “de-facto” standards bred from best practices mostly from the cloud providers that proved to be able to deliver a more efficient and agile operational model for running their infrastructure.
Quite often that includes the following core components:
Cloud-based infrastructure - OpenStack is currently the most popular choice for this purpose.
Software-defined network functions - This is a mix of the existing network functions provided as software packages, as well as new purely open source players that are making a new entrance into this world.
Orchestration engine - Responsible for provisioning of the network functions on a cloud-based infrastructure. TOSCA is a standard orchestration language defined by the OASIS organisation that provides a standard modeling language. Projects such as OpenStack Heat, as infrastructure orchestration in combination with Cloudify as an application orchestration tool are a good reference for this.
Analytics engine - Analytics engines are basically the feedback loop. They are an essential piece to measure which part to measure whether our services meet their desired SLAs and also as a means to analyze and optimize workloads. In the context of NFV, where many of the insights and decisions need to happen in real-time, the analytics engine will be heavily real-time based. A real time analytics engine that is more specific to the operational monitoring domain is Reimann.
What makes NFV different than any other cloud-based infrastructure?
For the most part network functions are not different than a database or any other software function that isn’t a defined as a network function. Having said that there are number of characteristics that are more specific to network functions such as:
Highly distributed deployment - Carrier data centers tend to span across multi-site deployments
Deterministic latency and performance - Network functions are more sensitive to latency and non-deterministic behavior that can be quite common in virtual data centers.
Support for third-party virtual appliances - Many of the network functions are packed as VMs. Those third-party VMs are mostly treated as a blackbox and can be accessed only through custom interfaces that are specific to the network function. Managing those functions can be fairly different than that of the other software services, as a result. The main difference, though, is that most management systems install an agent to control each managed service. In the context of network functions, these assumptions are no longer valid, and therefore managing virtual appliances can be done only remotely and not through a local agent.
Support for legacy network services - Many of the existing network functions were written in a pre-cloud world, as such they were built with specific assumptions that the high availability and scalability model, as well as the configuration model that are usually very much human-driven. Changing all that in one day isn’t going to be realistic, and therefore, there needs to be a more gradual path to transition those services into the new world or even replacing them with a more modern infrastructure.
High Degree of Security - Carrier networks assume a high degree of isolation on the network level. This often maps to a daily sophisticated network setup of VLANs and a separate network hierarchy that is not yet supported by most of the existing cloud infrastructures.
The Role of Orchestration in NFV
In a pre-NFV world setting, provisioning the carrier infrastructure was fairly human driven. In an NFV world those human operations need to be automated through software. In that context the orchestrator is a software defined operator and it’s the piece that manages the deployment of the network function in the right location and hardware. It is also responsible for orchestrating the network firewalls, routers and such, to fit a specific service deployment and security constraints. Like the human operator it is also responsible for continuously monitoring the deployed services and ensuring that they meet their desired SLA.
NFV is part of a broader shift toward industrialization of IT. In this particular case NFV is simply a way to make the operational model of carriers more efficient by adopting similar best practices and tools that have already proven themselves at scale by the major cloud providers such a Amazon, Google, Microsoft and others.
On the business side it has disrupted the entire networking industry that used to sell proprietary boxes. In the new virtualized world, boxes have become commoditized and the software on top becomes the main play. This opens up the door to new players that will have a pure software solution that will be specifically designed for a virtualized setup.
As with any disruptive force we will see big players that find it difficult to adapt to this change phased out to make room for new players who will take over, in a similar fashion to the way the internet wave brought Google, Amazon, PayPal and other players who were completely new commoners at the time and became the new market leaders.
On the technical side of things virtualizing the network function is only the first step. In order to experience the full potential of a virtualized carriers, we need to have the entire deployment and management completely automated. What we are missing in this picture is the software equivalent of the human operator, which is also known as the orchestrator. The orchestrator will become the major piece in putting all of this together.
In the next post I’ll describe in more depth the role of an orchestrator in the context of NFV.