Traditional networking solutions are inflexible and no longer adequate for the emerging Multicloud services.
On the Cloud, with just a few simple steps, users can create highly resilient, secure and near infinite-scale services and applications. This ease of use is driving adoption of Public Cloud and Private Cloud solutions at an unprecedented scale everywhere.
Networking services need to evolve to become automated and simpler in order to be easily consumed by Cloud services and allow migration and expansion to the Multicloud.
Software Defined Networks (SDN) together with Network Telemetry Streaming are the two underlying technologies enabling seamless Multicloud networking. These technologies when combined permit Closed-loop control of the network, resulting in greater automation, simpler operation across domains and high availability.
Network Telemetry Streaming for Software Defined Networks
Network telemetry streaming is a new technology that enables a granular (per second or faster) view of the performance of a network in real-time and in a scalable way. Devices are configured to stream (push) network performance data, at a configured frequency to a collector location. This retrieves much more accurate performance data than the traditional model, where a management station collects data from devices periodically (polls), generally on a 5-minute collection interval.
Currently, equipment vendors are providing Streaming Telemetry from their SDN and new Data Centre fabric solutions, but unfortunately many enterprises are not yet capable of benefiting from this.
Measuring the performance of the network in real-time is critical to support new enterprise services, primarily because:
• Cloud Networks are dynamic and any service issues in networks stretching across domains need to be addressed instantaneously, not several minutes late;
• If networks are monitored in real-time, Closed-loop control becomes possible, delivering automated self-healing and elasticity.
Closed-Loop Control with Network Telemetry Streaming
A Closed-loop Control system fundamentally requires a real-time view of the network status. Automated control systems cannot properly operate with old or delayed information, because they will fail to apply the right actions, as the situation is evolving.
There are three main challenges when building automated real-time network control systems:
• Traditional monitoring tools are simply unable to cope with the amount of data generated by streaming telemetry(for comparison, a 2-second interval stream of network telemetry will generate 150 times more data than a 5-minute polling cycle);
• The streaming telemetry data needs to be processed by closed-loop control applications, in real-time and as it arrives, in order to generate control actions;
• It is necessary to observe in real-time the effects of those control actions to be able to deliver an effective Closed-loop system.
New opensource technologies, from the big data field, can currently provide a framework for building an effective network control system with streaming telemetry.
Building a Network Closed-Loop Control System
The goal is to build a system able to collect the streaming telemetry, process it and apply rules in real-time to control the network. Apache Opensource projects, such as Kafka, Druid, Flink, NiFi and Superset, can all be used to relatively easily assemble a Closed-loop control solution. Standards for network telemetry streaming are also important, easing the collection and processing of the data. YANG models and Netconf/Restconf protocols, from the Internet Engineering Task Force (IETF), provide the basic networking standards framework.
Figure: Components of a Network Closed-Loop Control system for SDN and Multicloud.
In large networks, it is impractical to monitor the whole network at a very high frequency all of the time. Transporting all of the monitoring data through the network, collecting and then storing it, can be a very significant task, even unrealistic for many organisations. One option is to increase or decrease the streaming frequency of the data on a particular area for a period of time, depending on the current needs. This can be achieved by controlling the rate of streaming data on the SDN, through the standard configuration interfaces (Netconf/Restconf and YANG models). Another possible option is to deploy distributed collectors that only send relevant information to centralised monitoring and control applications.
Another important feature is to make the streaming telemetry data from the SDN available to the users of the service in real-time. Again, this is a fundamental building block of Cloud IT services. The data bus that receives the network streaming telemetry can be used to provide that same data to the consumers of the services. Data filtering and policies (such as telemetry threshold breach alerting) can be applied to the data streams in real-time by reusing the above framework and opensource components. This enables users of the service to obtain a greater insight into the performance of the underlying network and available resources.
Closed-loop control systems for networking, based on Network Telemetry Streaming, are the most novel approach towards creating automated and self-healing SDN solutions for interconnecting Multicloud. This approach can deliver the elasticity, high-availability and consistency across the various Multicloud domains of the Software Defined Network.