Phase Two — Adaptive Placement
In this next phase, the carriers could introduce dynamic network slices for multiple use-cases. For example, numerous eMBB deployments that demonstrate different levels of performance, along with low-latency slices that support various flavors of URLLC. During this phase, CSPs could provide self-provisioning by their business customers and then dynamically create these slices. The NF orchestration system would need to instantiate the appropriate NFs and dynamically place them in different locations to support the associated SLAs. This is where CSPs can start offering industry 4.0 types of services, including real-time video surveillance, and augmented reality using edge platforms. Once we've ascertained that the orchestration system continues to perform well, we can take on offering real-time services like connected vehicles, fleet and logistics use cases, and eventually autonomous driving.
With numerous requests for the creation of new slices, the orchestration system will need to ascertain if there are sufficient resources to fulfill the promised SLAs. The orchestration system may need to adapt by placing NFs in public cloud edges, with the attendant complexities around providing ongoing assurances that SLAs can be met even if the carrier doesn't own the underlying NFVI.
This is also the stage in which telemetry measurements feeding into the orchestration system are assessed for sufficient granularity and accuracy so that they can be used in optimization decisions for dynamic NF placement to meet SLAs. One of the biggest challenges in this phase will be managing resources to fulfill dynamic slice creation while meeting SLAs. This will likely require ongoing provisioning and de-provisioning of NFs and migration of NFs in real-time as resource availability changes in the infrastructure. There may even be situations where slices degrade enough to put SLAs at risk, and the system has to adapt by prioritizing VIP customers over others, minimizing the business fallout.
Regardless, the data gathered in this phase, as well as the closed-loop systems, will prove valuable for AI/ML training and ongoing AI/ML inferencing as we advance into the next stage.
Phase Three — Autonomous Placement (with AI/ML)
This is the phase in which we incorporate the use of AI/ML as needed, to help fine-tune dynamic placement. Here we verify our ability to dynamically orchestrate and re-orchestrate NF placement to further optimize and improve:
- Costs — The system should use the most cost-effective resources while ensuring SLAs are met. This could entail creatively using cloud-based spot instances opportunistically to lower the overall cost for less latency-sensitive or loss-sensitive network slices while saving more expensive telco NFVI for tighter performance bounds. CSPs could even route over non-owned network paths if expenses are lower and SLAs permit it.
- Reliability and Resiliency — AI/ML can provide sophisticated pro-active provisioning and placement actions that meet the currently requested SLAs, and that can improve overall system reliability even in the face of component failures. AI/ML systems can also quickly remap NFs and the network in the face of failures, providing rapid and automated recovery to reduce downtime.
- Predictability — AI/ML-enhanced orchestration can observe patterns in requests for slices and potentially even pre-provision necessary resources to reduce the dynamic network slice setup time if the system learns that the requests are periodic or triggered by other related events.
With the full range of underlying infrastructure choices and the SLA variations, it’s unlikely any human or static algorithm — even those based on heuristics — will be able to optimize sufficiently. This is an opportunity where AI/ML can shine.
The complexity and scope of the new generation of NF placement favor the use of AI/ML. With the full range of underlying infrastructure choices and the SLA variations, it's unlikely any human or static algorithm — even those based on heuristics — will be able to optimize sufficiently. This is an opportunity where AI/ML can shine. As part of getting to this phase, we need to ensure that telemetry from the NFs and underlying NFVI, as well as overall E2E QoS measurement metrics, are accurate. AvidThink anticipates that this is an area that will see significant innovation over the next few years. Many areas of research and development are ripe for investment: understanding appropriate training data set for machine learning, figuring out the roles that network engineers will play in training, identifying where AI augments human operators, and where AI can be completely autonomous.