The Increasingly Critical Role of Dynamic NF Placement
We'll dig into the importance of NF placement shortly, but before we start, let's generalize NF placement. For the rest of this paper, NFs will collectively include PNFs for physical, specialized hardware, VNFs, and CNFs. To clarify, for PNFs, it's more about locating and utilizing existing legacy physical appliances as part of the overall orchestration. The reality for CSPs is that many have a sizable investment in legacy PNFs, and they can and should continue to play in today's networks. Next, we'll provide a real-world scenario to demonstrate the complexity of the NF placement problem while highlighting the criticality of doing so.
Let's look at a city that just experienced a natural disaster. The disaster takes out parts of the network, including a number of cell towers, some of the mobile switching offices, and potentially some cloud data centers as well. In the meantime, we have an ambulance with paramedics tending to a trauma patient as they rush to get to the emergency room (ER). With the availability of telemedicine applications, the paramedics are using real-time video conferencing for ongoing consultation, and the patient's vitals are being streamed to the doctor at the ER. The 5G network can be used to provide an end-to-end slice between the ER and the mobile ambulance. That slice must have ultra-low latency, high reliability, and sufficient bandwidth to support real-time communications.
There might also be an ongoing transmission of vitals and videos to edge or cloud locations for ongoing analysis or archival for patient record-keeping (and post-analysis). As the ambulance traverses the crowded city to make its way to the ER as quickly as possible, it may cross multiple cell sites and be served by various edge locations.
First Responder Scenario — NF Placement Challenges
As described, this is a complex real-time problem. Even prior to the ambulance starting out, the orchestration system will have already been working overtime to adapt to the disaster. Base stations that are damaged will have to be decommissioned, and user equipment will end up connecting to any reachable and operational ones. An inventory of functional radios, switching offices, edge locations, data centers, including the availability of external cloud services, will need to be rapidly updated. Likewise, the topology of the surviving network needs to be revised and finally, the orchestration system will need to relocate critical NFs previously running in damaged areas. The system will need to place the NFs into new locations, perhaps with less than ideal matching of NFs to NFVI, and possibly in higher-cost and higher-latency cloud locations.
Further, due to the criticality of the first responder situation, resources may have to be proactively rearranged to ensure that edge and network resources located close to the disaster area or the path of rescue vehicles are freed-up to provide the rapid on-demand provisioning of services as needed.
Nevertheless, as the ambulance traverses the city, sending both video and patient vitals and record-keeping information across the different slices, the orchestration system will be monitoring KPIs to ensure that SLAs can be met. If SLAs are at risk of not being achieved due to the impaired network, the orchestration system (likely powered by AI/ML) will need to adapt in real-time. The system could take action by deprioritizing other less critical traffic and reallocating resources across the radio spectrum, and the network core, including calling upon the NF placement engine to place critical NFs in new locations. If insufficient resources are available, the operator may need to dynamically instantiate services in public clouds and public edge clouds to meet the processing needs of the overall system and to topologically ensure that the traffic can still achieve the end-to-end SLAs.
The problem of keeping the network up during disasters is one that we experience today, so the scenario above isn't unique. Today, we ensure that first responder networks have higher priority traffic markings and use preemption to make sure critical voice and data traffic gets through. In evolving 5G networks that feature network slices with many more grades of services, orchestration systems will need to step up. New systems need to take into account the significant number of goals, constraints, and telemetry streams reflecting the ongoing health and performance of the network. Real-time adaption will be necessary since simple predictive models based on time-of-day or day-of-week patterns may not be sufficient in a disaster situation, and the NF placement problem becomes significantly harder.
In evolving 5G networks that feature network slices with many more grades of services, orchestration systems will need to step up. They need to take into account the significant number of goals, constraints, and telemetry streams reflecting the ongoing health and performance of the network.