RFC 8735: Scenarios and Simulation Results of PCE in a Native IP Network

RFC 8735	CCDR Scenario and Simulation Results	February 2020
Wang, et al.	Informational	[Page]

Abstract

Requirements for providing the End-to-End (E2E) performance assurance are emerging within the service provider networks. While there are various technology solutions, there is no single solution that can fulfill these requirements for a native IP network. In particular, there is a need for a universal E2E solution that can cover both intra- and inter-domain scenarios.¶

One feasible E2E traffic-engineering solution is the addition of central control in a native IP network. This document describes various complex scenarios and simulation results when applying the Path Computation Element (PCE) in a native IP network. This solution, referred to as Centralized Control Dynamic Routing (CCDR), integrates the advantage of using distributed protocols and the power of a centralized control technology, providing traffic engineering for native IP networks in a manner that applies equally to intra- and inter-domain scenarios.¶

1. Introduction

A service provider network is composed of thousands of routers that run distributed protocols to exchange reachability information. The path for the destination network is mainly calculated, and controlled, by the distributed protocols. These distributed protocols are robust enough to support most applications; however, they have some difficulties supporting the complexities needed for traffic-engineering applications, e.g., E2E performance assurance, or maximizing the link utilization within an IP network.¶

Multiprotocol Label Switching (MPLS) using Traffic-Engineering (TE) technology (MPLS-TE) [RFC3209] is one solution for TE networks, but it introduces an MPLS network along with related technology, which would be an overlay of the IP network. MPLS-TE technology is often used for Label Switched Path (LSP) protection and setting up complex paths within a domain. It has not been widely deployed for meeting E2E (especially in inter-domain) dynamic performance assurance requirements for an IP network.¶

Segment Routing [RFC8402] is another solution that integrates some advantages of using a distributed protocol and central control technology, but it requires the underlying network, especially the provider edge router, to do an in-depth label push and pop action while adding complexity when coexisting with the non-segment routing network. Additionally, it can only maneuver the E2E paths for MPLS and IPv6 traffic via different mechanisms.¶

Deterministic Networking (DetNet) [RFC8578] is another possible solution. It is primarily focused on providing bounded latency for a flow and introduces additional requirements on the domain edge router. The current DetNet scope is within one domain. The use cases defined in this document do not require the additional complexity of deterministic properties and so differ from the DetNet use cases.¶

This document describes several scenarios for a native IP network where a Centralized Control Dynamic Routing (CCDR) framework can produce qualitative improvement in efficiency without requiring a change to the data-plane behavior on the router. Using knowledge of the Border Gateway Protocol (BGP) session-specific prefixes advertised by a router, the network topology and the near-real-time link-utilization information from network management systems, a central PCE is able to compute an optimal path and give the underlying routers the destination address to use to reach the BGP nexthop, such that the distributed routing protocol will use the computed path via traditional recursive lookup procedure. Some results from simulations of path optimization are also presented to concretely illustrate a variety of scenarios where CCDR shows significant improvement over traditional distributed routing protocols.¶

This document is the base document of the following two documents: the universal solution document, which is suitable for intra-domain and inter-domain TE scenario, is described in [PCE-NATIVE-IP]; and the related protocol extension contents is described in [PCEP-NATIVE-IP-EXT].¶

3. CCDR Scenarios

The following sections describe various deployment scenarios where applying the CCDR framework is intuitively expected to produce improvements based on the macro-scale properties of the framework and the scenario.¶

3.1. QoS Assurance for Hybrid Cloud-Based Application

With the emergence of cloud computing technologies, enterprises are putting more and more services on a public-oriented cloud environment while keeping core business within their private cloud. The communication between the private and public cloud sites spans the WAN. The bandwidth requirements between them are variable, and the background traffic between these two sites varies over time. Enterprise applications require assurance of the E2E QoS performance on demand for variable bandwidth services.¶

CCDR, which integrates the merits of distributed protocols and the power of centralized control, is suitable for this scenario. The possible solution framework is illustrated below:¶

                         +------------------------+
                         | Cloud-Based Application|
                         +------------------------+
                                     |
                               +-----------+
                               |    PCE    |
                               +-----------+
                                     |
                                     |
                            //--------------\\
                       /////                  \\\\\
  Private Cloud Site ||       Distributed          |Public Cloud Site
                      |       Control Network      |
                       \\\\\                  /////
                            \\--------------//

Figure 1: Hybrid Cloud Communication Scenario

As illustrated in Figure 1, the source and destination of the "Cloud-Based Application" traffic are located at "Private Cloud Site" and "Public Cloud Site", respectively.¶

By default, the traffic path between the private and public cloud site is determined by the distributed control network. When an application requires E2E QoS assurance, it can send these requirements to the PCE and let the PCE compute one E2E path, which is based on the underlying network topology and real traffic information, in order to accommodate the application's QoS requirements. Section 4.4 of this document describes the simulation results for this use case.¶

3.2. Link Utilization Maximization

Network topology within a Metro Area Network (MAN) is generally in a star mode as illustrated in Figure 2, with different devices connected to different customer types. The traffic from these customers is often in a tidal pattern with the links between the Core Router (CR) / Broadband Remote Access Server (BRAS) and CR/Service Router (SR) experiencing congestion in different periods due to subscribers under BRAS often using the network at night and the leased line users under SR often using the network during the daytime. The link between BRAS/SR and CR must satisfy the maximum traffic volume between them, respectively, which causes these links to often be underutilized.¶

                         +--------+
                         |   CR   |
                         +----|---+
                              |
                  |-------|--------|-------|
                  |       |        |       |
               +--|-+   +-|+    +--|-+   +-|+
               |BRAS|   |SR|    |BRAS|   |SR|
               +----+   +--+    +----+   +--+

Figure 2: Star-Mode Network Topology within MAN

If we consider connecting the BRAS/SR with a local link loop (which is usually lower cost) and control the overall MAN topology with the CCDR framework, we can exploit the tidal phenomena between the BRAS/CR and SR/CR links, maximizing the utilization of these central trunk links (which are usually higher cost than the local loops).¶

                                  +-------+
                              -----  PCE  |
                              |   +-------+
                         +----|---+
                         |   CR   |
                         +----|---+
                              |
                  |-------|--------|-------|
                  |       |        |       |
               +--|-+   +-|+    +--|-+   +-|+
               |BRAS-----SR|    |BRAS-----SR|
               +----+   +--+    +----+   +--+

Figure 3: Link Utilization Maximization via CCDR

3.3. Traffic Engineering for Multi-domain

Service provider networks are often comprised of different domains, interconnected with each other, forming a very complex topology as illustrated in Figure 4. Due to the traffic pattern to/from the MAN and IDC, the utilization of the links between them are often asymmetric. It is almost impossible to balance the utilization of these links via a distributed protocol, but this unbalance can be overcome utilizing the CCDR framework.¶

               +---+                +---+
               |MAN|----------------|IDC|
               +---+       |        +---+
                 |     ----------     |
                 |-----|Backbone|-----|
                 |     ----|-----     |
                 |         |          |
               +---+       |        +---+
               |IDC|----------------|MAN|
               +---+                +---+

Figure 4: Traffic Engineering for Complex Multi-domain Topology

A solution for this scenario requires the gathering of NetFlow information, analysis of the source/destination autonomous system (AS), and determining what the main cause of the congested link(s) is. After this, the operator can use the external Border Gateway Protocol (eBGP) sessions to schedule the traffic among the different domains according to the solution described in the CCDR framework.¶

3.4. Network Temporal Congestion Elimination

In more general situations, there is often temporal congestion within the service provider's network, for example, due to daily or weekly periodic bursts or large events that are scheduled well in advance. Such congestion phenomena often appear regularly, and if the service provider has methods to mitigate it, it will certainly improve their network operation capabilities and increase satisfaction for customers. CCDR is also suitable for such scenarios, as the controller can schedule traffic out of the congested links, lowering their utilization during these times. Section 4.5 describes the simulation results of this scenario.¶

4. CCDR Simulation

The following sections describe a specific case study to illustrate the workings of the CCDR algorithm with concrete paths/metrics, as well as a procedure for generating topology and traffic matrices and the results from simulations applying CCDR for E2E QoS (assured path and congestion elimination) over the generated topologies and traffic matrices. In all cases examined, the CCDR algorithm produces qualitatively significant improvement over the reference (OSPF) algorithm, suggesting that CCDR will have broad applicability.¶

The structure and scale of the simulated topology is similar to that of the real networks. Multiple different traffic matrices were generated to simulate different congestion conditions in the network. Only one of them is illustrated since the others produce similar results.¶

4.1. Case Study for CCDR Algorithm

In this section, we consider a specific network topology for case study: examining the path selected by OSPF and CCDR and evaluating how and why the paths differ. Figure 5 depicts the topology of the network in this case. There are eight forwarding devices in the network. The original cost and utilization are marked on it as shown in the figure. For example, the original cost and utilization for the link (1, 2) are 3 and 50%, respectively. There are two flows: f1 and f2. Both of these two flows are from node 1 to node 8. For simplicity, it is assumed that the bandwidth of the link in the network is 10 Mb/s. The flow rate of f1 is 1 Mb/s and the flow rate of f2 is 2 Mb/s. The threshold of the link in congestion is 90%.¶

If the OSPF protocol, which adopts Dijkstra's algorithm (IS-IS is similar because it also uses Dijkstra's algorithm), is applied in the network, the two flows from node 1 to node 8 can only use the OSPF path (p1: 1->2->3->8). This is because Dijkstra's algorithm mainly considers the original cost of the link. Since CCDR considers cost and utilization simultaneously, the same path as OSPF will not be selected due to the severe congestion of the link (2, 3). In this case, f1 will select the path (p2: 1->5->6->7->8) since the new cost of this path is better than that of the OSPF path. Moreover, the path p2 is also better than the path (p3: 1->2->4->7->8) for flow f1. However, f2 will not select the same path since it will cause new congestion in the link (6, 7). As a result, f2 will select the path (p3: 1->2->4->7->8).¶

      +----+      f1                +-------> +-----+ ----> +-----+
      |Edge|-----------+            |+--------|  3  |-------|  8  |
      |Node|---------+ |            ||+-----> +-----+ ----> +-----+
      +----+         | |       4/95%|||              6/50%     |
                     | |            |||                   5/60%|
                     | v            |||                        |
      +----+       +-----+ -----> +-----+      +-----+      +-----+
      |Edge|-------|  1  |--------|  2  |------|  4  |------|  7  |
      |Node|-----> +-----+ -----> +-----+7/60% +-----+5/45% +-----+
      +----+  f2      |     3/50%                              |
                      |                                        |
                      |   3/60%   +-----+ 5/55%+-----+   3/75% |
                      +-----------|  5  |------|  6  |---------+
                                  +-----+      +-----+
                (a) Dijkstra's Algorithm (OSPF/IS-IS)


      +----+      f1                          +-----+ ----> +-----+
      |Edge|-----------+             +--------|  3  |-------|  8  |
      |Node|---------+ |             |        +-----+ ----> +-----+
      +----+         | |       4/95% |               6/50%    ^|^
                     | |             |                   5/60%|||
                     | v             |                        |||
      +----+       +-----+ -----> +-----+ ---> +-----+ ---> +-----+
      |Edge|-------|  1  |--------|  2  |------|  4  |------|  7  |
      |Node|-----> +-----+        +-----+7/60% +-----+5/45% +-----+
      +----+  f2     ||     3/50%                              |^
                     ||                                        ||
                     ||   3/60%   +-----+5/55% +-----+   3/75% ||
                     |+-----------|  5  |------|  6  |---------+|
                     +----------> +-----+ ---> +-----+ ---------+
                   (b) CCDR Algorithm

Figure 5: Case Study for CCDR's Algorithm

4.2. Topology Simulation

Moving on from the specific case study, we now consider a class of networks more representative of real deployments, with a fully linked core network that serves to connect edge nodes, which themselves connect to only a subset of the core. An example of such a topology is shown in Figure 6 for the case of 4 core nodes and 5 edge nodes. The CCDR simulations presented in this work use topologies involving 100 core nodes and 400 edge nodes. While the resulting graph does not fit on this page, this scale of network is similar to what is deployed in production environments.¶

                                +----+
                               /|Edge|\
                              | +----+ |
                              |        |
                              |        |
                +----+    +----+     +----+
                |Edge|----|Core|-----|Core|---------+
                +----+    +----+     +----+         |
                        /  |    \   /   |           |
                  +----+   |     \ /    |           |
                  |Edge|   |      X     |           |
                  +----+   |     / \    |           |
                        \  |    /   \   |           |
                +----+    +----+     +----+         |
                |Edge|----|Core|-----|Core|         |
                +----+    +----+     +----+         |
                            |          |            |
                            |          +------\   +----+
                            |                  ---|Edge|
                            +-----------------/   +----+

Figure 6: Topology of Simulation

For the simulations, the number of links connecting one edge node to the set of core nodes is randomly chosen between two and thirty, and the total number of links is more than 20,000. Each link has a congestion threshold, which can be arbitrarily set, for example, to 90% of the nominal link capacity without affecting the simulation results.¶

4.3. Traffic Matrix Simulation

For each topology, a traffic matrix is generated based on the link capacity of the topology. It can result in many kinds of situations such as congestion, mild congestion, and non-congestion.¶

In the CCDR simulation, the dimension of the traffic matrix is 500*500 (100 core nodes plus 400 edge nodes). About 20% of links are overloaded when the Open Shortest Path First (OSPF) protocol is used in the network.¶

4.4. CCDR End-to-End Path Optimization

The CCDR E2E path optimization entails finding the best path, which is the lowest in metric value, as well as having utilization far below the congestion threshold for each link of the path. Based on the current state of the network, the PCE within CCDR framework combines the shortest path algorithm with a penalty theory of classical optimization and graph theory.¶

Given a background traffic matrix, which is unscheduled, when a set of new flows comes into the network, the E2E path optimization finds the optimal paths for them. The selected paths bring the least congestion degree to the network.¶

The link Utilization Increment Degree (UID), when the new flows are added into the network, is shown in Figure 7. The first graph in Figure 7 is the UID with OSPF, and the second graph is the UID with CCDR E2E path optimization. The average UID of the first graph is more than 30%. After path optimization, the average UID is less than 5%. The results show that the CCDR E2E path optimization has an eye-catching decrease in UID relative to the path chosen based on OSPF.¶

While real-world results invariably differ from simulations (for example, real-world topologies are likely to exhibit correlation in the attachment patterns for edge nodes to the core, which are not reflected in these results), the dramatic nature of the improvement in UID and the choice of simulated topology to resemble real-world conditions suggest that real-world deployments will also experience significant improvement in UID results.¶

       +-----------------------------------------------------------+
       |                *                               *    *    *|
     60|                *                             * * *  *    *|
       |*      *       **     * *         *   *   *  ** * *  * * **|
       |*   * ** *   * **   *** **  *   * **  * * *  ** * *  *** **|
       |* * * ** *  ** **   *** *** **  **** ** ***  **** ** *** **|
     40|* * * ***** ** ***  *** *** **  **** ** *** ***** ****** **|
 UID(%)|* * ******* ** ***  *** ******* **** ** *** ***** *********|
       |*** ******* ** **** *********** *********** ***************|
       |******************* *********** *********** ***************|
     20|******************* ***************************************|
       |******************* ***************************************|
       |***********************************************************|
       |***********************************************************|
      0+-----------------------------------------------------------+
      0    100   200   300   400   500   600   700   800   900  1000
       +-----------------------------------------------------------+
       |                                                           |
     60|                                                           |
       |                                                           |
       |                                                           |
       |                                                           |
     40|                                                           |
 UID(%)|                                                           |
       |                                                           |
       |                                                           |
     20|                                                           |
       |                                                          *|
       |                                     *                    *|
       |        *         *  *    *       *  **                 * *|
      0+-----------------------------------------------------------+
      0    100   200   300   400   500   600   700   800   900  1000
                            Flow Number

Figure 7: Simulation Results with Congestion Elimination

4.5. Network Temporal Congestion Elimination

During the simulations, different degrees of network congestion were considered. To examine the effect of CCDR on link congestion, we consider the Congestion Degree (CD) of a link, defined as the link utilization beyond its threshold.¶

The CCDR congestion elimination performance is shown in Figure 8. The first graph is the CD distribution before the process of congestion elimination. The average CD of all congested links is about 20%. The second graph shown in Figure 8 is the CD distribution after using the congestion elimination process. It shows that only twelve links among the total 20,000 exceed the threshold, and all the CD values are less than 3%. Thus, after scheduling the traffic away from the congested paths, the degree of network congestion is greatly eliminated and the network utilization is in balance.¶

            Before congestion elimination
        +-----------------------------------------------------------+
        |                *                            ** *   ** ** *|
      20|                *                     *      **** * ** ** *|
        |*      *       **     * **       **  **** * ***** *********|
        |*   *  * *   * **** ****** *  ** *** **********************|
      15|* * * ** *  ** **** ********* *****************************|
        |* * ******  ******* ********* *****************************|
  CD(%) |* ********* ******* ***************************************|
      10|* ********* ***********************************************|
        |*********** ***********************************************|
        |***********************************************************|
       5|***********************************************************|
        |***********************************************************|
        |***********************************************************|
       0+-----------------------------------------------------------+
           0            0.5            1            1.5            2

                     After congestion elimination
       +-----------------------------------------------------------+
       |                                                           |
     20|                                                           |
       |                                                           |
       |                                                           |
     15|                                                           |
       |                                                           |
 CD(%) |                                                           |
     10|                                                           |
       |                                                           |
       |                                                           |
     5 |                                                           |
       |                                                           |
       |        *        **  * *  *  **   *  **                 *  |
     0 +-----------------------------------------------------------+
        0            0.5            1            1.5            2
                         Link Number(*10000)

Figure 8: Simulation Results with Congestion Elimination

It is clear that by using an active path-computation mechanism that is able to take into account observed link traffic/congestion, the occurrence of congestion events can be greatly reduced. Only when a preponderance of links in the network are near their congestion threshold will the central controller be unable to find a clear path as opposed to when a static metric-based procedure is used, which will produce congested paths once a single bottleneck approaches its capacity. More detailed information about the algorithm can be found in [PTCS].¶

RFC 8735

Scenarios and Simulation Results of PCE in a Native IP Network

Abstract

Status of This Memo

Copyright Notice

Table of Contents

1. Introduction

2. Terminology

3. CCDR Scenarios

3.1. QoS Assurance for Hybrid Cloud-Based Application

3.2. Link Utilization Maximization

3.3. Traffic Engineering for Multi-domain

3.4. Network Temporal Congestion Elimination

4. CCDR Simulation

4.1. Case Study for CCDR Algorithm

4.2. Topology Simulation

4.3. Traffic Matrix Simulation

4.4. CCDR End-to-End Path Optimization

4.5. Network Temporal Congestion Elimination

5. CCDR Deployment Consideration

6. Security Considerations

7. IANA Considerations

8. References

8.1. Normative References

8.2. Informative References

Acknowledgements

Contributors

Authors' Addresses