RFC 9855: Topology Independent Fast Reroute Using Segment Routing
- A. Bashandy,
- S. Litkowski,
- C. Filsfils,
- P. Francois,
- B. Decraene,
- D. Voyer
Abstract
This document presents Topology Independent Loop-Free Alternate
(TI-LFA) Fast Reroute (FRR), which is aimed at providing protection of
node and Adjacency segments within the Segment Routing (SR)
framework. This FRR behavior builds on proven IP FRR concepts being
LFAs, Remote LFAs (RLFAs), and Directed Loop-Free Alternates (DLFAs).
It extends these concepts to provide guaranteed coverage in any
two-connected networks using a link-state IGP. An important aspect of
TI-LFA is the FRR path selection approach establishing protection over
the expected post
Status of This Memo
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
https://
Copyright Notice
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://
1. Introduction
This document outlines a local repair mechanism that leverages Segment Routing (SR) to restore end-to-end connectivity in the event of a failure involving a directly connected network component. This mechanism is designed for standard link-state Interior Gateway Protocol (IGP) shortest path scenarios. Non-SR mechanisms for local repair are beyond the scope of this document. Non-local failures are addressed in a separate document [SR-LOOP].¶
The term Topology Independent (TI) describes the capability providing a loop-free backup path that is effective across all network topologies. This provides a major improvement compared to LFA [RFC5286] and RLFA [RFC7490], which cannot provide a complete protection coverage in some topologies as described in [RFC6571].¶
When the network reconverges after failure, micro-loops [RFC5715] can form due to transient inconsistencies in the forwarding tables of different routers. If it is determined that micro-loops are a significant issue in the deployment, then a suitable loop-free convergence method should be implemented, such as one of those described in [RFC5715], [RFC6976], [RFC8333], or [SR-LOOP].¶
TI-LFA operates locally at the Point of Local Repair (PLR) upon detecting a failure in one of its direct links. Consequently, this local operation does not influence:¶
TI-LFA paths are activated from the instant the PLR detects a failure in a local link and remain in effect until the IGP convergence at the PLR is fully achieved. Consequently, they are not susceptible to micro-loops that may arise due to variations in the IGP convergence times across different nodes through which these paths traverse. This ensures a stable and predictable routing environment, minimizing disruptions typically associated with asynchronous network behavior. However, an early (relative to the other nodes) IGP convergence at the PLR and the consecutive "early" release of TI-LFA paths may cause micro-loops, especially if these paths have been computed using the methods described in Sections 5.2, 5.3, or 5.4 of this document. One of the possible ways to prevent such micro-loops is local convergence delay [RFC8333].¶
TI-LFA procedures are complementary to the application of any micro-loop avoidance procedures in the case of link or node failure:¶
For each destination (as specified by the IGP) in the network, TI-LFA pre-installs a backup forwarding entry for each protected destination ready to be activated upon detection of the failure of a link used to reach the destination. TI-LFA provides protection in the event of any one of the following: single link failure, single node failure, or single Shared Risk Link Group (SRLG) failure. In link failure mode, the destination is protected assuming the failure of the link. In node protection mode, the destination is protected assuming that the neighbor connected to the primary link (see Section 2) has failed. In SRLG protecting mode, the destination is protected assuming that a configured set of links sharing fate with the primary link has failed (e.g., a linecard or a set of links sharing a common transmission pipe).¶
Protection techniques outlined in this document are limited to protecting links, nodes, and SRLGs that are within a link-state IGP area. Protecting domain exit routers and/or links attached to another routing domain is beyond the scope of this document.¶
By utilizing SR, TI-LFA eliminates the need to establish Targeted Label Distribution Protocol sessions with remote nodes for leveraging the benefits of Remote Loop-Free Alternates (RLFAs) [RFC7490] [RFC7916] or Directed Loop-Free Alternates (DLFAs) [IPFRR-TUNNELS]. All the Segment Identifiers (SIDs) required are present within the Link State Database (LSDB) of the IGP. Consequently, there is no longer a necessity to prefer LFAs over RLFAs or DLFAs, nor is there a need to minimize the number of RLFA or DLFA repair nodes.¶
Utilizing SR also eliminates the need to establish an additional state within the network for enforcing explicit Fast Reroute (FRR) paths. This spares the nodes from maintaining a supplementary state and frees the operator from the necessity to implement additional protocols or protocol sessions solely to augment protection coverage.¶
TI-LFA also brings the benefit of the ability to provide a backup
path that follows the expected post
This document is structured as follows:¶
2. Terminology
2.1. Abbreviations and Notations
- DLFA:
- Directed Loop-Free Alternate¶
- FRR:
- Fast Reroute¶
- IGP:
- Interior Gateway Protocol¶
- LFA:
- Loop-Free Alternate¶
- LSDB:
- Link State Database¶
- PLR:
- Point of Local Repair¶
- RL:
- Repair List¶
- RLFA:
- Remote Loop-Free Alternate¶
- SID:
- Segment Identifier¶
- SPF:
- Shortest Path First¶
- SPT:
- Shortest Path Tree¶
- SR:
- Segment Routing¶
- SRLG:
- Shared Risk Link Group¶
- TI-LFA:
- Topology Independent Loop-Free Alternate¶
The main notations used in this document are defined as follows:¶
2.2. Conventions Used in This Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
3. Base Principle
The basic algorithm to compute the repair path is to pre-compute SPT_new(R, X) and, for each destination, encode the repair path as a loop-free segment list. One way to provide a loop-free segment list is to use Adjacency SIDs only. However, this approach may create very long SID lists that hardware may not be able to handle due to Maximum SID Depth (MSD) limitations.¶
An implementation is free to use any local optimization to provide smaller segment lists by combining Node-SIDs and Adjacency SIDs. In addition, the usage of Node-SIDs allow for maximizing ECMPs over the backup path. These optimizations are out of scope of this document; however, the subsequent sections provide some guidance on how to leverage P-spaces and Q-spaces to optimize the size of the segment list.¶
4. Intersecting P-space and Q-space with Post-Convergence Paths
One of the challenges of defining an SR path following the expected
post
4.1. Extended P-space Property Computation for a Resource X over Post-Convergence Paths
The objective is to determine which nodes on the post
This can be found by:¶
4.2. Q-space Property Computation for a Resource X over Post-Convergence Paths
The goal is to determine which nodes on the post
This can be found by intersecting the set of nodes belonging to the
post
4.3. Scaling Considerations When Computing Q-space
[RFC7490] raises scaling concerns about computing a Q-space per destination. Similar concerns may affect TI-LFA computation if an implementation tries to compute a reverse Shortest Path Tree (SPT) [RFC7490] for every destination in the network to determine the Q-space. It will be up to each implementation to determine the good tradeoff between scaling and accuracy of the optimization.¶
5. TI-LFA Repair Path
The TI-LFA repair path consists of an outgoing interface and a list
of segments (a Repair List (RL)) to insert on the SR header in
accordance with the data plane used. The RL encodes the
explicit (and possibly post
The TI-LFA repair path is found by intersecting P(S, X) and Q(D, X) with
the post
As an example, in Figure 1, the focus is on the TI-LFA backup from S to D, considering the failure of node N1.¶
As a result, the TI-LFA RL of S for destination D
considering the failure of node N1 is: <Node-SID(R1), Adj-SID(R1-R2),
Adj
Most often, the TI-LFA RL has a simpler form, as described in the following sections. Appendix B provides statistics for the number of SIDs in the explicit path to protect against various failures.¶
5.1. FRR Path Using a Direct Neighbor
When a direct neighbor is in P(S, X) and Q(D, x), and the link to that
direct neighbor is on the post
This is comparable to a post
5.2. FRR Path Using a PQ Node
When a remote node R is in P(S, X) and Q(D, x) and on the
post
This is comparable to a post
5.3. FRR Path Using a P Node and Q Node That Are Adjacent
When a node P is in P(S, X) and a node Q is in Q(D, x), and both are on
the post
This is comparable to a post
5.4. Connecting Distant P and Q Nodes Along Post-Convergence Paths
In some cases, there is no adjacent P and Q node along the
post‑convergenc
6. Building TI-LFA Repair Lists for SR Segments
The following sections describe how to build the RLs using the terminology defined in [RFC8402]. The procedures described in this section are equally applicable to both the Segment Routing over MPLS (SR-MPLS) and the Segment Routing over IPv6 (SRv6) data plane, while the data plane-specific considerations are described in Section 7.¶
This section explains the process by which a protecting router S handles the active segment of a packet upon the failure of its primary outgoing interface for the packet S-F. The failure of the primary outgoing interface may occur due to various triggers, such as link failure, neighbor node failure, and others.¶
6.1. The Active Segment Is a Node Segment
The active segment MUST be kept on the SR header unchanged and the RL MUST be added. The active segment becomes the first segment after the RL. The way the RL is added depends on the data plane used (see Section 7).¶
6.2. The Active Segment Is an Adjacency Segment
This section defines the FRR behavior applied by S for any packet received with an active Adjacency segment S-F for which protection was enabled. Since protection has been enabled for the segment S-F and signaled in the IGP (for instance, using protocol extensions from [RFC8667] and [RFC8665]), a calculator of any SR policy utilizing this segment is aware that it may be transiently rerouted out of S-F in the event of an S-F failure.¶
The simplest approach for link protection of an Adjacency segment S-F is to create an RL that will carry the traffic to F. To do so, one or more "PUSH" operations are performed. If the RL, while avoiding S-F, terminates on F, S only pushes segments of the RL. Otherwise, S pushes a node segment of F, followed by the segments of the RL. For details on the "NEXT" and "PUSH" operations, refer to [RFC8402].¶
This method, which merges back the traffic at the remote end of the Adjacency segment, has the advantage of keeping as much traffic as possible on the pre-failure path. When SR policies are involved and strict compliance with the policy is required, an end-to-end protection (beyond the scope of this document) should be preferred over the local repair mechanism described above.¶
Note, however, that when the SR source node is using Traffic
Engineering (TE), it will generally not be possible for the PLR to know
what post
The case where the active segment is followed by another Adjacency segment is distinguished from the case where it is followed by a node segment. Repair techniques for the respective cases are provided in the following subsections.¶
6.2.1. Protecting [Adjacency, Adjacency] Segment Lists
If the next segment in the list is an Adjacency segment, then the packet has to be conveyed to F.¶
To do so, S MUST apply a "NEXT" operation on Adj-SID(S-F) and then one or more "PUSH" operations. If the RL, while avoiding S-F, terminates on F, S only pushes the segments of the RL. Otherwise, S pushes a node segment of F, followed by the segments of the RL. For details on the "NEXT" and "PUSH" operations, refer to [RFC8402].¶
Upon failure of S-F, a packet reaching S with a segment list matching [adj-sid(S-F), adj-sid(F-M), ...] will thus leave S with a segment list matching [RL(F), node(F), adj-sid(F-M), ...], where RL(F) is the RL for destination F.¶
6.2.2. Protecting [Adjacency, Node] Segment Lists
If the next segment in the stack is a node segment, say for node T, the segment list on the packet matches [adj-sid(S-F), node(T), ...].¶
In this case, S MUST apply a "NEXT" operation on the Adjacency segment related to S-F, followed by a "PUSH" of an RL redirecting the traffic to a node Q, whose path to node segment T is not affected by the failure.¶
Upon failure of S-F, packets reaching S with a segment list matching [adj-sid(S-F), node(T), ...] would leave S with a segment list matching [RL(Q), node(T), ...].¶
7. Data Plane-Specific Considerations
7.1. MPLS Data Plane Considerations
The MPLS data plane for SR is described in [RFC8660].¶
The following data plane behaviors apply when creating an RL using an MPLS data plane:¶
7.2. SRv6 Data Plane Considerations
SRv6 data plane and programming instructions are described respectively in [RFC8754] and [RFC8986].¶
The TI-LFA path computation algorithm is the same as in the SR-MPLS data plane. Note, however, that the Adjacency SIDs are typically globally routed. In such a case, there is no need for preceding an Adjacency SID with a Prefix-SID [RFC8402], and the resulting RL is likely shorter.¶
If the traffic is protected at a Transit Node, then an SRv6 SID list is added on the packet to apply the RL. The addition of the RL follows the head-end behaviors as specified in Section 5 of [RFC8986].¶
If the traffic is protected at an SR Segment Endpoint Node, first the Segment Endpoint packet processing is executed. Then, the packet is protected as if it were a transit packet.¶
8. TI-LFA and SR Algorithms
SR allows an operator to bind an algorithm to a Prefix-SID (as defined in [RFC8402]). The algorithm value dictates how the path to the prefix is computed. The SR default algorithm is known as the "Shortest Path" algorithm. The SR default algorithm allows an operator to override the IGP shortest path by using local policies. When TI-LFA uses Node-SIDs associated with the default algorithm, there is no guarantee that the path will be loop-free, as a local policy may have overridden the expected IGP path. As the local policies are defined by the operator, it becomes the responsibility of this operator to ensure that the deployed policies do not affect the TI-LFA deployment. It should be noted that such a situation can already happen today with existing mechanisms such as RLFA.¶
[RFC9350] defines a Flexible Algorithm
framework to be associated with Prefix-SIDs. A Flexible Algorithm allows a user to
associate a constrained path to a Prefix-SID rather than using the
regular IGP shortest path. An implementation MAY support TI-LFA to
protect Node-SIDs associated with a Flexible Algorithm. In such a case, rather
than computing the expected post
9. Usage of Adjacency Segments in the Repair List
The RL of segments computed by TI-LFA may contain one or more Adjacency segments. An Adjacency segment may be protected or not protected.¶
In Figure 2, all the metrics
are equal to 1 except R2
To avoid the possibility of this double FRR activation, an implementation of TI-LFA MAY pick only unprotected Adjacency segments when building the RL. However, it is important to note that FRR in general is intended to protect for a single pre-planned failure. If the failure that happens is worse than expected or multiple failures happen, FRR is not guaranteed to work. In such a case, fast IGP convergence remains important to restore traffic as quickly as possible.¶
10. Security Considerations
The techniques described in this document are internal functionalities
to a router that can guarantee an upper bound on the time taken to
restore traffic flow upon the failure of a directly connected link or
node. As these techniques steer traffic to the post
The security considerations described in [RFC5286] and [RFC7490] apply to this document. Similarly, as the solution described in this document is based on SR technology, the reader should be aware of the security considerations related to this technology (see [RFC8402]) and its data plane instantiations (see [RFC8660], [RFC8754], and [RFC8986]). However, this document does not introduce additional security concerns.¶
11. IANA Considerations
This document has no IANA actions.¶
12. References
12.1. Normative References
- [RFC2119]
-
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10
.17487 , , <https:///RFC2119 www >..rfc -editor .org /info /rfc2119 - [RFC7916]
-
Litkowski, S., Ed., Decraene, B., Filsfils, C., Raza, K., Horneffer, M., and P. Sarkar, "Operational Management of Loop-Free Alternates", RFC 7916, DOI 10
.17487 , , <https:///RFC7916 www >..rfc -editor .org /info /rfc7916 - [RFC8174]
-
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10
.17487 , , <https:///RFC8174 www >..rfc -editor .org /info /rfc8174 - [RFC8402]
-
Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", RFC 8402, DOI 10
.17487 , , <https:///RFC8402 www >..rfc -editor .org /info /rfc8402 - [RFC8660]
-
Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing with the MPLS Data Plane", RFC 8660, DOI 10
.17487 , , <https:///RFC8660 www >..rfc -editor .org /info /rfc8660 - [RFC8754]
-
Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header (SRH)", RFC 8754, DOI 10
.17487 , , <https:///RFC8754 www >..rfc -editor .org /info /rfc8754 - [RFC8986]
-
Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 (SRv6) Network Programming", RFC 8986, DOI 10
.17487 , , <https:///RFC8986 www >..rfc -editor .org /info /rfc8986
12.2. Informative References
- [IPFRR-TUNNELS]
-
Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP Fast Reroute using tunnels", Work in Progress, Internet-Draft, draft
-bryant , , <https://-ipfrr -tunnels -03 datatracker >..ietf .org /doc /html /draft -bryant -ipfrr -tunnels -03 - [RFC5286]
-
Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for IP Fast Reroute: Loop-Free Alternates", RFC 5286, DOI 10
.17487 , , <https:///RFC5286 www >..rfc -editor .org /info /rfc5286 - [RFC5714]
-
Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC 5714, DOI 10
.17487 , , <https:///RFC5714 www >..rfc -editor .org /info /rfc5714 - [RFC5715]
-
Shand, M. and S. Bryant, "A Framework for Loop-Free Convergence", RFC 5715, DOI 10
.17487 , , <https:///RFC5715 www >..rfc -editor .org /info /rfc5715 - [RFC6571]
-
Filsfils, C., Ed., Francois, P., Ed., Shand, M., Decraene, B., Uttaro, J., Leymann, N., and M. Horneffer, "Loop-Free Alternate (LFA) Applicability in Service Provider (SP) Networks", RFC 6571, DOI 10
.17487 , , <https:///RFC6571 www >..rfc -editor .org /info /rfc6571 - [RFC6976]
-
Shand, M., Bryant, S., Previdi, S., Filsfils, C., Francois, P., and O. Bonaventure, "Framework for Loop-Free Convergence Using the Ordered Forwarding Information Base (oFIB) Approach", RFC 6976, DOI 10
.17487 , , <https:///RFC6976 www >..rfc -editor .org /info /rfc6976 - [RFC7490]
-
Bryant, S., Filsfils, C., Previdi, S., Shand, M., and N. So, "Remote Loop-Free Alternate (LFA) Fast Reroute (FRR)", RFC 7490, DOI 10
.17487 , , <https:///RFC7490 www >..rfc -editor .org /info /rfc7490 - [RFC8333]
-
Litkowski, S., Decraene, B., Filsfils, C., and P. Francois, "Micro-loop Prevention by Introducing a Local Convergence Delay", RFC 8333, DOI 10
.17487 , , <https:///RFC8333 www >..rfc -editor .org /info /rfc8333 - [RFC8665]
-
Psenak, P., Ed., Previdi, S., Ed., Filsfils, C., Gredler, H., Shakir, R., Henderickx, W., and J. Tantsura, "OSPF Extensions for Segment Routing", RFC 8665, DOI 10
.17487 , , <https:///RFC8665 www >..rfc -editor .org /info /rfc8665 - [RFC8667]
-
Previdi, S., Ed., Ginsberg, L., Ed., Filsfils, C., Bashandy, A., Gredler, H., and B. Decraene, "IS-IS Extensions for Segment Routing", RFC 8667, DOI 10
.17487 , , <https:///RFC8667 www >..rfc -editor .org /info /rfc8667 - [RFC9256]
-
Filsfils, C., Talaulikar, K., Ed., Voyer, D., Bogdanov, A., and P. Mattes, "Segment Routing Policy Architecture", RFC 9256, DOI 10
.17487 , , <https:///RFC9256 www >..rfc -editor .org /info /rfc9256 - [RFC9350]
-
Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., and A. Gulko, "IGP Flexible Algorithm", RFC 9350, DOI 10
.17487 , , <https:///RFC9350 www >..rfc -editor .org /info /rfc9350 - [SR-LOOP]
-
Bashandy, A., Filsfils, C., Litkowski, S., Decraene, B., Francois, P., and P. Psenak, "Loop avoidance using Segment Routing", Work in Progress, Internet-Draft, draft
-bashandy , , <https://-rtgwg -segment -routing -uloop -17 datatracker >..ietf .org /doc /html /draft -bashandy -rtgwg -segment -routing -uloop -17
Appendix A. Advantages of Using the Expected Post-Convergence Path During FRR
[RFC7916] raises several operational considerations when using LFA or RLFA. Section 3 of [RFC7916] presents a case where a high bandwidth link between two core routers is protected through a Provider Edge (PE) router connected with low bandwidth links. In such a case, congestion may happen when the FRR backup path is activated. [RFC7916] introduces a local policy framework to let the operator tuning manually the best alternate election based on its own requirements.¶
From a network capacity planning point of view, it is often assumed
for simplicity that if a link L fails on a particular node X, the
bandwidth consumed on L will be spread over some of the remaining links
of X. The remaining links to be used are determined by the IGP routing
considering that the link L has failed (we assume that the traffic uses
the post
In Figure 3, considering that the source of traffic is only from PE1 and PE4, when the link L fails, depending on the convergence speed of the nodes, X may reroute its forwarding entries to the remote PEs onto X-H or X-D; however, in a similar timeframe, PE1 will also reroute a subset of its traffic (the subset destined to PE2) out of its nominal path, reducing the quantity of traffic received by X. The capacity planning rule presented previously has the drawback of oversizing the network; however, it allows for preventing any transient congestion (for example, when X reroutes traffic before PE1 does).¶
Based on this assumption, in order to facilitate the operation of
FRR and limit the implementation of local FRR policies, traffic can be
steered by the PLR onto its expected post
It should be noted that some networks may have a different capacity
planning rule, leading to an allocation of less bandwidth on X-H and X-D
links. In such a case, using the post
Readers should be aware that FRR protection is pre-computing a backup
path to protect against a particular type of failure (link, node, or SRLG).
When using the post
Another consideration to take into account is as follows: while using
the expected post
Appendix B. Analysis Based on Real Network Topologies
This section presents an analysis performed on real service provider and large enterprise network topologies. The objective of the analysis is to assess the number of SIDs required in an explicit path when the mechanisms described in this document are used to protect against the failure scenarios within the scope of this document. The number of segments described in this section are applicable to instantiating SR over the MPLS forwarding plane.¶
The measurement below indicates that, for link and local SRLG protection, a repair path of 1 SID or less delivers more than 99% coverage. For node protection, a repair path of 2 SIDs or less yields 99% coverage.¶
Table 1 below lists the characteristics of the networks used in our measurements. The number of links refers to the number of "bidirectional" links (not directed edges of the graph). The measurements are carried out as follows:¶
The rest of this section presents the measurements done on the actual topologies. The conventions that we use are as follows:¶
Tables 2 and 3 below summarize the measurements on the number of SIDs needed for link protection.¶
Tables 4 and 5 summarize the measurements on the number of SIDs needed for local SRLG protection.¶
The remaining two tables summarize the measurements on the number of SIDs needed for node protection.¶
Acknowledgments
The authors would like to thank Les Ginsberg, Stewart Bryant, Alexander Vainsthein, Chris Bowers, Shraddha Hedge, Wes Hardaker, Gunter Van de Velde, and John Scudder for their valuable comments.¶
Contributors
In addition to the authors listed on the front page, the following co-authors have also contributed to this document:¶