RFC 9524: Segment Routing Replication for Multipoint Service Delivery
- D. Voyer, Ed.,
- C. Filsfils,
- R. Parekh,
- H. Bidgoli,
- Z. Zhang
This RFC was updated
Abstract
This document describes the Segment Routing Replication segment for multipoint service delivery. A Replication segment allows a packet to be replicated from a replication node to downstream nodes.¶
Status of This Memo
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
https://
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://
1. Introduction
The Replication segment is a new type of segment for Segment Routing (SR) [RFC8402], which allows a node (henceforth called a "replication node") to replicate packets to a set of other nodes (called "downstream nodes") in an SR domain. A Replication segment can replicate packets to directly connected nodes or to downstream nodes (without the need for state on the transit routers). This document focuses on specifying the behavior of a Replication segment for both Segment Routing with Multiprotocol Label Switching (SR-MPLS) [RFC8660] and Segment Routing with IPv6 (SRv6) [RFC8986]. The examples in Appendix A illustrate the behavior of a Replication Segment in an SR domain. The use of two or more Replication segments stitched together to form a tree using a control plane is left to be specified in other documents. The management of IP multicast groups, building IP multicast trees, and performing multicast congestion control are out of scope of this document.¶
1.1. Terminology
This section defines terms introduced and used frequently in this document. Refer to the Terminology sections of [RFC8402], [RFC8754], and [RFC8986] for other terms used in SR.¶
- Replication segment:
- A segment in an SR domain that replicates packets. See Section 2 for details.¶
- Replication node:
- A node in an SR domain that replicates packets based on a Replication segment.¶
- Downstream nodes:
- A Replication segment replicates packets to a set of nodes. These nodes are downstream nodes.¶
- Replication state:
- State held for a Replication segment at a replication node. It is conceptually a list of Replication branches to downstream nodes. The list can be empty.¶
- Replication-SID:
- Data plane identifier of a Replication segment. This is an SR-MPLS label or SRv6 Segment Identifier (SID).¶
- SRH:
- IPv6 Segment Routing Header [RFC8754].¶
- Point
-to -Multipoint (P2MP) Service: - A service that has one ingress node and one or more egress nodes. A packet is delivered to all the egress nodes.¶
- Root node:
- An ingress node of a P2MP service.¶
- Leaf node:
- An egress node of a P2MP service.¶
- Bud node:
- A node that is both a replication node and a leaf node.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
1.2. Use Cases
In the simplest use case, a single Replication segment includes the ingress node of a multipoint service and the egress nodes of the service as all the downstream nodes. This achieves Ingress Replication [RFC7988] that has been widely used for Multicast VPN (MVPN) [RFC6513] and Ethernet VPN (EVPN) [RFC7432] bridging of Broadcast, Unknown Unicast, and Multicast (BUM) traffic. This Replication segment on ingress and egress nodes can either be provisioned locally or using dynamic autodiscovery procedures for MVPN and EVPN. Note SRv6 [RFC8986] has End.DT2M replication behavior for EVPN BUM traffic.¶
Replication segments can also be used to form trees by stitching Replication segments on a root node, intermediate replication nodes, and leaf nodes for efficient delivery of MVPN and EVPN BUM traffic.¶
2. Replication Segment
In an SR domain, a Replication segment is a logical construct that connects a replication node to a set of downstream nodes. A Replication segment is a local segment instantiated at a Replication node. It can be either provisioned locally on a node or programmed by a control plane.¶
Replication segments can be stitched together to form a tree by either local provisioning on nodes or using a control plane. The procedures for doing this are out of scope of this document. One such control plane using a PCE with the SR P2MP policy is specified in [P2MP-POLICY]. However, if local provisioning is used to stitch Replication segments, then a chain of Replication segments SHOULD NOT form a loop. If a control plane is used to stitch Replication segments, the control plane specification MUST prevent loops or detect and mitigate loops in steady state.¶
A Replication segment is identified by the tuple <Replication
- Replication-ID:
- An identifier for a Replication segment that is unique in context of the replication node.¶
- Node-ID:
- The address of the replication node for the Replication segment. Note that the root of a multipoint service is also a Replication node.¶
Replication-ID is a variable-length field. In the simplest case, it
can be a 32-bit number, but it can be extended or modified as required
based on the specific use of a Replication segment. This is out of scope
for this document. The length of the Replication-ID is specified in the
signaling mechanism used for the Replication segment. Examples of such
signaling and extensions are described in [P2MP-POLICY]. When the PCE signals a
Replication segment to its node, the <Replication
A Replication segment includes the following elements:¶
- Replication-SID:
- The Segment Identifier of a Replication segment. This is an SR-MPLS label or an SRv6 SID [RFC8402].¶
- Downstream nodes:
- Set of nodes in an SR domain to which a packet is replicated by the Replication segment.¶
- Replication state:
- See below.¶
The downstream nodes and Replication state (RS) of a Replication segment can change over time, depending on the network state and leaf nodes of a multipoint service that the segment is part of.¶
The Replication-SID identifies the Replication segment in the forwarding plane. At a replication node, the Replication-SID operates on the RS of the Replication segment.¶
RS is a list of Replication branches to the downstream
nodes. In this document, each branch is abstracted to a <downstream
node, downstream Replication
A packet is steered into a Replication segment at a replication node in two ways:¶
In either case, the packet is replicated to each downstream node in the associated RS.¶
If a downstream node is an egress (leaf) of the multipoint service, no further replication is needed. The leaf node's Replication segment has an indicator for the leaf role, and it does not have any RS (i.e., the list of Replication branches is empty). The Replication-SID at a leaf node MAY be used to identify the multipoint service. Notice that the segment on the leaf node is still referred to as a "Replication segment" for the purpose of generalization.¶
A node can be a bud node (i.e., it is a replication node and a leaf node of a multipoint service [P2MP-POLICY]). The Replication segment of a bud node has a list of Replication branches as well as a leaf role indicator.¶
In principle, it is possible for different Replication segments to replicate packets to the same Replication segment on a downstream node. However, such usage is intentionally left out of scope of this document.¶
2.1. SR-MPLS Data Plane
When the active segment is a Replication
The operation performed on the incoming Replication-SID is NEXT [RFC8402] at a leaf or bud node where delivery of payload off the tree is per local configuration. For some usages, this may involve looking at the next SID, for example, to get the necessary context.¶
When the root of a multipoint service steers a packet to a Replication segment, it results in a replication to each downstream node in the associated RS. The operation is a PUSH of the Replication-SID and an optional segment list onto the packet, which is forwarded to the downstream node.¶
The following applies to a Replication-SID in MPLS encapsulation:¶
2.2. SRv6 Data Plane
For SRv6 [RFC8986], this document specifies "Endpoint with replication and/or decapsulate" behavior (End.Replicate for short) to replicate a packet and forward the replicas according to an RS.¶
When processing a packet destined to a local Replication
A local application on root (e.g., MVPN [RFC6513] or EVPN [RFC7432]) may also apply H.Encaps.Red and then steer the resulting traffic into the Replication segment. Again, note that H.Encaps.Red is independent of the Replication segment: it is the action of the application (e.g. MVPN or EVPN service). If the service is on a root node, then the two H.Encaps mentioned, one for the service and the other in the previous paragraph for replication to the downstream node, SHOULD be combined for optimization (to avoid extra IPv6 encapsulation).¶
When processing a packet destined to a local Replication
For leaf and bud nodes, local delivery off the tree is per Replication-SID or the next SID (if present in the SRH). For some usages, this may involve getting the necessary context either from the next SID (e.g., MVPN with a shared tree) or from the Replication-SID itself (e.g., MVPN with a non-shared tree). In both cases, the context association is achieved with signaling and is out of scope of this document.¶
The following applies to a Replication-SID in SRv6 encapsulation:¶
2.2.1. End.Replicate: Replicate and/or Decapsulate
The "Endpoint with replication and/or decapsulate" (End.Replicate for short) is a variant of End behavior. The pseudocode in this section follows the convention introduced in [RFC8986].¶
An RS conceptually contains the following elements:¶
Below is the Replicate function on a packet for Replication state (RS).¶
Notes:¶
When N receives a packet whose IPv6 DA is S and S is a local End.Replicate SID, N does:¶
The processing of the Upper-Layer header of a packet matching the End.Replicate SID at a leaf or bud node is as follows:¶
Notes:¶
If configured to process TLVs, processing the Replication-SID may
modify the "variable
2.2.1.1. Hashed Message Authentication Code (HMAC) SRH TLV
If a root node encodes a context-SID in an SRH with an optional HMAC SRH TLV [RFC8754], it MUST set the 'D' bit as defined in Section 2.1.2 of [RFC8754] because the Replication-SID is not part of the segment list in the SRH.¶
HMAC generation and verification is as specified in [RFC8754]. Verification of an HMAC TLV is determined by local configuration. If verification fails, an implementation of a Replication-SID MUST NOT originate an ICMPv6 Parameter Problem message with code 0. The failure SHOULD be logged (rate-limited) and the packet SHOULD be discarded.¶
2.2.2. OAM Operations
[RFC9259] specifies procedures for Operations, Administration, and Maintenance (OAM) like ping and traceroute on SRv6 SIDs.¶
Assuming the source node knows the Replication-SID a priori, it
is possible to ping a Replication-SID of a leaf or bud node directly by
putting it in the IPv6 DA without an SRH or in an
SRH as the last segment. While it is not possible to ping a
Replication-SID of a transit node because transit nodes do not
process Upper-Layer headers, it is still possible to ping a
Replication-SID of a leaf or bud node of a tree via the Replication-SID
of intermediate transit nodes. The source of the ping
MUST compute the ICMPv6 Echo Request checksum using
the Replication-SID of the leaf or bud node as the DA. The
source can then send the Echo Request packet to a transit node's
Replication
Traceroute to a leaf or bud node Replication-SID is not possible due to restrictions prohibiting the origination of the ICMPv6 Time Exceeded error message for a Replication-SID as described in Section 2.2.3.¶
2.2.3. ICMPv6 Error Messages
Section 2.4 of [RFC4443] states an ICMPv6 error message MUST NOT be originated as a result of receiving a packet destined to an IPv6 multicast address. This is to prevent a source node from being overwhelmed by a storm of ICMPv6 error messages resulting from replicated IPv6 packets. There are two exceptions:¶
An implementation of a Replication segment for SRv6 MUST enforce these same restrictions and exceptions.¶
3. IANA Considerations
IANA has assigned the following codepoint for End.Replicate behavior in the "SRv6 Endpoint Behaviors" registry in the "Segment Routing" registry group.¶
4. Security Considerations
The SID behaviors defined in this document are deployed within an SR domain [RFC8402]. An SR domain needs protection from outside attackers (as described in [RFC8754]). The following is a brief reminder of the same:¶
Failure to protect the SR-MPLS domain by correctly provisioning MPLS support per interface permits attackers from outside the domain to send packets that use the replication services provisioned within the domain.¶
Failure to protect the SRv6 domain with IACLs on external interfaces combined with failure to implement the recommendations of BCP 38 [RFC2827] or apply IACLs on nodes provisioning SIDs permits attackers from outside the SR domain to send packets that use the replication services provisioned within the domain.¶
Given the definition of the Replication segment in this document, an attacker subverting the ingress filters above cannot take advantage of a stack of Replication segments to perform amplification attacks nor link exhaustion attacks. Replication segment trees always terminate at a leaf or bud node resulting in a decapsulation. However, this does allow an attacker to inject traffic to the receivers within a P2MP service.¶
This document introduces an SR segment endpoint behavior that replicates and decapsulates an inner payload for both the MPLS and IPv6 data planes. Similar to any MPLS end-of-stack label, or SRv6 END.D* behavior, if the protections described above are not implemented, an attacker can perform an attack via the decapsulating segment (including the one described in this document).¶
Incorrect provisioning of Replication segments can result in a chain of Replication segments forming a loop. This can happen if Replication segments are provisioned on SR nodes without using a control plane. In this case, replicated packets can create a storm until MPLS TTL (for SR-MPLS) or IPv6 Hop Limit (for SRv6) decrements to zero. A control plane such as PCE can be used to prevent loops. The control plane protocols (like Path Computation Element Communication Protocol (PCEP), BGP, etc.) used to instantiate Replication segments can leverage their own security mechanisms such as encryption, authentication filtering, etc.¶
For SRv6, Section 2.2.3 describes an
exception for the ICMPv6 Parameter Problem message with Code 2. If an attacker sends a packet destined to a Replication-SID
with the source address of a node and with an extension header using the
unknown option type marked as mandatory, then a large number of ICMPv6
Parameter Problem messages can cause a denial
If an attacker can forge an IPv6 packet with:¶
then these nodes can cause a storm of ICMPv6 error packets to overwhelm the source node under attack. The IPv6 Hop Limit Threshold check described in Section 2.2 can help mitigate such attacks.¶
5. References
5.1. Normative References
- [RFC2119]
-
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10
.17487 , , <https:///RFC2119 www >..rfc -editor .org /info /rfc2119 - [RFC4443]
-
Conta, A., Deering, S., and M. Gupta, Ed., "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", STD 89, RFC 4443, DOI 10
.17487 , , <https:///RFC4443 www >..rfc -editor .org /info /rfc4443 - [RFC8174]
-
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10
.17487 , , <https:///RFC8174 www >..rfc -editor .org /info /rfc8174 - [RFC8402]
-
Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", RFC 8402, DOI 10
.17487 , , <https:///RFC8402 www >..rfc -editor .org /info /rfc8402 - [RFC8754]
-
Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header (SRH)", RFC 8754, DOI 10
.17487 , , <https:///RFC8754 www >..rfc -editor .org /info /rfc8754 - [RFC8986]
-
Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 (SRv6) Network Programming", RFC 8986, DOI 10
.17487 , , <https:///RFC8986 www >..rfc -editor .org /info /rfc8986 - [RFC9259]
-
Ali, Z., Filsfils, C., Matsushima, S., Voyer, D., and M. Chen, "Operations, Administration, and Maintenance (OAM) in Segment Routing over IPv6 (SRv6)", RFC 9259, DOI 10
.17487 , , <https:///RFC9259 www >..rfc -editor .org /info /rfc9259
5.2. Informative References
- [P2MP-POLICY]
-
Voyer, D., Ed., Filsfils, C., Parekh, R., Bidgoli, H., and Z. J. Zhang, "Segment Routing Point
-to , Work in Progress, Internet-Draft, draft-Multipoint Policy" -ietf , , <https://-pim -sr -p2mp -policy -07 datatracker >..ietf .org /doc /html /draft -ietf -pim -sr -p2mp -policy -07 - [PGM
-ILLUSTRATION] -
Filsfils, C., Camarillo, P., Ed., Li, Z., Matsushima, S., Decraene, B., Steinberg, D., Lebrun, D., Raszuk, R., and J. Leddy, "Illustrations for SRv6 Network Programming", Work in Progress, Internet-Draft, draft
-filsfils , , <https://-spring -srv6 -net -pgm -illustration -04 datatracker >..ietf .org /doc /html /draft -filsfils -spring -srv6 -net -pgm -illustration -04 - [RFC2827]
-
Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, DOI 10
.17487 , , <https:///RFC2827 www >..rfc -editor .org /info /rfc2827 - [RFC3704]
-
Baker, F. and P. Savola, "Ingress Filtering for Multihomed Networks", BCP 84, RFC 3704, DOI 10
.17487 , , <https:///RFC3704 www >..rfc -editor .org /info /rfc3704 - [RFC6513]
-
Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/BGP IP VPNs", RFC 6513, DOI 10
.17487 , , <https:///RFC6513 www >..rfc -editor .org /info /rfc6513 - [RFC7432]
-
Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10
.17487 , , <https:///RFC7432 www >..rfc -editor .org /info /rfc7432 - [RFC7988]
-
Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress Replication Tunnels in Multicast VPN", RFC 7988, DOI 10
.17487 , , <https:///RFC7988 www >..rfc -editor .org /info /rfc7988 - [RFC8660]
-
Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing with the MPLS Data Plane", RFC 8660, DOI 10
.17487 , , <https:///RFC8660 www >..rfc -editor .org /info /rfc8660 - [RFC9256]
-
Filsfils, C., Talaulikar, K., Ed., Voyer, D., Bogdanov, A., and P. Mattes, "Segment Routing Policy Architecture", RFC 9256, DOI 10
.17487 , , <https:///RFC9256 www >..rfc -editor .org /info /rfc9256 - [RFC9350]
-
Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., and A. Gulko, "IGP Flexible Algorithm", RFC 9350, DOI 10
.17487 , , <https:///RFC9350 www >..rfc -editor .org /info /rfc9350 - [SIDS-SRv6]
-
Krishnan, S., "SRv6 Segment Identifiers in the IPv6 Addressing Architecture", Work in Progress, Internet-Draft, draft
-ietf , , <https://-6man -sids -06 datatracker >..ietf .org /doc /html /draft -ietf -6man -sids -06
Appendix A. Illustration of a Replication Segment
This section illustrates an example of a single Replication segment. Examples showing Replication segments stitched together to form a P2MP tree (based on SR P2MP policy) are in [P2MP-POLICY].¶
Consider the following topology:¶
A.1. SR-MPLS
In this example, the Node-SID of a node Rn is N-SIDn and the Adj-SID from node Rm to node Rn is A-SIDmn. The interface between Rm and Rn is Lmn. The state representation uses "R-SID->Lmn" to represent a packet replication with outgoing Replication-SID R-SID sent on interface Lmn.¶
Assume a Replication segment identified with R-ID at Replication node R1 and downstream nodes R2, R6, and R7. The Replication-SID at node n is R-SIDn. A packet replicated from R1 to R7 has to traverse R4.¶
The Replication segments at nodes R1, R2, R6, and R7 are shown below. Note nodes R3, R4, and R5 do not have a Replication segment.¶
Replication segment at R1:¶
Replication to R2 steers the packet directly to R2 on interface L12. Replication to R6, using N-SID6, steers the packet via the shortest path to that node. Replication to R7 is steered via R4, using N-SID4 and then adjacency SID A-SID47 to R7.¶
Replication segment at R2:¶
Replication segment at R6:¶
Replication segment at R7:¶
When a packet is steered into the Replication segment at R1:¶
A.2. SRv6
For SRv6, we use the SID allocation scheme, reproduced below, from "Illustrations for SRv6 Network Programming" [PGM-ILLUSTRATION]:¶
Each node k has:¶
Assume a Replication segment identified with R-ID at Replication
node R1 and downstream nodes R2, R6, and R7. The Replication-SID at
node k, bound to an End.Replicate function, is
2001
The Replication segments at nodes R1, R2, R6, and R7 are shown below. Note nodes R3, R4, and R5 do not have a Replication segment. The state representation uses "R-SID->Lmn" to represent a packet replication with outgoing Replication-SID R-SID sent on interface Lmn. "SL" represents an optional segment list used to steer a replicated packet on a specific path to a downstream node.¶
Replication segment at R1:¶
Replication to R2 steers the packet directly to R2 on interface
L12. Replication to R6, using 2001
Replication segment at R2:¶
Replication segment at R6:¶
Replication segment at R7:¶
When a packet, (A,B2), is steered into the Replication segment at R1:¶
A.2.1. Pinging a Replication-SID
This section illustrates the ping of a Replication
Node R1 pings the Replication-SID of node R6 directly by sending the following packet:¶
Node R1 pings the Replication-SID of R7 via R4 by sending the following packet with the SRH:¶
Assume node R4 is a transit replication node with Replication-SID
2001
Acknowledgements
The authors would like to acknowledge Siva Sivabalan, Mike Koldychev, Vishnu Pavan Beeram, Alexander Vainshtein, Bruno Decraene, Thierry Couture, Joel Halpern, Ketan Talaulikar, Darren Dukes and Jingrong Xie for their valuable inputs.¶