Internet Engineering Task Force Takumi Kimura
INTERNET-DRAFT NTT
Expires in: April 2004 Jerry Perser
Spirent
October 2003
Benchmarking Terminology for Protection Performance
<draft-kimura-protection-term-02.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document addresses common terminology and metrics for the
performance benchmarking of sub-IP layer protection technologies:
Automatic Protection Switching (APS) for SONET/SDH, Fast Reroute for
Multi-Protocol Label Switching (MPLS), and Resilient Packet Ring
(RPR) standardized in IEEE. The benchmarks describe the performance
based on the effects in the IP-layer to avoid dependence on a
specific sub-IP layer protection technology.
Table of Contents
1. Introduction .............................................. 2
2. Existing definitions ...................................... 3
Kimura & Perser Expires April 2004 [Page 1]
INTERNET-DRAFT Protection Performance Terminology October 2003
3. Term definitions .......................................... 3
3.1 Path
3.1.1 Path ............................................... 3
3.1.2 Working Path ....................................... 4
3.1.3 Ordinary Path ...................................... 4
3.1.4 Recovery Path ...................................... 5
3.1.5 Recovery Span ...................................... 5
3.2 Protection
3.2.1 Path Failure ....................................... 6
3.2.2 Failure Detection .................................. 6
3.2.3 Switch Over ........................................ 7
3.2.4 Protection Switching ............................... 7
3.2.5 Protection-Capable Node ............................ 8
3.2.6 Protection System .................................. 8
3.3 Reference Model for Protection Benchmarking
3.3.1 Pseudo-Failure Equipment ........................... 9
3.3.2 Trigger for Failure Protection ..................... 9
3.3.3 Reference Model for Protection Benchmarking ........ 10
3.4 Metrics
3.4.1 Errored Packet ..................................... 11
3.4.2 Lost Packet ........................................ 12
3.4.3 Sequence-Error Period .............................. 12
3.4.4 Loss Period ........................................ 13
3.4.5 Base Latency ....................................... 13
3.4.6 Additive Latency ................................... 14
3.4.7 Induced Latency .................................... 14
3.4.8 Unstable-latency Period ............................ 15
3.4.9 Recovery Time ...................................... 15
4. Security Considerations ................................... 16
5. Acknowledgements .......................................... 16
6. References ................................................ 17
7. Authors' Addresses ........................................ 17
8. Full Copyright Statement .................................. 17
1. Introduction
Reliability is needed in today's IP networks, because the Internet
has already become an important communication infrastructure, and
quality-sensitive applications are being used on it. Protection
technologies have been implemented in sub-IP layers improve IP-layer
reliability. Automatic Protection Switching (APS) is for SONET/SDH,
Fast Reroute is for Multi-Protocol Label Switching (MPLS), and
Resilient Packet Ring (RPR) is standardized in IEEE. The recovery
time in the IP-layer is different from that in sub-IP layers because
of the recognition mechanism for when interfaces go up and down and
the buffering effect of IP routers. Protection performance
benchmarks and methodologies for testing them are required to allow
Kimura & Perser Expires April 2004 [Page 2]
INTERNET-DRAFT Protection Performance Terminology October 2003
an objective comparison of implementations.
These benchmark definitions are based on the effects in the IP layer,
so that they can be developed independent of protection technologies
and that we can compare different protection technologies.
2. Existing definitions
This document draws on existing terminology defined in other BMWG
work. Examples include, but are not limited to:
Latency [RFC 1242, section 3.8]
Frame Loss Rate [RFC 1242, section 3.6]
Throughput [RFC 1242, section 3.17]
Device Under Test (DUT) [RFC 2285, section 3.1.1]
System Under Test (SUT) [RFC 2285, section 3.1.2]
Out-of-sequence Packet [Ref.[4], section 3.3.1]
Out-of-order Packet [Ref.[4], section 3.3.2]
Duplicate Packet [Ref.[4], section 3.3.3]
This document adopts the definition format in Section 2 of RFC 1242.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.
3. Term definitions
3.1 Path
3.1.1 Path
Definition:
A sequence of nodes, <R1, ..., Rn>, with the following
properties:
- R1 is the ingress node and forwards IP packets, which are
entered into DUT/SUT, to R2 as sub-IP frames.
- Ri is a node which forwards data frames to R[i+1] for all i,
1<i<n, based on information in the sub-IP layer.
- Rn is the egress node and it passes sub-IP frames to its IP
layer for forwarding.
Discussion:
The term "path" is used as the sub-IP layer path in this
document, unlike an IP path in RFC 2026. For example, the
SONET/SDH path and the label-switched path for MPLS. A path may
Kimura & Perser Expires April 2004 [Page 3]
INTERNET-DRAFT Protection Performance Terminology October 2003
be regarded as being equivalent to one IP link between two IP
nodes, i.e., R1 and Rn. The two IP nodes may have multiple
paths between them for redundancy. A packet will travel on only
one path between the nodes. Packets belonging to a microflow
(RFC 2474) will transverse one or more paths. The path is
unidirectional.
Measurement units:
n/a
Issues:
"A bidirectional path", which transmits traffic in both
directions along the same nodes, consists of two unidirectional
paths. Therefore, the two unidirectional paths belonging to
"one bidirectional path" will be treated independently when
benchmarking for " a bidirectional path".
See Also:
3.1.2 Working Path
Definition:
The current path that the DUT/SUT is using to forward packets.
Discussion:
An ordinary path (3.1.3) is a working path before failure
protection, while a recovery path (3.1.4) becomes a working path
after failure protection.
Measurement units:
n/a
Issues:
See Also:
Path (3.1.1)
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
3.1.3 Ordinary Path
Definition:
A path which is a working path before failure protection.
Discussion:
Kimura & Perser Expires April 2004 [Page 4]
INTERNET-DRAFT Protection Performance Terminology October 2003
Measurement units:
n/a
Issues:
See Also:
Path (3.1.1)
Working Path (3.1.2)
Path Failure (3.2.1)
3.1.4 Recovery Path
Definition:
A path which is prepared for the eventuality of ordinary path
failure, and used to forward packets as a working path after
failure protection.
Discussion:
There are various types of recovery paths: a dedicated recovery
path (1+1), which has 100% redundancy for a specific ordinary
path, a shared recovery path (1:N), which is dedicated to the
protection of N specific ordinary paths, and an associated
shared recovery path (M:N) for which a specific set of recovery
paths (N) protects a specific set of ordinary paths (M).
Measurement units:
n/a
Issues:
See Also:
Path (3.1.1)
Working Path (3.1.2)
Ordinary Path (3.1.3)
Path Failure (3.2.1)
3.1.5 Recovery Span
Definition:
A section of an ordinary path that includes a failure link or
node and is changed to other link(s) and node(s) for protection.
Discussion:
There are two types of recovery spans: a full recovery span,
which is a recovery span prepared between the ingress and egress
nodes of DUT/SUT, and a partial recovery span, which is a
Kimura & Perser Expires April 2004 [Page 5]
INTERNET-DRAFT Protection Performance Terminology October 2003
recovery span prepared for only parts of an ordinary path
between ingress and egress nodes of DUT/SUT. For a full
recovery span, the whole of an ordinary path is changed to a
recovery span for protection, and the ordinary and recovery
paths do not overlap. For a partial recovery span, only a part
of an ordinary path is changed to a recovery span for
protection, and parts of the ordinary and recovery paths may
overlap (as in ring restorations).
Measurement units:
n/a
Issues:
See Also:
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Path Failure (3.2.1)
3.2 Protection
3.2.1 Path Failure
Definition:
A condition that prevents packets from being forwarded on an
ordinary path as a working path, caused by fault(s) with link(s)
or node(s) in a sub-IP layer.
Discussion:
Measurement units:
n/a
Issues:
See Also:
Working Path (3.1.2)
Ordinary Path (3.1.3)
3.2.2 Failure Detection
Definition:
The operation of identifying working-path failure which is
caused by fault(s) with link(s) or node(s) in a sub-IP layer.
Discussion:
Kimura & Perser Expires April 2004 [Page 6]
INTERNET-DRAFT Protection Performance Terminology October 2003
Measurement units:
n/a
Issues:
See Also:
Working Path (3.1.2)
Ordinary Path (3.1.3)
Path Failure (3.2.1)
3.2.3 Switch Over
Definition:
The operation that changes a working path from an ordinary path
to a recovery path.
Discussion:
Switch over does not always replace an entire ordinary path with
other link(s) and node(s) for a partial recovery span. This
operation can be instituted automatically or manually in cases
of failure.
Measurement units:
n/a
Issues:
See Also:
Working Path (3.1.2)
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Recovery Span (3.1.5)
Path Failure (3.2.1)
3.2.4 Protection Switching
Definition:
The operation of the detection of working-path failure(s) and
switch-over response to the detection of failure(s).
Discussion:
A protection-switching scheme includes both the mechanisms for
failure detection and switch over.
Measurement units:
n/a
Kimura & Perser Expires April 2004 [Page 7]
INTERNET-DRAFT Protection Performance Terminology October 2003
Issues:
See Also:
Working Path (3.1.2)
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Recovery Span (3.1.5)
Path Failure (3.2.1)
Failure Detection (3.2.2)
Switch Over (3.2.3)
3.2.5 Protection-Capable Node
Definition:
A node that includes functional elements to perform protection
switching.
Discussion:
Both end nodes of a recovery span for an ordinary path must be
protection-capable nodes.
Measurement units:
n/a
Issues:
See Also:
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Recovery Span (3.1.5)
Protection Switching (3.2.4)
3.2.6 Protection System
Definition:
A system which consists of two or more protection-capable nodes
connected to each other by link(s) and node(s) constructing
ordinary paths and recovery paths.
Discussion:
When a working-path failure occurs, the system detects the
failure and switches the working path from the failed ordinary
path to the recovery path. Some technologies for this are in
sub-IP layers, i.e., MPLS-based recovery and SONET/SDH-based
recovery.
Kimura & Perser Expires April 2004 [Page 8]
INTERNET-DRAFT Protection Performance Terminology October 2003
Measurement units:
n/a
Issues:
See Also:
Working Path (3.1.2)
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Recovery Span (3.1.5)
Path Failure (3.2.1)
Failure Detection (3.2.2)
Switch Over (3.2.3)
Protection Switching (3.2.4)
Protection-Capable Node (3.2.5)
3.3 Reference Model for Protection Benchmarking
3.3.1 Pseudo-Failure Equipment
Definition:
Equipment which emulate a path failure after receiving a
trigger-signal from test equipment.
Discussion:
Pseudo-failure equipment is used in benchmarking protection
systems, since it provides more reliable and reproducible
testing than actual path failure.
Measurement units:
n/a
Issues:
The time from receiving a signal to producing a failure
condition may be a potential source of measurement error, if the
time is used as the start time of the metrics.
See Also:
Path Failure (3.2.1)
Trigger for Failure Protection (3.3.2)
3.3.2 Trigger for Failure Protection
Definition:
A signal which is sent from test equipment to make a piece of
pseudo-failure equipment create a pseudo-failure in an ordinary
Kimura & Perser Expires April 2004 [Page 9]
INTERNET-DRAFT Protection Performance Terminology October 2003
path.
Discussion:
Measurement units:
n/a
Issues:
See Also:
Working Path (3.1.2)
Ordinary Path (3.1.3)
Path Failure (3.2.1)
Pseudo-Failure Equipment (3.3.1)
3.3.3 Reference Model for Protection Benchmarking
Definition:
A fundamental model that is used in benchmarking protection
systems. A System Under Test (SUT) consists of two protection-
capable nodes connected by both an ordinary path and a recovery
path. Pseudo-failure equipment is placed at a point along the
ordinary path. Test equipment is set outside the two nodes and
generates IP traffic. The test equipment also sends the
triggers for protection that cause the piece of pseudo-failure
equipment to simulate path failures.
+----------------+
+-----------------| Test Equipment |<------------------+
| +----------------+ |
| | Trigger |
| | for Protection |
| Ordinary v |
| +--------+ Path +---------+ +--------+ |
| | |-------| Failure |------>| | |
+--->| Node 1 | +---------+ | Node 2 |----+
| |- - - - - - - - - - - - >| |
+--------+ Recovery Path +--------+
| |
+-------------------------------------------+
System Under Test (SUT)
Figure 1
Kimura & Perser Expires April 2004 [Page 10]
INTERNET-DRAFT Protection Performance Terminology October 2003
Discussion:
A reference model for protection benchmarking is shown in fig.1.
A SUT consists of two protection-capable nodes connected by both
an ordinary path and a recovery path. The ordinary path has
pseudo-failure equipment. Test equipment, which is placed
outside the two nodes, continuously sends IP packets that
include sequence numbers and time stamps to one of the nodes and
receives packets from the other node. After the test equipment
has sent a trigger for protection to the pseudo-failure
equipment, the system detects the failure and switches from the
failed ordinary path to the recovery path. The test equipment
records the sequence numbers and time stamps in the IP packets
as well as the packet-reception times, during the time it takes
protection switching to detect and finish responding to a
failure.
Measurement units:
n/a
Issues:
See Also:
Working Path (3.1.2)
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Path Failure (3.2.1)
Failure Detection (3.2.2)
Switch Over (3.2.3)
Protection Switching (3.2.4)
Protection-Capable Node (3.2.5)
Pseudo-Failure Equipment (3.3.1)
Trigger for Failure Protection (3.3.2)
3.4 Metrics
Performance metrics for protection benchmarking will include
Lost Packets (related to Frame Loss Rate in RFC 1242) including
Errored Packets, Out-of-order Packets (Ref.[4]), Duplicate
Packets (Ref.[4]), Induced Latency, and Recovery Time.
3.4.1 Errored Packet
Definition:
A received packet that fails at least one error detection scheme
in a sub-IP (FCS) or IP layer (IP checksum).
Kimura & Perser Expires April 2004 [Page 11]
INTERNET-DRAFT Protection Performance Terminology October 2003
Discussion:
Packets may have these errors due to failure or protection
switching in a sub-IP layer. Such packets with one or more
errors are equivalent to lost packets in upper-layers, because
the errors are detected in IP or lower layers.
Measurement units:
Packet count
Issues:
See Also:
Lost Packet (3.4.2)
3.4.2 Lost Packet
Definition:
A packet which either has one or more errors or dropped from the
buffer in a DUT/SUT node.
Discussion:
The input traffic rate SHOULD be less than or equal to the
Throughput (RFC 1242) which is the smallest of two Throughputs
for paths before and after protection switching. This metric is
related to the Frame Loss Rate defined in RFC 1242 but we are
interested in the number of lost packets during testing.
Measurement units:
Packet count
Issues:
Lost packets cannot be directly observed because they cannot be
received by test equipment.
See Also:
Throughput (RFC 1242)
Frame Loss Rate (RFC 1242)
Errored Packet (3.4.1)
3.4.3 Sequence-Error Period
Definition:
The time duration between the first time and the last time when
Out-of-sequence Packets (Ref.[4]) are observed at the end of
DUT/SUT during whole testing.
Kimura & Perser Expires April 2004 [Page 12]
INTERNET-DRAFT Protection Performance Terminology October 2003
Discussion:
Observation of out-of-sequence packets can track all of the lost
packets, which include errored packets, out-of-order packets,
and duplicate packets.
Measurement units:
Seconds
Issues:
See Also:
Out-of-sequence Packet (Ref.[4])
Errored Packet (3.4.1)
Lost Packet (3.4.2)
Out-of-order Packet (Ref.[4])
Duplicate Packet (Ref.[4])
3.4.4 Loss Period
Definition:
The time duration calculated by dt/(Ns-Nr) if Ns > Nr, or 0 if
Ns<=Nr, where dt is a constant inter-packet time with which the
test equipment sends packets. Ns is the number of packets sent
from the test equipment and Nr is the number of packets received
by the test equipment.
Discussion:
Each test packet does not need to have its sequence number in it
to measure this metric.
Measurement units:
Seconds
Issues:
See Also:
Lost Packet (3.4.2)
3.4.5 Base Latency
Definition:
Latency during no network changes: no path failures, no route
changes, and no traffic overload.
Discussion:
Base latencies before path failure and after protection
Kimura & Perser Expires April 2004 [Page 13]
INTERNET-DRAFT Protection Performance Terminology October 2003
switching are the latencies in an ordinary path and in a
recovery path respectively. If a recovery path takes more hops
than an ordinary path, the base latency is increased by
protection switching. Base latency in the duration between path
failure and protection switching cannot be determined under the
above definition, because the working path is changed in this
duration. In this case, base latency is defined as the base
latency before path failure. So, base latency changes during
testing.
Measurement units:
Seconds
Issues:
See Also:
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Latency (RFC 1242)
3.4.6 Additive Latency
Definition:
Difference in base latencies in recovery path compared to the
ordinary path.
Discussion:
If a recovery path takes more hops than an ordinary path, the
latency is increased by protection switching.
Measurement units:
Seconds
Issues:
See Also:
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Latency (RFC 1242)
Base Latency (3.4.5)
3.4.7 Induced Latency
Definition:
Difference in measured latency during testing compared to the
base latency.
Kimura & Perser Expires April 2004 [Page 14]
INTERNET-DRAFT Protection Performance Terminology October 2003
Discussion:
This latency may be induced by buffering in nodes during
protection switching and it may vary with time.
Measurement units:
Seconds
Issues:
It is necessary to write a timestamp in every packet to measure
this metric.
See Also:
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Latency (RFC 1242)
Base Latency (3.4.5)
3.4.8 Unstable-latency Period
Definition:
The time duration between the first time and the last time when
test packets injected with a constant period dt are received
with a time interval which is not equal to dt at the end of
DUT/SUT during the entire test.
Discussion:
An observed inter-packet time T is set equal to dt, if dt - s <
T < dt + s, where the measurement error is limited by the value
s. The test equipment measures inter-packet times received by
it, because we do not need the value of induced latency itself.
The observation of packet intervals can indirectly track induced
latency,
Measurement units:
Seconds
Issues:
See Also:
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Latency (RFC 1242)
3.4.9 Recovery Time
Definition:
Kimura & Perser Expires April 2004 [Page 15]
INTERNET-DRAFT Protection Performance Terminology October 2003
The time duration from an earlier start time of unstable-latency
and sequence-error periods to a later end time of these periods.
Discussion:
Recovery time could be the sum of failure detection time,
switch-over time, and the time taken for the system to be
stabilized. This is the time duration when protection switching
in response to path failure has finished and stability is
restored enabling packets to be forwarded normally, i.e.,
abnormal and abnormally received packets (lost, errored, out-of-
order, and duplicated) are no longer present, induced latency
has decreased, and latency becomes stable.
The Loss Period may be an alternative metric of the recovery
time. But this metric may be not so accurate. If Loss Period
is used as an alternative of the recovery time, it MUST be
referred to as "Recovery Time by Loss Period"
Measurement units:
Seconds
Issues:
See Also:
Ordinary Path (3.1.3)
Recovery Path (3.1.4)
Failure Detection (3.2.2)
Switch Over (3.2.3)
Protection Switching (3.2.4)
Latency (RFC 1242)
Errored Packet (3.4.1)
Lost Packet (3.4.2)
Out-of-order Packet (Ref.[4])
Duplicate Packet (Ref.[4])
Induced Latency (3.4.7)
Sequence-Error Period (3.4.3)
Loss Period (3.4.4)
Unstable-latency Period (3.4.8)
4. Security Considerations
This document only addresses terminology for the performance
benchmarking of protection systems, and the information contained in
this document shall have no effect on the security of the Internet.
5. Acknowledgements
Kimura & Perser Expires April 2004 [Page 16]
INTERNET-DRAFT Protection Performance Terminology October 2003
The editors gratefully acknowledge the contribution of Al Morton in
reviewing this document.
6. References
[1] Bradner, S., "The Internet Standards Process -- Revision 3",
RFC 2026, October 1996.
[2] Bradner, S., Editor, "Benchmarking Terminology for
Network Interconnection Devices", RFC 1242, July 1991.
[3] Mandeville, R., "Benchmarking Terminology for LAN
Switching Devices", RFC 2285, February 1998.
[4] Perser, J., et al., "Terminology for Benchmarking Network-layer
Traffic Control Mechanisms",
Internet Draft, Work in Progress,
draft-ietf-bmwg-dsmterm-07.txt, June 2003.
[5] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", RFC 2119, March 1997.
[6] Paxson, V., et al., "Framework for IP Performance Metrics",
RFC 2026, May 1998.
7. Authors' Addresses
Takumi Kimura
NTT Service Integration Laboratories
3-9-11 Midori-cho
Musashino-shi, Tokyo 180-8585
Japan
Phone: +81 422 59 3026
EMail: takumi.kimura@lab.ntt.co.jp
Jerry Perser
Spirent Communications
26750 Agoura Road
Calabasas, CA 91302
USA
Phone: + 1 818 676 2300
EMail: jerry.perser@spirentcom.com
8. Full Copyright Statement
Kimura & Perser Expires April 2004 [Page 17]
INTERNET-DRAFT Protection Performance Terminology October 2003
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and
furnished to others, and derivative works that comment on or
otherwise explain it or assist in its implementation may be
prepared, copied, published and distributed, in whole or in
part, without restriction of any kind, provided that the
above copyright notice and this paragraph are included on all
such copies and derivative works. However, this document
itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or
other Internet organizations, except as needed for the
purpose of developing Internet standards in which case the
procedures for copyrights defined in the Internet Standards
process must be followed, or as required to translate it into
languages other than English.
The limited permissions granted above are perpetual and will
not be revoked by the Internet Society or its successors or
assigns. This document and the information contained herein
is provided on an "AS IS" basis and THE INTERNET SOCIETY AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY
THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY
RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE.
Kimura & Perser Expires April 2004 [Page 18]