Internet Engineering Task Force                           Takumi Kimura
INTERNET-DRAFT                                            NTT
Expires in: October 2003                                  Jerry Perser
                                                          Spirent
                                                          April 2003


          Benchmarking Terminology for Protection Performance

                 <draft-kimura-protection-term-01.txt>


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC 2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


Abstract

   This document addresses common terminology and metrics for the
   performance benchmarking of sub-IP layer protection technologies:
   Automatic Protection Switching (APS) for SONET/SDH, Resilient Packet
   Ring (RPR) for Ethernet, and Fast Reroute for Multi-Protocol Label
   Switching (MPLS).  The benchmarks describe the performance based on
   the effects in the IP-layer, to avoid dependence on a specific sub-IP
   layer protection technology.


Table of Contents

    1. Introduction  ..............................................  2
    2. Existing definitions  ......................................  3



Kimura & Perser            Expires April 2003                   [Page 1]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


    3. Term definitions  ..........................................  3
      3.1 Path
        3.1.1 Path  ...............................................  3
        3.1.2 Working Path  .......................................  4
        3.1.3 Ordinary Path  ......................................  4
        3.1.4 Recovery Path  ......................................  5
        3.1.5 Recovery Span  ......................................  5
      3.2 Protection
        3.2.1 Path Failure  .......................................  6
        3.2.2 Failure Detection  ..................................  6
        3.2.3 Switch-Over  ........................................  7
        3.2.4 Protection Switching  ...............................  7
        3.2.5 Protection-Capable Node  ............................  8
        3.2.6 Protection System  ..................................  8
      3.3 Reference Model for Protection Benchmarking
        3.3.1 Pseudo-Failure Equipment  ...........................  9
        3.3.2 Trigger for Protection  .............................  9
        3.3.3 Reference Model for Protection Benchmarking  ........ 10
      3.4 Metrics
        3.4.1 Lost Packet  ........................................ 11
        3.4.2 Errored Packet  ..................................... 12
        3.4.3 Additive Latency  ................................... 12
        3.4.4 Induced Latency  .................................... 12
        3.4.5 Recovery Time  ...................................... 13
    4. Security Considerations  ................................... 13
    5. References  ................................................ 14
    6. Authors' Addresses  ........................................ 14


1. Introduction

   Reliability is needed in today's IP networks, because the Internet
   has already become an important communication infrastructure, and
   quality-sensitive applications are being used on it.  To improve IP-
   layer reliability, protection technologies have been implemented in
   sub-IP layers.  Automatic Protection Switching (APS) is for
   SONET/SDH, Resilient Packet Ring (RPR) is for the Ethernet, and Fast
   Reroute is for Multi-Protocol Label Switching (MPLS).  Recovery time
   in the IP-layer is different from that in sub-IP layers because of
   the mechanism for recognition when interfaces go up and down and the
   buffering effect of IP routers.  Protection performance
   specifications and methodologies for testing them are required to
   allow an objective comparison of implementations.

   Performance metrics are based on the effects in the IP layer, so that
   they can be developed independent of protection technologies and that
   we can compare different protection technologies.




Kimura & Perser            Expires April 2003                   [Page 2]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


2.  Existing definitions

   This document draws on existing terminology defined in other BMWG
   work.  Examples include, but are not limited to:

        Latency                   [RFC 1242, section 3.8]
        Frame Loss Rate           [RFC 1242, section 3.6]
        Throughput                [RFC 1242, section 3.17]
        Device Under Test (DUT)   [RFC 2285, section 3.1.1]
        System Under Test (SUT)   [RFC 2285, section 3.1.2]
        Out-of-order Packet       [Ref.[4], section 3.3.2]
        Duplicate Packet          [Ref.[4], section 3.3.3]

   This document adopts the definition format in Section 2 of RFC 1242.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.


3. Term definitions

3.1 Path

3.1.1 Path

    Definition:
        A sequence of nodes, <R1, ..., Rn>, with the following
        properties:
        - R1 is the ingress node and forwards IP packets, which input
        into DUT/SUT, to R2 as sub-IP frames.
        - Ri is a node which forwards data frames to R[i+1] for all i,
        1<i<n, based on information in the sub-IP layer.
        - Rn is the egress node and it outputs sub-IP frames from
        DUT/SUT as IP packets.

    Discussion:
        The path is defined in the sub-IP layer in this document, unlike
        an IP path in RFC 2026.  For example, the SONET/SDH path, the
        label switched path for MPLS, and optical path.  One path may be
        regarded as being equivalent to one IP link between two IP
        nodes, i.e., R1 and Rn.  The two IP nodes may have multiple
        paths for protection.  A packet will travel on only one path
        between the nodes.  Packets belonging to a microflow (RFC 2474)
        will transverse one or more paths.  The path is unidirectional.

    Measurement units:
        n/a



Kimura & Perser            Expires April 2003                   [Page 3]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


    Issues:
        "A bidirectional path", which transmits traffic in both
        directions along the same nodes, consists of two unidirectional
        paths.  Therefore, the two unidirectional paths belonging to
        "one bidirectional path" will be treated independently when
        benchmarking for " a bidirectional path".

    See Also:


3.1.2 Working Path

    Definition:
        A path that the DUT/SUT is using to forward packets.

    Discussion:
        An ordinary path (3.1.3) is a working path before protection,
        while a recovery path (3.1.4) becomes a working path after
        protection.

    Measurement units:
        n/a

    Issues:

    See Also:
        Path (3.1.1)
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)


3.1.3 Ordinary Path

    Definition:
        A path which is a working path before protection.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:
        Path (3.1.1)
        Working Path (3.1.2)
        Path Failure (3.2.1)




Kimura & Perser            Expires April 2003                   [Page 4]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.1.4 Recovery Path

    Definition:
        A path which is prepared against the eventuality of path
        failure, and used to forward packets as a working path.

    Discussion:
        There are various types of recovery paths: a dedicated recovery
        path (1+1), which has 100% redundancy for a specific ordinary
        path, a shared recovery path (1:N), which is dedicated to the
        protection for more than one specific ordinary path, and an
        associated shared recovery path (M:N) for which a specific set
        of recovery paths protects a specific set of more than one
        ordinary path.

    Measurement units:
        n/a

    Issues:

    See Also:
        Path (3.1.1)
        Working Path (3.1.2)
        Ordinary Path (3.1.3)
        Path Failure (3.2.1)


3.1.5 Recovery Span

    Definition:
        A fraction of an ordinary path that includes a failure link or
        node and is changed to other link(s) and node(s) for protection.

    Discussion:
        There are two types of recovery spans: a full recovery span,
        which is a recovery span prepared between the ingress and egress
        nodes of DUT/SUT, and a partial recovery span, which is a
        recovery span prepared for only parts of an ordinary path
        between ingress and egress nodes of DUT/SUT.  For a full
        recovery span, the whole of an ordinary path is changed to a
        recovery span for protection, and the ordinary and recovery
        paths do not overlap.  For a partial recovery span, only a part
        of an ordinary path is changed to a recovery span for
        protection, and parts of the ordinary and recovery paths may
        overlap.

    Measurement units:
        n/a



Kimura & Perser            Expires April 2003                   [Page 5]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


    Issues:

    See Also:
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Path Failure (3.2.1)


3.2 Protection

3.2.1 Path Failure

    Definition:
        A condition that prevents packets from being forwarded on an
        ordinary path as a working path, caused by fault(s) with link(s)
        or node(s) in a sub-IP layer.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:
        Working Path (3.1.2)
        Ordinary Path (3.1.3)


3.2.2 Failure Detection

    Definition:
        To detect working-path failure which is caused by fault(s) with
        link(s) or node(s) in a sub-IP layer.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:
        Working Path (3.1.2)
        Ordinary Path (3.1.3)
        Path Failure (3.2.1)





Kimura & Perser            Expires April 2003                   [Page 6]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.2.3 Switch-Over

    Definition:
        To change a working path, in cases of failure, from an ordinary
        path to a recovery path.

    Discussion:
        Switch-over does not always replace an entire ordinary path with
        other link(s) and node(s) for a partial recovery span.

    Measurement units:
        n/a

    Issues:

    See Also:
        Working Path (3.1.2)
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Recovery Span (3.1.5)
        Path Failure (3.2.1)


3.2.4 Protection Switching

    Definition:
        The detection of working-path failures and response to these by
        switching the working path from the ordinary to the recovery
        path.

    Discussion:
        A protection-switching scheme includes both the mechanisms for
        failure detection and switch-over.

    Measurement units:
        n/a

    Issues:

    See Also:
        Working Path (3.1.2)
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Recovery Span (3.1.5)
        Path Failure (3.2.1)
        Failure Detection (3.2.2)
        Switch-Over (3.2.3)




Kimura & Perser            Expires April 2003                   [Page 7]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.2.5 Protection-Capable Node

    Definition:
        A node that includes functional elements to handle protection
        switching.

    Discussion:
        Both end nodes of a recovery span for an ordinary path must be
        protection-capable nodes.

    Measurement units:
        n/a

    Issues:

    See Also:
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Recovery Span (3.1.5)
        Protection Switching (3.2.4)


3.2.6 Protection System

    Definition:
        A system which consists of two or more protection-capable nodes
        connected to each other by link(s) and node(s) constructing
        ordinary paths and recovery paths.

    Discussion:
        When a working-path failure occurs, the system detects the
        failure and switches the working path from the failed ordinary
        path to the recovery path.  Some technologies for this are in
        sub-IP layers, i.e., MPLS-based recovery, SONET/SDH-based
        recovery and optical path recovery.

    Measurement units:
        n/a

    Issues:

    See Also:
        Working Path (3.1.2)
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Recovery Span (3.1.5)
        Path Failure (3.2.1)
        Failure Detection (3.2.2)



Kimura & Perser            Expires April 2003                   [Page 8]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


        Switch-Over (3.2.3)
        Protection Switching (3.2.4)
        Protection-Capable Node (3.2.5)


3.3 Reference Model for Protection Benchmarking

3.3.1 Pseudo-Failure Equipment

    Definition:
        Equipment which creates a pseudo path failure after receiving a
        signal from a tester.

    Discussion:
        An pseudo-failure equipment is used in benchmarking protection
        systems, since it provides more reliable and reproducible
        testing than actual path failure.

    Measurement units:
        n/a

    Issues:

    See Also:
        Path Failure (3.2.1)
        Trigger for Protection (3.3.2)


3.3.2 Trigger for Protection

    Definition:
        A signal which is sent from a tester to make a piece of pseudo-
        failure equipment create a pseudo failure in an ordinary path.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:
        Working Path (3.1.2)
        Ordinary Path (3.1.3)
        Path Failure (3.2.1)
        Pseudo-Failure Equipment (3.3.1)





Kimura & Perser            Expires April 2003                   [Page 9]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.3.3 Reference Model for Protection Benchmarking

    Definition:
        A fundamental model that is used in benchmarking protection
        systems.  A System Under Test (SUT) consists of two protection-
        capable nodes connected by both an ordinary path and a recovery
        path.  An pseudo-failure equipment is placed at a point along
        the ordinary path.  A tester is set outside the two nodes and it
        generates IP traffic.  The tester also sends the triggers for
        protection that cause the piece of pseudo-failure equipment to
        simulate path failures.


                                 +-----------+
            +--------------------|  Tester   |<-------------------+
            |                    +-----------+                    |
            |                          | Trigger                  |
            |                          |   for Protection         |
            |              Ordinary    v                          |
            |    +--------+  Path +---------+       +--------+    |
            |    |        |-------| Failure |------>|        |    |
            +--->| Node 1 |       +---------+       | Node 2 |----+
                 |        |- - - - - - - - - - - - >|        |
                 +--------+      Recovery Path      +--------+

                 |                                           |
                 +-------------------------------------------+
                              System Under Test (SUT)

                                    Figure 1


    Discussion:
        Figure 1 shows a reference model for protection benchmarking.  A
        SUT consists of two protection-capable nodes connected by both
        an ordinary path and a recovery path.  The ordinary path has
        pseudo-failure equipment.  A tester, which is placed outside the
        two nodes, continuously sends IP packets that include sequence
        numbers and time stamps to one of the nodes and receives packets
        from the other node.  After the tester has sent a trigger for
        protection to the pseudo-failure equipment, the system detects
        the failure and switches from the failed ordinary path to the
        recovery path.  The tester records the sequence numbers and time
        stamps in the IP packets as well as the packet-reception times,
        during the time it takes protection switching to detect and
        finish responding to a failure.

    Measurement units:



Kimura & Perser            Expires April 2003                  [Page 10]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


        n/a

    Issues:

    See Also:
        Working Path (3.1.2)
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Path Failure (3.2.1)
        Failure Detection (3.2.2)
        Switch-Over (3.2.3)
        Protection Switching (3.2.4)
        Protection-Capable Node (3.2.5)
        Pseudo-Failure Equipment (3.3.1)
        Trigger for Protection (3.3.2)


3.4 Metrics

        Performance metrics for protection benchmarking will include
        Lost Packet (related to Frame Loss Rate in RFC 1242), Errored
        Packet, Induced Latency, Out-of-order Packet (Ref.[4]),
        Duplicate Packet (Ref.[4]), and Recovery Time.


3.4.1 Lost Packet

    Definition:
        A packet that is lost during the time it takes protection
        switching to detect and finish responding to a failure.

    Discussion:
        The input traffic rate SHOULD be less than or equal to the
        Throughput (RFC 1242) which is the smallest of two Throughputs
        for paths before and after protection switching.  This metric is
        related to the Frame Loss Rate defined in RFC 1242 but we are
        interested in the number of lost packets during testing.

    Measurement units:
        Packet count

    Issues:

    See Also:
        Protection Switching (3.2.4)
        Throughput (RFC 1242)
        Frame Loss Rate (RFC 1242)




Kimura & Perser            Expires April 2003                  [Page 11]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.4.2 Errored Packet

    Definition:
        A received packet that fails at least one error detection
        scheme.

    Discussion:
        The error detection scheme can be sub-IP (FCS), IP (IP
        checksum), or other layers (TCP checksum).

    Measurement units:
        Packet count

    Issues:

    See Also:


3.4.3 Additive Latency

    Definition:
        Difference in recovery-path latency against ordinary-path
        latency.

    Discussion:
        If a recovery path takes more hops than an ordinary path, the
        latency (additive latency) is increased by protection switching.

    Measurement units:
        Seconds

    Issues:

    See Also:
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Latency (RFC 1242)


3.4.4 Induced Latency

    Definition:
        Difference in measured maximum latency during testing against
        the maximum value of ordinary and recovery path latencies.

    Discussion:
        This latency may be induced by buffering in nodes during
        protection switching.



Kimura & Perser            Expires April 2003                  [Page 12]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


    Measurement units:
        Seconds

    Issues:

    See Also:
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Latency (RFC 1242)


3.4.5 Recovery Time

    Definition:
        Time duration when protection switching in response to path
        failure has finished and stability is restored enabling packets
        to be forwarded normally, i.e., abnormal and abnormally received
        packets (lost, errored, out-of-order, and duplicated) are no
        longer present, induced latency is decreased, and latency
        becomes stable.

    Discussion:
        Recovery time may be the sum of failure detection time, switch-
        over time and the time taken for the system to be stabilized.

    Measurement units:
        Seconds

    Issues:

    See Also:
        Ordinary Path (3.1.3)
        Recovery Path (3.1.4)
        Failure Detection (3.2.2)
        Switch-Over (3.2.3)
        Protection Switching (3.2.4)
        Latency (RFC 1242)
        Lost Packet (3.4.1)
        Errored Packet (3.4.2)
        Out-of-order Packet (Ref.[4])
        Duplicate Packet (Ref.[4])
        Induced Latency (3.4.4)


4. Security Considerations

   This document only addresses terminology for the performance
   benchmarking of protection systems, and the information contained in



Kimura & Perser            Expires April 2003                  [Page 13]


INTERNET-DRAFT     Protection Performance Terminology       October 2002


   this document has no effect on the security of the Internet.


5. References

   [1]  Bradner, S., "The Internet Standards Process -- Revision 3",
        RFC 2026, October 1996.

   [2]  Bradner, S., Editor, "Benchmarking Terminology for
        Network Interconnection Devices", RFC 1242, July 1991.

   [3]  Mandeville, R., "Benchmarking Terminology for LAN
        Switching Devices", RFC 2285, February 1998.

   [4]  Perser, J., et al., "Terminology for Benchmarking Network-layer
        Traffic Control Mechanisms",
        Internet Draft, Work in Progress,
        draft-ietf-bmwg-dsmterm-05.txt, February 2003.

   [5]  Bradner, S., "Key words for use in RFCs to Indicate
        Requirement Levels", RFC 2119, March 1997.

   [6]  Paxson, V., et al., "Framework for IP Performance Metrics",
        RFC 2026, May 1998.


6. Authors' Addresses

   Takumi Kimura
   NTT Service Integration Laboratories
   3-9-11 Midori-cho
   Musashino-shi, Tokyo 180-8585
   Japan
   Phone: +81 422 59 3026
   EMail: takumi.kimura@lab.ntt.co.jp

   Jerry Perser
   Spirent Communications
   26750 Agoura Road
   Calabasas, CA 91302
   USA
   Phone: + 1 818 676 2300
   EMail: jerry.perser@spirentcom.com








Kimura & Perser            Expires April 2003                  [Page 14]