Reading Tom Hollingsworth’s post Network Consumer Reports got me to thinking of the time my boss came to me and said, “The Vendor says we need to use router X to support customers at our 100Mbit/s sites, but I think we can save some money and use router Y instead. What do you think?” Well, who am I to argue with The Vendor? Nobody really, but I like a good argument nonetheless…
There a 7 really good questions from Juniper that you should ask yourself and The Vendor when it comes to latency testing that can be applied in most cases of Ethernet switched networks. The next time latency figures are offered, be sure to ask the following questions:
1. Is the number switch latency or packet latency?
2. What size packets were used?
3. What latency methodology was used (that is, FIFO, LIFO, or LILO)?
4. On how many ports was this measured?
5. What was the testing topology (that is, port pair, partial mesh, or full mesh)?
6. Under what load was this tested (that is, 10 percent, 25 percent, 50 percent, or 100 percent)?
7. Is this the minimum, average, or maximum latency number?
Juniper has written a white paper based on the 7 questions above. A few key points to take from the white paper, Understanding How to Measure Latency on Ethernet Switches:
- You need to understand the equipment that is under test and the way the test is being conducted. “For a latency test case, the tester has the option of choosing the latency measurement methodology—FIFO, LIFO, or LILO. RFC 1242 defines a store-and-forward switch—which should be measured as LIFO—whereas for a cut-through switch, latency should be measured as FIFO.” Reading the white paper these methods will be defined in more detail.
- The white paper also makes sure to note that you need to make sure you are comparing “apples-to-apples” during the testing. “In theory, the RFC 1242 testing methodology is correct, but it can be misleading when comparing store-and-forward switches to cut-through switches. In fact, a more accurate way of measuring latency is FIFO. At the end of the day, the switch is a black box to the customer that is just forwarding traffic. From a latency perspective, what customers really care about is the time it takes for a packet to be switched across the device.”
- The Test Setup is key to the understanding of the test results as well, “There are other elements that can affect overall switching latency such as the number of connected ports, traffic patterns,and traffic load. As stated earlier, RFC 2544 provides guidelines about how switches should be tested, but traffic testers do not enforce these guidelines.There are many permutations as to how tests can be set up. There are many permutations as to how tests can be set up. Traffic flow can be unidirectional or bidirectional; and traffic patterns can be between port pairs, partial mesh, or full mesh.The more complex the testing methodology, the more likely switching latency increases.”
Moving on to the RFCs that are available for further detail on Benchmarking. The original benchmarking RFC 2544 is a few years old but still applies when it comes to performance testing. It gives you the framework of the type of things and tests results you should be looking for when evaluating gear and connections.
“Vendors often engage in “specsmanship” in an attempt to give their products a better position in the marketplace. This often involves “smoke & mirrors” to confuse the potential users of the products,” – RFC 2544 – Benchmarking Methodology for Network Interconnect Devices. A prerequisite to reading RFC 2544 is Benchmarking Terminology for Network Interconnect Devices” (RFC 1242) which will define much of the terminology that will be referenced in RFC 2544.
“Basic connectivity testing is generally sufficient for best-effort service such as residential Internet access which has no implicit performance guarantees. For corporate customers who require services with specific performance objectives, it is common to employ the RFC 2544 tests.” “RFC 2544, published in 1999 by the IETF, defines the “Benchmarking Methodology for Network Interconnect Devices”. It was originally designed to allow the standardized testing and benchmarking of a single interconnect device such as a router or a switch (known as the DUT or Device Under Test). This methodology has become the de facto standard performed routinely in QA labs and verification labs in order to quantify the performance of network devices.” The quotes are from an informative video and whitepaper from Sunrise Telecom. I have never used their product and the links are included only as another resource. The video is an overview of RFC 2544 and covers some aspects of the benchmarking tests and the fact that The Vendor “SHOULD” do tests in a certain fashion but it is not a “MUST”. You should be aware of this when the The Vendor sells you a product or if you buy or sell a service and agree to an SLA (Service Level Agreement) and how your equipment and circuit should perform.
RFC 2544 provides a set of tests as well as test conditions that should be included during testing:
26. Benchmarking tests:
Objective: To determine the DUT throughput as defined in RFC 1242.
Objective: To determine the latency as defined in RFC 1242.
26.3 Frame loss rate
Objective: To determine the frame loss rate, as defined in RFC 1242,
of a DUT throughout the entire range of input data rates and frame
26.4 Back-to-back frames
Objective: To characterize the ability of a DUT to process back-to-
back frames as defined in RFC 1242.
26.5 System recovery
Objective: To characterize the speed at which a DUT recovers from an
Objective: To characterize the speed at which a DUT recovers from a
device or software reset.
Appendix A: Testing Considerations
A.1 Scope Of This Appendix
This appendix discusses certain issues in the benchmarking
methodology where experience or judgment may play a role in the tests
selected to be run or in the approach to constructing the test with a
particular DUT. As such, this appendix MUST not be read as an
amendment to the methodology described in the body of this document
but as a guide to testing practice.
1. Typical testing practice has been to enable all protocols to be
tested and conduct all testing with no further configuration of
protocols, even though a given set of trials may exercise only one
protocol at a time. This minimizes the opportunities to "tune" a
DUT for a single protocol.
2. The least common denominator of the available filter functions
should be used to ensure that there is a basis for comparison
between vendors. Because of product differences, those conducting
and evaluating tests must make a judgment about this issue.
3. Architectural considerations may need to be considered. For
example, first perform the tests with the stream going between
ports on the same interface card and the repeat the tests with the
stream going into a port on one interface card and out of a port
on a second interface card. There will almost always be a best
case and worst case configuration for a given DUT architecture.
4. Testing done using traffic streams consisting of mixed protocols
has not shown much difference between testing with individual
protocols. That is, if protocol A testing and protocol B testing
give two different performance results, mixed protocol testing
appears to give a result which is the average of the two.
5. Wide Area Network (WAN) performance may be tested by setting up
two identical devices connected by the appropriate short- haul
versions of the WAN modems. Performance is then measured between
a LAN interface on one DUT to a LAN interface on the other DUT.
The maximum frame rate to be used for LAN-WAN-LAN configurations is a
judgment that can be based on known characteristics of the overall
system including compression effects, fragmentation, and gross link
speeds. Practice suggests that the rate should be at least 110% of
the slowest link speed. Substantive issues of testing compression
itself are beyond the scope of this document.
While RFC 2544 is the “benchmark” there have been additions over the years as technology changes and updates to testing is needed. A Google search for benchmarking RFCs will show more RFCs covering many aspects of benchmarking for equipment and protocols. The two from the Google search that I want to point out are RFC2889 and RFC5180: Benchmarking Methodology for LAN Switching Devices (RFC 2889) & IPv6 Benchmarking Methodology for Network Interconnect Devices (RFC 5180). IPv6 is covered in this, and there is a nice whitepaper from Cisco covering the considerations for IPv6/IPv4 dual stack environment testing – Cisco whitepaper.
There is an IETF Working Group that is looking at new standards for IP Performance Metrics (ippm), their charter states “The IPPM WG has developed a set of standard metrics that can be applied to the quality, performance, and reliability of Internet data delivery services.”
Also the ITU has created ITU-T Y.1564.
ITU-T Y.1564 is designed around three key objectives:
- To serve as a network service level agreement (SLA) validation tool, ensuring that a service meets its guaranteed performance settings in a controlled test time.
- To ensure that all services carried by the network meet their SLA objectives at their maximum committed rate, proving that under maximum load network devices and paths can support all the traffic as designed.
- To perform medium- and long-term service testing, confirming that network element can properly carry all services while under stress during a soaking period.
There are many commercial tools that can cost big bucks to do benchmark testing based on RFC 2544 and other tests like “IMIX, short for Internet Mix, is not officially defined by a standards organization, it has become an increasingly popular concern in the networking test arena. Its origins are derived in part out of a need to identify and simulate Internet network traffic according to frame size usage.” But to get your feet wet in the world of benchmark and real world performance testing a great free tool to use is iperf and also a java based front end jperf. Configuration and use of iperf/jperf is out of the scope of this post, but it’s very easy to find tutorials with a quick web search.
Wrapping up this post with the answer I gave to my boss, “It depends”. And then I went into basically what I have covered in this post about RFCs, packet sizes, protocols and services that would be running on router X and router Y at different customer sites depending on their needs. And how all these things would influence the performance. He said as usual, “that is all good stuff, but… ” So, in the end we went with the cheaper model because it worked for the basic Internet services we were providing to most sites. We did have to upgrade to router Y at some other sites, but that was due to VPN requirements. Benchmarking is important in both a standalone test of the device as well as getting a baseline SLA when all the devices are connected end-to-end. Without these tests results, you really have no basis for when there is a performance problem.