Just days before 2012 arrives and heralds in the Mayan apocalypse, I thought I’d do something ridiculous. Like shoot heroin or test the airbags in my car by driving into a wall at high speeds. But then something even more ridiculous happened, I lost a few more hours of my life to tuning a DLSw+ head-end router. I should have stuck with the heroin. At least it would have felt good for a little while. For the sake of posterity and as an act of kindness during this most wonderful time of the year, I’m going to post a quick guide to tuning DLSw+. Some lost soul may stumble upon it and save themselves some time and frustration.
To improve DLSw+ performance, particularly on a head-end device with lots of peers, you need to do three things that are tightly inter-dependent of each other:
1. Improve CPU performance.
2. Reduce the likelyhood of a packet getting dropped either in the network, or by the DLSw routers themselves.
3. Reduce the number of packets the router has to process.
CoS Markings (head-end and remote)
First, you need to figure out where DLSw+ fits in your QoS model. By default, DLSw+ marks its traffic IP precedence 5. If you are running DLSw in your network and you have mysterious VoIP issues, then its very possible that DLSw is passing through your priority queue. If you don’t have a priority queue then DLSw is going into the best-effort queue or possibly into your garbage queue (a garbage queue being a 1% queue commonly used to choke out certain kinds of traffic). We chose and stuck with IP precedence 3:
Router(config)#dlsw tos map high 3 medium 3 normal 3 low 3
Packet Size (head-end and remote)
Next is DLSw packet size. This one is quite important actually. Many of you are familiar with the negative impact of post-encrypted or post-encapsulated tunnel packets being fragmented. The receiving router must reassemble those packets before it can decrypt/un-encapsulate the packet within. This can negatively impact the performance of a router, particularly in the face of packet loss.
DLSw as a tunneling protocol is no different. By default, DLSw uses 576 byte packets. If the SNA frame exceeds this size than two DLSw packets are required and the remote router must wait for the second packet to re-assemble the frame. Unfortunately you can not directly configure the DLSw packet size. You must enable TCP path MTU discovery (PMTUD) on the router.
Router(config)#ip tcp path-mtu-discovery
You may actually be using the “crypto ipsec df-bit clear” global command in your network if you are using IPSec. You can’t use this command if you are using DLSw. The alternative approach is to use a policy route-map on the data-center/LAN side of your router. Use an ACL to match on any IP traffic except TCP traffic destined to or coming from ports 2065 and 2067. This way you don’t break PMTUD for DLSw.
Selective Acknowledgements (head-end and remote)
No matter how much you tune your router packets will still occasionally be lost. Enabling TCP selective acknowledgements will reduce the number of packets that are retransmitted when packet loss occurs. You will want to do this particularly if you have a large number of peers or if you enable the TCP Window Scaling feature.
Router(config)#ip tcp selective-ack
TCP Window Scaling (head-end and remote)
A larger window size potentially means more efficient throughput. If you enable this feature, be sure to enable selective acknowledgements.
Router(config)#ip tcp window-size 128000
Singe TCP connect (remote only)
In DLSw+ v1 when two peers initially communicate, two TCP sessions are actually opened: one for each direction. If both sides agree to it, they will knock down one of the sessions. If you have large number of peers and a network event occurs that causes some number of DLSw peers to bounce, then having the DLSw v2 “Single TCP” feature enabled will help. With this feature enabled on the remote routers (not the central, promiscuous DLSw router) only a single TCP session will be established in the beginning.
Router(config)#dlsw remote-peer 0 tcp 18.104.22.168 v2-single-tcp
DLSw UDP packets (head-end and remote)
An unholy abomination if there ever was one: Someone, probably an ex-SNA network person, actually decided to implement a feature in DLSw that uses UDP port 0. And this is enabled by default. If its not needed (and its probably not), then just disable it everywhere on all DLSw routers.
Hold Queues and SPD thresholds (head-end only)
If you have a head-end DLSw device with hundreds of peers, then you really should be using a 7200 w/NPE-G2 processor and 2G of RAM. At a minimum. It should be dedicated to this purpose (DLSw head-end). Since all DLSw packets are sent to the processor, go ahead and increase the hold-queue (in and out) to 4096 packets on all interfaces. With hundreds of peers, you will in fact be completely overrunning the hold-queue with its default settings. The hold queue is configured on the physical interface.
There is no point to increasing the hold-queue without also increasing the SPD threshold values. What is SPD? The interface inbound hold-queue is really a queue for process-switched traffic waiting for the CPU’s attention. SPD is effectively RED for this queue. SPD will randomly drop packets from the inbound hold-queue once a minimum threshold is reached and it will drop all packets once a maximum threshold is reached. If you increase the size of the hold-queue, you need to adjust the SPD thresholds to match.
Router(if-config)#hold-queue 4096 in
Router(if-config)#hold-queue 4096 out
Router(config)#ip spd queue threshold minimum 2000 maximum 4096
Buffers (head-end only)
Increase the amount of available buffers. We use the following settings on our head-end DLSw router (a 7206vxr w/NPE-G2 and 2G of RAM):
Router(config)#buffers small permanent 2000
Router(config)#buffers small max-free 2500
Router(config)#buffers small min-free 500
Router(config)#buffers middle permanent 800
Router(config)#buffers middle max-free 1000
Router(config)#buffers middle min-free 200
Router(config)#buffers big permanent 800
Router(config)#buffers big max-free 1000
Router(config)#buffers big min-free 200
Router(config)#buffers verybig permanent 100
Router(config)#buffers verybig max-free 125
Router(config)#buffers verybig min-free 25
Router(config)#buffers large permanent 100
Router(config)#buffers large max-free 125
Router(config)#buffers large min-free 25
Router(config)#buffers huge permanent 100
Router(config)#buffers huge max-free 125
Router(config)#buffers huge min-free 25
Scheduler Allocate (head-end and remote)
DLSw is process-switched. If either the head-end or remote router is CEF-switching a significant amount of traffic it could starve DLSw of the CPU time that it needs. This command will dedicate more CPU time to process-switching.
Router(config)#scheduler allocate 3000 1000
IOS version (head-end and remote)
Lastly you may want to consider upgrading to IOS 15.0(1)M7 (or perhaps the upcoming M8). Significant improvements to the overall performance of IOS have been made starting in this code. We went from the 12.4(15)T train to the 15.0(1)M train and noticed across the board improvements in CPU utilization. In some cases, the CPU utilization was cut by 2/3rds. It might be worth waiting for the M8 release which should be out soon. This code train is a bug-fix only train and we’ve had great luck with it. As always, TEST FIRST.