Cisco ASA: High CPU in Dispatch Unit

I ran into an issue of unexpectedly high CPU utilization on a Cisco ASA firewall running 8.4.x family code; the CPU was running greater than 90%, when less than 25% was normal. The culprit was the “Dispatch Unit”; a little googling suggests that the ASA dispatch unit is the process through which the majority of packets are flowing for inspection and accomplishes I’m not sure what else. (I just did a lot of poking to find something that officially described the dispatch unit, but I came up empty. That said, let’s assume that the dispatch unit is more or less the ASA “big brother” process, monitoring traffic flowing through the ASA.)

The next step was to determine what might be pumping so much data into the ASA that the CPU was aggravated. I knew that it wasn’t…

  1. Volume related. The Mbps rates were normal.
  2. Connection related. I do historical graphing of connections transiting the firewall, and they were at a normal as well. I did poke a bit deeper at connections using “sh local | in host|count/limit” per a recommendation I found on a Cisco forum, but that didn’t find anything unusual. Just the mail servers getting flogged, per normal.
  3. NAT related. The NAT translation count was at normal levels.
  4. Encryption related. This firewall didn’t handle any VPN traffic.

The next stop was the log server. I happen to have access to Cisco Security Manager’ Event Viewer at this particular site, which gives me functionality similar to Check Point’s SmartView Tracker. Doing a real-time dump of firewall logs, CSM quickly revealed that an old and untended Linux FTP host in a DMZ was absolutely pounding a pair of internal DNS servers with lookups (permitted via pinhole), and at the same rate trying to connect to an external host on tcp/3303 (denied). I also saw some attempts to send mail via SMTP, tcp/25. As DNS lookups (udp/53) are very short-lived, these didn’t build up in the ASA connection table, even though they were coming at a rate of hundreds per second.

I was curious as to what tcp/3303 was, and don’t have a strong conclusion as yet. Based on googling, it seems plausible that tcp/3303 could be used for a command/control network via a chat protocol. Considering the behavior of this DMZ box, it seems  a reasonable conclusion that the system was trying to connect to home base where it would receive further instructions from the botnet overlords. The furious DNS lookups were for hosts in advertising-related domain names. I didn’t spend much more time on the specifics.

In notified the appropriate parties about the badly behaving box. While waiting for a resolution, I traced down the switchport it was hanging off of using its MAC and switch bridging tables. (I found the MAC on the ASA using “show arp”. On the switches, I used “show mac address-table address”.) Since it was a physical host and not a VM, I shut the switchport down. The ASA CPU returned to normal in seconds.

Concluding Thoughts

Baseline and historical data is helpful. You get a sense of what’s normal now in comparison to what’s been normal in the past. Without history, you’re guessing whether the current state of affairs is normal or not. It’s very hard to catch an anomaly if you don’t know normality. Ask any couch sitting on the highly improbable starship, Heart of Gold.

You have to be alerted when there’s anomalies, like when your CPU is about to catch on fire. I’m embarrassed to admit that this firewall running so hot didn’t send up a flare on my NMS. The CPU issue was discovered by accident – we weren’t experiencing issues of any kind, as the firewall was working fine and not dropping packets unexpectedly. I’ve since configured an NMS alert that is triggered when the ASA CPU is running hotter than normal for longer than 10 minutes. The alert first logs into the firewall and runs a script that pulls connection count, xlate count, cpu-hog, and other possibly interesting stats; the script then e-mails that information to me. The only reason it hadn’t been done before is that all of the ASAs I manage at this site run at different baseline CPU utilization rates, and so I hadn’t taken the time to custom craft all of the alerts. Sometimes you have to take the time, even when you don’t have it.

Logging is very helpful. If you lack log detail or are missing logs completely, a real-time packet capture might also reveal the issue. The ASA can do packet capture from the CLI or ASDM. The ASDM interface is my favorite choice here; ASDM allows you to capture traffic and download it to your workstation as a PCAP, which you can then examine in Wireshark.

Remember to limit what DMZ hosts have access to outside your network as well as inside them. It’s a bad design to allow DMZ hosts to anything out on on the public Internet. If the DMZ host is compromised, you are protecting your company’s business most effectively by making sure that DMZ host can only talk to what is absolutely necessary, whether inside your network or outside. In this case, the host couldn’t get to much at all, and as such it was unable to connect to the presumably malicious tcp/3303 or deliver mail. Therefore, the damage was contained.

I was disappointed not to find a Cisco ASA architecture document on cisco.com that explained ASA packet flows inside the box or ASA processes & their use. I even have the Cisco Firewalls book put out by Cisco Press, and this topic is not addressed that I could find. If there is such an architecture document somewhere, please let me know. Such a reference would be invaluable, and it’s possible I just didn’t ask the search engines the right questions when looking for it.

 

Ethan Banks
Ethan Banks, CCIE #20655, has been managing networks for higher ed, government, financials and high tech since 1995. Ethan co-hosts the Packet Pushers Podcast, which has seen over 2M downloads and reaches over 10K listeners. With whatever time is left, Ethan writes for fun & profit, studies for certifications, and enjoys science fiction. @ecbanks
Ethan Banks
Ethan Banks
  • http://marathon-networks.com/ Dan Shechter

    Nice troubleshooting!

    Can you share what ASA module was that?

    • http://packetpushers.net/author/ecbanks Ethan Banks

      5520

  • Robert Harper

    Check out brksec-3020 via ciscolive for the asa packet flow info.

    • http://packetpushers.net/author/ecbanks Ethan Banks

      Perfect, syncing to Evernote right now. Thanks!

  • http://packetpushers.net/author/ecbanks Ethan Banks

    Thanks – the “Packet Flow” one is especially good; I think I’ve seen that sometime in the past, but lost track of it.

  • thepacketologist

    Ethan,

    Don’t forget, you can get those same packet captures that you run from the command line as a pcap as well. Point your web browser to the ASA’s IP address along with the path to your capture. Something like:

    https://192.168.0.1/admin/capture/DMZ_Cap/pcap

    This will allow you to save the capture as a pcap so you can open it in Wireshark or another traffic analyzer. Without the “/pcap” at the end displays the capture packets in your web browser window.

    Nice job tracking the issue down and thanks for sharing!

    Regards,

    Keith

  • Andy Litzinger

    Does it seem at all odd that a single server hitting an acl deny rule would peg the cpu of the ASA? I would have imagined it could handle that in its sleep.

  • costa

    Saw this behavior also with an old pix with 6.2. it turned out that failed and repeated dns requests from the very same hosts toward the very same server overwhelmed the dns ALG in the firewall.

    long story made short, disable the alg (no fixup in 6, mpf in 7+) or chage the dns port used (not always applicable, we have been lucky enought to use a bind hierarchy and thus we just reconfigured the “inside” bind to point to port 5353 of the outside ones).

    we also filed a bug in the tac, but they of couse made nothing in 2 major releases..

  • Chris

    Do you have a link to the script you use to generate the automated email with all the ASA info you monitor?

7ads6x98y