A scientist is never certain. We all know that. We know that all our statements are approximate statements with different degrees of certainty; that when a statement is made, the question is not whether it is true or false but rather how likely it is to be true or false. – Richard P. Feynman
Recently a listener emailed me for my opinion on the topic of URL categorization and classification. He lamented how the security team at his organization had no doubts regarding the efficacy of these products and was pushing for implementation, while he was absolutely convinced that they were always a waste of money. He seemed to expect absolutes from me regarding the failure of this technology. Unfortunately, I think the answer is much more complicated.
The business of information security seems to be more about superstition than science. The CISO is just another slob who desperately wants to believe that some sexy, new technology will keep the Big, Bad Wolf of APT from blowing the organizational house down. Eventually, the inevitable intrusion occurs and the latest acquisition winds up on the dung heap along with all the other broken toys. But is it the fault of the product or unrealistic expectations?
I don’t believe any security technology will prevent APT or a similarly impactful incident. First, APT is a “black swan event“, hence unpredictable. If you can’t predict with perfect accuracy, then you can’t prevent something with absolute certainty. Like the financial market, dealing with APT is about educated guesses using heuristics, or as I like to call it, thin-slicing a black swan. In general, the main problem I have with the way security groups perform incident response and intrusion detection, is that they try to address an “open world” problem using “closed world” assumptions. Gerd Gigerenzer, Director of the Center for Adaptive Behavior and cognition at the Max Planck Institute and leader in the field of Bounded Rationality, proposes that while more information is good for hindsight, less information is better for prediction or foresight. People use smart heuristics or “rules of thumb,” because the human brain economizes for faster response times. That being said, I have mixed feelings about URL categorization/classification and web reputation techniques. Know that if you purchase such a product, often the vendor probably doesn’t do the work in-house. For example, Zvelo actually resells their URL classification service to many OEMs. So what are you actually buying from the vendor? What offends me most about this type of technology isn’t the intrusiveness of inspecting everyone’s traffic, so much as the low ROI. With many classification/categorization systems, unless you man-in-the-middle all your SSL traffic, I’m not sure how well they’ll work in most scenarios. To bypass them, one could just run Tor, DNSCurve or use an SSH tunnel. Granted, these systems might get the low-hanging fruit, but if we’re talking about APT, threats like Flame and Red October, then wouldn’t the attacker have already figured a way around this sort of technique?
How about the numerous false positives that make users scream and administrators pull out their hair in frustration? I wonder whether this effort is worth it. I feel the same way about endpoint security too. Maybe energy is better focused on deploying desktop virtualization and treating the user’s system as something unclean. In EDU and GOV, you have access to various Information Sharing and Analysis Centers (ISACs), which are data clearinghouses of known malicious hosts and threats. This data can be used in various ways to protect an organization and I wish the model was more widely used. I think one of the most efficient techniques I’ve seen for filtering malware and inappropriate content is DNS-based web filtering used by DNS sinkholes. There’s no user chokepoint and it’s pretty effective in catching many of the malicious sites. I don’t know how the commercial products build their databases, but the open source versions seem to use ISAC data. Probably more reputation than classification, but when I was part of a team that implemented such a product, it was incredibly effective, with little impact to users.
If the threat reports from Trustwave and Verizon are to be trusted and more than 80% of intrusions are detected by outsiders, then the products we’re buying clearly aren’t working. My feeling is that we need to work on gathering LESS information for faster response times. Build systems implementing Fast and Frugal Trees like those utilized by first responders to find the real-time attacks and shut them down. This means shifting to a Bounded Rationality model using stopping rules. That’s a difficult adjustment for security vendors to make, especially when their financial well-being is built on the current way of doing things.