Network Complexity Bites Back

Let me tell you a story.

The cell phone service at my house stinks. On a good day, if I walk out on the porch and lean against the rail, I can get one bar of signal. If it’s raining and the middle of the summer, where the trees have a good covering of leaves, well, maybe you can get signal, maybe you can’t. AT&T says they can’t build new towers in the area, and since I sometimes get one bar, I’m covered.

But then along comes this brilliant little idea for a local cell using VOIP to carry your calls back into the AT&T network —the AT&T Microcell (also called a femtocell). It’s made by Cisco (says so right on the side), so I thought, “how bad could this be?” Twenty years in the networking industry, and I’m still asking, “how bad could this be?” As if I don’t already know the answer.

I picked up a Microcell and connected it behind my router. Of course it didn’t work. How long did you wonder? Long enough to serialize a 128 octet packet onto a Gig/E link?

The friendly AT&T folks said it must be some problem with my router, “you need to open some ports up so it can communicate.” I scratched my head over why a Microcell might be hosting inbound connections of any sort, and scratched my head over why this thing doesn’t have any sort of a management interface (so I could at least assign it a static address on my little network), configured a static DHCP mapping, put the Microcell in the DMZ, and tried again. Nothing.

So I went out and bought another router. I tried three different Cisco routers and one Netgear. Nothing. I called AT&T. “Apparently, your Microcell is only going to work if it’s connected in front of the router.” This raised a new host of questions. How much throughput does this little device support? How often do they fail? Is it a router? If it consumes the IP address my provider gives me, it must be a NAT device —how does that work? But AT&T doesn’t know the answer to any of these questions, so it’s not going in front of my router.

Next step –I went through months of trying to convince someone, anyone, at my cable provider to let me to pay for a second IP address, so I could connect the Microcell directly to their network. After getting that all set up, I rebooted everything and plugged the Microcell in.

In case you haven’t guessed already: nope. It still won’t work.

So I called AT&T again. “Sorry, it’s not connected to the Internet.” I scratched my head again, and plugged a PC into the back of it. I can browse through the device (though it does cut down my connection speed a good bit). I reached someone else at AT&T, who then told me… Microcells don’t work with the brand of cable modem I have installed.

“It’s not getting free flow through the Internet, sir, you need to put it behind a router.” Huh? What, precisely, is “free flow.” I’m a network engineer; tell me what protocols are failing to make it through the network. The AT&T techs don’t know. I even sent packet traces to some Cisco folks. They don’t know what it’s doing, either. And it didn’t work the last time I put it behind a router. Does anyone read case notes?

Why is this all so hard?

Because the protocols required to get VOIP running are complex. Because IPsec is complex, and VOIP on top of IPsec is even more complex. Because there is no way to actually see what the microcell is doing on the network, so there’s no way to determine where its getting hung up. Because no-one at AT&T knows enough about networks to look at what’s happening on their end and say, “the microcell isn’t accepting HTTPS connections,” or provide any other sort of proper diagnosis. Because no-one at my local provider knows enough about microcells to know how to do whatever is needed on their network to allow the right sort of traffic through.

In short, this is a case of complexity biting back. Complexity lets us do all sorts of neat things with networks. But it also allows us to break things in a way that simply makes them impossible to fix. Maybe it’s time to start thinking in terms of making networks simple (again). Maybe there’s a limit to how much complexity we can layer on top of complexity and still expect things to actually work.

Maybe we should take to heart RFC1925. “(6) It is easier to move a problem around (for example, by moving the problem to a different part of the overall network architecture) than it is to solve it. … (6a) (corollary). It is always possible to add another level of indirection.” Indirection is another form of complexity, after all.

The Microcell? Because I really want my cell phones to work, I’ve ordered another cable modem. Will it work? Who knows? It’s almost embarrassing being reduced to random acts of engineering.

About Russ White

Russ White is a Network Architect who's currently looking for a new challenge. He's scribbled a basket of books, penned a plethora of patents, written a raft of RFCs, taught a trencher of classes, and done a lot of other stuff you either already know about, or don't really care about. If you want letters, well... BSIT/MSIT (Capella University), CCIE #2635, CCDE 2007:001, CCAr. So there.

  • http://twitter.com/anthony_ge Anthony

    Are you sure the Microcell isn’t DOA?

    • Russ White

      This is the third one I’ve tried, so I don’t think that’s it, either… My last effort fizzled as well –for various reasons, the provider can’t provision my personal modem with voice service (and my voice service is tangled up in this mess, too). So, now I’m waiting on another modem to see if the modem will fix the problem –though I can’t imagine why it would, given the SA is coming up. If there’s a fragmentation problem, I imagine that would be on the head-end router –an N7k or 6500 or 3800 or something like that, but I don’t have access there to see what it’s doing at all. Same thing with an MTU problem, but again, the folks at the provider can’t tell me (they won’t talk to me at that level), and the folks at AT&T can’t tell me (they don’t have anyone at their end to look at the packet trace and diagnose it).

  • Willy

    Bravo. An example of complexity, but we get complex systems to work (most of the time), I say more a failure of customer service. They provide a product they simply can’t support or troubleshoot.  Once the sale is made, you (the customer) aren’t a profit center  any longer, you are a cost, and must be reduced.

  • http://twitter.com/christalsness christalsness

    Love the phrase “random acts of engineering”  I’ll have to make use of that.

  • GS

    Sorry for all your trouble.  I went through much the same thing with AT&T.  My microcell worked great at 2 houses, not at all at another. 

    I love the “random acts of engineering” line, much like the “part-swapping” school of auto repair.

  • http://twitter.com/cloudtoad Derick Winkworth

    Russ White?  *The* Russ White?  WHAT?  

    • Russ White

      Does that mean I’ve sunk this low, or I’ve drawn Packet Pushers up a notch? :-) Yes, that one…

      • http://twitter.com/cloudtoad Derick Winkworth

        Awesome.  I have all your books.  Glad you are here blogging!  Any chance of seeing another Inside IOS Architecture?  For the G2s or ASR1ks?  

  • Automaton

    The femtocells I’ve seen like to be behind a UPNP router that does IPSec passthrough and have an MTU of 1500 for the whole path.

    The third of these is the most critical, and of course the hardest to solve…

  • Tomstrr

    Back when I installed out AT&T Microcell (on FIOS using FIOS router) so we too won’t have to go out on the lawn to get cell service,  after some long “discussions” with folk at AT&T,  one emailed me a document that she had  gotten from further inside which listed about a half dozen things you needed to do/not do that were not elsewhere documented.  I don’t recall the list (I might still have it) but the key one for me was one had to allow packet fragments. THEN it worked. I no not why.

    Stay tuned for the real fun as the management web page was a real abomination and as I had to use it for the first time again recently, thou better, sitll is. 

    • Russ White

       Yep –if you’re connected to a router you can get to the console of… After trying to get it working behind all sorts of routers using that set of instructions, I finally gave up and asked for a second IP address so I could connect it directly to the ISPs network –but still no dice. That’s the problem, really –this thing is layers on layers, and the complexity of the layers on layers is really making it next to impossible to troubleshoot.

  • Gonnason

    Sniff the traffic and see what it is trying to do. Why throw random hardware at it if you don’t understand the problem?

    I would be fired if I attempted ”Random Engineering” like this.

    • Russ White

      The entire session is in an IPsec SA… So all I see on the packet trace is encrypted traffic. :-(

      • Jordan Urie

        Pop the case, grind down the chip packages, and strip the keys out with a scanning electron microscope. Decrypt and troubleshoot at will ;-)