In the Service-Provider world, there is the concept of a “circuit design review” or “path scrub” or “Class A.” Typically you would ask for this to be done when you have chronic issues with a circuit and you need a closer look at every device in the path to determine what is going on. This is an important concept and one that can apply to IP networks and Enterprise engineers.
Frequently people get hung up on the idea of conducting a root-cause analysis (RCA) when there really is no value in it. Sometimes the problem is that somewhere in the network path, or multiple places in the network path, the configuration or performance just isn’t up to standards or best practices. The best thing to do is to clear these problems in the path first. Ensure that counters aren’t incrementing by the hundreds or thousands every second. Ensure that your QoS configuration is up to snuff. If that clears the problem, you can summarize the issue as a “circuit/out-of-compliance” issue.
There is no need to conduct a series of endless and painful conference calls to figure out precisely what happened to every bit and byte at every moment the problem was occurring. If clearing the path solved the problem, MOVE ON. This would, I think resolve 80% of all network issues and potentially save a huge amount of time.
I think, in general, its time to “up our game” in network engineering. Don’t feel good about finding that packets are not being marked correctly or that FEBEs are incrementing on a circuit. This should be trivial to discover. We should be doing better than this at this point, and managers should be managing these issues appropriately.
We should adopt the “Class A” approach to issues in the IP world. Is it really an issue or is it a QoS configuration problem or cabling problem in the path?