Please note: the following post is the musing of a madman, so take that into consideration.
I am also aware that someone may have thought about this already; if so, I apologize in advance.
I’ve been thinking about router convergence recently, and how convergence is not just about how the routers exchange updates.
I have been looking at convergence from a user perspective, in that the real aim of convergence is to provide a stable network for the users traversing the network, or more specifically the user traffic traversing the network.
As such, I found myself asking this question:
“What is the minimum diameter (or radius) of a network so that the ‘loss’ of traffic from a TCP/UDP ‘stream’ seen locally would indicate a network outage FASTER than a routing update?”
My thinking is that during everyday network transactions, sessions or flows are formed between clients and servers that just so happen to traverse a router. The router “sees” these sessions and records some information about the flows in its CEF/Netflow/MPLS/Whatever table.
If an outage were to occur on the far side of the network, then the flow of TCP/UDP traffic would slow or stop.
So, with the correct level of intelligence in the routing device, could this lack of flow be used to signify a change in the network?
We do something like this with SD-WAN/iWAN, but we don’t look at it from the client-to-server perspective.
Would this lack of flow information be reported to me more quickly than a routing update?
That is, by the time the flow has dissipated enough to indicate a failure, could the router distinguish this faster than waiting for a routing update?
- Is this information more trustworthy than a routing update as it is an indication of network flow and not specifically a message that could be dubious?
- How many administrative domains or how many routing domains would need to be between the source and destination to make a loss of user data pertinent to router convergence?
- Most paths through a network carry multiple flows, so simply specifying multiple flows could provide an accurate picture for the reliability of the destination
As an example, I live in AUS and most of my internet traffic comes from the US.
It will go via multiple AS’s before it reaches its destination. My routing update/user data will likely go via multiple IGP and BGP’s before it reaches me.
The likelihood that I will get an update about the source failure before my router notices the decline of routing traffic may be quite small.
Maybe the Internet is bad example, but what about a large service provider network or big enterprise?
Ok, so I can hear your comments already, as they are mine as well.
Yes, we can intelligently route packets based on some network criteria, but is it enough? Can it provide the intelligence we need in the “network of everything”?
- Dynamic routing doesn’t work that way
Correct, we would need to change the way we write protocols.
I have thought that a two-stage protocol, with a “standard” adjacency mechanism that we have today for the initial setup. Then there would be a separate process for the monitoring of “streams” and kicking off a convergence process based on the user activity.
- What about security?
These problems could be addressed in the same way we are looking into them now.
- The router needs to inspect every packet?
Sure, like NBAR/Netflow does now. The routers/x86 platforms have the processing ability, why not use it?
There has been a large amount of change in the networking industry recently, and although this idea may well be a fool’s errand, looking at alternatives is something I’m told all engineers should try at least once.
So here is my attempt. Please be gentle.