Only one change or link flap can cause one hour or more traffic drop. It is weird, right? But this is true.
In this article BGP Path Hunting/ Path Exploration behavior will be shown, BGP route flap dampening and its variants will be explained and how only one interface flap can cause very long down time regarding the connections between the Autonomous Systems will be covered.
I assume from the readers basic BGP knowledge such as path vector/distance vector routing, IBGP – EBGP neighbors and some BGP abbreviations.
Before starting to explain how one hour or more down time can be possible with even just one update or link/route flap, let me explain some concepts and then we can combine everything together.
- Based on RIPE definition BGP Route Flap Dampening
“Each time a prefix is withdrawn, the router will increment the damping penalty by a fixed amount.
When the number of withdrawals/announcements (=flap) is exceeded in a given time frame (cutoff threshold) the path is no longer used and not advertised to any BGP neighbor for a predetermined period starting from when the prefix stops flapping.
Any more flaps happening after the prefix enters suppressed state will attract additional penalty. Once the prefix stops flapping, the penalty is decremented over time using a half-life parameter until the penalty is below a reuse threshold. Once below this reuse threshold the suppressed path is then re-used and re-advertised to BGP neighbors.”
Different implementations can use different default timers and even different names for the parameters. Cisco implementation uses half-life, reuse, suppress and max-suppress-time as a name for these parameters. And default values are 15min, 750, 2000, 60 min respectively. These values and names might be different for different vendor implementations.
BGP dampen the paths not the prefixes, consider two routers connected with two links and running BGP each other. If one of the paths is unstable and causes instability for BGP domain that path is dampened. IP event dampening also thought for to use with all other IGPs and implemented on almost every platforms. Although BGP Route Flap Dampening (RFD) is good for overall core BGP domain stability it can cause more harm than its usefulness because of the BGP’s path vector behavior.
- BGP Path Hunting/Exploration
To understand BGP Path Hunting behavior lets use Figure-1
As you see , When AS1-2 link fails , withdrawn will be sent by AS2 to 5 and 3. AS5 will get withdrawn run the best path algorithm and send announcement to its peer upstream (Not shown).
Result of first run is AS5 will find a path through AS3 since it gets three paths in its BGP table but just install one path in its routing table and advertise the all other BGP neighbor based on BGP best path algorithm.
In this example, until it receives withdrawn from every neighbor it will try all the possible paths for the network which is behind AS1 and then finally will converge. This may take a lot of time based on processing, queuing, propagation delays, Minimum Route Advertisement Interval MRAI, etc. This is how BGP as a path vector protocol behaves and why all paths are explored.
Let’s create an example; here AS1 can be Enterprise Company which is connected to their upstream provider which is connected to two peering or another upstream provider. Although this example is explained from AS5 point of view, same path hunt behavior might happen at every AS in the topology. As an example if AS2-3 link fails, AS3 will explore 5 2 1, 4 5 2 1 and then converge.
What if AS5 in this topology implements aggressive BGP route flap dampening parameters.
Since BGP cannot differentiate all these withdrawn/announcements from the real flaps, these real updates can cause very long down time since they can be dampened if the parameters are too aggressive.
In this particular topology there is only one link between AS5 and all the other ASes. In real life most probably we will have more than one link for redundancy and/or load balancing maybe even at multiple points. In this case ,withdrawn/ announcement will be sent from each link between ASes and based on the propagation delay of the links single flap between AS1 -2 will be multiplied by the routers and penalty for the flap will be increased up to max-suppress-time limit.
By default Cisco implements 60 minutes max-suppress-time for every prefix regardless of the prefix’s length. This is called the flat & gentle approach.
Longer prefixes and shorter prefixes will be penalized the same way. Another approach which is the progressive approach can penalize route flap events based on their prefix length. While /24 can be suppressed max 60 minutes, /19 might be 45 minutes.
/8 prefixes shouldn’t dampened as like /24. One good example for this is root DNS, Top Level Domain servers. These are generally called within BGP terminology as “Golden Networks “.
Provider Aggregated address is given by the Provider to the company to use. If the company receives directly from Local Internet Registry then it is called Provider Independent. Provider Aggregated (PA) is good for overall internet stability since providers aggregate these addresses and we see fewer routes at the default free zone. (You can think of full internet table).
If prefixes are Provider Aggregated, then upstream provider of the company does not need to dampen the prefixes since they will be already aggregated and flap cannot affect overall Internet stability. This can be thought as a flooding domain boundary and hiding reachability information of unstable prefixes at Multi area OSPF and multi level IS-IS domains.
There are at least two implemented work around for this behavior. One might be with flat/gentle less aggressive timers but you will rely on inter connection between the ASes which you cannot assume how many path between all the ASes and real flapped can cause a problem at the overall Internet.
As an example AS2 in this topology can implement less aggressive parameters but AS5 can implement more aggressive ones. Then maybe 10 flaps may not be dampened by AS2 but they will be dampened at AS5. Troubleshooting of this problem might be hard for AS1 since they will see the connection is up they are sending the prefixes. Then coordinating the route flap dampening parameters between the ASes is also important.
A better approach for this problem maybe the Selective route flap dampening which is a progressive approach. Yet, which parameters are the best or makes more sense.
RIPE published some best practices for Route Flap Dampening with the Progressive approach;
* don’t start damping until the 4th flap
* /24 and longer prefixes: max=min outage 60 minutes
* /22 and /23 prefixes: max outage 45 minutes; min outage of 30 minutes
* all other prefix lengths: max outage 30 minutes; min outage 10 minutes
If a specific damping implementation does not allow configuration of
Prefix-dependent parameters the least aggressive set should be used:
* Don’t start damping before the 4th flap in a row
* Max outage 30 minutes; min outage 10 minutes
Conclusion: BGP is path vector protocol and based on its machinery it tries all the received path advertisement from all the neighbors. This is called BGP Path hunting or path exploration. Route Flap dampening is implemented to protect global Internet core from the instabilities and can only be enabled between EBGP neighbors. The effect of BGP path hunting to route flap dampening might be huge and single route/link flap, policy updates can cause prefixes to be dampened for a long time.