There are design tools which we should consider for every design. LAN, WAN and the data center where these common design tolls and attributes should be considered. Many of the principles in this article series might be fit not only for the network part of the design but also compute, virtualization and storage technologies also can be evaluated with them.
I will not write of course whole tools in this article since it would be too long. In this post I will try to explain ‘reliability’, what the components of the reliable design are and also I will start to define ‘resiliency’ as a common design tools.
Reliability is within the reasonable amount of time which depends on the application type and architecture, delivering the legitimate packets from source to destination. This time is known as delay or latency and it is one of the packet delivery parameters. Consistency of delay known as jitter and it is very important for some type of applications such as voice and video, jitter is our second delivery parameters. Third packet delivery parameter is packet loss or drop, especially voice and video traffic is more sensitive to packet loss compare to data traffic. Check this blog post also to see the affect of packet loss on Video traffic.
Packet loss is application dependent and some applications are very drop/packet loss sensitive. General accepted best practices for the delay, jitter and packet loss ratio has been defined and knowing and considering them is important from the network design point of view. For example for the voice packets one way delay should be less than 150ms. This is known also as mouth to ear delay.
Reliability should not be considered only at the link level. Network links, devices such as switches, routers, firewalls, application delivery controllers, servers, storage systems and others should be reliable; also component of these devices needs to be reliable. For example, if you will carry the voice traffic over unreliable serial links, you may likely encounter packet drops because of link flaps.
But actually whichever device , link or component you choose, essentially they will fail. Vendors share their MTBF ( Meantime between failure ) numbers. You can choose the best reliable devices, links, component, protocols and architecture; you need to consider unavoidable failure. This brings us the resiliency. Resiliency is how the network behaves once the failure happen. Is that highly available, will it convergence and when?.
Resiliency can be considered as combination of high availability and convergence. There are at least three sub component of resiliency which are redundancy, fast convergence and fast reroute.
I want to give an analogy. Assume your upper level manager promise that they will promote you after some achievement, maybe some project. If he or she is reliable you will be promoted after that project, but now the question, when you will be promoted? 1 year, 2 years? This can be thought as latency. If you are not be promoted within the reasonable amount of time packet loss will start and you will probably consider to leave?.
Assume you decided to leave you need to find another job to survive. If you think that not being a promoted a failure for you, and you decided to take an action, this is resilience. Depends on how long it will take for you to find another job, fast convergence and fast reroute technologies which will be discussed in the next article might be explained.
But let me give very easy and brief example about them here. If you started to look after some time when the project completed , this can be considered as fast convergence, or if you started to look and maybe even agreed with the new place before the project complete , this is fast reroute since you knew that failure will happen and took an early action and your life didn’t change much. We will look fast convergence and fast reroute in detail at the next article and analogy will be much clearer for you.
I hope this has been informative for you and I would like to thank you for reading.