Men have become the tools of their tools.
– Henry David Thoreau
This will be a three part series. This first article will involve an introduction to the subject, the general techniques involved and general vendor information. The second and third installment will cover real-world examples and advanced techniques.
There is a lot of data flowing over our networks out there. Networks are getting larger, faster, and more depends on it every year. More institutions – whether it is enterprise, service provider, education, government – are expanding their networks in every direction. Network vendors are brandishing faster speeds (10G, 40G, 100G), multi-path/fabric technologies at L2 (MLAG, vPC, TRILL, etc), and more overlay/encapsulation techniques (VXLAN, whatever-they-can-think-of-oMPLS, OTV, etc). There are fancy services gear becoming ever popular in network topologies such as WAN optimization, load balancers/accelerators, and of course firewall/IDS/IPS. We as network professionals are ever increasingly called upon to troubleshoot both the basic network flow as well as performance management (yet another hat!) for applications. That means good, old-fashioned packet capture and analysis, but at a much larger scale.
For basic troubleshooting, gone are the days of a network administrator walking up to the device with a laptop loaded with the latest Wireshark performing a capture to see what the heck is going on for user A complaining that server B “is slow.” There are countless vendors out there with application performance analysis, threat prevention/detection services, and so on. We need to get every packet collected together – in some cases from multiple paths to a single location. We also sometimes need to filter that data to sift out only what we are looking for. We need a packet capture network!
Enter the Visibility Fabric
There are plenty of clever names you can call it: visibility fabric, span network, tap network, packet capture network, etc. In the end, it’s a way to define a network topology designed exclusively for transporting raw capture traffic to locations appropriate for various examinations to be done. A few examples of real work someone might need:
- A customer is complaining about a poorly responding application with the typical “the app is not working.” This is an HTTP app that is load-balanced through multiple edge switches, but isolating each server doesn’t seem to find the issue. You’ve brought in a vendor to do analysis through an appliance, but the appliance only has one connection. You need to filter only HTTP traffic going to these servers into one single link to that appliance.
- The security group has come to the networking team asking for raw data feeds from the 18 supplier connections an enterprise has. They don’t want to deploy 18 IDS devices, as that will blow out the budget. Plus, the vendor they selected for IDS can only do 1G connections, and many of these supplier connections are gigabit as well. Much to our dismay, many of the routers supporting these connections are spread out across a 50,000 sq.ft. datacenter due to legacy buildup over 10 years of doing business.
- The public internet router is a Cisco device with a 10G connection to the internet. You have 4 devices you need to get the same traffic from the PI ingress/egress: an IDS, a security threat analysis engine, a long term packet capture system, and a route optimization appliance. We only have two local SPAN sessions on this device. We can’t really use ERSPAN due to the capacity of the link is sometimes at 50%
- A trading floor app using microbursting is having an out of order situation across some client endpoints. We need a time-synced capture from the various hops through the network and we don’t have administrative access to several machines at those hops.
We can think of dozens more where using a hub inline or being limited by 2 port-mirror sessions just won’t cut it. Even small businesses with limited footprints are growing beyond these basics. We need devices that can aggregate this data together and have enough density and filtering capability to do it at line rate.
As a network engineer or architect, I strongly recommend looking at installing a network of this type. Whether you have a greenfield, evergreen, or brownfield/retrofit environment (yes, I speak marketecture), it’s increasingly becoming a requirement to have this view into your network. The expansion of the features network gear provide, as well as the wider push towards cloud architectures, not having these types of devices in place make it almost impossible to get that necessary view. There are of course other ways to get at some of this analysis via flow export: netflow/Sflow. That may work for many people (and can usually be had with less overall investment). But, nothing beats raw data capture for troubleshooting and deep packet metrics.
Fortunately, there are several vendors in this space that provide solutions to these problems. Each of the major players have a general feature set that provide the basics: one-to-many/many-to-one packet distribution, pre/post filters, and cli/GUI management.
The four large vendors in this space are: Gigamon, APCon, Ixia Anue and Net Optics Inc. There may be others, but these are the dominant players in this space. Much of the differentiation these vendors offier is their port speed (some have 40G interfaces now), port density (some do 256+ 10G ports in a single chassis), footprint (single device or modular slot chassis), stacking (interconnecting chassis together), and the management features via cli/GUI. I’m not going to review all the specifics of each vendor simply because each one has plenty of brochure-ware on their sites to read, and all are responsive to phone/email queries.
However, we will focus on one of them for the remainder of this article series as I have the opportunity to use them at a customer site.
Disclaimer: I am not being paid by any vendor, nor under an obligation to produce a favorable result of the testing. However, I did consult with their management team to ensure the content of these articles did not violate any agreements when using their products.
One of the largest and seasoned vendors in this space is Gigamon. If you’ve been to a networking show (Cisco Live, NFD, etc.), you most likely their presence with the large booths and screaming Halloween orange color for their devices. Whenever I see one, I want break out a black sharpie and fill in three black triangles and a jagged mouth.
They have a range of products all the way from the single 1U chassis GigaVUE-212…
…to the 14U, 8 slot modular HD8 chassis.
Nearly all of their products have similar functions, but the standout is the GigaVUE-2404 which features the GigaSmart module that performs a vast array of operations on packets such as slicing, masking and other advanced features (more on that in part 2 of the series).
The Basics of Visibility Fabrics using Gigamon
As discussed earlier, all of the vendors have a basic set of features they provide. Gigamon prides itself on offering this capability across its entire line, in which the main variation is density and some oversubscription on the HD4/HD8 (depending on what you are doing). Here are the specific terms and techniques Gigamon uses:
- Packet Distribution. This is the main feature of these devices. You have many-to-one (combine multiple input ports out to a single port), one-to-many (one input port to many output ports), pre-filtering (L2-L4 packet filtering on an input ports), post filtering (L2-L4 packet filtering on output ports), and pass-all (a one-to-many or many-to-one that bypasses any filters).
- Network Port. This is the input port. It can only receive data.
- Tool Port. This is the output port. It can only send data.
- Connections. Associating a many-to-one (combining multiple Network ports to a single Tool port) or one-to-many (a single Network port to multiple Tool ports). You can’t really say “mapping” here, as that implies another technique.
- Filter. A basic L2-L4 ACL applied to a Network port or Tool port. Examples include source/destination IP/MAC, L4 port information, ToS, etc. Pretty much any portion of the header for a packet, IPv4/IPv6, and the general range of protocols (ICMP, TCP, UDP, etc) are covered.
- Maps. A more advanced technique of combining the filters and the connections into a single operation that can be reused. This is important in Gigamon world as you can have over 1000 maps in a standard system, but only 100 filters available to assign directly to a port in a system.
- Pass-all. A technique where you can have a connection defined of Network and Tool ports, but bypass any maps or filters associated on those individual ports. This way, you can ensure a basic connection is not disturbed by someone trying to put a pre or post filter on a port. For example, you have a single network port feeding multiple tool ports, but someone wants to pre-filter (at the Network port) for HTTP. Having a pass-all set up will ensure that the filter will not apply to the pass-all connection, while the other connection/map will work with the filter in place.
- Stacking. You can string multiple chassis together to make a massive distributed interconnected system. The reason they have a separate definition (isn’t it really a Network/Tool port connection anyway?) is the CLI/GUI management understands the difference and allows a global management from a single chassis. If you have 4 chassis, you only need to interact with the cli/GUI on one of them to manage all the ports in the system.
Again, most of the vendors provide these capabilities, they just have different names for them.
Coming up next
In the next article, we’ll focus on a real-world deployment of both the visibility fabric and the devices that receive the data. We’ll explore a three building campus deployment topology using multiple WAN feeds aggregating back to a central location. We’ll explore the G and H series devices, including a GigaSmart module. We’ll see the cli and GUI management configuration and do multiple connection scenarios. We’ll also look briefly at the capture devices being used on the Tool ports.