Cumulus Networks is rolling out an agent called NetQ, which streams network telemetry in real time to provide system-wide visibility into network state and make it easier for network operators to validate configuration changes and troubleshoot problems.
The NetQ agent, which is now available on Cumulus Linux 3.3 and higher, runs on Cumulus switches to collect detailed operational information and stream it to a server running a Redis database that collects and aggregates the data.
Rather than log in to switches individually to get configuration status, operators can get a complete picture of network state using NetQ. They can also query the telemetry server to ensure that configuration changes were actually made, and that network links are up.
The agent gathers information such as the current switch configuration, protocols in use, link state, and general system information such as licenses and the status of hardware components.
The agent can also be deployed on servers that run Ubuntu or Red Hat Enterprise Linux (either bare metal or within a VM) to collect host-based network information.
Cumulus is positioning NetQ for IT shops that use tools such as Ansible, Puppet, and Chef to automate configuration changes across a large number of devices.
If an operator pushes a change across 100 switches, instead of manually sampling a few by hand to see if the change was made, “We can go out and say ‘Yes, all these changes took, or these three systems didn’t get updated,’” said Cumulus CEO Josh Leslie in an interview.
The system can also make incremental checks, on a pod-by-pod basis, to ensure there are no errors in a configuration update. If a problem is detected, the change can be rolled back in the affected pods and then corrected before the change continues.
The company hopes this capability will encourage organizations to take more advantage of automation by helping to address the fear of a outage caused by a configuration error that pushed across the whole network.
Ad because NetQ maintains a full picture of network state, operators can also use it to help troubleshoot network problems by replaying network events. NetQ can also integrate with third-party tools such as PagerDuty to notify operators in real time if performance or connectivity problems occur.
Note that NetQ itself isn’t used to configure switches running Cumulus Linux. It works in tandem with automation tools such as Ansible that make the actual changes. And at present, NetQ is CLI-based. For data visualization, Cumulus says NetQ integrates with visualization tools such as Splunk and Grafana.
NetQ is a software product so customers will bring their own hardware to run the telemetry server. Hardware size and configuration will depend on the amount of information being streamed from NetQ agents, the number of switches and servers for which data is being collected, and the length of time organizations keep the data.
NetQ is offered as a Debian package that will run on Cumulus Linux 3.3 and Ubuntu and RHEL. It’s available as a perpetual license, but Cumulus did not provide additional pricing details.
Cumulus CEO Leslie noted that the company may extend NetQ to other server platforms, as well as third-party networking and storage systems—as long as they have a Linux OS under the covers.
“In the modern data center we believe an increasing percentage of devices are Linux, so that’s a natural place for us to go,” he said. “We’ll cover storage, servers, and networking as the big three.”
A core value proposition of Cumulus Linux is the operational benefits of scalability and automation that you get from being able to run your switch infrastructure using the same tools and methods you use to run your Linux servers.
NetQ helps extend that value proposition by beginning to provide a global view of network state and a more centralized mechanism for validating changes and updates.
It also helps Cumulus compete better against companies such as Big Switch and Arista, both of which have made a priority of telemetry and visibility for network operators.
And last but certainly not least, NetQ helps make Cumulus more sticky in customer environments. This is extremely useful for a disaggregated NOS that can be removed from the switch if the customer wants something different.