This column is available in a weekly newsletter called IT Best Practices. Click here to subscribe.
You've heard the saying "Necessity is the mother of invention." Stefano Gridelli took that saying to heart and helped create exactly what he needed when he couldn’t get what he wanted out of existing network monitoring tools.
When Gridelli was a network engineer for one of the largest healthcare organizations in western Pennsylvania, he needed to understand network and application performance in remote locations such as hospitals, doctors' offices, pharmacies and labs. The existing tools could tell him if the switches and routers were running properly, but they couldn't tell him much about problems end users were encountering, such as application outages or slow Internet response time. When trouble tickets came in, support teams lost valuable time trying to diagnose the root causes.
Gridelli figured he couldn't be the only network engineer with this problem, so he teamed up with a couple of computer science Ph.D.'s, Panickos Neophytou and Panos Vouzis, and the three budding entrepreneurs talked to people in numerous other companies to confirm Gridelli’s suspicion. Ultimately their research led them to start NetBeez in 2013 in the Pittsburgh accelerator AlphaLab.
NetBeez is a distributed network monitoring solution that monitors the network from the end user perspective. Other monitoring tools typically stop short of understanding what a user is seeing or experiencing, especially in remote or branch locations. NetBeez fills this gap.
"Here's a common scenario," Gridelli says. "When there is a problem in the network, an end user opens a ticket with the helpdesk and tries to verbally communicate what he is experiencing to the helpdesk operator. The user might say, 'I can't access the inventory application.' The operator collects the information and then it gets escalated to, say, the network operations center that is the control center for the enterprise. But the technician assigned to the ticket doesn't know if this is a true network problem, an application problem, or even something that is specific to that user's workstation.
"The detection of remote application issues is still something that is human driven and is more reactive than proactive, where sometimes the user is no better than the existing management tools to explain what is happening," says Gridelli.
The NetBeez approach is to install a small hardware agent – a compact appliance about the size of a man's wallet – in each WAN location. These agents simulate end user activity in a very simple way by continuously testing network services and reporting back to a central server. By doing so they provide end-to-end verification of reachability and performance from the user’s perspective. They also check that the network as a system can successfully forward traffic, which in turn verifies the application delivery from the data center to the remote office location. The agent performs these tasks 24x7 to constantly test availability for the end users.
The illustration below shows how NetBeez can help determine if a problem is local or global, and whether it affects the network layer or the application layer.
By polling information from the remote hardware agents (i.e., the "beez"), a network engineer can determine when there is an outage and whether it is an outage that is detected by one or multiple locations. Based on that information, the troubleshooting steps can begin.
Gridelli likens it to being in a coffee shop and being unable to connect to the free Wi-Fi. You ask other customers if they can connect. If they can, the problem is likely with your system; if they can't, the problem is likely with the shop's Wi-Fi system. "That's exactly what it's like in a distributed multi-site environment," says Gridelli. "When there's a problem or ticket, you want to understand if it's a local versus global issue, and you want to know immediately without waiting for users to call to complain. Those are precious minutes when your business operations are being affected."
Another important answer the beez can provide is whether the problem is with the network or the application. "That's important to know because if it's at the network level, you'll forward that ticket to the network group, while if the problem's on the application, that ticket will be sent to the application group," Gridelli explains. "As soon as possible you want to escalate the ticket to the right group that can try to resolve the issue."
The NetBeez solution is comprised of a central server delivered as a virtual appliance and multiple inexpensive hardware agents. The solution was built to scale to support hundreds of WAN locations in a cost effective way and to be plug-and-play. A "bee" can simply be shipped to a remote location and plugged into the network. Via a dashboard on the central server application, each bee can be customized in terms of what actions it performs. Gridelli describes the remote device as a Swiss Army knife that can be programmed to perform a variety of network and application monitoring tasks that the organization needs for a particular site.
One popular use case for NetBeez is to understand if major configuration changes have any impact on remote locations. Many companies apply their network configuration changes in the middle of the night. Then they wait until workers arrive for the regular workday to discover any problems that might have occurred as a result of a change. With NetBeez, a local bee will simulate user behavior and alert on problems in the field long before end users first login in the morning.
One of the original beta testers and early commercial user of NetBeez is a non-profit organization in Pittsburgh that provides network services to 75 physical locations in and around the city. Part of what this organization does is provide public access computers as well as staff computers at these locations for Internet access as well as some other business applications. Ernest is a manager of network services for the company.
"We saw a demonstration of NetBeez in its early stage and we saw a lot of value in the product," says Ernest. "We jumped on the opportunity to get a commercial version of the product. The primary reason is the ability to have this end user perspective in connections that are being made and to be able to track the information about those connections, whether we are looking for latency, connectivity issues, connections going down, or anything like that. With so many locations all around the city, it's a big deal for us to be able to do this without a whole bunch of work involved." Ernest also says the NetBeez devices were easy to set up and put into production, and they've been very reliable.
"We were really concerned about our ability to have a grasp of what the end users are experiencing on a day-to-day basis with any of the applications and services we are providing, whether it's just Internet access or one of our public applications," says Ernest. "For instance, we have a search application for an online catalog and we were really concerned about how well that was performing. NetBeez has mechanisms to create a task that is actually performing a search using a URL string so we were able to get some measurement of the application response time and do that across all of our locations. Capturing that data for historical reference was very important. We also monitor Internet performance. Now we have good metrics about whether, during any particular times throughout the day, there are any performance issues."
Though NetBeez is a relatively new company, it has a solid roadmap of where the solution is headed. The company has just released performance alerts, and an SLA report is coming soon to the management dashboard. This will allow an enterprise to generate a report to understand network uptime and performance for each remote location and then enforce an SLA, perhaps with an Internet service provider, for example.
Other features on the near-term product roadmap include bandwidth analysis and baselining, and wireless, virtual and external agents. Virtual agents will enable NetBeez to also monitor private cloud infrastructures, while external agents will verify online services from the perspective of an Internet user.