You might think that implementing a network monitoring tool is like every other rollout. You would be wrong.
Oh, so you’re installing a new network monitoring tool, huh? No surprise there, right? What, was it time for a rip-and-replace? Is your team finally moving away from monitoring in silos? Perhaps there were a few too many ‘Let me Google that for you’ moments with the old vendor’s support line?
I’ve found there are three primary areas that are often overlooked when it comes to deploying a network monitoring application. This isn’t an exhaustive list, but taking your time with these three things will pay off in the end.
Scope–First, consider how far and how deep you need the monitoring to go. This will affect every other aspect of your rollout, so take your time thinking this through. When deciding how far, ask yourself the following questions:
- Do I need to monitor all sites, or just the primary data center?
- How about the development, test or quality assurance systems?
- Do I need to monitor servers or just network devices?
- If I do need to include servers, should I cover every OS or just the main one(s)?
- What about devices in DMZs?
- What about small remote sites across low-speed connections?
And when considering how deep to go, ask these questions:
- Do I need to also monitor up/down for non-routable interfaces (e.g., EtherChannel connections, multiprotocol label switching links, etc.)?
- Do I need to monitor items that are normally down and alert when they’re up (e.g., cold standby servers, cellular wide area network links, etc.)?
- Do I need to be concerned about virtual elements like host resource consumption by virtual machine, storage, security, log file aggregation and custom, home-grown applications?
Protocols and permissions–After you’ve decided which systems to monitor and what data to collect, you need to consider the methods to use. Protocols such as Simple Network Management Protocol (SNMP), Windows Management Instrumentation (WMI), syslog and NetFlow each have its own permissions and connection points in the environment.
For example, many organizations plan to use SNMP for hardware monitoring, only to discover it’s not enabled on dozens –or hundreds — of systems. Alternatively, they find out it is enabled, but the connection strings are inconsistent, undocumented or unset. Then they go to monitor in the DMZ and realize that the security policy won’t allow SNMP across the firewall.
Additionally, remember that different collection methods have different access schemes. For example, WMI uses a Windows account on the target machine. If it’s not there, has the wrong permissions or is locked, monitoring won’t work. Meanwhile, SNMP uses a simple string that can be different on each machine.
Architecture–Finally, consider the architecture of the tools you’re considering. This breaks down to connectivity and scalability.
First, let’s consider connectivity. Agent-based platforms have on-device agents that collect and store data locally, then forward large data sets at regular intervals. Each collector bundles and sends this data to a manger-of-managers, which passes it to the repository. Meanwhile, agentless solutions use a collector that directly polls source devices and forwards the information to the data store.
You need to understand the connectivity architecture of these various tools so you can effectively handle DMZs, remote sites, secondary data centers and the like. You also need to look at the connectivity limitations of various tools, such as how many devices each collector can support and how much data will be traversing the wire, so you can design a monitoring implementation that doesn’t cripple your network or collapse under its own weight.
Next comes scalability. Understand what kind of load the monitoring application will tolerate, and what your choices are to expand when — yes, when, not if — you hit that limit. To be honest, this is a tough one and many vendors hope you’ll accept some form of a, “it-really-depends” response.
In all fairness, it does matter, and some things are simply impossible to predict. For example, I once had a client who wanted to implement syslog monitoring on 4,000 devices. It ended up generating upwards of 20 million messages per hour. That was not a foreseeable outcome.
By taking these key elements of a monitoring tool implementation into consideration, you should be able to avoid most of the major missteps many monitoring rollouts suffer from. And the good news is that from there, the same techniques that serve you well during other implementations will help here. You want to ask lots of questions; meet with customers in similar situations, such as environment size, business sector, etc.; set up a proof of concept first; engage experienced professionals to assist as necessary; and be prepared — both financially and psychologically — to adapt as wrinkles crop up. Because they will.
(this originally appeared on SearchNetworking)