(this originally appeared on GeekSpeak)
Want to know a secret?
I’m going to start at the end.
If your environment collects syslog and trap messages, no matter what vendor solution you are using, create a filtration layer that will take all those messages, process them, and forward just the useful ones along.
Now, moving from the end back to the beginning, here’s what you want to do: Get some copies of Kiwi Syslog Server, set up a load balancer like an F5 to do UDP round robin between all those servers, and set rules on the first server to filter out everything but the alerts you want to keep. For the messages you want to keep, set up rules to transparently forward them to the system(s) that will process and act on them. Export that rule set and import it to the other servers sitting behind the load balancer. Finally, update all of the devices in your enterprise to send their trap and syslog messages to the VIP presented by the load balancer.
That’s the secret! Now that I’ve explained it, the trick, the bottom line, are you curious to know WHY I am telling you all this?
This is why: I’ve seen the following scenario a half-dozen times. I’m brought in to consult on a monitoring project and someone announces, “My monitoring sucks! It’s dog slow and just doesn’t work. Find me something else!” So, I poke around and realize that all of their traps and syslog messages are going to a single system, which also happens to be the monitoring system. In Solarwinds terms, that’s the primary poller.
In my experience, network devices generate a metric buttload (yes, that’s a scientifically accurate measurement) of messages per hour. In more boring terms, we’re talking about roughly 4,000 messages per hour per machine.
If you have a server that is trying to manage pinging a set of devices (and collecting and storing those metrics) along with pulling SNMP or WMI data from that same set of devices (again, and storing that data), along with presenting that information in the form of views and reports, and checking the database for exceeded thresholds to create alerts, and analyzing that data to provide baselines, and… Well, you get the point. Polling engines have a lot of work to do. And one of the ways they stay on top of that work is having a finely tuned scheduler that manages all those polling cycles.
If you then start throwing a few million spontaneous messages, which must be processed in real-time, what you have is a VERY unhappy system. What you have is monitoring solution that “sucks” through no fault of its own.
Once I am able to point this out to clients, the next question is, “Should we turn off syslog or traps?” Of course not. That is a rich and vital source of information. What you need is to put something in front of those messages to filter them out.
Which brings me back to the “filtration system.”
BUT… there’s a catch! The catch is that most syslog and trap receivers expect to also process those messages themselves – to create alerts, to store the data, etc. What is needed in my example is to be able to ignore the messages that are unimportant, but then FORWARD the ones that matter to another system that is able to act upon them. The challenge here is to forward them without changing the source machine.
Many trap and syslog handlers can forward messages, but they replace the original machine with itself as the source. That’s not helpful when you want to correlate a syslog message with data collected another way, say SNMP polling, for example. To do that, you need to perform what is called “transparent” forwarding, which keeps the original source machine information intact.
Kiwi Syslog has done this for years. But not so with SNMP traps. For a variety of reasons, which I won’t get into now, that capability hasn’t existed until 9.6, the latest version.
Now that this essential function within your monitoring infrastructure is available (not to mention really, REALLY affordable) you can impact the performance of your monitoring system in a great big, positive way.
So, take a minute and check out the new version. Forwarding traps transparently isn’t the only new feature, by the way. There’s also IPv6 support, SNMP v3 support, use of VarBinds in output, logging to Papertrail, and more! Try it and let me know what you think in the comments below.