(This post originally appeared on The Kentik Blog)
In my last post, I waxed poetic (or at least long-winded) on how to add a custom SNMP object ID (OID) into Kentik NMS. Despite the fact that NMS collects a metric butt-tonne (that’s a highly technical form of measurement) of data, there are always custom elements needed by various folks for specific use cases.
However, in sharing my example – collecting a custom OID for CPU temperature – I omitted a critical piece of context: Most modern systems have more than one CPU and, therefore, more than one temperature value.
I left out that detail due to the need to clearly explain the mechanics of adding custom OIDs without overwhelming the audience with a bunch of additional complexities.
With the basic process out of the way now, I felt it was important to circle back and talk about how you can add multiple custom SNMP metrics at the same time. And when I say “multiple,” there are two different scenarios I’m going to cover:
- When a single OID returns multiple values
- Collecting several different, unrelated metrics in one configuration file.
Don’t feel like reading? Watch this demo video to see a walkthrough of adding multiple custom metrics to Kentik NMS.
A quick review of custom OID collection
Just to review the process for custom OIDs in Kentik NMS:
- Move to (or create if it doesn’t exist) the dedicated folder on the system where the Kentik agent (kagent) is running:
/opt/kentik/components/ranger/local/config
- In that directory, create directories for /sources, /reports, and /profiles
- Create three specific files:
- Under /sources, a file that lists the custom OID to be collected
- Under /reports, a file that associates the custom OID with the data category it will appear under within the Kentik portal
- Under /profiles, a file that describes a type of device (Using the SNMP System Object ID) and the report(s) to be associated with that device type
For just one temperature setting, it would look like this:
sources/linux.yml
version: 1
metadata:
name: local-linux
kind: sources
sources:
temp: !snmp
value: 1.3.6.1.4.1.2021.13.16.2.1.3.1
interval: 60s
reports/linux_temps_report.yml
version: 1
metadata:
name: local-temp
kind: reports
reports:
/device/linux/temp:
fields:
CPUTemp: !snmp
value: 1.3.6.1.4.1.2021.13.16.2.1.3.1
metric: true
interval: 60s
profiles/local-net-snmp.yml
version: 1
metadata:
name: local-net-snmp
kind: profile
profile:
match:
sysobjectid:
- 1.3.6.1.4.1.8072.*
reports:
- local-temp
include:
- device_name_ip
Collecting multiple unrelated OIDs
Once you understand the process of collecting a single OID, adding others is pretty simple.
For our example, we also want to collect icmpInEchos (the number of pings received) for Linux/net-snmp type devices. The OID for this is 1.3.6.1.2.1.5.8.0. Using the same files from above, I’d make the following modifications:
sources/linux.yml
version: 1
metadata:
name: local-linux
kind: sources
sources:
temp: !snmp
value: 1.3.6.1.4.1.2021.13.16.2.1.3.1
interval: 60s
icmpInEchos: !snmp
value: 1.3.6.1.2.1.5.8.0
interval: 60s
reports/ping-count.yml
version: 1
metadata:
name: local-ping-count
kind: reports
reports:
/device/linux/pings:
fields:
ping-count: !snmp
value: 1.3.6.1.2.1.5.8.0
metric: true
interval: 60s
profiles/local-net-snmp.yml
version: 1
metadata:
name: local-net-snmp
kind: profile
profile:
match:
sysobjectid:
- 1.3.6.1.4.1.8072.*
reports:
- local-temp
- local-ping-count
include:
- device_name_ip
Let’s unpack some of the things you see there, and how they differ from the collection of a single OID:
- sources/linux.yml has two different sources: one for temp (temperature) and one for icmpEchos.
- An entire new file under /reports, named ping-count.yml, describes the OID to be collected, and that its data will appear under /device/linux/pings within the Kentik portal.
- Finally, profiles/local-net-snmp.yml (which was already present for the single temperature OID) has been modified to also associate the report named “local-ping-count”.
I want to emphasize two other points: First, the file name doesn’t matter at all. The key is to make sure the “name: ” element within the YAML files is correct. Second, the directory structure is just a housekeeping mechanism. As long as the “kind: ” element within the YAML file is correct, you can have everything in the same folder if you prefer.
The result is that you will now receive and can display data within the Kentik portal for both temperature and ICMP echos received:
Collecting a table of OIDs
(or, “I Contain Multitudes”)
Some SNMP OIDs return a single value, like the icmpEchos metric in our last example.
But others are effectively the tip of an iceberg of metrics. Examples of this type of OID include CPU, temperature, fans, disks, and even the SNMP OID that displays running processes.
Which brings us back to the original OID – You’ll recall that in my previous post, the OID I used was 1.3.6.1.4.1.2021.13.16.2.1.3.1, which gave me one temperature stat. But if I used “.2” instead of “.1” at the end, I would see another temperature reading.
In fact, I can do that for five different OIDS:
This is because the humble little Raspberry Pi I’m monitoring in this example has four cores (stick with me; I’ll explain why it’s not five in a minute). While I’d like to monitor all of them, I have to recognize that not every device has the same number of cores, and therefore, I need something that will flexibly collect whatever number I have. Luckily, SNMP handles that more or less automatically. At the command line, an snmpwalk (instead of snmpget) accomplishes the same thing:
What’s missing is the names of each of these elements. While it may not be a big deal for CPUs, it’s far more important when I’m collecting the same data point for disks, interfaces, and the like.
https://oidref.com/ tells me the names of the sensors can be found at the OID 1.3.6.1.4.1.2021.13.16.2.1.2.
From this I can see that the first OID is the aggregate temperature, and the next four OIDs are temperatures for each of the four cores in my Pi.
With that information in hand, our goals are to:
- Collect all five temperature readings without having to call out each one explicitly.
- Associate the data values with labels.
Here is what the YAML files look like:
sources/linux.yml
version: 1
metadata:
name: local-linux
kind: sources
sources:
CPUTemp: !snmp
table: 1.3.6.1.4.1.2021.13.16.2
interval: 60s
reports/temp.yml
version: 1
metadata:
name: local-temp
kind: reports
reports:
/device/linux/temp:
fields:
name: !snmp
table: 1.3.6.1.4.1.2021.13.16.2
value: 1.3.6.1.4.1.2021.13.16.2.1.2
metric: false
CPUTemp: !snmp
table: 1.3.6.1.4.1.2021.13.16.2
value: 1.3.6.1.4.1.2021.13.16.2.1.3
metric: true
interval: 60s
profiles/local-net-snmp.yml
version: 1
metadata:
name: local-net-snmp
kind: profile
profile:
match:
sysobjectid:
- 1.3.6.1.4.1.8072.*
reports:
- local-temp
include:
- device_name_ip
Once again, let’s unpack that:
- sources/linux.yml uses the OID 1.3.6.1.4.1.2021.13.16.2. Two things are notable:
- This is a couple of levels “up” the OID chain from both the temperature (1.3.6.1.4.1.2021.13.16.2.1.3) and the labels (1.3.6.1.4.1.2021.13.16.2.1.2).
- The original example used a metric type of “value”. This one is using the “table” type.
- /reports/temp.yml, describes two fields instead of just one:
- A “name” field which pulls two data sets:
- the overall table from the OID ending at 16.2
- the actual values from the OID ending at 16.2.1.2
- A “CPUtemp” field which pulls two data sets:
- the overall table from the OID ending at 16.2
- the actual values from the OID ending at 16.2.1.3
- A “name” field which pulls two data sets:
The result of this structure is that it will associate the labels from the 16.2.1.2 branch of the OID table with the temperature values in the 16.2.1.3 branch.
Note that the profiles/local-net-snmp.yml is unchanged from our original example of collecting a single temperature value.
BONUS ROUND: Putting it all together
By this point, you should be getting pretty comfortable with the idea, if not the technique, of adding custom SNMP OIDs to Kentik NMS. But in this last example, we’re going to include custom OIDs for every CPU temperature, along with icmpEcho data. Here’s what the files look like:
sources/linux.yml
version: 1
metadata:
name: local-linux
kind: sources
sources:
icmpInEchos: !snmp
value: 1.3.6.1.2.1.5.8.0
interval: 60s
CPUTemp: !snmp
table: 1.3.6.1.4.1.2021.13.16.2
interval: 60s
reports/temp.yml
version: 1
metadata:
name: local-temp
kind: reports
reports:
/device/linux/temp:
fields:
name: !snmp
table: 1.3.6.1.4.1.2021.13.16.2
value: 1.3.6.1.4.1.2021.13.16.2.1.2
metric: true
CPUTemp: !snmp
table: 1.3.6.1.4.1.2021.13.16.2
value: 1.3.6.1.4.1.2021.13.16.2.1.3
metric: true
interval: 60s
reports/ping-count.yml
version: 1
metadata:
name: local-ping-count
kind: reports
reports:
/device/linux/pings:
fields:
ping-count: !snmp
value: 1.3.6.1.2.1.5.8.0
metric: true
interval: 60s
profiles/local-net-snmp.yml
version: 1
metadata:
name: local-net-snmp
kind: profile
profile:
match:
sysobjectid:
- 1.3.6.1.4.1.8072.*
reports:
- local-temp
- local-ping-count
include:
- device_name_ip
That’s right. If you’re looking closely, you’ll see that it’s mostly just the same files from our previous example, but the inclusion of reports/ping-count.yml and the combining of both report names in local-net-snmp.yml.
The mostly unnecessary conclusion
Two things should be clear at this point: Far from the shriveled, washed-up, has-been of a monitoring technique it’s often accused of being, SNMP continues to be a powerful, flexible, and valuable tool in your observability toolkit.
Moreover, Kentik NMS is an equally powerful, flexible, and useful solution for collecting and displaying those metrics alongside other data types, providing you with complete insight into the health and stability of your network.