(This article originally appeared on The Observatory)
On my blog Absolutely simple infrastructure monitoring, I wrote:
Depending on your background, the process of instrumenting your applications, systems, and even your coffee pot makes perfect sense. Requests to “pipe this curl command through BASH” are the kind of thing you do every day.
Or, you know, maybe not.
My tech journey started with managing servers (Novell and the nascent Windows NT) and network infrastructure. Monitoring—the traditional ping-and-trap kind—came later. I’m an old-school IT practitioner used to on-premises tools that ask some variation of “give me a list of IP addresses and SNMP strings and I’ll do the rest.” So a cloud-native interface like New Relic that automatically instruments my network didn’t feel as familiar or intuitive to me, at least at first. If you’re an old-school IT practitioner, too, you might feel the same way, and this blog post is for you.
You’ll learn how New Relic network monitoring works and why adding it to your stable of tools will be a win for you and the wider organization. And then I’ll show you how to install the monitoring agent and display your network data.
Monitoring your network with New Relic
Before we get going with this tutorial, let’s take a look at some of the key features in New Relic that are helpful for network engineers, including dashboards and alerts.
Here’s an example of the Hosts dashboard:
This dashboard gives you high-level information about your system’s CPU, RAM, storage (if it has any), network traffic, and so on.
If you want an aggregated view of data from multiple network devices at once, you can simply select them:
And then select View Selected to drill down into the metrics for those entities:
There are prebuilt dashboards you can view by selecting Dashboards from the upper navbar, such as Routers and Switches:
The Routers and Switches dashboard includes information on device inventory, interface inventory, and traffic.
Every data point and metric can potentially be used to generate an alert or notification if something goes wrong. Which is a heck of a lot better than waiting for a customer to call and ask “is the internet down?”
Prerequisites
The mark of a good cook—and a good IT practitioner— is to gather your tools and ingredients before you start so you’re not scrambling in the middle of the process.
What we’re building is a low-powered system that will host the New Relic agent. This agent scans your network for devices, builds a list, and then continuously collects data from those devices—transmitting it back to the New Relic database. This data is what drives all the dashboards, insights, and alerts I described earlier in this post.
If you’re not already using New Relic, sign up for a free account.
The New Relic agent runs inside a Docker container, so you’ll need a physical or virtual Linux- or MacOS-based system that can run Docker.
Here’s what you need to have handy:
- The read-only SNMP string for the device(s) you want to monitor
- The CIDR-notated network segment you want to monitor (or a list of individual IP addresses)
- A machine (it can be a virtual machine or container) running Linux. If you aren’t sure how to set this up, don’t panic. There are step-by-step instructions in the next section.
- That Linux system needs to have Docker installed. I’ve included instructions for this step, too.
For the rest of this tutorial, I’m going to refer to this Linux box running Docker as <NR1-agent>.
Setting up Linux and Docker
If you already have Linux and Docker set up on your machine, you can skip this step. If you don’t, you might be worried that this is going to be overly complicated. Don’t worry, it’s not—and you don’t need to know the ins and outs of Docker to follow along.
Almost any version and distribution of Linux will work. (MacOS will work too, although MacOS’s relationship with Docker is… complicated.) For this tutorial, I’m choosing to use Ubuntu 20.04. If you want to follow along step-by-step, you can download it from Ubuntu. Follow their instructions to complete the Ubuntu installation process.
When that’s done, you might want to connect to the server remotely via SSH. Why? If you’re setting this up as a VM, you likely won’t be able to copy-paste commands from this document directly into the VM screen, but you can through an SSH session.
To get your machine’s IP address, run this command:
ip address
The output will show you each of your interfaces along with their IP addresses. Now, install the following additional packages and applications:
OpenSSH server
Open a terminal window and type these three commands, pressing enter between each (and adding your password if prompted):
sudo apt-get install openssh-server
sudo systemctl enable ssh
sudo systemctl start ssh
Docker
In your terminal, enter the following command:
sudo apt install docker.io
You can give your account permission to run Docker directly with the following commands:
sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker
Copy
Before we move on, here are a few Docker commands that you might need for your installation process. These commands are useful if you enter the wrong information during installation and need to try again or if you add more devices to the network and want to rescan.
You can see which containers are running (along with their container IDs) with the following command:
docker ps
You can see an output of what a Docker container is doing with this command (you get the container ID with the docker ps
command):
docker logs –follow <container id>
Finally, if you have issues with any containers, including the command to build or run the New Relic agent, you can easily stop and remove a container with these commands:
docker stop <container id>
docker rm <container id>
Then you can try again with a new container.
Setting up New Relic
Next, you need a New Relic account. You can set one up for free, and it comes with 100Gb of data per month. You may even be tempted to start monitoring other stuff in New Relic, which wouldn’t be the worst decision you’ve ever made. Trust me on this.
Next, you need to set up the network performance monitoring agent. Log in and select Add More Data from the left-hand pane.
In the search bar, type network and then select SNMP from the Network Performance Monitoring section.
Choose your account and set the SNMP version. If you don’t know, select v2c. Leave the polling interval as-is unless you have a reason to change it.
Set your CIDR range to your network range, which will look like 10.1.2.3/16
, 192.168.122.0/24
, or something similar, and enter your SNMP community string. Then select Validate and continue.
Next, you need to run some fairly complex commands on the <NR1-agent> VM, so you should SSH into your machine using a utility that supports copy-and-paste such as PuTTY or Linux terminal.
Got your SSH session running? Then let’s keep rolling along.
In https://one.newrelic.com, you’ll see a command that will do an initial discovery of your network devices. Copy it from the browser and paste it into the <NR1-agent> terminal. You’ll need this command again later, so save it in your clipboard or somewhere else convenient.
Why would you need it? Because the agent will only detect devices that are currently in your environment. If you add more devices later, they don’t automatically show up in New Relic unless you do another scan. You can manually add devices without rescanning, but that’s a bit more involved.
After you run the command in your SSH session, you’ll see a line near the bottom that says 1 device was added (unless you’re going all out and adding a whole subnet. You wild child, you!)
Next, go back to one.newrelic.com and select Continue. You’ll receive another command to copy and paste into your SSH session. You should save this command for later as well.
Important: Just like the earlier command, save this one in a file or location where you’ll be able to get back to it. I’ll explain why later in this section.
At this point, New Relic will be monitoring your devices and your Routers dashboard in New Relic might look something like this. Note that it might take a minute (or fifteen) for data to show up. That delay is completely normal.
You’re now using New Relic to monitor a network device! Give yourself a high five!
Maintaining your instrumentation
If your network configuration remains the same, the agent will continue running inside its Docker container, collecting metrics and sending them to New Relic.
However, in most networks, devices will get added, removed, or upgraded fairly often. When that happens, you’ll need to instrument your environment again.
Unlike physical or virtual machines, containers are meant to be temporary. Just remove the container and launch a new one. Following the New Relic setup instructions in order to ensure that the agent monitors new devices in your network.
With minimal setup and maintenance, you can quickly set up network monitoring for a larger slice of your network.