Between now and the end of Passover, I’m sharing excerpts from my book “The Four Questions Every Monitoring Engineer is Asked“. It blends wisdom and themes from Passover with common questions (and their answers) heard when you are setting up and running monitoring and observability solutions.
You can buy it as an ebook (Amazon Kindle, Barnes&Noble Nook, and more) or as a good old-fashioned physical book. You can even check it out through OverDrive.
Monitoring is not alerting. Some people confuse getting a ticket, page, email, or other alert with actual monitoring. Monitoring is nothing more–and nothing less–than the ongoing collection of data about a particular element or set of elements. Alerting is a happy by-product of monitoring, because once you have metrics you can notify people when a specific metric is above or below a threshold. I mention this here, in connection with the first question, because customers sometimes ask (or demand) that you fix (or even turn off) “monitoring.” What they really want is for you to change something about the alert they received, such as the frequency or level of detail. Rarely do they really mean you should stop collecting metrics.