April 13, 2017 | Written by: Joydeep Banerjee
Share this post:
DevOps and Application Performance Management (APM) go hand in hand. I want to take you through a simple journey which shows why APM is such a key part of DevOps today. Let’s take a look at typical types of metrics that need to be tracked and measured, as well as the key features needed in APM to help in the DevOps environment.
When we talk about DevOps today, we often also mean cloud, microservices, and cloud level availability, like 99.999 precent or 26.3 seconds per month of downtime. Microservice behavior is critical to DevOps success. In a DevOps environment, microservices must be able to report the following about themselves:
- Am I healthy?
- What is my latency?
- How many times do I connect to my dependent systems?
- What is the latency of each of those dependent connections?
- How many of those dependent connections succeed and fail?
- Am I doing the work I am supposed to do?
- How many customers do I have?
- Am I gaining new customers?
Microservices need to be built carefully so that these types of metrics are available for each of the microservice instances. Why? If you want to hit that availability figure of less than 26.3 seconds of downtime per month, these metrics will help you to restore service faster. Some of these are easier to measure. But capturing “am I doing the work I am supposed to do” may need some development depending on what your microservice does.
Let’s now talk about two key features that a APM solution must have.
First, developers tend to create some pointers in the application log, like how many times a certain kind of error occurred. This can be problematic because logs have a higher latency than metrics in reaching the server. And at these demanding levels, every second counts. Therefore, a better practice is to be able to measure latency at the microservice code level and push it to the APM. Then, have the APM system accept these custom metrics and transport it to the server. That way, latency can be analyzed and visualized just like the regular APM metrics.
Secondly, if something fails in the cloud, the standard response is to restart the component in an automated fashion. However, there is one set of problems that is very nasty and cannot be solved by restarts. This happens when the latency of your microservice suddenly goes south. And this is where the APM tool makes a difference, capturing a broad range of metrics like the ones mentioned above. APM can show metrics from different microservices and help users isolate the faulty microservice or other kinks in the process.
If you are in this business for serious production deployments, the development team has already embedded monitoring into the process. If not yet done, better get it done soon. Without APM it is much more difficult to guarantee 99.999 percent service levels.
Want to dig deeper? Check out this blog from my colleague, Mike Mallo, who explores how to drive DevOps transformation when developers own application monitoring. And read APM and DevOps: A Winning Combination to learn more.