What is Application Performance Management (APM)?

What is APM?

Application performance management (APM) software helps an organization ensure that critical applications meet established expectations for performance, availability and customer or user experience. It enables organizations to predict and prevent performance issues before they impact users or the business.

APM does this by measuring application performance, alerting administrators when performance baselines aren’t met, providing visibility into root causes of performance issues and automatically resolving many performance issues before they impact users or the business.

APM is also an abbreviation for application performance monitoring. The terms are often used interchangeably, but application performance monitoring is actually a component of application performance management because after all, you must monitor performance to manage it.

Increasingly, application performance management solutions are evolving from relying on traditional application performance monitoring tools to incorporating observability, a performance data collection and analysis technology better suited to the complexity of modern, distributed cloud-native applications.

Quick guide to operationalizing FinOps automation

Go deeper in your learning about FinOps and understand its advantages and challenges.

Related content

How APM works

Again, APM gathers software application performance data, analyzes it to detect potential performance problems, and provides information or accelerates resolution of those problems. The chief difference in how they gather and analyze the data is the difference between application performance monitoring and observability.

Application performance monitoring

In application performance monitoring, agents are deployed throughout the application environment and supporting infrastructure to 'monitor' performance by sampling performance and performance-related metrics (sometimes called telemetry) as frequently as once every minute. Types of monitoring these agents perform include:

Digital experience monitoring gathers performance metrics - such as load time, response time, uptime, downtime - from the user interface on the user device. (This used to be called user experience monitoring, but was broadened to acknowledge that non-human entities, such as robots or other software components, also interact with the application and have performance expectations of their own). Digital experience monitoring usually supports real-user monitoring, which monitors the experience of an actual user on the system, and synthetic monitoring for performance testing in production and non-production environments.
Application monitoring includes monitoring of the entire application stack - application framework (for example, Java or .NET), operating system, database, APIs, middleware, web application server, UI - as well as IT infrastructure monitoring that samples factors such as CPU utilization, disk space and network performance. Stack monitoring typically includes code-level tracing, which can help spot portions of code that might be causing a performance bottleneck.
Database monitoring samples performance of SQL queries or procedures, in addition to the database monitoring provided by application monitoring agents.
Availability monitoring monitors the actual availability of application and hardware components (because applications can generate performance data even when they aren't accessible to the user).

In addition to collecting performance data, these agents perform user-defined transaction profiling, tracing each transaction from the user UI or device through every application component or resource involved in the transaction. This information is used to determine application dependencies, and to create a topology map - a visualization of the dependencies between application and infrastructure components, ideally across on-premises, private cloud, public cloud (including any software-as-a-service or SaaS solutions) and or hybrid cloud environments.

APM solutions typically provide a controller and centralized dashboard where the collected performance metrics are aggregated, analyzed and compared to established baselines. The dashboard alerts system administrators to deviations from baselines that indicate actual or potential performance issues; it also provides contextual information and actionable insights administrators can use to troubleshoot and resolve the issues.

Observability

Periodic sampling is effective enough for monitoring and troubleshooting monolithic applications or traditional distributed applications, where new code is released periodically and workflows and dependencies between application components, servers and related resources are well-known or easy to trace.

But today, as organizations are adopting modern development practices and cloud-native technologies—Agile and DevOps methodologies, microservices, Docker containers, Kubernetes and serverless functions—they're deploying new application components so often, in so many places, in so many languages and for such widely varying periods of time that the once-a-minute data sampling of traditional monitoring solutions can't keep up.

Observability swaps traditional monitoring agents with instrumentation that collects performance and contextual data non-stop, and uses machine-learning techniques to correlate and analyze the data in real-time. With an observability solution, development, IT operations and site reliability engineering (SRE) teams can:

Discover and address 'unknown unknowns.' Traditional monitoring looks only for known deviations from known baselines. An observability platform's machine-learning functionality can detect patterns in performance telemetry to identify new deviations that correlate with performance problems.
Catch and resolve issues early in development. Observability lets DevOps teams bake monitoring into the early phases of the software development process, so that they can test, identify and fix issues in new code before they impact the customer experience or service level agreements (SLAs).
Scale observability automatically. For example, developers can specify observability instrumentation as part of a Kubernetes cluster configuration, so that any new cluster starts gathering telemetry from the moment it spins up, until it spins down.

Observability doesn’t replace monitoring; it enables better monitoring, and better APM.

Learn more about observability

AI and AIOps: The future of APM

Today APM tools are using observability and AI in varying degrees. Some are combining traditional application performance monitoring with AI to automate the discovery of changing transaction paths and application dependencies. Others are combining observability with AI to automatically determine performance baselines, and to sift signals, or actionable insights, from the 'noise' of IT operations management (ITOM) data. Industry analyst Gartner finds that organizations can realize a "60% noise reduction in ITOM through use of AI-augmented tools."

The ultimate goal—and the future of APM and IT operations—is to combine observability with artificial intelligence for IT operations, or AIOps, to create self-healing, self-optimizing infrastructure. Together, the steady stream of real-time observability telemetry and AIOps machine learning and automation can predict application performance issues based on system outputs, resolve them before they impact the user experience or operations and even take actions to optimize application performance - all without management intervention.