What is application performance management (APM)?
Application performance management (APM) enables organizations to predict and prevent performance issues before they impact users or the business.
Shot of a young businesswoman working on a computer in an office
What is application performance management?

Application performance management (APM) software helps an organization ensure that its critical applications meet established expectations for performance, availability and customer or end-user experience. It does this by measuring application performance, alerting administrators when performance baselines aren’t met, providing visibility into root causes of performance issues, and automatically resolving many performance issues before they impact users or the business.

APM is also an abbreviation for application performance monitoring. The terms are often used interchangeably, but application performance monitoring is actually a component of many application performance management—because after all, you have to monitor performance to manage it. 

Increasingly, however, application performance management solutions are evolving from relying on traditional application performance monitoring tools to incorporating observability, a performance data collection and analysis technology better suited to the complexity of modern, distributed cloud-native applications. 


How APM works

Again, APM gathers software application performance data, analyzes it to detect potential performance problems, and provides information or takes action to accelerate resolution of those problems. The chief difference in how they gather and analyze the data is the difference between application performance monitoring and observability.

Application performance monitoring

In application performance monitoring, agents are deployed throughout the application environment and supporting infrastructure, to 'monitor' performance by sampling performance and performance-related metrics (sometimes called telemetry) usually as frequently as once every minute. Types of moitoring these agents perform include:

  • Digital experience monitoring gathers performance metrics - such as load time, response time, uptime, downtime - from the user interface on the end-user device. (This used to be be called end-user experience monitoring, but was broadened to acknowledge that non-human entities, such as robots or other software components, also interact with the application and have performance expectations of their own). Digital experience monitoring usually supports real-user monitoring, which which monitors the experience of an actual user on the system, and synthetic monitoring, for performance testing in production and non-production environments.

  • Application monitoring includes monitoring of the entire application stack - application framework (e.g., Java or .NET), operating system, database, APIs, middleware, web application server, UI - as well as IT infrastructure monitoring that samples factors such as CPU utilization, disk space, and network performance. Stack monitoring typically includes code-level tracing, which can help spot portions of code that might be causing a performance bottleneck.

  • Database monitoring samples performance of SQL queries or procedures, in addition to the datase monitoring provided by application monitoring agents.

  • Availability monitoring monitors the actual availability of application and hardware components (because applications can generate performance data even when they aren't accessible to the end user).

In addition to collecting performance data, these agents perform user-defined transaction profiling, tracing each transaction from the end-user UI or device through every application component or resource involved in the transaction. This information is used to determine application dependencies, and to create a topology map - a visualization of the dependencies between application and infrastructure components, ideally across on-premises, private cloud, public cloud (including any software-as-a-service or SaaS solutions) and or hybrid cloud environments. 

APM solutions typically provide a controller and centralized dashboard where the collected performance metrics are aggregated, analyzed and compared to established baselines. The dashboard alerts system administrators to deviations from baselines that indicate actual or potential performance issues; it also provides contextual information and actionable insights administrators can use to troubleshoot and resolve the issues.

Observability

Periodic sampling is effective enough for monitoring and troubleshooting monolithic applications or traditional distributed applications, where new code is released periodically and workflows and dependencies between application components, servers and related resources are well-known or easy to trace.

But today, as organizations adopting modern development practices and cloud-native technologies—Agile and DevOps methodologies, microservices, Docker containers, Kubernetes and serverless functions—they're deploying new application components so often, in so many places, in so many languages and for such widely varying periods of time that the once-a-minute data sampling of traditional monitoring solutions can't keep up.

Observability swaps traditional monitoring agents with instrumentation that collects performance and contextual data non-stop, and uses machine-learning techniques to correlate and analyze the data in real-time. With an observability solution, development, IT operations and site reliability engineering (SRE) teams can:

  • Discover and address 'unknown unknowns.' Traditional monitoring looks only for known deviations from known baselines. An observability platform's machine-learning functionality can detect patterns in performance telemetry to identify new deviations that correlate with performance problems.

  • Catch and resolve issues early in development. Observability lets DevOps teams bake monitoring into the early phases of software development process, so they can test, identify and fix issues in new code before they impact the customer experience or service level agreements (SLAs).

  • Scale observability automatically. For example, developers can specify observability instrumentation as part of a Kubernetes cluster configuration, so that any new cluster starts gathering telemetry from the moment it spins up, until it spins down.

Observability doesn’t replace monitoring; it enables better monitoring, and better APM.

Learn more about observability

AI and AIOps: The future of APM

Today APM tools are leveraging observability and AI in varying degrees. Some are combining traditional application performance monitoring with AI to automate discovery of changing transaction paths and application dependencies. Others are combining observability with AI to automatically determine performance baselines, and to sift signals, or actionable insights, from the 'noise' of IT operations management (ITOM) data. Industry analyst Gartner finds that organizations can realize a "60% noise reduction in ITOM through use of AI-augmented tools."

The ultimate goal—and the future of APM and IT operations—is to combine observability with artificial intelligence for IT operations, or AIOps, to create self-healing, self-optimizing infrastructure. Together, the steady stream of real-time observability telemetry and AIOps machine learning and automation can predict application performance issues based on system outputs, resolve them before they impact the end-user experience or operations, and even take actions to optimize application performance - all without management intervention.


Related solutions

IBM Cloud Pak® for Watson AIOps

Innovate faster, reduce operational cost and transform IT operations (ITOps) with an AIOps platform that delivers visibility into performance data and dependencies across environments.


IBM Observability with Instana

Discover the leading enterprise observability platform for hybrid clouds. Improve application performance management and accelerate CI/CD pipelines no matter where applications reside.


Take the next step
With IBM Cloud Pak for Watson AIOps you'll embrace artificial intelligence, machine learning and automation to help ITOps managers and Site Reliability Engineers (SREs) address incident management and remediation. Innovate faster, reduce operational cost and transform IT operations (ITOps) across a changing landscape with an AIOps platform that delivers visibility into performance data and dependencies across environments. Explore Cloud Pak for Watson AIOps