What is application performance management (APM)?
Explore IBM's APM solution Subscribe to AI Topic updates
Illustration with collage of pictograms of gear, robotic arm, mobile phone
What is APM?

Application performance management (APM) software helps an organization ensure that critical applications meet established expectations for performance, availability and customer or user experience. It enables organizations to predict and prevent performance issues before they impact users or the business.

APM does this by measuring application performance, alerting administrators when performance baselines aren’t met, providing visibility into root causes of performance issues and automatically resolving many performance issues before they impact users or the business.

APM is also an abbreviation for application performance monitoring. The terms are often used interchangeably, but application performance monitoring is actually a component of application performance management because after all, you must monitor performance to manage it. 

Increasingly, application performance management solutions are evolving from relying on traditional application performance monitoring tools to incorporating observability, a performance data collection and analysis technology better suited to the complexity of modern, distributed cloud-native applications.

Quick guide to operationalizing FinOps automation

Go deeper in your learning about FinOps and understand its advantages and challenges.

Related content

Register for the guide on observability

How APM works

Again, APM gathers software application performance data, analyzes it to detect potential performance problems, and provides information or accelerates resolution of those problems. The chief difference in how they gather and analyze the data is the difference between application performance monitoring and observability.

Application performance monitoring

In application performance monitoring, agents are deployed throughout the application environment and supporting infrastructure to 'monitor' performance by sampling performance and performance-related metrics (sometimes called telemetry) as frequently as once every minute. Types of monitoring these agents perform include:

  • Digital experience monitoring gathers performance metrics - such as load time, response time, uptime, downtime - from the user interface on the user device. (This used to be called user experience monitoring, but was broadened to acknowledge that non-human entities, such as robots or other software components, also interact with the application and have performance expectations of their own). Digital experience monitoring usually supports real-user monitoring, which monitors the experience of an actual user on the system, and synthetic monitoring for performance testing in production and non-production environments.

  • Application monitoring includes monitoring of the entire application stack - application framework (for example, Java or .NET), operating system, database, APIs, middleware, web application server, UI - as well as IT infrastructure monitoring that samples factors such as CPU utilization, disk space and network performance. Stack monitoring typically includes code-level tracing, which can help spot portions of code that might be causing a performance bottleneck.

  • Database monitoring samples performance of SQL queries or procedures, in addition to the database monitoring provided by application monitoring agents.

  • Availability monitoring monitors the actual availability of application and hardware components (because applications can generate performance data even when they aren't accessible to the user).

In addition to collecting performance data, these agents perform user-defined transaction profiling, tracing each transaction from the user UI or device through every application component or resource involved in the transaction. This information is used to determine application dependencies, and to create a topology map - a visualization of the dependencies between application and infrastructure components, ideally across on-premises, private cloud, public cloud (including any software-as-a-service or SaaS solutions) and or hybrid cloud environments. 

APM solutions typically provide a controller and centralized dashboard where the collected performance metrics are aggregated, analyzed and compared to established baselines. The dashboard alerts system administrators to deviations from baselines that indicate actual or potential performance issues; it also provides contextual information and actionable insights administrators can use to troubleshoot and resolve the issues.

Observability

Periodic sampling is effective enough for monitoring and troubleshooting monolithic applications or traditional distributed applications, where new code is released periodically and workflows and dependencies between application components, servers and related resources are well-known or easy to trace.

But today, as organizations are adopting modern development practices and cloud-native technologies—Agile and DevOps methodologies, microservices, Docker containers, Kubernetes and serverless functions—they're deploying new application components so often, in so many places, in so many languages and for such widely varying periods of time that the once-a-minute data sampling of traditional monitoring solutions can't keep up.

Observability swaps traditional monitoring agents with instrumentation that collects performance and contextual data non-stop, and uses machine-learning techniques to correlate and analyze the data in real-time. With an observability solution, development, IT operations and site reliability engineering (SRE) teams can:

  • Discover and address 'unknown unknowns.' Traditional monitoring looks only for known deviations from known baselines. An observability platform's machine-learning functionality can detect patterns in performance telemetry to identify new deviations that correlate with performance problems.

  • Catch and resolve issues early in development. Observability lets DevOps teams bake monitoring into the early phases of the software development process, so that they can test, identify and fix issues in new code before they impact the customer experience or service level agreements (SLAs).

  • Scale observability automatically. For example, developers can specify observability instrumentation as part of a Kubernetes cluster configuration, so that any new cluster starts gathering telemetry from the moment it spins up, until it spins down.

Observability doesn’t replace monitoring; it enables better monitoring, and better APM.

Learn more about observability
AI and AIOps: The future of APM

Today APM tools are using observability and AI in varying degrees. Some are combining traditional application performance monitoring with AI to automate the discovery of changing transaction paths and application dependencies. Others are combining observability with AI to automatically determine performance baselines, and to sift signals, or actionable insights, from the 'noise' of IT operations management (ITOM) data. Industry analyst Gartner finds that organizations can realize a "60% noise reduction in ITOM through use of AI-augmented tools."

The ultimate goal—and the future of APM and IT operations—is to combine observability with artificial intelligence for IT operations, or AIOps, to create self-healing, self-optimizing infrastructure. Together, the steady stream of real-time observability telemetry and AIOps machine learning and automation can predict application performance issues based on system outputs, resolve them before they impact the user experience or operations and even take actions to optimize application performance - all without management intervention.

Related solutions
IBM Cloud Pak® for Watson AIOps

Innovate faster, reduce operational cost and transform IT operations (ITOps) with an AIOps platform that delivers visibility into performance data and dependencies across environments.

Explore Cloud Pak® for Watson AIOps
IBM Observability with Instana®

Discover the leading enterprise observability platform for hybrid clouds. Improve application performance management and accelerate CI/CD pipelines no matter where the applications reside.

Explore IBM Observability with Instana®
Manage your application resources with IBM® Turbonomic®

Leverage observability to proactively optimize application resourcing, ensure performance and save money.

Explore IBM Turbonomic
Resources Future-proof your IT operations with AI

Learn how AI for IT improves business outcomes, leads to increased revenue and lowers both cost and risk for organizations.

IT automation, powered by AI

Achieve new levels of efficiency and resiliency in your IT operations.

What is site reliability engineering (SRE)?

SRE uses software engineering to automate IT operations tasks that would otherwise be performed manually by systems administrators.

Intelligent app resource management with AI-powered automation

Gain full visibility into your application and infrastructure resource allocation which contribute to user response time and any resource congestion.

Take the next step

IBM Instana provides real-time observability that everyone and anyone can use. It delivers quick time-to-value while verifying that your observability strategy can keep up with the dynamic complexity of current and future environments. From mobile to mainframe, Instana supports over 250 technologies and growing. 

Explore IBM Instana Book a live demo