Application Performance Management (APM)
Application Performance Management (APM)
What is Application Performance Management (APM)?
Application performance management (APM) software helps an organization ensure that its critical applications meet established expectations for performance, availability and customer or end-user experience. It does this by measuring application performance, alerting administrators when performance baselines aren’t met, providing visibility into root causes of performance issues, and automatically resolving many performance issues before they impact users or the business.
APM is also an abbreviation for application performance monitoring. The terms are often used interchangeably, but application performance monitoring is actually a component of many application performance management - because after all, you have to monitor performance to manage it.
Increasingly, however, application performance management solutions are evolving from relying on traditional application performance monitoring tools to incorporating observability, a performance data collection and analysis technology better suited to the complexity of modern, distributed cloud-native applications.
How APM works
Again, APM gathers software application performance data, analyzes it to detect potential performance problems, and provides information or takes action to accelerate resolution of those problems. The chief difference in how they gather and analyze the data is the difference between application performance monitoring and observability.
Application performance monitoring
In application performance monitoring, agents are deployed throughout the application environment and supporting infrastructure, to 'monitor' performance by sampling performance and performance-related metrics (sometimes called telemetry) usually as frequently as once every minute. These agents perform
- Digital experience monitoring gathers performance metrics - such as load time, response time, uptime, downtime - from the user interface on the end-user device. (This used to be be called end-user experience monitoring, but was broadened to acknowledge that non-human entities, such as robots or other software components, also interact with the application and have performance expectations of their own). Digital experience monitoring usually supports real-user monitoring, which which monitors the experience of an actual user on the system, and synthetic monitoring, for performance testing in production and non-production environments.
- Application monitoring includes monitoring of the entire application stack - application framework (e.g., Java or .NET), operating system, database, APIs, middleware, web application server, UI - as well as IT infrastructure monitoring that samples factors such as CPU utilization, disk space, and network performance. Stack monitoring typically includes code-level tracing, which can help spot portions of code that might be causing a performance bottleneck.
- Database monitoring samples performance of SQL queries or procedures, in addition to the datase monitoring provided by application monitoring agents.
- Availability monitoring monitors the actual availability of application and hardware components (because applications can generate performance data even when they aren't accessible to the end user).
In addition to collecting performance data, these agents perform user-defined transaction profiling, tracing each transaction from the end-user UI or device through every application component or resource involved in the transaction. This information is used to determine application dependencies, and to create a topology map - a visualization of the dependencies between application and infrastructure components, ideally across on-premises, private cloud, public cloud (including any software-as-a-service or SaaS solutions) and or hybrid cloud environments.
APM solutions typically provide a controller and centralized dashboard where the collected performance metrics are aggregated, analyzed and compared to established baselines. The dashboard alerts system administrators to deviations from baselines that indicate actual or potential performance issues; it also provides contextual information and actionable insights administrators can use to troubleshoot and resolve the issues.
Periodic sampling is effective enough for monitoring and troubleshooting monolithic applications or traditional distributed applications, where new code is released periodically and workflows and dependencies between application components, servers and related resources are well-known or easy to trace.
But today, as organizations adopting modern development practices and cloud-native technologies – Agile and DevOps methodologies, microservices, Docker containers, Kubernetes and serverless functions - they're deploying new application components so often, in so many places, in so many languages and for such widely varying periods of time that the once-a-minute data sampling of traditional monitoring solutions can't keep up.
Observability swaps traditional monitoring agents with instrumentation that collects performance and contextual data non-stop, and uses machine-learning techniques to correlate and analyze the data in real-time. With an observability solution, development, IT operations and site reliability engineering (SRE) teams can:
- Discover and address 'unknown unknowns.' Traditional monitoring looks only for known deviations from known baselines. An observability platform's machine-learning functionality can detect patterns in performance telemetry to identify new deviations that correlate with performance problems.
- Catch and resolve issues early in development. Observability lets DevOps teams bake monitoring into the early phases of software development process, so they can test, identify and fix issues in new code before they impact the customer experience or SLAs.
- Scale observability automatically. For example, developers can specify observability instrumentation as part of a Kubernetes cluster configuration, so that any new cluster starts gathering telemetry from the moment it spins up, until it spins down.
Observability doesn’t replace monitoring; it enables better monitoring, and better APM.
AI and AIOps: The future of APM
Today APM tools are leveraging observability and AI in varying degrees. Some are combining traditional application performance monitoring with AI to automate discovery of changing transaction paths and application dependencies. Others are combining observability with AI to automatically determine performance baselines, and to sift signals, or actionable insights, from the 'noise' of IT operations management (ITOM) data. Industry analyst Gartner finds that organizations can realize a "60% noise reduction in ITOM through use of AI-augmented tools."(1)
The ultimate goal - and the future of APM and IT operations - is to combine observability with artificial intelligence for IT operations, or AIOps, to create self-healing, self-optimizing infrastructure. Together, the steady stream of real-time observability telemetry and AIOps machine learning and automation can predict application performance issues based on system outputs, resolve them before they impact the end-user experience or operations, and even take actions to optimize application performance - all without management intervention.
APM and IBM Cloud®
Application performance management is just one part of modernizing your organization as the need for automation widens across business and IT operations. A move toward greater automation should start with small, measurably successful projects, which you can then scale and optimize for other processes and in other parts of your organization.
Working with IBM, you’ll have access to AI-powered automation capabilities, including prebuilt workflows, to make every IT services process more intelligent, freeing up teams to focus on the most important IT issues and accelerate innovation.
Take the next step:
- IBM Cloud Pak® for Watson AIOps uses machine learning and natural language understanding to correlate structured and unstructured data across your operations toolchain in real time to uncover hidden insights and help identify root causes faster. Eliminating the need for multiple dashboards, IBM Cloud Pak for Watson AIOps feeds insights and recommendations directly into your team workflows to speed incident resolution.
- See IBM Cloud Pak for Watson AIOps in action by watching the demos.
- IBM Observability with Instana offers industry-leading AI-powered automation capabilities to manage complexity of modern applications that span hybrid cloud landscapes. Instana combines with IBM Cloud Pak for Watson AIOps to provide a leading observability platform for automated remediation, powered by a continuous stream of contextualized telemetry data.
- Register to download the Gartner report and discover how to future-proof your IT operations with AI.
- Download the IBM Cloud infographic (PDF, 467 KB) that shows the benefits of AI-powered automation for IT operations.
- Read about the five “must-have’s” for automation success (link resides outside IBM) in this HFS Research report.
Get started with an IBM Cloud account today.