What is AWS monitoring?

Published 30 October 2025

A woman with glasses and her hair tied up is working at a desk, looking at multiple computer monitors displaying various data visualizations.

Authors

Gregg Lindemulder

Staff Writer

IBM Think

Annie Badman

Staff Writer

IBM Think

What is AWS monitoring?

AWS monitoring is the process of tracking, collecting and analyzing data from the Amazon Web Services (AWS) cloud computing platform. It helps optimize AWS resources, identify performance issues, manage costs and maintain secure cloud infrastructure.

With over 30% of the cloud market, AWS runs more infrastructure than any other cloud infrastructure provider.¹ Companies use AWS for web hosting, data storage, big data analytics, mobile app development and enterprise IT services.

For these organizations, monitoring their workloads on AWS is critical. It enables them to track performance metrics, system logs and events to help ensure that AWS environments perform as expected.

Without AWS monitoring, performance issues can multiply undetected, costs can spiral from overprovisioned resources and security vulnerabilities can remain exposed. For example, excessive network traffic might overload a central processing unit (CPU) and cause bottlenecks. Or misconfigured cloud storage containers might expose sensitive data through public access.

AWS monitoring tools can identify these issues and trigger automated responses—such as invoking AWS Lambda functions for remediation—or alert teams for manual troubleshooting. This helps organizations maintain optimal performance, reduce costs, strengthen their security posture and make data-driven decisions about their infrastructure.

Industry newsletter

The latest tech news, backed by expert insights

Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think newsletter. See the IBM Privacy Statement.

Why is AWS monitoring important?

AWS monitoring helps ensure the reliability, availability and performance of AWS cloud resources. It provides visibility into AWS infrastructures so organizations can proactively detect and resolve problems, reducing disruptions and minimizing downtime.

Maintain reliability and availability

Monitoring helps reveal the operational health of AWS cloud-based resources, including compute capacity, storage systems and network infrastructure.

For example, if a server experiences heavy load from customer traffic, monitoring triggers automatic scaling to add more servers. This helps prevent application crashes and ensure consistent response times even during traffic spikes, so applications remain accessible to end users.

Improve performance

By collecting metrics on AWS cloud resources, monitoring can help optimize configurations for increased speed and efficiency.

For example, if users in Asia experience slow page loads, monitoring tools can cache content in Singapore, closer to their location. This helps reduce latency and enables faster page loads, ensuring smoother streaming and improving the user experience.

Manage AWS costs

Cloud costs are often one of the biggest line items on an organization’s IT budget.

Monitoring can identify underutilized or unnecessary AWS resources to help optimize cloud costs. For example, a virtual server using only 10% of its allocated memory for its workload can risk wasting 90% of memory costs. AWS monitoring can help automatically right size instances and shut down idle resources during off-peak hours.

Enhance security

Monitoring detects suspicious activities or unusual changes to AWS infrastructures that might indicate security threats, such as unauthorized API calls, unusual data transfers and configuration changes.

For example, when monitoring tools detect repeatedly failed login attempts, they can block the source IP address and trigger security notifications.

Anticipate issues

Monitoring can identify potential problems before they occur and impact end users. This can include approaching capacity limits, degrading performance trends and expiring SSL certificates.

For example, monitoring a database approaching over 90% storage capacity triggers alerts to developers to add more space before database failure.

AI Academy

Achieving AI-readiness with hybrid cloud

Led by top IBM thought leaders, the curriculum is designed to help business leaders gain the knowledge needed to prioritize the AI investments that can drive growth.

Go to episode

How does AWS monitoring work?

AWS monitoring continuously collects and analyzes data from AWS infrastructure.

Monitoring data typically includes performance metrics, resource utilization and error rates from servers, databases and applications, along with system logs such as API calls, configuration changes, network activity and security events.

Tools analyze this data to identify trends, detect anomalies and visualize performance in real time through dashboards.

AWS monitoring tools can then alert teams to potential issues for troubleshooting and root cause analysis. They can also automatically resolve some issues, including adding resources or restarting AWS services.

For example, when customers abandon online shopping carts due to failed checkouts, AWS monitoring can identify payment gateway errors and alert the DevOps team. They can then investigate and correct the outdated timeout setting, minimizing revenue loss.

AWS monitoring pricing typically scales with the number of custom metrics, log ingestion volumes and analysis frequency, depending on the services used.

Key metrics for AWS monitoring

To provide comprehensive coverage, AWS monitoring typically tracks metrics across four core areas:

Infrastructure performance and usage
Application performance and behavior
Security and compliance
Operational and business metrics

Infrastructure performance and usage

AWS monitoring automatically tracks performance and resource consumption across AWS infrastructure components.

These metrics help teams identify overprovisioned resources, predict capacity needs and detect performance degradation before it impacts users.

Amazon CloudWatch, the foundation of AWS monitoring, is the primary tool for gathering this data. It automatically collects, aggregates and analyzes metrics across all AWS services. Other AWS tools like X-Ray and CloudTrail integrate with CloudWatch to provide a unified view of system health, performance and security.

Amazon EC2 (Elastic Compute Cloud)

EC2 provides virtual servers that run applications in the cloud with on-demand computing capacity.

Key EC2 metrics include:

CPU utilization: Percentage of allocated compute units being used
Disk I/O: Rate of reads and writes from attached storage volumes
Network traffic: Data flowing in and out of the EC2 instance
Status checks: System and instance health checks that verify reachability and detect hardware or software issues

ECS (Elastic Container Service) and EKS (Elastic Kubernetes Service)

ECS and EKS, Amazon’s native container services, deploy and orchestrate containerized applications at scale.

Key container metrics include:

Resource utilization: CPU, memory usage and network usage
Performance logs: Diagnostic information such as container restart failures for root-cause analysis
Network performance: Network traffic between containers, services and external resources

Amazon RDS (Relational Database Service)

RDS provides managed databases in AWS cloud environments.

Key RDS metrics include:

Database connections: Number of active connections to each RDS database instance
Query latency: Time to complete database queries
Storage usage: Disk space consumed by databases

AWS Lambda

Lambda provides serverless computing for tasks such as resizing images or updating databases.

Key Lambda metrics include:

Invocation count: Number of times a function executes
Duration: Time to complete each function
Errors: Number of function execution failures

Elastic Load Balancing (ELB)

ELB distributes incoming traffic to appropriate cloud resources to help maintain high availability.

Key ELB metrics include:

Latency: Time requests take to process through the load balancer
Request count: Total requests the load balancer handled
Active connection count: Concurrent connections from clients to targets
Processed bytes: Total bytes processed by the load balancer

Application performance and behavior

AWS performance monitoring can track application behavior but only with manual configuration and code instrumentation.

Applications rarely exist in isolation—they connect to external payment processors, third-party APIs and non-AWS databases.

Monitoring these interactions reveals whether performance problems stem from AWS resources or external dependencies. It also provides insight into how end users interact with applications.

AWS X-Ray, a distributed tracing service, is the primary tool for collecting these metrics. When code is instrumented with its SDK, X-Ray traces requests as they move through applications, providing visibility into latency, errors and bottlenecks across AWS services.

Key application performance monitoring metrics include:

Request rates: Volume and patterns of incoming requests, revealing traffic spikes and usage trends

Error rates: Failed transactions, crashed processes and broken API connections

User monitoring: Real user experience data—page load times, JavaScript errors, click through rates

Distributed tracing: End-to-end request journeys across microservices, pinpointing where bottlenecks or failures occur

Security and compliance

AWS monitoring automatically tracks activity and configuration changes within an AWS account. These metrics help identify who accessed sensitive data, detect policy violations and prove regulatory compliance.

Different AWS Security services provide different security benefits, such as audit trails, configuration tracking and threat detection. Some, like CloudTrail, log basic activity by default, while others—such as GuardDuty and Config—require explicit setup.

Key AWS-native security services include:

AWS CloudTrail: Tracks every API call across AWS environments, logging who did what and when—essential for auditing and incident investigation.

AWS Config: Records how AWS resources are configured and tracks changes over time, helping prove compliance and with policies.

Amazon Inspector: A security and vulnerability management service that automatically scans EC2 instances and containers for known vulnerabilities and deviations from security best practices.

Amazon GuardDuty: A continuous threat detection service that uses machine learning to detect malicious activity, including cryptocurrency mining, unauthorized access attempts and unusual API calls across an AWS account.

Key security metrics include:

API calls: API calls are communications between AWS APIs and software programs to request data, services or actions. Source, destination and timing are recorded for all calls.

Unauthorized access: Any unusual or malicious login activity.

Configuration changes: Changes to AWS resource configurations for compliance and auditing. For example, changes to user permissions or data encryption might affect compliance with security policies or data protection regulations.

Network traffic analysis: Network traffic across virtual private clouds (VPCs) to help spot security threats such as misconfigurations or data exfiltration attempts.

Operational and business metrics

AWS monitoring can track business metrics but requires custom configuration, where organizations define and send them to AWS monitoring tools as custom metrics.

These metrics—such a revenue, order fulfillment time or customer satisfaction—can help connect technical performance to business outcomes, justify cloud spending and identify how system issues impact revenue.

Key operational and business metrics include:

Customer service response time: How quickly systems respond to customer requests

Order fulfillment time: Average time for e-commerce orders to be processed and shipped

User activity: Daily and monthly active users, new sign-ups and session counts

Revenue: Total monetary value of purchases over a specific period

AWS billing: Anomaly detection can identify unusual usage spikes so teams can make cost-effective adjustments and meet budget thresholds

Monitoring vs observability

Monitoring helps answer what’s wrong, while observability helps explain why it’s happening.

AWS monitoring solutions typically include AWS observability capabilities. Both work together to solve problems and maintain reliability.

Monitoring

Monitoring captures predefined metrics such as CPU usage and network latency, offering a snapshot of system performance but little insight into why an issue occurs. For example, monitoring might identify high CPU utilization on a web server but not the root cause.

Observability

Observability correlates multiple telemetry data types—metrics, events, logs and traces (MELT data)—to provide a real-time view of AWS environments. It evolved from traditional monitoring to handle the complexity of cloud-native architectures.

Enhance Apache Spark and Log Analysis with Hybrid Cloud + Generative AI

Discover how hybrid cloud architecture and generative AI can boost the performance and scalability of data platforms like Apache Spark. This guide explores strategies for improving data quality, observability, and ROI in modern analytics environments through intentional cloud design.

Resources

2025 buyer’s guide: IBM observability products

How to choose the right observability solutions for proactive—and even predictive—management of IT and applications.

Unlock the power of IBM Instana Observability

IBM Instana Observability can help you achieve an ROI of 219% and reduce developer time spent troubleshooting by 90%.

Getting started with observability in modern systems

Discover the importance of observability and how it can help you gain insights into system behaviors.

Driving business value with AI-powered IT automation

Learn how combining APM and hybrid cloud cost optimization tools helps organizations reduce costs and increase productivity.

Optimize your business performance with AI-powered analytics

Register now to learn how advanced AI analytics can unlock new opportunities for growth and innovation in your business. Access expert insights and explore how AI solutions can enhance operational efficiency, optimize resources and lead to measurable business outcomes.

Modernize mainframe applications with hybrid cloud patterns

Explore the latest IBM Redbooks publication on mainframe modernization for hybrid cloud environments. Learn actionable strategies, architecture solutions and integration techniques to drive agility, innovation and business success.

Enhance your z/OS DevOps with automation and modernization

Join this webinar for a demo of how generative AI and automated testing provides a simplified developer experience to reduce the risk in complex modernization projects.

Footnotes

¹ AWS, Microsoft, Google Fight For USD 90B Q4 2024 Cloud Market Share , CRN.com, 13 February 2025.

What is AWS monitoring?

Authors

What is AWS monitoring?

The latest tech news, backed by expert insights

Thank you! You are subscribed.

Why is AWS monitoring important?

Maintain reliability and availability

Improve performance

Manage AWS costs

Enhance security

Anticipate issues

Achieving AI-readiness with hybrid cloud

How does AWS monitoring work?

Key metrics for AWS monitoring

Infrastructure performance and usage

Amazon EC2 (Elastic Compute Cloud)

ECS (Elastic Container Service) and EKS (Elastic Kubernetes Service)

Amazon RDS (Relational Database Service)

AWS Lambda

Elastic Load Balancing (ELB)

Application performance and behavior

Security and compliance

Operational and business metrics

Monitoring vs observability

Monitoring

Observability

Resources

Footnotes