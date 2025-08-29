Site reliability engineering (SRE) and DevOps teams are exhausted. Sprawling IT estates, tool overload and the job’s on-call nature, all play a role in an overarching issue—alert fatigue.

Alert fatigue (sometimes called alarm fatigue) refers to “a state of mental and operational exhaustion caused by an overwhelming number of alerts.” It erodes the responsiveness and efficacy of DevOps, security operations center (SOC), site reliability engineering (SRE) and other teams responsible for IT performance and security, and is a widespread, consequential problem.

Vectra’s “2023 State of Threat Detection” report (based on a survey of 2,000 IT security analysts at firms with 1,000 or more employees) found that SOC teams field an average of 4,484 alerts per day. Of these, 67% are ignored due to a high volume of false positives and alert fatigue. The report also found that 71% of analysts believed that their organization might already have been “compromised without their knowledge, due to lack of visibility and confidence in threat detection capabilities.”



While the Vectra report takes a security-specific focus, teams charged with monitoring application and infrastructure performance face a similar overload. For example, a single misconfiguration can cause hundreds or thousands of performance alerts, an “alert storm” that can distract or desensitize IT teams and cause delayed responses to critical alerts and real issues. Those real issues can be costly.



What’s driving this burnout, and can agentic AI be part of a scalable solution?