Home Topics What is a data breach? What is a data breach?
Explore IBM's data breach solution Subscribe to security topic updates
Illustration with collage of pictograms of clouds, mobile phone, fingerprint, check mark

Updated: 24 May 2024
Contributor: Matthew Kosinski

What is a data breach?

A data breach is any security incident in which unauthorized parties access sensitive or confidential information, including personal data (Social Security numbers, bank account numbers, healthcare data) and corporate data (customer records, intellectual property, financial information). 

The terms "data breach" and "breach" are often used interchangeably with "cyberattack." However, not all cyberattacks are data breaches. Data breaches include only those security breaches where someone gains unauthorized access to data. 

For example, a distributed denial of service (DDoS) attack that overwhelms a website is not a data breach. A ransomware attack that locks up a company's customer data and threatens to leak it unless the company pays a ransom is a data breach. The physical theft of hard drives, USB flash drives or even paper files containing sensitive information is also a data breach.

Cost of a Data Breach

Get insights to better manage the risk of a data breach with the latest Cost of a Data Breach report.

An expensive problem

According to the IBM®Cost of a Data Breach 2023 report, the global average cost of a data breach is USD 4.45 million. While organizations of every size and kind are vulnerable to breaches, the severity of these breaches and the costs to remediate them can vary.

For example, the average cost of a data breach in the United States is USD 9.48 million, more than 4 times the cost of a breach in India (USD 2.18 million).

Breach consequences tend to be especially severe for organizations in highly regulated fields like healthcare, finance and the public sector, where steep fines and penalties can compound the costs. For example, according to the IBM report, the average healthcare data breach costs USD 10.93 million, more than twice the average cost of all breaches.

Data breach costs arise from several factors, with IBM's report noting four key ones: lost business, detection and containment, post-breach response and notification. 

The loss of business, revenue and customers resulting from a breach costs organizations USD 1.30 million on average. The price of detecting and containing the breach is even higher at USD 1.58 million. Post-breach expenses—including fines, settlements, legal fees, providing free credit monitoring to affected customers and similar expenditures—cost the average breach victim USD 1.20 million. 

Notification costs, which include reporting breaches to customers, regulators and other third parties, are the lowest at USD 370,000. However, reporting requirements can still be onerous and time-consuming. 

  • The US Cyber Incident Reporting for Critical Infrastructure Act of 2022 (CIRCIA) requires organizations in national security, finance and other designated industries to report cybersecurity incidents affecting personal data or business operations to the Department of Homeland Security within 72 hours. 

  • US organizations subject to the Health Insurance Portability and Accountability Act (HIPAA) must notify the US Department of Health and Human Services, affected individuals and sometimes the media if protected health information is breached. 

  • All 50 US states also have their own data breach notification laws. 

  • The General Data Protection Regulation (GDPR) requires companies doing business with EU citizens to notify authorities of breaches within 72 hours.  

Related content

Register for the X-Force Threat Intelligence Index

Why data breaches happen

Data breaches are caused by:

  • Innocent mistakes, such as an employee emailing confidential information to the wrong person. 

  • Malicious insiders, including angry or laid-off employees who want to hurt the company and greedy employees who want to profit off the company's data. 

  • Hackers, malicious outsiders who commit intentional cybercrimes to steal data. Hackers can act as lone operators or part of an organized ring.  

Financial gain is the primary motivation for most malicious data breaches. Hackers steal credit card numbers, bank accounts or other financial information to directly drain funds from people and companies.

Some attackers steal personally identifiable information (PII)—such as Social Security numbers and phone numbers—for identity theft, taking out loans and opening credit cards in their victims' names. Cybercriminals can also sell stolen PII and account information on the dark web, where they can fetch as much as USD 500 for bank login credentials.1

A data breach can also be the first phase of a larger attack. For example, hackers might steal the email account passwords of corporate executives and use those accounts to conduct business email compromise scams. 

Data breaches might have objectives other than personal enrichment. Unscrupulous organizations might steal trade secrets from competitors, and nation-state actors might breach government systems to steal information about sensitive political dealings, military operations or national infrastructure.

Some breaches are purely destructive, with hackers accessing sensitive data to destroy or deface it. According to the Cost of a Data Breach report, such destructive attacks account for 25% of malicious breaches. These attacks are often the work of nation-state actors or hacktivist groups seeking to damage an organization.

How data breaches happen

Most intentional data breaches caused by internal or external threat actors follow the same basic pattern:

  1.  Research: The threat actor identifies a target and looks for weaknesses that they can use to break into the target's system. These weaknesses can be technical, such as inadequate security controls, or human, such as employees susceptible to social engineering.  

  2. Attack: The threat actor starts an attack on the target by using their chosen method. The attacker might send a spear-phishing email, directly exploit vulnerabilities in the system, use stolen login credentials to take over an account or leverage other common data breach attack vectors. 

  3. Compromise data: Inside the system, the attacker locates the data they want and does what they came to do. Common tactics include exfiltrating data for sale or use, destroying the data or locking up data to demand a ransom. 
Common data breach attack vectors 

Malicious actors can use various attack vectors or methods to carry out data breaches. Some of the most common include:

Stolen or compromised credentials 

According to the Cost of a Data Breach 2023 report, stolen or compromised credentials are the second most common initial attack vector, accounting for 15% of data breaches.

Hackers can compromise credentials by using brute force attacks to crack passwords, buying stolen credentials off the dark web or tricking employees into revealing their passwords through social engineering attacks.

Social engineering attacks 

Social engineering is the act of psychologically manipulating people into unwittingly compromising their own information security. 

Phishing, the most common type of social engineering attack, is also the most common data breach attack vector, accounting for 16% of breaches. Phishing scams use fraudulent emails, text messages, social media content or websites to trick users into sharing credentials or downloading malware.


Ransomware, a type of malware that holds data hostage until a victim pays a ransom, is involved in 24% of malicious breaches according to the Cost of a Data Breach report. These breaches also tend to be more expensive, costing an average of USD 5.13 million. This figure does not include ransom payments, which can run to tens of millions of dollars.

System vulnerabilities 

Cybercriminals can gain access to a target network by exploiting weaknesses in websites, operating systems, endpoints, APIs and common software like Microsoft Office or other IT assets. 

Threat actors don't need to hit their targets directly. In supply chain attacks, hackers exploit vulnerabilities in the networks of a company's service providers and vendors to steal its data.  

When hackers locate a vulnerability, they often use it to plant malware in the network. Spyware, which records a victim's keystrokes and other sensitive data and sends it back to a server that the hackers control, is a common type of malware used in data breaches.

SQL injection  

Another method of directly breaching target systems is SQL injection, which takes advantage of weaknesses in the Structured Query Language (SQL) databases of unsecured websites.

Hackers enter malicious code into user-facing fields, such as search bars and login windows. This code causes the database to divulge private data like credit card numbers or customers' personal details.

Human error and IT failures

Threat actors can take advantage of employees' mistakes to gain access to confidential information. 

For example, misconfigured or outdated systems can let unauthorized parties access data they shouldn't be able to. Employees can expose data by storing it in unsecured locations, misplacing devices with sensitive information saved on their hard drives or mistakenly granting network users excessive access privileges. Cybercriminals can use IT failures, such as temporary system outages, to sneak into sensitive databases.

According to the Cost of a Data Breach report, cloud misconfigurations account for 11% of breaches. Known, unpatched vulnerabilities account for 6% of breaches. Accidental data loss, including lost or stolen devices, accounts for another 6%. Altogether, these errors are behind nearly a quarter of all breaches. 

Physical security compromises 

Threat actors may break into company offices to steal employees' devices, paper documents and physical hard drives containing sensitive data. Attackers can also place skimming devices on physical credit and debit card readers to collect payment card information.

Notable data breaches 

The 2007 breach of TJX Corporation, the parent company of retailers TJ Maxx and Marshalls, was at that time the largest and costliest consumer data breach in US history. As many as 94 million customer records were compromised, and the company suffered more than USD 256 million in financial losses. 

Hackers gained access to the data by planting traffic sniffers on the wireless networks of two stores. The sniffers allowed the hackers to capture information as it was transmitted from the store's cash registers to back-end systems.


In 2013, Yahoo suffered what might be the largest data breach in history. Hackers exploited a weakness in the company's cookie system to access the names, birthdates, email addresses and passwords of all 3 billion Yahoo users. 

The full extent of the breach was revealed in 2016 while Verizon was in talks to buy the company. As a result, Verizon reduced its acquisition offer by USD 350 million.


In 2017, hackers breached the credit reporting agency Equifax and accessed the personal data of more than 143 million Americans. 

Hackers exploited an unpatched weakness in Equifax's website to gain access to the network. The hackers then moved laterally to other servers to find Social Security numbers, driver's license numbers and credit card numbers. The attack cost Equifax USD 1.4 billion between settlements, fines and other costs associated with repairing the breach.


In 2020, Russian threat actors executed a supply chain attack by hacking the software vendor SolarWinds. Hackers used the organization's network monitoring platform, Orion, to covertly distribute malware to SolarWinds' customers.

Russian spies gained access to the confidential information of various US government agencies, including the Treasury, Justice and State Departments, that use SolarWinds' services. 

Colonial Pipeline

In 2021, hackers infected Colonial Pipeline's systems with ransomware, forcing the company to temporarily shut down the pipeline that supplies 45% of the US East Coast's fuel. 

Hackers breached the network by using an employee's password that they found on the dark web. The Colonial Pipeline Company paid a USD 4.4 million ransom in cryptocurrency, but federal law enforcement recovered roughly USD 2.3 million of that payment.


In the fall of 2023, hackers stole the data of 6.9 million 23andMe users. The breach was notable for a couple of reasons. First, because 23andMe conducts genetic testing, the attackers obtained some unconventional and highly personal information, including family trees and DNA data.  

Second, the hackers breached user accounts through a technique called "credential stuffing." In this kind of attack, hackers use credentials exposed in previous leaks from other sources to break into users' unrelated accounts on different platforms. These attacks work because many people reuse the same username and password combinations across sites.  

Data breach prevention and mitigation 

According to the Cost of a Data Breach report, it takes organizations an average of 277 days to identify and contain an active breach. Deploying the right security solutions can help organizations detect and respond to these breaches faster. 

Standard measures, such as regular vulnerability assessments, scheduled backups, timely patching and proper database configurations, can help prevent some breaches and soften the blow of those that occur.

However, many organizations today implement more advanced controls and best practices to stop more breaches and significantly mitigate the damage they cause.


Data security tools

Organizations can deploy specialized data security solutions to automatically discover and classify sensitive data, apply encryption and other protections and gain real-time insight into data usage. 

Incident response plans

Organizations can mitigate breach damage by adopting formal incident response plans for detecting, containing and eradicating cyberthreats. According to the Cost of a Data Breach report, organizations with regularly tested incident response plans and dedicated response teams reduce the time it takes to contain breaches by an average of 54 days.

AI and automation

Organizations that extensively integrate artificial intelligence (AI) and automation into security operations resolve breaches 108 days faster than those that don't, according to the Cost of a Data Breach report. The report also found that security AI and automation also reduce the cost of an average breach by USD 1.76 million or 40%.  

Many data security, data loss prevention and identity and access management tools now incorporate AI and automation.

Employee training

Because social engineering and phishing attacks are leading causes of breaches, training employees to recognize and avoid these attacks can reduce a company's risk of a data breach. In addition, training employees to handle data properly can help prevent accidental data breaches and data leaks. 

Identity and access management (IAM)

Password managers, two-factor authentication (2FA) or multifactor authentication (MFA)single sign-on (SSO) and other identity and access management (IAM) tools can protect employee accounts and credentials from theft.

Organizations can also enforce role-based access controls and the principle of least privilege to limit employee access to only the data that they need for their roles. These policies can help stop both insider threats and hackers who hijack legitimate accounts.

Related solutions
Data security and protection solutions  

Protect data across hybrid clouds and simplify compliance requirements.

Explore IBM data security and protection solutions
Data privacy solutions

Strengthen data privacy protection, build customer trust and grow your business.

Explore IBM data privacy solutions
IBM X-Force® Incident Response Services

Improve your organization’s incident response program, minimize the impact of a breach and experience rapid response to cybersecurity incidents.

Explore X-Force incident response services
Resources Five common data security pitfalls to avoid

Learn how to improve your data security and compliance posture by centralizing security, addressing vulnerabilities and more.

Cybersecurity in the era of generative AI

Learn how today’s security landscape is changing and how to navigate the challenges and tap into the resilience of generative AI.

What is ransomware?

Learn how ransomware works, why it has proliferated in recent years and how organizations defend against it.

Take the next step

IBM Security Guardium Insights offers a unified data security solution with both SaaS and on-premises capabilities to protect data wherever it lives. Improve your data security posture with centralized visibility, continuous data monitoring and advanced compliance features with automated workflows. Connect and protect data in 19+ cloud environments and detect data security vulnerabilities from a single location.

Explore Guardium Insights Book a live demo

How Much Do Hackers Make From Stealing Your Data? (link resides outisde ibm.com), Nasdaq. 16 October 2023