What is a data breach?
A data breach is any security incident that results in unauthorized access to confidential information.
Two workers sitting at shared desk, both looking at computer monitor
What is a data breach?

A data breach is any security incident in which unauthorized parties gain access to sensitive data or confidential information, including personal data (Social Security numbers, bank account numbers, healthcare data) or corporate data (customer data records, intellectual property, financial information).

The terms ‘data breach’ and ‘breach’ are often used interchangeably with ‘cyberattack.’ But not all cyberattacks are data breaches—and not all data breaches are cyberattacks. Data breaches include only those security breaches in which the confidentiality of data is compromised. So, for example, a distributed denial of service (DDoS) attack that overwhelms a website is not a data breach. But a ransomware attack that locks up a company’s customer data, and threatens to sell it if a ransom is not paid, is a data breach. So is the physical theft of hard drives, thumb drives, or even paper files containing sensitive information.

An expensive problem

According to IBM's Cost of a Data Breach 2022 report, the average data breach costs a company USD 4.35 million, and 83 percent of organizations have experienced more than one data breach.

Organizations of every size and type are vulnerable to breaches—large and small businesses, public and private companies, federal, state and local governments, non-profit organizations. But the consequences of a data breach are especially severe for organizations in fields such as healthcare, finance, and the public sector. The value of the data these companies handle—government secrets, patient health information, bank account numbers and log-in credentials—and the strict regulatory fines and penalties these organizations face in the event of a breach make their breach costs even higher. For example, according to the IBM report, the average healthcare data breach cost USD 10.10 million—more than twice the average cost of all breaches.

Data breach costs arise from several factors, some more surprising than others. Resulting lost business, revenue and customers cost data breach victims USD 1.42 million on average. But the cost of detecting and containing a breach is slightly more expensive, averaging USD 1.44 million. And post-breach expenses—including everything from fines, settlements, and legal fees to reporting costs and providing free credit monitoring from affected customers—cost the average data breach victim USD 1.49 million. Data breach reporting requirements can be particularly costly and time-consuming.

  • The U.S. Cyber Incident Reporting for Critical Infrastructure Act of 2022 (CIRCIA) requires organizations in national security, finance, critical manufacturing, and other designated industries to report cybersecurity incidents affecting either personal data or business operations to the Department of Homeland Security within 72 hours.

  • U.S. organizations subject to the Health Insurance Portability and Accountability Act (HIPPA) must notify the U.S. Department of Health and Human Services, affected individuals, and (in some cases) the media if protected health information is breached.

  • All 50 U.S. states also have their own data breach notification laws.

  • The General Data Protection Regulation (GDPR) requires companies doing business with EU citizens to notify authorities of breaches within 72 hours. This reporting and other post-breach responsibilities—from paying fines, settlements and legal fees to providing free credit monitoring for affected customers—costs the average data breach victim USD 1.49 million.
Why data breaches happen

Data breaches can be caused by

  • Innocent mistakes—e.g., an employee emailing confidential information to the wrong person

  • Malicious insiders—angry or laid-off employees, or a greedy employee susceptible to an outsider’s bribe

  • Hackers—malicious outsiders committing intentional cybercrimes to steal data.

Most malicious attacks are motivated by financial gain. Hackers may steal credit card numbers, bank accounts, or other financial information to drain funds from people and companies directly. They may steal personally identifiable information (PII)—social security numbers and phone numbers— for identity theft (taking out loans and opening up credit cards in their victims' names) or for sale on the dark web, where it can fetch as much as USD 1 per social security number and USD 2,000 for a passport number (link resides outside ibm.com). Cybercriminals may also sell personal details or stolen credentials to other hackers on the dark web, who may use them for their own malicious purposes. 

Data breaches may have other objectives. Unscrupulous organizations may steal trade secrets from competitors. Nation-state actors may breach government systems to steal information about sensitive political dealings, military operations, or national infrastructure. Some breaches are purely destructive, with hackers accessing sensitive data only to destroy or deface it. Such destructive attacks, which account for 17 percent of breaches according to the Cost of a Data Breach 2022 report, are often the work of nation-state actors or hacktivist groups seeking to damage an organization.

How data breaches happen

According to the Cost of a Data Breach 2022 report, the average data breach lifecycle is 277 days—meaning it takes that long for organizations to identify and contain an active breach. Data breaches can take many forms, but most external breaches follow the same basic pattern:

  1. Research: Hackers look for a target, and then look for weaknesses they can exploit in the target's computer system or employees. They may also purchase previously stolen information malware that will grant them access to the target's network.

  2. Attack: With a target and method identified, the hacker launches the attack. The hacker may begin a social engineering campaign, directly exploit vulnerabilities in the target system, use stolen log-in credentials, or leverage any of the other common data breach attack vectors (see below).

  3. Compromise data: The hacker locates the data they're after and takes action. This may mean exfiltrating data for use or sale, destroying data, or locking the data up with ransomware and demanding payment.

Common data breach attack vectors

Malicious actors can use a number of attack vectors, or methods, to carry out data breaches. Some of the most common include:

Stolen or compromised credentials. According Cost of a Data Breach 2022, stolen or compromised credentials are the most common initial attack vector, accounting for 19 percent of data breaches. Hackers may steal or compromise credentials by using brute force attacks, buying stolen credentials off the dark web, or tricking employees into revealing credentials through social engineering attacks.

Social engineering. Social engineering is the act of psychologically manipulating people into unwittingly compromising their own information security. Phishing, the most common type of social engineering attack, is also the second most-common data breach attack vector, accounting for 16 percent of breaches. Phishing scams use fraudulent emails, text messages, social media content or web sites to trick users into sharing credentials or downloading malware.

Ransomware. According Cost of a Data Breach 2022, it takes a company 326 days on average to identify and contain a ransomware breach. The average cost of a ransomware-related breach is USD 4.54 million—a figure that does not include ransom payments.

Directly exploiting system vulnerabilities. Cybercriminals may gain access to a target network by exploiting weaknesses IT assets like websites, operating systems, endpoints, and commonly used software like Microsoft Office or web browsers. Once hackers have located a vulnerability, they'll often use it to inject malware into the network. Spyware, which records a victim's keystrokes and other sensitive data and sends it back to a command and control server operated by the hackers, is a common type of malware used in data breaches.

SQL injection. Another method of breaching target systems directly, SQL injection takes advantage of weaknesses in the Structured Query Language (SQL) databases of unsecured websites. Hackers enter malicious code into the website's search field, prompting the database to return private data like credit card numbers or customers' personal details.

Human error and IT failures. Hackers can take advantage of employees' mistakes to gain access to confidential information. For example, according to IBM's Cost of a Data Breach 2022 report, cloud misconfigurations served as the initial attack vector in 15 percent of breaches. Employees may also expose data to attackers by storing it in unsecured locations, misplacing devices with sensitive information saved on their hard drives, or mistakenly granting network users excessive data access privileges. Cybercriminals may also use IT failures, such as temporary system outages, to sneak into sensitive databases.

Physical security failures. Attackers may steal an employees' work or personal device to gain access to the sensitive data it contains, break into company offices to steal paper documents and physical hard drives, or place skimming devices on physical credit and debit card readers to collect individuals' payment card information.

Notable data breaches

A handful of examples demonstrate the range of data breach causes and costs.

  • TJX: The 2007 breach of TJX Corporation, the parent company of retailers TJ Maxx and Marshalls, was at that time the largest and costliest consumer data breach in the U.S. history, with as many as 94 million compromised customer records and upwards of USD 256 million in financial losses. Hackers gained access to the data by decrypting the wireless network connecting a store’s cash registers to back-end systems.

  • Yahoo: In 2013, Yahoo suffered what may be the largest data breach in history. Hackers exploited a weakness in the company's cookie system to gain access to the names, birthdates, email addresses, and passwords of all 3 billion of Yahoo's users. The full extent of the breach didn't come to light until 2016, while Verizon was in talks to buy the company. As a result, Verizon reduced its acquisition offer by USD 350 million.
  • Equifax: In 2017, hackers breached the credit reporting agency Equifax and accessed the personal data of more than 143 million Americans. Hackers exploited an unpatched weakness in Equifax's website to gain access to the network and then moved laterally to other servers to find social security numbers, driver's license numbers, and credit card numbers. The attack cost Equifax USD 1.4 billion between settlements, fines, and other costs associated with repairing the breach.
  • SolarWinds: In 2020, Russian state actors executed a supply chain attack by hacking the software vendor SolarWinds. Hackers used the organization's network monitoring platform, Orion, to covertly distribute malware to SolarWinds' customers. Russian spies were able to gain access to the confidential information of various U.S. government agencies using SolarWinds' services, including the Treasury, Justice, and State Departments.
  • Colonial Pipeline: In 2021, hackers infected Colonial Pipeline's systems with ransomware, forcing the company to temporarily shut down the pipeline supplying 45 percent of the U.S. East Coast's fuel. Hackers used an employee's password, found on the dark web, to breach the network. The Colonial Pipeline Company paid a USD 4.4 million ransom in cryptocurrency, but the US Department of Justice was able to recover roughly USD 2.3 million of that payment.
Data breach prevention and mitigation

Standard security measures—regular vulnerability assessments, scheduled backups, encryption of data at rest and in transit, proper database configurations, timely application of systems and software—can help prevent data breaches, and soften the blow when data breaches occur. But today organizations may implement more specific data security controls, technologies and best practices to better prevent data breaches and mitigate the damage they cause.

Incident response plans

An organization’s incident response plan (IRP)—a blueprint for detecting, containing and eradicating cyberthreats—is one of the most effective ways to mitigate the damage of a data breach. According to the Cost of a Data Breach 2022 report, organizations with regularly tested incident response plans and formal incident response teams have an average data breach cost of USD 3.26 million—USD 2.66 million less than the average cost of a data breach for organizations without incident response teams and plans.

Security AI and automation

The Cost of a Data Breach 2022 report also found that organizations apply high levels of artificial intelligence (AI) and automation for threat detection and response have an average data breach cost that is 55.3 percent lower than organizations applying lower levels of those technologies. Technologies such as SOAR (security orchestration, automation and response), UEBA (user and entity behavior analytics), EDR (endpoint detection and response) and XDR (extended detection and response) leverage AI and advanced analytics to identify threats early—even before they lead to data breaches—and provide automation capabilities that enable a faster, cost-saving response.

Employee training

Because social engineering and phishing attacks are leading causes of breaches, training employees to recognize and avoid these attacks can reduce a company’s risk of a data breach. In addition, training employees to handle data properly can help prevent accidental data breaches and data leaks.

Identity and access management (IAM)

Strong password policies, password managers, two-factor authentication (2FA) or multi-factor authentication (MFA), single sign-on (SSO) and other identity and access management (IAM) technologies and practices can help organizations better defend against hackers using stolen or compromised credentials, the most common data breach attack vector.

A zero trust security approach

A zero trust security approach is one that never trusts and continuously verifies all users or entities, whether they’re outside or already inside the network. Specifically, zero trust requires

  • Continuous authentication, authorization and validation: Anyone or anything trying to access the network or a network resource is treated as potentially compromised or malicious, and must pass continuous, contextual authentication, authorization and validation challenges to gain or maintain access.

  • Least privileged access: Upon successful validation, users or entities are granted the lowest level of access and permissions necessary to complete their task or fulfill their role.

  • Comprehensive monitoring of all network activity: Zero trust implementations require visibility into every aspect of an organization's hybrid network ecosystem, including how users and entities interact with resources based on roles and where potential vulnerabilities exist.

These controls can help thwart data breaches and other cyberattacks by identifying and stopping them at the outset, and by limiting the movement and progression of hackers and attacks that do gain access to the network.

Related solutions
Data security solutions

Protect enterprise data across multiple environments, meet privacy regulations and simplify operational complexity.

Explore data security solutions
Incident response services

Reduce your response time, minimize the impact of a cyberattack, and help you recover faster from a cyber breach with a trusted incident response team on standby.

Explore incident response services
Governance and risk compliance services

Manage IT risk, establish governance structures and increase cybersecurity maturity with an integrated governance, risk and compliance approach.

Explore governance, risk and compliance services
Resources How much does a data breach cost in 2022?
The annual Cost of a Data Breach Report, featuring research by Ponemon Institute, offers insights gained from 550 real breaches.
What is ransomware?
Learn how ransomware works, why it has proliferated in recent years, and how organizations defend against it.
What is social engineering?
Social engineering relies on human nature, rather than technical hacking, to manipulate people into compromising personal or enterprise security.