What is a Data Breach?

What is a data breach?

A data breach is any security incident in which unauthorized parties gain access to sensitive or confidential information, including personal data (Social Security numbers, bank account numbers, healthcare data) or corporate data (customer data records, intellectual property, financial information).

The terms ‘data breach’ and ‘breach’ are often used interchangeably with ‘cyberattack.’ But not all cyberattacks are data breaches—and not all data breaches are cyberattacks.

Data breaches include only those security breaches in which data confidentiality is compromised. So, for example, a distributed denial of service (DDoS) attack that overwhelms a website is not a data breach. But a ransomware attack that locks up a company’s customer data and threatens to sell it for ransom, is a data breach—so is the physical theft of hard drives, thumb drives, or even paper files containing sensitive information.

Cost of a Data Breach

Get insights to better manage the risk of a data breach with the latest Cost of a Data Breach report.

Related content

An expensive problem

According to the IBM Cost of a Data Breach 2022 report, the global average cost of a data breach is USD 4.35 million. Also, the average cost of a data breach in the United States is more than twice that amount at USD 9.44 million. Eighty-three (83) percent of organizations surveyed in the report experienced more than one data breach.

Organizations of every size and type are vulnerable to breaches—large and small businesses, public and private companies, federal, state and local governments and non-profit organizations. The consequences of a data breach are especially more severe for organizations in fields such as healthcare, finance and the public sector.

The value of these data—government secrets, patient health information, bank account numbers and log-in credentials—and the strict regulatory fines and penalties are what these organizations carry when a breach occurs. For example, according to the IBM report, the average healthcare data breach cost USD 10.10 million—more than twice the average cost of all breaches.

Data breach costs arise from several factors, some more surprising than others. The resulting loss of business, revenue and customers cost data breach victims USD 1.42 million on average. But the average cost of detecting and containing a breach is slightly more expensive at USD 1.44 million. And post-breach expenses—including everything from fines, settlements and legal fees to reporting costs and providing free credit monitoring from affected customers—cost the average data breach victim USD 1.49 million. Data breach reporting requirements can be particularly costly and time-consuming.

The U.S. Cyber Incident Reporting for Critical Infrastructure Act of 2022 (CIRCIA) requires organizations in national security, finance, critical manufacturing, and other designated industries to report cybersecurity incidents affecting either personal data or business operations to the Department of Homeland Security within 72 hours.
U.S. organizations subject to the Health Insurance Portability and Accountability Act (HIPAA) must notify the U.S. Department of Health and Human Services, affected individuals and sometimes the media if protected health information is breached.
All 50 U.S. states also have their own data breach notification laws.
The General Data Protection Regulation (GDPR) requires companies doing business with EU citizens to notify authorities of breaches within 72 hours. This reporting and other post-breach responsibilities—from paying fines, settlements and legal fees to providing free credit monitoring for affected customers—costs the average data breach victim USD 1.49 million.

Why data breaches happen

Data breaches are caused by:

Innocent mistakes—e.g., an employee emailing confidential information to the wrong person
Malicious insiders—angry or laid-off employees, or a greedy employee susceptible to an outsider’s bribe
Hackers—malicious outsiders committing intentional cybercrimes to steal data

Financial gain is the primary motivation for most malicious attacks. Hackers may steal credit card numbers, bank accounts, or other financial information to drain funds from people and companies directly.

They could steal personally identifiable information (PII)—social security numbers and phone numbers—for identity theft (taking out loans and opening up credit cards in their victims' names) or for sale on the dark web, where it can fetch as much as USD 1 per social security number and USD 2,000 for a passport number (link resides outside ibm.com). Cybercriminals may also sell personal details or stolen credentials to other hackers on the dark web, who may use them for their own malicious purposes.

Data breaches may have other objectives. Unscrupulous organizations may steal trade secrets from competitors. Nation-state actors may breach government systems to steal information about sensitive political dealings, military operations, or national infrastructure.

Some breaches are purely destructive, with hackers accessing sensitive data only to destroy or deface it. Such destructive attacks, which account for 17% of breaches according to the Cost of a Data Breach 2022 report, are often the work of nation-state actors or hacktivist groups seeking to damage an organization.

How data breaches happen

According to the Cost of a Data Breach 2022 report, the average data breach lifecycle is 277 days, which means it takes that long for organizations to identify and contain an active breach.

Intentional data breaches caused by internal or external threat actors follow the same basic pattern:

Research: Hackers look for a target and then look for weaknesses that they can exploit in the target's computer system or employees. They may also purchase previously stolen information malware that will grant them access to the target's network.
Attack: With a target and method that are identified, the hacker launches the attack. The hacker could begin a social engineering campaign, directly exploit vulnerabilities in the target system, use stolen log-in credentials, or leverage any of the other common data breach attack vectors.
Compromise data: The hacker locates the data they're after and takes action. This may mean exfiltrating data for use or sale, destroying data, or locking up the data with ransomware and demanding payment.

Common data breach attack vectors

Malicious actors can use various attack vectors, or methods, to carry out data breaches. Some of the most common include:

Stolen or compromised credentials

According to the Cost of a Data Breach 2022 report, stolen or compromised credentials are the most common initial attack vector, accounting for 19% of data breaches. Hackers may steal or compromise credentials by using brute force attacks, buying stolen credentials off the dark web, or tricking employees into revealing credentials through social engineering attacks.

Social engineering attacks

Social engineering is the act of psychologically manipulating people into unwittingly compromising their own information security. Phishing, the most common type of social engineering attack, is also the second most-common data breach attack vector, accounting for 16% of breaches. Phishing scams use fraudulent emails, text messages, social media content or web sites to trick users into sharing credentials or downloading malware.

Learn more about social engineering

Ransomware

According to the Cost of a Data Breach 2022 report, it takes a company 326 days on average to identify and contain a ransomware breach. This issue is particularly chilling because according to the X-Force Threat Intelligence Index 2023, the average time to execution for ransomware dropped from 60+ days in 2019 to just 3.85 days in 2021. The average cost of a ransomware-related breach is USD 4.54 million—a figure that does not include ransom payments, which can run to tens of millions of dollars.

Learn more about ransomware

System vulnerabilities

Cybercriminals may gain access to a target network by exploiting weaknesses IT assets like websites, operating systems, endpoints and commonly used software like Microsoft Office or web browsers. Once hackers locate a vulnerability, they will often use it to inject malware into the network. Spyware, which records a victim's keystrokes and other sensitive data and sends it back to a command and control server that the hackers operate, is a common type of malware used in data breaches.

SQL injection

Another method of breaching target systems directly, SQL injection takes advantage of weaknesses in the Structured Query Language (SQL) databases of unsecured websites. Hackers enter malicious code into the website's search field, prompting the database to return private data like credit card numbers or customers' personal details.

Human error and IT failures

Hackers can take advantage of employees' mistakes to gain access to confidential information. For example, according to the IBM Cost of a Data Breach 2022 report, cloud misconfigurations served as the initial attack vector in 15% of breaches. Employees may also expose data to attackers by storing it in unsecured locations, misplacing devices with sensitive information saved on their hard drives, or mistakenly granting network users excessive data access privileges. Cybercriminals may also use IT failures, such as temporary system outages, to sneak into sensitive databases.

Physical or site security errors

Attackers may steal an employees' work or personal device to gain access to the sensitive data it contains, break into company offices to steal paper documents and physical hard drives, or place skimming devices on physical credit and debit card readers to collect individuals' payment card information.

Notable data breaches

A handful of examples demonstrate the range of data breach causes and costs.

TJX: The 2007 breach of TJX Corporation, the parent company of retailers TJ Maxx and Marshalls, was at that time the largest and costliest consumer data breach in the U.S. history, with as many as 94 million compromised customer records and more than USD 256 million in financial losses. Hackers gained access to the data by decrypting the wireless network that connects a store’s cash registers to back-end systems.
Yahoo: In 2013, Yahoo suffered what may be the largest data breach in history. Hackers exploited a weakness in the company's cookie system to gain access to the names, birthdates, email addresses and passwords of all 3 billion of Yahoo's users. The full extent of the breach didn't come to light until 2016, while Verizon was in talks to buy the company. As a result, Verizon reduced its acquisition offer by USD 350 million.

Equifax: In 2017, hackers breached the credit reporting agency Equifax and accessed the personal data of more than 143 million Americans. Hackers exploited an unpatched weakness in Equifax's website to gain access to the network and then moved laterally to other servers to find social security numbers, driver's license numbers and credit card numbers. The attack cost Equifax USD 1.4 billion between settlements, fines and other costs associated with repairing the breach.

SolarWinds: In 2020, Russian threat actors executed a supply chain attack by hacking the software vendor SolarWinds. Hackers used the organization's network monitoring platform, Orion, to covertly distribute malware to SolarWinds' customers. Russian spies were able to gain access to the confidential information of various U.S. government agencies that use SolarWinds' services, including the Treasury, Justice and State Departments.

Colonial Pipeline: In 2021, hackers infected Colonial Pipeline's systems with ransomware, forcing the company to temporarily shut down the pipeline supplying 45% of the U.S. East Coast's fuel. Hackers used an employee's password, found on the dark web, to breach the network. The Colonial Pipeline Company paid a USD 4.4 million ransom in cryptocurrency, but federal law enforcement was able to recover roughly USD 2.3 million of that payment.

Data breach prevention and mitigation

Standard security measures—regular vulnerability assessments, scheduled backups, encryption of data at rest and in transit, proper database configurations, timely application of systems and software—can help prevent data breaches and soften the blow when data breaches occur. But today organizations may implement more specific data security controls, technologies and best practices to better prevent data breaches and mitigate the damage they cause.

Incident response plans. An organization’s incident response plan (IRP)—a blueprint for detecting, containing and eradicating cyberthreats—is one of the most effective ways to mitigate the damage of a data breach. According to the Cost of a Data Breach 2022 report, organizations with regularly tested incident response plans and dedicated response teams have an average data breach cost of USD 3.26 million—USD 2.66 million less than the average cost of a data breach than those without.

AI and automation. The Cost of a Data Breach 2022 report also found that organizations apply high levels of artificial intelligence (AI) and automation for threat detection and response have an average data breach cost that is 55.3% lower than organizations applying lower levels of those technologies. Technologies such as security orchestration, automation and response (SOAR), user and entity behavior analytics (UEBA ), endpoint detection and response (EDR ) and extended detection and response (XDR ) leverage AI and advanced analytics to identify threats early—even before they lead to data breaches—and provide automation capabilities that enable a faster, cost-saving response.

Employee training. Because social engineering and phishing attacks are leading causes of breaches, training employees to recognize and avoid these attacks can reduce a company’s risk of a data breach. In addition, training employees to handle data properly can help prevent accidental data breaches and data leaks.

Identity and access management (IAM). Strong password policies, password managers, two-factor authentication (2FA) or multi-factor authentication (MFA), single sign-on (SSO) and other identity and access management (IAM) technologies and practices can help organizations better defend against hackers that use stolen or compromised credentials, the most common data breach attack vector.

A zero trust security approach. A zero trust security approach is one that never trusts and continuously verifies all users or entities, whether they’re outside or already inside the network. Specifically, zero trust requires

Continuous authentication, authorization and validation: Anyone or anything trying to access the network or a network resource is treated as potentially compromised or malicious and must pass continuous, contextual authentication, authorization and validation challenges to gain or maintain access.
Least privileged access: Upon successful validation, users or entities are granted the lowest level of access and permissions necessary to complete their task or fulfill their role.
Comprehensive monitoring of all network activity: Zero trust implementations require visibility into every aspect of an organization's hybrid network ecosystem, including how users and entities interact with resources based on roles and where potential vulnerabilities exist.

These controls can help thwart data breaches and other cyberattacks by identifying and stopping them at the outset and by limiting the movement and progression of hackers and attacks that do gain access to the network.