What is data exfiltration?
Subscribe to the IBM Newsletter Explore IBM Security QRadar
Isometric drawing showing different office personnel, all using IBM Security

Data exfiltration—also known as data extrusion or data exportation—is data theft: the intentional, unauthorized, covert transfer of data from a computer or other device. Data exfiltration may be conducted manually, or automated using malware.

For targets ranging from average users to major businesses and government agencies, data exfiltration attacks rank among the most destructive and damaging cybersecurity threats. Preventing data exfiltration and protecting company data are crucial for several reasons:

  • Maintaining business continuity: Data exfiltration can disrupt operations, damage customer trust, and lead to financial losses.

  • Regulatory compliance: Many industries have specific regulations regarding data privacy and protection. Data exfiltration often results from or exposes a failure to comply with these regulations and can result in severe penalties and lasting reputational damage.

  • Safeguarding intellectual property: Data exfiltration can compromise trade secrets, research and development, and other proprietary information essential to an organization’s profitability and competitive advantage.

For cybercriminals, sensitive data has become an extremely valuable target. Stolen customer data, personally identifiable information (PII), social security numbers, or any other type of confidential information might be sold on the black market, used to execute further cyber attacks, or held hostage in exchange for exorbitant fees as part of a ransomware attack.

Data exfiltration vs. data leakage vs. data breach

While often used interchangeably, data leakage, data breach and data exfiltration are different, if related, concepts.

Data leakage is the accidental exposure of sensitive data. Data leakage can result from a technical security vulnerability or procedural security error.

A data breach is any security incident that results in unauthorized access to confidential or sensitive information. Someone who shouldn’t have access to sensitive data, gains access to sensitive data.

Data exfiltration is the discrete act of stealing the data. All data exfiltration requires a data leak or a data breach, but not all data leaks or data breaches lead to data exfiltration—a threat actor may choose instead to encrypt the data, as part of a ransomware attack, or use it to hijack an executive’s email account. It’s not data exfiltration until the data is copied or moved to some other storage device under the attacker’s control.

The distinction is important. Search Google for ‘data exfiltration costs’ and you’ll find lots of information about the costs of data breaches—largely because there’s very little data available on costs directly attributable to data exfiltration. But many data breach cost calculations do not include costs related specifically to exfiltration, such as the often substantial cost of ransom payments to prevent the sale or release of exfiltrated data, or the cost of subsequent attacks enabled by exfiltrated data.

How does data exfiltration happen?

In most cases, data exfiltration is caused by

  • An outside attacker—a hacker, cybercriminal, foreign adversary or other malicious actor.

  • A careless insider threat—an employee, business partner or other authorized user who inadvertently exposes data through human error, poor judgement (e.g., falling for a phishing scam) or ignorance of security controls, policies and best practices (e.g., transferring sensitive data to a thumb drive, portable hard drive or other unsecure device).

In rarer cases, the cause is a malicious insider threat—a bad actor with authorized access to the network, such as a disgruntled employee.

Common data exfiltration techniques and attack vectors

Outside attackers and malicious insiders exploit careless or poorly trained insiders, as well as technical security vulnerabilities, to access and steal sensitive data.

Phishing and other social engineering attacks

Social engineering attacks exploit human psychology to manipulate or trick an individuals into taking actions that compromise their own security or their organization’s security.

The most common type of social engineering attack is phishing, the use of email, text or voice messages that impersonate a trusted sender and convince users to download malware (such as ransomware), click links to malicious web sites, give up personal information (e.g., log-in credentials), or in some cases directly hand over the data that the attacker wants to exfiltrate.

Phishing attacks can range from impersonal bulk phishing messages that appear to come from trusted brands or organizations, to highly personalized spear phishing, whale phishing and business email compromise (BEC) attacks that target specific individuals with messages that appear to come from close colleagues or authority figures.

But social engineering can be far less technical. One social engineering technique, called baiting, is as simple as leaving a malware-infected thumb-drive where a user will pick it up. Another technique, called tailgaiting, is no more complex than following an authorized user into a room physical location where data is stored.

Vulnerability exploits

A vulnerability exploit takes advantage of a security flaw or opening in a system’s or device’s hardware, software, or firmware. Zero-day exploits take advantage of security flaws that hackers discover before software or device vendors know about them or are able to fix them. DNS tunneling uses domain name service (DNS) requests to evade firewall defenses and create a virtual tunnel for exfiltrating sensitive information.

The cost of data exfiltration

For individuals, data stolen through exfiltration can result costly consequences such as identity theft, credit card or bank fraud, and blackmail or extortion. For organizations—particularly organizations in highly-regulated industries such as healthcare and finance—the consequences are more costly by orders of magnitude. They include:

  • Disrupted operations resulting from lost business-critical data;

  • Loss of customers’ trust or business;

  • Compromise of valuable trade secrets, such as product developments/inventions, unique application code or manufacturing processes;

  • Severe regulatory fines, fees, and other sanctions for organizations required by law to adhere to strict data protection and privacy protocols and precautions when dealing with customers’ sensitive data;

  • Subsequent attacks made possible by the exfiltrated data.

Reports or studies of costs attributable directly to data exfiltration are difficult to find, but incidence data exfiltration is rising rapidly. Today most ransomware attacks are double-extortion attacks—the cybercriminal encrypts the victim’s data and exfiltrates it, then demands on ransom to unlock the data (so the victim can resume business operations) and a subsequent ransoms to prevent sale or release of the data to third parties.

In 2020, cybercriminals exfiltrated hundreds of millions of customer records from Microsoft and Facebook alone. In 2022, the Lapsus$ hacking group exfiltrated 1 terabyte of sensitive data from chipmaker Nvidia, and leaked source code for the company’s deep learning technology. If hackers follow the money, the money in data exfiltration must be good and getting better.

Data exfiltration prevention

Organizations use a combination of best practices and security solutions to prevent data exfiltration.

Security awareness training. Because phishing is such a common data exfiltration attack vector, training users to recognize phishing scams can help block hackers attempts at data exfiltration. Schooling users on best practices for remote work, password hygiene, use of personal devices at work, and handling/transferring/storing company data can help organizations reduce their risk of data exfiltration.

Identity and access management (IAM). IAM systems allow companies to assign and manage a single digital identity and single set of access privileges for each user on the network, in a way that streamlines access for authorized users while keeping unauthorized user, including hackers, out. IAM can combine technologies such as

  • Multi-factor authentication—requiring one or more log-on credentials in addition to a username and password)

  • Role-based access control (RBAC)—access permissions based on the user’s role in the organization

  • Adaptive authentication—requiring users to re-authenticate when context changes (e.g. they switch devices or attempt to access particularly sensitive applications or data)

  • Single sign-on—an authentication scheme that enables users to log in to a session once using a single set of login credentials, and access to multiple related on-premises or cloud services during that session without logging in again.

Data loss prevention (DLP). DLP solutions monitor and inspect sensitive data in any state—at rest (in storage), in motion (moving through the network), and in use (being processed)—for signs of exfiltration, and block exfiltration if those signs are detected. For example, DLP technology can block data from being copied to an unauthorized cloud storage service, or from being processed by an unauthorized application (e.g., an app a user downloads from the web).

Threat detection and response technologies. A growing class of cybersecurity technologies continuously monitor and analyze corporate network traffic and user activity and help overburdened security teams detect cyberthreats in real or near-real time and respond with minimal manual intervention. These technologies include intrusion detection systems (IDSs) and intrusion prevention systems (IPSs), security information and event management (SIEM) and security orchestration, automation and response (SOAR) software, and endpoint detection and response (EDR) and extended detection and response (XDR) solutions.

Related solutions
IBM Security® QRadar® Suite

Outsmart attacks with a connected, modernized security suite. The QRadar portfolio is embedded with enterprise-grade AI and offers integrated products for endpoint security, log management, SIEM and SOAR—all with a common user interface, shared insights and connected workflows.

Explore QRadar Suite
Data security and protection solutions

Implemented on premises or in a hybrid cloud, IBM data security solutions help you gain greater visibility and insights to investigate and remediate cyberthreats, enforce real-time controls and manage regulatory compliance.

Explore data security and protection solutions
X-Force incident response team

Proactive threat hunting, continuous monitoring and a deep investigation of threats are just a few of the priorities facing an already busy IT department. Having a trusted incident response team on standby can reduce your response time, minimize the impact of a cyberattack, and help you recover faster.

Explore X-Force incident response
Resources What is ransomware?

Ransomware is a form of malware that threatens to destroy or withhold the victim’s data or files unless a ransom is paid to the attacker to unencrypt and restore access to the data.

Cost of a Data Breach 2022

Now in its 17th year, this report shares the latest insights into the expanding threat landscape, and offers recommendations for saving time and limiting losses.

X-Force Threat Intelligence Index 2023

CISOs, security teams and business leaders: Find actionable insights for understanding how threat actors are waging attacks, and how to proactively protect your organization.

Take the next step

Cybersecurity threats are becoming more advanced and more persistent, and demanding more effort by security analysts to sift through countless alerts and incidents. IBM Security QRadar SIEM makes it easy to remediate threats faster while maintaining your bottom line. QRadar SIEM prioritizes high-fidelity alerts to help you catch threats that others simply miss.

Learn more about QRadar SIEM Request a QRadar SIEM demo