What is data exfiltration?
Explore IBM's data exfiltration solution Subscribe to Security Topic Updates
Illustration with collage of pictograms of clouds, mobile phone, fingerprint, check mark

Data exfiltration—also known as data extrusion or data exportation—is data theft: the intentional, unauthorized, covert transfer of data from a computer or other device. Data exfiltration can be conducted manually, or automated using malware.

For targets ranging from average users to major businesses and government agencies, data exfiltration attacks rank among the most destructive and damaging cybersecurity threats. Preventing data exfiltration and protecting company data are crucial for several reasons:

  • Maintaining business continuity: Data exfiltration can disrupt operations, damage customer trust and lead to financial losses.
     

  • Complying with regulations: Many industries have specific data privacy and protection regulations. Data exfiltration often results from or exposes a failure to comply with these regulations and can cause severe penalties and lasting reputational damage.
     

  • Safeguarding intellectual property: Data exfiltration can compromise trade secrets, research and development and other proprietary information essential to an organization’s profitability and competitive advantage.

For cybercriminals, sensitive data is an extremely valuable target. Stolen customer data, personally identifiable information (PII), social security numbers, or any other type of confidential information might be sold on the black market. The stolen data can also be used to execute further cyberattacks or held hostage in exchange for exorbitant fees as part of a ransomware attack.

IBM Security X-Force Threat Intelligence Index

Gain insights to prepare and respond to cyberattacks with greater speed and effectiveness with the IBM Security X-Force Threat Intelligence Index.

Related content

Register for the Cost of a Data Breach report

Data exfiltration versus data leakage versus data breach

While often used interchangeably, data leakage, data breach and data exfiltration are different, if related, concepts.

Data leakage is the accidental exposure of sensitive data. Data leakage can result from a technical security vulnerability or procedural security error.

A data breach is any security incident that results in unauthorized access to confidential or sensitive information. Someone who shouldn’t have access to sensitive data, gains access to sensitive data.

Data exfiltration is the discrete act of stealing the data. All data exfiltration requires a data leak or a data breach, but not all data leaks or data breaches lead to data exfiltration. For example, a threat actor can choose instead to encrypt the data as part of a ransomware attack or use it to hijack an executive’s email account. It’s not data exfiltration until the data is copied or moved to some other storage device under the attacker’s control.

The distinction is important. A Google search for ‘data exfiltration costs’ typically shows general information about the costs of data breaches but not much about the costs of data exfiltration. These often include substantial ransom payments to prevent the sale or release of exfiltrated data and further ransoms to prevent possible subsequent attacks.

How does data exfiltration happen?

Usually, data exfiltration is caused by

  • An outside attacker—a hacker, cybercriminal, foreign adversary or other malicious actor.
     

  • A careless insider threat—an employee, business partner or other authorized user who inadvertently exposes data through human error, poor judgment (for example, falling for a phishing scam) or ignorance of security controls, policies and best practices. For example, a user transferring sensitive data to a USB flash drive, portable hard disk drive or other unsecure device poses a threat.

In rarer cases, the cause is a malicious insider threat—a bad actor with authorized access to the network, such as a disgruntled employee.

Common data exfiltration techniques and attack vectors

Outside attackers and malicious insiders exploit careless or poorly trained insiders and technical security vulnerabilities to access and steal sensitive data.

Phishing and other social engineering attacks

Social engineering attacks exploit human psychology to manipulate or trick an individual into compromising their own security or their organization’s security.

The most common type of social engineering attack is phishing, the use of email, text or voice messages that impersonate a trusted sender and convince users to do any of the following actions:

  • Download malware (such as ransomware)
  • Click links to malicious web sites
  • Give up personal information (for example, log-in credentials)
  • Directly hand over the data that the attacker wants to exfiltrate

Phishing attacks can range from impersonal bulk phishing messages that appear to come from trusted brands or organizations, to highly personalized spear phishing, whale phishing and business email compromise (BEC) attacks. BEC attacks target specific individuals with messages that appear to come from close colleagues or authority figures.

But social engineering can be far less technical. One social engineering technique, called baiting, is as simple as leaving a malware-infected thumb-drive where a user will pick it up. Another technique, called tailgaiting, is simply following an authorized user into a room or a physical location where data is stored.

Vulnerability exploits

A vulnerability exploit takes advantage of a security flaw or opening in a system’s or device’s hardware, software or firmware. Zero-day exploits take advantage of security flaws that hackers discover before software or device vendors know about them or are able to fix them. DNS tunneling uses domain name service (DNS) requests to evade firewall defenses and create a virtual tunnel for exfiltrating sensitive information.

The cost of data exfiltration

For individuals, data that is stolen through exfiltration can result in costly consequences such as identity theft, credit card or bank fraud and blackmail or extortion. For organizations—particularly organizations in highly regulated industries such as healthcare and finance—the consequences are more costly by orders of magnitude. The following consequences are examples of what may occur:

  • Disrupted operations resulting from lost business-critical data
     

  • Loss of customers’ trust or business
     

  • Compromised trade secrets, such as product developments/inventions, unique application codes or manufacturing processes
     

  • Severe regulatory fines, fees and other sanctions for organizations that are required by law to adhere to strict data protection and privacy protocols and precautions when dealing with customers’ sensitive data
     

  • Subsequent attacks that are made possible by the exfiltrated data

Reports or studies of costs attributable directly to data exfiltration are difficult to find, but data exfiltration incidents are increasing rapidly. Today, most ransomware attacks are double-extortion attacks—the cybercriminal encrypts the victim’s data and exfiltrates it. Next, the cybercriminal demands a ransom to unlock the data (so the victim can resume business operations) and subsequent ransoms to prevent sale or release of the data to third parties.

In 2020, cybercriminals exfiltrated hundreds of millions of customer records from Microsoft and Facebook alone. In 2022, the Lapsus$ hacking group exfiltrated one terabyte of sensitive data from chipmaker Nvidia, and leaked source code for the company’s deep learning technology. If hackers follow the money, the money in data exfiltration must be good and getting better.

Data exfiltration prevention

Organizations use a combination of best practices and security solutions to prevent data exfiltration.

Security awareness training. Because phishing is such a common data exfiltration attack vector, training users to recognize phishing scams can help block hackers' attempts at data exfiltration. Schooling users on best practices for remote work, password hygiene, use of personal devices at work and handling/transferring/storing company data can help organizations reduce their risk of data exfiltration.

Identity and access management (IAM). IAM systems allow companies to assign and manage a single digital identity and a single set of access privileges for each user on the network. These systems streamline access for authorized users while keeping unauthorized users and hackers out. IAM can combine the following technologies:

  • Multi-factor authentication—requiring one or more log-on credentials in addition to a username and password).
     

  • Role-based access control (RBAC)—providing access permissions based on the user’s role in the organization.
     

  • Adaptive authentication—requiring users to reauthenticate when context changes (for example they switch devices or attempt to access particularly sensitive applications or data).
     

  • Single sign-on—enabling users to log in to a session once using a single set of login credentials, and access to multiple related on-premises or cloud services during that session without logging in again.

Data loss prevention (DLP). DLP solutions monitor and inspect sensitive data in any state—at rest (in storage), in motion (moving through the network), and in use (being processed)—for signs of exfiltration, and block exfiltration accordingly. For example, DLP technology can block data from being copied to an unauthorized cloud storage service, or from being processed by an unauthorized application (for example, an app a user downloads from the web).

Threat detection and response technologies. A growing class of cybersecurity technologies continuously monitor and analyze corporate network traffic and user activity. These technologies help overburdened security teams detect cyberthreats in real or near-real time and respond with minimal manual intervention. These technologies include the following: 

  1. Intrusion detection systems (IDSs)
  2. Intrusion prevention systems (IPSs)
  3. Security information and event management (SIEM)
  4. Security orchestration, automation and response (SOAR) software
  5. Endpoint detection and response (EDR)
  6. Extended detection and response (XDR) solutions
Related solutions
IBM Security® QRadar® Suite

Outsmart attacks with a connected, modernized security suite. The QRadar portfolio is embedded with enterprise-grade AI and offers integrated products for endpoint security, log management, SIEM and SOAR—all with a common user interface, shared insights and connected workflows.

Explore QRadar Suite
Data security and protection solutions

Implemented on premises or in a hybrid cloud, IBM data security solutions help you investigate and remediate cyberthreats, enforce real-time controls and manage regulatory compliance.

Explore data security and protection solutions
X-Force incident response team

Proactive threat hunting, continuous monitoring and a deep investigation of threats are just a few of the priorities facing an already busy IT department. A trusted incident response team on standby can reduce your response time, minimize the impact of a cyberattack and help you recover faster.

Explore X-Force incident response
IBM Storage Defender

Proactively protect your organization’s primary and secondary storage systems against ransomware, human error, natural disasters, sabotage, hardware failures and other data loss risks.

Explore IBM Storage Defender

Resources What is ransomware?

Ransomware is a form of malware that threatens to destroy or withhold the victim’s data or files unless a ransom is paid to the attacker to unencrypt and restore access to the data.

Cost of a Data Breach 2022

Now in its 17th year, this report shares the latest insights into the expanding threat landscape and offers recommendations for saving time and limiting losses.

X-Force Threat Intelligence Index 2023

CISOs, security teams and business leaders: Find actionable insights for understanding how threat actors are waging attacks and how to proactively protect your organization.

Take the next step

Cybersecurity threats are becoming more advanced, more persistent and are demanding more effort by security analysts to sift through countless alerts and incidents. IBM Security QRadar SIEM helps you remediate threats faster while maintaining your bottom line. QRadar SIEM prioritizes high-fidelity alerts to help you catch threats that others miss.

Explore QRadar SIEM Book a live demo