What is Data Exfiltration?

What is data exfiltration?

Data exfiltration—also known as data extrusion or data exportation—is data theft: the intentional, unauthorized, covert transfer of data from a computer or other device. Data exfiltration can be conducted manually, or automated using malware.

For targets ranging from average users to major businesses and government agencies, data exfiltration attacks rank among the most destructive and damaging cybersecurity threats. Preventing data exfiltration and protecting company data are crucial for several reasons:

Maintaining business continuity: Data exfiltration can disrupt operations, damage customer trust and lead to financial losses.
Complying with regulations: Many industries have specific data privacy and protection regulations. Data exfiltration often results from or exposes a failure to comply with these regulations and can cause severe penalties and lasting reputational damage.
Safeguarding intellectual property: Data exfiltration can compromise trade secrets, research and development and other proprietary information essential to an organization’s profitability and competitive advantage.

For cybercriminals, sensitive data is an extremely valuable target. Stolen customer data, personally identifiable information (PII), social security numbers, or any other type of confidential information might be sold on the black market. The stolen data can also be used to execute further cyberattacks or held hostage in exchange for exorbitant fees as part of a ransomware attack.

The latest tech news, backed by expert insights

Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think newsletter. See the IBM Privacy Statement.

Data exfiltration versus data leakage versus data breach

While often used interchangeably, data leakage, data breach and data exfiltration are different, if related, concepts.

Data leakage is the accidental exposure of sensitive data. Data leakage can result from a technical security vulnerability or procedural security error.

A data breach is any security incident that results in unauthorized access to confidential or sensitive information. Someone who shouldn’t have access to sensitive data, gains access to sensitive data.

Data exfiltration is the discrete act of stealing the data. All data exfiltration requires a data leak or a data breach, but not all data leaks or data breaches lead to data exfiltration. For example, a threat actor can choose instead to encrypt the data as part of a ransomware attack or use it to hijack an executive’s email account. It’s not data exfiltration until the data is copied or moved to some other storage device under the attacker’s control.

The distinction is important. A Google search for ‘data exfiltration costs’ typically shows general information about the costs of data breaches but not much about the costs of data exfiltration. These often include substantial ransom payments to prevent the sale or release of exfiltrated data and further ransoms to prevent possible subsequent attacks.

Security Intelligence | 4 March, episode 23

Your weekly news podcast for cybersecurity pros

Whether you're a builder, defender, business leader or simply want to stay secure in a connected world, you'll find timely updates and timeless principles in a lively, accessible format. New episodes on Wednesdays at 6am EST.

Watch the latest podcast episode

How does data exfiltration happen?

Usually, data exfiltration is caused by

An outside attacker—a hacker, cybercriminal, foreign adversary or other malicious actor.
A careless insider threat—an employee, business partner or other authorized user who inadvertently exposes data through human error, poor judgment (for example, falling for a phishing scam) or ignorance of security controls, policies and best practices. For example, a user transferring sensitive data to a USB flash drive, portable hard disk drive or other unsecure device poses a threat.

In rarer cases, the cause is a malicious insider threat—a bad actor with authorized access to the network, such as a disgruntled employee.

Common data exfiltration techniques and attack vectors

Outside attackers and malicious insiders exploit careless or poorly trained insiders and technical security vulnerabilities to access and steal sensitive data.

Phishing and other social engineering attacks

Social engineering attacks exploit human psychology to manipulate or trick an individual into compromising their own security or their organization’s security.

The most common type of social engineering attack is phishing, the use of email, text or voice messages that impersonate a trusted sender and convince users to do any of the following actions:

Download malware (such as ransomware)
Click links to malicious web sites
Give up personal information (for example, log-in credentials)
Directly hand over the data that the attacker wants to exfiltrate

Phishing attacks can range from impersonal bulk phishing messages that appear to come from trusted brands or organizations, to highly personalized spear phishing, whale phishing and business email compromise (BEC) attacks. BEC attacks target specific individuals with messages that appear to come from close colleagues or authority figures.

But social engineering can be far less technical. One social engineering technique, called baiting, is as simple as leaving a malware-infected thumb-drive where a user will pick it up. Another technique, called tailgaiting, is simply following an authorized user into a room or a physical location where data is stored.

Vulnerability exploits

A vulnerability exploit takes advantage of a security flaw or opening in a system’s or device’s hardware, software or firmware. Zero-day exploits take advantage of security flaws that hackers discover before software or device vendors know about them or are able to fix them. DNS tunneling uses domain name service (DNS) requests to evade firewall defenses and create a virtual tunnel for exfiltrating sensitive information.

The cost of data exfiltration

For individuals, data that is stolen through exfiltration can result in costly consequences such as identity theft, credit card or bank fraud and blackmail or extortion. For organizations—particularly organizations in highly regulated industries such as healthcare and finance—the consequences are more costly by orders of magnitude. The following consequences are examples of what may occur:

Disrupted operations resulting from lost business-critical data
Loss of customers’ trust or business
Compromised trade secrets, such as product developments/inventions, unique application codes or manufacturing processes
Severe regulatory fines, fees and other sanctions for organizations that are required by law to adhere to strict data protection and privacy protocols and precautions when dealing with customers’ sensitive data
Subsequent attacks that are made possible by the exfiltrated data

Reports or studies of costs attributable directly to data exfiltration are difficult to find, but data exfiltration incidents are increasing rapidly. Today, most ransomware attacks are double-extortion attacks—the cybercriminal encrypts the victim’s data and exfiltrates it. Next, the cybercriminal demands a ransom to unlock the data (so the victim can resume business operations) and subsequent ransoms to prevent sale or release of the data to third parties.

In 2020, cybercriminals exfiltrated hundreds of millions of customer records from Microsoft and Facebook alone. In 2022, the Lapsus hacking group exfiltrated one terabyte of sensitive data from chipmaker Nvidia, and leaked source code for the company’s deep learning technology. If hackers follow the money, the money in data exfiltration must be good and getting better.

Data exfiltration prevention

Organizations use a combination of best practices and security solutions to prevent data exfiltration.

Security awareness training. Because phishing is such a common data exfiltration attack vector, training users to recognize phishing scams can help block hackers' attempts at data exfiltration. Schooling users on best practices for remote work, password hygiene, use of personal devices at work and handling/transferring/storing company data can help organizations reduce their risk of data exfiltration.

Identity and access management (IAM). IAM systems allow companies to assign and manage a single digital identity and a single set of access privileges for each user on the network. These systems streamline access for authorized users while keeping unauthorized users and hackers out. IAM can combine the following technologies:

Multi-factor authentication—requiring one or more log-on credentials in addition to a username and password).
Role-based access control (RBAC)—providing access permissions based on the user’s role in the organization.
Adaptive authentication—requiring users to reauthenticate when context changes (for example they switch devices or attempt to access particularly sensitive applications or data).
Single sign-on—enabling users to log in to a session once using a single set of login credentials, and access to multiple related on-premises or cloud services during that session without logging in again.

Data loss prevention (DLP). DLP solutions monitor and inspect sensitive data in any state—at rest (in storage), in motion (moving through the network), and in use (being processed)—for signs of exfiltration, and block exfiltration accordingly. For example, DLP technology can block data from being copied to an unauthorized cloud storage service, or from being processed by an unauthorized application (for example, an app a user downloads from the web).

Threat detection and response technologies. A growing class of cybersecurity technologies continuously monitor and analyze corporate network traffic and user activity. These technologies help overburdened security teams detect cyberthreats in real or near-real time and respond with minimal manual intervention. These technologies include the following:

Achieve continuous compliance in a hybrid data world with IBM Guardium Data Protection

Register for this webinar to learn how AI governance helps organizations manage risk, meet evolving regulations and build trusted, responsible AI at scale.

What is data exfiltration?