What is a data breach?

By IBM Services

What is a data breach and how do you defend against one?

A data breach is also known as a data spill or data leak.

According to Techopedia, a data breach is “an incident which involves the unauthorized or illegal viewing, access or retrieval of data by an individual, app, or service.” ⁽¹⁾ This is a type of security breach specifically for stealing sensitive information and can be performed physically by accessing a computer or network or remotely, by bypassing network security.

Data breaches commonly occur after a hacker or similar unauthorized user accesses a secure database or data repository. Frequently conducted via the Internet or network connection, data breaches usually revolve around the pursuit of logical or digital data.

According to Symantec, the most common form of data lost to data breaches was personally identifiable information — such as full names, credit card numbers, and Social Security numbers — with personal financial information close behind. ⁽²⁾

After they have acquired this data, hackers may use it to commit identity theft and other cybercrimes, including applying their stolen information and gaining administrator access to all of your network.

In addition to data loss, a data breach harms both a business and their customers in other ways. The data breach’s damage extends to the cost to boost cyber security and repair and update the exploitable vulnerability, as well as the long-term damage to the enterprise’s reputation, and the damage to the enterprise’s customers who had their private information stolen. 

How does a data breach occur?

Trend Micro argues that data breaches are a four-step process that includes the following actions for a general data breach:

  • Research: The hacker probes the computer or network for vulnerabilities.
  • Attack: The hacker begins the attack, making contact using a network or social attack.
    • Network attack: This attack involves network manipulation. The hacker uses infrastructure, system and applications weaknesses to infiltrate the victim’s computer or network.
    • Social attack: This attack involves social manipulation. The hacker tricks or baits employees into giving them access to the computer or network. This includes tricking an employee into revealing their login credentials or duping them into opening a malicious attachment.
  • Exfiltration: Once they have broken into a computer, the hacker can then attack the network or pilfer the company’s data. After the network is damaged or the data is extracted, the attack is considered successful. ⁽³⁾

Why do data breaches occur?

Malwarebytes argues that “a data breach isn’t a threat or attack in its own right and instead comes as a result of a cyber attack that allows hackers to gain unauthorized access to a computer system or network and steal its data.” ⁽⁴⁾  Despite the common phrase, cybercrime does pay. As the process of digitizing content rises and the cloud continues to grow, then data breaches will continue to occur.

Targeted data breaches typically occur for the following reasons:  

  • Exploiting system vulnerabilities
  • Weak passwords
  • Structured Query Language (SQL) injection
  • Spyware
  • Phishing
  • Drive-by downloads
  • Broken or misconfigured access controls

Exploiting system vulnerabilities

An attack that allows hackers to use vulnerabilities in software or a system to gain unauthorized access to a computer or network and their data. Exploits are commonly found in operating systems, Internet browsers, and a variety of different apps.

Hidden within a system’s code, these vulnerabilities are sought out by both hackers and cyber security experts and researchers. For example, older operating systems can unfortunately have built in vulnerabilities that today’s hackers can easily exploit to access a computer’s data.

While the hackers want to use the exploits for their own malicious gain, the cyber security agents want to better understand the exploits and how they can be patched or otherwise modified to prevent data breach and boost cyber security.

To make their dubious work easier, some cybercriminal groups will package different exploits into automated kits. These kits allow criminals with little technical knowledge to

Cybercriminal groups sometimes package multiple exploits into automated exploit kits that make it easier for criminals with little to no technical knowledge to take advantage of exploits.

Weak passwords

As its name implies, a weak password is a password that is easy to determine by both humans and computers. These are often passwords that contain the name of the user’s spouse, children, pets or address, since these are easy things for the user to remember. These passwords may not be case sensitive or just generally fail to use capital letters or symbols.

Weak passwords are easy for hackers to guess or to use brute force attacks or spidering to figure out a user’s password. Also, never have your password written down on your desk or be aware of anyone who make be shoulder surfing when you’re entering a password.   

SQL injection

SQL attacks exploit the vulnerabilities in unsecure website’s SQL database management software. To execute a SQL injection attack, a hacker embeds malicious code into a vulnerable site or application, then pivots to the backend database.

For example, a hacker changes the code in a retailer’s website so that when they perform a search for “best-selling headphones,” instead of yielding results for great headphones, the retailer’s website provides the hacker with a list of customers and their credit card information.

A less sophisticated type of cyber attack, SQL attacks can be performed using automated programs similar to those used for exploits.


Spyware is malware that infects your computer or network to “spy” on you and otherwise gather information about you, your computer, and what websites you visit.

Victims often get spyware after downloading or installing something that seems benign, only to have spyware bundled together with it.  You can also get spyware by clicking on a malicious link or as a secondary infection from a virus.

After your computer has been infected with spyware and it collects information about you, it then forwards this information to a remote location, such as command and control (C&C) servers or a similar repository where cyber criminals can access it.

Spyware is a type of malware that infects your computer or network and steals information about you, your Internet usage, and any other valuable data it can get its hands on. You might install spyware as part of some seemingly benign download (aka bundleware). Alternatively, spyware can make its way onto your computer as a secondary infection via a Trojan like Emotet. As reported on the Malwarebytes Labs blog, Emotet, TrickBot, and other banking Trojans have found new life as delivery tools for spyware and other types of malware. Once your system is infected, the spyware sends all your personal data back to the command and control (C&C) servers run by the cyber criminals.


Phishing attacks usually use social engineering to manipulate its victim’s emotions against logic and reasoning and get them to share sensitive information. They are often performed using email spoofing-based attacks or cloned website-based attacks that function similarly.

Attackers employing phishing and spam email tactics will trick users into doing the following:

  • revealing their user and password credentials
  • downloading malicious attachments
  • visiting malicious websites


For example, you could get an email that looks like it’s from your credit card company, asking you to verify made up charges to your account, and prompting you to login using a link to a fake version of the credit card site. Unsuspecting victims would attempt to login to the fake site using their real usernames and passwords. Once they have that information, then they can login to and access your credit card account and use it for identity theft and similar cyber crime.

Drive-by downloads

Drive-by downloads are cyber attacks that can install spyware, adware, malware and similar software onto a users computer without the users authorization. They allow hackers to take advantage of exploits and security flaws in browsers, applications and operating systems.

This cyber attack doesn’t necessarily need to trick the users into enabling it. Unlike phishing and spoofing attacks where the user needs to click a malicious link or download a malicious attachment, drive-by downloads just engage with a computer or device without the user’s permission. 

Broken or misconfigured access controls

If a website administrator isn’t careful, then they could establish access controls that would make parts of a system that are meant to be private instead able to be accessed by the public. This could be something as careless as neglecting to set certain back-end folders that contain sensitive data to private. General users tend to remain unaware of broken or misconfigured access controls. However, hackers that perform specific Google searches can locate these folders and access them. A good comparison to this is a burglar entering a house through an unlocked window as opposed to a burglar breaking into a house through a locked door.  

Benevolent hackers and data breaches

Data breach, similar to most types of cyber theft, involves hackers attempting to gain unauthorized access to your computer or network and steal your private information. However, there are some instances where this theft is performed with benevolent intentions.

Like many cyber security researchers, white hat hackers and other benevolent hackers will attempt to break into your computer or network, to discover exploits and vulnerabilities, and then to make others aware so that they can create a solution that remedies the exploit.

For example, after nine months reverse engineering work, an academic hacker team from KU Leuven University in Belgium published a paper in September 2018, that revealed how they defeated Tesla’s encryption for the Model S. ⁽⁵⁾ Their work helped Tesla to create new cyber security technology for their vehicles that remedied exploit the KU Leuven team discovered and used to clone the Model S’s key fob.

How can you detect a data breach?

Unlike many common types of cyber attacks, data breaches are notoriously hard to detect and it’s very common for organizations to discover the breach days or weeks, sometimes months after it has occurred. This large gap between when the data breach occurs and discover is incredibly problematic, as the hacker will have a large head start on using or selling the data they’ve stolen. Once the data breach is finally discovered and the vulnerability that allowed it is fixed, the damage has already been done.

In his article for SecurityIntelligence, Koen Van Impe notes that there are two signs of a data breach:

  • Precursors
  • Indicators ⁽⁶⁾


Precursors signal an imminent threat based on public information, such as security blogs, vendor advisories and security blogs, and similar information from threat analysis and intelligence sources or threat detection. Cyber security professionals use precursors to prepare for an anticipated cyber attack and to adjust their system’s security and cyber resilience according to the threat level. Precursors tend to occur rarely, especially when compared to indicators.


Indicators display that a data breach may have happened or that a one is currently happening. Security alerts, suspicious behavior and reports or alerts submitted by people from in our outside of a business are all examples of indicators. Indicators frequently occur at a high volume, a factor that contributes to the incident response process’s inefficiencies.

What indicators should you look for?

Here are several indicators that you should be aware of in the event of a possible data breach or similar cyber attack:

  • Irregularly high activity for your system, disk, or network. This is particularly worrisome if this occurs during what would normally be an idle period.
  • Activity on network ports or applications that are usually inactive. Unusual activity where the ports or applications are listening to network ports that they wouldn’t usually be listening to.
  • Unrecognized software are installed or odd system preferences are established.
  • Unrecognized and untraceable system configuration changes, including firewall changes, services reconfigurations, new startup program installations or schedules tasks.
  • Spikes in activity in a cloud services “last activity” overview that tracks abnormal behavior. This includes logging in at unusual times or from unusual locations or multiple locations in a short time period and other anomalous user activity.
  • Unanticipated user account lockouts, password resets or group membership deviations.
  • Frequent system crashes or application crashes.
  • Alerts from malware or antivirus protections, including notifications that they have been disabled.
  • Frequent pop-ups or unexpected redirects while browsing the internet, or browser configuration changes such as a new home page or search engine preferences.
  • Contacts report receiving unusual emails or direct messages from social media from you that you didn’t send them.
  • You receive a message from an attacker demanding money, such as from ransomware.

What can you do to detect and respond to a data breach?

In addition to the precursors and indicators, there are several guiding principles that can bolster your ability to detect and respond to an intrusion into your system. 

1. No changes, no red flags

If you can avoid making any changes to your computer or network, then do that. Making changes in a system where there’s a suspected intrusion risks damaging or destroying evidence, or even worsening the situation. The obvious trade-off here is the weight of the incident and the hacker’s intent, as well as your business objectives and the breach’s impact on them. 

2. Gather evidence

Be sure to collect evidence of what you suspect to be an intrusion and ensure that the evidence is stored somewhere with little risk of data loss. This will help with incident analysis and post-incident decision making, as well as part of a forensic data collection.

Log files, disk and memory information, malware samples, running process lists, user activity lists and active network connections are all data that can be collected for evidence.

In adhering to the No changes, no red flags rule, don’t make any changes to the system while collecting this information. And as with the first rule, consider your situation, the weight of the incident, and other relevant factors when weighing the advantages or disadvantages of your actions.

If you can access them, consider using remote forensics tools and work closely with your IT operations or cyber security team. If central logging isn’t something that you have, then ensure that logs are copied for a read-only location on a different computer or system from the attacked one.  

3. Record everything

Note taking during incident response can provide a treasure trove of data. Try to record every action that’s taken, including the verification, correlation and pivoting actions. Take notes and ensure that you haven’t missed anything now that might be important later. Your notes can help establish timelines and determine system areas that need support.

4. Confer with your peers

Once you have established a general understanding of everything that is occurring with your system, then confer with your peers and verify your findings. This includes referencing threat intelligence sources, as well as industry information sharing and analysis centers (ISACs) and national computer security incident response teams (CSIRTs). This step helps you to establish what others have already done and what steps need to be done in order to contain the intrusion and how to reverse the damage it caused.

5. Create an internal report

In addition to reporting observed incidents, you should also report any critical ongoing incidents that may impact your business to your stakeholders. A high-level analysis of the attack should include the following facts:

  • Whether the attack was targeted.
  • Whether the attack was observed before.
  • Whether other companies or organizations have experienced similar attacks.
  • What damage it has caused to date and the damage it’s expected to cause in the future.
  • What was the intent of the attack.

Be sure to include in your report any mitigation actions that were taken, if they were effective, and what additional actions you can expect to take in the future. White it behooves you to include the appropriate technical details, be sure to focus on how this will impact the business and its employees.  

Spread awareness about reports

Indicators can include reports from people within your organization. These internal reports can supply essential information for raising awareness of unusual behavior or situations. Streamline the reporting process and spread awareness about the reports among your employees. Consider establishing a “report an incident” button on your organization’s internal homepage. 

Make sure that your employees are aware of your cyber security team or IT support team. Make it so your employees can easily contact these teams if they have any questions or suggestions. Create help desk questions for these teams to ask to help them collect information.  

Foster transparency and a sense of ownership with the reports. This can mean following-up with each individual that submitted a report and provide an update regarding the incident specific to their report.

By incorporating this process into your workplace, not only will you help to cultivate an IT security culture and potentially boost your cyber resilience and security, but employees will be much more likely to report anything they feel is unusual. This combined process and culture can help you to shut down intrusions when they start.    

What can I do to prevent a data breach?

There is no golden solution for preventing a data breach outside of never going on the Internet or never booting up your computer or getting your network online. Obviously, there’s aren’t acceptable solutions for anyone.

Fortunately, when it comes to reducing the risk of data breach, there are several things that you can do to bolster your cyber security and cyber resilience.

  • Use strong passwords: Consider using a password generator that creates random combinations of upper- and lower-case letters, numbers, and symbols. Consider using a password tracking program that helps manage of these passwords for you.
  • Monitor your finances: Regularly review your bank and similar financial account activity. If possible, use activity alerts that inform you of any unusual activity. 
  • Monitor your credit report: If someone tries to use your private information to open a credit card or bank account using your name, this will show it. AnnualCreditReport.com offers a credit report every 12 months at no charge.
  • Act immediately: As soon as you see any unusual activity, take immediate action and contact the respective credit card company, bank or similar financial institution. If you were the victim of a data breach, then be sure to inform them of this fact.
  • Make your phone security-rich: Always create either a short numerical password or a swipe password for your phone. If you have a fingerprint scanner on your phone, then you should use that too. This provides a line of defense against unauthorized access to your phone and all the personal information stored on it in the event that it’s lost or stolen.    
  • Pay attention to URLs: Try to only use secure URLs. These all begin with “https://”. The “s” stands for secure and the HTTP request uses Secure Sockets Layer (SSL), a protocol used for secure communication between two parties. 
  • Up-to-date anti-virus: Depending on what software you are using and how your network is set up, this may also include a firewall. It should go without saying that having reliable anti-virus software with up-to-date definitions generally boosts your cyber security and cyber resilience, and generally improving your resistance to cyber attacks.
  • Regularly back up your files: Establish a regular schedule for backing up your files and storing these backups in security-rich environment. This will help you with creating recovery point objectives in the event of data loss or corruption.
  • Format or destroy your old hard drives: If you are retiring old systems and you are planning on cannibalizing the components, then be sure to format the hard drives before installing them into new computers. If you are simply getting rid of these systems and don’t plan on reusing the components, then first make sure that you have backed up your files. Secondly, dispose of your hard drives in such a way that it ensures no one will be able to make use of them. The simplest solution here is often to take a hammer to them.
  • Don’t post important information online: This is a practical step that shouldn’t require much explanation. Don’t post private, sensitive or otherwise very important information online, including on your social media accounts. It’s also generally a good idea to set your social media accounts to “private” as this limits who can view your social media account’s content.
  • Identity theft protection and credit monitoring services: Consider using identity theft protection and credit monitoring services, as they help prevent identity theft and can notify you in the event that it does occur.   
  • Use secure payment services: Paypal is a great example of this, as it doesn’t require you to give your credit card information to make a payment. Instead it helps you make secure payments using your accounts and without requiring you to input sensitive information. 

2018: Year of the data breach

Because of the vast amount of data, they contain, enterprises and large organizations are exceptionally attractive targets for cybercriminals that are looking to steal data.

In the Malwarebytes Lab blog post 2018: The year of the data breach tsunami, author Logan Strain notes that more data breaches occurred during 2017 than in 2018. However, the 2018 data breaches were more massive in scale and featured victims that included some of the biggest tech companies, retailers, and hospitality providers such as Facebook, Under Armor, Quora, and Panera Bread. ⁽⁷⁾

Due to the large amounts of data they contain, corporations and businesses are attractive targets for cybercriminals looking to steal large amounts of private data.  According to the Ponemon Institute’s 2018 Cost of a Data Breach study, a data breach goes undiscovered for an average of 197 days. The study argues that the average total cost to a company of a data breach is USD 3.86 million, a 6.4 percent increase over 2017. The global average cost for each lost or stolen record is also increased by 4.8 percent and averaging at approximately USD 148 per record. ⁽⁸⁾

The amount of data lost is further compounded by data breaches being notoriously difficult to detect, often going, with an additional 69 days to reverse the damage and work to recuperate from the losses.

Facebook data breaches, exposures and cyber attack

Facebook experienced several data breaches and exposures and cyber attacks that were made public during 2018 and 2019.

Facebook’s data exposures involve data stored online and publicly without a password. These exposures don’t necessarily involve malicious intent, such as a data breach or cyber attack, and are instead tied to human error and representing a security problem. 


The first data breach

When did the breach occur: Between 2013 and 2015

When was the breach discovered: Unknown

When was the breach made public: The breach was exposed on 17 March 2018 by reports from The New York Times and The Guardian.

What was stolen:  

  • Facebook user profile data
  • Facebook user preferences and interests

Although it was initially reported that 50 million Facebook profiles were accessed by Cambridge Analytica, multiple reports later confirmed that the figure was actually closer to 87 million users.

How did the data breach occur: A loophole in Facebook’s application programing interface allowed third-party developers to collect data. Cambridge Analytica exploited this loophole and were able to steal data from the Facebook app users, as well as all the people in those users’ friends network on Facebook.

Technicality: Technically, this isn’t a data breach, and is instead a misuse of user data.


The second data breach

When did the breach occur: The first breach took place between July 2017 and the end of September 2018.

When was the breach discovered: The breach was discovered on 25 September 2018.

When was the breach made public: This breach was publicly disclosed on 28 September 2018.

What was stolen:

  • Names
  • Phone numbers
  • Email addresses
  • Other personal information

How much data was stolen: Facebook initially reported that the breach exposed the information of approximately 50 million users, a figure that was later revised 30 million users with 14 million having their respective usernames and Facebook search history accessed.

How did the data breach occur: Using a flaw in the code for Facebook’s “view as” feature, hackers stole Facebook access tokens, then used the tokens to access users’ accounts, potentially gaining control of them.

What happened to the data: Cambridge Analytica used the data from these profiles to help identify swing voters in the 2016 U.S. presidential election. ⁽⁹⁾


The Instagram Nasty List attack

When did the attack occur: Unknown

When was the attack discovered: During March and April 2019

When was the attack made public: 8 April 2019

What was stolen:

  • Instagram login information:
    • User names and passwords
    • Email addresses
    • Phone numbers

First reported on Reddit, compromised Instagram accounts would message non-compromised accounts that they followed, prompting them that they were on a “Nasty List” or something similar, and including a malicious link.

How did this attack occur: A phishing attack, this malicious link would take the user to a cloned or otherwise fake Instagram page and prompt them to login.

How much data was stolen: The amount of stolen Instagram user information as a result of this attack is unknown. ⁽¹⁰⁾


Instagram passwords plaintext file data exposure

When did this data exposure occur: Unknown

When was this data exposure discovered: During March and April 2019

When was this data exposure made public: 18 April 2019

What may have been exposed:

  • Millions of Instagram passwords

How did this data exposure occur: Following the Instagram Nasty List attacks, Facebook confirmed more password security issues, noting that millions of Instagram accounts’ passwords were being stored in a plain text file. Although Facebook said “our investigation has determined that these stored passwords were not internally abused or improperly accessed, ⁽¹¹⁾” users whose information was on the plain text file were encouraged to perform a password reset. 


Facebook unsecure databases data exposure

When did this data exposure occur: Unknown

When was this data exposure discovered: Unknown

When was this data exposure made public: 4 September 2019

What may have been exposed:

  • Phone numbers linked to 419 million user accounts from multiple databases across several geographies including:
    • 133 million records on U.S.-based Facebook users
    • 18 million records of users in the U.K.
    • more than 50 million records on users in Vietnam
  • In addition to Facebook user IDs and phone numbers, information about each account’s username, gender, and country location were included. 

How did this data exposure occur: Unsecure databases across several countries contained Facebook account IDs, phone numbers and additional user information. ⁽¹²⁾


  1. Data Breach, Techopedia. https://www.techopedia.com/definition/13601/data-breach
  2. What is a data breach, Norton. https://us.norton.com/internetsecurity-privacy-data-breaches-what-you-need-to-know.html
  3. Data Breaches 101: How They Happen, What Gets Stolen, and Where It All Goes, Trend Micro. 10 August 2018. https://www.trendmicro.com/vinfo/us/security/news/cyber-attacks/data-breach-101
  4. Data Breach, Malwarebytes. https://www.malwarebytes.com/data-breach/
  5. Greenberg, Andy. Hackers Can Steal a Tesla Model S in Seconds by Cloning Its Key Fob, Wired. 10 September 2018. https://www.wired.com/story/hackers-steal-tesla-model-s-seconds-key-fob/
  6. Van Impe, Koen, Don’t Dwell On It: How to Detect a Breach on Your Network More Efficiently, SecurityIntelligence. 22 October 2018. https://securityintelligence.com/dont-dwell-on-it-how-to-detect-a-breach-on-your-network-more-efficiently/
  7.  Strain, Logan. Malwarebytes Labs. 2018: The year of the data breach tsunami. 4 April 2019. https://blog.malwarebytes.com/101/2018/12/2018-the-year-of-the-data-breach-tsunami/
  8. Ponemon Institute's 2018 Cost of a Data Breach Study: Global Overview.  https://www.ibm.com/account/reg/us-en/signup?formid=urx-33316
  9. Katz, Eitan, Dashlane blog. The 20 Biggest Data Breaches of 2018. 2 January 201 https://blog.dashlane.com/data-breaches-2018/
  10. Winder, Davey, Forbes. Hackers Are Using Instagram 'Nasty List' To Steal Passwords -- Here's What You Need To Know. 14 April 2019. https://www.forbes.com/sites/daveywinder/2019/04/14/hackers-are-using-instagram-nasty-list-to-steal-passwords-heres-what-you-need-to-know/#c9473b669dd2
  11. Keeping Passwords Secure, Facebook. 21 March 2019. https://about.fb.com/news/2019/03/keeping-passwords-secure/
  12. Winder, Davey, Forbes. Unsecured Facebook Databases Leak Data Of 419 Million Users. 5 September 2019. https://www.forbes.com/sites/daveywinder/2019/09/05/facebook-security-snafu-exposes-419-million-user-phone-numbers/#62bac4231ab7