Data sovereignty is the principle that nations have legal and regulatory authority over data that is generated or processed within their national borders. Data residency refers to the geographical location of data—the physical place where the data centers, servers or other systems that store or handle the data are located.
The core distinction is that data sovereignty is a legal concept and data residency is a geographical category. But the two concepts are deeply related.
Data residency often determines data sovereignty. For example, if a business stores data in a data center in Ireland, then that data resides in Ireland. Because the data resides in Ireland, Ireland has sovereignty over it. The business must comply with any data protection laws, data privacy laws and other regulatory requirements mandated by the Irish government.
That said, residency is not the only factor in determining who has jurisdiction over data. Other factors, such as where the data was originally collected or who it relates to, can also play a role.
Data sovereignty and data residency are important data governance concepts for organizations today. Businesses collect and process more data than ever before, and they often use cloud computing and software as a service (SaaS) apps to do it.
The result is a massive increase in international data flows. Businesses might collect data from people in one country, store it in a data center in a second country and process it with a cloud-based application running in a third country. Data might have to meet different legal requirements in each of these locations.
Organizations need a firm grip on where their data resides at every point in its lifecycle and the rules it must follow in each locale. Businesses can suffer significant penalties for breaking local data laws.
Data residency is the physical location of data. Data is said to reside in a particular nation, state or place if the data centers, servers or other machines that house or handle the data are physically located in that place.
Because a business’s data can move around a lot, a single organization’s data can have multiple residencies.
Say that a business is based in the US, collects personal data from US consumers and stores data on servers in the US. Clearly, the data resides in the US.
Now say that the same organization uses a SaaS app to process this data, and the app’s servers are located in Canada. Any data transferred to the Canadian servers for processing might now reside in Canada, and it might fall under Canadian data laws.
Data residency requirements often originate from an organization’s internal policy requirements or contractual commitments, independent of any regulatory requirement to localize data.
However, organizations do not always have a choice over where their data resides. Some regions have laws with data localization requirements, which mandate that organizations keep or process their data in a particular place.
While these terms are sometimes used interchangeably, they refer to two distinct concepts. Data residency describes where data is held. Data localization refers to legal requirements to keep data where it was created—that is, keeping data local.
Some countries have data localization requirements, under which organizations must keep data created in that country within the country’s borders. These requirements can range from merely keeping a copy of the data in the country to bans on data transfers outside the country.
Data sovereignty is the concept that data is subject to the laws of the country or region where it is generated or processed. If a country has "sovereignty over" a piece of data, that means the country has legal authority over that data, including for the purpose of national security.
Data sovereignty is often determined by residency. If data resides in a place, it is usually subject to that place’s laws.
Some data sovereignty laws follow data around, applying to the data regardless of where it moves. For example, the European Union’s (EU) General Data Protection Regulation (GDPR) can apply to data held or processed outside of the EU if that data pertains to EU residents.
So it’s not just where the data resides that can be relevant, but also where it was collected or who it relates to.
In much the same way that data can have multiple residencies, it can also fall under multiple sovereignties. For example, data that resides in an EU country must abide by that country’s local laws and the EU-wide GDPR.
Data sovereignty requirements can vary:
Failure to comply with local data laws can lead to fines or other legal penalties. It can also cause reputational damage. If an organization flouts data privacy regulations, customers might take their business elsewhere.
Data residency and data sovereignty requirements can shape an organization’s decisions about the kinds of data it collects, the way it uses data and the IT infrastructure it builds.
Today, organizations collect more types of data (customer data, operational data, transactional data) from more data sources (web apps, business systems, Internet of Things devices) around the world. Many organizations use cloud services for data storage, processing, analytics and other key workloads.
As data moves through an organization's cloud-connected IT infrastructure, it can cross many borders. Wherever the data goes, it can be subject to new laws. When working with cloud service providers, organizations need to be aware of where their data goes for storage, backups and processing.
Organizations might choose to work with public cloud providers that have infrastructure in the same place as the organization. Some organizations rely on private clouds with hardware located where they need it to be.
Many organizations take a hybrid multicloud approach, using multiple public and private cloud environments and providers. This hybrid approach can help the organization build the infrastructure it needs to comply with different data laws in different locations.
The complexities of data residency and sovereignty in the cloud have led to the development of sovereign cloud, a type of cloud computing designed to help organizations comply with the legal and regulatory requirements of different regions.
Some organizations opt for on-premises data systems instead of cloud storage and processing. Keeping data on-premises can help reduce certain compliance issues, but these arrangements can also be costly and less scalable than the cloud.
Some countries mandate that organizations take certain steps to secure data, such as applying specific access controls and threat detection technologies.
While preventing unauthorized access to sensitive information is already a priority for most organizations, data residency and sovereignty can dictate the specific data security steps they must take.
Some data laws dictate what organizations can do with the data they have.
For example, some laws forbid the use of sensitive data unless specific restrictive conditions are met. Some laws grant people considerable rights over their personal data, including the right to have it deleted upon request.
Organizations that are subject to these laws must put mechanisms in place to ensure that data is used appropriately and that consumers can exercise their rights with ease.
Data residency and sovereignty requirements can have implications for artificial intelligence (AI) and machine learning (ML) workloads.
Some nations restrict certain uses of AI on certain types of data. For example, the EU AI Act prohibits things such as social scoring systems and AI systems that exploit certain vulnerabilities, such as vulnerabilities due to age or disability.
Moreover, few organizations today build their own AI models from scratch. Many use AI systems from third-party providers, hosted in the cloud. These systems can introduce the same complexities as other cloud services.
Concerns about safe AI use have led to increasing interest in sovereign AI—that is, the efforts that nations undertake to develop their own AI systems and govern AI within their borders.
AI governance tools can help organizations gain more visibility into and control over how and where AI is deployed in their IT stacks. This increased visibility and control positions organizations to ensure that their AI and ML applications deliver value while complying with relevant regulations.
Disclaimer: The client is responsible for ensuring compliance with all applicable laws and regulations. IBM does not provide legal advice nor represent or warrant that its services or products will ensure that the client is compliant with any law or regulation.
Discover the benefits and ROI of IBM® Security Guardium Data Protection in this Forrester TEI study.
Learn about strategies to simplify and accelerate your data resilience roadmap while addressing the latest regulatory compliance requirements.
Data breach costs have hit a new high. Get essential insights to help your security and IT teams better manage risk and limit potential losses.
Follow clear steps to complete tasks and learn how to effectively use technologies in your projects.
Stay up to date with the latest trends and news about data security.
Identity and access management (IAM) is a cybersecurity discipline that deals with user access and resource permissions.