What is a primary key?

Authors

Alice Gomstyn

Staff Writer

IBM Think

Alexandra Jonker

Staff Editor

IBM Think

What is a primary key?

A primary key is a column or columns in a database table with values that uniquely identify each row or record. For example, an employee ID column could be a primary key in a table of employee information.

Primary keys play an important role in relational databases by reinforcing data integrity and enabling successful data retrieval. Additionally, primary keys can be referenced by another type of key to define relationships between tables in relational databases.

Often-cited examples of primary keys found in various databases include:

Driver’s license numbers
Employee ID numbers
Product ID numbers
Social security numbers
Student ID numbers

Industry newsletter

The latest tech news, backed by expert insights

Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think Newsletter. See the IBM Privacy Statement.

Understanding keys and relational databases

Before exploring the particulars of primary keys, it’s helpful to understand relational database systems and the roles that keys such as primary keys play within database design.

Relational databases house data structured across multiple tables, with related data points across different tables. Such databases are managed through relational database management systems (RDBMS) such as IBM Db2, Oracle Database, Microsoft SQL Server and open source database systems such as PostgreSQL and mySQL.

Structured query language (SQL) is a common programming language used for database interactions, although some database management systems support other languages.

In the context of databases, a key is a column or an ordered collection of columns used to identify rows of data in a table. Keys help organize databases by showing relationships between related tables, identifying unique records and ensuring data accuracy and integrity. These benefits support successful, efficient SQL database queries.

AI Academy

Is data management the secret to generative AI?

Explore why high-quality data is essential for the successful use of generative AI.

Go to episode

What are the characteristics of a primary key?

A primary key, also known as an SQL primary key, gives a unique identifier to each record associated with a primary key value. Important characteristics of primary keys include:

No null values
No duplicate values
May take the form of an existing column in a table (natural key) or be added as a new column (surrogate key)
May include a single column or multiple columns (as a composite primary key)

A primary key might be an existing column within a single table, in which case it is known as a natural key. However, sometimes there is no single column in the table that meets the primary key constraint—the primary key rule that values must be unique and not null.

In such circumstances, a new column of unique values, not derived from the table’s existing data, may be generated to serve as the primary key. This artificially generated primary key is known as a surrogate key.

Relational database management systems commonly feature the capability to generate unique values for a column, which can then be used as a surrogate key. One example is the auto-increment feature of mySQL.

Primary keys can also be composite keys, meaning they consist of more than a single column of values.

Composite primary keys—unique combinations of columns that can be used to identify records—can serve as an alternative to generating surrogate keys. For example, a column of customer last names might be paired with a column containing dates of birth to form a composite primary key.

What data types are used in primary keys?

SQL supports a range of data types, ranging from timestamp (representations of dates and times) to varchar (variable length character strings). But not every data type is a good choice for primary keys.

It’s often recommended that columns containing numeric values, particularly integers, be used as primary keys since those are typically processed more quickly by relational database management systems.

What are other important types of keys?

Other keys used in database management include:

Unique keys
Candidate keys
Super keys
Foreign keys

Unique keys

Primary keys are a subset of unique keys. Unique keys follow what’s known as a unique constraint—the rule that the values of a key are valid only if they are unique. While all primary keys are unique keys, not all unique keys are primary keys. That’s because, unlike primary keys, unique keys may include null values.

Candidate keys

Candidate keys are keys that could serve as primary keys because they contain unique values and no null values. The difference between primary keys and candidate keys is that an existing table can have multiple candidate keys, but only one primary key.

Super keys

Super keys are similar to composite primary keys in that they may include multiple columns and be used to identify records. However, super keys might also include more columns or information than is strictly needed to uniquely identify a record, while composite primary keys do not include more columns or data than is necessary.

Foreign keys

A foreign key in one table refers to a specific primary key in another table in order to define a relationship between the tables. For instance, a foreign key in a table with customer order information may refer to a primary key column of unique customer IDs that’s part of another table containing customer data. This way, an order by a given customer may be linked to important information about that customer, such as their email address and date of birth.

The significance of referential integrity constraints in defining table relationships

To understand how primary keys and foreign keys define relationships between tables in relational databases, it helps to consider referential integrity constraints.

Referential integrity constraints, also known as referential constraints or foreign key constraints, are rules requiring that foreign key values in one table match the primary key values in another table.

For example, in an employee database, a referential integrity constraint may specify that each employee in the EMPLOYEE table must be a member of an existing company department listed in the DEPARTMENT table.

In this case, the DEPARTMENT table’s primary key is a column storing unique department numbers, while the EMPLOYEE table’s foreign key is a column storing the same unique department numbers. These matching columns mean the tables adhere to the foreign key constraint.

Since the DEPARTMENT table is home to the primary key referenced by the EMPLOYEE table’s foreign key, the EMPLOYEE key is considered as having a dependency on the DEPARTMENT table. In turn, the DEPARTMENT table is considered the “parent table” in this table relationship.

Referential integrity constraints can block erroneous data insertions into databases. Such constraints, for instance, would prevent a user from adding an employee record to the EMPLOYEE table (as pictured below) if the department number in that employee record did not exist in DEPARTMENT TABLE.

A diagram showing how primary key and foreign key constraints can prevent the insertion of an invalid record into a table.

What other concepts and practices relate to primary keys?

Indexes

Relational database management systems may create or require the creation of a unique index to correspond to each primary key in the database. In the context of database design, an index is a set of pointers that refer to rows in a table. Indexes can be used to improve performance, including the optimization of data retrieval.

Data modeling

Data modeling provides a visual representation of a database schema, or a blueprint for data relationships among tables in a database. Data models can include information on the primary keys and foreign keys that are used to maintain those relationships.

SQL statements

SQL statements are commands used to interact with relational databases. Under SQL syntax, primary keys for tables may be assigned or added using CREATE TABLE or ALTER TABLE statements.

For example, consider a CREATE TABLE statement for a table of employee names called EMP using IBM’s Db2. The column names are ID (for employee ID), and FIRSTNME and LASTNAME (with 15-character maximums for each). When ID is selected as the primary key, the statement could look like this:

CREATE TABLE EMP
    (ID INT NOT NULL,
    FIRSTNME VARCHAR (15) NOT NULL,
    LASTNAME VARCHAR (15) NOT NULL,
PRIMARY KEY (ID));

In ALTER TABLE statements (for existing tables) in Db2, a clause used to add a primary key is ADD PRIMARY KEY, while foreign keys are added through ADD CONSTRAINT in combination with a reference to the parent table.

Increasing AI Adoption with AI-Ready Data

Gain actionable insights on how to invest in AI technology for data and preparing data for AI.

Resources

AI agents run on data—is yours ready?

Your data is your competitive edge. Learn how to unlock it securely and drive measurable ROI from AI in this short webinar.

Data management explained

Techsplainers by IBM breaks down the essentials of data for AI, from key concepts to real‑world use cases. Clear, quick episodes help you learn the fundamentals fast.

Unify and access your data to help scale your AI

Learn why the path to AI-ready data often starts with effective access to both structured and unstructured data and the challenges that can impede data leaders.

Legal overhead turned into strategic insight

Learn how an AI-powered legal agent helps accelerate decision-making, reduce manual work and improve compliance.

AI Academy: Building a data strategy for enterprise AI

In this episode, Cathy Reese explains how organizations today need a data strategy that’s ready for advanced AI, which will require them to harness their highest quality data assets.

The hybrid, open data lakehouse for AI

Simplify data access and automate data governance. Discover the power of integrating a data lakehouse strategy into your data architecture, including cost-optimizing your workloads and scaling AI and analytics, with all your data, anywhere.

Cost of a Data Breach Report 2025

Data breach costs have hit a new high. Get up-to-date insights into cybersecurity threats and their financial impacts on organizations.

The data leader’s guide to AI-ready data

Understand the actionable steps data leaders can take to overcome data challenges, establish the groundwork for a trusted data foundation and help get your organization’s data ready for AI.

How the C-suite is turning information into impact

Explore insights from 1,700 CDOs in this cross-industry report for data leaders.

What is a primary key?

Authors

What is a primary key?

A primary key is a column or columns in a database table with values that uniquely identify each row or record. For example, an employee ID column could be a primary key in a table of employee information.

The latest tech news, backed by expert insights

Thank you! You are subscribed.

Understanding keys and relational databases

Is data management the secret to generative AI?

What are the characteristics of a primary key?

What data types are used in primary keys?

What are other important types of keys?

Unique keys

Candidate keys

Super keys

Foreign keys

The significance of referential integrity constraints in defining table relationships

What other concepts and practices relate to primary keys?

Indexes

Data modeling

SQL statements

Share

Resources