An entity relationship diagram (ER diagram or ERD) is a visual representation of how items in a database relate to each other. ERDs are a specialized type of flowchart that conveys the relationship types between different entities within a system. They use a defined set of symbols, including rectangles, ovals and diamonds, and link them with connecting lines.
Within the relational model of database design, ERDs establish how entries in a database are connected. ERDs are a high-level conceptual data model that sets the stage for more advanced database design and analysis.
Also, entity relationship modeling can help distill narratives and insights from a seemingly disparate collection of data points.
Business analysts and database engineers use ER diagrams as data modeling tools to assess the scope of the databases their organizations need, then plan out how the data will be stored.
ERDs inform the software engineering portion of a database project by laying out the requirements for the information systems architecture and database structure. In the three-schema approach to software engineering for database management systems (DBMS), the ERD is the conceptual tier.
Data integration is a complex data engineering process consisting of many moving parts. An ERD can help data engineers conceptualize the overall system and reduce the potential for errors.
Comparing existing databases to an ERD can reveal database design missteps that might be causing problems. Complex databases with numerous tables require extensive SQL knowledge for the debugging process. An ERD summarizes the database so engineers can quickly identify potential errors.
When undertaking business process reengineering projects, it can be helpful to obtain a bird’s-eye view of all the data within an organization’s information systems. ERDs are used to draft newer, more efficient data architecture solutions that facilitate the other stages of the BPR process.
Entity relationship diagrams, database schemas and data flow diagrams all visually represent the way data is arranged in a system.
Entity relationship diagrams illustrate the entities within a database and their relationships to each other. ER diagrams often depict database schemas.
Database schemas establish how real-world entities will be modeled in a relational database. They contain the rules and guidelines that determine the organization of the database, such as table names, fields and data types.
Data flow diagrams are a type of flowchart that depicts the flow of data through a process or system. They show how data moves from process to internal and external storage locations.
Entity relationship diagrams include entities, the attributes of those entities and the relationships between them. Some ERDs also convey cardinality, which quantifies the relationship between two entities.
An ERD entity is a definable thing, such as a person, role, event, concept or object, that can have information about it stored in a relational database. Many styles of entity relationship diagrams depict entities as rectangles.
Entities are similar to nouns in a grammatical sense. They are core items in the database, with attributes and relationships conveying information about these entities, just as adjectives and verbs provide more information about the nouns in a sentence.
Entity types are a category of entities. If entities are similar to nouns, then entity types are noun categories: foods, sports and countries. The individual entities within an entity type are known as instances. Within the entity type, vegetables might be the instances broccoli, carrot and asparagus.
Entities are classified as either strong or weak. Strong entities contain enough identifying information in their attributes to not need further clarification. Meanwhile, weak entities exist only as an outcome or consequence of another entity. The strong entity associated with a given weak entity is known as its parent or owner entity.
Consider a database modeling customer order in an e-commerce business. Each order is a strong entity because it can be defined as a unique instance based on the purchaser, time and date. However, the line items within each order are weak entities. They only have meaning within the context of their respective orders. This reliance is known as existence dependency or participation constraint.
Strong entities are shown as solid rectangles, while ERDs represent weak entities as a double rectangle.
An associative entity links the instances between two entity sets and has its own attributes that provide more information about that relationship. In an ERD used by a university, the entity sets students and professors to have many connections to each other. The associative entity bridging the two would show which students are taking courses taught by which professors.
Relational databases use associative entities to inform junction tables, which combine fields from multiple other database tables. In ER diagrams, associative entities are depicted as a diamond within a rectangle.
Attributes are qualities, properties and characteristics that define an entity or entity type. In a classic ERD design, attributes are shown as ovals and are displayed next to the corresponding entity in an ERD.
Entity keys are the attributes that uniquely define each entity in a data set. Any attribute can be designated as a key, provided that it fills this role. For example, in a people entity set, an appropriate key attribute might be a national ID number. Conversely, surnames would not work as a key attribute in this context since more than one person can share the same surname.
Relationships are the connected lines linking the entities in an ERD together. They indicate how entities within an ERD are associated with each other. If entities are nouns, and attributes are adjectives, then relationships are verbs.
In a traditional ERD, relationships are depicted as diamonds. Weak relationships bind a weak entity with its owner and are shown as double diamonds.
Entity participation in a relationship might be total, in which case the entirety of the entity set is involved in the relationship or partial. In partial participation, some or all of the entities within the set might be involved in the relationship at any specific time.
Cardinality is the quality of a relationship that defines the number of instances in one entity that relate to the instances of another.
ERDs represent cardinality through variations in the connecting lines between entities. The way cardinality is shown depends on the style of ERD used.
Most ERDs are drafted in one of three entity-relationship models: conceptual, logical and physical. All three depict entities along with their attributes and relationships, but their use cases and intended audiences differ. Conceptual is the least detailed, while physical ERDs offer the most granular information.
Since computer scientist and database theorist Peter Chen introduced ERDs in the 1970s, multiple types of diagrams have emerged to fill an increasing range of use cases.
Chen ERDs look similar to classical flowcharts, with various shapes connected by lines. Cardinality is shown with the characters 1, M and N along the connecting lines. M and N both represent “many” in a one-to-many or many-to-many relationship; depicting the latter with M:N or N:M notation implies that the number of entities in the relationship need not be equal on both sides.
The Chen style depicts total participation with a single connecting line and partial with a double connecting line.
Named for its three-pronged forked connecting line showing many relationships, crow’s foot notation replaces Chen’s symbols with tables. Each table represents an entity and contains all its attributes. Crow’s foot notation allows ERD creators to show information regarding relationship cardinality.
Charles Bachman’s data structure diagrams directly inspired Chen in the creation of the ERD. Bachman used lines with arrows to indicate cardinality in relationships.
The US Air Force introduced its Integration DEFinition for information modeling (IDEF1X) language in the 1980s to support the development of semantic data models. It took Chen’s work a step forward by displaying attributes within a shared table and introducing more options for cardinality.
Created by Richard Barker in 1981, the Barker style is the standard for use in Oracle. Barker notation shares the crow’s foot style for connecting lines while also using dashed lines to represent partial or optional participation.
Simplify data access and automate data governance. Discover the power of integrating a data lakehouse strategy into your data architecture, including cost-optimizing your workloads and scaling AI and analytics, with all your data, anywhere.
Explore the data leader's guide to building a data-driven organization and driving business advantage.
Access our guide to learn how to use the right databases for applications, analytics and generative AI.
Learn how an open data lakehouse approach can provide trustworthy data and faster analytics and AI projects execution.
Gain unique insights into the evolving landscape of ABI solutions, highlighting key findings, assumptions and recommendations for data and analytics leaders.
Discover why AI-powered data intelligence and data integration are critical to drive structured and unstructured data preparedness and accelerate AI outcomes.
Use IBM database solutions to meet various workload needs across the hybrid cloud.
Explore IBM Db2, a relational database that provides high performance, scalability and reliability for storing and managing structured data. It is available as SaaS on IBM Cloud or for self-hosting.
Unlock the value of enterprise data with IBM Consulting, building an insight-driven organization that delivers business advantage.