Share this post:
Turing award recipient, IBM Fellow, and mathematician Ted Codd, known for his invention of the relational model for database management
Fifty years ago this month, IBM researcher and computing pioneer Edgar Frank Codd published the seminal paper “A Relational Model of Data for Large Shared Data Banks,” which became the foundation of Structured Query Language (SQL), a language originally built to manage structured data with relational properties. Today SQL is one of the world’s most popular programming languages, but few know that 50 years ago it helped to spark an industry that is still focused on services which continue to drive the efficient, organized access to data.
The abstract of Dr. Codd’s landmark publication begins, “Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation).” This statement seems to be more relevant today than ever as users of these “large data banks” may find their data in local storage, network attached storage, or even in the cloud in a variety of database technologies built to satisfy the world’s insatiable demand for data. The paper then goes on to describe the need for this “data independence” and the relational model to which the user can interact without needing to know changes in data representation and how the data is stored.
After Dr. Codd’s paper on the relational model was published, it was quickly embodied in the formation of SQL by his fellow IBM researchers, Don Chamberlin and Raymond Boyce. IBM and others leveraged this in the late 1970s and early 1980s in Relational Database products that are still in widespread use across the world today (DB2, etc.). Similar adoption occurred in the open-source community of the early 1990s with relational database projects such as PostgreSQL and MySQL which continue to be used as primary datastores for business-critical workloads by today’s enterprises.
As the use of object-oriented programing grew in popularity in the 1980s and 1990s, object databases emerged as mechanisms to persist their programming objects and still remain popular datastores for multiple types of data. This was a precursor to an entire genre of datastores whose method of data storage and retrieval didn’t rely on the traditional tabular model of relational databases, but new structures such as key-value pairs, wide column stores, graphs, and documents of various encoding types (e.g. JSON). At this point, Dr. Codd’s relational model and SQL had become so pervasive, this new genre of datastores began to be simply identified in the industry as NoSQL databases.
Due to the sheer magnitude of people skilled in Dr. Codd’s relational model and subsequent SQL language, many data-related offerings that aren’t typically viewed as databases have now created and adopted an SQL interface to make these new technologies usable to the broadest base of users. One recent example is KSQL, the effort to position SQL as an interface to the popular open-source Apache Kafka distributed streaming platform. Another example is the growing number of SQL interface engines built to give clients access and ability to process popular cloud object service offerings (IBM’s SQL Query, open source Presto project, etc.). Big Data vendors have also looked to ease the burden of users understanding complex parallel processing and MapReduce concepts by adopting SQL interfaces on top of popular Data Warehousing offerings (IBM’s BigSQL, Apache Hive project, etc.). It has even been adopted by popular data manipulation software like Microsoft Excel to give users additional options for complex processing of spreadsheet data.
From the invention of the disk drive in the 1950s to the DB2 franchise launched in the early 1980s to the leading SQL and NoSQL offerings we provide in the IBM Cloud today, IBM has a rich heritage in driving innovation in the space of data services and bringing best of breed data technologies to our clients. Dr. Codd’s relational data model and the resulting technologies within which it has been leveraged have fundamentally changed how the world has organized and accessed data for the last fifty years. Fiftieth wedding anniversaries are called Golden Anniversaries. Fiftieth birthday celebrations are sometimes referred to as the Golden Jubilee. This month, at IBM, we recognize and celebrate Dr. Codd’s Relational Model’s Golden Anniversary, the impact it has had over the past 50 years, as well as what the next five decades may look like.
Also today, IBM has unveiled updates to IBM Cloudant database to more closely align the serverless database with the strengths of the IBM public cloud. With greater scalability, control and security and open design, clients will be able to access to the cloud-native design they are used to with Cloudant, while retaining the mission-critical benefits of traditional relational database. For more information on the Cloudant database, please see a blog here.
For a look at advancements and innovation that IBM is driving across data, check out the below blogs from IBM Research, Data and AI, Cloud, and Systems:
Codd, E.F., A relational model of data for large shared data banks, Communications of the ACM, Vol. 13, No.6., June 1970. https://doi.org/10.1145/362384.362685