June 25, 2020 By Michael Factor 3 min read

The benefits of using object storage to store relational data.

There are many different classes of data—voice recordings, text, pictures, and tabular, among others. Tabular data, which is also known as rectangular or structured data, is of critical importance to businesses, but also to individuals. This class of data is pervasive. It is the records the bank keeps of our transactions, a history of purchases in a store, the grades a teacher gave to a class, or the rate of transmission of COVID-19 by location. In all of these cases, we can have a table of records that defines a relationship.

For the example of a teacher’s grades, the table may have columns for the student’s name, the grade on the first exam, the grade on the second exam, and the final grade for the class. Each record in the table defines a relation among a set of values, where the value of each column has a particular type and each row is an n-tuple with a value for each column. For instance, the grades may be whole numbers and the name a string. There is no inherent ordering to the rows in the table, but one can query them based upon various properties. Returning to the example of grades, the teacher might want to find all of the students whose final grade was above 90. Structured Query Language (SQL) is the way these tables are typically queried.

The relational model

This model of organizing data that has relations was first introduced by Edgar Codd, from IBM, fifty years ago. Initially, it was realized in integrated databases that were responsible for everything from the management of the storage up through the processing of the queries. Such a top-to-bottom integrated system provides the maximum opportunities for performance optimization but limits the ability to share data between different tools.

Over the past decade or so—first with open source frameworks such as Map Reduce, then with Spark, and now with more traditional databases, such as IBM Db2 Event Store—engines that are supporting the processing of relational tables are delegating the storage and management of the data to an object store, typically supporting the S3 API. These engines use a range of formats to organize the tabular data, such as Parquet, ORC, or CSV, and then support the same SQL queries that were supported by the original integrated databases.

Object storage and relational data

This use of object storage for storing the relational data brings several benefits (although, at the price of providing less opportunities for optimization):

  • It reduces the cost of developing the overall system by separation of concerns; the query engine no longer needs to address storing the data.
  • It maps naturally to cloud deployments, as all clouds have highly scalable cost-efficient object stores.
  • It enables much higher scalability in data size, as object stores are essentially infinitely scalable.
  • If an open format is used, it allows having different engines and tools to process the same data, allowing you to use the right tool for the job.

IBM has a range of such tools that help data professionals wrangle data and fully harness the power of ANSI SQL over this effectively limitless, yet economical, storage medium. On the IBM Cloud, SQL Query provides serverless execution of SQL for rectangular data stored in IBM Cloud Object Storage. SQL Query is a great tool for data exploration, preparation, transformation, and analytics.

For self-managed capacity hybrid or private cloud workloads, Db2 Big SQL provides a Db2-compatible SQL interface to Object Storage APIs and other data stores through the IBM Cloud Pak for Data on Red Hat OpenShift. And Spark, available with Cloud Pak for Data and via IBM Analytics Engine on the IBM Cloud, has built-in integration with object storage.

To get you started, we have provided free data lake architecture patterns for Big Data Modernization, Logging, and IoT use cases.

Learn more about the 50th anniversary of the relational model and other announcements.

Was this article helpful?
YesNo

More from Cloud

Fortressing the digital frontier: A comprehensive look at IBM Cloud network security services

6 min read - The cloud revolution has fundamentally transformed how businesses operate. Its superior scalability, agility and cost-effectiveness have made it the go-to platform for organizations of all sizes. However, this shift to the cloud has introduced a new landscape of ever-evolving security threats. Data breaches and cyberattacks continue to hit organizations, making robust cloud network security an absolute necessity. IBM®, a titan in the tech industry, recognizes this critical need, provides a comprehensive suite of tools and offers unmatched expertise to fortify…

How well do you know your hypervisor and firmware?

6 min read - IBM Cloud® Virtual Private Cloud (VPC) is designed for secured cloud computing, and several features of our platform planning, development and operations help ensure that design. However, because security in the cloud is typically a shared responsibility between the cloud service provider and the customer, it’s essential for you to fully understand the layers of security that your workloads run on here with us. That’s why here, we detail a few key security components of IBM Cloud VPC that aim…

New IBM study: How business leaders can harness the power of gen AI to drive sustainable IT transformation

3 min read - As organizations strive to balance productivity, innovation and environmental responsibility, the need for sustainable IT practices is even more pressing. A new global study from the IBM Institute for Business Value reveals that emerging technologies, particularly generative AI, can play a pivotal role in advancing sustainable IT initiatives. However, successful transformation of IT systems demands a strategic and enterprise-wide approach to sustainability. The power of generative AI in sustainable IT Generative AI is creating new opportunities to transform IT operations…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters