As of last week, IBM Cloud SQL Query now supports a wide variety of time series functions as native components of the service.

The library, developed by IBM Research, joins our Geospatial functionality as fully supported by the IBM Cloud SQL Query service and team. 

SQL Query is a serverless, pay-per-query offering that can manipulate and analyze semi-structured and structured data in IBM Cloud Object Storage. Its new SQL-native time series support is industry-leading by its breadth of directly available capability inside a SQL engine. It significantly simplifies definition and execution of time series processing problems through a declarative language (SQL) instead of having to write hundreds of lines of custom code. SQL-native time series processing allows for very high productivity and achieves low time to value.

Key features of the time series functions

  • Full suite of SQL-style temporal joins on aperiodic and unaligned time series
  • Multi-typed and multi-time series support for numeric and categorical (string) data
  • Time Reference System (akin to Coordinate Reference Systems for geospatial) for handling timestamps at multiple granularities
  • Rich support for segmentation based on time, number of records, markers, and anchors
  • Built-in SQL constructs for interpolation, similarity, and forecasting

Why time series support? 

The growth of data volumes is dominated by machine-generated data, such as IoT sensor feeds, connected cars, or user behavior logs. All this data is time stamped by nature, which is also a key dimension for deriving valuable business insights.

Capturing and analyzing the state of a systems over time allows us to make more informed predictions about future events, observe real-time changes, and capture historical anomalies. Whether it be IoT devices, autonomous driving systems, or network performance, many of the systems and products we use today are constantly emitting time series data that can be used to optimize and improve performance, safety, and robustness. 

Coupled with the massive scale-out capabilities of SQL Query and IBM Cloud Object Storage, we can now provide unlimited retention and analytics on this at petabyte scale at an extremely low cost and barrier to entry. Does your team know SQL? Great, they can now tap into time series insights on the IBM Cloud

Adding sophisticated time series functionality to IBM Cloud SQL Query is an essential realization to our vision of building a cloud-native, serverless data lake for our clients. We aim to deliver a platform that makes data simple, allowing you to seamlessly store, manage, life-cycle, and analyze data and develop analytic solutions at scale. 

What does it do? 

Our time series functionality includes, but isn’t limited to, the following: 

  • Artifact creation
  • Exploding and flattening
  • Statistical insights 
  • Forecasting
  • Filtering
  • Temporal join and align
  • Interpolation 

Check out the full cadre of features and get going with sample queries in our UI.

Example of hourly segmentation of time series data

Benefits of using SQL Query’s time series support

Time series functions allow SQL Query to filter, cleanse, and analyze trillions of observational events per day at an order of magnitude less than a typical database. For example, storing 1 Terabyte of Parquet data and scanning all that data 100 times would only cost:

1,000 GB *.022/GB =$22/month 

(100TB/38)*$5/TB-Scanned = $13/month

Total = $35/month 

We recommend converting your data to Parquet using IBM Cloud SQL Query to dramatically decrease your total data scanned and improve the speed of your queries. For example, we’ve shown that converting to Parquet allowed us to scan 38x less data than a CSV required, which means that it’s 38x less expensive!  

Learn more about Data Lakes in the IBM Cloud

For Jupyter notebook users, we also have an in-depth tutorial of using this functionality for data science.

To get in touch with IBM Cloud about a time series use case, you can reach out to Josh Rosenkranz (jmrosenk@us.ibm.com) or myself at Joshua.Mintz@ibm.com.

Categories

More from Analytics

Data science vs data analytics: Unpacking the differences

5 min read - Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Data science is an area of expertise that combines many disciplines such as mathematics, computer science, software engineering and statistics. It focuses on data collection and management of large-scale structured and unstructured data for various academic and business applications. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to…

Financial planning & budgeting: Navigating the Budgeting Paradox

5 min read - Budgeting, an essential pillar of financial planning for organizations, often presents a unique dilemma known as the “Budgeting Paradox.” Ideally, a budget should give the most accurate and timely idea of anticipated revenues and expenses. However, the traditional budgeting process, in its pursuit of precision and consensus, can take several months. By the time the budget is finalized and approved, it might already be outdated.In today's rapid pace of change and unpredictability, the conventional budgeting process is coming under scrutiny.It's…

How Macmillan Publishers authored success using IBM Cognos Analytics

5 min read - Macmillan Publishers is a global publishing company and one of the “Big Five” English language publishers. If you're a reader, chances are good you've read a book from Macmillan. They published many perennial favorites including Kristin Hannah’s The Nightingale, Bill Martin’s Brown Bear, Brown Bear, what do you see? and some of the more recent bestsellers such as The Silent Patient by Alex Michaelides, Identity by Nora Roberts and Razorblade Tears by S. A. Cosby. It’s no wonder then that Macmillan…

MLOps and the evolution of data science

7 min read - The advancement of computing power over recent decades has led to an explosion of digital data, from traffic cameras monitoring commuter habits to smart refrigerators revealing how and when the average family eats. Both computer scientists and business leaders have taken note of the potential of the data. The information can deepen our understanding of how our world works—and help create better and “smarter” products. Machine learning (ML), a subset of artificial intelligence (AI), is an important piece of data-driven…