Introducing IBM Cloud SQL Query

We are excited to announce that SQL Query is now publicly available in the IBM Cloud as a beta service. SQL Query supports using standard ANSI SQL to analyze CSV, Parquet, and JSON files stored in IBM Cloud Object Storage.

Because SQL Query operates in a server-less fashion, you do not have to worry about sizing a server of any kind: just author a SELECT statement and submit it.

SQL Query is tightly integrated in the IBM Cloud. For example:

  • Its user interface is available from within the IBM Cloud console, and can be used to author and experiment with queries interactively.

  • Its REST API is part of the IBM Cloud API, allowing for single sign on (SSO) across IBM Cloud Object Storage and SQL Query API calls.

A single query can reference any number of data sets. These data sets can be stored as CSV, JSON, or Parquet objects in one or more IBM Cloud Object Storage instances. SQL Query automatically infers the schema of the data sets before executing the query. You can use the full power of SQL to correlate, aggregate, transform, and filter data; merge data sets; carry out complex analytic computations; and more. The result of each query is written to an IBM Cloud Object Storage instance of your choice.

SQL Query complements IBM Cloud Object Storage perfectly because both are made for seamless elasticity. You can start with any data volume and grow at any rate to whatever volume you desire. IBM Cloud Object Storage charges only for the volume of data you have stored, and SQL Query charges only for the volume of data that you process.

Here is how easy it is to get started:

  1. Provision an instance of IBM Cloud Object Storage, if you haven’t done so yet.

  2. Provision an instance of IBM Cloud SQL Query. (Here is a short video for service provisioning)

  3. Open the SQL Query console.

  4. Select one of the samples in the top right of your screen and click the “Run” button.

There is no need to configure any server resources or data, because the samples use data that is provided out of the box, and the SQL Query service automatically creates a default target bucket for you. After a few seconds, you see the results of your query at the bottom of your screen. If you like, try running additional samples. Feel free to modify the samples to learn more about what the service can do.

After running a few of the samples, you can continue by querying your own data:

  • If necessary, create a new bucket in your IBM Cloud Object Storage instance to hold your input data.

  • Upload data to your bucket, or use another IBM Cloud service (such as the Streaming Analytics service) to add data to your bucket.

Then, write your own queries for your own data. A good starting point is always to explore the schema of your data and look at a few sample records. For this you can use this query pattern:

SELECT * from cos://<endpoint>/<bucket>/<data set prefix> STORED AS CSV LIMIT 10

Replace STORED AS CSV with STORED AS PARQUET or STORED AS JSON as necessary to reflect the format of your input data. Here is a short video that demonstrates how to use the SQL Query service console.

After you have tested your queries and have identified ones that you would like to use in an application or cloud solution, you can use the IBM Cloud REST API directly. This end-to-end demo video shows an example of how to use the API. If you are a Python developer, you can use the ibmcloudsql client package instead of the REST API. For example, you can use ibmcloudsql with a Jupyter notebook and combine it with powerful visualization libraries.

If you want to learn more about how you can upload your own data, you might find this video about using the AWS CLI with IBM Cloud Object Storage useful.

 

For more information, consult the IBM Cloud SQL Query documentation.

Categories

More from Analytics

Data science vs data analytics: Unpacking the differences

5 min read - Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Data science is an area of expertise that combines many disciplines such as mathematics, computer science, software engineering and statistics. It focuses on data collection and management of large-scale structured and unstructured data for various academic and business applications. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to…

Financial planning & budgeting: Navigating the Budgeting Paradox

5 min read - Budgeting, an essential pillar of financial planning for organizations, often presents a unique dilemma known as the “Budgeting Paradox.” Ideally, a budget should give the most accurate and timely idea of anticipated revenues and expenses. However, the traditional budgeting process, in its pursuit of precision and consensus, can take several months. By the time the budget is finalized and approved, it might already be outdated.In today's rapid pace of change and unpredictability, the conventional budgeting process is coming under scrutiny.It's…

How Macmillan Publishers authored success using IBM Cognos Analytics

5 min read - Macmillan Publishers is a global publishing company and one of the “Big Five” English language publishers. If you're a reader, chances are good you've read a book from Macmillan. They published many perennial favorites including Kristin Hannah’s The Nightingale, Bill Martin’s Brown Bear, Brown Bear, what do you see? and some of the more recent bestsellers such as The Silent Patient by Alex Michaelides, Identity by Nora Roberts and Razorblade Tears by S. A. Cosby. It’s no wonder then that Macmillan…

MLOps and the evolution of data science

7 min read - The advancement of computing power over recent decades has led to an explosion of digital data, from traffic cameras monitoring commuter habits to smart refrigerators revealing how and when the average family eats. Both computer scientists and business leaders have taken note of the potential of the data. The information can deepen our understanding of how our world works—and help create better and “smarter” products. Machine learning (ML), a subset of artificial intelligence (AI), is an important piece of data-driven…