How to query your log archives directly from object storage.
Today, we are thrilled to announce the release of a new feature that will change the way you access and analyze your log data. With IBM Cloud Data Engine, you can now query your log archives directly from object storage, without the need to download and store them locally.
How to use SQL for analyzing unstructured logs
Log data is typically generated by various systems and applications, and the schema of the data can change over time as new systems and applications are added and existing ones are updated. This makes it difficult to query log data using SQL because the schema is not fixed and can change from one log entry to the next.
To address these challenges, you can now query log archives as text. Let’s look at an example:
First, the query uses the WITH
clause to define a derived table named logs
that contains the structured log data extracted from the unstructured log archives. The logs
table is defined using the SELECT
clause and a series of get_json_object
functions, which are used to extract the specific fields of interest from the unstructured log data.
This is a powerful approach because it allows the query to work with structured data, which is much easier to query and analyze than unstructured data. By extracting the specific fields of interest and creating a structured table, the query can use standard SQL syntax to filter, sort and aggregate the data as needed.
One of the key benefits of this new capability is the ability to take advantage of parallel query processing for gzip-compressed data. The secret sauce of speed for the parallel processing of gzip is its ability to divide the data being compressed or decompressed into smaller chunks and process those chunks simultaneously — using multiple processors or cores — to achieve orders of magnitude better query execution times.
What insights can you gain from analyzing log archives with SQL?
Here are some examples of how you can use this new capability to gain insights from your historical IBM Cloud log archives:
- Identify and troubleshoot issues with your applications: By querying your log data, you can easily identify any errors or issues that may have occurred in the past and may be impacting the performance of your applications. This can help you troubleshoot problems and improve the reliability and stability of your applications.
- Monitor system performance and usage: By analyzing your log data, you can gain insights into how your systems were being used and how they were performing in the past. This can help you identify trends and patterns and make data-driven decisions to optimize your system performance and improve the user experience.
- Analyze user behavior and preferences: By querying your log data, you can gain insights into how your users were interacting with your applications and what features they were using in the past. This can help you understand their needs and preferences and make data-driven decisions to improve the user experience and drive engagement.
Check out more examples in the documentation.
Get started with Data Engine and log analysis archiving
Overall, this new capability of IBM Cloud Data Engine provides a powerful and flexible way to query and analyze your historical log data, helping you gain valuable insights and make data-driven decisions. We are excited to see how you will use it to unlock the value of your log data and drive success for your business.