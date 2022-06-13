At IBM Instana™, we process and store every single call collected by Instana tracers with no sampling over the last seven days. Instana’s Unbounded Analytics feature allows for the filtering and grouping of calls by arbitrary tags to gain insights into the unsampled, high-cardinality tracing data. We are able to provide accurate metrics—such as call count, latency percentiles or error rate—and display the detail of every single call.

For many of our large clients, over 1 billion calls are stored every day. This number now reaches 18 billion for our largest client and keeps growing. Calls are stored in a single table in ClickHouse and each call tag is stored in a column. Filtering this large number of calls, aggregating the metrics and returning the result within a reasonable time has always been a challenge.

Previously, we created materialized views to pre-aggregate calls by some frequently used tags, such as application/service/endpoint names or HTTP status code. However, we can’t include all tags into the view—especially those with high cardinalities—because it would significantly increase the number of rows in the materialized view and, therefore, slow down the queries.

Filtering on high-cardinality tags not included in the materialized view still requires a full scan of the calls table within the selected time frame, which could take over a minute.