Widget optimization for data visualization

Understanding how widget configuration affects performance and data access methods is crucial for efficient dashboard loading, especially when working with large data volumes.

The same result from a widget can sometimes be obtained in different ways. How a widget is configured affects its data access method, and certain configurations perform better than others. This becomes an important factor to consider when working with processes that handle large data volumes as it can impact the loading time of your dashboards.

Example: Optimizing an Elastic widget for case counting by year

The following example demonstrates two equivalent configurations of an Elastic widget that returns the number of cases grouped by year of the last activity of the case. The performance difference between these approaches is significant when working with large datasets.

Option 1: Using the eventlog (less efficient)

This configuration requires the following settings:

Select eventlog as the data source in the from field.
Select YEAR(End-Time) as the dimension.
Set the measure to Cases with Count(Distinct(CASEID)).
Toggle the Keep last event for each case option.

On a process with 100 million events, this configuration took 86 seconds to compute.

Option 2: Using case_stats (more efficient)

This configuration requires the following settings:

Select case_stats as the data source in the from field.
Select YEAR(End-Time) as the dimension.
Set the measure to Cases with Count(Distinct(CASEID)).

On a process with 100 million events, this configuration took less than 2 seconds to compute.

The eventlog table contains all the process events, and the computation must process all the data in it. In contrast, using the case_stats table allows you to access aggregate case-level information that has been precomputed during the mining phase. This means you do not have to recompute the data aggregation logic and can obtain the data more efficiently.

Best practices for widget optimization

Follow these guidelines to make your widgets perform better:

Choose the appropriate data source. Based on the desired output, select the from option in this order of decreasing priority:
1. case_stats (highest priority - most efficient)
2. eventlog (medium priority)
3. eventlog + case_stats (lowest priority - least efficient)
Avoid combined sources when possible. The selection from eventlog + case_stats is the most computationally expensive. When using it, make sure there are no alternative ways to achieve the same result.
Leverage precomputed data. Whenever possible, use case_stats to take advantage of aggregations that were computed during the mining phase rather than recalculating them at query time.