Planning to modify your analytics data

Review how you can optionally customize the analytics data in your IBM® API Connect Analytics deployment before storing it.

You can create filters to customize the analytics data before storing in external or internal storage. Filters allow you to add new fields, modify existing fields, or remove fields from the data. You can do this globally for the entire Analytics subsystem, or you can do it conditionally for a specific API. By default, filters are defined at the global level, so unless you include conditionals, changes are applied to every piece of data flowing through the pipeline.

For information on adding filters to the Analytics CR to modify your data, see Modifying your analytics data.

What fields can you modify?

The fields that are available for you to interact with are described in API event record fields. Each API defined in the Management subsystem and published to your Gateway has its own success and error log policies. The log policy settings determine what data is available for customization. Be sure to verify the existence of a field before trying to modify it; otherwise your pipeline might fail.

Modifying fields is a complex operation and can cause problems with your data. It is highly recommended that you remove fields entirely (instead of modifying them) if you need to sanitize data. If you want to change the format of an existing field, you should create a new, unique field.

Important: Do not change the data type of existing fields or modify fields critical to the Analytics subsystem's operations and access control. Doing so can cause problems with retrieving or storing data. The following fields are restricted (note that there might be other fields that are not included in this list):
  • org_id
  • catalog_id
  • space_id
  • developer_org_id
  • datetime
  • @timestamp

The service for the data pipeline is based on Elastic Logstash, which provides a set of predefined filter plugins that you can use for modifying data. However, the plugins are third-party software that IBM does not control, so IBM cannot guarantee support for them.

Which pipeline is affected by the filter?

The message queue and offload enabled settings determine which pipeline's data is affected by filters. Table 1 specifies which pipeline's data is affected when the message queue and offload settings are enabled in different combinations.

Table 1. Settings that determine which pipeline is affected by filters
  Offloading is enabled Offloading is disabled

Message queue is enabled

  • Offload.filter applies to offloaded data
  • Ingestion.filter applies to internal data
  • Offload.filter is ignored
  • Ingestion.filter applies to internal data

Message queue is disabled

  • Offload.filter applies to all data
  • Ingestion.filter applies to all data
  • Offload.filter is ignored
  • Ingestion.filter applies to all data