Observe your data with Data Observability
With IBM Data Observability, you can observe your data and set up notifications for any problems that occur. You can investigate issues in data quality, integrity, and access.
Use Data Observability to:
- Make sure that your job runs behave as expected.
- Detect and resolve issues with your data before they lead to Service Level Agreement (SLA) misses.
- Identify problems in data quality.
- Quickly assess your observability coverage, including the number of detected issues, triggered alerts, open alerts, and their trends over the selected time frame.
With Data Observability, you can create alerts that inform your team of incidents immediately and provide the necessary tools to identify root causes.
On the Data Observability Dashboard, you can view the triggered alerts. Go to the Triggered alerts tab to triggered alerts and affected jobs across all your projects. You can start investigating directly from the alert and update its status.
In addition, you can create and assign alert receivers to your alerts. Alert receivers are notification endpoints, such as email or Slack to which alert payloads are sent when an alert is triggered. Alert receivers are paired with alert definitions to make sure that each triggered alert goes directly to the individuals or teams who need to see them, without engaging others in your organization.
Requirements
The following requirements exist for Data Observability:
- Required service
- IBM watsonx.data integration
- Supported tools
- You can use Data Observability to observe DataStage jobs.
- Data size
- Data Observability works with data of any size.
- Required permissions and roles
-
- Your role and permissions determine which data observability tasks you can perform. You need a different set of roles and permissions for working with alerts and working with alert receivers.
-
- For more information about permissions and roles necessary to work with alerts, see Data Observability roles and permissions.
-
- For more information about permissions and roles necessary to work with alert receivers, see Required permissions for alert receivers.
- Workspaces
- You can create alerts for DataStage jobs in projects.
Sample flow: Observing your data with alerts
The following graphic shows the relationship between alerts, alert receivers, and jobs. In this example, the alert is a Job run duration with the static value condition.
Fig. 1 Observing your data - a sample flow.
A data observability flow might have the following tasks:
- Create an alert definition.
In the context of a project, a data engineer creates an alert definition within Data Observability for the DataStage job. They decide to create a Job run duration alert definition, which is used to track and report any anomalies in the job run duration based on the set condition. In this example, an alert is triggered when the job run duration differs from the defined value. - Create and assign an alert receiver.
A data engineer creates and assigns an email alert receiver to the created alert definition to get an email notification with the details of the triggered alert. - Run a DataStage job in a project.
To create an alert, a data engineer needs to first run a DataStage job in a project. - The job fails.
- The system detects the time anomaly and triggers an alert.
The system detects a difference between the job's actual run time and the expected time. As a result, an alert is triggered. - The alert is sent to the assigned alert receiver.
Because the data engineer assigned an email alert receiver to the created alert definition, they receive an email with the triggered alert and its details. - Acknowledge the triggered alert.
A data engineer opens the Triggered alerts tab to display triggered alerts and their logs. The data engineer can also go to the job run details section and directly from the run details section to the DataStage project where the issue was found. To make sure that other users know that the issue is being worked on, a data engineer can mark the alert Acknowledged. - Go to the data source and fix the issue that caused the alert.
From this tab, a data engineer can go directly to the DataStage canvas with the job where the issue was found. - Mark the alert Resolved.
When the problem is solved, a data engineer can mark the alert Resolved to let others know that the issue is fixed.