Transforming data (DataStage)

Use the DataStage® service to transform data to provide enriched and tailored information for your enterprise. DataStage is available as DataStage Enterprise and DataStage Enterprise Plus.

ServiceThis service is not available by default. An administrator must install this service on the IBM Cloud Pak® for Data platform, and you must be given access to the service. To determine whether the service is installed, open the Services catalog and check whether the service is enabled.

Palette and canvas in DataStage

With DataStage, you can create, edit, load, and run transformation jobs. DataStage has features like built-in search, automatic metadata propagation, and simultaneous highlighting of all compilation errors. Developers can use these features to be more productive.

  • Search: Find what you need fast by using the flexible Search feature.
  • Automatic metadata propagation: DataStage automatically propagates metadata from one stage to the other stages later in the job, increasing productivity.
  • Highlighting of all compilation errors: DataStage highlights all errors and gives you a way to see problems with a quick hover over each stage, so you can fix multiple problems at the same time before you recompile.

DataStage features the following tabs, which you use for quick access to essential actions:

  • Projects
  • Connections
  • Table definitions
  • Jobs
  • Parameter sets
DataStage Enterprise Plus gives you additional useful features for data quality. These features include:
  • Cleansing data by identifying potential anomalies and metadata discrepancies.
  • Identifying duplicates by using data matching and probabilistic matching of data entities between two data sets.
Note: You must use DataStage Enterprise Plus to access the additional data quality functions.