Transforming data with DataStage

DataStage® is a data integration tool that moves and transforms data between operational, transactional, and analytical target systems. Data integration specialists use DataStage to develop flows that process and transform data. Hundreds of prebuilt transformation functions, parallel processing capabilities, and platform connectivity is available to connect directly to enterprise applications, cloud data sources, relational and NoSQL systems, REST endpoints, and more. You can administer, manage, deploy, and reuse these flows to integrate data across many systems throughout your organization.

Required service
IBM watsonx.data integration
Data format
Tabular: CSV, JSON, Parquet, TSV (read only), or delimited text files
Data size
DataStage works with data of any size.
Connectors
Example connectors include: Db2®, Netezza® Performance Server, Microsoft SQL Server, Oracle,Teradata, Snowflake, Microsoft Azure File Storage, Amazon Web Services and Google Cloud Platform services, and Amazon S3.

See DataStage connectors for the list of connectors that DataStage supports.

Stages
This service provides stages, which describe a particular process such as accessing a database or transforming data in some way. DataStage stages provide common functions for moving and transforming data. QualityStage stages are important for, but not limited to, eliminating redundant, obsolete, or inaccurate data, standardizing data, and verifying address data.

See DataStage stages and Quality stages in DataStage for information on the stages that DataStage supports.

For more information, see Quality stages in DataStage.

Learn more

For more information about using DataStage, see the following topics: