Transforming data with DataStage
DataStage® is a data integration tool that moves and transforms data between operational, transactional, and analytical target systems. Data integration specialists use DataStage to develop flows that process and transform data. Hundreds of prebuilt transformation functions, parallel processing capabilities, and platform connectivity is available to connect directly to enterprise applications, cloud data sources, relational and NoSQL systems, REST endpoints, and more. You can administer, manage, deploy, and reuse these flows to integrate data across many systems throughout your organization.
- Required service
- IBM watsonx.data integration
- Data format
- Tabular: CSV, JSON, Parquet, TSV (read only), or delimited text files
- Data size
- DataStage works with data of any size.
- Connectors
- Example connectors include: Db2®, Netezza® Performance
Server, Microsoft SQL Server, Oracle,Teradata, Snowflake, Microsoft Azure File
Storage, Amazon Web Services and Google Cloud Platform services, and Amazon S3.
See DataStage connectors for the list of connectors that DataStage supports.
- Stages
- This service provides stages, which describe a particular process such as accessing a database
or transforming data in some way. DataStage stages
provide common functions for moving and transforming data. QualityStage stages are important for,
but not limited to, eliminating redundant, obsolete, or inaccurate data, standardizing data, and
verifying address data.
See DataStage stages and Quality stages in DataStage for information on the stages that DataStage supports.
For more information, see Quality stages in DataStage.
Learn more
For more information about using DataStage, see the following topics: