DataStage Anywhere

DataStage® Anywhere is an offering that runs jobs for DataStage on a remote engine deployed in a location of your choice. You can run jobs on an on-premises location or any cloud or data center, including Amazon Web Services, Google Cloud Storage, or Azure.

DataStage-aaS Anywhere separates DataStage into an IBM-hosted control plane, where you manage projects, assets, and flows, and a data plane, where jobs are executed on a parallel engine. With a remote runtime engine, you can deploy a data plane within your own environment, an on-premises location or any cloud or data center. You continue to use a control plane hosted and managed by IBM Cloud, but can strategically deploy remote engines to execute jobs wherever your data is, reducing costs and optimizing performance.

DataStage-aaS Anywhere supports public, private and hybrid cloud architectures with remote engines deployed in your environment. By deploying your own data plane, you retain full control of your data, and can execute all processing behind your own security measures without exposing critical metadata to IBM Cloud. You can use stages and connectors not available in DataStage-aaS, including user-defined stages and function libraries. For a comparison of supported features, see Feature differences between Cloud Pak for Data deployments. DataStage-aaS Anywhere also supports automatic load balancing, elastic scaling, and flexible core-based pricing.

A remote engine can be deployed as a container on any Docker or Kubernetes-based environment or another container management platform. Startup scripts are provided for Docker and Kubernetes.

Features available in DataStageas a Service Anywhere

  • Java integration stage
  • Java library component
  • Generic JDBC connection
  • Excel
  • AVI
  • External source stage
  • External target stage
  • Wrapped stage
  • Build stage
  • Custom stage
  • Apache HBase connection
  • User-defined functions
  • User-created APT_CONFIG_FILEs
  • Db2 database sequence in Slowly Changing Dimension stage, Surrogate Key Generator stage, and Transformer stage
  • Use the Apache Hive connection as a target. (Available when Use DataStage properties is selected in the connector.)
  • Operational Decision Manager stage