Frequently asked questions for DataStage Anywhere
Find answers to frequently asked questions and use those solutions to resolve issues that you might encounter with DataStage® Anywhere with remote runtime engines.
Contents
- How to purchase DataStage-aaS?
- Can IBM Cloud see my data when jobs are run on a remote engine?
- Is Red Hat OpenShift required?
- Is the remote engine available for stand-alone or classic DataStage?
- What is the difference between the control plane and data plane?
- Are remote engines for DataStage Anywhere available in any geolocation?
- Is there a limit to a number of remote engines deployed?
- Is an inbound and outbound network required?
- Is the runtime environment a set size?
- Can Databand monitor DataStage Anywhere jobs?
- Is remote engine capability available for other IBM Cloud Pak for Data products?
- Do I control my egress charges with DataStage Anywhere?
- Does the Container Process run as a non-root user?
- What is the remote engine behavior when assets are imported?
How to purchase DataStage-aaS?
You can purchase DataStage-aaS from the https://cloud.ibm.com/catalog/services/datastage. Purchasing one plan for DataStage-aaS provides access to the control plane and remote data planes. You are charged based on the maximum number of remote engines that are deployed in a month, even if they remain unused.
Can IBM Cloud see my data when jobs are run on a remote engine?
No. With DataStage-aaS, IBM Cloud has no direct access to the remote engine. The remote engine checks a queue that is maintained by IBM Cloud for job runs and job information.
Is Red Hat OpenShift required?
No. The remote engine manifests as a lightweight container, and does not require Red Hat OpenShift Container Platform to run.
Is the remote engine available for stand-alone or classic DataStage?
No. Remote engine capability is available only for modern DataStage and IBM Cloud.
What is the difference between the control plane and data plane?
DataStage flows are created on the control plane and DataStage jobs are run on the data plane. The control plane (design time) sits on-premises or on IBM Cloud and is where you access the DataStage project to design and interact with your flows. The Dallas and Frankfurt data centers only affect the control plane. The data plane (runtime) is the parallel engine where completed DataStage jobs are executed.
Are remote engines for DataStage Anywhere available in any geolocation?
Yes. With DataStage Anywhere you can execute DataStage jobs within a local remote engine in your own environment, anywhere in the world. You can deploy remote engines in any geolocation, data center, cloud, and on-premises. On IBM Cloud, the control plane (design time) is located only in Dallas and Frankfurt, but the data plane is local to the user and can be run on any docker or Kubernetes based deployment.
Is there a limit to a number of remote engines deployed?
No. There is no limit to the number of remote engines that you can deploy. You can deploy one remote engine per project, multiple remote engines per project, or tie multiple projects to one remote engine. However, once a remote engine is selected for a specific project, you cannot switch back to an IBM-hosted environment.
Is an inbound and outbound network required?
Only the outbound network is needed. There are no incoming calls to the remote engine.
Is the runtime environment a set size?
No. You can enable dynamic scaling and add or remove engines throughout the month.
Can Databand monitor DataStage Anywhere jobs?
Yes.
Is remote engine capability available for other IBM Cloud Pak for Data products?
Remote engine capability currently supports only the DataStage runtime. All other IBM Cloud Pak® for Data services, such as Watson Pipelines, are not currently supported by the remote engine capability. You can resolve this limitation by refactoring file processing before or after job properties within a DataStage flow.
Do I control my egress charges with DataStage Anywhere?
Yes, with DataStage Anywhere, data egress costs are under your control. You can deploy the runtime of DataStage in any location or multiple locations. Locate as closely as possible to your workload to minimize or completely eliminate egress charges.
Does the Container Process run as a non-root user?
Yes. On Red Hat OpenShift clusters, the deployment uses the restricted security context constraints (SCCs).
What is the remote engine behavior when assets are imported?
- Import behavior for ODBC connection by using DSN from ODBC configuration
- When you import ISX/ZIP file to the project bound to Cloud Pak for Data remote engine, if ODBC platform connection uses DSN from odbc.ini, the connection is imported as a flow or local connection. All imported flows and subflows that reference the connection are updated to use flow connections.
- Import behavior for JDBC connection
- When you import ISX/ZIP file to the project bound to Cloud Pak for Data remote engine the connection is imported as a flow or local connection. You can manually upload driver to remote engine persistent volume. All imported flows and subflows that reference the connection are updated to use flow connections.