GCE Deployments
Applies to: IBM StreamSets as a Service
You can create a Google Compute Engine (GCE) deployment for an active GCP environment.
When you create a GCE deployment, you define the engine type, version, and configuration to deploy to the Google Cloud project and VPC network specified in the environment. You also specify the number of engine instances to deploy. Each engine instance runs on a dedicated Google Compute Engine VM instance.
When you start a GCE deployment, Control Hub connects to the project and VPC network specified in the environment and then uses Google Cloud Deployment Manager to create a Google deployment. Google Cloud Deployment Manager provisions the group of VM instances in the VPC network and then deploys and launches one IBM StreamSets engine instance on each VM instance.
Google Cloud Deployment Manager manages the provisioning and monitoring of the VM instances. Control Hub simply receives the status of the deployed engine instances and sends any updates to Deployment Manager.
When you stop a GCE deployment, Deployment Manager deletes the existing VM instances.
For more information about Google Cloud Deployment Manager, see the Google Cloud Deployment Manager documentation.
Before you create a GCE deployment, you must complete several prerequisites.
VM Instance Details
Engine Type | Software |
---|---|
Data Collector 5.11.x and later |
|
Data Collector 5.10.x and earlier |
|
Transformer 6.0.x and later |
|
Transformer 5.9.x |
|
Transformer 5.8.x and earlier |
|
Secrets Policy
- Authentication token that the deployment uses to communicate with IBM StreamSets.
- Proxy credentials, including the HTTP and HTTPS proxy user and password, when you configure engines to use a proxy server.
- Automatic
- A secret with an automatic replication policy has its payload data replicated without restriction. This configuration is recommended for most users.
- User Managed
- A secret with a user managed replication policy has its payload data replicated to a set of locations that you specify. The secret can be replicated to one or more supported locations.
Prerequisites
- Create a Google Cloud Platform (GCP) environment
- Create and activate a GCP environment in Control Hub, as described in GCP Environments.
- Create an instance service account
- Ask your Google Cloud administrator to create an instance service account in Google Cloud to associate with the provisioned VM instances. If a default instance service account is defined for the parent GCP environment, you can skip this prerequisite and simply use the default. If a default is not set or if you'd like to override the default for the deployment, see Create Instance Service Accounts for VM Instances.
- Optionally, create an SSH key pair
- Control Hub does not use or require an SSH key pair to access the VM instances. However, if you’d like to use an SSH key to access the provisioned VM instances, create an SSH key pair to associate with the VM instances.
- Optionally, set up an external resource archive
- When your pipelines require external resources and when you plan to deploy multiple engine instances, you must set up an external resource archive that all engine instances can access. When your pipelines do not require external resources or when using a single engine instance to get started with IBM StreamSets, you do not need to complete this prerequisite.
Configuring a GCE Deployment
To create a new deployment, click Create
Deployment icon: .
To edit an existing deployment, click Edit.
in the Navigation panel, click the deployment name, and then clickDefine the Deployment
Define the deployment essentials, including the deployment name and type, the environment that the deployment belongs to, and the engine type and version to deploy.
Once saved, you cannot change the deployment type, the engine version, or the environment.
Configure the Engine
Define the configuration of the engine to deploy. You can use the defaults to get started.
Configure the GCE Region and Secrets Policy
Select the region to provision the Google Compute Engine VM instances in and the replication policy type for GCP Secret Manager secrets.
Configure the GCE Zone and Subnet
Select one or more zones and a subnet to provision the Google Compute Engine VM instances in. You can select from the available zones and subnets within the selected GCE region and VPC network.
Configure the GCE Autoscaling Group
Configure details about the Google Compute Engine VM instances that will be provisioned.
Configure GCE SSH Access
Optionally, configure SSH key access for the provisioned Google Compute Engine VM instances and whether to attach external IP addresses to the instances.
Share the Deployment
By default, the deployment can only be seen by you. Share the deployment with other users and groups to grant them access to it.
Review and Launch the Deployment
You've successfully finished creating the deployment.
Editing a GCE Deployment
You can edit a GCE deployment while it is deactivated or active.
When you stop a deployment, all existing VM instances are deleted. After you edit properties and then restart the deployment, Control Hub uses Google Cloud Deployment Manager to provision a new group of VM instances and launch a new IBM StreamSets engine instance on each VM instance.
- General deployment or engine properties
- When you edit general deployment or engine properties while the deployment is active, Google Cloud Deployment Manager continues running the existing VM instances. Changes to all engine instances are replicated on the next restart of the engines.
- GCE properties
- When you edit GCE properties while the deployment is active, Google Cloud Deployment Manager replaces all of the existing VM instances. This results in engine downtime while the new instances are being provisioned.
To edit a deployment, locate the deployment in the Deployments
view. In the Actions column, click the
More icon () and then click Edit.
Tracking URL
When you view the details of an active GCE deployment, you can access a tracking URL to the Google Cloud Console. Use the URL to view details about the Google Cloud resources automatically provisioned for the IBM StreamSets deployment.
To access the tracking URL, click a GCE deployment name in the Deployments view and then locate the Tracking URL property in the deployment details.
- VM instance template
- Managed instance group
- Autoscaler
For example, the following image displays a sample overview page:
The following topics provide brief tips on finding the most useful information about the provisioned resources. For more details about monitoring a Google Cloud deployment, see the Google Cloud documentation.
VM Instance Template
In the Google deployment overview page, click the link to the VM instance template and then click Manage Resource on the right.
The Google Cloud Console displays the following details about the instance template. Use the details to verify that the IBM StreamSets parent environment and deployment are configured with the correct values, such as the networking information or the SSH key:
Managed Instance Group
In the Google deployment overview page, click the link to the managed instance group and then click Manage Resource on the right.
The Google Cloud Console displays details about the instance group, including the status of the instance group, the number of provisioned VM instances, and an Errors tab. The Errors tab lists errors that occurred while provisioning the managed instance group; however, the list is not necessarily comprehensive.
For example, the following image displays an instance group with a Ready status that includes one VM instance:
In the Instance Group Members section, click an instance name to view specific details about the VM instance. For example, click instance-5rz2 in the image above. The VM instance details page also allows you to use SSH to connect to the VM instance, even if you didn't provide an SSH key when creating the deployment.