Amazon EC2 Deployments
Applies to: IBM StreamSets as a Service
You can create an Amazon EC2 deployment for an active AWS environment.
When you create an EC2 deployment, you define the engine type, version, and configuration to deploy to the Amazon VPC specified in the environment. You also specify the number of engine instances to deploy. Each engine instance runs on a dedicated EC2 instance.
When you start an EC2 deployment, Control Hub connects to the Amazon VPC specified in the environment and then creates an AWS CloudFormation stack. AWS CloudFormation provisions the group of EC2 instances in the VPC and then deploys and launches one IBM StreamSets engine instance on each EC2 instance.
AWS CloudFormation manages the provisioning and monitoring of the EC2 instances. Control Hub simply receives the status of the deployed engine instances and sends any updates to CloudFormation.
When you stop an EC2 deployment, CloudFormation deletes the existing EC2 instances.
For more information about AWS CloudFormation, see the AWS CloudFormation documentation.
Before you create an Amazon EC2 deployment, you must complete several prerequisites.
EC2 Instance Details
Engine Type | Software |
---|---|
Data Collector 5.11.x and later |
|
Data Collector 5.10.x and earlier |
|
Transformer 6.0.x and later |
|
Transformer 5.9.x |
|
Transformer 5.8.x and earlier |
|
Transformer for Snowflake - all versions Applicable when your organization uses a deployed Transformer for Snowflake engine. |
|
Secrets
- Authentication token that the deployment uses to communicate with IBM StreamSets.
- Proxy credentials, including the HTTP and HTTPS proxy user and password, when you configure engines to use a proxy server.
Prerequisites
- Create an AWS environment
- Create and activate an AWS environment in Control Hub, as described in AWS Environments.
- Configure an instance profile
- Ask your AWS administrator to configure an instance profile in AWS to associate with the provisioned EC2 instances. If a default instance profile is defined for the parent AWS environment, you can skip this prerequisite and simply use the default. If a default is not set or if you'd like to override the default for the deployment, see Configure Instance Profiles for EC2 Instances.
- Optionally, create an EC2 key pair
- Control Hub does not use or require an EC2 key pair to access the EC2 instances. However, if you plan to connect to the instances using SSH, ask your AWS administrator to create an Amazon EC2 key pair to associate with the provisioned EC2 instances.
- Optionally, set up an external resource archive
- When your pipelines require external resources and when you plan to deploy multiple engine instances, you must set up an external resource archive that all engine instances can access. When your pipelines do not require external resources or when using a single engine instance to get started with IBM StreamSets, you do not need to complete this prerequisite.
Configuring an Amazon EC2 Deployment
Configure an Amazon EC2 deployment to define the group of engine instances to deploy to an AWS environment.
To create a new deployment, click Create
Deployment icon: .
To edit an existing deployment, click Edit.
in the Navigation panel, click the deployment name, and then clickDefine the Deployment
Define the deployment essentials, including the deployment name and type, the environment that the deployment belongs to, and the engine type and version to deploy.
Once saved, you cannot change the deployment type, the engine version, or the environment.
Configure the Engine
Define the configuration of the engine to deploy. You can use the defaults to get started.
Configure the EC2 Autoscaling Group
Configure details about the EC2 instances that will be provisioned.
Configure EC2 SSH Access
Optionally, select the Amazon EC2 key pair to associate with the provisioned EC2 instances.
Share the Deployment
By default, the deployment can only be seen by you. Share the deployment with other users and groups to grant them access to it.
Review and Launch the Deployment
You've successfully finished creating the deployment.
Editing an Amazon EC2 Deployment
You can edit an Amazon EC2 deployment while it is deactivated or active.
When you stop a deployment, all existing EC2 instances are deleted. After you edit properties and then restart the deployment, Control Hub uses AWS CloudFormation to provision a new group of EC2 instances and launch a new IBM StreamSets engine instance on each EC2 instance.
When you edit a deployment while it is active, existing EC2 instances might be deleted, depending on the following types of edited properties:
- General deployment or engine properties
- When you edit general deployment or engine properties while the deployment is active, AWS CloudFormation continues running the existing EC2 instances. Changes are replicated to all engine instances on the next restart of the engines.
- EC2 properties
- When you edit EC2 properties while the deployment is active, AWS CloudFormation might replace all of the existing EC2 instances, depending on the change. If a replacement is needed, CloudFormation deletes the EC2 instances in batches to prevent engine downtime. Each batch can contain up to 25% of the total number of instances in the deployment.
To edit a deployment, locate the deployment in the Deployments
view. In the Actions column, click the
More icon () and then click Edit.
Tracking URL
When you view the details of an active Amazon EC2 deployment, you can access a tracking URL to the AWS Management Console. Use the URL to view additional information about the AWS resources automatically provisioned for the IBM StreamSets deployment.
To access the tracking URL, click an Amazon EC2 deployment name in the Deployments view and then locate the Tracking URL property in the deployment details.
- Events - Displays status and error messages that help with troubleshooting.
- Resources - Displays the resources created for the deployment, including the EC2 template and the auto scaling group.
- Parameters - Displays some of the values entered in the Control Hub UI. Use to verify that the IBM StreamSets parent environment and deployment are configured with the correct values, such as the security group, subnet, and IAM instance profile.
For example, the following image displays the Events tab for a sample CloudFormation stack:
The following topic provides brief tips on finding the most useful information about the provisioned resources. For more details about monitoring an AWS CloudFormation stack, see the AWS CloudFormation documentation.
Auto Scaling Group
In the AWS CloudFormation stack details page, click the Resources tab and then click the ASG link.
- Activity - Displays status messages.
- Instance management - Includes a link to each provisioned EC2 instance.
For example, the following image displays the Instance management tab that includes one EC2 instance with a Healthy status:
In the Instances section, click an instance ID to view specific details about the EC2 instance, such as the private IP address. For example, the following image displays a sample EC2 instance summary page: