High Availability
You can run a single StreamSets Control Hub instance in a development environment. However, in a production environment, we recommend using multiple Control Hub instances and a load balancer to ensure that Control Hub is highly available.
To set up Control Hub
as a highly available system, complete the following tasks:
- Use highly available database clusters
- Use highly available database clusters:
- For the relational database, use MariaDB Galera Cluster, MySQL Enterprise High Availability, or PostgreSQL with high availability enabled.
- For the time series database, use InfluxEnterprise.
- Install Control Hub on multiple machines
- Install Control Hub on multiple machines, ensuring that each Control Hub instance uses the same relational and time series database and the same SMTP account for emails.
- Set up a load balancer for Control Hub
- Set up a load balancer to distribute user and registered Data Collector and Provisioning Agent requests across the Control Hub system. These Control Hub clients use the load balancer URL to communicate with the Control Hub system. In addition, each Control Hub instance accesses the front end of the load balancer to communicate with the other Control Hub instances.
The following image displays the components of a highly available Control Hub system: