Granting the Spark Cluster Access to Transformer
When Transformer works with a Spark installation that runs on a cluster, the Spark cluster must be able to access Transformer to send the status, metrics, and offsets for running pipelines.
Granting the Spark cluster access to Transformer involves specifying a cluster callback URL that Spark uses to communicate with Transformer.
- Self-managed deployment
- Cloud service provider deployment, including an Amazon EC2, Azure VM, or GCE deployment
- Kubernetes deployment
Granting Access for a Self-Managed Deployment
Complete the following steps when the Transformer engine belongs to a self-managed deployment.
Granting Access for Cloud Service Provider Deployments
Complete the following steps when the Transformer engine belongs to a cloud service provider deployment, including an Amazon EC2, Azure VM, or GCE deployment.
Granting Access for a Kubernetes Deployment
Granting the Spark cluster access to Transformer when using a Kubernetes deployment involves exposing the Transformer container outside the cluster using a Kubernetes service.
You can also optionally associate an Ingress with the service. An Ingress can provide load balancing, SSL termination, and name-based virtual hosting to the services in a Kubernetes cluster.
For more information, see the Kubernetes services and Ingress documentation.