Multiple connection managers
Multiple connection managers are a new capability that is designed to enhance scanning performance and enable parallel ingestion. It proves especially valuable in scenarios where data sources are geographically dispersed and need to be scanned as remote sources.
With this new capability, users can take advantage of two primary deployment scenarios:
- The first scenario involves deploying multiple connection managers within the same cluster, allowing for efficient coordination and distribution of scanning tasks. It enables optimized resource utilization and faster processing of data.
- The second scenario involves adding external nodes to the system and deploying one or more
connection managers on these nodes. This distributed setup further enhances scanning performance by
using extra computing resources and enabling parallel execution of scanning operations across
multiple nodes.Note:
- Multiple connection managers improve scanning performance, but it is necessary to understand
that the increase in scanning speed does not increase in the performance of indexing records into
the database.
The indexing process might have its own limitations and dependencies that might impact overall performance.
Example deployment scenario:- Main cluster located in Mexico with 2 connection managers deployments.
- One remote worker located in France with 3 connection manager deployed to scan France data sources.
- One remote worker located in Canada with 2 connection manager deployed to scan Canada data
sources.Example:
kind: SpectrumDiscover apiVersion: spectrum-discover.ibm.com/v1alpha1 metadata: name: spectrumdiscover-sample namespace: discover spec: license: accept: true doInstall: true rwx_storage_class: ibmc-file-gold-gid connmgr: site: mexico replicas: 2 extraLocations: - site: france locationType: remote replicas: 2 - site: canada locationType: remote replicas: 3 affinity: tolerations: - effect: PreferNoSchedule key: isd operator: Exists
In case it's required to modify main location needs to be specified it onsite property as follows:connmgr: site: france
Note: In case site does not exist in a statefulset as example.com or empty, for instance, internal scheduler assigns it to any type local connection manager available. - Multiple connection managers improve scanning performance, but it is necessary to understand
that the increase in scanning speed does not increase in the performance of indexing records into
the database.