Data distribution using grid synchronization

IBM® Spectrum Symphony Advanced Edition provides a grid synchronization feature that distributes a collection of files or directories (called data sets) from the source (client) to resource groups on target hosts. Rather than transferring the same data over the network multiple times, grid synchronization uses a peer-to-peer pipe-lining synchronization algorithm so that data distributes from one host to the next. Grid synchronization relieves network and resource bottlenecks associated with data distribution from a single source to all targets. This distributes data efficiently, saving time and bandwidth, and ultimately speeds up SOAM workload.

IBM Spectrum Symphony grid synchronization also provides a consistency check to validate the expected list of files for the data set, on the repository server, against the actual files on the target host. It also calculates checksum for all files in the data set and compares the expected value that the repository server maintains.

Enabling grid synchronization

By default, grid synchronization is not enabled. When enabled, IBM Spectrum Symphony keeps hosts (in the data set's resource groups) in sync with the latest files in the data set. IBM Spectrum Symphony streams any changes to the data set to target hosts immediately, as the repository server receives them. To enable synchronization:

Set the GS_ENABLE_GRIDSYNC environment variable to Y in the rsa.xml configuration file. For example:

On Windows, update the NTX64 section:

<sc:ActivityDescription>
   <ego:Attribute name="hostType" type="xsd:string">NTX64</ego:Attribute>
   <ego:ActivitySpecification>
      ...
             <ego:EnvironmentVariable name="GS_ENABLE_GRIDSYNC">Y</ego:EnvironmentVariable>

On Linux®, update the all section:

<sc:ActivityDescription>
   <ego:Attribute name="hostType" type="xsd:string">all</ego:Attribute>
   <ego:ActivitySpecification>
      ...
             <ego:EnvironmentVariable name="GS_ENABLE_GRIDSYNC">Y</ego:EnvironmentVariable>

Set the grid synchronization service profile to automatic (<sc:StartType>AUTOMATIC</sc:StartType>):
1. From the cluster management console, select System & Services > EGO Services > Service Profiles.
2. From the list of services, click gridsync to edit the service profile for the grid synchronization service.
3. In the sc:ServiceDefinition > sc:ControlPolicy > sc:StartType section, click to change MANUAL to AUTOMATIC.
4. Click Save. The service must be stopped to apply the change. Click OK to stop the service.
5. Refresh the System & Services > EGO Services > Services page and verify that the service has started.

When disabled (<ego:EnvironmentVariable name="GS_ENABLE_GRIDSYNC">Y</ego:EnvironmentVariable> and <sc:StartType>MANUAL</sc:StartType>), the repository server holds any changes to the data set until you re-enable synchronization.

Grid synchronization samples

Additionally, IBM Spectrum Symphony Developer Edition provides grid synchronization samples for Java™, C++, Python, and .NET languages. The samples and associated readme files are located in the following directories, under the directory where you installed IBM Spectrum Symphony Developer Edition:

/7.3.2/samples/Java/SampleGridsyncApp
/7.3.2/samples/CPP/SampleGridsyncApp
/7.3.2/samples/Python/SampleGridsyncApp (available for Linux only)
\7.3.2\samples\DotNet\SampleGridsyncApp (available for Windows only)

Grid synchronization log files

All data set create, modify, and delete actions for grid synchronization are stored in the following log files:

For the repository server service:
- On Linux: $EGO_TOP/eservice/rs/rs.host_name.log
- On Windows: Installation_top\eservice\rs\rs.host_name.log
For the repository server agent service:
- On Linux: $EGO_TOP/eservice/rs/rsa.host_name.log
- On Windows: Installation_top\eservice\rs\rsa.host_name.log

Synchronization of data sets

Within the grid synchronization feature, you can also enable or disable dataset synchronization to synchronize collections of files or directories. See Enabling synchronization for data sets and Disabling synchronization for data sets for details.