What's New in Enterprise Replication for Informix, Version 12.10

This publication includes information about new features and changes in existing functionality.

For a complete list of what's new in this release, go to What's new in Informix®.

Table 1. What's New in IBM Informix Enterprise Replication Guide for 12.10.xC8
Overview Reference
Consistent sharded insert, update, and delete operations

When you run sharded operations that insert, update, or delete data, the transactions are now applied with the two-phase commit protocol instead of being eventually consistent. Data is moved to the appropriate shard server before the transaction is committed.

For sharding with Enterprise Replication commands, you must set the new USE_SHARDING session environment option to enable consistent sharded insert, update, and delete operations.

Creating a shard cluster
List Enterprise Replication definition commands

You can print a list of commands that you ran to define replication server, replicates, replicate sets, templates, or grids with the new cdr list catalog command. You can use the list of commands to easily duplicate a system for troubleshooting or moving a test system into production.

cdr list catalog
Table 2. What's New in IBM Informix Enterprise Replication Guide for 12.10.xC7
Overview Reference
Quickly add or remove shard servers with consistent hashing

You can quickly add or remove a shard server by using the new consistent hashing distribution strategy to shard your data. With consistent hash-based sharding, the data is automatically distributed between shard servers in a way that minimizes the data movement when you add or remove shard servers. The original hashing algorithm redistributes all the data when you add or remove a shard server. You can specify the consistent hashing strategy when you run the cdr define shardCollection command.

Shard cluster definitions

cdr define shardCollection

Table 3. What's New in IBM Informix Enterprise Replication Guide for 12.10.xC6
Overview Reference
Parallel sharded queries

You can now run SELECT statements in sharded queries in parallel instead of serially on each shard. Parallel sharded queries return results faster, but also have the following benefits:

  • Reduced memory consumption: Table consistency is enforced on the shard servers, which eliminates the processing of data dictionary information among the shard servers.
  • Reduced network traffic: Client connections are multiplexed over a common pipe instead of being created individual connections between each client and every shard server. Client connections are authenticated on only one shard server instead of on every shard server. Network traffic to check table consistency is eliminated.

To enable parallel sharded queries, set the new SHARD_ID configuration parameter in the onconfig file to a unique value on each shard server in the shard cluster. Also set the new sharding.parallel.query.enable=true and sharding.enable=true parameters in the wire listener configuration file for each shard server. You can customize how shared memory is allocated for parallel sharded queries on each shard server by setting the new SHARD_MEM configuration parameter. You can reduce latency between shard servers by increasing the number of pipes for SMX connections with the new SMX_NUMPIPES configuration parameter.

If you plan to upgrade your existing shard cluster from a previous version of Informix 12.10, upgrade and set the SHARD_ID configuration parameter on all the shard servers to enable parallel sharded queries.

Shard cluster setup

SHARD_ID configuration parameter

SHARD_MEM configuration parameter

Table 4. What's New in IBM Informix Enterprise Replication Guide for 12.10.xC4
Overview Reference
Replicate hertz and compressed time series data

You can now replicate hertz and compressed time series data with Enterprise Replication.

Replication of TimeSeries data types
Enhancements to the Enterprise Replication apply process and memory pool allocation

You can now specify two new methods of memory pool allocation for Enterprise Replication. Set the new CDR_MEM configuration parameter to specify that Enterprise Replication allocates memory pools for CPU virtual processors or to use a fixed-block memory pool allocation strategy.

Transaction apply performance for large-scale grid environments is faster.

CDR_MEM configuration parameter
New event alarm for blocked replication transactions

The new event alarm 33003 appears if Enterprise Replication transactions are being blocked because a table is in alter mode.

Enterprise Replication Event Alarms
Table 5. What's New in IBM Informix Enterprise Replication Guide for 12.10.xC3
Overview Reference
Shard data across Enterprise Replication servers

Using Enterprise Replication, Informix can now horizontally partition (shard) a table or collection across multiple database servers. When you create a sharding definition through the cdr utility, rows from a table or documents from a collection can be distributed across the nodes of an Enterprise Replication system, reducing the number of rows or documents and the size of the index on each node. When you distribute data across database servers, you also distribute performance across hardware. As your database grows in size, you can scale up by adding more database servers.

Shard cluster setup
Easier configuration and cloning of a server for replication

If you create a server during installation, you can easily create an Enterprise Replication domain or a high-availability cluster. Previously, you had to configure connectivity manually on each server.

Run the cdr autoconfig serv command to configure connectivity and start Enterprise Replication.

cdr autoconfig serv
Table 6. What's New in IBM Informix Enterprise Replication Guide for 12.10.xC2
Overview Reference
Set up and query time series data through a grid

If you plan to replicate time series data, you can set up time series through a grid. You can run the commands to set up time series on one grid server and propagate the commands to the other grid servers.

You can query time series data in the context of a grid. However, you can run a grid query only on a virtual table that is based on a table that has a TimeSeries column.

Replication of TimeSeries data types
Simplified schema changes for replicated tables

If you make many changes to the schema of replicated tables that belong to a replicate set, you can easily update the replicate definitions to reflect the schema changes. After you alter replicated tables, run the cdr define replicateset command with the --needRemaster option to derive a replicate set that consists of only the replicates that are affected by the alter operations. You remaster the derived replicate set by running the cdr remaster replicateset command. You do not need to update or remaster every replicate individually.

If you want to only drop multiple columns from multiple replicated tables, you can run the cdr remaster command with the --remove option.

Altering multiple tables in a replicate set

Removing replicated columns

Example of rolling out schema changes in a grid

Control the replication of large objects

By default, when any column in a replicate row is changed, Enterprise Replication replicates the entire row. However, to improve performance, columns that contain a large object are replicated only when the content of the large object changes. You can force the replication of large objects by including the --alwaysRepLOBs=y option with the cdr define replicate, cdr modify replicate, or cdr define template command. Always including large object columns in replicated rows can be useful if you have a workflow replication system.

Controlling the replication of large objects
Custom checksum function for consistency checking

When you check the consistency of replicated rows, a checksum is generated for each row on each server and then the corresponding checksums are compared. You can write your own checksum function instead of using the checksum function that is supplied with the database server.

Implementing a custom checksum function
Shard tables across database servers

You can now shard, or horizontally partition, a table across multiple database servers. Rows from a table can be distributed across a cluster of database servers, which reduces the number of rows and the size of the index for the database of each server. When you distribute data across database servers, you also distribute performance across hardware, which can result in significant performance improvements. As your database grows in size, you can scale up by adding more database servers.

cdr define shardCollection
Table 7. What's New in IBM Informix Enterprise Replication Guide for 12.10.xC1
Overview Reference
Automatic space management for Enterprise Replication

If you have a storage pool, storage spaces are created automatically if needed when you define a replication server. Also, the CDR_DBSPACE and CDR_QDATA_SBSPACE configuration parameters are set automatically in the onconfig file. In earlier versions of Informix, you had to create the required spaces and set the configuration parameters before you could define a replication server.

cdr define server

CDR_QDATA_SBSPACE Configuration Parameter

CDR_DBSPACE Configuration Parameter

Managing server connections on Windows operating systems

On Windows operating systems, you now configure connectivity information for Informix servers by using the sqlhosts file, not the Windows registry. The file is installed in %INFORMIXDIR%\etc\sqlhosts.%INFORMIXSERVER%, and it uses the same format as the sqlhosts file on UNIX operating systems. The sync_registry Scheduler task automatically converts the connection information between the sqlhosts file format and the Windows registry format. The task runs every 15 minutes. You can manually convert the connection information between the sqlhosts file format and the Windows registry format by running the syncsqlhosts utility.

Preparing the Network Environment
Reduce replication latency between Enterprise Replication and shared-disk secondary servers

If an Enterprise Replication server is a primary server for shared-disk secondary servers, you can reduce replication latency by reducing the number of transactions that are applied before the logs are flushed to disk. By default, the logs are flushed after 50 transactions are applied, or 5 seconds elapse. You can set the CDR_MAX_FLUSH_SIZE configuration parameter to 1 to flush the logs after every transaction and reduce replication latency.

CDR_MAX_FLUSH_SIZE configuration parameter
Apply transactions for a replicate serially

You can specify to apply replicated transactions for a specific replicate serially. By default, replicated transactions are applied in parallel. If Enterprise Replication detects deadlock conditions, it automatically reduces the parallelism for the replication system until the problem is resolved. If you have a replicate that consistently reduces parallelism or your application requires serial processing, include the --serial option when you define or modify a replicate. By isolating a problematic replicate, you can improve the performance of the rest of the replication system. The onstat -g rcv full command displays the number of concurrent transactions and whether any replicate is preventing parallel processing.

cdr define replicate
Replicate tables without primary keys or ERKEY columns

Enterprise Replication requires a unique key to replicate data. Previously, Enterprise Replication required that the replicated table definition included a primary key or the ERKEY shadow columns. ERKEY columns require extra storage space. You can now specify the columns in a unique index as the replication key with the --key option, or allow Enterprise Replication to assign a primary key, ERKEY columns, or a unique index as the replication key with the --anyUniqueKey option.

Unique key for replication

cdr define replicate

cdr define template

Replicate time-series data

You can replicate time-series data with Enterprise Replication. For example, if you collect time-series data in multiple locations, you can consolidate the data to a central server.

Replication of TimeSeries data types
Grid queries for consolidating data from multiple grid servers

You can write a grid query to select data from multiple servers in a grid. Use the new GRID clause in the SELECT statement to specify the servers on which to run the query. After the query is run, the results that are returned from each of the servers are consolidated.

Grid queries
Defer the propagation of DDL statements in a grid

You can run DDL statements in a grid context on a local server but defer the propagation of the DDL statements to the other grid servers. After you test the effects of the DDL statement, you can propagate the deferred DDL statements or remove them. You specify whether to defer the propagation of DDL statements in the ifx_grid_connect() procedure, and whether to enable Enterprise Replication for the deferred DDL statements.

ifx_grid_connect() procedure
Replicates are mastered by default

By default, Enterprise Replication replicates are master replicates. If you do not specify a master server with the --master option, the master replicate is based on the first participant. A master replicate uses saved dictionary information about the attributes of replicated columns to verify that participants conform to the specified schema. To create a classic replicate, which does not verify the schemas of participants, include the --classic option in the cdr define replicate command.

cdr define replicate
Simplified setup of a data consolidation system

In a data consolidation system, multiple primary servers that contain different data replicate to one target server. The target server does not replicate any data. You can easily set up a data consolidation replication system by defining a replicate and specifying that the primary servers are participants that send only data. Previously, you would configure this type of data consolidation system by defining a different replicate for each primary server.

Data consolidation

Participant and participant modifier

Enterprise Replication supported among non-root servers

You can replicate data among database servers that have non-root installations and that do not have a user informix account. The servers must have the same owner. Previously, Enterprise Replication required servers to connect as user informix.

Enterprise Replication Server administrator
Easily propagate external files through a grid

You can propagate external files that are in a specific directory to other servers in the grid by running the ifx_grid_copy() procedure. For example, if a grid has 50 servers, you can copy an executable file from one server to the other 49 servers by running one procedure.

Propagating external files through a grid
Monitor the status of Enterprise Replication queues

You can check the status of Enterprise Replication queues by using the cdr check queue command. Check the queue status before you run a command that might have a dependency on a previously run command.

cdr check queue
Replicate light-append operations

Unlogged changes to a table, such as when data is added by a light append, can be replicated through Enterprise Replication. For example, you can use the express-load operation of the Informix High-Performance Loader (HPL).

Load and unload data