cdr check replicate

The cdr check replicate command compares the data on replication servers to create a report that lists data inconsistencies and can optionally repair the inconsistent data within a replicate.

Syntax

Read syntax diagramSkip visual syntax diagramcdr check replicateConnect Option 1 --master= data_server2--nomaster --repl= repl_nametarget_server --all --name= task_name --verbose --repair --extratargetrows= deletekeepmerge --timestamp --deletewins --firetrigger= offonfollow --inprogress= recheck_time --background --skipLOB --since= start_time --where= WHERE_Clause --excludeTimeSeries --ignoreHiddenTSElements --checksum= checksum_function
Notes:
Element Purpose Restrictions Syntax
checksum_function Name of the checksum function to use during consistency checking. The function must be installed and registered on all replication servers. Long Identifiers
data_server Name of the database server to use as the reference copy of the data. Must be the name of an existing database server group in SQLHOSTS. Long Identifiers
recheck_time The number of seconds to spend rechecking transactions that might be listed as inconsistent because they are not yet applied on the target server. Must be a positive integer.  
repl_name Name of the replicate to check. Must be an existing replicate. Long Identifiers
start_time The time from which to check updated rows. Can have one the following formats:
  • numberM = Include rows updated in the last specified number of minutes.
  • numberH = Include rows updated in the last specified number of hours.
  • numberD = Include rows updated in the last specified number of days.
  • numberW = Include rows updated in the last specified number of weeks.
  • "YYYY-MM-DD hh:mm:ss" = Include rows updated since this time stamp.
The time stamp format follows the convention of the DBTIME environment variable.
target_server Name of a database server group to check. Must be the name of an existing database server group in the sqlhosts file. Long Identifiers
task_name The name of the progress report task. If you use an existing task name, the information for that task is overwritten.

Maximum name length is 127 bytes.

Long Identifiers
WHERE_Clause Clause that specifies a subset of table rows to be checked.

You cannot include a TimeSeries column in the WHERE clause.

WHERE clause syntax

The following table describes the options for the cdr check replicate command.

Long Form Short Form Meaning
--all -a Specifies that all servers defined for the replicate are checked.
--background -B Specifies that the operation is run in the background as an SQL administration API command.

The command and its result are stored in the command_history table in the sysadmin database, under the name that is specified by the --name= option, or the time stamp for the command if --name= is not specified.

--checksum=   Specifies the name of an existing checksum function to use during consistency checking. By default, the checksum function that is provided with the database server is run.
--deletewins -d Specifies that the replicate uses the delete wins conflict resolution rule.

You cannot use this option for replicates that include TimeSeries columns.

--excludeTimeSeries   Specifies to prevent the checking of time series data.
--extratargetrows= -e Specifies how to handle rows that are found on the target servers that are not present on the server from which the data is being copied (data_server):
  • delete: (default) remove rows and dependent rows, based on referential integrity constraints, from the target servers
  • keep: retain rows on the target servers
  • merge: retain rows on the target servers and replicate them to the data source server. You cannot use this option for replicates that include TimeSeries columns.
Note: When cdr check replicate is used with -extratargetrows (or - e) option for SEND-ONLY replicate, server displays the following warning and continue the operation:
WARNING: Extra row option is not applicable to Send-Only participant. 
'--extratargetrows' option is ignored
Note: When cdr check replicate is used with -extratargetrows (or -e) option for RECV-ONLY participant as a MASTER node, server displays the following error and operation is aborted:
Error: Receive only participant '%s’ can not be a master 
node in data synchronization task
--firetrigger= -T Specifies how to handle triggers at the target servers while data is synchronizing:
  • off: (default) do not fire triggers at target servers during synchronization
  • on: always fire triggers at the target servers even if the replicate definition does not have the --firetrigger option
  • follow: fire triggers at target servers only if the replicate definition has the --firetrigger option
--ignoreHiddenTSElements   Specifies to avoid checking time series elements that are marked as hidden.
--inprogress= -i Specifies to spend more than the default time to recheck inconsistent rows that might be in the process of being applied on target servers. If the --inprogress= option is not set, inconsistent rows are rechecked for up to five seconds.
--master= -m Specifies the database server to use as the reference copy of the data.

You cannot use the --master option with the --timestamp option.

--name= -n Specifies that the progress of this command can be monitored. Information about the operation is stored under the specified progress report task name on the server on which the command was run.
--nomaster -N Specifies that the replicate is configured as a data consolidation system in which the multiple primary servers only send data and the single target server only receives data.
--repair -R Specifies that rows that are found to be inconsistent are repaired.
--repl= -r Specifies the name of the replicate to check.
--since= -S Specifies the time from which to check updated rows. The replicate must be using the time stamp or delete wins conflict resolution rule.

You cannot use this option for replicates that include TimeSeries columns.

--skipLOB -L Specifies that large objects are not checked.
--timestamp -t Specifies to repair inconsistent rows based on the latest time stamp among all the participants. The replicate must use the time stamp or delete wins conflict resolution rule.

You cannot use the --master option with the --timestamp option.

You cannot use this option for replicates that include TimeSeries columns.

--verbose -v Specifies that the consistency report shows specific inconsistent values.
--where= -w Specifies what data to check with a WHERE clause.

You cannot include a TimeSeries column in the WHERE clause.

Usage

Use the cdr check replicate command to check the consistency of data between multiple database servers for a specific replicate. The cdr check replicate command compares all rows on all specified database servers against the data in the reference server and produces a consistency report. If you include the --verbose option, the report lists every inconsistent row value; otherwise, the report summarizes inconsistent rows.

If you run this command as a DBSA instead of as user informix, you must have INSERT, UPDATE, and DELETE permission on the replicated tables on all the replication servers in the domain.

If you want to monitor the progress of the check operation, include the --name option and specify a name for the progress report task. Then, run the cdr stats check command and specify the progress report task name.

Depending on the state of the data in your database when you run the cdr check command, the system might also run an UPDATE STATISTICS command.

If replicated transactions are active when the cdr check replicate command is running, the consistency report might include rows that are temporarily inconsistent until those transactions are applied at the target server. By default, the cdr check replicate command rechecks inconsistent rows for up to five seconds after the initial check is completed. If you find your transaction latency is longer than five seconds, you can extend the recheck time period by using the --inprogress option to specify a longer interval. After the initial recheck, inconsistent transactions are rechecked until there are no inconsistent transactions or the number of seconds specified by the --inprogress option elapses. In general, set the recheck time to a little longer than your average transaction latency because if repairing inconsistencies causes spooling in the send queue, transaction latency might increase during a repair. View your transaction latency with the cdr view apply command.

You can improve the performance of consistency checks by limiting the amount of data that is checked by using one or more of the following options:

  • Check from a specific time with the --since option. If the replicate uses the time stamp or delete wins conflict resolution rule and you regularly check consistency, you can limit the data that is checked to the data that was updated since the last consistency check.
  • Check a subset of the data with the --where option. For example, if you have a corrupted table fragment on a server, you can specify to check only the data in that fragment.
  • Skip the checking of large objects with the --skipLOB option. If you find that your large objects do not change as much as other types of data, then skipping them can make a consistency check quicker.

You can run a consistency check as a background operation as an SQL administration API command if you include the --background option. This option is useful if you want to schedule regular consistency checks with the Scheduler. If you run a consistency check in the background, provide a name for the progress report task by using the --name option so that you can monitor the check with the cdr stats check command. You can also view the command and its results in the command_history table in the sysadmin database. If you use the --background option as a DBSA, you must have CONNECT privilege on the sysadmin database and INSERT privilege on the ph_task table.

If you have large tables, you can speed consistency checking by indexing the ifx_replcheck shadow column.

If your replication system is configured for data consolidation and the primary servers include the S option in their participant definitions, you must include the --nomaster option.

If you include the --repair option, the cdr check replicate command repairs inconsistent rows so that they match the rows on the reference server. The cdr check replicate command uses direct synchronization as a foreground process when repairing inconsistent rows. The cdr check replicate command with the--repair option does the following tasks:

  1. Creates a shadow replicate with the source server and target server as participants. The conflict resolution rule for the shadow replicate is always apply.
  2. Performs an index scan on the replication key index at both the source server and the target server to create a checksum and identify inconsistent rows.
  3. Replicates inconsistent rows from the source server to the target server by doing a placeholder update of the source server, which might result in increased logging activity. Rows are not replicated to participants that include the S option in the participant definition because those participants only send data.
  4. Runs a check to determine whether any rows remain inconsistent. Rows can be temporarily inconsistent if not all transactions are complete on the target server.
  5. If any rows are inconsistent, reruns the check for up to five seconds, or for up to the number of seconds specified by the --inprogress option.
  6. Deletes the shadow replicate.
  7. Displays the consistency report.

To repair replicate sets based on the latest time stamps among the participants instead of based on a master server, use the --repair option with the --timestamp option. If your replicates use the delete wins conflict resolution rule, also include the --deletewins option. A time stamp repair evaluates extra and mismatched rows according to the rules of the time stamp or delete wins conflict resolution rules. The reference server in a time stamp repair is the server with the lowest replication key.

The following table describes the columns of the consistency report.
Table 1. Consistency Report Description
Column name Description
Node The name of the replication server.
Rows The number of rows that are checked in the participant.

If you included the --since or --where options, this number indicates the number of rows that fit the filter conditions. The number of rows that are checked with the --since option might be different on different servers, because of replication latency. Some rows might be checked on a source server to verify target server rows even if those rows on the source server did not originally fit the filter conditions.

Elements For replicates that include TimeSeries columns the Elements column is shown instead of the Rows column. The Elements column shows the number of time series elements that are checked in the participant.
Extra The number of rows on the target server that do not exist on the reference server.

For the reference server, this number is always 0.

Missing The number of rows on the reference server that do not exist on the target server.

For the reference server, this number is always 0.

Mismatch The number of rows on the target server that are not consistent with the corresponding rows on the reference server.

For the reference server, this number is always 0.

Total Mismatch For replicates that include TimeSeries columns the Total Mismatch column is shown instead of the Mismatch column. The Total Mismatch column shows the number of rows on the target server that are not consistent with the corresponding rows on the reference server. If the number in this column is greater than the number in the TmSr Rltd Mismatch column, the additional rows are inconsistent in a way that does not involve a TimeSeries column.

For the reference server, this number is always 0.

TmSr Rltd Mismatch For replicates that include TimeSeries columns the TmSr Rltd Mismatch column shows the number rows on the target server that do not have the same time series properties as the corresponding rows on the reference server because of time series properties. The following time series properties are checked:
  • Whether the TimeSeries column is NULL
  • The origin of the time series
  • The calendar definition
  • Whether the time series is regular or irregular
  • The time series instance ID
  • The time series threshold

For the reference server, this number is always 0.

Other Mismatch For replicates that include TimeSeries columns the Other Mismatch column shows the number of elements that are in mismatched rows on the target server.

For the reference server, this number is always 0.

Processed The number of rows that are processed to correct inconsistent rows.

The number of processed rows on the reference server is equal to the number of mismatched rows plus missing rows on the target servers.

The number of processed rows for a target server is usually equal to the number of extra rows it has. If a row has child rows, then the number of processed rows can be greater than the number of extra rows because the child rows must be deleted as well.

If the --extratargetrows option is set to keep, then extra rows are not deleted from the target and those rows are not added to the Processed column. If the --extratargetrows option is set to merge, then those rows are replicated to the reference server and are listed in the Processed column for the target server.

For a time stamp repair, the time stamp or delete wins conflict resolution rule is used to determine how to process the row.

You can run this command from within an SQL statement by using the SQL administration API.

Return codes

A return code of 0 indicates that the command was successful.

If the command is not successful, one of the following error codes is returned: 1, 5, 17, 18, 31, 37, 48, 53, 54, 61, 75, 99, 101, 121, 172, 174, 178, 193, 194, 195, 200, 203, 204.

For information about these error codes, see Return Codes for the cdr Utility

Example 1: Summary consistency report

The following command generates a consistency report for a replicate named repl1, comparing the data on the server serv2 with the data on the server serv1:
cdr check replicate --master=g_serv1 --repl=repl1 g_serv2
The summary consistency report shows that the servers are consistent:
                 ------   Table scan for repl1 start  --------

------   Statistics for repl1 ------
Node                  Rows     Extra   Missing  Mismatch Processed
---------------- --------- --------- --------- --------- ---------
g_serv1                 52         0         0         0         0
g_serv2                 52         0         0         0         0

              ------   Table scan for repl1 end   ---------

This report indicates that the replicate is consistent on these servers.

Example 2: Summary consistency report with repair

The following command generates a consistency report and repairs inconsistent rows on all servers for a replicate named repl1:
cdr check replicate --master g_serv1 --repl=repl1 --all --repair
The consistency report shows that the target server has extra rows:
                ------   Table scan for repl1 start  --------

------   Statistics for repl1 ------
Node                  Rows     Extra   Missing  Mismatch Processed
---------------- --------- --------- --------- --------- ---------
g_serv1                 67         0         0         0         2
g_serv2                 67         2         2         0         2
g_serv3                 67         0         0         0         0

Validation of repaired rows failed.
WARNING: replicate is not in sync

               ------   Table scan for repl1 end   ---------
This report indicates that g_serv2 has two extra rows and is missing two rows. Two rows were processed on g_serv1 to replicate the missing rows to g_serv2. Also, two rows were processed on g_serv2 to delete the extra rows. Because the --extratargetrows option was not specified, the default behavior of deleting rows on the target servers that are not on the reference server occurred.

In this example, not all repaired rows were validated. Some rows might be still in the process of being applied on the target servers. Using the --inprogress option to extend the time of the validation check after the repair might prevent validation failures.

Example 3: Verbose consistency report with repair

The following command generates a verbose consistency report, creates a progress report task, and repairs inconsistent rows on all servers for a replicate named repl1:

cdr check replicate --master=g_srv1 --replicate=repl1 --all --name=task1 \
 --verbose --repair

The verbose consistency report shows details of the repaired rows:


             ------   Table scan for repl1 start  --------

------   Statistics for repl1 ------
Creating Shadow Repl sync_20104_1310721_1219952381
Node                  Rows     Extra   Missing  Mismatch Processed
---------------- --------- --------- --------- --------- ---------
g_srv1                 424         0         0         0        11
g_srv2                 416         3        11         0         3

The repair operation completed. Validating the repaired rows ...
Validation failed for the following rows:

row missing on <g_srv2>
key:c1:424 
------------------------------------------------------------------ 
row missing on <g_srv2> 
key:c1:425 
------------------------------------------------------------------ 
row missing on <g_srv2> 
key:c1:426 
------------------------------------------------------------------ 
marking completed on g_srv1 status 0

               ------   Table scan for repl1 end   ---------

This report indicates that the first check found three extra rows and 11 missing rows on the server g_srv2. After the repair operation and subsequent recheck, three rows were still missing on g_srv2. The progress report information can be accessed with the cdr stats check task1 command.

Example 4: Repeating verbose consistency report without repair

The following command generates a verbose consistency report for a replicate named repl1, comparing the data on the server serv2 with the data on the server serv1, and rechecks inconsistent rows for up to 20 seconds:
cdr check replicate --master g_serv1 --repl=repl_1 g_serv2 --all \
--verbose --inprogress=20
The verbose consistency report shows details for the inconsistent rows:
               ------   Table scan for repl1 start  --------

------   Statistics for repl1 ------
data mismatch between g_serv1 and g_serv2
item_num:1
order_num:1011
          lname
g_serv1   Pauly
g_serv2   Pauli
--------------------------------------------------
row missing on g_serv2
item_num:1
order_num:1014
--------------------------------------------------
row missing on g_serv2
item_num:2
order_num:1014
--------------------------------------------------
Node                  Rows     Extra   Missing  Mismatch Processed
---------------- --------- --------- --------- --------- ---------
g_serv1                 67         0         0         0         0
g_serv2                 65         0         2         1         0

WARNING: replicate is not in sync

               ------   Table scan for repl1 end   ---------
This report indicates that there is one inconsistent row on g_serv2. The replication key for that row is the combination of the item_num column and the order_num column. The row that is inconsistent is the one that has the item number 1 and the order number 1011. There are two rows that are missing on g_serv2, each identified by its replication key value.

Example 5: Summary consistency report with time filter

The following command generates a summary consistency report for the data that was updated in the last five minutes:

cdr check replicate --master=g_serv1 --repl=repl1 g_serv2 --since=5M

The consistency report shows that the servers are consistent:

            ------   Table scan for repl1 start  --------

------   Statistics for repl1 ------
Node                  Rows     Extra   Missing  Mismatch Processed
---------------- --------- --------- --------- --------- ---------
g_serv1                  2         0         0         0         0
g_serv2                  2         0         0         0         0

            ------   Table scan for repl1 end   ---------

Only two rows were checked on each server (the Rows column) because only two rows were updated in the last five minutes.

Example 6: Consistency check and repair with time filter

The following command generates a summary consistency report for the data that was updated since July 4, 2008 at 12:30:00 local time:

cdr check replicate --master=g_serv1 --repl=repl1 g_serv2 \
--since="2008-07-04 12:30:00"

Example 7: Summary consistency report and repair with data filters

The following command generates a consistency report and repairs the data where the region column equals East:

cdr check replicate --master=g_serv1 --repl=repl1 --repair g_serv2 \
--where="region = 'East'"

Example 8: Repair inconsistencies based on time stamp

The following command repairs inconsistencies based on the most recent time stamps for the repl1 replicate on all replication servers:

cdr check replicate --repl=repl1 --all --repair --timestamp 

The master server is not specified because the --timestamp option is used.

The consistency report shows that the three servers are not consistent:

           ------   Table scan for repl1 start  --------

------   Statistics for repl1 ------
Node                  Rows     Extra   Missing  Mismatch Processed
---------------- --------- --------- --------- --------- ---------
g_serv1                 67         0         0         4         10
g_serv2                 67         0         2         3         0
g_serv3                 67         0         5         0         4

WARNING: replicate is not in sync

          ------   Table scan for repl1 end   ---------

The value in the Extra column is always 0. In this example, seven rows are replicated from the g_serv1 server to fix missing rows. The g_serv1 server also replicated three rows to fix mismatched rows on the g_serv2 server. The g_serv3 server replicated four rows to resolve mismatched rows on the g_serv1 server.

Example 9: Check a replicate that includes a TimeSeries column

The following command checks the replicate named repl2, which includes a TimeSeries column:

cdr check replicate --repl=repl2 --master=g_3 --all

The following consistency report shows that the source server, g_3, has more elements than the target server, g_4:

           ------   Table scan for repl2 start  --------

                                                Total    TmSr Rltd
Node                  Rows     Extra   Missing  Mismatch  Mismatch Processed
---------------- --------- --------- --------- --------- --------- ---------
g_3                      1         0         0         0         0         0
g_4                      1         0         0         0         0         0

TimeSeries Column: raw_reads
                                                             Other 
Node              Elements     Extra   Missing  Mismatch  Mismatch Processed
---------------- --------- --------- --------- --------- --------- ---------
g_3                      2         0         0         0         0         0
g_4                      1         0         1         0         0         0

WARNING: replicate is not in sync
              ------   Table scan for repl2 end   ---------