The first step in recovering from a Cassandra data center failure is to process the stuck
files in the surviving data center.
Procedure
To process stuck files in the surviving data center, complete the following
steps:
-
If you are using adapter containers in your setup, stop the Sterling B2B Integrator adapter containers in
the failed data center using stopContainer.sh.
-
Redeliver the files that failed delivery within the surviving data center.
- Use Advanced Search in Sterling File Gateway to find all files where
Status=Failed delivery.
See Redelivering processed files in
the Sterling File Gateway IBM Knowledge
Center for more information.
- Manually start BP FileGatewayReroute so that file reprocessing occurs
immediately rather than on the next scheduled interval.
- Wait until all redelivery is complete to proceed.
-
Replay the files that failed routing within the surviving data center.
- Use Advanced Search in Sterling File Gateway to find all files where
Status=Failed routing.
See Replaying files in the Sterling File Gateway IBM Knowledge Center for
more information.
- Manually start BP FileGatewayReroute so that file reprocessing occurs
immediately rather than on the next scheduled interval.
- Wait until all routing is complete to proceed.
-
Use the eventUtility script to resend events for incomplete messages.
- Run the eventUtility to list the unprocessed files on data
centers.
./eventUtility.sh listEvents --appName=B2Bi --processStatus="PROCESSING" --sourceDC=<failed DC> --targetDC=<Surviving DC>
- Run the eventUtility to resend events for files that were in the middle of
processing in the failed data
center:
./eventUtility.sh resendEvents --appName=B2Bi --processStatus="PROCESSING" --sourceDC=<failed DC> --targetDC=<Surviving DC> --adminUser=<userid> --adminPassword=<password>
- Run the eventUtility to resend events for unprocessed files in the failed
data
center:
./eventUtility.sh resendEvents --appName=B2Bi --processStatus="UNPROCESSED" --sourceDC=<failed DC> --targetDC=<Surviving DC> --adminUser=<userid> --adminPassword=<password>
- Run the eventUtility to complete an Evaluate for
messages that do not have an event associated with
them:
./eventUtility.sh evaluateRules --appName=B2Bi --outputFile=<file name> --adminUser=admin --adminPassword=<password>
Tip: For more information about the
eventUtility script and a list of
additional parameters, see
eventUtility script.
What to do next
Resolve the issue causing the Cassandra node failure and then complete the next task, Restoring a failed data center, to return the
failed data center back to its original configuration.