Failover and restore operations at the remote site during an unplanned outage
Use this process to run failover and restore operations at your remote (C) site during an unplanned outage, using E volumes at the intermediate site.
Before you begin
If possible, before you issue a failover operation to the remote site, ensure that data processing has completely stopped at the local and intermediate sites. If you fail to do so, data can be lost if you did not quiesce I/O processing to the local site before recovering at the remote site.
About this task
Complete the following steps after a failure has been detected at the local site. (The steps in this scenario are examples.)
Procedure
- If the local site was not completely destroyed, it is essential
that data from any surviving A and B volume pairs be copied and a
consistent copy be achieved at the remote site. If possible and you
are able to freeze write activity to the Metro Mirror primary volumes, complete the following steps:
- Freeze updates to the A volumes in Metro Mirror relationships across the affected LSSs. This process ensures
that the B volumes are consistent at the time of the freeze. (One
command per LSS is required.) Enter the freezepprc command at the dscli command prompt with the following parameters and variables:
dscli> freezepprc -dev IBM.2107-130165X -remotedev IBM.2107-75ALA2P 07:12
The following represents an example of the output:As a result of the freeze action, the following processing occurs:CMUC00161W freezepprc: Remote Mirror and Copy consistency group 07:12 successfully created.
- The established paths between the LSS pairs are deleted.
- The volume pairs that are associated with the source and target LSSs are suspended. During this time, the storage unit collects data that is sent to the A volumes that are in Metro Mirror relationships.
- I/O processing to the Metro Mirror volume pairs is temporarily queued during the time that updates are frozen.
- Resume operations following a freeze. Issue the unfreezepprc command to allow I/O activity to resume for the specified volume pairs. Enter the unfreezepprc command at the dscli command prompt with the following parameters and variables:Note: This activity is sometimes referred to as a thaw operation.
dscli> unfreezepprc -dev IBM.2107-130165X -remotedev IBM.2107-75ALA2P07:12
The following represents an example of the output:CMUC00198I unfreezepprc: Remote Mirror and Copy pair 07:12 successfully thawed.
- Freeze updates to the A volumes in Metro Mirror relationships across the affected LSSs. This process ensures
that the B volumes are consistent at the time of the freeze. (One
command per LSS is required.)
- Verify that the last data from the local site has been
included in a Global Mirror consistency group. Monitor this activity by querying the B and C volumes to determine when at least two successful consistency groups have formed. The "Total Successful CG Count" field from the query output displays this information.Note: When you use the showgmir command with the -metrics parameter, you can monitor the progress of the consistency group formation. When Global Mirror is running, the number of consistency groups is steadily growing each time you issue the showgmir command.
Enter the showgmir -metrics command at the dscli command prompt with the following parameters and variables:
dscli> showgmir -metrics 10
The following represents an example of the output:
ID Total
Failed
CG
CountTotal
Succes-
sful
CG
CountSucces-
sful
CG
Percen-
tageFailed
CG
after
Last
Suc-
cessLast
Succes-
sful
CG
Form
TimeCoord.
Time
(milli-
seconds)CG
Inter-
val
Time
(sec-
onds)IBM.
2107-
75ALA
2P/1023 139 85 0 02/20/
2006
11:33:56
MST50 0 Max
CG
Drain
Time
(sec-
onds)First
Failure
Control
UnitFirst
Failure
LSSFirst
Failure
StatusFirst
Failure
ReasonFirst
Failure
Master
StateLast
Failure
Control
UnitLast
Failure
LSS30 IBM.
2107-
75ALA
2P0x12ErrorSession
or
Session
Members
not in
Correct
StateGlobal
Mirror
Run in
ProgressIBM.2107-
75ALA
2PNot
AvailableLast
Failure
StatusLast
Failure
ReasonLast
Failure
Master
StatePrev-
ious
Failure
Control
UnitPrev-
ious
Failure
LSSPrev-
ious
Failure
StatusPrev-
ious
Failure
ReasonPrev-
ious
Failure
Master
StateErrorMax
Drain
Time
Exceed-
edDrain
in
Prog-
ressIBM.2107
-75ALA
2PNot
Avail-
ableErrorMax
Drain
Time
Exceed-
edDrain
in
Prog-
ress - Stop the Global Mirror session from which the B and C volume pairs are included. Enter the rmgmir command at the dscli command prompt with the following parameters and variables:
dscli> rmgmir -quiet -lss 10 -session 31
The following represents an example of the output:CMUC00165I rmgmir: Global Mirror for session 31 successfully stopped.
See Ending Global Mirror processing (script mode) or Ending Global Mirror processing (no script) for more information.
- Verify that the Global Mirror session has ended. Consistency groups do not form when Global Mirror processing is stopped.
See Querying Global Mirror processing for more information.
Enter the showgmir command at the dscli command prompt with the following parameters and variables:dscli> showgmir 10
The following represents an example of the output:ID Master
CountMaster
Session
IDCopy
StateFatal
ReasonCG
Inter-
val
(sec-
onds)Coord.
Time
(milli-
sec-
onds)CG
Drain
Time
(sec-
onds)Current
TimeIBM.
2107-
75ALA
2P/10- - - - - - - - CG
TimeSucces-
sful CG
Percen-
tageFlash-
Copy
Sequ-
ence
NumberMaster
IDSubor-
dinate
CountMaster/
Subordi-
nate
Assoc.- - - - - - - Delete the Global Copy relationships between
the B and C volume pairs at the intermediate and remote sites. When the relationships between the B and C volumes are deleted, the cascade parameter is disabled for the B volumes and the B volumes are no longer detected as being in cascaded relationships. Enter the rmpprc command at the dscli command prompt with the following parameters and variables:See Deleting a Metro Mirror relationship for more information.
dscli> rmpprc -quiet -dev IBM.2107-75ALA2P -remotedev IBM.2107-1831760 1200-125f:0700-075f
The following represents an example of the output:CMUC00155I rmpprc: Remote Mirror and Copy volume pair 1200:0700 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair 1201:0701 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair 1202:0702 relationship successfully withdrawn.
- Issue a failover command to the B volumes with the Cascade
option. With this process, updates are collected using the change recording feature, which allows the later resynchronization of the B to A volumes. Enter the failoverpprc command at the dscli command prompt with the following parameters and variables:
dscli> failoverpprc -dev IBM.2107-75ALA2P -remotedev IBM.2107-130165X -type gcp -cascade 1200-125f:1a00-1a5f
The following represents an example of the output:CMUC00196I failoverpprc: Remote Mirror and Copy pair 1200:1A00 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 1201:1A01 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 1202:1A02 successfully reversed.
See Running a recovery failover operation for more information. - Create Global Copy relationships using the C and B volume pairs. Specify the
NOCOPY option. Enter the mkpprc command at the dscli command prompt with the following parameters and variables:Note: You can specify the NOCOPY option with the following commands because the B and C volume pairs contain exact copies of data.
dscli> mkpprc -dev IBM.2107-1831760 -remotedev IBM.2107-75ALA2P -type gcp -mode nocp 0700-075f:1200-125f
The following represents an example of the output:CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 0700:1200 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 0701:1201 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 0702:1202 successfully created.
See Creating a Global Copy relationship for more information.
- Use FlashCopy to create a copy of B source volumes to
E target volumes. Specify the following options: Persistent and
Start Change Recording. This creates a backup copy of the consistency
group. Enter the mkflash command at the dscli command prompt with the following parameters and variables:
dscli> mkflash -dev IBM. 2107-75ALA2P -tgtinhibit -record -persist -nocp 1200-125f:1900-195f
The following represents an example of the output:See Creating FlashCopy relationships (Global Mirror setup) for more information.CMUC00137I mkflash: FlashCopy pair 1200:1900 successfully created. CMUC00137I mkflash: FlashCopy pair 1201:1901 successfully created. CMUC00137I mkflash: FlashCopy pair 1202:1902 successfully created.
- Create a Global Mirror session using the C volumes. Enter the mksession command at the dscli command prompt with the following parameters and variables:
dscli> mksession -lss 07 1
The following represents an example of the output:CMUC00145I mksession: Session 1 opened successfully.
See Creating the Global Mirror session for more information. - Start the Global Mirror session from which the C, B and E volumes are included. Enter the mkgmir command at the dscli command prompt with the following parameters and variables:
dscli> mkgmir -lss 07 -session 1
The following represents an example of the output:CMUC00162I mkgmir: Global Mirror for session 1 successfully started.
See Starting Global Mirror processing for more information. - Verify that the Global Mirror session has started. Enter the showgmir command at the dscli command prompt with the following parameters and variables:
dscli> showgmir 07
The following represents an example of the output:ID Master
CountMaster
Session
IDCopy
StateFatal
ReasonCG
Inter-
val
(sec-
onds)Coord.
Time
(milli-
sec
onds)CG
Drain
Time
(sec-
onds)Current
TimeIBM.
2107-
75ALA
2P/071 0x01 Running Not Fatal 0 50 30 02/20/
2006
11:37:40
MSTCG
TimeSucces-
sful CG
Percen-
tageFlash-
Copy
Sequ-
ence
NumberMaster
IDSubor-
dinate
CountMaster/
Subordi-
nate
Assoc.02/20/
2006
11:37:40
MST100 0x4357E-
3F4IBM.
2107-
75ALA
2P/070 - - Allow the I/O to run and monitor the formation of the
consistency groups. Enter the showgmir command at the dscli command prompt with the following parameters and variables:
dscli> showgmir 07
The following represents an example of the output:ID Master
CountMaster
Session
IDCopy
StateFatal
ReasonCG
Inter-
val
(sec-
onds)Coord.
Time
(milli-
sec-
onds)CG
Drain
Time
(sec-
onds)Current
TimeIBM.
2107-
75ALA
2P/071 0x01 Running Not Fatal 0 50 30 02/20/
2006
11:37:40
MSTCG
TimeSucces-
sful CG
Percen-
tageFlash-
Copy
Sequ-
ence
NumberMaster
IDSubor-
dinate
CountMaster/
Subordi-
nate
Assoc.02/20/
2006
11:37:40
MST100 0x4357E3-
F4IBM.
2107-
75ALA
2P/070 - - When the local site is ready to return, issue a failback
command to the B and A volumes. Before the applications are started at the local site, data at the local site has to be copied from the intermediate site. Issue the failbackpprc command to start copying data from the B volumes at the intermediate site to the A volumes at the local site while hosts are running on the B volumes. When all data is copied, the A volumes are synchronized with the B volumes.Note: In a DS CLI environment, where the local and intermediate sites use different management consoles, you have to use a different DS CLI session for the management console of the B volumes at the intermediate site.Enter the failbackpprc command at the dscli command prompt with the following parameters and variables:
dscli> failbackpprc -dev IBM.2107-130165X -remotedev IBM.2107-75ALA2P -type gcp -cascade 1200-125f:1a00-1a5f
The following represents an example of the output:CMUC00197I failbackpprc: Remote Mirror and Copy pair 1200:1A00 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 1201:1A01 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 1202:1A02 successfully failed back.
See Running a recovery failback operation for more information. - Wait for the copy operation of the B and A volumes to
reach full duplex status (all out-of-sync tracks have completed copying). You can monitor this activity by querying the status of the B and
A volume pairs. Enter the lspprc command at the dscli command prompt with the following parameters and variables:
dscli> lspprc -dev IBM.2107-75ALA2P -remotedev IBM. 2107-130165X -l -fmt default 1200-125f
The following represents an example of the output:ID State Reason Type Out
Of
Sync
TracksTgt
ReadSrc
Cascade1200:1a00 Copy
Pending- Global
Copy0 Disabled Enabled 1201:1a01 Copy
Pending- Global
Copy0 Disabled Enabled 1201:1a02 Copy
Pending- Global
Copy0 Disabled Enabled Tgt
CascadeDate
Sus
pendedSource
LSSTimeout
(secs)Crit
ModeFirst
Pass
StatusIncre
mental
ResyncTgt
WriteInvalid - 12 Unknown Disabled True Enabled Enabled Invalid - 12 Unknown Disabled True Enabled Enabled Invalid - 12 Unknown Disabled True Enabled Enabled - End I/O processing to the C volumes.. Enter the rmgmir command at the dscli command prompt with the following parameters and variables:
dscli> rmgmir -quiet -lss 07 -session 1
The following represents an example of the output:CMUC00165I rmgmir: Global Mirror for session 1 successfully stopped.
See Ending Global Mirror processing (script mode) or Ending Global Mirror processing (no script) for more information.
- Verify that at least two consistency groups have formed. Assuming that the consistency groups formed successfully, the A, B, C, and E volumes contain consistent data. (Data at the remote site is consistent to the last successful consistency group formed by the master storage unit.)See Querying Global Mirror processing for more information.Enter the showgmir -metrics command at the dscli command prompt with the following parameters and variables:
dscli> showgmir -metrics 07
The following represents an example of the output:
ID Total
Failed
CG
CountTotal
Succes-
sful CG
CountSucces-
sful CG
Percent-
ageFailed
CG after
Last
SuccessLast
Succes-
sful CG
Form
TimeCoord.
Time
(milli-
sec-
onds)CG
Interval
Time
(sec-
onds)IBM.2107
-75ALA
2P/100 55 100 0 02/20/
2006
11:38:25
MST50 0 Max
CG
Drain
Time
(seconds)First
Failure
Control
UnitFirst
Failure
LSSFirst
Failure
StatusFirst
Failure
ReasonFirst
Failure
Master
StateLast
Failure
Control
UnitLast
Failure
LSS30 - - No Error- - - - Last
Failure
StatusLast
Failure
ReasonLast
Failure
Master
StatePrev-
ious
Failure
Control
UnitPrev-
ious
Failure
LSSPrevious
Failure
StatusPrevious
Failure
ReasonPrev-
ious
Failure
Master
StateNo Error- - - - No Error- - - End the Global Mirror session between the C, B, and E volumes. Enter the rmgmir command at the dscli command prompt with the following parameters and variables:
dscli> rmgmir -quiet -lss 07 -session 1
The resulting output is displayed:CMUC00165I rmgmir: Global Mirror for session 1 successfully stopped
See Ending Global Mirror processing (script mode) or Ending Global Mirror processing (no script) for more information.
- Verify that the Global Mirror session that includes the C, B, and E volumes has stopped. Enter the showgmir command at the dscli command prompt with the following parameters and variables.
showgmir 07
The following represents an example of the output:
ID Total
Failed
CG
CountTotal
Succes-
sful CG
CountSucces-
sful CG
Percen-
tageFailed
CG after
Last
SuccessLast
Succes-
sful CG
Form
TimeCoord.
Time
(milli-
sec-
onds)CG
Interval
Time
(sec-
onds)IBM.2107
-75ALA
2P/1023 139 85 0 02/20/
2006
11:33:56
MST50 0 Max
CG
Drain
Time
(seconds)First
Failure
Control
UnitFirst
Failure
LSSFirst
Failure
StatusFirst
Failure
ReasonFirst
Failure
Master
StateLast
Failure
Control
UnitLast
Failure
LSS30 IBM.2107
-75ALA
2P0x12ErrorSession
or
Session
Members
not in
Correct
StateGlobal
Mirror
Run in
ProgressIBM.2107
-75ALA
2PNot
AvailableLast
Failure
StatusLast
Failure
ReasonLast
Failure
Master
StatePrevious
Failure
Control
UnitPrevious
Failure
LSSPrevious
Failure
StatusPrevious
Failure
ReasonPrevious
Failure
Master
StateErrorMax
Drain
Time
ExceededDrain in
ProgressIBM.2107
-75ALA
2PNot
AvailableErrorMax
Drain
Time
ExceededDrain in
ProgressSee Querying Global Mirror processing for more information.
- At the remote site, remove the C volumes (or Global
Copy secondary volumes) from the Global Mirror session that includes the C, B, and E volumes. Enter the chsession command at the dscli command prompt with the following parameters and variables:
dscli> chsession -dev IBM.2107-75ALA2P -action remove -volume 1200-125f -lss 07 1
The resulting output is displayed:CMUC00147I chsession: Session 1 successfully modified.
See Removing volumes from a session (Global Mirror) for more information. - Delete the Global Copy relationships between the C to
B volumes between the intermediate and remote sites. Deleting the Global Copy relationships between the C to B volume pairs prepares for restoring to the original Global Copy relationships between the B to C volume pairs. The cascaded relationship ends, as well.Enter the rmpprc command at the dscli command prompt with the following parameters and variables:
dscli> rmpprc -quiet -dev IBM.2107-1831760 -remotedev IBM.2107-75ALA2P 0700-075f:1200-125f
See Deleting a Metro Mirror relationship for more information.CMUC00155I rmpprc: Remote Mirror and Copy volume pair 0700:1200 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair 0701:1201 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair 0702:1202 relationship successfully withdrawn.
- Issue a failover command to the A to B volumes.
This process ends the Metro Mirror relationships between the B and A volumes and establishes the Metro Mirror relationships between the A and B volume pairs. Enter the failoverpprc command at the dscli command prompt with the following parameters and variables:
failoverpprc -dev IBM.2107-130165X -remotedev IBM.2107-75ALA2P -type mmir 1a00-1a5f:1200-125f
The following represents an example of the output:CMUC00196I failoverpprc: Remote Mirror and Copy pair 1A00:1200 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 1A01:1201 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 1A02:1202 successfully reversed.
- Reestablish paths that were disabled by the freeze operation
between the local site LSS and intermediate site LSS that contain
the B to A Metro Mirror volume pairs. Enter the mkpprcpath command at the dscli command prompt with the following parameters and variables:
dscli> mkpprcpath -dev IBM.2107-75ALA2P -remotedev IBM.2107-130165X -remotewwnn 5005076303FFC550 -srclss 61 -tgtlss 63 -consistgrp I0102:I0031 I0002:I0102
The resulting output is displayed:CMUC00149I mkpprcpath: Remote Mirror and Copy path 61:63 successfully established.
See Reestablishing remote mirror and copy paths (site A to site B) for more information. - Issue a failback command to the A and B volumes. This command copies the changes back to the A volumes that were
made to the B volumes in Metro Mirror relationships while hosts were
running on the B volumes. The A volumes are now synchronized with
the B volumes. (In a DS CLI environment, where the local and intermediate
sites use different management consoles, you have to use a different
DS CLI session for the management console of the B volumes at the
intermediate site.) Enter the failbackpprc command at the dscli command prompt with the following parameters and variables:
dscli> failbackpprc -dev IBM.2107-130165X -remotedev IBM.2107-75ALA2P -type mmir 1a00-1a5f:1200-125f
The following represents an example of the output:CMUC00197I failbackpprc: Remote Mirror and Copy pair 1A00:1200 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 1A01:1201 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 1A02:1202 successfully failed back.
- Establish the B and C volume pairs
in Global Copy relationships. Enter the mkpprc command at the dscli command prompt with the following parameters and variables:
dscli> mkpprc -dev IBM.2107-75ALA2P -remotedev IBM.2107-1831760 -type gcp -mode nocp -cascade 1200-125f:0700-075f
The following represents an example of the output:CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1200:0700 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1201:0701 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1202:0702 successfully created
See Creating a Global Copy relationship for more information. - Optionally, you can issue a FlashCopy operation to create
a backup copy of all the C, B, and E volumes from which the last consistency
group was created. If you need to preserve data from the set of volumes (or consistency group) that was created using the E volumes, allow the background copy from the FlashCopy process to complete before you continue to the next step, which describes removing the FlashCopy relationship between the B to E volume pairs.Enter the mkflash command at the dscli command prompt with the following parameters and variables:
dscli> mkflash -dev IBM.2107-75ALA2P -tgtinhibit -record -persist -nocp 1200-125f:1900-195f
The following represents an example of the output:CMUC00137I mkflash: FlashCopy pair 1200:1900 successfully created. CMUC00137I mkflash: FlashCopy pair 1201:1901 successfully created.
See Creating FlashCopy relationships (Global Mirror setup) for more information. - Delete the FlashCopy relationship between the B and
E volume pairs to end the relationship at the intermediate site. Enter the rmflash command at the dscli command prompt with the following parameters and variables:
dscli> rmflash -dev IBM.2107-75ALA2P -quiet 1200-125f:1900-195f
The following represents an example of the output:See Removing FlashCopy relationships for more information.CMUC00140I rmflash: FlashCopy pair 1200:1900 successfully removed. CMUC00140I rmflash: FlashCopy pair 1201:1901 successfully removed.
- Resume Global Mirror at the intermediate site. This starts Global Mirror processing for the B, C , and D volumes.Enter the resumegmir command at the dscli command prompt with the following parameters and variables:
dscli> resumegmir -dev IBM.2107-75ALA2P -session 10 -lss 31
The resulting output is displayed:See Resuming Global Mirror processing for more information.CMUC00164I resumegmir: Global Mirror for session 10 successfully resumed.
- Resume I/O on A volumes.
- Verify that consistency groups are forming successfully. Enter the showgmir -metrics command at the dscli command prompt with the following parameters and variables:
dscli> showgmir -metrics 10
The following represents an example of the output:
ID Total
Failed
CG
CountTotal
Succes-
sful CG
CountSucces-
sful CG
Percen-
tageFailed
CG after
Last
SuccessLast
Succes-
sful CG
Form
TimeCoord.
Time
(milli-
sec-
onds)CG
Inter-
val
Time
(sec-
onds)IBM.2107
-75ALA
2P/101 39 97 0 02/20/
2006
11:33:56
MST50 0 Max
CG
Drain
Time
(seconds)First
Failure
Control
UnitFirst
Failure
LSSFirst
Failure
StatusFirst
Failure
ReasonFirst
Failure
Master
StateLast
Failure
Control
UnitLast
Failure
LSS30 IBM.2107
-75ALA
2P0x12ErrorSession
or
Session
Members
not in
Correct
StateGlobal
Mirror
Run in
ProgressIBM.2107
-75ALA
2P0x12Last
Failure
StatusLast
Failure
ReasonLast
Failure
Master
StatePrevious
Failure
Control
UnitPrevious
Failure
LSSPrevious
Failure
StatusPrevious
Failure
ReasonPrevious
Failure
Master
StateErrorSession
or
Session
Members
not in
Correct
StateGlobal
Mirror
Run in
Progress- - No Error- Drain in
ProgressSee Querying Global Mirror processing for more information.