The auto recovery operation can impact the I/O performance across the cluster. To avoid
this problem, you can stop auto recovery manually and restart it later when the cluster is not so
busy. The disks that are not functioning must be recovered to protect your data.
Run the mmlsdisk -e command to see the disks that do not have the Up
availability and the Ready status. If all the disks in the file system are functioning correctly,
the system displays the following message: 6027-623 All disks up and
ready.
- To stop the auto recovery process, stop the tschdisk and tsrestripefs processes
on the file system manager node. Log in to the IBM
Storage Scale file
system manager node. Retrieve the tschdisk and tsrestripefs command
processor ID through the ps -elf | grep -e tschdisk -e tsrestripefs command.
Alternatively, check the IBM
Storage Scale log
(/var/adm/ras/mmfs.log.latest) in the file system manager node to see whether a
tschdisk command is still running. When the restripefs command
is invoked by the auto recovery and is still running, the command log message is redirected to
/var/adm/ras/restripefsOnDiskFailure.log.<timestamp>(IBM
Storage Scale 4.1 and IBM
Storage Scale 4.1.1) or
/var/adm/ras/autorecovery.log.<timestamp>(IBM
Storage Scale 4.1.1 PTF1 and later).
-
Take the following steps to stop the tschdisk and
tsrestripefs command processes:
- Make a list of the file system manager nodes in the cluster. The list must include the file
system manager node of each file system that is affected. To list the file system manager nodes, go
to a file system manager node and issue the following command:
mmlsmgr
This
command is in the directory /usr/lpp/mmfs/bin.
- Do the following actions for the tschdisk and tsrestripefs
processes on each of the file system manager nodes in your list:
- If you are not connected to a file system manager node, connect to it with
ssh
.
- Issue the following command to list the back-end processes that are running and their command
IDs:
mmfsadm command list all
In the following example, the
tsrestripefs process is running in the back end (line 6) and its command ID is
#92
(line
5):# mmfsadm command list all
CrHashTable 0x7F7E64001A08 n 4
cmd sock 75 cookie 3489916426 owner 12912 id 0x2D7ADC0785000064(#100) uses 1 type 14 start 1531294737.470181
flags 0x106 SG none line 'command list all'
cmd sock 70 cookie 2102087586 owner 4450 id 0x2D7ADC078500005C(#92) uses 1 type 13 start 1531294660.218091
flags 0x117 SG fpofs line 'tsrestripefs /dev/fpofs -r'
hold PIT/repair waitTime 6.082489
- If a back-end process is running, issue the following command to stop
it:
mmfsadm command stop <commandID>
where
<commandID> is the command ID of the back-end process from the previous step.
The following example uses command ID 92
from the example in the previous
step:mmfsadm command stop 92
- Run the mmfsadm command again to verify that the process is no longer
running:
mmfsadm command list all