Troubleshooting high availability issues in the core warehouse
If a problem occurs with the HA configuration in the core warehouse, it might be caused by a problem with a managed resource, a problem with the HA resource model, or a problem with the system configuration. As part of a problem determination process, IBM Support might ask you to temporarily disable automation for the high availability configuration in the core warehouse.
- Troubleshooting a problem with a highly available resource in the core warehouse
Use the troubleshooting steps in this topic to identify and resolve problems with a highly available resource, and to clear failed resource states. - Troubleshooting a problem with the high availability resource model in the core warehouse
Use the hachkconfig and hareset commands to identify and repair problems in the high availability (HA) resource model for the core warehouse. - Troubleshooting a system configuration problem that affects the high availability configuration
A system configuration problem, such as an incorrect network configuration or an incorrect storage configuration, can cause a problem with the HA configuration for the core warehouse. - Disabling and enabling high availability
Use the instructions for disabling and enabling high availability only when instructed to do so by IBM Support as part of the problem determination process. If you use these instructions to manually disable automation for prolonged periods, your system might become unstable and failovers might become unpredictable. - Event monitoring is unavailable when the management host fails over or when the system console is unavailable
The system console is not included in the high availability configuration for the management host. If the management host fails, the system console application fails, or IBM® Systems Director fails (for IBM PureData® System for Operational Analytics version 1.1.0.0 only), the event monitoring provided by the system console will not be available and no alerts will be issued to indicate the loss of event monitoring. - Monitoring the core warehouse database when the system console is not available
If the management host fails over to the standby management host, you can continue to monitor the core warehouse database from the standby management host as the database performance monitor database user (db2opm). - Accessing the warehouse tools administration console when the system console is not available
If the management host fails over to the standby management host and the system console is not available, you can continue to access the warehouse tools administration console from the standby management host as the warehouse tools administrator (dweadmin). - Unmounting GPFS file systems and stopping GPFS on a host
The automated monitoring and failover capability within the high availability configuration for the IBM PureData System for Operational Analytics are dependent on the availability of the file systems that are managed by IBM General Parallel File System (GPFS™) software. Manually unmount GPFS file systems or stop GPFS resources only as part of a troubleshooting process or if you need to reboot a node. - Starting GPFS and mounting GPFS file systems on a host
Use this task to manually start the GPFS software or to manually mount the GPFS file systems when the core domain is offline as part of a troubleshooting process or to reboot a node. - Unmounting a GPFS file system on all hosts
Use this task to manually unmount one GPFS file system on all hosts. Perform this task only as part of a troubleshooting process.
Parent topic: Techniques for troubleshooting problems