Troubleshooting link problems

When troubleshooting a link problem, the analysis is started from the master domain manager. The loss of the "F" flag at an agent indicates that some link had a problem. The absence of a secondary link can be located by matching the "W" flags found on the full-status fault-tolerant agent on the other side.

Consider the network shown in Figure 1, where the workstation ACCT_FS, which is a full-status fault-tolerant agent, is not linked:
The key to Figure 1 is as follows (for those looking at this guide online or who have printed it on a color printer, the colors of the text and labels is indicated in parentheses, but if you are viewing it without the benefit of color, just ignore the color information):
White text on dark (blue) labels
CPUIDs of fault-tolerant agents in the master domain
Black text
Operating systems
Black text on grey labels
CPUIDs of standard agents in the master domain, or any agents in lower domains
Text (red) in "double quotation marks"
Status of workstations obtained by running conman sc @!@ at the master domain manager. Only statuses of workstations that return a status value are shown.
Black double-headed arrows
Primary links in master domain
Explosion
Broken primary link to ACCT_FS
Dotted lines (red)
Secondary links to ACCT_FS from the other workstations in the ACCT domain that could not be effected.
You might become aware of a network problem in a number of ways, but if you believe that a workstation is not linked, follow this procedure to troubleshoot the fault:
  1. Use the command conman sc @!@ on the master domain manager, and you can see that there is a problem with ACCT_FS, as shown in the example command output in Figure 2:
  2. From the ACCT_DM workstation run conman sc. In this case you see that all the writer processes are running, except for ACCT_FS. These are the primary links, shown by the solid lines in Figure 1. The output of the command in this example is as shown in Figure 3:
  3. From the ACCT_FS workstation run conman sc. In this case you see that there are no writer processes running. These are the secondary links, shown with the dashed lines in Figure 1. The output of the command in this example is as shown in Figure 4:
  4. If a network problem is preventing ACCT_FS from linking, resolve the problem.
  5. Wait for ACCT_FS to link.
  6. From the ACCT_FS workstation, run conman sc @!@. If the workstation has started to link, you can see that a writer process is running on many of the workstations indicated in Figure 1. Their secondary links have now been made to ACCT_FS. The workstations that have linked have an "F" instead of their previous setting. This view also shows that the master domain manager has started a writer process running on ACCT_FS. The output of the command in this example is as shown in Figure 5:
  7. Another way of checking which writer processes are running on ACCT_FS is to run the command: ps -ef | grep writer (use Task Manager on Windows™). The output of the ps command in this example is as shown in Figure 6:
  8. To determine if a workstation is fully linked, use the Monitor Workstations list in the Dynamic Workload Console.