Topic
  • 11 replies
  • Latest Post - ‏2012-03-02T18:52:03Z by jacobjr
robinsguk
robinsguk
29 Posts

Pinned topic Cannot add node to VIOS cluster

‏2012-02-24T10:43:36Z |
Hello,

2 x VIOS, one on 750 and one on 770 both running at ioslevel 2.2.1.3

I had a cluster with both VIOS as cluster members.

There was an error with the storage and both VIOS lost their cluster configuration.

I'm rebuilding the cluster.

So far I've managed to get the 770 VIOS to create the cluster again with just itself as the only member.

The cluster is working fine and providing storage to the LPARs.

When I try to add the 750 VIOS to the cluster using cluster -addnode -clustername mycluster -hostname 750_vios_hostname I get an error:
Cluster subsystem failed to add partition, perhaps reporting an error during cleanup.
750_vios_hostname

Warning: User intervention may be required to complete operation.

CAA: No Entities were added or removed
Command did not complete.
I can resolve the hostname of the 750 VIOS from the 770.
The 750 VIOS is showing that it is not a member of a cluster.

1. Where can I see any error messages generated by the cluster -addnode command?
2. It looks like the 750 VIOS requires some kind of cleanup to clear any remnants of the previous cluster config. Is there a way to do this?
Thanks

Glenn
Updated on 2012-03-02T18:52:03Z at 2012-03-02T18:52:03Z by jacobjr
  • jacobjr
    jacobjr
    7 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-02-24T14:42:29Z  
    Hi Glen,

    Can you provide more information on what happened with the storage?
    Are you reusing the same storage as before?
    When you run lscluster -m on each VIOS what is the output.

    • Jacob
  • robinsguk
    robinsguk
    29 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-02-24T21:35:23Z  
    • jacobjr
    • ‏2012-02-24T14:42:29Z
    Hi Glen,

    Can you provide more information on what happened with the storage?
    Are you reusing the same storage as before?
    When you run lscluster -m on each VIOS what is the output.

    • Jacob
    First of all, this a sandbox environment so this is not mission critical.

    Each VIOS is connected via fc to 2 x V7000, each presenting LUNs to the VIOSs.

    Someone accidentally removed the fc cables from one of the V7000 and the clusters just didn't cope well.

    I tried to recover the cluster but couldn't do it.

    I recreated the whole environment, with the exception of the 750 VIOS.

    On the 750 Vios which I can't add to the cluster:
    Cluster services are not active.

    On the 770 Vios:

    Calling node query for all nodes
    Node query number of nodes examined: 1
    Node name: xxxxxxxxxxxx
    Cluster shorthand id for node: 1
    uuid for node: c9065724-5716-11e1-a500-00145ee8a029
    State of node: UP NODE_LOCAL
    Smoothed rtt to node: 0
    Mean Deviation in network rtt to node: 0
    Number of clusters node is a member in: 1
    CLUSTER NAME TYPE SHID UUID
    xxxxxxx local c91e9690-5716-11e1-a500-00145ee8a029

    Number of points_of_contact for node: 0
    Point-of-contact interface & contact state
    n/a
  • jacobjr
    jacobjr
    7 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-02-24T21:47:22Z  
    Okay, you will need to make sure that 750 has visibility to the repository disk and the storage pool disks.
    Run lscluster -d to get a list of these devices. Once you have the list, make sure the disks have been zoned to the node you wish to add.
  • robinsguk
    robinsguk
    29 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-02-25T07:38:28Z  
    • jacobjr
    • ‏2012-02-24T21:47:22Z
    Okay, you will need to make sure that 750 has visibility to the repository disk and the storage pool disks.
    Run lscluster -d to get a list of these devices. Once you have the list, make sure the disks have been zoned to the node you wish to add.
    750 shows Cluster services are not active on the 750 for lscluster -d

    The 770 shows that there's 1 node in the cluster, 1 expected, 1 repos disk and 2 cluster disks,

    Glenn
  • jacobjr
    jacobjr
    7 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-02-25T16:28:33Z  
    • robinsguk
    • ‏2012-02-25T07:38:28Z
    750 shows Cluster services are not active on the 750 for lscluster -d

    The 770 shows that there's 1 node in the cluster, 1 expected, 1 repos disk and 2 cluster disks,

    Glenn
    You can verify that the 750 sees the disks that are in the cluster by getting the UDID from the output of lsluster -d and checking the state of the 770. Once you have the UDIDS, run lsdev -attr -dev <disk> | grep <UDID> on the 750
  • robinsguk
    robinsguk
    29 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-02-27T08:35:31Z  
    • jacobjr
    • ‏2012-02-25T16:28:33Z
    You can verify that the 750 sees the disks that are in the cluster by getting the UDID from the output of lsluster -d and checking the state of the 770. Once you have the UDIDS, run lsdev -attr -dev <disk> | grep <UDID> on the 750
    So,

    I seen three LUNs in the cluster on the 770, 1 quorum and 2 data.

    On the 770 the quorum disk is not showing as having a uDid:

    hdisk4
    state : UP
    uDid :
    uUid : 2c347d48-1289-e8b0-d542-8f234721b66e
    type : REPDISK
    On the 750 I can see all three disk presented and the 2 data disks have matching uDids with the 770 but the quorum disk shows a uDid:

    unique_id 3321360050768028180DD700000000000000504214503IBMfcp

    When I check the PVID for this disk on both 770 and 750 it matches.

    One thing I have noticed is the the quorum disk is showing as belonging to a cluser vg.

    hdisk4 00f6ad6b7c3ab85d caavg_private active

    If I try and vary on/off that vg I get an error saying it's not found on the system.

    Thanks

    Glenn
  • jacobjr
    jacobjr
    7 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-03-02T14:48:45Z  
    • robinsguk
    • ‏2012-02-27T08:35:31Z
    So,

    I seen three LUNs in the cluster on the 770, 1 quorum and 2 data.

    On the 770 the quorum disk is not showing as having a uDid:

    hdisk4
    state : UP
    uDid :
    uUid : 2c347d48-1289-e8b0-d542-8f234721b66e
    type : REPDISK
    On the 750 I can see all three disk presented and the 2 data disks have matching uDids with the 770 but the quorum disk shows a uDid:

    unique_id 3321360050768028180DD700000000000000504214503IBMfcp

    When I check the PVID for this disk on both 770 and 750 it matches.

    One thing I have noticed is the the quorum disk is showing as belonging to a cluser vg.

    hdisk4 00f6ad6b7c3ab85d caavg_private active

    If I try and vary on/off that vg I get an error saying it's not found on the system.

    Thanks

    Glenn
    Can you try to delete the cluster, and then create it on the 750?
    If that succeeds then you can attempt to add the other node.
  • robinsguk
    robinsguk
    29 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-03-02T17:44:10Z  
    • jacobjr
    • ‏2012-03-02T14:48:45Z
    Can you try to delete the cluster, and then create it on the 750?
    If that succeeds then you can attempt to add the other node.
    cluster -delete -clustername xxxxxxxx

    "xxxxxx" is not a valid cluster.

    Command did not complete.
    The 750 says there is no cluster so it can't delete it.

    cluster -list return nothing

    lscluster

    Cluster services are not active.

    Glenn
  • jacobjr
    jacobjr
    7 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-03-02T17:48:45Z  
    • robinsguk
    • ‏2012-03-02T17:44:10Z
    cluster -delete -clustername xxxxxxxx

    "xxxxxx" is not a valid cluster.

    Command did not complete.
    The 750 says there is no cluster so it can't delete it.

    cluster -list return nothing

    lscluster

    Cluster services are not active.

    Glenn
    Hi Glenn,

    Sorry I should have been more clear. If possible, delete the cluster on the 770, then create it on the 750. If successful, then add the 770 node.
  • robinsguk
    robinsguk
    29 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-03-02T17:55:51Z  
    • jacobjr
    • ‏2012-03-02T17:48:45Z
    Hi Glenn,

    Sorry I should have been more clear. If possible, delete the cluster on the 770, then create it on the 750. If successful, then add the 770 node.
    That could be difficult. I have the cluster up and running on the 770 with storage attached.

    Can I delete it and recreate it without losing my data?

    Glenn
  • jacobjr
    jacobjr
    7 Posts

    Re: Cannot add node to VIOS cluster

    ‏2012-03-02T18:52:03Z  
    • robinsguk
    • ‏2012-03-02T17:55:51Z
    That could be difficult. I have the cluster up and running on the 770 with storage attached.

    Can I delete it and recreate it without losing my data?

    Glenn
    If you already have data in the cluster, I would advise you to not delete it.
    The next steps would be to take the snap data, after you attempt the addnode, from
    both the 770 and the 750. The logs in there should detail the problem.