Comentários (17)

1 BruceGillespie comentou às Link permanente

Hi Chris. I wonder if you would be doing an update on this article based on the changes introduced by AIX 7? <div>&nbsp;</div> Bruce Gillespie

2 Jim_VandeVegt comentou às Link permanente

How can one go about debugging the authentication process? The description is great, but how do you check each step out to see where it might be breaking down? <div>&nbsp;</div> What strategies can be employed when the LPAR is behind a NAT? I have several systems that get registered on the HMC using their backend NAT address rather the public advertised address. <div>&nbsp;</div> Thanks,

3 AnthonyEnglish comentou às Link permanente

Chris, <div>&nbsp;</div> Not sure which version of the HMC you're using, but I think the command: <br /> hscroot@hmc1:~&gt; lsLPAR -dlpar <div>&nbsp;</div> should be <br /> hscroot@hmc1:~&gt; lspartition -dlpar <div>&nbsp;</div> Anthony

4 cggibbo comentou às Link permanente

Thanks Anthony. I have updated this post.

5 cggibbo comentou às Link permanente

Here's an interesting blog post on DLPAR issues relating to AIX 6.1 and 7.1 systems that have been cloned using alt_disk_copy: <div>&nbsp;</div> DLPAR issues with cloned AIX LPAR: <div>&nbsp;</div> <div>&nbsp;</div> The issue relates specifically to DLPAR and Cluster Aware AIX (CAA) i.e. the CAA unique cluster node id.

6 meyawi comentou às Link permanente

Thanks for writing about my blog post :)

7 ste37 comentou às Link permanente

Hi Chris, <br /> I have some connection problem between new p780 HMC and same LPAR in DMZ with IP NAT. <br /> In our old p590 the comunication it's ok but in new p780 we have same problem. <br /> You wrote: "On power5 and power6 partitions, the RMC connection will be IP based so this will list the IP address of the partition." <br /> You think that the problem is there? <br /> The communication is ok between AIX 5.3 NAT LPAR and HMC <br /> and is ok between AIX 6.1 No-NAT LPAR and HMC <br /> but not between AIX 6.1 NAT LPAR and HMC <br /> If I run command : lsrsrc "IBM.ManagementServer" <br /> I have a correct output on AIX 5.3 NAT and AIX 6.1 No-NAT <br /> but on AIX 6.1 NAT LPAR the output is... blank <br /> [XXX]$ lsrsrc "IBM.ManagementServer" <br /> Resource Persistent Attributes for IBM.ManagementServer <br /> [XXX]$ <div>&nbsp;</div> Can you help me to find an workaround? <br /> TKS

8 hillanes comentou às Link permanente

Thks, <br /> Chris, What about IVM ?, (VIOS whitout HMC). <br /> I cant start the RMC daemons, How can I troubleshoot my environment?

9 hillanes comentou às Link permanente

Thks <br /> Chris, What about IVM environment, without HMC. <br /> I cant start RMC service. How can I troubleshoot my environment?

10 VEUT_xu_ma comentou às Link permanente

Thanks for your article. I resolve my problem.

11 Nolte comentou às Link permanente

Hi Chris, thank for the article very useful. <br /> But i have a problem: <br /> Adding in a dlpar mode a virtual fiber channel adapter to my vio, i have received an error in communication, but i have clicked "OK" in the window. <div>&nbsp;</div> After reset the connection following this article, the problem is that is not possible to remove the adaper from the vio running profile because : "0931-009 You specified a drc_name for a resource which is not assigned to this partition." effectively i don't have the vfchost on vio..(not even in "unknow state"). is like an error from HMC and a my error clicking "OK" and not "cancel" when the communication is bad. <br /> obviously after the reset is possible to make dlpar actions but is not possible delete the adapter even whit --force option in chhwres command. <br /> Thanks.

12 cggibbo comentou às Link permanente

Sounds looks like you might have a "ghost" adapters info on the HMC and in the VIOS. So now things are out of sync. <br /> You could try the following (at your own risk), to resolve the problem: <div>&nbsp;</div> From oem_setup_env on the VIOS: <div>&nbsp;</div> # /usr/sbin/drmgr -a -c slot -s U911X.MXX.1234E8C-V1-C164 -d 5 ; where the location code matches your adapter/slot config. <div>&nbsp;</div> Reconfigure the slot in the VIOS from padmin: <div>&nbsp;</div> $ cfgdev <div>&nbsp;</div> If the above works as expected then you should be able to remove the VFC adapter now, as padmin: <div>&nbsp;</div> $ rmdev -dev vfchostXYZ <div>&nbsp;</div> Then the HMC DLPAR remove on the slot should complete and leave the HMC and VIOS partition in a consistent state.

13 rcotter comentou às Link permanente

Warning: The recfgct command referenced above is *not* supported for use by customers without direct IBM support instructions. It erases all RSCT configuration info and makes it look like the node was just installed. This may be fine for DLPAR recycling, but if you have any other products dependent on RSCT on the partition in question, you will be *broken*. <br /> In particular, PowerHA 7 will crash, and Tivoli SAMP will have all its cluster info destroyed, partitioning it from the rest of the domain until it can be manually re-added (and it may also crash, depending on the presence of resources). <br /> If you find that DLPAR is not working, and all other network checks and even the RMC recycling (-z/-A/-p) does not work, it is strongly recommended that you use the ctsnap command to gather data and contact IBM support. (Capturing iptrace for a few minutes would not be a bad idea either. A complementary tcpdump on the HMC would also be good, but this may not be possible for most customers given HMC's access restrictions.) <br /> Then, if you wish to proceed with recfgct and find that it does resolve whatever the problem was, it would be equally wise to gather another ctsnap after the partition is once again connected to the HMC, to compare to the previous one.

14 POWERHAguy comentou às Link permanente

I just encountered a similar problem for cloning a primary node in PowerHA and making it a standby node. Though the environment as far as PowerHA was concerned worked, but errpt every 60 seconds was recording the errors I listed below. I went through the procedures above, though I stopped PowerHA on the standby node before starting. After performing the steps I got cthags was no longer found. I rebooted and all worked but I wasn't real happy with that. <div>&nbsp;</div> <div>&nbsp;</div> So I recreated my environment again. The only step difference this time was when stopping PowerHA on the stby node I also stopped CAA services, <br /> went through steps, then restarted and told it to start CAA services again. Now this exact clmgr syntax only works with PowerHA 7.1.3 SP1 or above. Earlier versions of CAA/HA have different options/commands to stop/start CAA individually. This seems to have worked for me, hopefully it works for others. <div>&nbsp;</div> clmgr stop node dtcu0_stby WHEN=now MANAGE=offline STOP_CAA=yes <div>&nbsp;</div> stopsrc -g rsct_rm; stopsrc -g rsct <br /> /usr/bin/odmdelete -o CuAt -q 'attribute=node_uuid' <br /> /usr/sbin/rsct/bin/mknodeid -f (when I ran this step I go no output, I think it just pulls in the existing one from repos disk for node but not sure) <br /> lsattr -El cluster0 <br /> /usr/sbin/rsct/bin/lsnodeid <br /> /usr/sbin/rsct/install/bin/recfgct <div>&nbsp;</div> clmgr start node web WHEN=now MANAGE=auto START_CAA=yes <div>&nbsp;</div> ________________________________________________________________________________________- <br /> LABEL: CONFIGRM_ONLINEFAIL <br /> IDENTIFIER: E509DBCA <div>&nbsp;</div> LABEL: CONFIGRM_STARTED_ST <br /> IDENTIFIER: DE84C4DB <div>&nbsp;</div> LABEL: SRC_RSTRT <br /> IDENTIFIER: CB4A951F <div>&nbsp;</div> LABEL: CONFIGRM_EXIT_ONLIN <br /> IDENTIFIER: 68FD23E8

15 POWERHAguy comentou às Link permanente

I did prove rcotter comment about running recfgct does indeed crash a PowerHA node cluster. So as my original notes state, take powerha and caa down before hand.

Incluir um Comentário Incluir um Comentário