We are using ganglia for our Power7 systems.
For all LPARs in a system we configured a "cluster".
If we are going to use LPM (Live Partition Mobility) the cluster name should change.
Are there any best practices how to handle that?
We were thinking of changing gmon.conf and restart gmond after LPM.
Thanks in advance for any hints.
This topic has been locked.
3 replies Latest Post - 2011-11-21T01:26:50Z by Jonesy1234
Pinned topic Ganglia with LPM
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2011-11-21T01:26:50Z at 2011-11-21T01:26:50Z by Jonesy1234
Bumbes 2700015RXE1 PostACCEPTED ANSWER
Re: Ganglia with LPM2011-10-31T16:16:38Z in response to SystemAdminHi,
working on the same topic. I would like to let the community and you know, that you are not the only one. I found this entry during gathering some ideas for a solution.
I though about restarting gmond, too, let a cronjob checking regurlary the system id of the managed system and compare with gmond.conf.
But you will lose all historical data of the LPAR, because you create a new node in ganglia with this script. So you have to think about copying the rrd data to the new cluster, too.
Do you have some results meanwhile?
paulob 100000UXKE1 PostACCEPTED ANSWER
Re: Ganglia with LPM2011-11-07T21:01:04Z in response to SystemAdminHello,
I'm in the same boat. LPM, Ganglia, restarting gmond, historical data. You can use the drmgr script to restart gmond after a postmigration event. http://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/SG245765.html Customize the /etc/rc.d/init.d/gmond script, and have it check the serial number of the lpar, an then you will know which frame it's on.
I'm also trying to figure out an elegant solution to retaining historical data. I wish Ganglia can report to two different "clusters" with one agent running.
One option is to run two agents. One for historical data tracking which will report to one cluster on say port 8649. Then have another agent report to another cluster, depending on which frame it's on, report it to port 8650. You will have to create new init.d script.
Another option is to run one agent, reporting to the correct cluster based on frame, and then somehow merge the rrds file of the two clusters to retain historical data.
I've been testing with a few, but haven't had any success.
Anyone else figure something out?
Jonesy1234 270004FXS72 PostsACCEPTED ANSWER
Re: Ganglia with LPM2011-11-21T01:26:50Z in response to SystemAdminAt my previous job I implemented a ksh script that ran every minutes to looked for migration successful messages in errpt. It would then pull down the relevant cluster gmond.conf file from a central location via rsync based on the pseries serial number and then restarted gmond and clear the errpt message.
At my current place I have a more elegant solution in that this process is all controlled via cfengine to control the configuration files for gmond.conf and the status of the daemon. In-turn any new LPARs I create get ganglia configured as soon as I install cfengine on them.
If your interesting in getting started with cfengine a good resource is http://www.ibm.com/developerworks/opensource/library/os-cfengine1/index.html?ca=drs-
Let me know if you want details of how I achieved this within cfengine.