Topic
3 replies Latest Post - ‏2011-11-21T01:26:50Z by Jonesy1234
SystemAdmin
SystemAdmin
2404 Posts
ACCEPTED ANSWER

Pinned topic Ganglia with LPM

‏2011-10-18T21:56:40Z |
We are using ganglia for our Power7 systems.
For all LPARs in a system we configured a "cluster".
If we are going to use LPM (Live Partition Mobility) the cluster name should change.
Are there any best practices how to handle that?
We were thinking of changing gmon.conf and restart gmond after LPM.

Thanks in advance for any hints.
Updated on 2011-11-21T01:26:50Z at 2011-11-21T01:26:50Z by Jonesy1234
  • Bumbes
    Bumbes
    1 Post
    ACCEPTED ANSWER

    Re: Ganglia with LPM

    ‏2011-10-31T16:16:38Z  in response to SystemAdmin
    Hi,

    working on the same topic. I would like to let the community and you know, that you are not the only one. I found this entry during gathering some ideas for a solution.
    I though about restarting gmond, too, let a cronjob checking regurlary the system id of the managed system and compare with gmond.conf.

    But you will lose all historical data of the LPAR, because you create a new node in ganglia with this script. So you have to think about copying the rrd data to the new cluster, too.

    Do you have some results meanwhile?

    Rgds
    Bumbes
  • paulob
    paulob
    1 Post
    ACCEPTED ANSWER

    Re: Ganglia with LPM

    ‏2011-11-07T21:01:04Z  in response to SystemAdmin
    Hello,

    I'm in the same boat. LPM, Ganglia, restarting gmond, historical data. You can use the drmgr script to restart gmond after a postmigration event. http://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/SG245765.html Customize the /etc/rc.d/init.d/gmond script, and have it check the serial number of the lpar, an then you will know which frame it's on.

    I'm also trying to figure out an elegant solution to retaining historical data. I wish Ganglia can report to two different "clusters" with one agent running.

    One option is to run two agents. One for historical data tracking which will report to one cluster on say port 8649. Then have another agent report to another cluster, depending on which frame it's on, report it to port 8650. You will have to create new init.d script.

    Another option is to run one agent, reporting to the correct cluster based on frame, and then somehow merge the rrds file of the two clusters to retain historical data.

    I've been testing with a few, but haven't had any success.

    Anyone else figure something out?

    Paulo Baptista
  • Jonesy1234
    Jonesy1234
    2 Posts
    ACCEPTED ANSWER

    Re: Ganglia with LPM

    ‏2011-11-21T01:26:50Z  in response to SystemAdmin
    At my previous job I implemented a ksh script that ran every minutes to looked for migration successful messages in errpt. It would then pull down the relevant cluster gmond.conf file from a central location via rsync based on the pseries serial number and then restarted gmond and clear the errpt message.

    At my current place I have a more elegant solution in that this process is all controlled via cfengine to control the configuration files for gmond.conf and the status of the daemon. In-turn any new LPARs I create get ganglia configured as soon as I install cfengine on them.

    If your interesting in getting started with cfengine a good resource is http://www.ibm.com/developerworks/opensource/library/os-cfengine1/index.html?ca=drs-

    Let me know if you want details of how I achieved this within cfengine.

    Cheers