Topic
9 replies Latest Post - ‏2012-02-29T15:54:28Z by jjgarcia
jjgarcia
jjgarcia
4 Posts
ACCEPTED ANSWER

Pinned topic DLPAR Memory Add fails with Debian on Power 720 and 740

‏2012-02-14T15:54:00Z |
 Hi.
 
I installed Debian Squeeze on two Power Servers 740 and 720, installed all the IBM productivity packages for Linux as RSCT, SRC, DevicesCHRP, DynamicRM, etc. Recompile the Kernel 3.0.8 with a new configuration to perform DLPAR operations, and now Im able to perform CPU add and remove DLPAR, Memory remove and PCI add and remove. But the HMC throws me a error when I attemp to Add new memory to the partition dynamically:
 
The error is: 
 
Dynamic add of memory resources failed:

########## Jan 20 09:30:06 2012 ##########
: Could not online lmb.
......
...... 
..... 
: Could not online lmb.
: Could not online lmb.
Rotating logs...



Please issue the lshwres command to list the memory resources of the partition and to determine whether or not its pending and runtime memory values match. If they do not match, problems with future memory-related operations on the managed system may occur, and it is recommended that the rsthwres command to restore memory resources be issued on the partition to synchronize its pending memory value with its runtime memory value.
HSC02931
 
I issued both commands ahead, but everything looks ok. 
 
I get the Kernel log: 
 
Feb  8 10:53:16 debian720 : drmgr: drslot_chrp_mem -a -c mem -q 2 -w 5 -d 1
Feb  8 10:53:16 debian720 kernel: [  630.555196] section number 384 page number 4097 not reserved, was it already online?
Feb  8 10:53:16 debian720 kernel: [  630.921371] section number 400 page number 4096 not reserved, was it already online?
Feb  8 10:53:17 debian720 kernel: [  631.307976] section number 416 page number 4097 not reserved, was it already online?
Feb  8 10:53:17 debian720 kernel: [  631.672336] section number 432 page number 4096 not reserved, was it already online?
Feb  8 10:53:17 debian720 kernel: [  632.035692] section number 448 page number 4097 not reserved, was it already online?
Feb  8 10:53:18 debian720 kernel: [  632.403525] section number 464 page number 4096 not reserved, was it already online?
Feb  8 10:53:18 debian720 kernel: [  632.792630] section number 480 page number 4097 not reserved, was it already online?
Feb  8 10:53:19 debian720 kernel: [  633.147243] section number 496 page number 4096 not reserved, was it already online?
Feb  8 10:53:19 debian720 kernel: [  633.515014] section number 512 page number 4097 not reserved, was it already online?
Feb  8 10:53:19 debian720 kernel: [  633.880325] section number 528 page number 4096 not reserved, was it already online?
Feb  8 10:53:20 debian720 kernel: [  634.232037] section number 544 page number 4097 not reserved, was it already online?
Feb  8 10:53:20 debian720 kernel: [  634.586849] section number 560 page number 4096 not reserved, was it already online?
Feb  8 10:53:20 debian720 kernel: [  634.941967] section number 576 page number 4097 not reserved, was it already online?
Feb  8 10:53:21 debian720 kernel: [  635.290772] section number 592 page number 4096 not reserved, was it already online?
Feb  8 10:53:21 debian720 kernel: [  635.653717] section number 608 page number 4097 not reserved, was it already online?
Feb  8 10:53:21 debian720 kernel: [  636.025669] section number 624 page number 4096 not reserved, was it already online?
Feb  8 10:53:22 debian720 kernel: [  636.390506] section number 640 page number 4097 not reserved, was it already online?
Feb  8 10:53:22 debian720 kernel: [  636.741041] section number 656 page number 4096 not reserved, was it already online?
Feb  8 10:53:22 debian720 kernel: [  637.092352] section number 672 page number 4097 not reserved, was it already online?
Feb  8 10:53:23 debian720 kernel: [  637.443083] section number 688 page number 4096 not reserved, was it already online?
Feb  8 10:53:23 debian720 kernel: [  637.800875] section number 704 page number 4097 not reserved, was it already online?
Feb  8 10:53:24 debian720 kernel: [  638.166836] section number 720 page number 4096 not reserved, was it already online?
Feb  8 10:53:24 debian720 kernel: [  638.525368] section number 736 page number 4097 not reserved, was it already online?
Feb  8 10:53:24 debian720 kernel: [  638.906531] section number 752 page number 4096 not reserved, was it already online?
Feb  8 10:53:25 debian720 kernel: [  639.257044] section number 768 page number 4097 not reserved, was it already online?
Feb  8 10:53:25 debian720 kernel: [  639.614637] section number 784 page number 4096 not reserved, was it already online?
Feb  8 10:53:25 debian720 kernel: [  639.970354] section number 800 page number 4097 not reserved, was it already online?
Feb  8 10:53:26 debian720 kernel: [  640.333856] section number 816 page number 4096 not reserved, was it already online?
Feb  8 10:53:26 debian720 kernel: [  640.721378] section number 832 page number 4097 not reserved, was it already online?
Feb  8 10:53:26 debian720 kernel: [  641.075491] section number 848 page number 4096 not reserved, was it already online?
Feb  8 10:53:27 debian720 kernel: [  641.422704] section number 864 page number 4097 not reserved, was it already online?
Feb  8 10:53:27 debian720 kernel: [  641.808307] section number 880 page number 4096 not reserved, was it already online?
Feb  8 10:53:28 debian720 kernel: [  642.178335] section number 896 page number 4097 not reserved, was it already online?
Feb  8 10:53:28 debian720 kernel: [  642.540730] section number 912 page number 4096 not reserved, was it already online?
Feb  8 10:53:28 debian720 kernel: [  642.904853] section number 928 page number 4097 not reserved, was it already online?
Feb  8 10:53:29 debian720 kernel: [  643.265897] section number 944 page number 4096 not reserved, was it already online?
Feb  8 10:53:29 debian720 kernel: [  643.639210] section number 960 page number 4097 not reserved, was it already online?
Feb  8 10:53:29 debian720 kernel: [  644.006150] section number 976 page number 4096 not reserved, was it already online?
Feb  8 10:53:30 debian720 kernel: [  644.373531] section number 992 page number 4097 not reserved, was it already online?
Feb  8 10:53:30 debian720 kernel: [  644.721492] section number 1008 page number 4096 not reserved, was it already online?

and part of the DRMGR log: 

########## Jan 20 11:09:44 2012 ##########
Validating CPU DLPAR capability...yes.
Validating Memory DLPAR capability...yes.
Validating I/O DLPAR capability...yes.
Validating PHB DLPAR capability...yes.
Validating HEA DLPAR capability...yes.
Validating partition migration capability...yes.
Validating partition hibernation capability...yes.
########## Jan 20 11:09:44 2012 ##########

########## Jan 20 11:11:45 2012 ##########
drmgr: drslot_chrp_mem -a -c mem -q 2 -w 5 -d 1
Validating Memory DLPAR capability...yes.
Found 63 lmbs
Found 19 owning lmbs
Adding 2 lmbs
AMS ballooning is not active
get-sensor for 80000013: 0, 2
Found available lmb, LMB20, drc index 0x80000013
Acquiring drc index 0x80000013
get-sensor for 80000013: 0, 2
setting allocation state to alloc usable
setting indicator state to unisolate
Updating of property
Attempting to online lmb.
Probing memory address 0x130000000
Marking /sys/devices/system/memory/memory19 online
Could not online /sys/devices/system/memory/memory19.
: Could not online lmb.
Updating of property
Releasing drc index 0x80000013
get-sensor for 80000013: 0, 1
setting isolation state to isolate
setting allocation state to alloc unusable
get-sensor for 80000013: 0, 2
drc_index 80000013 sensor-state: 2
Resource is not available to the partition.
AMS ballooning is not active
get-sensor for 80000014: 0, 2
Found available lmb, LMB21, drc index 0x80000014
Acquiring drc index 0x80000014
get-sensor for 80000014: 0, 2
setting allocation state to alloc usable
setting indicator state to unisolate
Updating of property
Attempting to online lmb.
Probing memory address 0x140000000
Marking /sys/devices/system/memory/memory20 online
Could not online /sys/devices/system/memory/memory20.
: Could not online lmb.
Updating of property
Releasing drc index 0x80000014
get-sensor for 80000014: 0, 1
setting isolation state to isolate
setting allocation state to alloc unusable
get-sensor for 80000014: 0, 2
drc_index 80000014 sensor-state: 2
Resource is not available to the partition.
 
Anyone can help me with this issue?? 
 
Thanks a lot. 
Updated on 2012-02-29T15:54:28Z at 2012-02-29T15:54:28Z by jjgarcia
  • Dave_Hansen
    Dave_Hansen
    6 Posts
    ACCEPTED ANSWER

    Re: DLPAR Memory Add fails with Debian on Power 720 and 740

    ‏2012-02-14T17:07:58Z  in response to jjgarcia
     First and foremost, Debian isn't supported on this hardware, as far as I know.  I'm saying that just so you know why you ran in to this: you're probably the first to ever try this.
     
    My first guess is that there's some confusion somewhere between SECTION_SIZE (the compile-time constant for sparsemem), the /sys/devices/system/memory/* "section size", and the LMB size. There were some patches a bit ago to decouple things, and I wonder if you're hitting something in there.  Or, if your dlpar tools are making a false correlation between LMB size and section size.
     
    Could you post your entire dmesg, or perhaps a URL to it?  The entire contents of /sys/devices/system/memory/ tarred up would also be interesting to see both before and after the attempted hotplug operation.  I'm most curious about what is in here: /sys/devices/system/memory/block_size_bytes
     
    Lastly, how much memory is in your system, and how much did you attempt to hotplug?
    Updated on 2012-02-14T17:07:58Z at 2012-02-14T17:07:58Z by Dave_Hansen
  • nfont
    nfont
    2 Posts
    ACCEPTED ANSWER

    Re: DLPAR Memory Add fails with Debian on Power 720 and 740

    ‏2012-02-14T17:13:43Z  in response to jjgarcia
     Can you also let us know what version of powerpc-utils you have installed.
  • jjgarcia
    jjgarcia
    4 Posts
    ACCEPTED ANSWER

    Re: DLPAR Memory Add fails with Debian on Power 720 and 740

    ‏2012-02-27T16:36:46Z  in response to jjgarcia
    Hi 
     
    Well, I know this Linux Distribution its not supported, but we are running some test in order to accomplish some targets in the region. Im using powerpc-utils version 1.2.12, I downloaded the source and compile into a Deb Package .
     
    Here's  a link to the files on /sys/devices/system/memory into the file sys_memory.tar.gz, also there are the entire log files of Drmgr, Messsage and Dmsg.
     
     
    Thanks to all.. 
    • This reply was deleted by RCJ 2012-02-27T18:28:15Z.
      • RCJ
        RCJ
        10 Posts
        ACCEPTED ANSWER

        Re: DLPAR Memory Add fails with Debian on Power 720 and 740

        ‏2012-02-27T18:10:53Z  in response to RCJ
         I lost track of the fact that this is not using the stock kernel.  I'll continue to take a look at this.  The first thing I notice in the drmgr log is that drmgr is attempting to online sections 32 - 63 using files /sys/devices/system/memory/memory[32 - 63] which don't exist in sysfs according to the tar file provided.
    • nfont
      nfont
      2 Posts
      ACCEPTED ANSWER

      Re: DLPAR Memory Add fails with Debian on Power 720 and 740

      ‏2012-02-27T20:21:13Z  in response to jjgarcia
       This appears to be the same issue that was fixed by commit http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=54f23eb7ba7619de85d8edca6e5336bc33072dbd
       
      This has the same error message signature "section number 1008 page number 4096 not reserved, was it already online?" that I remembered seeing when making the update. You mentioned rebuilding the kernel, can you verify that this fix is in the source you built.
  • jjgarcia
    jjgarcia
    4 Posts
    ACCEPTED ANSWER

    Re: DLPAR Memory Add fails with Debian on Power 720 and 740

    ‏2012-02-28T19:15:58Z  in response to jjgarcia
     I finally solve the issue. As nfont said, there was a bug on the kernel version that I was using, so I downloaded and compiled the latest stable version of the kernel that had this issue fixed and every DLPAR operation works perfect now with Debian on a POWER 720 and 740.
     
    Thanks to all for the help. 
    • jscheel
      jscheel
      45 Posts
      ACCEPTED ANSWER

      Re: DLPAR Memory Add fails with Debian on Power 720 and 740

      ‏2012-02-29T14:20:27Z  in response to jjgarcia
       I'm happy to hear that you have things working.  I would strongly encourage you to do a couple key things:
      1. Review the  Debian 6 on Power7 LPAR wiki to ensure that your insights are shared with others, especially in the area of using the DLPAR tooling.
      2. Make sure to reach out to the Debian development community to get fixes included, packages rebuilt, etc.
      Thank you for your efforts.  It takes contributions such as yours to make a community work!
      • jjgarcia
        jjgarcia
        4 Posts
        ACCEPTED ANSWER

        Re: DLPAR Memory Add fails with Debian on Power 720 and 740

        ‏2012-02-29T15:54:28Z  in response to jscheel
         Hi jscheel!
         
        I just updated the Debian Wiki with the process to setup the DLPAR on Debian.