Contents


IBM PureFlex System virtualization configuration for optimal functional capability

PureFlex virtualization configuration for IBM Power Systems

Comments

IBM PureFlex System is an integrated infrastructure solution. PureFlex System combines compute, storage, networking, virtualization, and management into a single infrastructure system.

IBM Flex System Manager is a systems management appliance that drives efficiency and cost savings in the data center. IBM Flex System Manager has full, built-in virtualization support of servers, storage, and networking to speed provisioning and increase resiliency.

Flex System Manager functions might fail due to incorrect or missing virtualization configuration. It is really difficult to diagnose and fix environmental and configuration issues. The proposed tutorial covers the common configuration validations for the following resources:

  • Basic configurations (Base Flex System Manager server validations)
  • Virtualization configurations (IBM Flex System Manager VMControl™ validations)
  • Other configurations (Other advance manager validations)
  • Host and logical partitioning (LPAR) validations
  • Virtual I/O Server (VIOS) for IBM Power Systems™ validations

This tutorial focuses on the basic validation for IBM Power® virtualization environment as supported in PureFlex System.

These configurations are really important to use the PureFlex System to its optimal capacity in the virtualized environment. IBM PureFlex System supports the following architectures:

  • Power Systems
  • KVM
  • VMware

This tutorial helps anyone (including customers, IBM independent software vendors (ISVs), development and test teams, Lab based services (LBS) teams and so on) who is working with the PureFlex System solution using Flex System Manager.

Figure 1. Pure Flex System
Pure Flex System
Pure Flex System

Figure 1 shows the rack that can have multiple chassis. Each chassis would have multiple compute nodes and one Flex System Manager. Chassis would also have storage and network switches connected to it.

Figure 2. Flex System Manager with SAN storage and Power node
Flex System Manager with SAN storage and Power node
Flex System Manager with SAN storage and Power node

Figure 2 has one Flex System Manager as a management entity which would have both storage area network (SAN) storage and IBM Power server discovered. The Power server would have a VIOS partition, which would have the Common Agent Services (CAS) and Common Repository Subagent installed.

Enabling infrastructure

In the PureFlex System environment, there are some initial components that should be preinstalled, which includes;

  1. Flex System Manager
  2. IBM Flex System Manager VMControl
  3. Other advanced plug-ins such as network control, storage control, and so on

There are certain steps required to validate the enabling infrastructure.

  1. Flex System Manager status: The first and important step to start with is to check the status of Flex System Manager server. It should be up and running. The smstatus command would check the status of the Flex System Manager server.

    Listing 1. smstatus command output

    USERID@c598n34:~> smstatus Active USERID@c598n34:~>

    Corrective action:

    • Inactive state: Run the smstart;smstatus –r command to start the server and check the status recursively.
    • Error state: Run the smstop –f;smstart;smstatus –r command to forcefully stop the server, start the server, and check the status recursively.
  2. IBM Flex System Manager server version: Check if user has the required Flex System Manager server version installed with the required build number.

    Listing 2. Flex System Manager server version output

    USERID@c598n34:~> cat /opt/ibm/director/version.srv smcore.ver=6.3.4 smcore.level=usmi13c-0033 smcore.build=0001 smcore.build.date=9-23-2013 smprod.ver=6.3.4 smprod.level=usmi13c-0033 smprod.build=0001 smprod.build.date=9-23-2013 component=IBM Systems Director Server version=6.3.4 version.displayable=6.3.4 date=9-23-2013 level=usmi13c-0033 USERID@c598n34:~>

    Corrective action:

    If you are unable to find the version of the Flex System Manager server, check the network connection and ensure that the state of the management server is active.

  3. Flex System Manager version: Check if the user has the required Flex System Manager version.

    Listing 3. Flex System Manager version output

    USERID@c598n34:~> lsconfig -V "version= Version: 1 Release: 3 Service Pack: 1 FSM Build level 20130923-1206 2013_266.Mon Sep 23 12:09:189 CDT 2013 ","base_version=V1R3 " USERID@c598n34:~>

    Corrective action:

    Verify the network connection and ensure that the state of the management node or appliance is active.

  4. Flex System Manager communication state, access state, and protocols for both server and OS MEP: Check if the communication state, access state, and protocols for the Flex System Manager server and the operating system managed end points (MEPs) are in good state.

    Listing 4. Expected Flex System Manager server and operating system MEP status

    USERID@c598n34:~> smcli lssys -i 10.32.73.34 -A AccessState,CommunicationState, Protocols c598n34.pokprv.stglabs.ibm.com: Unlocked, 2, Unsupported c598n34.pokprv.stglabs.ibm.com: Unlocked, 2, { 'CAS' } USERID@c598n34:~>

    Supported communication states:

    The following list provides the valid communication states that are supported for an MEP. The expected communication state is 2.

    0 – Unknown

    1 – Not available

    2 – Communication OK

    3 – Lost communication

    4 – No contact

    5 – Communication untrusted

    Corrective action:

    1. If the access state of the server and operating server MEP is Locked, then the user would need to perform the request access action operation of the Locked MEP. Users can use the smcli accesssys command for the same.
    2. If the communication state of the system is not good, then the user would need to collect the inventory on both: server and operating system MEP. User can use the smcli collectinv command for the same.
    3. If the required protocols are not displayed, then you can do a collect inventory and also do a required access again, followed by revoke access.
  5. VMControl activation status: Check if the VMControl is in the active state or the deactivated state.

    Listing 5. VMControl activation output

    USERID@c598n34:~> smcli lsmgrs Network Control : Activated VMControl : Activated Storage Control : Activated USERID@c598n34:~>

    Corrective action:

    • Deactivated state: Run the smcli activatemgrs VMControl command to activate the VMControl plug-ins. Restart the Flex System Manager server after running this command.
  6. VMControl version: Check if the required version of VMControl is installed.

    Listing 6. VMControl version output

    USERID@c598n34:~> cat /opt/ibm/director/lwi/runtime/vmc/VMControl.properties # #Fri Sep 20 14:38:39 EDT 2013 com.ibm.vmc.build.buildDate=September 20, 2013 2\:38\:39 PM EDT com.ibm.vmc.build.number=187 com.ibm.vmc.build.timestamp=1379702319363 com.ibm.vmc.install.version=2.4.4.0-201309201405 USERID@c598n34:~>

    Corrective action:

    If running the above command to read the VMControl properties file does not show any output, it means that it could be a problem with the Flex System Manager appliance installed.

    Check for the existence of the following directory in Flex System Manager: /opt/ibm/director/lwi/runtime/vmc/. If this does not exist, the user might have to reinstall the Flex System Manager build.

  7. Network control activation status: Check if the network manager is in the active state or the deactivated state.

    Listing 7. Network manager activation output

    USERID@c598n34:~> smcli lsmgrs Network Control : Activated VMControl : Activated Storage Control : Activated USERID@c598n34:~>

    Corrective action:

    • Deactivated state: Run the smcli activatemgrs Network Manager command to activate the network manager plug-ins. Restart the Flex System Manager server after running this command.
  8. Storage control activation status: Check if the storage manager is in the active state or the deactivated state.

    Listing 8. Storage manager activation output

    USERID@c598n34:~> smcli lsmgrs Network Control : Activated VMControl : Activated Storage Control : Activated USERID@c598n34:~>

    Corrective action:

    Deactivated state: Run the smcli activatemgrs Storage Manager command to activate the storage manager plug-ins. Restart the Flex System Manager server after running this command.

IBM Power compute node validations

IBM Power compute nodes in PureFlex System should be discovered to the Flex System Manager and should have storage zoning configured. Power node includes the following validations:

  1. Host inventory: Check whether the inventory of the host is current. Host inventory should not be too old.

    Listing 9: lsresource output

    USERID@c598n34:~> smcli lsresource Server 7346 | grep LastInventoryDate Property: Name: LastInventoryDate Type: dateTime Value: 2013-02-28T02:38:30-04:00 USERID@c598n34:~>

    Corrective action: In case the inventory is older then 1 hour, then user should collect inventory on the IBM Power compute node again to retrieve the latest information about the host.

  2. Firmware level: Power hosts should have the supported firmware level.

    Listing 10: IBM Power compute node firmware level

    #lslic -t sys -m Server-7895-22X-SN10F599A | ecnumber ecnumber=01AF763

    Corrective action: If the firmware level is not returned at the exit of this command, the user can try verifying by logging to the Advanced System Management (ASM) console of the Power host. If the firmware level is not the latest, an update can be performed by downloading the latest from the IBM Fix Central website.

  3. Host state: The Power host should be in active state.

    Listing 11: Host state

    hscroot@xhm2109:~> lssyscfg -r sys -m "pfm3128 8233-E8B-SN100BB7P" -F state Operating

    Corrective action: If the state of the Power host shows as Power Off, then the user has to start the host manually from the Flex System Manager console by right-clicking the host and clicking Start.

  4. Host has one or more active VIOS instances: The Power host should have one or more active VIOS instances.

    Listing 12: lssyscfg command output

    USERID@c598n34:~> lssyscfg -r lpar -m Server-7895- 22X-SN10F599A -F lpar_id,name,state 6,c612n75,Running 5,c612n77,Not Activated 4,c612n247,Not Activated 3,c612n251,Not Activated 1,VIOS_9-114-146-112_599A,Running 2,GA41-9-114-146-248,Running 7,mix-9-114-146-247,Not Activated 8,edit-9-114-146-247,Not Activated 9,defect-9-114-146-250,Running

    Running the above command displays the state of all the systems of type lpar. The user has to manually identify the VIOS from the above output based on the naming convention and check the status against it.

    Corrective action: If the status of the VIOS shows Not Activated, the user has to start it manually from the Flex System Manager console and try running the above command again to check the status.

VIOS validations

VIOS validations are one of the critical validations for the PureFlex System environment. The VIOS is the software that is located in a logical partition. This software facilitates the sharing of physical I/O resources between client LPARs within the server. Here is the list of the validations for VIOS:

  1. CIM service status

    This reports the status of the CIM service running on the VIOS. The CIM service should be up and running for the VIOS to be managed by the Flex System Manager.

    Listing 13: Process for cimserver

    # ps -ef | grep cimserver root 10767 1 0 Sep25 ? 00:01:56 /usr/sbin/cimserver root 20492 20445 0 03:14 pts/16 00:00:00 grep cimserver

    Corrective Action: If CIM service is not running:

    /usr/bin/ssh -l <loginId> <ip> /usr/ios/cli/ioscli startnetsvc cimserver
  2. RMC service status

    Check the status of the Resource Monitoring and Control (RMC) service, hosted by the Flex System Manager server plug-ins to talk to the IBM Power compute node.

    Listing 14: RMC service

    # lssrc -s ctrmc Subsystem Group PID Status ctrmc rsct 6619344 active

    Corrective action: If RMC is not running:

    /usr/bin/ssh -l <loginId> <ip> startsrc -s ctrmc`
  3. SLP service status

    This check for the various basic services that are required for the Flex System Manager server to communicate with the VIOS.

    Listing 15: slp_query command output

    # slp_query --type=* --address=<VIOS_IP> 0 8 59 URL: service:management-software.IBM:platform-agent://9.12.31.76 URL: service:wbem:http://9.12.31.138:5988 URL: service:wbem:https://9.12.31.138:5989 URL: service:wbem:https://9.12.31.76:5989 ATTR: (template-url-syntax=https://9.12.31.76:5989) URL: service:wbem:http://9.12.31.76:5988 ATTR: (template-url-syntax=http://9.12.31.76:5988) URL: service:management- software.IBM:usma://pva1076.pok.stglabs.ibm.com ATTR: (ip-address=9.12.31.76),(mac-address=aa.76.54.5b.f3.a), (tivguid=A42AD1481C7011E394FFAA76545BF304),(uid=7f8e6792373c6e72), (vendor=IBM),(System-Name=pva1076.pok.stglabs.ibm.com), (timezone-offset=-300),(version=6.3.2),(port=9510),(manager=9.37.74.106) URL: service:TivoliCommonAgent://pva1076.pok.stglabs.ibm.com:9510 ATTR: (ca-uid=file:///var/opt/tivoli/ep/runtime/agent),(am-host=9.37.74.106), (ca-ips=9.12.31.76),(ca-basic-port=9510),(ca-cert-port=9510), (ca-version=1.4.2.4),(os-uid=A42AD1481C7011E394FFAA76545BF304) URL: service:service-agent://9.12.31.76 ATTR: (service-type=service:management-software.IBM:usma,service:service-agent)

    Corrective action:

    If there is an error, it indicates that there might be a problem with the network. Check if the SLP service is running on the VIOS with the following command:

    ps -ef | grep slp

    In case, if the SLP service is not started and running, try starting it again and retry.

  4. CAS service status

    This is to check the status of the CAS service protocol running on the VIOS managed by Flex System Manager. This service helps the Flex System Manager to communicate with VIOS to perform any operation.

    Listing 16: slp query output

    # slp_query –type=service:management-software.IBM:usma --address=9.12.31.76 0 1 66 URL: service:management-software.IBM:usma://pva1076.pok.stglabs.ibm.com ATTR: (ip-address=9.12.31.76),(mac-address=aa.76.54.5b.f3.a), (tivguid=A42AD1481C7011E394FFAA76545BF304),(uid=7f8e6792373c6e72), (vendor=IBM),(System-Name=pva1076.pok.stglabs.ibm.com), (timezone-offset=-300),(version=6.3.2),(port=9510),(manager=9.37.74.106)

    Corrective action: If it fails with the "Failed to call ICoreAgent" error, it indicates that there might be a problem on the network. The network problem might be caused by one of the following issues:

    • The time difference between the agent and the server should not be more than the given timezone-offset value.
    • The agent connector was not active.
    • Problems that are related to the agent manager or Port 9510 is not open.

    To open port 9510 on the management server or the management node, enter the following command at a command prompt:

    telnet <ServerIP> 9510

    This is the default port for this service. If you are using a non-default port, verify the CAS service with that port number.

  5. VIOS to Flex System Manager ping

    This is used to check the status and availability of the VIOS endpoint from Flex System Manager by pinging to the VIOS.

    Listing 17: Ping output

    # ping 9.12.31.76 PING 9.12.31.76 (9.12.31.76) 56(84) bytes of data. 64 bytes from 9.12.31.76: icmp_seq=1 ttl=236 time=24.0 ms 64 bytes from 9.12.31.76: icmp_seq=2 ttl=236 time=24.0 ms

    Corrective action: In case of failure, the connection between the management node with an IP address: <IP_Address> and VIOS cannot be established. Verify the network connection and ensure that it is not blocked by a firewall.

  6. VIOS license:

    This check is to find whether the license of the VIOS has been accepted.

    Listing 18: License check

    $ license The license has been accepted en_US Sep 13 2013, 07:37:10 0(padmin)

    Corrective action: The user has to view and accept the license before using the VIOS. If the output of the above command says that the license is not accepted, the user needs to perform following actions:

    Step 1: To view the license in the en_US locale, enter:

    license -view

    Step 2: To accept the license in the fr_FR locale, enter:

    license -accept -lang fr_FR
  7. VIOS common repository subagent status

    This is to check whether the common repository agent is installed on the VIOS that is used as Image Control Point (ICP) to host image repositories.

    Listing 19: subagent output

    # ./lwiupdatemgr.sh -listFeatures | grep im.cr com.ibm.director.im.cr.agent.installer_9.9.9.9-201308122037 Enabled

    Negative Scenario

    # ./lwiupdatemgr.sh -listFeatures | grep im.cr com.ibm.director.im.cr.agent.installer_9.9.9.9-201308122037 Disabled

    Corrective action: A common repository subagent is installed on the VIOS. However, the common repository subagent is not enabled. Restart the managed system to enable the common repository subagent by using the following command:

    /opt/ibm/director/agent/runtime/agent/bin/./endpoint.sh restart.
  8. VIOS firmware

    This checks the level of firmware, the VIOS machine is running with.

    Listing 20: VIOS firmware

    # lsconf | grep "Platform Firmware level" Platform Firmware level: AL780_006

    Corrective action:

    Step 1: Use a PuTTY session to log in to the VIOS partition.

    Step 2: Run the lsfware command. The output of this command displays the status of the installed firmware.

    Step 3: Obtain the firmware image file you want to install from the Fix Central website.

    Step 4: Use the File Transfer Protocol (FTP) to copy a file from a PC or directly from the Fix Central website to the VIOS.

    • To use FTP, open a MS DOS prompt on the PC. Open the directory where the image file is downloaded. From this directory, run the FTP command to the VIOS. Log in as padmin with a valid password. You will now be connected. The remote directory will be /home/padmin.
    • Use the bin command to set the transfer to the image mode (binary).
    • After the transfer is done, use the bye command to exit FTP.

    Step 5: Verify if the firmware file is in the VIOS.

    Restart the previous PuTTy session or start a new one if you have closed the previous session.

    If you have transferred the file from your PC using FTP, by default, it will be stored in the /home/padmin directory. If you have used FTP to create a directory, it will be in the directory that was created. Use the ls -l command (directory name: either /home/padmin or /home/padmin/firmware).

    Verify whether the firmware image file is present.

    Step 6: Ensure that all of the running virtual servers other than the VIOS are shut down. If they are not shut down, take the necessary action to shut them down. Do not shut down the VIOS.

    Step 7: Install the firmware:\n\ta. Issue the ldfware command to load your new firmware. The command syntax is:

    ldfware –file /home/padmin/<firmwareFileName>.img

    Or,

    ldfware - file /home/padmin/firmware/<firmwareFileName>.img

    Confirm the firmware installation using option 1. The blade server will now shut down and load the firmware. This takes approximately 10 minutes.

    Step 8: Confirm that the firmware has been updated. After the blade server restarts, reconnect using PuTTY. Issue the lsfware command and verify if the new firmware has been installed.

  9. VIOS inventory

    Check whether the inventory of VIOS is latest / up-to-date. VIOS inventory should not be too old.

    Listing 21: VIOS inventory

    USERID@c598n34:~> smcli lsresources Server <VIOS_OID> | grep LastInventoryDate Property: Name: LastInventoryDate Type: dateTime Value: 2013-09-30T09:49:24-04:00

    Corrective action: In case the inventory is older then 1 hour, then user should collect inventory on the Power node again to retrieve the latest information about the host.

  10. VIOS GUID uniqueness

    In case of multiple VIOS instances, each of the VIOS instance should have different Globally Unique Identifier (GUID).

    Listing 22: slp_query command

    # slp_query --type=service:management-software.IBM:usma --address=9.12.31.76 0 1 66 URL: service:management-software.IBM:usma://pva1076.pok.stglabs.ibm.com ATTR: (ip-address=9.12.31.76),(mac-address=aa.76.54.5b.f3.a), (tivguid=A42AD1481C7011E394FFAA76545BF304), (uid=7f8e6792373c6e72), (vendor=IBM),(System- Name=pva1076.pok.stglabs.ibm.com), (timezone-offset=-300), (version=6.3.2),(port=9510),(manager=9.37.74.106)

    In case the user has more than one VIOS managed by the server, the user has to run this command against every VIOS to verify that the uniqueness of GUID.

    Corrective action: The GUID is not found or is not unique. It is similar to the GUID of another VIOS with the IP address. Delete the VIOS from the management node and run discovery again.

    One of the agent services [cimserver/cimlistener/tier1slp/Director Common Agent] might not be running on the managed system. Restart agent services using the stopsvc director_agent and startsvc director_agent commands on the managed system.

  11. VIOS UID uniqueness:

    In case of multiple VIOS instances, each of the VIOS instance should have different unique identifier (UID).

    Listing 23: slp_query command

    # slp_query --type=service:management-software.IBM:usma --address=9.12.31.76 0 1 66 URL: service:management-software.IBM:usma://pva1076.pok.stglabs.ibm.com ATTR: (ip-address=9.12.31.76),(mac-address=aa.76.54.5b.f3.a), (tivguid=A42AD1481C7011E394FFAA76545BF304),(uid=7f8e6792373c6e72), (vendor=IBM,(System- Name=pva1076.pok.stglabs.ibm.com), (timezone-offset=-300), (version=6.3.2),(port=9510),(manager=9.37.74.106)

    In case the user has more than one VIOS managed by the server, the user has to run this command against every VIOS to verify the uniqueness of UID.

    Corrective action: The UID of VIOS is not unique. It is similar to the UID of another VIOS. Delete the VIOS from the server, verify that the managed system is active and not blocked by a firewall, and run discovery again. Then, verify the logs for more details.

  12. VIOS communication state, access state, and protocols for both server and OS MEP

    This verifies whether the access state and communication state of the described protocols for server and OS MEP is unlocked and OK for all the operations to complete successfully.

    Listing 24: lssys command output

    USERID@c598n34:~> smcli lssys -i 9.12.32.189 -A "AccessState,CommunicationState,Protocols" -t OperatingSystem c618b-m1b6.pok.stglabs.ibm.com: Unlocked, 2, { 'CIM', 'SSH' }

    Supported communication states:

    Following are the list of valid communication states that is supported for an MEP. The expected communication state is 2.

    0 – Unknown

    1 – Not Available

    2 – Communication OK

    3 – Lost Communication

    4 – No Contact

    5 – Communication Untrusted

    Corrective action: If the output of the above command did not contain any of the above information:

    • Verify if a managed resource exists for the management node.
    • Verify if the access state of the managed resource is in the OK state.
    • Collect inventory on the chassis and the management server.
    • Verify the network connection and request access again.
  13. Flex System Manager-to-VIOS repository subagent communication status

    Verify Flex System Manager-to-VIOS repository subagent communication.

    Listing 25: getAgentInfo command output

    USERID@c598n34:~> smcli getAgentInfo 10.32.55.64 getAgentInfo is called.... Version: 2.4.3.0-201305151104 API level: 2 Bundle Info: May 15, 2013 11:04:54 AM EDT@65397 USERID@c598n34:~>

    Negative scenario

    USERID@c598n34:~> smcli getAgentInfo 10.32.55.48 getAgentInfo is called.... Error: No CAS agent! Error: Can not get the ICP agent: 10.32.55.48! USERID@c598n34:~>

    Corrective action:

    • Verify whether the CAS agent is installed.
    • Verify whether the common agent subagent is installed.
    • Verify whether the CAS service is running.

Resources

Learn

Get products and technologies

Discuss


Downloadable resources


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=953249
ArticleTitle=IBM PureFlex System virtualization configuration for optimal functional capability
publish-date=11192013