IBM Support

Shared Storage Pool (SSP) Best Practice

How To


Summary

Shared Storage Pools are a VIOS feature from around 2015 and this document contains many hints and tip to set it up well and avoid common pitfalls.
It virtualizes Fibre Channel SAN disks on the Virtual I/O Servers (VIOS) and presented to Virtual I/O Clients over vSCSI.
It allows many fancy disks function without the support of underlying disk subsystems.

Objective

Nigels Banner

Shared Storage Pools (SSP) has no official Best Practice and Frequently Ask Questions (FAQ), so I wrote this one.

This article is a work in progress and covers new SSP releases as new features arrive.

Environment

logo
Shared Storage Pools operating within Power Systems from
  • POWER7,
  • POWER8 and
  • POWER9
SSP is implemented within the PowerVM and VIOS functions but you can also control the SSP from the HMC or PowerVC.

Steps

Best Practice from the SSP developers

Storage Best Practice

The SSP has your important data for running virtual machines - look after your data and protect it.

100) Only use RAID’ed devices to the storage pool.

  • Why? It is hard to avoid RAID on modern Disks subsystems and regarded as essential these days - otherwise, a single disk spindle failure could wipe out your data.

101) Configure failure mirroring with SSP

  • Why? Mirroring allows the SSP to survive a whole disk failure.

102) Monitor for events and failures of storage devices (for pool or repository) and replace immediately.

  • Why? You need to take action if/when your disk subsystem has problems - don't ignore them.

103) Monitor Shared Storage Pool space utilization and over-commitment regularly.

  • Why?  Your SSP gives you advanced notice of problems - don't ignore them.
  • Add storage to pool as soon as utilization threshold is reached or remove unused resources like LUs or Snapshots.
  • Determine comfort zone for over-commitment.
  • Review free space after every new virtual machine is deployed and fully operational.

104) Configure all client virtual machines with multi-pathing for disk access (that is dual VIOS)

  • Why? Probably the largest source of storage access problems is faulty or damaged Fibre Channel cables or an FC Switch power off. Multi-pathing keeps the SSP available.
  • One path through each VIOS.

105) Configure all VIOS with multi-pathing across FC disk adapters (assumes you have enough adapters)

  • Monitor path failures and take corrective actions.

Network Best Practice

The SSP is a cluster of VIO Servers - as with all clusters a healthy network is vital and consistent clocks - So get this right first time.

200) Keep cluster communication on network adapter that is not congested.

  • Do not share cluster communication adapter with client partitions (SEA).
  • Do not share cluster communication adapter with storage traffic (FCoE).

201) Redundancy in the network configuration.

  • Configure Etherchannel.
  • Bind logical interface over a channel that has multiple ports and connected to multiple switches.
  • Configure multi-path routing.

202) Perform network administration only during the maintenance window.

  • If necessary, bring storage pool and virtual machines down during networking configuration change.

203) Synchronise the time zone and Clocks across the VIO Servers

  • Why? It is then easy to check much easier consistency
  • You can have wildly different dates and times across your SSP but confuses diagnostics and could complicate recovery (what was the time zone and time on the old destroyed VIOS?)
  • They don't need subsecond synchronization within a minute is fine.
  • When adding a VIOS to the cluster, make sure it has the right time zone and time.
  • If changing the time zone, reboot the VIOS and then add it back to the cluster.

204) Do not change the time of day clock of a node while it is active in the cluster.  No Longer necessary from VIOS 2.2.5 onwards  

  • Use the clstartstop command applies to VIOS before 2.2.5+. From this release onwards, you can change the date time at any time.  NEW 
  • Use
    clstartstop -stop -m $(hostname)
  • Change the clock by starting a ntpd or xntpd services or use a date or ntpdate command.
  • Use
    clstartstop -start - m $(hostname)

205) Synchronize clocks of all VIOS nodes in the cluster via the commands ntpd or xntpd with -x flag.

  • Synchronous clocks are not required for the cluster to operate properly but makes administration and service in the cluster more convenient.

206) Network Setup and Routing -  NEW 

  • Network setup - no special requirements beyond normal network best practices are needed.
  • The VIOS uses all the regular network mechanisms to find the named other VIO Servers when you use the "cluster -addnode" command.
  • So you can use hostname lookup, default gateway, multiple paths, and route commands to direct packets to specific networks adapters and LANs etc.

207) Isolating your SSP network traffic (if your network is unreliable at times) -  NEW 

  • Some SSP admin like to use a separate private network to isolate SSP traffic.  Private networks stops user packet floods or network admin mistakes from disrupting the SSP cluster communication. Alternatively, you can use a quiet admin network - don't use a high throughput busy backup network for obvious reasons.
  • All clusters must have the best network reliability you can organize.
  • If the SSP network is down or slow, the SSP cluster suffers and so does disk access.  If it needs to manage space like allocating blocks to thin provisioned storage it requires resource locking which is performed over the network. Thick provisioning would be less affected as it does not often need to coordinate space management across the cluster.

208) Changing the SSP hostname or IP address?  NEW 

  • Changing the hostname used in the "cluster -addnode" command or the underlying IP address of your running SSP cluster network traffic is painful as you need a total SSP downtime.
  • So get it correct at the start.

General Best Practice

300) When updating the VIOS or third-party software, bring the VIOS node down for maintenance, for example, changing the PowerPath configuration.

  • Stop cluster services: clstartstop -stop
  • Always reboot after maintenance work is complete.
  • Follow up by starting cluster services: clstartstop -start

301) Perform maintenance on one VIOS node in a pair at a time.

  • Always have the redundant pair on that frame available while the maintenance is going on.
  • Shut down the ‘good’ node after its redundant pair is back online and fully functional.
  • For more efficiency, you could streamline and do all ‘odd’ nodes at one time, and once they are up, switch to the ‘even’ nodes.

302) Always take a backup of the cluster configuration if anything changes.

  • Add or remove nodes
  • Add or remove disks
  • Change repository disks
  • Create new LPARs
  • Put viosbr in cron to be run frequently and archive the resulting backup images off the VIOS.
  • Check out the new in VIOS 2.2.5+ viosbr -autobackup option to create VIOS and SSP configuration backups regularly (don't forget to get them off the VIOS  NEW 

303) Build your Disaster Recovery policy around VIOS backup

  • Can use storage replication and recover cluster at a remote site.
  • Test your recovery procedures occasionally.

 304) Always collect VIOS snaps across cluster shortly after any issue.

  • With VIOS 2.2.5+, you can use the cluster snap command to do all the VIOS is one command - see clffdc manual page  NEW 
  • Why? Because VIOS / SSP Support asks for them.

 

Nigel's Best Practice

400)  SSP is high in concept but simple to operate with a few commands so Read the Flaming Manual (RTFM)

  • Why? Because then you know most of the answer.
  • The commands are for the VIOS and padmin user:
  1. The cluster command for cluster create, list, and status
  2. The lssp command for the pool free space
  3. The lu command for virtual disk create and control
  4. The pv command for controlling the LUNs in the LUN pool
  5. The failgrp command for creating the pool mirror
  6. The lscluster command for high-level view of the hdisk / LUN names
  • and ignore those ghastly -clustername -sp and -spname in the syntax after you created your SSP as where made options in later SSP released.

401) Get your hostnames, DNS and time, date, time zone correct and then create your SSP cluster

  • Why? It is typical of clustering software that fixing the configuration later is painful and error-prone

402) Put the IP addresses and hostnames in /etc/hosts in all the VIOS of the cluster and set your /etc/netsvc.conf file  NEW 

  • My /etc/netsvc.conf file contains a line like this:
hosts = local, bind
  • So the first place the hostname is looked up is in the /etc/hosts file and then checks DNS then if DNS fails it can still find the other VIO Servers IP addresses
  • Why? That way a DNS Server outage does not damage the operation of the SSP

403) DUPLICATE OF 204

404) Always use Dual VIOS with your SSP

  • Why?  To avoid Client virtual machine downtime during VIOS upgrades
  • OK your small server might run with one VIOS and that means IMHO no production workloads, so you can shut down the VIOS when ever you want to performance maintenance.

405) Don't forget you can have more than one SSP

  • Why? You could have more than on SSP to avoid limit the failure domain.
  • You could have different SSP for different classes of work or data importance.
  • You can have multiple SSP on the same server!  Not possible on the same VIOS at the same time.
  • On one server, you could have one VIOS pair for production with access to the "production" SSP and a further pair of VIOS for test+dev with access to the "test+dev" SSP

406) Knock up a test SSP today - Don't be shy!  Its simple.

  • After the storage team create the LUN, Zone them to the VIOS - it is then three VIOS commands
  • Assuming a two VIOS called vios1 and vios2, a 1 GB repository disk called hdiska and two data 32 GB disks called hdiskb and hdiskc
    • On vios1:
      cluster -create  -clustername mySSP  -spname mySSP  -repopvs hdiska  -sppvs hdiskb  -hostname vios1
    • On vios1:
      cluster -addnode -clustername mySSP  -hostname vios2 
    • Now we have a cluster with two Virtual I/O Servers
    • On either VIOS create a mirror:
      failgrp -create -fg mirror2: hdiskc
      This command creates the mirror
  • And you are ready to allocate space to a virtual machine - say the virtual machine has configured vSCSI adapter vhost5
    • lu -create -size8G -adapter vhost5
    • List the newly creates LU virtual disks and any other you have:
    • lu -list
    • Start your virtual machine and install your Operation System

407) How many LUNs for the Shared Storage Pool? Answer many?

  • Have at least 8 LUNs (even for a simple test SSP so you practice with a non-trivial set of LUNs) & preferably more like 16 to 64 for each failgrp
  • Why? To allow concurrent I/O - when performing more than trivial I/O
  • 8 LUNs to 16 LUNs is assuming a fairly small SSP of a hand full of TBs.
  • We need to give the VIOS multiple concurrent paths for the I/O and definitely not have 1 or 2 gigantic LUN

408) Create a failgrp (mirror) onto a different back-end disk device and don't mix up the LUN from one disk subsystem with another

  • Why? to protect against a whole disk back-end subsystem failure or communication with it stopping your SSP
  • Double check those LUNs from one disk subsystem are in one failure group and the 2nd disk subsystem are in the other fail group.

410) Have a spare Repository disk on a different back-end device (for emergency use)

  • Why? If the SSP reports a Repository disk failure or an I/O error to it, then you want to immediately rebuild the Repository on a spare LUN. And do have a spare LUN ready without asking the storage team to provide you a new LUN and Zone it in - that could take days, weeks or months!
  • You can also test the Repository disk failure recovery process whenever you like.

411) Make your Repository disks 1 GB and 2 GB

  • Why?  They don't need to be even that large but who can be bothered with less than 1 GB these days!
  • The disk sizes are easy to list:
padmin:  lspv -size
  • Keeping two repository disks of slightly different size Makes me aware which on is which and reduces risk of confusing them with much bigger data LUNs or even internal disks.

412) Monitor your VIOS error logs regularly.

  • Recent HMC event logging includes the important SSP errors and alerts so monitoring them can be simple.
  • The SSP informs you when there are problems with SSP data, SSP Repository, and general network issues.
  • On ALL the VIOS use:
    errlog -ls

413) Backup to off-site your VIOS SSP setup (you need to decide the option for your environment):

  • It is simple to do but get the resulting files off the VIOS. I use NFS for the backup files.

414)  Always install the latest VIOS level

  • Why?  a) removes already fixed bugs and 2) gives you the latest features

415) Use the same SHORT hostnames for the cluster and pool: 

  • Why? It works out the full hostnames - that is why we have DNS!

416) Write your own lu -list command

  • Why? The default lu command output is painful as the variable length LU names (1st column) means column output is a wreck.  Also, on what planet do you want the truncated LUUDID hexadecimal numbers?
  • Decide what you want and the lu -list option makes it simple for you to have what.
  • Take a look at nlu (see AIXpert Blog:  SSP hands on fun with LU virtual disks by example

417) SSP is fast and can keep up with NPIV

  • So don't let them NPIV bigots look down on you & remember you get to go home earlier than them :-)

418) Converting an AIX virtual machine based on old vSCSI or internal disks to SSP is simple.

  • Why? The virtual machine gets much higher I/O and is LPM ready.
  1. Add the LU virtual disk to the AIX virtual machine of the same size as it has now.
  2. The cfgmgr command to find the new LU.
  3. Add the disk to rootvg, I use smitty.
  4. The migratepv command to move the data live to the SSP LU.
  5. Set the bosboot and bootlist.
  6. Remove the old disks from the rootvg volume group and remove then completely rmdev them.
  7. For a simple 8 GB rootvg, it is about 10 minutes work.
  8. See YouTube Video 9 list

418) You can manually restart an SSP based virtual machine on a now dead server is a couple of commands on an alternative server.

  • SSP could be regarded as a cheap but 100% manual alternative to high availability solutions.
  • You need to know the virtual machine size details and LU names plus have the resources on a VIOS in the same SSP.
  • A nice "Get out of jail free!" card during an emergency.
  • See YouTube Video 8 list.

419) You are allowed to dd SSP LUs for backup and moving between different SSPs look in /var/vio/SSP/spiral/D_E_F_A_U_L_T_061310/VOL1

  • Be careful and practice on a test SPP first!
  • Use a dd command block size of 64 MB for faster I/O speeds. Hint: dd bs=64M

420) No one knows with certainty what "061310" means!

  • I think it is  a butchered American date format of the 1st SSP. So that is 13th June 2010 - it was not a Friday!

Frequently Asked Questions

 Question 1: What is the simplest way to increase the SSP pool size?

  • Assuming a reasonable number of LUNs you can ask the storage team to double size of all the data LUNs
  • The SSP spots the change in a few minutes and make use of the new space automatically.
  • No further moving of data chucks in necessary and a extended LUN is on the same disks spindles.
  • You can see the change in the free pool space stats with "lssp" command or Nigel's "npool" script.

Question 2: When does SSP support NPIV?

  • We get this question a lot and we laugh as it is impossible.
  • With NPIV, the LUN is mapped straight to the client virtual machine. It is not online to the VIOS. The space on the LUN can't be shared across virtual machines by the VIOS.

Question 3: When is SSP phase ?? out and what are the prime features?

  • SSP phase 5 arrived Q4 2015 key features were with VIOS 2.2.4+:
    • SSP Tiers (multiple pools, only better)
      • 9 tiers (think disk grouping - not hierarchy of levels)
      • Systems tier (with metadata) or User tier (data only)
      • failgrp (mirroring) at tier level [was whole pool level]
      • tier -create -tier blue: <list of LUNs>
      • lu -create -lu fred -size 32G -tier blue
    •  SSP LU move a LU between tiers
      • Command: lu -move -lu vm42 -dsttier blue
    • SSP LU Resize (grow only, saves admin time)
      • Command: lu -resize-lu vm42 –size 128G 
    • SSP + Tier support in HMC Enhanced+ view
      • VIOS alerts escalated to HMC for Tiers
    • SSP phase 6 arrived Q4 2016 key features were with VIOS 2.2.5+:
      • Increased SSP VIOS nodes from 16 VIOS to 24 VIOS
      • 2.Fully support the original DeveloperWorks SSP Disaster Recovery software (see viosbr -dr command options - also viosbr -auto for automatic backups).  The DeveloperWorks website was removed in 2020.
      • 3.RAS (Reliable, Available & Serviceable)
        • 3a Cluster-Wide Snap Automation - command clffdc
        • 3b Asymmetric Network Handling - less VIOS node ejection from the cluster
        • 3c Lease by Clock Tick - can change VIOS data and time without stopping the SSP in that VIOS (clstartstop command)
      • ​​​​​4 lu -list -attr provisioned=<true/false>  = which LUs are mapped to a virtual machine(or not)
      • 5 HMC further GUI support for SSP - Arrives with HMC 860
      • 6  Performance & Capacity Metrics (SSP I/O Performance stats at SSP and VIOS level)
    • Future SSP phases?  IBM does not discuss unannounced products in public.

Questions 4: What is the overhead of SSP rather than NPIV?

  • Recent testing like for like between the two with 32 KB block by the IBM Montpellier benchmark centre came to a drawer.
  • The latency to the disks was the same.
  • Note for small virtual machines I would expect an SSP to be faster because it has an advantage:
    • NPIV would have one SAN LUN at the back end and a single queue depth
    • Even a small SSP is likely to have 16 or more LUNs. More LUN = more bandwidth.
  • When running tens of 1000's of I/O per second, the SSP does take more CPU cycles in the VIOS. You have to balance that against SSP offering fast disk space allocation and simple disk management features compared with NPIV's more complex to setup per virtual disk man-power.

Question 5: Which is strategic: LUN over vSCSI, NPIV over vFC or Shared Storage Pools?

  • They are all strategic and long-term supported regardless of claimed from your Storage team :-)
  • Shared Storage Pools was first available (public beta) in 2010 and IBM is continuing to develop new features and RAS with every release.

Question 6: How much does Shared Storage Pools cost?

  • SSP is a part of the Virtual I/O Server and no additional cost.

Question 7: How can I monitor SSP performance?

  • I recommend: njmon for VIOS, nimon for VIOS, or the older nmon tools.
  • Also, consider running the VIOS Advisor (the command is called "part") for instant tuning advice and watch the SSP free space

Questions 8: That single Repository is a single point of failure, surely?

  • Joke: "Please don't call me Shirley!"
  • No it is not a single point of failure. It can be rebuilt from scratch easily and every VIOS has a copy on its disk.
  • But it is worth monitoring the VIO Servers for Repository errors so that you can run the command to make a new one and having a spare LUN ready.
  • See Best Practice 410 and 411.
  • See AIXpert Blog 6 list.

 

Question 9: What are the current limits?

  • Shared Storage Pools phase 4, which means from Q4 2014 and VIOS 2.2.3+
    • 16 VIOS in the cluster - normally up to 8 servers with the default dual VIOS.
  • Shared Storage Pools phase 6, which means from Q4 2016 and VIOS 2.2.5+ -  NEW 
    • 24 VIOS in the cluster - normally up to 12 servers with the default dual VIOS. -  NEW 
  • A VIOS can be in only one SSP. 
  • If you have different pairs of VIOS for different uses (like production and test), then each pair of VIOSs can be in a different SSP.

Feature

Minimum

Maximum

Number of VIOS Nodes in Cluster

1

24 (VIOS 2.2.5+)  NEW 

Number of LUNs in Pool

1

1024

Number of Virtual Disks (LUs) Pool

1

8192

Number of Client LPARs per VIOS node

1

200

Each LUN in Pool size

10 GB

16 TB

Total Pool size

10 GB

512 TB *

Virtual Disk (LU) size from the Pool

1 GB

4 TB

Number of Repository Disks

1

1

Capacity of Repository Disk

512 MB

1016 GB

If you need more than 512 TB get in touch - it is a testing limit.
  • These numbers can change in future releases.

Questions 10: What disk types are supported for Shared Storage Pool use?

  • All disk subsystems with LUNs supported over Fibre Channel by the Virtual I/O Server (VIOS) are supported for Shared Storage Pools.

Questions 11: What disk types are NOT supported for Shared Storage Pool use?

  • The following are known to not be supported
    • Internal disks (all disks must be writable from every VIOS)
    • NFS is not supported
    • GPFS is not supported
    • iSCSI is not supported
  • These file systems or disks are noticed when you first create the the SSP and the command fails.

Question 12: What is the chunk or minimum allocation size of a LU virtual disk ?

  • 1 MB  (was previously incorrectly stated as 64 MB)
  • Use a dd command block size of 64 MB for faster I/O speeds. Hint: dd bs=64M

Question 13: What is the value proposition of the Shared Storage Pool?

  • Rather a marketing question. The highlights are:
    1. Enormous reduction in administration time.
    2. Independence from underlying SAN technology & team!
    3. Subsecond: virtual disk allocate and connect with lu command: create, map, unmap, and remove
    4. Subsecond: Snapshot: create/delete/rollback to allow rapid removal of failed upgrades
    5. Autonomic disk mirrors & resilver with zero virtual machine effort
    6. Live Partition Mobility (LPM) ready by default
    7. HMC GUI for fast and accurate dual VIOS disk setup - No more: VIOS slot numbers, Cnn, or vhosts
    8. Simple Pool management: pv & failgrp, lssp, alert, and VIOS logs
    9. Disaster Recovery capability to rebuild a virtual machine quickly
    10. Adoption of older LPARs with older or internal disks - giving them an I/O speed boost and LPM
    11. PowerVC ready

Question 14: Sizing the VIOS for SSP?

  • The minimum VIOS sizing is something like 1 CPU and 4 GB of RAM for the VIOS and for light disk I/O use for a few handfuls of virtual machines.
  • The same is true for a VIOS running SSP.
  • For high I/O disks rates, increase the VIOS CPU capacity - I have no "rule of thumb" specific to SSP.
  • Monitor the VIOS resource use as you workload increase and regularly use the VIOS Advisor too.

Question 15: Do failure groups (that is the SSP mirrors) add an overhead with double disk writes to the mirrors?

  • Simple answer is no
  • The failure group gives you protection from a whole disk subsystem failure or the total failure of communication paths to it.
  • The alternative is two disks as seen by the client virtual machine and double writes to the VIOS - failure groups are more efficient.

Question 16: Does SSPs play in the PowerHA (HACMP) world?

  • Yes they are fully compatible and works fine.
  • The SSP virtual disks (LUs) can be available and online on both sets of VIOS so there is no problem at that level and if there is an issue the VIOS creates error messages.
  • The LU can be accessed by only one host server and the active virtual machine mapped to it.
  • PowerHA can organise which virtual machine has access to the LU as a regular resource control during a take-over or fail-over.

Question 17: Can I migrate working virtual machines to use SSP with zero downtime?

  • For AIX, it is simple to due to functions like LVM and migratepv
  • See the YouTube video Migrating to Shared Storage Pool (SSP3) & then LPM
  • For Linux, it is more complicated as there is many distributions and many disk volume managers to consider that may or cannot support live disk changing but you local Linux guru can help.

Question 18: Linux (including Linux on POWER) has some "interesting challenges" unfortunately with mirrored boot disks. It can be done but it is not trivial as with AIX (mirrorvg, bosboot, bootlist). Can SSP help Linux?

  • Absolutely.
  • As the SSP mirroring is done invisibly to the client virtual machine at the VIOS SSP level, the Linux, AIX, or IBM i is not aware of the mirroring.
  • You don't have to go checking for stale mirror copies on every virtual machine - SSP automatically recovers the mirror for you.
  • Thus mirroring is made simple and done only once (use the failgrp command) and NOT at every new client virtual machine.

Question 19: I have a virtual machine that is already has a LUN via vSCSI or NPIV and I would like to add the virtual machine and LUN to the SSP. Can I do this without migrating the data?

  • Yes it is possible due to the new importpv command!
  • The importpv command works regardless of:
    • The operating system (AIX, Linux on POWER or IBM i).
    • The LUN content like raw access data or containing a file system.
  • Briefly:
  1. Stop the virtual machine that is running NPIV or vSCSI
  2. Make the LUN available to all VIO Servers of the SSP (if it was NPIV then remove the NPIV setting from the VIOSs)
    Warning: This operation can be significant SAN Zoning work.
  3. Use this command to check the LUN is available on all VIO servers:
     pv -list -capable
  4. Assuming in the VIOS you are working on it is hdisk42 and you want the LU virtual disks to be called vm99boot and that virtual machine has a vSCSI virtual adapter vhost44
  5. Run as the padmin user:
    importpv hdisk42:vm99boot
  6. Once complete, hdisk42 is in the SSP pool and there is a new LU in it called vm99boot, which contains all the original disk blocks.
  7. Map the LU to the virtual machine with:
    lu -map -lu vm99boot -vadapter vhost44

    If you have a dual VIOS configuration: Remember to map this LU on the other VIOS.
  8. Then, you can boot the virtual machine and it now gets the virtual disk via the SSP.
  9. By the way it is LPM ready.

Question 20: What is good or bad use of a Tier? 
Disks with different attributes can be separated disks in to groups=tiers  NEW 

  • Good use the tiers are a group of LUNs based on some attribute of the disks.  So it is about the underlying disks
  1. Fast, medium, slow = highlight the LUNs have different performance. Perhaps Flash drives, regular LUNs from your current storage and LUNs from older SAN disk units (a little slower but still good)
  2. Vendors = IBM, HDS, EMC = remind yourself of the underlying disk unit vendor and performance implications!
  3. Prod, in-house, test = make sure you know different policies are in place for the important data - might well match the vendor or performance categories in point 1 or 2.
  4. V7000a, V7000b = you know explicitly which SAN disk unit the data is in to reduce risks aid problem diagnosis.
  5. Room101 & Room102, datacentre1 & datecentre2 = clarify where (geographically) the LUNs are placed
  6. Remote-mirror, local-mirror, unmirrored = separate disk remotely mirrored (failgrp in SSP terms), locally mirrored or not mirrored
  7. Bad use of an SSP tier example: Pointlessly splitting groups of disks based on their content in to tiers - splitting reduces performance due to having fewer spindles to spread the I/O across.  I was asked: Is this data split a good idea?:
    • The rootvg & data volume groups
    • The RDBMS data & logs
    • The answer is "no".  One or other tier's disks would be busier than the other tier meaning reduced performance for that data. One tier combining these disks operates faster.
  • If you want to note the contents of a disk? Include it in the LU name.  For examples, a boot disk or database log then add to the end of the LU name "_boot" or "_log".

Question 21: Questions on LU management like:  NEW 

  1. How can I rename a LU?
  2. Can I backup & recover a LU? 
  3. Can I backup & recover point in time LU snapshot on the VIOS and would that work with Thick and Thin provisioned LUs?
  4. Can I move a LU from one SSP to another SSP?

Question 22: Is there user supplied useful commands or scripts?  NEW 

  • Absolutely - check the AIXpert Blog Shared Storage Pools - Hands-On Fun with Virtual Disks (LU) by Example
  • For details of:
    1.  ncluster - displays status of all VIOSs and their software levels
    2.  nlu - improved "lu" command with useful columns that can be sorted
    3.  npool - outputs the storage pool use statistics and explanation
    4.  nmap - details the LU online and mapped status on all VIOSs
    5.  nslim - copy a fat provisioned LU to a now thin provisioned  LU.  Warning: if you get the command options wrong then you can destroy your LU disk contents.
  • New tool - Shared Storage Pool config dump command
    • nsspconf - this command dumps the SSP configuration: SSP, disks, L's and VIOSs. This command is useful in problem determination.

Question 23: Is it OK to run Domain Name Service (DNS) on SSP? -  NEW 

  • Do not do that as there is an obvious bootstrap issue during a total SSP restart after a power outage and your DNS cannot be running until your SSP is up.
  • You need network access to the computer room, HMC, and Virtual I/O Servers before DNS can start.  Some could be handled by settings in AIX /etc/hosts and /etc/netsvc.conf. 
  • It is recommended to have DNS on a  independent server that would boot quickly after a power failure.

Question 24: How can I monitor the whole SSP I/O and VIOS level I/O?

Additional Information

Good other places for information

  1. VIOS SSP phase 6 New Features NEW 
  2. SSP - Migrating to New Disk Subsystem  NEW 
  3. SSP - Question on two pools?   NEW 
  4. Shared Storage Pools - Hands-On Fun with Virtual Disks (LU) by Example -  NEW content 
  5. Shared Storage Pools - Advanced lu -list search continues
  6. Shared Storage Pools - Growing the Pool using LUN Expansion = saves you negotiating new LUNs and no FC Zones to get wrong
  7. Shared Storage Pools - Cheat Sheet = All the commands you really need by example
  8. How many Shared Storage Pools in the world? = interesting feedback on who is running what - PLEASE: add your SSP use
  9. VIOS Shared Storage Pool Single Repository Disk = Not a Problem = what you really need to know about the Repository Disk and why you don't need to worry.
  10. VIOS Shared Storage Pool phase - How fast is the disk I/O ? = SSP disks are fast.
  11. Shared Storage Pools and Disaster Recovery in 30 seconds

 

Nigel Griffiths' YouTube videos on Shared Storage Pools - Still available

  1. Shared Storage Pools Repository is bullet proof
  2. Shared Storage Pool Remote Pool Copy Activation for Disaster Recovery
  3. Shared Storage Pool in 3 Commands in 3 Minutes
  4. PowerVC 1.2.1.2 Export Import Image from a SSP
  5. PowerVC 1.2.1 with Shared Storage Pools
  6. VIOS "part" Performance Advisor for VIOS 2.2.3
  7. Shared Storage Pools 4 (SSP4) Hands On 
  8. SSP3 Recover a Crashed Machine's LPAR to Another Machine
  9. Migrating to Shared Storage Pool (SSP3) & then LPM
  10. Shared Storage Pool 4 (SSP4) Concepts
  11. Live Partition Mobility (LPM) with Shared Storage Pool SSP3
  12. Looking Around a Shared Storage Pool SSP3
  13. Shared Storage Pool (SSP) Intro
  14. VIOS Shared Storage Pool phase 5 & SSP Update - 90 minutes Q4 2015 (still available)

Other good Websites:

  1. VIOS SSP creation using the HMC Enhanced+ GUI -  by 
  2. HMC Enhanced+ GUI for SSP YouTube Video

A good worked example of the basics on the VIOS Command line based on VIOS 2.2. The commands are the same but some of the odd options in commands are now default and not needed.

  • Article by Karthikeyan Kannan July 2013 originally from the DeveloperWorks website (now close)
  • Here is the PDF of that page: au-aix-vios-clustering-pdf.pdf
  • The commands are the same for newer VIOS version but some of the odd options in commands are now default and became optional.
  • Update about the Hints section:
    • "Can't resize a LU" - You can now with lu -resize (that is to grow the LU size but not shrink)
    • Listing disks with lspv - true but you can absolutely determine the SSP disk with lscluster -d (including the hdisk names on all the other VIOS)
    • NPIV can't ever be used as that maps the LUN straight to the virtual machine so the VIOS can't add it to the pool of LUNs it controls.
    • Alert works fine but I still think the name is back to front.

 - - - The End - - -


If you find errors or have a question, email me: 

  • Subject: SSP Best Practice
  • E-mail: n a g @ u k . i b m . c o m  

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power ->PowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"HW1A1","label":"IBM Power Systems"},"Component":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
03 May 2021

UID

ibm11111131