How To
Summary
More customers are using NPIV over traditional vSCSI mapped storage. Typically the Server team has little visibility of the SAN zoning and therefore it becomes difficult to trace where mismatches have occurred.
Objective
This How-To details some additional techniques using AIX commands to help examine and ensure that the NPIV connections through the VIOS are as expected and match the expected SAN zones.
Environment
VIOS systems with NPIV attached storage
Steps
The problem - the ‘lsmap’ command is not providing enough information to determine the real npiv usage and if there are any issues in the system setup.
Glossary:
- Wwpn - worldwide port number - a 16-digit number assigned to a SAN endpoint (host or storage or disk)
- Fcid - Fibre Channel ID, sometimes called scsi ID
- Instance/guest/ vfchost - virtual Fibre Channel host - IBM’s implementation of a virtual NPIV connection
- NPIV - N_Port ID virtualization - allows lots of logical connections to be connected between physical host instances/Guests and storage
Versions supported: VIOS 2.2.4 and 3.1.1 and greater
For clarity we will only examine one instance group, fcs0 and vfchost0 in the examples.
Step 1: the ‘lsmap’ command - what does it show
lsmap is the primary command to show what the status of NPIV connections are:
padmin@vios1:/home/padmin$ lsmap -all -npiv
Name Physloc ClntID ClntName ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost0 U8408.44E.XXXXXXX-V1-C22 9 guest1 Linux
Name Physloc ClntID ClntName ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost0 U8408.44E.XXXXXXX-V1-C22 9 guest1 Linux
Status:LOGGED_IN
FC name:fcs0 FC loc code:U78C7.001.KKKKKKK-P1-C8-T1
Ports logged in:3
Flags:a<LOGGED_IN,STRIP_MERGE>
VFC client name:host1 VFC client DRC:U8408.44E.XXXXXXX-V9-C4
FC name:fcs0 FC loc code:U78C7.001.KKKKKKK-P1-C8-T1
Ports logged in:3
Flags:a<LOGGED_IN,STRIP_MERGE>
VFC client name:host1 VFC client DRC:U8408.44E.XXXXXXX-V9-C4
Name Physloc ClntID ClntName ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost1 U8408.44E.XXXXXXX-V1-C23 9 guest1 Linux
------------- ---------------------------------- ------ -------------- -------
vfchost1 U8408.44E.XXXXXXX-V1-C23 9 guest1 Linux
Status:LOGGED_IN
FC name:fcs1 FC loc code:U78C7.001.KKKKKKK-P1-C8-T2
Ports logged in:3
Flags:a<LOGGED_IN,STRIP_MERGE>
VFC client name:host2 VFC client DRC:U8408.44E.XXXXXXX-V9-C6
FC name:fcs1 FC loc code:U78C7.001.KKKKKKK-P1-C8-T2
Ports logged in:3
Flags:a<LOGGED_IN,STRIP_MERGE>
VFC client name:host2 VFC client DRC:U8408.44E.XXXXXXX-V9-C6
From examination of the output I think I have a problem here - for vfchost0 the ‘Ports logged in’ count is 3, but I know I have an even number of connections to anything in my SAN - Do I have a problem in the SAN somewhere (zoning, port access etc) ???
Let’s now examine in detail where this information is being sourced from and how we can determine what the real situation is in the following steps.
Step 2: What WWPN’s is a vfchost using ?
This is typically where your site documentation has a hole in it - it doesn’t show this information and it would be a nightmare to maintain (instances/guests get built, moved), so the WWPN’s will change.
In addition, when an instance is migrated (LPM), it uses the 2nd wwpn that it was assigned.
Your SAN team (if your friendly enough) might be able to give you the wwpn’s that a guest is using, but in the absence of being given them, we need to find them.
What we need is a method to get this information without writing an extensive program - thinking about this, the AIX/VIOS command ‘snap’ MustGather some of this information (or something close to it), so lets have a dig.
The AIX kernel debug command allows us to extract some information (which is done in a safe way using ‘-script’ flag to the kdb command):
root@ vios1:/tmp# echo "svfcCI; svfcPrQs; svfcva vfchost0" | kdb -script
read vscsi_scsi_ptrs OK, ptr = 0x0
vmcKdb_anchor_p=0x0000000000000000
vmc kdb command extension, 64 bit version, is loaded. Commands are:
vmc - load extension and show help text
vmcu - unload extension
vmcd - VMC dump anchor, adapter, connections
vmcfa - VMC fetch anchor from symbol table
vmcsa address - VMC set anchor
vmcdb - VMC dump connection buffers
vmcdm - VMC dump connection messages
vmcdq - VMC dump queue
vmct directoryname - VMC Internal Adapter trace
vmctbm directoryname - VMC buffer and message trace
vmcKdb_anchor_p=0x0000000000000000
(0)> svfcCI; svfcPrQs; svfcva vfchost0
Executing svfcCI command
read vscsi_scsi_ptrs OK, ptr = 0x0
vmcKdb_anchor_p=0x0000000000000000
vmc kdb command extension, 64 bit version, is loaded. Commands are:
vmc - load extension and show help text
vmcu - unload extension
vmcd - VMC dump anchor, adapter, connections
vmcfa - VMC fetch anchor from symbol table
vmcsa address - VMC set anchor
vmcdb - VMC dump connection buffers
vmcdm - VMC dump connection messages
vmcdq - VMC dump queue
vmct directoryname - VMC Internal Adapter trace
vmctbm directoryname - VMC buffer and message trace
vmcKdb_anchor_p=0x0000000000000000
(0)> svfcCI; svfcPrQs; svfcva vfchost0
Executing svfcCI command
End of execution for svfcCI command
Executing svfcPrQs command
Executing svfcPrQs command
End of execution for svfcPrQs command
Executing svfcva command
Target vFC Adapter Structure vfchost0 F1000C00377F6000
next ............... : 0
waiting_rsp ........ : 0 waiting_cmd ........ : 0
channels[0]: F1000C0150D14000 channels[2]: F1000C0150D14400
channels[1]: F1000C0150D14200 channels[3]: F1000C0150D14600
channel_numbers[0]: 0 channel_numbers[2]: 2
channel_numbers[1]: 1 channel_numbers[3]: 3
**state ............ : 10
SRP_PROCESSING
process_state ...... : 0 dds ................ : F1000C00377F6090
xmem.aspace_id ..... : FFFFFFFB
xmem.num_sids/rem_size ..................................... : 20000000
xmem.subspace_id/rem_addr .................................. : 0
xmem.subspace_ptr .. : 0 xmem.vaddr/rem_liobn : 13000016
xmem.xmemflags ..... : 0 xmem.prexflags ..... : 0
xmem.xp.total ...... : 0 xmem.xp.used ....... : 0
xmem.xp.s_vpn ...... : 0 xmem.xp.rpn ........ : 0
ras_cb ............. : f1000c0032264880 schedule_q ......... : 0
offlvl_q ........... : 0
buffer ............. : F1000C00378E4000 buffer_ioba ........ : 4000
**flags ............ : 2080001
INTERRUPT_IS_ENABLED
CLI_CAN_MIGRATE
VFC_FLUSH_ON_HALT
intr_lock._slock ... : 0
cmd_q.base_addr .... : F1000C003789B000
cmd_q.map_phys.total_iovecs ................................ : 1
cmd_q.map_phys.used_iovecs ................................. : 1
cmd_q.map_phys.bytes_done .................................. : 1000
cmd_q.map_phys.resid_iov ................................... : 0
cmd_q.map_phys.dvec : F1000C00204D3F60 cmd_q.mask ......... : FF
cmd_q.index ........ : 9F cmd_q.size ......... : 1
dk_wq_depth ........ : 0
dma_handle: F1000C0620275E00
mem ................ : F1000C0030BDC000 util_lock._slock ... : 0
&off_level ......... : f1000c00377f64f0 next_liobn ......... : 0
devno .............. : 8000001F00000001 sleep_tid .......... : FFFFFFFFFFFFFFFF
map_sleep .......... : FFFFFFFFFFFFFFFF initr_sleep ........ : FFFFFFFFFFFFFFFF
channel ............ : 10000016 **new_state ........ : 0
rsp_q_timer.timer .. : F1000C00002BE580 rsp_q_timer.vscsi .. : F1000C00377F6000
rsp_q_timer.timer_pops ..................................... : 0
rsp_q_timer.flags .. : 1 buffer_lock._slock . : 0
shutdown_state ..... : 1
migrate.wd.func .... : F1000000C03B2CB8 migrate.wd.count ... : 61000200000000
migrate.wd.restart . : 708 migrate.stream_id .. : 0
migrate.flags ...... : 0 migrate.client_flags : 0
migrate.changing ... : 0 migrate.effective .. : FFFFFFFF
migrate.call_count . : 0 migrate.timer_count : 0
login_tcnt ......... : 1 npiv_handle ........ : F1000C02305E1000
target_list ........ : F1000C01510C5000 active_q ........... : 0
idle.tid ........... : FFFFFFFFFFFFFFFF idle.flags ......... : 0
ucode.flags ........ : 0 &ucode ............. : f1000c00377f6668
resume_state ....... : 0 fc_devno ........... : 8000000E00000000
fc_fp .............. : F100111F500C8C00 fc_handle .......... : F1000C01C0B7C000
npiv_query ......... : 74C13E8 npiv_scsi .......... : 74C1430
npiv_admin ......... : 74C1418 npiv_chba .......... : 74C1400
fc_bus_id .......... : 90000380 fc_flags ........... : 7D
fc_max_xfer_size ... : 800000 fc_size_scsi_id .... : 4
fc_ileave_type ..... : 1 fc_link_speed ...... : 640
fc_map_buf_size .... : 0 fc_scsi_scratch .... : 2F0
fc_admin_scratch ... : 2A8 fc_dma_handle ...... : F1000C0690241800
sm_dma_handle ...... : 0 fc_adap_parms ...... : F1000C00377F6740
&async_cmd ......... : F1000C00377F6848 async_cmd.flags .... : 0
async_fp ........... : F1000000C03B2E38
&login_cmd ......... : F1000C00377F6980 login_cmd.flags .... : 0
&logout_cmd ........ : F1000C00377F6AB8 logout_cmd.flags ... : 0
osType ............. : 2 wwpn ............... : C0507609EF66123E
wwpn1 .............. : C0507609EF66123E wwpn2 .............. : C0507609EF66123F
node_name .......... : C0507609EF66123E scsi_id ............ : 670310
max_payload ........ : 20 max_response ....... : 80
actual_payload ..... : 74 actual_response .... : 80
max_dma_size ....... : 800000 client_part_num .... : 9
cancel_uniquifier .. : 1000100000000 async_ioba ......... : 18000000
async_next ......... : 18000090 async_len .......... : FFF0
async_free ......... : FD link_status ........ : 1
halt_status ........ : 0 halt_statix ........ : 0
validate.ptarget ... : 0 validate.pid ....... : 0
validate.flags ..... : 0
validate.cancel_uniquifier ................................. : 2000100000000
validate.syscall_count ..................................... : 0
validate.num_active_cmds ................................... : 0
validate.sleep_tid . : FFFFFFFFFFFFFFFF validate.sleep_idle : FFFFFFFFFFFFFFFF
validate.login_sleep : FFFFFFFFFFFFFFFF validate.sleep_stop : FFFFFFFFFFFFFFFF
validate.cancel_cmd : 0 validate.port_trans : FFFFFFFFFFFFFFFF
validate.prev_state : 0
validate.cmd_data.num ...................................... : 0
validate.cmd_data.flags .................................... : 0
v.cmd_data.cmds_buf : 0 v.cmd_data.v_cmd_buf : 0
v.cmd_data.fc_cmd_buf ...................................... : 0
v.cmd_data.map_buf . : 0 validate.cmds ...... : 0
validate.cancel_start ...................................... : 0
validate.instance .. : 0 resrv32 ............ : F1000C00377F75E0
nport_log.handle ... : 0 nport_log.flags .... : 0
nport_log.wwpn1 .... : 0 nport_log.wwpn2 .... : 0
nport_log.wwpn1_state ...................................... : 0
nport_log.wwpn2_state ...................................... : 0
nport_log.wwpn1_fail_type .................................. : 0
nport_log.wwpn2_fail_type .................................. : 0
nport_log.wwpn1_fail_reason ................................ : 0
nport_log.wwpn2_fail_reason ................................ : 0
portResName ........ : fcs0
portLocCode ........ : U78C7.001.KKKKKKK-P1-C8-T1
clientPartName ..... : guest1_full_name
clientName ......... : host1
clientDrc .......... : U8408.44E.XXXXXXX-V9-C4
ms_high ............ : 0 ms_range ........... : 0
num_cmds ........... : 0 drop_cnt ........... : 0
worst_time ......... : 0 phyp_time .......... : A525B1
kp_sleep_tid ....... : 8E0121 cfg_sleep_tid ...... : FFFFFFFFFFFFFFFF
proc_pid ........... : 45018E flags .............. : 3
lock ............... : 0 kproc->cmds ........ : 0
channel_scheduler_count .................................... : 0
start_channel ...... : 0 num_channels ....... : 4
disconnect_action .. : 0 disconnect_status .. : 0
cmdh_num ........... : 1 cmdq_num ........... : 1
chan_num ........... : 4
Executing svfcva command
Target vFC Adapter Structure vfchost0 F1000C00377F6000
next ............... : 0
waiting_rsp ........ : 0 waiting_cmd ........ : 0
channels[0]: F1000C0150D14000 channels[2]: F1000C0150D14400
channels[1]: F1000C0150D14200 channels[3]: F1000C0150D14600
channel_numbers[0]: 0 channel_numbers[2]: 2
channel_numbers[1]: 1 channel_numbers[3]: 3
**state ............ : 10
SRP_PROCESSING
process_state ...... : 0 dds ................ : F1000C00377F6090
xmem.aspace_id ..... : FFFFFFFB
xmem.num_sids/rem_size ..................................... : 20000000
xmem.subspace_id/rem_addr .................................. : 0
xmem.subspace_ptr .. : 0 xmem.vaddr/rem_liobn : 13000016
xmem.xmemflags ..... : 0 xmem.prexflags ..... : 0
xmem.xp.total ...... : 0 xmem.xp.used ....... : 0
xmem.xp.s_vpn ...... : 0 xmem.xp.rpn ........ : 0
ras_cb ............. : f1000c0032264880 schedule_q ......... : 0
offlvl_q ........... : 0
buffer ............. : F1000C00378E4000 buffer_ioba ........ : 4000
**flags ............ : 2080001
INTERRUPT_IS_ENABLED
CLI_CAN_MIGRATE
VFC_FLUSH_ON_HALT
intr_lock._slock ... : 0
cmd_q.base_addr .... : F1000C003789B000
cmd_q.map_phys.total_iovecs ................................ : 1
cmd_q.map_phys.used_iovecs ................................. : 1
cmd_q.map_phys.bytes_done .................................. : 1000
cmd_q.map_phys.resid_iov ................................... : 0
cmd_q.map_phys.dvec : F1000C00204D3F60 cmd_q.mask ......... : FF
cmd_q.index ........ : 9F cmd_q.size ......... : 1
dk_wq_depth ........ : 0
dma_handle: F1000C0620275E00
mem ................ : F1000C0030BDC000 util_lock._slock ... : 0
&off_level ......... : f1000c00377f64f0 next_liobn ......... : 0
devno .............. : 8000001F00000001 sleep_tid .......... : FFFFFFFFFFFFFFFF
map_sleep .......... : FFFFFFFFFFFFFFFF initr_sleep ........ : FFFFFFFFFFFFFFFF
channel ............ : 10000016 **new_state ........ : 0
rsp_q_timer.timer .. : F1000C00002BE580 rsp_q_timer.vscsi .. : F1000C00377F6000
rsp_q_timer.timer_pops ..................................... : 0
rsp_q_timer.flags .. : 1 buffer_lock._slock . : 0
shutdown_state ..... : 1
migrate.wd.func .... : F1000000C03B2CB8 migrate.wd.count ... : 61000200000000
migrate.wd.restart . : 708 migrate.stream_id .. : 0
migrate.flags ...... : 0 migrate.client_flags : 0
migrate.changing ... : 0 migrate.effective .. : FFFFFFFF
migrate.call_count . : 0 migrate.timer_count : 0
login_tcnt ......... : 1 npiv_handle ........ : F1000C02305E1000
target_list ........ : F1000C01510C5000 active_q ........... : 0
idle.tid ........... : FFFFFFFFFFFFFFFF idle.flags ......... : 0
ucode.flags ........ : 0 &ucode ............. : f1000c00377f6668
resume_state ....... : 0 fc_devno ........... : 8000000E00000000
fc_fp .............. : F100111F500C8C00 fc_handle .......... : F1000C01C0B7C000
npiv_query ......... : 74C13E8 npiv_scsi .......... : 74C1430
npiv_admin ......... : 74C1418 npiv_chba .......... : 74C1400
fc_bus_id .......... : 90000380 fc_flags ........... : 7D
fc_max_xfer_size ... : 800000 fc_size_scsi_id .... : 4
fc_ileave_type ..... : 1 fc_link_speed ...... : 640
fc_map_buf_size .... : 0 fc_scsi_scratch .... : 2F0
fc_admin_scratch ... : 2A8 fc_dma_handle ...... : F1000C0690241800
sm_dma_handle ...... : 0 fc_adap_parms ...... : F1000C00377F6740
&async_cmd ......... : F1000C00377F6848 async_cmd.flags .... : 0
async_fp ........... : F1000000C03B2E38
&login_cmd ......... : F1000C00377F6980 login_cmd.flags .... : 0
&logout_cmd ........ : F1000C00377F6AB8 logout_cmd.flags ... : 0
osType ............. : 2 wwpn ............... : C0507609EF66123E
wwpn1 .............. : C0507609EF66123E wwpn2 .............. : C0507609EF66123F
node_name .......... : C0507609EF66123E scsi_id ............ : 670310
max_payload ........ : 20 max_response ....... : 80
actual_payload ..... : 74 actual_response .... : 80
max_dma_size ....... : 800000 client_part_num .... : 9
cancel_uniquifier .. : 1000100000000 async_ioba ......... : 18000000
async_next ......... : 18000090 async_len .......... : FFF0
async_free ......... : FD link_status ........ : 1
halt_status ........ : 0 halt_statix ........ : 0
validate.ptarget ... : 0 validate.pid ....... : 0
validate.flags ..... : 0
validate.cancel_uniquifier ................................. : 2000100000000
validate.syscall_count ..................................... : 0
validate.num_active_cmds ................................... : 0
validate.sleep_tid . : FFFFFFFFFFFFFFFF validate.sleep_idle : FFFFFFFFFFFFFFFF
validate.login_sleep : FFFFFFFFFFFFFFFF validate.sleep_stop : FFFFFFFFFFFFFFFF
validate.cancel_cmd : 0 validate.port_trans : FFFFFFFFFFFFFFFF
validate.prev_state : 0
validate.cmd_data.num ...................................... : 0
validate.cmd_data.flags .................................... : 0
v.cmd_data.cmds_buf : 0 v.cmd_data.v_cmd_buf : 0
v.cmd_data.fc_cmd_buf ...................................... : 0
v.cmd_data.map_buf . : 0 validate.cmds ...... : 0
validate.cancel_start ...................................... : 0
validate.instance .. : 0 resrv32 ............ : F1000C00377F75E0
nport_log.handle ... : 0 nport_log.flags .... : 0
nport_log.wwpn1 .... : 0 nport_log.wwpn2 .... : 0
nport_log.wwpn1_state ...................................... : 0
nport_log.wwpn2_state ...................................... : 0
nport_log.wwpn1_fail_type .................................. : 0
nport_log.wwpn2_fail_type .................................. : 0
nport_log.wwpn1_fail_reason ................................ : 0
nport_log.wwpn2_fail_reason ................................ : 0
portResName ........ : fcs0
portLocCode ........ : U78C7.001.KKKKKKK-P1-C8-T1
clientPartName ..... : guest1_full_name
clientName ......... : host1
clientDrc .......... : U8408.44E.XXXXXXX-V9-C4
ms_high ............ : 0 ms_range ........... : 0
num_cmds ........... : 0 drop_cnt ........... : 0
worst_time ......... : 0 phyp_time .......... : A525B1
kp_sleep_tid ....... : 8E0121 cfg_sleep_tid ...... : FFFFFFFFFFFFFFFF
proc_pid ........... : 45018E flags .............. : 3
lock ............... : 0 kproc->cmds ........ : 0
channel_scheduler_count .................................... : 0
start_channel ...... : 0 num_channels ....... : 4
disconnect_action .. : 0 disconnect_status .. : 0
cmdh_num ........... : 1 cmdq_num ........... : 1
chan_num ........... : 4
End of execution for svfcva command
(0)> Executing q command
(0)> Executing q command
From this output we can see the wwpn’s and get a full guest name, not the truncated version from lsmap, for clarity I have copied these here (as you’d probably write them down if you were debugging this live)
osType ............. : 2 wwpn ............... : C0507609EF66123E
wwpn1 .............. : C0507609EF66123E wwpn2 .............. : C0507609EF66123F
clientPartName ..... : guest1_full_name
wwpn1 .............. : C0507609EF66123E wwpn2 .............. : C0507609EF66123F
clientPartName ..... : guest1_full_name
You could also check vfchost1, but it should always be using the same wwpn’s as vfchost0 given the client name is the same ‘guest …’
What we also know is that these vfchosts ‘0’ and ‘1’ use the physical adapters ‘fcs0’ and ‘fcs1’.
So, we now have what the wwpn’s are that the client is using - let’s see where the string is tied to.
Step 3 - Return of the King - the ‘/proc’ filesystem
Many *nix variants’ use a pseudo filesystem to write information that is often wanted by users, but hard to access in the kernel (without committing suicidal feats of programming).
Most of these files are pure text files and can be read easily, and future changes to AIX orVIOS code can easily publish more information that is now accessible, without having to write C code to query the kernel (and who has time …).
Given we know the Physical adapter is ‘fcs0’ (or ‘fcs1’) let’s see what one of these special files can show us using the ‘cat’ command:
cat /proc/sys/adapter/fc/fcs0/connections
vp:F1000C0230610600 pn:C0507609EF66123E id:670310 vpi:8203 flags:0000027F NPIV
log:F1000C01C3FB0640 pn:250D00DEFB5B8380 id:FFFFFC rpi:1B0C flags:00E0
log:F1000C01C3FB0708 pn:251100DEFB5B8380 id:FFFC67 rpi:1C0C flags:00E1
log:F1000C01C3FB07D0 pn:500507680C31A9FA id:6700A0 rpi:1D0C flags:00E0
log:F1000C01C3FB0898 pn:500507680C31A36C id:6700E0 rpi:1E0C flags:00E0
log:F1000C01C3FB0640 pn:250D00DEFB5B8380 id:FFFFFC rpi:1B0C flags:00E0
log:F1000C01C3FB0708 pn:251100DEFB5B8380 id:FFFC67 rpi:1C0C flags:00E1
log:F1000C01C3FB07D0 pn:500507680C31A9FA id:6700A0 rpi:1D0C flags:00E0
log:F1000C01C3FB0898 pn:500507680C31A36C id:6700E0 rpi:1E0C flags:00E0
vp:F1000C01C06D0500 pn:10000090FA8E842F id:670337 vpi:8103 flags:0000037F PHYS
log:F1000C01C3FB04B0 pn:250D00DEFB5B8380 id:FFFFFC rpi:180C flags:00E0
log:F1000C01C3FB0578 pn:500507680C53A9FA id:6700C0 rpi:190C flags:00E0
log:F1000C01C3FB04B0 pn:250D00DEFB5B8380 id:FFFFFC rpi:180C flags:00E0
log:F1000C01C3FB0578 pn:500507680C53A9FA id:6700C0 rpi:190C flags:00E0
tot_vports: 2
tot_npiv: 1
tot_logins: 6
tot_npiv: 1
tot_logins: 6
cat /proc/sys/adapter/fc/fcs1/connections
vp:F1000C0230610300 pn:C0507609EF661230 id:680310 vpi:0200 flags:0000027F NPIV
log:F1000C01C1C90708 pn:250D00DEFB5B8380 id:FFFFFC rpi:0C00 flags:00E0
log:F1000C01C1C907D0 pn:251100DEFB5B8380 id:FFFC68 rpi:0D00 flags:00E1
log:F1000C01C1C90898 pn:500507680C32A9FA id:680060 rpi:0E00 flags:00E0
log:F1000C01C1C90960 pn:500507680C32A36C id:680100 rpi:0F00 flags:00E0
log:F1000C01C1C90708 pn:250D00DEFB5B8380 id:FFFFFC rpi:0C00 flags:00E0
log:F1000C01C1C907D0 pn:251100DEFB5B8380 id:FFFC68 rpi:0D00 flags:00E1
log:F1000C01C1C90898 pn:500507680C32A9FA id:680060 rpi:0E00 flags:00E0
log:F1000C01C1C90960 pn:500507680C32A36C id:680100 rpi:0F00 flags:00E0
vp:F1000C01C06D0400 pn:10000090FA8E8430 id:680337 vpi:0100 flags:0000037F PHYS
log:F1000C01C1C90578 pn:250D00DEFB5B8380 id:FFFFFC rpi:0900 flags:00E0
log:F1000C01C1C90640 pn:500507680C52A36C id:6800E0 rpi:0A00 flags:00E0
log:F1000C01C1C90578 pn:250D00DEFB5B8380 id:FFFFFC rpi:0900 flags:00E0
log:F1000C01C1C90640 pn:500507680C52A36C id:6800E0 rpi:0A00 flags:00E0
tot_vports: 2
tot_npiv: 1
tot_logins: 6
tot_npiv: 1
tot_logins: 6
We now have the string tied from our client instance wwpn’s to the ones the system can see at the storage layer.
Looking at these tables it also becomes apparent why the ‘lsmap’ command reports 3 connections, and not just the two we were expecting (the three with flags: 00E0) ones are the clue for the NPIV entry.
We can determine (using ‘fcs0’ as the example), that from NPIV WWPN C0507609EF66123E there are two connections to storage 500507680C31A9FA id: 6700A0 and 500507680C31A36C id: 6700E - and that the third one that is reported (250D00DEFB5B8380 id:FFFFFC rpi:1B0C flags: 00E0) makes the count equal to 3
Why should we not count id: FFFC67 rpi:1C0C flags:00E1 for fcs0?
In SAN (Fibre Channel), FFFFFC is the special SAN Fabric Directory Server, and seeing this tells us the port is logged in correctly, FFFC67 is the SAN Domain Fabric controller - responsible in this case for a VSAN (number ‘67’) i.e. the 67 suffix of the ID’s for fcs0 (on VSAN 67) and 68 for fcs1 mapped to VSAN 68 that have been displayed – we should not count these entries that present to the SAN Domain Fabric controller.
So, our count of 3 x 00E0 EQUALS our 'Ports logged in' count of 3 - we have found where our connections are - exercise is complete.
The PHYS connection also produces fabric logins but does not consume NPIV resources.
It is possibly a good thing to explain here the numbers from the connections file in more detail, since they expose potential issues you may need to look at:
tot_vports: 2 <= total number of virtual ports currently in kernel space
tot_npiv: 1 <= total number of NPIV connections
tot_logins: 6 <= total number of FLOGI's used
tot_npiv: 1 <= total number of NPIV connections
tot_logins: 6 <= total number of FLOGI's used
SAN switches have a limit on the number of FLOGI entries they can store, so it's important knowing the number of FLOGI's based on the connections.
Recap:
At this point, we have been able to follow a piece of string to trace:
- from the lsmap command, what it thinks is the right information
- from the magic kdb command the wwpn’s that the client is actually using and the client name
- and finally, from the /proc filesystem, what the actual connection is to the storage layer
If you now see unexpected results coming back, at least you can ask better questions to the SAN team (for example), or from support to understand why a connection is missing.
Step 4 - so ... let’s automate this
If you have an urgent need to work out why a production instance does not see all the connections you assume it should, typing all these commands every time is going to take a while - so let’s write a script that makes this somewhat easier.
Code example:
# for each vfchost get wwpn, wwpn1 and wwpn2 (combined into a single line, .'s removed)
WWPN=$(echo "svfcCI; svfcPrQs; svfcva ${VFCHOST}" | kdb -script | egrep "wwpn|wwpn1|wwpn2" | head -n 2 | awk '{ ORS = NR%2 ? " " : RS } 1' | tr -d . | sed -e 's/ *//g' | sed -e 's/:/: /g' | sed -e 's/wwpn/ wwpn/g')
WWPN=$(echo "svfcCI; svfcPrQs; svfcva ${VFCHOST}" | kdb -script | egrep "wwpn|wwpn1|wwpn2" | head -n 2 | awk '{ ORS = NR%2 ? " " : RS } 1' | tr -d . | sed -e 's/ *//g' | sed -e 's/:/: /g' | sed -e 's/wwpn/ wwpn/g')
WWPNB=$(echo $WWPN | awk '{print $4}')
WWPN1=$(echo $WWPN | awk '{print $6}')
WWPN2=$(echo $WWPN | awk '{print $8}')
#we check which wwpn is used out of the pair - and put ()'s around the non-active one
if [[ "${WWPNB}" == "${WWPN1}" ]]
then
WWPNDISP1="${WWPN1}"
WWPNDISP2="(${WWPN2})"
else
WWPNDISP1="(${WWPN1})"
WWPNDISP2="${WWPN2}"
fi
WWPN1=$(echo $WWPN | awk '{print $6}')
WWPN2=$(echo $WWPN | awk '{print $8}')
#we check which wwpn is used out of the pair - and put ()'s around the non-active one
if [[ "${WWPNB}" == "${WWPN1}" ]]
then
WWPNDISP1="${WWPN1}"
WWPNDISP2="(${WWPN2})"
else
WWPNDISP1="(${WWPN1})"
WWPNDISP2="${WWPN2}"
fi
A full example script "ls_vfcosts.ksh" is included later in this document.
The output from the script should look similar to:
vfchost0 C0507609EF66123E (C0507609EF66123E) fcs0 guest1_full_name
storage login count:2 500507680C31A9FA id:6700A0 500507680C31A36C id:6700E0
vfchost1 ... (n)
*(wwpn) is the non-active WWPN of the virtual pair
The example above shows the two actual logins in this case for fcs0 against WWPN C0507609EF66123E rather than 3 or more from lsmap -all -npiv output.
Summary
This article provides some checks and hints for where to look if you have any issues tracing what is actually using NPIV, why the total login counts are always higher using 'lsmap -all -npiv' and a small script example to display some of this information in a meaningful manner.
This HOWTO article is one of a series I am producing that provides insight into the workings of SAN functions.
The next articles will include:
An example of some AIX utilities that perform SAN functions such as:
- allow a method to provide the SAN Fabric Services with the actual name of the AIX port - this can then be used to create zone aliases and port descriptions without having to type these names in manually
- issue an RDP (Request Diagnostic Parameters) to a Fibre Channel connection (such as a port on a SAN switch)
- issue an ELS_ECHO (SAN ping), with capability to have various size payloads - this is useful for stress testing fibre links without having a running workload
- issue a ELS_BEACON - flashing both the HBA and SAN Switch ports to identify correct cabling
- Advanced SFP problem determination
If there is a particular function or requirement that you have and feel I should include it, please do email me at: dlancast @au1.ibm.com
Thanks to Chris Gibson for testing and validating.
Additional Information
ls_vfchosts.sh (use at own risk/Disclaimer)
Document Location
Worldwide
[{"Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSPHKW","label":"PowerVM Virtual I\/O Server"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]
Was this topic helpful?
Document Information
Modified date:
02 June 2022
UID
ibm16391578