Troubleshooting
Problem
Cause
Environment
Diagnosing The Problem
Scenario #1: Hostname resolution on the VIOS
If DNS is setup, make sure /etc/resolv.conf file contains the proper name server, and domain search entries if the file exists.
In the /etc/hosts file, make sure that you have the loopback and the VIOS hostname entries. The format of the VIOS hostname should be: IP FQDN alias
EXAMPLE: vios2 has IP (VIOS_IP) and the domain is dfw.ibm.com
127.0.0.1 loopback localhost # loopback (lo0) name/address
(VIOS_IP) (VIOS_Fully_qualified_domain_name) (VIOS_alias)
Note: If you are using IPv4 only, you need to add the IPv4 loopback and remove or comment out the IPv6 loopback.
127.0.0.1 loopback localhost # loopback (lo0) name/address <---- IPv4 loopback
::1 loopback localhost # loopback (lo0) name/address <---- IPv6 loopback
Scenario #2: Missing the name resolution ordering in netsvc.conf file
In the /etc/netsvc.conf file, Make sure that the resolution ordering is mentioned. If you are using IPv4, then add the following:
hosts=local,bind4
Scenario #3: Updating the VIOS from 3.1.0.x to higher
You can encounter this error after updating your VIOS from 3.1.0.x to a higher version.
From the /home/ios/logs/viosvc.log.err log file on VIOS the following error is reported:
Could not load module /usr/ios/db/postgres13/lib/psqlodbcw.so.
Dependent module /usr/lib/libpq.a(libpq.so.5) could not be loaded.
The module has an invalid magic number.
Could not load module /usr/ios/db/postgres13/lib/psqlodbcw.When we run the following commands under oem_setup_env, we get similar outputs:
$ oem_setup_env # ldd /usr/ios/db/postgres13/lib/psqlodbcw.so /usr/ios/db/postgres13/lib/psqlodbcw.so needs: /usr/lib/libc.a(shr.o) /usr/lib/libiodbcinst.a(libiodbcinst.so.2) /usr/lib/libpthreads.a(shr_xpg5.o) /usr/ios/db/postgres13/lib/libpq.a(libpq.so.5) /unix /usr/lib/libcrypt.a(shr.o) /usr/lib/libdl.a(shr.o) /usr/lib/libpthreads.a(shr_comm.o)# ar -tv /usr/ios/db/lib/libpq.a rw-r----- 300/300 418709 May 22 11:33 2018 libpq.32so.5# ar -tv /usr/lib/libpq.a rw-r----- 300/300 418709 May 22 11:33 2018 libpq.32so.5
Scenario #4: Filesystem size is full on VIOS
When the filesystem on the VIOS is full, it can impact the ability of the vio_daemon from activating and updaing the Configuration Managment Database (CMDB) which will lead to a corruption in the database. If the "/home" filesystem is full, this can cause the vio_daemon not access the CMDB in order to update the database. Check the filesystem size by using the following command
$ df -g
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 0.25 0.10 62% 3277 13% /
/dev/hd2 4.44 1.63 64% 59622 14%. /usr
/dev/hd9var 0.75 0.70 8% 672 1% /var
/dev/hd3 4.69 4.69 1% 35 1% /tmp
/dev/hd1 10.00 9.89 2% 1556 1% /home
/dev/hd11admin 0.12 0.12 1% 5 1% /admin
/proc - - - - - /proc
/dev/hd10opt 0.81 0.77 5% 597 1% /opt
/dev/livedump 0.25 0.25 1% 4 1% /var/adm/ras/livedump
/ahafs - - - 37 1% /aha
If the "/home", "/var", "/", or "/tmp" are reporting 100% used, the respective filesystem needs to be cleaned out in order to leave enough space for the vio_daemon and other processes to work correctly.
Scenario #5: Permissions issue in /tmp
Check vpgadmin user permissions by using the following command
$ oem_setup_env
# lsuser vpgadmin
(Output)
You can use your favorite editor or you can use the vi editor that is part of the VIOS to view the /home/ios/logs/vdba.log file, you can see the following error
could not remove old lock file "/tmp/.s.PGSQL.6080.lock": The file access permissions do not allow the specified action.
Alternatively, you can run the following grep command to filter the /home/ios/logs/vdba.log file and output the error if it exists:
$ grep -i '/tmp/.s.PGSQL.6080.lock' /home/ios/logs/vdba.log
Then, when you try to run #su vpgadmin, you receive the following error
$ oem_setup_env
# su vpgadmin
ksh: /tmp/sh29032950.13: 0403-005 Cannot create the specified file.
Scenario #6: Upgrading from 3.1.x to 4.1.x
After upgrading the VIOS using the viosupgrade command on the VIOS command line using both the -F with the -g flags for system specific files like /etc/group, the HMC GUI would produce an error that it is not able to communicate with the VIOS database as the CMDB is not created on the VIOS after the upgrade process is completed.
Check vpgadmin attribute by using the following command
$ oem_setup_env
# lsuser vpgadmin
(Output)
Check the groups attribute if it is missing "db_users"
Check the vdba.log file for the following error "Cannot set process credentials"
$ oem_setup_env
# grep -i 'Cannot set process credentials' /home/ios/logs/vdba.log
Resolving The Problem
Scenario #1: Hostname resolution on the VIOS
In the /etc/hosts file, make sure that you have the loopback and the VIOS hostname entries. The format of the VIOS hostname should be as the following: IP FQDN alias
EXAMPLE: (VIOS_alias) has IP (VIOS_IP) and domain is dfw.ibm.com
127.0.0.1 loopback localhost # loopback (lo0) name/address
(VIOS_IP) (VIOS_Fully_qualified_domain_name) (VIOS_alias)
After editing the /etc/hosts file, you need to check that the resolution is correct by running the following commands:
$ oem_setup_env
# nslookup 1.2.3.4
# nslookup myhostname.mydomain
# host 1.2.3.4
# host myhostname.mydomain
EXAMPLE: (VIOS_alias) has IP (VIOS_IP) and domain is dfw.ibm.com
-- Run a reverse name lookup (query for IP in DNS database)
# nslookup (VIOS_IP)
Server: (server_IP)
Address: (server_IP)#53
-- Run a hostname lookup using FQDN
# nslookup (VIOS_Fully_qualified_domain_name)
Server: (server_IP)
Address: (server_IP)#53
Address: (VIOS_IP)
NOTE: both the hostname lookup and reverse name lookup returned the same information and this is what you need to see if DNS is configured (/etc/resolv.conf file exists and is configured properly)
-- Also check local name resolution using the host command
# host (VIOS_IP)
(VIOS_Fully_qualified_domain_name) is (VIOS_IP)
# host (VIOS_Fully_qualified_domain_name)
(VIOS_Fully_qualified_domain_name) is (VIOS_IP)
NOTE: both queries returned same information.Correct /etc/hosts and if needed get entries in DNS fixed untill all checks return the same information when your lookup VIOS's IP and hostname.
After completing your edits in the /etc/hosts file, run the following commands in order:
$ oem_setup_env
# /usr/bin/stopsrc -s vio_daemon
Wait 300 seconds or until vio_daemon has stopped.
# /usr/sbin/slibclean
# rm -rf /home/ios/CM
# /usr/bin/startsrc -s vio_daemon -a '-d 4' (this will start vio_daemon and vio_chgmgt and database)
# ps -eaf |grep vio_chgmgt ---> Note down the process ID of vio_chgmgt
# kill -1 PID_of_vio_chgmgt
Then, wait 5 - 10 minutes for the CMDB to repopulated then try the HMC GUI query again
If you are running an SSP environment, you must first stop the cluster and leave the MFS node in the end before running the commands mentioned:
$ clstartstop -stop -n clustername -m nodeA
Scenario #2: Missing the name resolution ordering in netsvc.conf file
In the /etc/netsvc.conf file, make sure that the resolution ordering is mentioned. If you are using IPv4, then add the following:
hosts=local,bind4
After completing your edits in the /etc/hosts file, run the following commands in order:
$ oem_setup_env
# /usr/bin/stopsrc -s vio_daemon
Wait 300 seconds or until vio_daemon has stopped.
# /usr/sbin/slibclean
# rm -rf /home/ios/CM
# /usr/bin/startsrc -s vio_daemon -a '-d 4' (this will start vio_daemon and vio_chgmgt and database)
# ps -eaf |grep vio_chgmgt ---> Note down the process ID of vio_chgmgt
# kill -1 PID_of_vio_chgmgt
Then, wait 5 - 10 minutes for the CMDB to repopulated then try the HMC GUI query again
If you are running an SSP environment, you must first stop the cluster and leave the MFS node in the end before running the commands mentioned:
$ clstartstop -stop -n clustername -m nodeA
Scenario #3: Updating the VIOS from 3.1.0.x to higher
To resolve this issue, run the following commands in order:
$ oem_setup_env
# stopsrc -s vio_daemon
# rm /usr/lib/libpq.a
# startsrc -s vio_daemon
# lssrc -a | grep -i vio_daemon -> to get the vio_daemon’s PID
# kill -1 vio_daemon's PID
Then, wait 5 - 10 minutes for the CMDB to repopulated then try the HMC GUI query again
If you are running an SSP environment, you must first stop the cluster and leave the MFS node in the end before running the commands mentioned:
$ clstartstop -stop -n clustername -m nodeA
Scenario #4: Filesystem size is full on VIOS
In order to resolve this issue, the full filesystem needs to cleaned out by diagnosing the Full File Systems in PowerVM VIOS
Then run the following commands in order:
$ oem_setup_env
# /usr/bin/stopsrc -s vio_daemon
Wait 300 seconds or until vio_daemon has stopped.
# /usr/sbin/slibclean
# rm -rf /home/ios/CM
# /usr/bin/startsrc -s vio_daemon -a '-d 4' (this will start vio_daemon and vio_chgmgt and database)
# ps -eaf |grep vio_chgmgt ---> Note down the process ID of vio_chgmgt
# kill -1 PID_of_vio_chgmgt
Then, wait 5 - 10 minutes for the CMDB to repopulated then try the HMC GUI query again
If you are running an SSP environment, you must first stop the cluster and leave the MFS node in the end before running the commands mentioned:
$ clstartstop -stop -n clustername -m nodeA
Scenario #5: Permissions issue in /tmp
This is an issue with /tmp permissions. You need to make sure that /tmp has the following permissions by running the following command:
$ ls -ld /tmp
drwxrwxrwt bin bin tmp
If permissions do not match, run the following command to fix this:
$ oem_setup_env
# chmod 1777 /tmpThen run the following commands in order:
$ oem_setup_env
# /usr/bin/stopsrc -s vio_daemon
Wait 300 seconds or until vio_daemon has stopped.
# /usr/sbin/slibclean
# rm -rf /home/ios/CM
# /usr/bin/startsrc -s vio_daemon -a '-d 4' (this will start vio_daemon and vio_chgmgt and database)
# ps -eaf |grep vio_chgmgt ---> Note down the process ID of vio_chgmgt
# kill -1 PID_of_vio_chgmgtThen, wait 5 - 10 minutes for the CMDB to repopulated then try the HMC GUI query again
If you are running an SSP environment, you must first stop the cluster and leave the MFS node in the end before running the commands mentioned:
$ clstartstop -stop -n clustername -m nodeA
Scenario #6: Upgrading from 3.1.x to 4.1.x
The APAR IJ49629: SSP OR CM DB DOES NOT START AFTER VIOSUPGRADE discusses that it is not recommended to use the viosupgrade flags -F and -g for system specific files like /etc/group. Instead, just use -g, and manually merge the copy of /etc/group from backup_files after migration.
As a workaround, make sure to run the following commands on the affected VIOS
$ oem_setup_env
# stopsrc -s vio_daemon
# mkgroup -'A' id='202' users='vpgadmin,padmin' db_users
# startsrc -s vio_daemon -a "-d 4"
# kill -1 vio_daemon's PID
Then, wait 5 - 10 minutes for the CMDB to repopulated then try the HMC GUI query again
If you are running an SSP environment, you must first stop the cluster and leave the MFS node in the end before running the commands mentioned:
$ clstartstop -stop -n clustername -m nodeA
Note: You can use your favorite editors to edit or read the files mentioned. For example, You can use the vi editor that is part of the VIOS to make your alterations and view the content of your files.
Author:
Aly Aboulgheit
Related Information
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
05 November 2025
UID
ibm16621345