IBM Support

db2start returns a warning message "libnuma: Warning: /sys not mounted or invalid."

Troubleshooting


Problem

db2start completes successfully but with a warning message "libnuma: Warning: /sys not mounted or invalid."

Symptom

$ db2start
libnuma: Warning: /sys not mounted or invalid. Assuming one node: No such file or directory
2017-12-14 14:28:44 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.

db2diag.log:

2017-12-19-15.00.09.447117+480 E135107179E487 LEVEL: Warning (OS)
PID : 104183 TID : 140737016641312 PROC : db2star2
INSTANCE: db2inst1 NODE : 000
HOSTNAME: myhost1
FUNCTION: DB2 UDB, oper system services, sqloKADetermineNUMASupport, probe:50
CALLED : OS, -, open
OSERR : -1 "Unknown error 18446744073709551615"
DATA #1 : Hexdump, 65171 of 4028196206 bytes
Object not dumped: Address: 0x0000000000000000 Size: 4028196206 Reason: Address is NULL

Cause

db2start will load libnuma.so.1 and call numa_node_to_cpus() as following:

libHandle = dlopen( "/usr/lib64/libnuma.so.1", RTLD_NOW );
funcv1 = (dlvsym(libHandle, "numa_node_to_cpus", "libnuma_1.1"));
sysrc = (*funcv1)( 0, (long unsigned int*)&cpuSetNode, sizeof(cpuSetNode) ) ;

In this case, numa_node_to_cpus fails somehow.

Environment

Redhat Linux

Diagnosing The Problem

db2trc shows sqloKADetermineNUMASupport fails:


7047 | | | sqloKADetermineNUMASupport entry
7048 | | | | OSSHLibrary::load entry
7049 | | | | OSSHLibrary::load data [probe 10]
7050 | | | | OSSHLibrary::load exit
7051 | | | | OSSHLibrary::getFuncAddress entry
7052 | | | | OSSHLibrary::getFuncAddress data [probe 10]
7053 | | | | OSSHLibrary::getFuncAddress data [probe 100]
7054 | | | | OSSHLibrary::getFuncAddress exit
7055 | | | | OSSHLibrary::getFuncAddress entry
7056 | | | | OSSHLibrary::getFuncAddress data [probe 10]
7057 | | | | OSSHLibrary::getFuncAddress data [probe 20]
7058 | | | | OSSHLibrary::getFuncAddress data [probe 30]
7059 | | | | OSSHLibrary::getFuncAddress data [probe 100]
7060 | | | | OSSHLibrary::getFuncAddress exit
7061 | | | | pdLogSysRC entry
7062 | | | | | pdIsDiagLevelOk entry
<skipped>
7111 | | | sqloKADetermineNUMASupport SYSTEM ERROR [probe 50]
7112 | | | | | | sqloclose entry
7113 | | | | | | sqloclose exit
7114 | | | | | | sqloSigMask entry
7115 | | | | | | sqloSigMask exit
7116 | | | | | | sqloSigMask entry
7117 | | | | | | sqloSigMask exit
7118 | | | | | pdLogInternal exit
7119 | | | | pdLogSysRC exit


7111 SYSTEM ERROR DB2 UDB oper system services sqloKADetermineNUMASupport fnc (5.3.15.1286.0.50)
pid 104183 tid 140737016641312 cpid -1 node 0 probe 50
Func.Called: open
System Errno: 0
bytes 504
Data1 (PD_TYPE_DIAG_LOG_REC,488) Diagnostic log record:

2017-12-19-15.00.09.447117+480 E135107179E487 LEVEL: Warning (OS)
PID : 104183 TID : 140737016641312 PROC : db2star2
INSTANCE: db2inst1 NODE : 000
HOSTNAME: myhost1
FUNCTION: DB2 UDB, oper system services, sqloKADetermineNUMASupport, probe:50
CALLED : OS, -, open
OSERR : -1 "Unknown error 18446744073709551615"
DATA #1 : Hexdump, 65171 of 4028196206 bytes
Object not dumped: Address: 0x0000000000000000 Size: 4028196206 Reason: Address is NULL


Then why does sqloKADetermineNUMASupport fail? Need to collect a Linux trace:

$ cat start.sh
echo 'My process ID = ' $$
read -p 'Enter to run db2start ...' temp
echo 'Run db2start ...'
/home/db2inst1/sqllib/adm/db2start

Session1:
chmod a+x start.sh
./start.sh

#sample output:
$ ./start.sh
My process ID = 7200
Enter to run db2start ...

#Remember the process ID, in this example it is 7200.


Session2: with root user
strace -o db2stat.strace.out -f -p 7200

Note: replace 7200 with the real process ID you got in session1.


Session1:
Press 'Enter' key, command 'db2start' starts

#as soon as 'db2start' returns
Session2:
press ctrl + c


Look at the Linux trace, call of numa_node_to_cpus() fails due to error message as below:

68475 open("/sys/devices/system/node/node0/cpumap", O_RDONLY) = -1 ENOENT (No such file or directory)
68475 write(2, "libnuma: Warning: ", 18) = 18
68475 write(2, "/sys not mounted or invalid. Ass"..., 73) = 73
68475 write(2, "\n", 1)

Seems like missing of /sys/devices/system/node/node0/cpumap caused the failure.

Resolving The Problem

The problem is outside of Db2. It is recommended to contact a Linux support to check why /sys/devices/system/node/node0/cpumap is not there. But before that, you are suggested to check BUG 998678:
https://bugzilla.redhat.com/show_bug.cgi?id=998678

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Operating System \/ Hardware - Other OS\/Hardware","Platform":[{"code":"PF016","label":"Linux"}],"Version":"10.5;11.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
07 December 2022

UID

swg22013988