APAR status
Closed as program error.
Error description
LogAssertFailed when pagepool is too small. The following logs can be seen in MMFS logs: [W] The pagepool size may be too small. Try increasing the pagepool size or adjusting pagepool usage with, for example, nsdBufSpace, nsdRAIDBufferPoolSizePct, or verbsSendBufferMemoryMB). [X] logAssertFailed: !"More than 22 minutes searching for a free buffer in the pagepool"
Local fix
Problem summary
Current DIO over RDMA code allocates memory for NSD operation by below logic: 1) When RDMA is enabled: 1.1) Use encryption temp buffer for regular DIO Calls newBuffer(SO_NO_LOG_WRITES) if encryption temp buffer initialization fails 1.2) Use heap for mmap DIO page out 2) when RDMA is disabled 2.1) Calls newBuffer(SO_NO_LOG_WRITES) for regular DIO 2.2) Use heap for mmap DIO page out . The newBuffer call is risky as SO_NO_LOG_WRITES flag means it can not steal dirty buffer. If periodical sync and DioHandlerThread lock into each other we may dead lock (clean thread and Per steal may break the interlocking, but they are inactive totally under some conditions).
Problem conclusion
1) When RDMA is enabled: 1.1) Use encryption temp buffer for regular DIO Use heap if encryption temp buffer initialization fails 1.2) Use heap for mmap DIO page out 2) when RDMA is disabled 2.1) Calls newBuffer(SO_NO_LOG_WRITES) for regular DIO 2.2) Use heap for mmap DIO page out . Note: the behavior when RDMA is disabled is same as pre-4.1 behavior. Alternatively, we can keep using encryption temp buffer. Heap is prefered at this time, as encryption temp buffer consumes some space from the pagepool and its allocations can still block if the pool is depleted.
Temporary fix
Comments
APAR Information
APAR number
IV89732
Reported component name
SPECTRUM SCALE
Reported component ID
5725Q01AP
Reported release
411
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-10-06
Closed date
2016-10-06
Last modified date
2019-04-30
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPECTRUM SCALE
Fixed component ID
5725Q01AP
Applicable component levels
R411 PSY U884675
19/04/30 I 1000
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"411","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSFKCN","label":"General Parallel File System"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"411","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
30 April 2019