Topic
  • 2 replies
  • Latest Post - ‏2010-06-24T16:48:02Z by dlmcnabb
wgwang
wgwang
2 Posts

Pinned topic GPFS bad I/O performance for domain server

‏2010-06-18T16:03:12Z |
it is a two-aix node gpfs cluster,
gpfs version:3.3.0.6
aix version:5300-09-01-0847

when domain application start running, the gpfs will have a heavy I/O waitter, but the nsd disk is not very busy, and cpu has about 50% wait.
I have tried to turn some I/O relevant parameter, but problem is still exist.

  1. mmfsadm dump waiter
0x111CC19F0 waiting 0.017181613 seconds, Fsync handler: for I/O completion on disk hdisk3
0x111F03110 waiting 0.052621283 seconds, Writebehind worker: for I/O completion on disk hdisk4
0x111DED5B0 waiting 0.017005079 seconds, Fsync handler: for I/O completion on disk hdisk5
0x111DEA0D0 waiting 0.450624517 seconds, Fsync handler: for I/O completion on disk hdisk5
0x111508510 waiting 0.047789718 seconds, File block random read fetch handler: for I/O completion on disk hdisk3
0x111A3CD10 waiting 0.016830557 seconds, Fsync handler: for I/O completion on disk hdisk5
0x111A382D0 waiting 0.015906413 seconds, File block random read fetch handler: for I/O completion on disk hdisk5
0x11150ED30 waiting 0.367202196 seconds, Fsync handler: for I/O completion on disk hdisk5
0x1114FD9F0 waiting 0.248093785 seconds, Fsync handler: for I/O completion on disk hdisk5
0x1114FD870 waiting 0.012184826 seconds, File block random read fetch handler: for I/O completion on disk hdisk5
0x110E8C550 waiting 0.018406077 seconds, Fsync handler: for I/O completion on disk hdisk2
0x110E89030 waiting 0.037410467 seconds, Writebehind worker: for I/O completion on disk hdisk4
0x110E79230 waiting 0.010858462 seconds, File block random read fetch handler: for I/O completion on disk hdisk5
0x110B94F90 waiting 0.002746035 seconds, File block random read fetch handler: for I/O completion on disk hdisk2
0x110F08110 waiting 0.007616574 seconds, File block random write fetch handler: for I/O completion on disk hdisk3
0x110B52C10 waiting 0.181541293 seconds, Fsync handler: for I/O completion on disk hdisk5
0x110B485B0 waiting 0.009951285 seconds, Fsync handler: for I/O completion on disk hdisk4
0x110B42490 waiting 0.382403531 seconds, Fsync handler: for I/O completion on disk hdisk5
0x1101CB2D0 waiting 0.448739603 seconds, Fsync handler: for I/O completion on disk hdisk5
  1. mmfsadm dump config
allocNearPoolId -1
allocNearRegionId -1
allowDeleteAclOnChmod 1
allowDummyConnections 0
allowRemoteConnections 0
allowSambaCaseInsensitiveLookup 0
allowSynchronousFcntlRetries 1
assertOnStructureError 0
asyncSocketNotify 0
autoSgLoadBalance 0
cipherList EMPTY
clusterId 760611762772921775
clusterName eip_gpfs_cluster.eip_gpfsnode1
crashdump 0
dataStructureDump 1 /tmp/mmfs
dataStructureDumpOnSGPanic 0 /tmp/mmfs
dataStructureDumpWait 60
distributedTokenServer 1
dmapiEnable 1
dmapiEventBuffers 64
dmapiEventTimeout -1
dmapiFileHandleSize 32
dmapiMountEvent all
dmapiMountTimeout 60
dmapiSessionFailureTimeout 0
dmapiWorkerThreads 12
dmapiDataEventRetry 2
eeWatchDogHungThreadCutoff 60
eeWatchDogInterval 90
eeSGUtilThreshold 90
enableIPv6 0
enableLowspaceEvents 0
enableNFSCluster 0
enablePNFSmds 0
enableStatUIDremap 0
enableTreeBasedQuotas 0
enableUIDremap 0
enforceFilesetQuotaOnRoot 0
envVar
eventsExporterTcpPort 1191
failureDetectionTime -1
fgdlActivityTimeWindow 10
fgdlLeaveThreshold 1000
fineGrainDirLocks 1
flushedDataTarget 32
flushedInodeTarget 32
fragmentAllocatorThreshold 16
healthCheckInterval 10
hotListPct 10
idleSocketTimeout 3600
IgnoreNonDioInstCount 0
IgnoreReplicaSpaceOnStat 0
ioHistorySize 512
initPrefetchBuffers 1
isBalancerEnabled 1
joinTimeout 60
leaseDMSTimeout -1
leaseDuration -1
leaseRecoveryWait 35
listenOnAllInterfaces 1
logPingPongSector 1
logWrapBuffers -1
logWrapThreads 8
logWrapThreadsPerInvocation -1
logWrapThresholdPct 50
logWrapAmountPct 10
maxAllocPctToCache 0
maxBackgroundDeletionThreads 4
maxblocksize 1048576
maxBufferDescs -1
maxDataShipPoolSize 266240
maxDiskAddrBuffs -1
maxFcntlRangesPerFile 200
maxFeatureLevelAllowed 1105
maxFileCleaners 8
maxFilesToCache 5000
maxFragmentAllocatorBlocks 4
maxFragmentAllocatorsFree 8
maxInodeDeallocHistory 250
maxInodeDeallocPrefetch 8
maxMBpS 800
maxMissedPingTimeout 60
maxNFSDelegationTimeout 60
maxReceiverThreads 16
maxRelinquishAttrByteRangeThreads 2
maxSGDescIOBufSize 262144
maxStatCache 20000
maxTokenServers 128
minMissedPingTimeout 3
minQuorumNodes 1
minReleaseLevel 1105
mmapKprocs 3
mmapRangeLock 1
mmsdrservTimeout 10
mmsdrservWorkerPool 10
multiTMMountThreshold 2
nfsPrefetchStrategy 0
nsdBufSpace(% of PagePool) 30
nsdInlineWriteMax 1024
nsdClientMaxRetries 1000
nsdMaxWorkerThreads 26
nsdMinWorkerThreads 8
nsdServerCheckingIntervalForMount 10
nsdServerWaitConfig 2
nsdServerWaitTimeForMount 300
nsdServerWaitTimeWindowOnMount 600
nsdThreadsPerDisk 3
opensslLibName /opt/freeware/lib/libssl.a(libssl.so.0.9.7):/opt/freeware/lib/libssl.a(libssl.so.0):libssl.a(libssl.so.0)
pagepool 2617245696
pagepoolMaxPhysMemPct 75
pctTokenMgrStorageUse 25
pCacheLookupRefreshInterval 30
pCacheOpenRefreshInterval 30
pCacheAsyncDelay 15
pcacheParallelReadThresh 1073741824
pcacheSemantics cacheWins
pindaemon 0
pingPeriod 2
pinmaster stack:256K data:4096K
prefetchPct 20
prefetchThreads 818
prefetchTimeout 5
priority 40
privatesubnetoverride 0
psspVsd 0
qRevokeWatchThreshold 0
readReplicaPolicy default
reconnectBrokenSockets 1
retryFcntlTokenThreshold 3
SANergyLease 60
SANergyLeaseLong 3600
sendTimeout 10
seqDiscardThreshhold 1048576
setCtimeOnAttrChange 0
sharedMemLimit 0
shutdownOnLeaseExpiry 3600
sidAutoMapRangeLength 15000000
sidAutoMapRangeStart 15000000
socketkeepalive 0
socketRcvBufferSize 0
socketSndBufferSize 0
statCacheDirPct 10
statfsSyncInterval 60
statliteMaxAttrAge 30
statMaxAttrAge 0
stealHistLen 4096
subnets
syncInterval 70
syncSambaMetadataOps 0
takeOverSdrServ 0
takeovertimeout 600
tiebreakerdisks nsd1;nsd2;nsd3
tokenMemLimit 536870912
totalPingTimeout 120
treatOSyncLikeODSync 1
tscTcpPort 1191
tscWorkerPool 10
uidDomain eip_gpfs_cluster.eip_gpfsnode1
uidExpiration 36000
unmountOnDiskFail 0
usePersistentReserve 0
verbsLibName libibverbs.so
verbsPorts
verbsRdma disable
verbsRdmaMinBytes 8192
verbsRdmasPerConnection 8
verbsRdmasPerNode 0
verifyGpfsReady 0
watchdogtimeout 20
worker1Threads 656
worker3Threads 8
Updated on 2010-06-24T16:48:02Z at 2010-06-24T16:48:02Z by dlmcnabb
  • wgwang
    wgwang
    2 Posts

    Re: GPFS bad I/O performance for domain server

    ‏2010-06-18T16:10:41Z  
    sorry, is for domino application.
  • dlmcnabb
    dlmcnabb
    1012 Posts

    Re: GPFS bad I/O performance for domain server

    ‏2010-06-24T16:48:02Z  
    Waiting times of 10s of milliseconds is OK for IO. But times in the hundreds of millisecs of "waiting for I/O completion" means your disk subsystem is either overloaded or sick. Check the disk subsystem and system error logs for any indications of problems.