Pinned topic problem with multi-filesytem cluster
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
we have a cluster with several filesystems, served by the same set of NSD servers. We are observing the following problem. When the disk backend of one of the filesystems saturates, we start observing large NSD thread I/O waiters on the NSD servers, and this is expected. Unfortunately, this ends up to consume the total number of NSD threads (nsdMaxWorkerThreads 256), and so all the operations of the other filesystems are delayed as well, since all NSD I/O threads are busy. Is there any smart way to handle this, apart from the obvious solution of using separate NSD's for different filesystems?
Updated on 2013-03-28T09:44:27Z at 2013-03-28T09:44:27Z by SystemAdmin
bdherr 060000C2N221 Posts
Re: problem with multi-filesytem cluster2013-03-26T14:53:13ZThis is the accepted answer. This is the accepted answer.What version of GPFS are you running? You could throttle down the clients by lowering the maxMBpS or lowering prefetchThreads, BUT this all depends on your workload and I/O access paterns and may have other unintended consequences.
VincenzoVagnoni 27000328NS112 Posts
Re: problem with multi-filesytem cluster2013-03-27T10:21:35ZThis is the accepted answer. This is the accepted answer.
- bdherr 060000C2N2
SystemAdmin 110000D4XK2092 Posts
Re: problem with multi-filesytem cluster2013-03-28T09:44:27ZThis is the accepted answer. This is the accepted answer.
- bdherr 060000C2N2
A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.