Topic
  • 7 replies
  • Latest Post - ‏2012-06-05T22:42:29Z by db808
botemout
botemout
70 Posts

Pinned topic GPFS NSD server with samba running on it

‏2012-04-30T15:29:33Z |
Greetings,

In general, would it be obvious that if I wanted to provide CIFS access to my GPFS data (to windows XP clients which we still have MANY of), that the samba processes should NOT run on NSD servers (GPFS nodes that have physical access to disk), but rather to regular GPFS nodes? When I was working with Lustre some time back they advised that the file servers shouldn't run another other applications, but I don't think I've seen this said about GPFS fileservers (NSD servers). I'm using CTDB so continued operation in the failure of a server isn't my concern; more, I'm wondering about performance.

Thanks
JR
Updated on 2012-06-05T22:42:29Z at 2012-06-05T22:42:29Z by db808
  • botemout
    botemout
    70 Posts

    Re: GPFS NSD server with samba running on it

    ‏2012-05-14T17:45:56Z  
    Hi all,

    I'm going to assume that, as I thought, it is obvious that there would be a benefit from having my NSD servers not also have to do double duty serving out CIFS. Either that I was very unclear in how I asked the question ;-)
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS NSD server with samba running on it

    ‏2012-05-14T18:03:46Z  
    • botemout
    • ‏2012-05-14T17:45:56Z
    Hi all,

    I'm going to assume that, as I thought, it is obvious that there would be a benefit from having my NSD servers not also have to do double duty serving out CIFS. Either that I was very unclear in how I asked the question ;-)
    I wouldn't bet on it either way.

    If you split the Samba server from the NSD server, then you have two servers consuming more total CPU and network bandwidth doing protocols and moving data between them, then if you just had one server doing the job.

    So "it depends"...

    There are many variables, among which: how storage is attached to how many NSD servers, what network hardware is being used, how many spare CPU cycles do you have on your NSD servers to do Samba protocol ...

    Best to run some tests and see which works best for you.
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS NSD server with samba running on it

    ‏2012-05-15T07:09:56Z  
    Clearly IBM don't think so as the Storwize Unified product comes with just two nodes. They are beefy nodes with some 72GB of RAM, fast system disks and lots of CPU, but you do only get just two of them.
  • botemout
    botemout
    70 Posts

    Re: GPFS NSD server with samba running on it

    ‏2012-05-15T19:31:07Z  
    I wouldn't bet on it either way.

    If you split the Samba server from the NSD server, then you have two servers consuming more total CPU and network bandwidth doing protocols and moving data between them, then if you just had one server doing the job.

    So "it depends"...

    There are many variables, among which: how storage is attached to how many NSD servers, what network hardware is being used, how many spare CPU cycles do you have on your NSD servers to do Samba protocol ...

    Best to run some tests and see which works best for you.
    Thanks, Marc. I'm in the process of doing some tests; I'll try to post my results.
  • botemout
    botemout
    70 Posts

    Re: GPFS NSD server with samba running on it

    ‏2012-05-15T19:31:22Z  
    Clearly IBM don't think so as the Storwize Unified product comes with just two nodes. They are beefy nodes with some 72GB of RAM, fast system disks and lots of CPU, but you do only get just two of them.
    Good point ;-)
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS NSD server with samba running on it

    ‏2012-05-16T09:29:49Z  
    • botemout
    • ‏2012-05-15T19:31:22Z
    Good point ;-)
    It might help that IBM are flicking undocumented configuration options on GPFS to help with Samba.

    allowSambaCaseInsensitiveLookup
    allowWriteWithDeleteChild
    cifsBypassShareLocksOnRename
    cifsBypassTraversalChecking
    syncSambaMetadataOps

    and possibly

    allowSynchronousFcntlRetries

    The very first of these is a CPU workload/performance switch, if you have Samba configured with case insensitive = yes. The next three change GPFS/Posix schematics to behave more like Windows does, and do exactly what they say on the tin. The fifth option I have no idea what does or more precisely why I would want to do that. The last one I only include on the basis of it's location in the mmchconfig Korn shell script; it is sandwiched in the middle of all the clearly Samba related configuration options. I guess it gives more Windows like schematics but I don't know/understand in what way.
  • db808
    db808
    86 Posts

    Re: GPFS NSD server with samba running on it

    ‏2012-06-05T22:42:29Z  
    • botemout
    • ‏2012-05-14T17:45:56Z
    Hi all,

    I'm going to assume that, as I thought, it is obvious that there would be a benefit from having my NSD servers not also have to do double duty serving out CIFS. Either that I was very unclear in how I asked the question ;-)
    Hello all,

    I just want to chime in on the "it depends" note on the positive side.

    Yes, in summary for SAMBA performance, "it depends".

    For sequential access to large files,
    robustly configured 64-bit contemporary Xeon-class servers, with sufficient mechanical disk performance, doing large IO, exploited by the sophisticated GPFS software, with Linux explicitly configured to enable large IO, SAMBA can run very well.

    We have been designing and deploying medium-to-large (96 TB to 1.7PB) GPFS clusters, whose major purpose is to act as CIFS file servers for large, media-centric files. Our data sets have average file sizes that are typically gigabytes in size, but we also have some that are small as an average file size of 2.5 MB.

    Our performance runs from stellar to excellent to very good ... typically limited by the mechanical speed of the disk systems that we are using. We typically use a GPFS maxblocksize of 4MB ... even when the average file size is 2.5 MB.

    We also 2 to 4 x 8gbit FC controllers per GPFS node on the storage side, and 1-4 10GbE NICs on the NAS side to meet the throughput requirements. Using multi-gigabyte files, we have run over 700 MByte/sec from a GPFS node to an robust Dual-Xeon 5500/5600 series engineering workstation ...through SAMBA. We have also run ~ 980 MB/sec to a similar dual-Xeon 5500/5600 series Linux server using a Linux CIFS client.

    We typically run 4-node GPFS clusters, with each node connected to ALL the visible LUNs, and also running SAMBA to several 10gbit NICs.

    Our fastest 4-node cluster can do ~ 10,000 MB/sec to disk (limited by storage) through GPFS using ~ 20% of the 12-core CPU. That leaves 80% of the 12-core CPU available for SAMBA, FTP, and local processing.

    Key to this performance level is being able to balance the IO load across multiple FC controllers, multiple disk arrays, and multiple LUNs. Avoiding momentary "clumping" of IO and the resulting momentary starvation of the other IO resources allows you to scale performance. We have achieved 98% balance across 4 FC controllers, and 92% balance across 12 storage array processors. Without this balance, you could not extract the performance, and building a high-throughput or "strong" GPFS node would not make sense. Many IBM GPFS topology discussions focus on a larger number of "weak" GPFS NSD nodes, and in this environment, heavy-duty SAMBA processing intermingled with the NSD probably does not make sense.

    In our experience, if you can achieve a high-throughput "heavy" GPFS node, with excess CPU capacity, then running heavy-duty SAMBA processing can be done. And it does not cost a lot of server dollars to accomplish this using Nahelem-class or newer servers with appropriate hardware.

    We are running 900-1200 MB/sec of multi-user sequential SAMBA traffic per node, using large files. For smaller files with average file sizes of 2.5 - 8 MB/file, we are running 350-650 MB/sec of multi-user sequential SAMBA traffic per node .... most often limited by our storage.

    We try to use the same versions of SAMBA and GPFS that IBM is using in their scale-out-file-services (SOFS) product to improve stability. Due to more aggressive "tuning" and cost-aware provisioning, we typically run at higher performance levels using less equipment than SOFS would achieve (otherwise we would have just used SOFS).

    The basic philosophy is to cost effectively avoid areas of congestion. We use 24GB GPFS nodes, for example. Why? They cost ~ $200 more than 4GB servers. For $200 we gave GPFS a 10GB buffer pool, and 10 GB to SAMBA ... and we set the parameters to cache a lot of "stuff". We are probably somewhat "wasteful", but all the "waste" costs less than $200. For example, we buffer over 100,000 inodes. Why not? It costs < 2GB of memory ...about $20. We've reduced the amount of metadata activity to a trivial level by using 4MB block sizes, and then cached most of the metadata.

    On the SAMBA side, we did similar things ... increasing the buffering to the equivalent of about 1 MB per open file, and empirically testing the impact of various settings.

    In our experience, properly configured, SAMBA is reasonably CPU efficient handling sequential IO to large files ... on your contemporary dual-socket Xeon 5500/5600 series servers (Nehalem/Westmere). The new Intel Xeon E5-series (Sandy Bridge-EP with the Romley-EP platform) dual-socket servers have 2x the memory bandwidth and over 4x the IO bandwidth, with 50%-100% more CPU power.

    If you are trying to use SAMBA with small files, and smaller GPFS maxblocksize values, especially if you don't have a lot of disks ...you are fighting an up hill battle, of which SAMBA is not the limiting element.

    Obviously, in a steady state environment, unless you are re-reading or re-writing the same files (hopefully cached in the GPFS buffer pool), you are limited by the aggregate performance of your storage system. SAMBA can't go faster than a local program accessing the GPFS files locally.

    Some SAMBA workflows (often with small files) are very difficult. If you look at the SAMBA statistics (or the file sharing statistics on a Windows file server) .. look at the ratio of "constructive" to "unconstructive" work. How many SMB read and writes are being done per open/close? How many directory operations and other commands that are not involved with moving the data to/from the client? I've seen systems where only 20% of the SMB commands were reads and writes. 80% were "unconstructive" work. Tuning and capacity planning for this is much different. Eliminating the unconstructive requests from the client-side is often the best first approach.

    Yes, in summary for SAMBA performance, "it depends".

    But for sequential access to large files, robustly configured 64-bit contemporary Xeon-class servers, with sufficient mechanical disk performance doing large IO, exploited by the sophisticated GPFS software, with Linux explicitly configured to enable large IO, SAMBA can run very well.

    Hope this helps.

    Dave B