Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
37 replies Latest Post - ‏2015-05-14T02:27:58Z by gpfs@us.ibm.com
gpfs@us.ibm.com
gpfs@us.ibm.com
216 Posts
ACCEPTED ANSWER

Pinned topic GPFS V3.4 announcements

‏2010-07-30T18:50:00Z |
Watch this thread for announcements on the availability of updates for GPFS v3.4.
Updated on 2013-03-11T18:16:49Z at 2013-03-11T18:16:49Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2010-07-30T18:52:28Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.1 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral/?productGroup0=ibm/fcpower&productGroup1=ibm/ClusterSoftware&productGroup2=ibm/power/IBM+General+Parallel+File+System
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2010-09-10T15:47:37Z  in response to gpfs@us.ibm.com
    In order for GPFS tracing to function properly on a system running AIX 6.1 with the 6100-06 Technology Level, you must either install AIX 6100-06-02 Service Pack or open a PMR to obtain an iFix from IBM Service. If you are running GPFS on AIX 6.1 TL 6 without 6100-06-02 Service Pack or the iFix and have AIX tracing enabled (such as by using the GPFS mmtracectl command), you will experience a GPFS memory fault (coredump) or node crash with kernel panic.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2010-10-12T14:50:02Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.2 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.2

    Oct 07, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

    • FSCK checks log file inodes even if they have log group number set to -1.
    • gpfsInodeCache slab (and cpu) usage high due to NFS anon dentry allocations.
    • Fix rare occurrence of file fragment expansion happening during file sync that can cause the assert failure related to GETSUBBLOCKSPERFILEBLOC.
    • If node cannot do cNFS recovery for a failed node then commit suicide so another node can do the takeover for both nodes.
    • Prevent FGDL kernel memory fault caused by very narrow race condition during directory lookup.
    • Fix assert related to RCTX.REPLIED, TSCOMM.C that occurs on the FS manager node if the FS manager is running GPFS release 3.2, and a release 3.3 client tries to mount the filesystem.
    • Linux IO: check mm_struct before pinning pages.
    • Improve performance of stat operations on Linux under certain multi-node access patterns.
    • Fixed an incompatibility between GPFS for Windows and the Interop Systems software bundles. This incompatibility caused Interop Systems bundle installation failure.
    • Fix hang between node join thread and events exporter request handler thread.
    • This update addresses the following APARs: IZ81230 IZ83798 IZ84008 IZ84016 IZ84039 IZ84041 IZ84045 IZ84145 IZ84161.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2010-11-16T19:21:22Z  in response to gpfs@us.ibm.com
    Customers who enable GPFS RDMA on Linux x86_64 with GPFS 3.4 may experience I/O failures with var/log/message reporting an error 733 or 735. Customers should contact IBM Service for an efix for APAR IZ88828 until GPFS 3.4.0-3 is available. Please note, this forum will be updated when GPFS 3.4.0-3 is available.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2011-01-13T14:54:29Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.3 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.3

    Jan 06, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

    • Fix Assert happening in setCachedRecAddr when the cached disk address is NULL and while the disk address read from disk is a real disk address. Modify the assert to allow this kind of change. And, update the cached address locally.
    • Fix a potential metadata allocation problem where wrong disk may be selected-occurs in selectRandom code branch when invoked in getRandomTargetDisk.
    • Fix repeated RDMA connection attempts due to IBV_EVENT_PORT_ERR.
    • Fix AIX crash caused by kxFreeAllSharedMemory.
    • Fix code, when value of the option "-p" for cmd mmcrfileset is invalid, returning misleading error message, "InvalidOption",
    • Fix mmbackup of fs with no changes sending files to TSM.
    • Fix mmbackup migration from straight TSM, older backup format (3.2), or recovery when shadow file not present would cause full backup.
    • Avoid very rare assert (BaseFilesetMetadataRecord) during restripe operations such as mmdeldisk.
    • Fix a deadlock between the tsdeldisk and the inode expansion.
    • Fix: An additional characteristic of pathnames with special characters present is they can cause TSM to exit with rc=4. Sometimes this was being mis-handled in mmbackup because the highest and "net status" error code from all the runs of TSM was not recorded/provided.
    • Permit TSM install to be in "bin64" for AIX and find needed config file (dsm.opt) there. Provide enhanced debugging in tsbackup33 using DEBUGmmbackup bits.
    • Add new functions to carefully split file list lines and notice if the split char is showing up in the file name as well.
    • Fix the allocation code which can cause a filesystem to panic with "Too manydisks are unavailable" when running out of disk space.
    • Fix a rare race condition where a file system manager failure during a disk status change could cause temporary loss of file system access.
    • Fix kernel assert when dmapi event generator is accessing null sgP pointer.
    • Fix mmcheckquota showing confusion error message when scanning large fs. This was due to insufficient memory.
    • Provide inode space expansion during directory entry creation if new inode number is out of range.
    • Fix an erroneous assert check in fsck cleanup path.
    • Fix rare deadlock that occurs if NSD server fails during mmcrfs cmd.
    • Fix mmapplypolicy having an internal error. Or, as a workaround, avoid using -i and -g and -N together.
    • Provide support for recognizing DSM_CONFIG env variable in mmbackup.
    • Fix mmimgbackup failing when rebooting another node. Keep tsapolicy exit code at 0 when recoverable or expected errors occur. Important for scripted usage of mmapplypolicy depending on mmapplypolicy exit code to be 0-even during multi-node operation, or when one or more "helper" nodes fail.
    • Fix FSErrValidate error in ACL GC while inode expansion is also running.
    • Provide a --block-size option for the mmlssnapshot and mmlsfileset commands.
    • Fix race condition between two remove threads removing same file.
    • Fix deleting a snapshot containing sparse files that could, in some rare cases, cause temporary loss of file system access.
    • Fix race condition between deferred deletions and policy file creation.
    • Fix assert "aceLength > 0" in tsgetacl for default ACL on a dir in a remote fs.
    • Fix a rare problem in mmrestorefs error code path- CHECK_CONTINUE_ON_ERROR was starting the next inode without advancing inode block buffer to next inode buffer pointer.
    • Fix asserts in fsck while trying to fix corrupt directories.
    • Fix deadlock preMount callback invokes mm commands.
    • memcmp now defined in rtnetlink.
    • Fix permitting FSET snap handles inappropriately modifying the fssnap->magic number prior to a tsfattr(GPFS_SYNC_FS).
    • Fix for quote error in the m4 command invocation when running mmapplypolicy.
    • Relocate mmbackup related temporary files from root to <gpfs>/.mmbackupCfg/.
    • Fix sort->$sort
    • Permit recovery after non-fatal TSM error codes.
    • Improve GPFS mmstartup time & other GPFS commands in adminMode=allToAll cluster.
    • mmexpel --wait command aborts the wait in case of quorum loss.
    • Avoid structure error assert after mmdelsnapshot when cached files in other snapshots are accessed.
    • Fix buffer length calculation for dmapi user event returned by dm_get_events call.
    • Fix dm_handle_to_path so that it can look up the directory name by its own handle.
    • Fix rare assert in fsync code path.
    • Fix problem of disk going off line with error 733.
    • Fix mmrestoreconfig failing. This occurs when quota command returns E_NO_QMGR while file system is being closed but, has not completely closed yet.
    • Fix problem of disk going off line with error 735.
    • Fix return code of appendOnly file checking routine.
    • Improve mmdeldisk progress time.
    • Add additional exmaple files, tspgrep and tsprm, to samples/ilm.
    • Fix a remote cluster not always picking a local NSD server when readReplicaPolicy=local is set.
    • Fix a startup synchronization issue that prevented GPFS autoload from working on some Windows systems.
    • Fix to fputattrs to restore/clear xattrs and to fputattrs_withpathname to not overwrite pathname once copied into the kernel. Users of IBM Tivoli Storage Manager Backup Archive Client and GPFS storage pools with policy RESTORE rules for data placement should apply this fix before restoring data to the GPFS file system.
    • Improve performance of small writes (< 32k) over NFS to a file opened with with O_DIRECT on the NFS client.</li>
    • Fix Linux crashing in cxigetinodenum on lockd call.
    • Fixed mmchfs code path that is leaving filesystem mounted internally.
    • Add support for RHEL 4.X x86_64.
    • Fix mmlssnapshot with -d option not showing data and metadata usage.
    • mm commands now display conflicting messages when the command is waiting to run, due to a conflicting program running, instead of after it has finished.
    • mmapplypolicy now suppresses progress messages if a non-interactive device is being used.
    • Fix the fsck block compare operation that results in a SEGV due to buffer overrun.
    • Reduce message traffic when writing a file with NFS.
    • Fix synchronization of dmapi destroy event thread and dmapi event response thread.
    • Fix assert in dmapi event timeout handlers.
    • Fix forced unlink of a fileset (mmunlinkfileset -f) on Linux causing temporary loss of file system access if there were deleted files still open in the fileset at the time it was unlinked.
    • Use TRCBUFSIZE environment variable for trace buffer size and ensure it is not overwritten by config parameter.
    • Fix some 64 bit counters in GPFS SNMP.
    • Properly restore windows attributes.
    • Improve performance of mixed random read/write workloads on large files over NFS.
    • Fix an invalid conditional assert in the fsck orphan management code.
    • Fix mmrestoreconfig failing, with tsdefquotaon failed rc=245, if quota command returns E_NO_QMGR while file system is being closed, but has not completely closed yet.
    • Avoid asserts and deadlock by having mmlsfileset and mmlssnapshot commands wait while mmcrsnapshot command runs.
    • Fix logAssertFailed: rmr1 != rmr2 when using GPFS RDMA.
    • Fix enabling fs with diskea and overflow block feature.
    • Fix a startup synchronization issue that prevented GPFS autoload from working on some Windows systems.
    • Fix a SEGV problem in mmlsquota when printing user quotas for a fileset.
    • Fix a case in mmchfs which leaves the FS internally mounted for a small time period after mmchfs is over.
    • Initialize filesetNameP
    • Fix deadlock issue for flushing mmapped files.
    • FILESET_CMD_PERM for mmchfileset on Windows has been temporarily disabled.
    • Fix a startup synchronization issue that prevented GPFS autoload from working on some Windows systems.
    • Fix Linux mmdelacl returning E_OPNOTSUPP for files in a "-k nfs4" fs.
    • Add useDIOXW configuration variable to avoid Direct IO token thrashing when using some IO requests that match the GPFS blocksize.
    • On SLES11, a privileged user is now not allowed to create a file in a remote filesystem even though root squashing is enabled because the DAC_OVERRIDE capability was specified in the credential.
    • Fix mmbackup recording failed files incorrectly.
    • Update unlinked fileset handling code to properly cull paths from a new (3.4)-style shadow file and sort by inode number into the updated list file. Exempts the unlinked fileset contents from being expired from TSM.
    • Fix duplicate session id returned by dm_create_session due to clock out of sync problem.
    • Speed up snapshot creation and unmount on systems with a large amount of dirty data in the cache.
    • Suppress implicit file time updates after an explicit set time operation was perform on the same handle-these semantics only apply to Windows systems.
    • Allow dmapi clients to acquire access rights to a file that is being destroyed.
    • Add two new API's which can be used to improve performance for applications that make many API calls on Linux.
    • Tweak TSM Query code to recover files more accurately when doing shadow file reconstruction, including file names with break chars in the name.
    • Improve performance of random updates to a large file from a single node, if that file was previously accessed by another node.
    • Fix an assert during multiple instances of restripe running in parallel.
    • Fix a surplus indirect block not being processed during restripe.
    • Fix that could cause some files to become unreadable when running mmrestripefs on a system with small page pool or a workload that causes high demand for page pool buffers.
    • Speed up the reclaim of unused GPFS inodes on Linux.
    • Do not allow GPFS internal extended attributes to be set using the gpfs_fputattrs API. Require root authority to set DMAPI external attributes or external namespace attributes other than "user."
    • Avoid problems preventing upgrading v3.3 filesystems with snapshots.
    • Fix problem of restoring EA of pre 3.4 file system without fastea enabled.
    • This update addresses the following APARs: IZ86550 IZ87150 IZ87153 IZ88685 IZ88701 IZ88704 IZ88745 IZ88747 IZ88748 IZ88749 IZ88752 IZ88828 IZ88900 IZ88904.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2011-01-26T15:05:44Z  in response to gpfs@us.ibm.com
    A fix introduced in GPFS 3.3.0-11 and in GPFS 3.4.0-3 changed the returned buffer size for file attributes to include additional available information, affecting the TSM incremental backup process due to the selection criteria used by TSM. As a result of this buffer size change, TSM incremental backup will treat all previously backed up files as modified, causing the dsmc incremental backup process to initiate new backups of all previously backed up files. If the file system being backed up is HSM managed, this new backup can result in recall of all files which have been previously backed up. This effect is limited to files backed up using TSM incremental backup; there are no known effects on files backed up using either GPFS mmbackup or the TSM selective backup process.

    This issue is resolved in GPFS 3.3.0-12 (APAR IZ92779) and GPFS 3.4.0-4 (APAR IZ90535). Customers using the TSM Backup/Archive client to do incremental backup (via dsmc incremental command) should not apply GPFS 3.3.0-11 or GPFS 3.4.0-3, but should wait to apply GPFS 3.3.0-12 or GPFS 3.4.0-4. Any customer using TSM incremental backup and needing fixes in GPFS 3.3.0-11 or 3.4.0-3 should apply an ifix containing the corresponding APAR before executing dsmc incremental backup using these PTF levels, to avoid the additional file backup overhead, and (in the case of HSM-managed file systems) the potential for large scale recalls caused by the backup. Please contact IBM service to obtain the ifix, or to discuss your individual situation.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2011-03-03T14:34:44Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.4 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.4

    February 17, 2011

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

    • Update mmtracectl to have --format and --noformat flags which allow one greater control over whether to format traces.
    • Fix rare race between flushBuffer and mergeInode updating lastDataBlock.
    • Fix a problem when RDMA connecting between two clusters where the IB networks are not connected.
    • Fix an assertion at daemon startup on 32 bit linux nodes when prefetchThreads and worker1Threads configuration variables are set too high.
    • Fix a bug in mmchdisk caused by premature loop exiting if a metadta disk and a data disk are found.
    • Add a make parameter "LINUX_DISTRIBUTION" for non-standard Linux distributions. e.g. make LINUX_DISTRIBUTION=REDHAT_AS_LINUX Autoconfig.
    • Fix race between nfsWatchdog and delsnapshot.
    • Fix the allocation code which caused an assert on filesystem manager node after encounter I/O error.
    • Fix code in displaying of cmd "mmlsfileset dev -L -p". Let this command show all the fileset status.
    • Fix a bug in mmwindisk utility (called from mmdevdiscover, for instance) that could cause the program to fail when the Windows node has certain uncommon storage devices attached.
    • Fix cxiIsNFSLock erroneously returning FALSE for NFSv4 lockctl calls.
    • Detect and skip the continuously repeated lines (from pgalloc), enable the MaxLoopCheck_dumpBufQueue and set it's default value to be 2, so it will not impact performance greatly.
    • Fix allocation code causing delete disk to fail when deleting last disk of a failure group.
    • Fix copyInodeBlock to not try to copy data block if it is EOF.
    • Fix Windows code to handle, or enable access on, all Unix filesystem object such as a block device, fifo, socket, symlinks. All such objects will be listed as 0 byte files on Windows. No other operation such as read/write/delete/modify will be allowed from Windows.
    • Fix code which caused long waiters and daemon shutdown on leaseloss.
    • Fix sublock to disk sector conversion routines handling invalid disk addresses. This is done by returning E_INVAL back to the caller during fsck scan.
    • Fix a race condition which can cause daemon shutdown when adding/deleting disk from the file system and when the inode prefetch thread is hitting a tiny time window.
    • Improve performance of file system metadata scan phases of mmrestripefs.
    • Add gpfs_set_times() and gpfs_set_times_path() API.
    • Avoid assert when a snapshot is created immediately after deleting a large file.
    • Fix subblock size not keeping consistency among the cluster.
    • Fix mmrestripefs or mmdeldisk command, after being killed, background activity continueing for a significant time.
    • Fix code which could cause daemon assert and segmentation fault during filesystem takeover.
    • Include mount event disposition in dm_get_disp() call.
    • Fix invalid assert in fcntl lock token relinquish path.
    • Fix lost quota code when calling tslsquota -j with fully qualified device name specified.
    • Prevent GPFS from starting while certain admin commands are running.
    • Fix assertion in mmcheckquota when it encounters quota entries with invalid fileset ids.
    • GPFS for Windows now disables SMB2 on the node during installation.
    • Fix unnecessary work during mmrestripefs and mmdeldisk commands.
    • Fix quorum formation when the /var/mmfs/gen/BallotFile file is too small.
    • Fix a longwaiter problem caused by and infinite loop in mmdefragfs.
    • Fix kernel assert in gotVinfoLock when doing read/write mmap.
    • Fix a workload that continuously invokes operations that require exclusive inode locks, such as chmod or chown, could prevent mmrestripefs or mmdeldisk commands from making progress.
    • Fix rsh/rcp failing in mmcrcluster when the second interface is being used to create a node and the hostname interface does not have permission.
    • Fix connecting to server, when running mmsdrcli, in cluster with daemon and admin on two different interfaces.
    • Do not return windows attributes blindly for gpfs_fgetattrs() API.
    • Fix file creates and deletes being blocked for a long time after a node failure, especially in file systems with a large number of inodes.
    • Fix allocation code to prevent an assert that could occur while trying to delete/replace disk.
    • Fix the fsck block compare method to support variable data / metadata block size routines failing and resulting in asserts during the fsck compare operation.
    • Fix file system policy information obtained by mmsnmpagentd.
    • Fix mmsdrfs generation changed event in events exporter.
    • Ensure the scope of remaining configuration parameters is not changed as a result of a delete operation on a configuration parameter.
    • Always refresh session list that registered for mount. This avoids a mount failure scenario where the session could be deleted or added while another node is processing the mount event.
    • Improve GPFS admin commands initiated on a daemon down node, forcing the execution to run on an active node, in a large cluster.
    • Fix IcQueryDirectory implementation for FILE_ID_BOTH_DIR_INFORMATION and FILE_ID_FULL_DIR_INFORMATION to correctly assign the file ID field.
    • Fix performance when sequentially overwriting an existing file in contiguous sector-aligned pieces that are less than the block size, do not read or prefetch any data.
    • Add GPFS mount options nfsHashName and nonfsHashName. If nfsHashName is in effect, the NFS FH will include the hash value of the file name. This option will improve performance but is off by default. Turning it on might cause some ENOENT error with NFS.
    • Fix ACL garbage collector delete auto-generated Window SID mappings.
    • Fix race condition starting too many mmkprocs when using many mmapped files.
    • Change the allocation code to prevent looping when migrating blocks after a disk's failure group assignment, data type or storage pool has changed.
    • Use seqDiscardThreshhold for determining if a buffer stays in pagepool after it has been consumed by a reader, or after it has been flushed by a writebehind thread. This allows setting a large seqDiscardThreshhold to cache most files below that size. Setting a low writebehindThreshold will start flushing sequentially written files so that a large pagepool does not fill up with dirty buffers.
    • Improve performance of snapshot, copy-on-write in particular, immediately after the snapshot is created and a large number of files are updated from multiple nodes in the cluster.
    • Fix gpfsInodeCache slab (and cpu) usage high due to NFS anon dentry allocations.
    • Improve sync performance when many dirty buffers.
    • Fix race condistion when performing extensive AIO on a node.
    • Add padding variable to SGPoolDataStored structure to prevent data misalignment.
    • Update gpfs_fcntl.h copyright statement.
    • This update addresses the following APARs: IZ90535 IZ90866 IZ92310 IZ92316 IZ92318 IZ92320 IZ92325 IZ92330 IZ92413 IZ92427 IZ92467.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2011-04-14T17:29:42Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.5 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.5

    April 07, 2011

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

    • Avoid showing 'unknown' version in mmfslog.
    • Fix assignment causing unaligned access warning in Linux for IA64.
    • Fix logging recovery when data replication is enabled and metadata replication is not, or following a node failure.
    • Stop internal mount for lsquota command.
    • Change "mmwindisk initialize" to create the GPFS data partition with a 16MB alignment.
    • Allow a node which cannot externally mount a file system, due to page size restrictions, to still serve as file system manager.
    • Fix immutability flags being lost after a file is restored from TSM restore program.
    • Fix Signal 8 (divide by zero) error when processing a deleted inode for prefetch.
    • Write new filesystem device name to all disks to avoid the unwanted warning messages.
    • Added a post-install script in GPFS for Windows that runs "mmautoload postinstall".
    • Create gpfs init lock file on system startup. This ensures GPFS shutdown will be called during system shutdown on RHEL distributions.
    • Fix fileset metadata files with inode numbers outside the reserved range being handled as orphans.
    • Fix file system stopping with an assert on structure error related to an invalid number of disk pointers stored in the inode.
    • Fix quota file block expansion code resulting in an 'oldDiskAddrP == NULL' assert.
    • Allow user space attributes to be set by user commands.
    • Fix an inodeScan interface clean up error that could cause long waiters during unmount.
    • Fix panic in cxiStartIO when disk device drivers are configured for 1024 scatter gather lists.
    • Fix mmbackup failing to backup file with "NONE" in pathname.
    • Fix a fcntl retry message arriving and being processed before the original gpfs_v_lockctl operation completes. This could have resulted in the lock operation referencing the sleepElement after the message handler had freed it.
    • Rework the handling of options for mmapplypolicy to accept blank characters arguments.
    • Fix mmdeldisk allowing deleting a disk without moving existing data off the disk first. This would only have occured on file systems with metadata replication enabled (-m 2), strict allocation enforced (-K always; the default is "whenpossible"), when running mmdeldisk shortly after creating a new snapshot, and if the only disks remaining are in a single failure group.
    • Performance improvements to mmdelsnapshot.
    • Disallow the colon character in a filename during create/open from Windows. The Unix nodes can still create filename that have a colon. Such files can be accessed on Windows using their 8.3 names.
    • Fix excessive acquires on TcpConnTab mutex. Every five seconds, every receiver threads checks for broken connection timeout, when only one is needed.
    • Fix an bug in the log recovery code that was forcing a filesystem panic while offline fsck is in progress.
    • dump functions now dump regionsPerPass for each storage pool.
    • Fix fsck code so that it is able to recover data from a filesystem with a fatal root inode which is a candidate for delete.
    • Fix problem where mmdf would not allow specifying more than one of the -d, -m and -F options.
    • Fix when using EXTERNAL POOL EXEC 'script' rules with an external storage manager that supports "premigration", such as TSM/HSM, mmapplypolicy mistakenly invoked the rules' EXEC 'script' with the subcommand MIGRATE instead of PREMIGRATE for files that should be pre-migrated.
    • Fix for a bug where small synchronous writes to a block pre-allocated usinggpfs_prealloc may be lost.
    • Fix isStoragePoolIndexValid(poolIndex) asserts after a storage pool has been deleted.
    • This update addresses the following APARs: IZ94724 IZ95818 IZ95819 IZ95821 IZ95855 IZ96323 IZ96325.
    • mbacchi
      mbacchi
      1 Post
      ACCEPTED ANSWER

      Re: GPFS V3.4 announcements

      ‏2011-06-21T13:26:36Z  in response to gpfs@us.ibm.com
      GPFS 3.4.0.6 is now available from IBM Fix Central:

      http://www-933.ibm.com/support/fixcentral

      Problems fixed in GPFS 3.4.0.6

      June 02, 2011

      Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

      • Fix the mmchpolicy code which caused daemon assert after fail with E_NOENT error.
      • Fix a quota manager assertion where it could be caching invalid quota file inode after restripe.
      • Simplify expel cmd execution when multiple nodes including clmgr is specified in the cmd.
      • When TSM mangles path names, always leave the old shadow database in place and indicate an error by returning 12. Suggest next mmbackup be run with -q to synchronise shadow file to TSM.
      • Fix assert "!ofP->destroyOnLastClose" or apparent hang under certain workloads with concurrent directory updates from multiple nodes.
      • Fix for a rare race condition involving simultaneous mounts of different file systems that could lead to conflicting tokens being issued.
      • mmtracectl --off will now reset all trace config variables.
      • replace mhAclStore assert when stale ACL data is discovered.
      • Made the Windows code more robust in validating various GPFS handles and data structures before making GPFS internal calls. An inflight shutdown can race and cause various GPFS data structures to go invalid.
      • Modify the reporting of various file system allocation units in terms of GPFS sub-blocks. This forces the Windows Explorer to round up the disk usage for files to the nearest sub-block, which is consistent with the actual allocation size reported for files.
      • Fix a long waiter problem where the quorum is lost while the update mtime handler is already waiting for node recovery.
      • Fix for ProbeClusterThread waiting indefinitely for outbound connect during startup (problem occurs mainly on Windows nodes).
      • Add functionality to suspend write operations on a filesystem.
      • tsdbfs: avoid displaying Inode's wide address fields on narrow disk address filesystems.
      • Disallow immutable flag setting on narrow da fs.
      • Performance improvement in mmdf.
      • Persistent Reserve fix for database clusters where the nsd are directly-attached with many paths.
      • Clean up exit code reporting throughout. Add date/time stamp on exit. If tssnapdir command fails, add output to explain failure. Modify count of failed files to include skipped files in audit log. Permit failure exit if count of errors does not match count of Failed and Skipped files in audit log. Skip over pathnames in audit log if mangled. Permit shadow file update to record progress if all the failed pathnames can be elided properly even if mangled ones were skipped.
      • Wait recovery to finish before deldisk.
      • Fix long mmstartup delay on AIX after mmshutdown --force.
      • Fix code that could cause rare filesystem panic during sync.
      • Windows maximum shared segment size was increased from 256 MB to 1 GB and the maximum TM memory limit was increased from 1 GB to 128 GB.
      • Avoid extremely rare assert during unmount.
      • Replace portmap with rpcbind on SLES11 for CNFS.
      • Disable expiration time setting on narrow da file system.
      • Speed up node failure recovery for configurations with large values for maxFilesToCache.
      • Fix the Windows codepaths to call gpfsMmap specifying a valid byte-range.
      • Get rid of spurious "approaching limit for the maximum number of inodes" when a new FS manager takes over.
      • Ensure sdr server is not running on old configuration servers after rebooted.
      • Corrrectly set the list of registered RPC programs after restart portmap.
      • Fix CNFS failover problem with SLES11 or later.
      • Fix a deadlock that may occur in the presence of node failures.
      • Fix a GPL build break on RHEL 4.X when LINUX_KERNEL_VERSION=2060900.
      • Fix the allocation code which caused signal 11 under certain error condition.
      • Speed up traceback dumping on Linux by saving the symbol table the first time it is read.
      • Improve the fairness of the outbound RPC message queue, so that unlucky reply threads will not be stuck indefinitely under heavy communications loads.
      • Rare race condition causing spurious ENOSPC error when creating files.
      • Change the GPFS code for handleing the conversion of file handle to dentry to ESTALE to NFSD since ESTALE this the valid return code for file not found.
      • Change the SIGTERM and SIGINT signal handler and the methods it invokes so that all methods are "signal-safe", invoking only library calls that are Posix, specified as signal-safe. This code is still responsible for removing temporary files that tsapolicy creates.
      • There was a race/synchronization problem in the relatively new gpBS (buck slip) protocol that was introduced into rel 3.4 on 2011/02/25. This was fixed by adding code to synchronize of new flag p_beg2_sent, so that we are sure every session is bracketed by a begin... fini sequence.
      • Fix thread traceback stopping early and not translating some addresses to symbols.
      • Performance improvement in snapshot copy-on-write.
      • Fixed an invalid condition check that asserts the deamon instead of doing a proper cleanup while a filesystem manager takeover happens during quota check or fsck operation.
      • Fix problem when writing replicated data with direct-IO, where GPFS recovery after a node failure could interfere with concurrent updates to the same file from other nodes.
      • Prevent GPFS from starting when registering the pagepool to infiniband fails.
      • Make multi-threaded execution procedure local static constructors and/or destructors thread safe by adding code to avoid multiple execution.
      • Improve handling of symbolic links in mmbackup.
      • Improve error handling in mmbackup.
      • Prevent a rare FSSTRUCT error accessing large directory in a snapshot from multi-threaded application on AIX.
      • Fix a memory leak when getting extended attributes of a file.
      • Performance improvement in snapshot copy-on-write.
      • On 64 bit big endian architectures with 64 bit (void*) type and 32 bit (unsigned int) type, an old incorrect ccListVector::append() method stores the uint in memory such that a subsequent vec[i] access method retrieves 0000 instead of the stored value. A first attempted fix did not work due to a bug in the optimizing compiler (did work at level -O0), so a final fix was made to ts/pc/fc/cclistvect.h on 2011/05/06 using an explicit union type instead of "funky" casting. * Cleanup mmnfsmonitor monitor process after remove CNFS node.
      • Found TSM command exit codes were being discarded by execscript due to missing escape slash before $rc.
      • When exec script returns nonzero and mmapplypolicy returns EBADF(9) just print message and allow cleanup code to discern whether failure is fatal or not.
      • Change variable name to badPathCnt as itrepresents path names mangled by TSM, not files that were "skipped" by TSM because they were busy. Eliminate use of lstat() in determining this and simply see if backupDir is a common root of the failed path. If badPathCntis nonzero, will have to fail the backup.
      • Upgraded the Windows build environment to WiX 3.5, which includes an updated DifxApp.
      • Fixed regression caused by token management performance improvements.
      • Re-bury the --choice-algorithm=fast, so that program behaviour defaults to the old "exact" with sort choice algorithm. Also, fix the choice-algorith option propagation bug.
      • This update addresses the following APARs: IZ98687 IZ98699 IZ98702 IZ98771 IZ98981 IZ99353 IZ99355 IZ99356.
      • gpfs@us.ibm.com
        gpfs@us.ibm.com
        216 Posts
        ACCEPTED ANSWER

        Re: GPFS V3.4 announcements

        ‏2011-08-03T13:51:47Z  in response to mbacchi
        GPFS 3.4.0.7 is now available from IBM Fix Central:

        http://www-933.ibm.com/support/fixcentral

        Problems fixed in GPFS 3.4.0.7

        August 02, 2011

        Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.
        • Reference any user handle passed to the kernel using the correct access mode and security check.
        • Fix code which can lead to assertion in a very rare case.
        • Sovled rare race condition which may lead to a kernel crash for 64bit AIX boxes when starting GPFS and unloading GPFS kernel extension concurrently.
        • Fix deadlock involving fcntl locking operations that can occur on on Linux systems with 2.6.18 based kernels under memory pressure.
        • Fix a problem in libmmantras that was causing application core dump.
        • Fixed a problem found when reading alloc sum file.
        • Fixed mmexpelnode to avoid 'Failed to locate a working cluster manager' errors.
        • Fixed bad assertion in InodeLkObj::token_revoke.
        • Fixed remount code that caused mmmount to still show success after remount failed.
        • Fix for a rare race condition when GPFS startup and shutdown race each other, resulting in a spurious assert.
        • Fix to allow policy rules files with extremely long files, such as those that may be generated by a program or a script. A viable workaround would be to just insert a few newline characters to break long lines.
        • Command mmlsquota with option -d only allows one option among -u, -g, -j.
        • Fix 'unknown opcode 0x20' warning message in trace back of linux on x86_64.
        • Add protection to filesystem name in preUnmount event callback to prevent kernel panic.
        • Call putECred on exit from kxWinOps and kxSetTimes.
        • Fix to possible deadlock situation after a node which is the primaryNSD server fails while I/O requests are being handled by that NSD server. Threads may be waiting for "Change SG desc".
        • Correct disk usage problem in recent Linux kernels with certain LANG setting.
        • Fix long waiters caused by error handling in mmpmonNodeListRequest message handler.
        • Serialize creation of grace period thread.
        • Fix log code which can cause filesystem cleanup to get stuck, which could then prevent other GPFS commands from executing.
        • Remove extra IOs when closing a sequentially written file that is larger than the writebehindThreshold.
        • Fix rare kernel memory race condition doing AIX IO when the disk max_transfer setting is smaller than the GPFS blocksize.
        • Address an issue of mmrestoreconfig in mix version cluster.
        • Added new api to restripe a very big file in chunks.
        • Fix GPFS RDMA ibv_reg_mr error 12.
        • Fix segfault in tsapolicy due to destructor being called twice.
        • Fix code that can cause mmdefragfs to never finish. This could occur only when all disks are almost full.
        • Fix UTF8 encoding check routine to return correct path name length.
        • Based on number of clients that can connect simulteneosly, if correct value is set for socketMaxListenConnections, connection failure will not be observed by clients attempting to connect at the same time.
        • Fix to mmexpelnode to avoid indefinite blockage when a network partition is present and the target node is the cluster manager.
        • Fix EA limit calculation when metablock size is smaller than 64KB.
        • Fix mmapplypolicy in a mixed AIX/Linux cluster AND using the --sort-buffer-size option.
        • Fixed windows node assert using gpfsSetAllocFileSize on a datareplicated file system.
        • Fix mmrepquota to show human readable form instead of scientific notation for negative number.
        • Fixed a problem during quorum type change and when tiebreaker type is set to callback.
        • Permit mmbackup to analyze audit log even if "Failed" messages are not detected in STDOUT from dsmc selective command. Should help in rare cases where TSM runs out of space and exits rc=12 without emitting any of the usual failure messages.
        • Fix problem in enable fastea path when some other file system error happens at the same time, which interrupted enable fast ea operation.
        • Fix exception in GetReturnAddr following a cNFS grace period.
        • Fix large inDoubt left unclaimed problem after running chgrp command.
        • Add checking code to make sure sgID and inode number passed by dmapi handle are valid.
        • Prevent intermittent not-found errors when looking up snapshot pathnames while other snapshots are being deleted.
        • Change allocation code to prevent a deadlock that could occur when rebalance disks. This problem could only occur if there are large number of disk (over 32) and most of them 100% full.
        • Fix to a problem where when a file is opened with O_DIRECT, but there are other processes with the same file opened without this flag, the writes to the file do not get immediately updated on the disk.
        • Fix AIX crash or EINVAL error when calling gpfs_get_winattrs_path.
        • HoldDaemonSeg before referencing configuration flags in the segment.
        • Install this fix is you are using mmapplypolicy ... -i pathlist -g something ...
        • Eliminate possible FSSTRUCT error when accessing snapshots.
        • Restore quota file record size shortened by mistake in 3.4 and fix possible quota file checksum errors introduced by wrong record size.
        • This update addresses the following APARs: IV01082 IV01136 IV01138 IV01793 IV02033 IV02055 IV02253 IV02277 IZ99707.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2011-10-10T17:57:06Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.8 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.8

    October 07, 2011

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

    • Fix a race condition between creating a fileset snapshot in a node and mounting file system in another node which the file system was internal mounted.
    • Fix gpfs.base RPM to prevent errors during install when install_initd is not available.
    • Errors in an off-line hierarchical storage manager will result in errors when trying to access the data from the on-line "stub" file. The on-line operation will fail with the return code from the off-line hierarchical storage manager. Deleting an on-line "stub" file, which has a snapshot that also references the off-line data, requires that the off-line data be recalled on-line so that it may be captured in the snapshot. If the off-line error is due to a permanent data loss, then all retries will also obtain the same error resulting in the user being unable to delete the on-line file. This change allows the on-line file to be deleted when there are permanent errors in the off-line storage. The snapshot file remains on-line, but access to the lost file data will result EIO errors to indicate the data loss.This fix requires the most recent version of IBM Tivoli Storage Manager for Space Management.
    • Fix an assertion caused by expanding a quota file fragment.
    • Fix code which can cause daemon assert when mmdelfs was immediately followedby mmcrfs command with same FS name.
    • Attempt to restore a missing shadow file before resorting to rebuilding it from query data in all cases. The former behavior was inconsistent with -t full or -t incremental.
    • Invalid assertion in SFSCreateMetaFile.
    • Workaround ksh issue causing spurious command failure in RHEL6 x86_64.
    • Fix code which can cause deadlock when mmdeldisk running with a truncate operation on the helper node.
    • Fixed mmdelsnapshot not allowning non root user to delete globalsnapshots.
    • Fix a double free problem when doing restripe.
    • Prevent quorum loss at the cluster manager when quorum nodes are being added or deleted in an environment where tie-breaker disks and persistent reserve are used.
    • mmsdrserv should not result in connection failure when multiple clients query the server simultaneously.
    • Fix NULL ptr exception in aioComplete.
    • In kSFSGetattr(), the check for pcache fileset needs to account for OpenFile being compact.
    • Don't print out "GPFS Deadman Switch timer has expired" in the system error log if there are no outstanding IOs.
    • OpLock token revoke can cause deadlock on cacheObjMutex.
    • Add debug statement in exec script to detail expire errors. Restore lost progress indication for every 30mins to output Backing up files. Limit error messages about policy nonzero return status to occur only for policy errors. In full backup plus query mode preserve the restored shadow file in .mmbackupShadow*.old. Repaired message formatting for TSM failed with RC... message. Repaired message formatting indicating Audit log missing info. Elide decorating double quotes from audit log lines while comparing file names to shadow file. This fixes most common problem with quotation marks in file names when usingthe TSM 6.3.0 client (enhanced now at least print the correct path). Change exit status from tsbackup33 when all errors have been corrected to be "4" instead of $worstTSMrc.
    • Take note of any "severe" class error message from dsmc commands and count them. If a severe error occurs, or if ANS1999E message occurs indicating processing of a file list was aborted, then keep old shadow file.
    • Fixed rare deadlock during token recovery.
    • Fix mmfsck to use correct calucation of max number disk addr allowed in inode to avoid false header corruption report.
    • Fix mmapplypolicy ... --choice-algorithm=fast.
    • Prevent a node from being added to the cluster a second time under a different name.
    • Fix code to enforce that FILESET type quota entries belong to FILESET_ROOT.
    • Fix an assertion during "mmlsfs -Q" and the file system recovery has failed.
    • Fix mmstartpolicy to accept all valid low disk space event names.
    • Fix for segfault in mmapplypolicy under low or no space conditions.
    • Avoid very unlikely chance of errors after failed mmcrsnapshot command.
    • free kernel buffer upon exit from kxWinOps
    • Handle CNFS interface on vlan tagging devices.
    • Dynamicaly update callbacks that allow restricting to a particular nodeclass when the nodeclass's member nodes change.
    • Fix problem that file is unnecessarily recalled before it is deleted. Should not recall file if there is no more snapshots.
    • Avoid very rare asserts in fileset and snapshot delete commands.
    • Fix a race condition between creating a fileset snapshot in a node and mounting file system in another node which the file system was internal mounted.
    • Fixed the broken "sync" mechanism when kernel version is greater than 2.6.31.
    • Avoid problems accessing files in snapshots while running the mmunlinkfileset and mmdelsnapshot commands.
    • Fix invalid mmlsquota command syntax error causing signal 11.
    • Work around posix_unblock_lock() prototype changes in later linux kernel.
    • This update addresses the following APARs: IV03588 IV03589 IV05690 IV06271 IV06329 IV06333 IV06438 IV06830 IV07501 IV07717
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2011-12-02T14:47:43Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.9 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.9

    November 11, 2011

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

    • Fix to daemon assert when node is changed from a quorum to a non-quorum node.
    • Fix for unexpected mmfsd daemon shutdown after "Socket operation on non-socket" error.
    • Fix deadlock caused by some dmapi application running together with snapshot and fileset commands running at the same time.
    • Update reconnect and paxos code to reduce delay in disk election which can cause quorum loss when multiple quorum nodes fail at same time.
    • Fixed a group protocol problem which can cause daemon assert when communication link is down between two quorum nodes only.
    • Fixed code in mount code path to check for quorum loss before getting mount disposition in order to generate mount event.
    • Fix a rare problem which could cause mount to hang in "waiting for SG cleanup" after new FS manager takeover fails due to panic.
    • Rare filesystem hang due SGPanic before setting filesystem uid.
    • In an environment where a tie-breaker disk is used, fix problem where command mmexpelnode (primarily used in a DB2 cluster) may take more than 2 minutes to run if target nodes are rebooted but come back quickly after the reboot.
    • Fixed assert caused due to having extended attributes in inode of a sparse file.
    • Speed up mmexpelnode execution when multiple nodes fail.
    • Separate audit logs into separate files for each TSM server. Make each loop that iterates over all TSM servers recoverable so that after an error that server's state is marked to indicate the error and the loop continues to the next server. Track shadow file updates, file backup errors, and general TSM command failures separately and compute the return code given to mmbackup.sh based on the degree of success attained over all TSM servers. Rewrite logic at beginning to obtain or recreate a shadow file for each TSM server depending on the command line arguments and state of shadow file. Track serverExits, serverFails, shadowUpdates all in hashes based on $server. Isolate all code for performing query from TSM server and execute this step when needed for missing shadow file from each server. Save query results into a file and refer to it later after file system scan if rebuild needed. Discard TSM messages that are warnings or information during inventory query. Recognize when TSM has nothing backed up yet for the specified file system and that this is not an error. Create new function mergeQuery() for use after file system scan to merge query data with file system scan data using mmcmi rebuildShadow. Eliminate use of lstat() during query rebuild of shadow file saving time. Create new function restoreShadow() to restore a single shadow file from TSM if it is in inventory. Permit rebuilding shadow files even without TSM present by use of --notsm switch. Support the --rebuild options from mmbackup.sh. Disallow use of -q and -t full at the same time. Corrected check of $? after close only of command pipes and not files.
    • Fix a small window where truncating a file could cause restripe to fail.
    • Fixed problem with i/o error handling on HSM file system that could leave locks on a file. The locked file would prevent the file system from being unmounted.
    • Fix deadlock when the cluster configuration manager fails around the time the file system manager is being re-assigned.
    • Add hardlink to /usr/lpp/mmfs/win/gpfs.dll from /usr/lpp/mmfs/bin.
    • Fix assertion occurred during clone file deletion in dmapi enabled file systems.
    • Fix timer tick calculations to avoid AIX times() bug when running on LPARS in shared processor mode (not dedicated processors).
    • Fix log migration code to prevent long waiter during mmdeldisk. The long waiter can occur while deleting working disk with other down disks present.
    • Leave vattr untouched when there is a failure collecting the attributes.
    • We can change fs data replica to max data replica even if its higher than max metadata replica.
    • Fix read code which caused errno to be incorrectly set to EAGAIN even when O_NONBLOCK is used. This problem only occurs when reading past end of file.
    • Fix problem that quorum reached event could be triggered twice for one event occurrance during node shutdown.
    • Added new log mesasge to indicate when a sync user callback script finishes.
    • Ensure the presence of excluded disks is always reflected in the mount options string.
    • When a tie-breaker disk is not present, fix delay in completing mmexpelnode when multiple nodes fail and are the targets of the command.
    • Fix an assertion encountered in gpfs_prealloc when the file system panics.
    • Call ridSurplus irrespective of filesize==0. In the case where filesize is less than 1 block (DMAPI) it will also get rid of indirect blocks.
    • Fix hung dmapi dm_release_rights api call in DeclareResourceUsage.
    • Install this fix if you want to move progress messages to STDOUT.
    • NFS performance improvement for workloads that fit in the cache.
    • Added conditional compilation statement so that no error messages will appearwhen compiling GPL layer on SuSE 10 SP1.
    • Fix a condition that was causing mmsnmpagentd to terminate abnormally.
    • Improve NFS performance when file is already in the cache.
    • Fix quota file checksum errors caused by using wrong byte-ordering.
    • rmdir sets errno to EPERM when erroneously applied to a fileset junction.
    • Ensure the sg is not panicked before we trigger a logassert in updateDataBlockDiskAddr.
    • Fixed gpfs rpm dependence problem so that no error message will appear if rpm -ivh gpfs* is used.
    • Avoid deadlock in recovering from certain error conditions in mmlinkfileset.
    • Add mmapBufstructs configuration setting to allow more page requests which may avoid vmmiowait deadlock.
    • Fix a rare compatibility problem that can lead to file system corruption. The problem only occurs in the rare case where number of allocated inodes is not a multiple of inodes per inode block.
    • Fix a rare compatibility problem that can lead to file system corruption. The problem only occurs when certain file system with odd number of inodes was migrated to 3.4 format.
    • Ensure REMOVE, RMDIR and RENAME operations are committed to disk with 'nfssync' mount option specified.
    • Avoid assert when deleting a very large directory from multiple nodes.
    • Fix performance of directIO that has to go to NSD server.
    • File struct cleared during reclock revoke handler UNLCK.

    • This update addresses the following APARs: IV07043 IV07777 IV08201 IV08411 IV08731 IV08733 IV08734 IV08737 IV08980 IV08994 IV09017 IV09018 IV09023 IV09055.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2011-12-20T17:26:32Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.10 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.10

    December 16, 2011

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.ch are not listed here.

    • Fix a race between truncate and restripe that causes a E_HOLE error.
    • Fixed a problem with the tie-breaker disk logic, where a cluster managerwould resign from its role because of a disk challenge coming from a nodewhich was no longer a quorum node.
    • Fix a missing proper synchornization between the EA updates and logwrap.
    • It fixed command parser for mmlsquota even although usage for mmlsquota is wrong.
    • Fix a file system mount problem when quota files can not be createdaccording to metadata replication factor.
    • Defect 815857: Do not allow NumNFSDataObjects increment if NFSKProc is terminated.
    • Defect 815889: Recognize new TSM error when file list aborted. TSM corrected their original mispelling and grammar error with a new one and it also must be recognized.
    • Fix a segfault triggerred by mmaddcallback and daemon shutdown at the same time.
    • Fix a deacdlock problem when restripe the fs.
    • Fix the directory code which can cause a rare deadlock during mkdir and link fileset. Thedeadlock could occur when GPFS can not write to all replica due to down disk.
    • Avoid filesystem corruption in rare cases when issuing multipleadministrative commands in parallel.
    • Avoid a crash in rare cases involving creating files and snapshots whiledeleting disks and StripedLogs are enabled.
    • StealNFS() incorrectly takes an object out of the NFSData list even when it does not actually close it (either because it is referenced or belongs to a different filesystem). Subsequently, the background NFS watchdog does not find the vinfo for reaping (but invalidates corresponding gnode). On delayed NFS close, the stale gnode has a connected vinfo. Put NFSData back on the list if close isn't called.
    • Fix an rare assertion during a file system mount while openingfile system disks the quorum is lost.
    • The changes ensure that fileset related checks are not done for older filesystem format that do notsupport filesets. This ensures fsck does not generate false positves on corruption.
    • Fix a resource leak problem when run mmfileid.
    • Improve NFS performance.
    • Fix a race condition between 'mmunlinkfileset -f' command and rename operation in AIX and Windows.
    • Fix a completion race in the parallel inode traverse.
    • Fix Linux fatal page fault at memcpy_c+0xb when running dsmmigrate
    • Fix duplicated dmapi session problem caused in cluster manager node takeover process.
    • This update addresses the following APARs: IV10610 IV11351 IV11562.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2012-02-06T19:30:40Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.11 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.11

    February 03, 2012

    • Fixed code to prevent daemon assert which can occur after file system panic due to disk failure.
    • Fix a panic when gpfs shuts down.
    • Improve GPFS log file diagnostics for disk commands.
    • Run cbooP->occupancy callback when THRESHOLD(0,) or no THRESHOLD clause.
    • Fixed the logging code which can caused GPFS daemon assert under some race condition.
    • Reduce TSBuf lock contention.
    • The code change fixes an race condition in the allocation code that could potentially cause a segment violation problem (SEGV) while online fsck is in progress.
    • Fix a race condition when unlinking a fileset on linux.
    • Fix error handling of tsapolicy if somehow a file rename fails, we print an error message. Also if buckslip/bucket inodescan completes with an error, but without an slave disconnect, we do up to 3 retries.
    • Customers effected by this fix are GPFS 3.4 customers that have all nodes in their cluster upgraded to GPFS 3.4 and have upgraded their file system to GPFS 3.4 format so that the filesystem supports 64 bit inode numbers (older versions of GPFS did not support more than 2147483647 files in a single filesystem). Further for the assert to trigger the file being operated on must have a large inode number, larger than the older version of GPFS supported so an inode number larger than 2,147,483,647. This can occur for large filesystems (with more than 2,147,483,647) or customers that use filesets as the newer version of GPFS filesets can cause large inode numbers to be generated as that code leverages the 64 bit inode number space and can partition it for fileset use. The fix applies is for any GPFS node architecture (Linux, AIX, etc...).
    • Fix to a sporadic kernel crash with "kSynch.C" in the stack trace, due to anassert in the GPFS kernel module, which may occur around the time the daemonterminates.
    • Fixed problem in mmbackup that caused files larger than 100 GB to be backed up needlessly after rebuilding the shadow file.
    • Fix a problem that old ea code path does not properly synchronization to log wrap.
    • Fixed mmfsck handling of extended attribute corruption.
    • Fix an assertion encountered during mmedquota command when the file system is not mounted.
    • Updated gpfs.wxs with a ServiceControl entry to uninstall the mmautoload service.
    • The distributed directory scan code in tsapolicy was augmented to limit the number of "request" messages that are allowed to be outstanding and unacknowledged.
    • Check for stale mount (privVfsP == NULL) before calling kBeginVnopRd.
    • Support relink operation of Linux Checkpoint-Restart feature.
    • Fix potential kernel panic when dmapi mount event is waiting for response while daemon shutdown at the same time.
    • Fix a problem that GPFS does not properly handle cxiUXfer failure.
    • In some cases, in configurations where multiple NSD servers exist for a given NSD and none of the serverscan successfuly perform IO to that NSD, the client node can fail to take appropriate action due to being unable to write data to that disk. Fix this problem so that client now takes appropriate error handling action(s).
    • Performance improvements for small concurrent reads and writes to a single file over NFS.
    • Speed up recovery when the GPFS daemon terminates at the cluster manager node.
    • Fix an assertion during umount caused by online checkquota that could leave filesets locked.
    • Fix a deadlock problem when deleteing file.
    • Fix a deadlock between the quota manager and inode prefetch in accessing quota files.
    • Fix to a problem where, with a file opened with the O_DSYNC flag, the data may not be written to disk before the write() system call returns.
    • Allow more prefetch and writebehind caching of sequential access files up to prefetchPct of the pagepool.
    • Fix a race condition. Mixed(mmap and regular) read-write same file on multi nodes could cause some mmap writes to be lost .
    • Fix assert caused by concurrent mmap write and regular read/write.
    • Fix problem in mmlsquota which could cause daemon to crash when there are more than 32 quota enabled file systems.
    • The Windows installation script (mmwininstall.sh) was changed to properly set the "setuid" bit on mmcmdwrap. This program is used to allow ordinary user access to commands like mmdf.
    • Unauthorized chown command was not rejected and setid bits were cleared.
    • Fix to mmfsck to remove erroneous warnings about extended attribute overflow blocks not matching.
    • This update addresses the following APARs: IV11661 IV12202 IV12326 IV12359 IV12901 IV12930 IV12932 IV12936 IV12937 IV13273.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2012-03-16T19:29:30Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.12 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.12

    March 16, 2012

    • Updated policy rule to scan only the directory and its dependants.
    • AIX versions of libgpfs API returning errno in rc instead of seperately.
    • Correct filesystem hang due to running out of mailboxes.
    • Fix issue where writing to a file with O_DIRECT in the presence of snapshots could sometimes failed to preserve the old file content in the snapshot.
    • Fix assert when many threads append at the beginning of the same file.
    • Fix quota manager locking order causing long waiters.
    • Unblock quota commands earlier in file system recovery.
    • Fix AIX fatal page fault at 0xFFFFF00010000000.
    • Fix a problem where writing to a previously read but not yet modified page in a memory-mapped file will sometimes fail after creating a snapshot.
    • Fix an E_WOULDBLOCK assert which caused by mixed mmap/regular read write on multi nodes.
    • Fix incomplete cleanup of a failed mmcrfileset command that could cause an assertion at file system close time.
    • kBegin called on a duplicated KernelOperation.
    • Mmbackup fails with DEBUG=1. Finding the snapshot root dir with "mmsnapdir" when DEBUG is set resulted in mmbackup failing to parse the debug output while looking for the snaproot. Recode the function to look for the specific output needed, and change to use tssnapdir instead of mmsnapdir. Fix a few spelling typos in comments and fix comparison of errorFileCount.
    • Prevent problems after incomplete cleanup of failed snapshot create.
    • Fix issue where certain workloads that include frequent updates to large extended attributes could cause deadlock.
    • Fix a problem where a node may be incorrectly expelled from the active cluster in a scenario involving multiple node failures (including a change in cluster manager) and a TCP connection break.
    • Skip renamed/deleted name entry during a Windows query directory call, which could lead to entire query directory call to fail.
    • Fix cluster partition problem when a single-node cluster configured with a tiebreaker disk is expanded to multiple quorum nodes, and a network outage causes the cluster manager to lose contact with the other nodes.
    • Avoid assert after I/O error following quorum loss.
    • In some situations mmrepquota does not release all quota entry holds which prevents file system manager to migrate to another node.
    • Mmbackup shadow file rebuild process can result in some files being expired from the wrong server. This happens when thereis a plurality of TSM servers and there remains queried inventory from a server reported in the QueryShadow file with path names lexicographically after all the current file pathnames in the live file system. The mmcmi rebuildshadow command is finishing processing the input files (QueryShadow and List) and when remaining data in the QueryShadow must be appended to the being-rebuilt new shadow file. Remedy by duplicating the code which replaces the QueryShadow values for iAggregate, iRule, etc with the data saved from the first line of the current list file and which contains a valid iAggregate and iRule value pertaining to this TSM server.
    • Fix mmfileid to handle big disk addresses in 32 bit kernels.
    • In mmbackup, check if unlinked fileset at the beginning and display correct error if -f is not specified.
    • Fix assert that can be caused by truncating a file to less than one block in size and then expanding the file back to one block in size.
    • Fix mmcheckquota command cleanup on failure that was impeding further invocation of the command.
    • Mmbackup leaves audit log after success. Mmbackup needs to remove audit logs for each tsm server before running and after running each backup job.
    • Fix race condition in prealloc path to avoid assert when one thread is doing prealloc and another is truncating the file.
    • The change is in generating Policy Rule routine in tsbackup33 to add additional statement in where clause.
    • Fix buffer length calculation so that there is room for the complete error message.
    • Fixed segfault in SFSdmGetDirAttrsCode.
    • When scanning through the errors in the audit log file, if mmbackups comes to the end of the list file then it has now made a "reduced copy" of the orginal list, so it closes the original list, closes the reduced copy and re-opens the reduced copy starting at the top and looking at this reduced copy as the "list file" to continue the search for failures to remove. It continues searching the audit failures out from the ever-reducing list file, until done. If it succeeds in finding all audit failures in the list, then it has successfully reduced the list to a proper "new shadow " file state and can exit with some measure of success (1).But if it fails to find all the audit failures in there, then it cannot update the shadow and record success. Repair this routine which reopens the reduced list to no longer discardthe first 5 lines of the top of the list file. Originally the intermediate shadow file would havehad a 5-line header but the usual 5-lines are not present in this list file anymore as it is in a full shadow database. Simple removal of the code that was discarding the 5 top lines fixes the problem.
    • Fixed hang due to dmapi lock not being released.
    • The Windows installation script (mmwininstall.sh) was changed to properly set the "setuid" and "setgid" bit on tsusercmd. This program is used to allow ordinary users access to commands like mmdf.
    • Fix an issue where writing to a file with O_DSYNC did not correctly commit data if disk space for the file was pre-allocated via gpfs_prealloc().
    • Moved ODM script from root part to usr part in packaging script.
    • Support two more DWARF4 operation codes in GPFS traceback generation code.
    • Avoid a deadlock lock when quorum loss happen in restripe is in progress.
    • Permit interpreting O_NOATIME open flag on Linux to set the value placed in openInstance updateATime member. O_NOATIME prevents updating the atime on a file when reading its data on the opened descriptor. This is aimed at supporting the preservelastaccessdate option in TSM for backup. TSM backup/archive client uses the O_NOATIME open flag when this option is selected.
    • Fix calculation and minor mistake in mmcmi rebuildShadow routine.
    • Deadlock when quiesce sneaks in between an NFS close (either through nfsWatchKproc or cleanupStaleNFS) that has already done a kBegin and a subsequent cxiPutOSNode that ends up calling back into mmfs through delete_inode and does another kBegin. Pass a flag to CloseNFS() to allow releasing reservation before calling cxiPutOSNode.
    • Fix a problem that, in certain rare instances, can cause a segfault when a filesystem manager fails or is migrating.
    • Fix defect that a deleted session could reappear in session list, when this session registered for mount, mount might hang forever.
    • Provide the nanosecond granularity support in gpfs_stat().
    • Avoid double mutex release when SGMount fails to get the up-to-dateStripeGroupDesc information.
    • The TSM inventory contains records with unprintable or newline embedded characters. These were rendered by dsmc q b on multiple lines and confused the query interpreter. As a short-term fix, just omit any such records from appearing in the QueryShadow.
    • Fix cluster manager daemon hang when a tie-breaker disk is not used, and the daemon is restarting following a node crash which occurred right afterthe node had become the cluster manager.
    • Avoid daemon crash when accessing snapshot files while other snapshots are deleted.
    • This update addresses the following APARs: IV03071 IV13931 IV14312 IV14915 IV14917 IV14921 IV15097 IV15098 IV15296 IV15729 IV16015 IV16025 IV16026 IV16063.
    • gpfs@us.ibm.com
      gpfs@us.ibm.com
      216 Posts
      ACCEPTED ANSWER

      Re: GPFS V3.4 announcements

      ‏2012-05-03T19:20:56Z  in response to gpfs@us.ibm.com
      GPFS 3.4.0.13 is now available from IBM Fix Central:

      http://www-933.ibm.com/support/fixcentral

      Problems fixed in GPFS 3.4.0.13

      May 3, 2012

      • Fixed mmap deadlock due to MML_FLUSH
      • Fix multi-cluster assert when using a common directory.
      • Fixed slow file deletes. Now deletes are done in the background.
      • When current backup has 3.2 format, mmbackup -t full will expire 3.2 format files from TSM server.
      • When existing backup is 3.2 format, incremental mmbackup will keep the format.
      • Avoid very rare deadlock while multiple nodes rename directories concurrently in a deep, shared tree when a snapshot is created.
      • When part of back up failed during mmbackup process, the error msg will show more explanatory error message.
      • Fix a window where a node failure at a particular time during an mmrestripefs or mmdeldisk command could cause the command to leave log files on a suspended or deleted disk. This could subsequently cause mount to fail due log recovery attempting to access logs on a deleted disk.
      • Load and free symbols and GPFS memory address ranges for each dump to print proper traceback on Linux.
      • mmbackup will exit 1 when auditlog file is not available for result analysis after backup transaction is done.
      • Reload GPFS memory address ranges before each dump to print proper traceback on Linux.
      • When mmbackup --rebuild exit with syntax/usage error, the error msg will be more accurate.
      • Take whether a node is an NSD server into account as a new criteriain deciding which node to expel when two nodes cannot communicate.Also add new callback script to allow changing the original decisionon which node to expel.
      • Fix a deadlock in alloc protocol. This tends to occur in SNC cluster.
      • Fixed mmfsck handling of duplicate address corruptions.
      • Fixed inode inconsistency struc error problem due to lastBlockSubblocks not being set correctly after truncate.
      • Fix to daemon failure ("sgiP->sgMgr == NONE_APPOINTED") after nodeloses and then regains quorum.
      • Fix a race window that causes msync returns undocumented error EAGAIN on AIX.
      • After NSD server nodes change, fix ReadReplicaPolicy choice of a localNSD server
      • Fix a long waiter problem during file system umount caused by incompletecleanup of gpfs_prealloc on error exit.
      • Fix an issue that could cause log recovery to fail if a node fails shortly after time-of-day on the node had be set backwards by more than a few seconds.
      • Fixed rare hang problem with dynamic mailboxes and dmapi event wait mailbox handler
      • Fix a problem that can lead to SAMBA create/rename permission denied
      • Fix a deadlock if GPFS cannot get proper file lock or inode because inode is invalid at some time.
      • Fixed a case where 3.2 filesystem is not resetting hasXAttr flagat file destroy timeCode Review: Wayne Sawdon
      • gpfsOpen asserts when getInodeStatus==INODE_DELETED but nLink!=0
      • Prevent hangs and crashes if a node fails immediately after running a snapshot create or delete command.
      • Fix a deadlock between the last local restripe thread and remotehelpers.
      • mmbackup -t incremental will keep 3.2 backup format if existing backup was done with GPFS 3.2 or earlier version of mmbackup.
      • Fix a problem that will cause the whole system unusable when wehit E_NOMEM. Also increase the shared segment hard limit 1GB toa value bases on physical memory and page pool size.
      • No longer provides an improper warning message after an mmchdisk that doesn't actually change anything.
      • Fix assert (callerSerialized == 0) OR (isAllocListChanging())
      • Files created from a Linux node via an AIX/NFSv4 server have bad mtimes
      • Exception in the AIX vattr_to_fattr3 routine.
      • Support DW_CFA_val_offset(0x14), DW_CFA_val_offset_sf(0x15) and DW_CFA_val_expression(0x16) DWARF operation codes for traceback on Linux.
      • Fix a window where a client node unmounting the file system at a particular time during an mmrestripefs or mmdeldisk command could cause the command to leave log files on a suspended or deleted disk. This could subsequently cause mount to fail due log recovery attempting to access logs on a deleted disk.
      • mmbackup will not replace shadow file by .mmbackupShadow.*.old inadvertently.
      • "Incorrect ACL entry" from mmputacl -i due to overlapping memcpy
      • mmbackup will check detail error code and preserve .mmbackupCfg data for debug.
      • When backup partially fail, mmbackup continues to compensate shadow file even thoughthere are multiple failed reported for the same file in auditlog file.
      • Fix code that can leave filesystem in quiesced state on some nodes after filesystemmanager node fails in the middle of a snapshot command.
      • When existing backup is 3.2 format, mmbackup with multiple TSM server will exit with error.
      • Fix calculation of the number of file systems managed by each node,which is used in deciding which node to expel.
      • Fix a rare quota manager deadlock occurred when it needs to expand the last fragment of a quota file.
      • Fix an FSErrSnapInodeModified error caused by copying quota files to snapshots.
      • mmbackup without -S <snapshot> will not consider a fileset that is deleted but remain as unlinked status in a snapshot as unlinked fileset.
      • When mmbackup fail even before actual backup processing start, it will exit with 2.
      • Fix bad "bufP == NULL" assertion when downlevel nodes used to do a remote allocation.
      • Added diskFailure event callback support, %fsName and %diskName can be used as parms for the callback script to indicate which disk is failed.This callback event is an local event and only be triggered on file system manager node.
      • mmdelsnapshot was hanging in cleanupStaleNFS/kBegin when there was NFS activity.
      • mmbackup will backup regular files or directories that happen to contain "mmbackup" in its path.
      • CNFS: get the correct vlan interface name in SLES.
      • mmbackup -S <snapshot> will skip files that match to exclude criteria.
      • mmbackup -t full on 3.2 backup format will successfully expire 3.2 backup format files from TMS server.
      • mmbackup will consider linked child fileset whose parent fileset is unlinked as unlinked fileset.
      • mmbackup will exit 1 when incremental backup partially fail and shadow file compensation succeed.
      • CNFS: Fix recovery with multiples VLANs in CNFS IP list.
      • mmbackup will not stop processing even though there's no auditlog file if only expiration processing is done
      • mmbackup will display progress msg "Expiring files..." correctly if expiration transaction takes longer than 30 mins.
      • mmbackup with multiple TSM clients will catch all error messages from dsmc command output.
      • Fix printing of long fileset names in mmrepquota and mmlsquota commands.
      • mmbackup can backup files/directories with long pathname as long as GPFS and TSM support.
      • mmbackup will display backup/expiration progress message in every interval specified byMMBACKUP_PROGRESS_INTERVAL environment variable if specified. Otherwise, mmbackup willdisplay backup/expiration progress message in every 30 mins.
      • This update addresses the following APARs: IV00000 IV16984 IV16990 IV17337 IV17341 IV17445 IV17454 IV17475 IV17494 IV17900 IV18090 IV18092 IV18095.
      • gpfs@us.ibm.com
        gpfs@us.ibm.com
        216 Posts
        ACCEPTED ANSWER

        Re: GPFS V3.4 announcements

        ‏2012-06-20T16:37:32Z  in response to gpfs@us.ibm.com
        GPFS 3.4.0.14 is now available from IBM Fix Central:

        http://www-933.ibm.com/support/fixcentral

        Problems fixed in GPFS 3.4.0.14

        June 20, 2012

        • mmbackup will filter ANS1361E Session Rejected: The specified node name is currently locked error and will exit error.
        • mmbackup will filter filename that contains unsupported characters by TSM.
        • Fix a problem stealing buffers in a large pagepool after installing 3.4.0.11.
        • Fix for the "iP->i_count == 0" kernel assert in super.c. This problem onlyaffects Linux 2.6.36 and later.
        • Fix a rare deadlock where a kernel process gets blocked waiting for afree mailbox to send to the GPFS daemon.
        • Correct mmlsfileset output for junctions of deleted filesets in some cases.
        • Fix a memory allocation problem when online mmfsck runs on a node with a heavy mmap workload.
        • Prevent the cluster manager from being expelled as a consequence ofsome communication outage with another node.
        • Fixes problem where the 'expelnode' callback indicates that the chosen node had joined the cluster first.
        • Fix a problem with nBytesNonStealable accounting.
        • Fixed message handler for filesystem quiesce which caused a GPFS assert when filesystemmanager failed while filesystem is been quiesced.
        • Fix mmap operations to go through nsd server when direct accessto disks are no longer possible.
        • Fix mmsetquota to handle numerical fileset names.
        • Fix an error message in mmchattr command with -M/R/m/r option.
        • Fix a problem that restripe failed in to an inifinite loop when sg panicked on the busy node.
        • Fixed rare assert when deleting files in a fileset.
        • Fixed rare hang problem during sg or token recovery.
        • Fix deadlock when doing inode scan (mmapplypolicy/mmbackup) in small pagepool.
        • getxattr for ACLs may ovewrite the kernel buffer if small buffer sizes (less than 8 bytes) are specified.
        • When mmbackup shadow file is rebuilt by --rebuild or -q option, mmbackup will get CTIME information from TSM server,hence files modified after previous backup but before shadow is rebuilt will be backed up by consequent incremental backup.
        • Fix a problem where certain errors from a pdisk, like media errors, caused RecoveryGroup open to fail. Change code to continue attempting to openthe RecoveryGroup and simply discount the pdisk(s) returning media errors and unexpected error codes.
        • Prevent disks from being marked as 'down' when a node with the configuration option unmountOnDiskFail=yes receives an I/O error or loses connectivity to a disk.
        • Fix an assert when copy the inode block to previous snapshot.
        • When mmbackup can't backup files, the message is more informational.
        • Fixed mailbox calls which can lead to deadlock during filesystem quiesce. The deadlockis most likely to happen on a extremely overloaded system.
        • When backup fail (partially or entirely) due to error from TSM client,mmbackup will display error msg from TSM cleint for easy problem detection. But mmbackup will display the error msg only once for the same erroreven though multiple times occur.
        • Make GPFS more resilient to sporadic errors during disk access. Upon an unexpected error during disk open, such as ERROR_INSUFFICIENT_BUFFER, GPFS now retries the call after a brief pause.
        • When compesating shadow file takes long time because backup partially fail, mmbackup will show progress message.
        • Added logic to reduce the chance of failure for "mmfsadm dump cfgmgr".
        • mmbackup Device -t incremental and --rebuild is valid syntax and will work properly.
        • Fix the problem that deldisk returned success even though if failed.
        • handleRevokeM loops when a lease callback is not responded to.
        • Fixed old bug in getSpareLogFileDA due to a typo.
        • Fix assertion failure when multiple threads use direct I/O to write to the same block of a file that has data replication enabled.
        • Fix daemon crash, during log recovery, when log file becomes corrupted.
        • Avoid deadlock creating files under extreme stress conditions.
        • Fix code to ensure E_ISDIR error get returned when FWRITE flag is used to open a directory.
        • Fix problems with using mmapped files after a filesystem has been force unmounted by a panic or cluster membership loss.
        • Fix quorum loss seen after a ppc64 node reboots.
        • This update addresses the following APARs: IV19040 IV19164 IV20127 IV20299 IV20612 IV20617 IV20626 IV20629 IV20633 IV21510 IV21655 IV21751 IV21757 IV22013.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2012-07-23T14:20:37Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.15 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.15

    July 23, 2012

    • Fixed potential live-lock in snapshot copy-on-write of the extended attribute overflow block when the next snapshot is being deleted. Problem occurred in rare cases after the inode file increases in size.
    • Prevent an assert accessing files via DIO.
    • When a tiebreaker disk is being used, avoid quorum loss under heavy load when the tiebreaker disk is down but all quorum nodes are still up.
    • Fix an infinite wait when delsnapshot.
    • When a tiebreaker disk is used, prevent situations where more than onecluster configuration manager is present simultaneously in the same cluster.
    • Fix assert "isValid()" that occurs during mmbackup a snapshot.
    • Update mmrpldisk to issue warning instead of error when it can not invalidate disk contents due to disk been in down state.
    • mmbackup will report sever error if dsmc hit ANS1351E (Session rejected: All server sessions are currently in use).
    • Fix issue in multi-cluster environment, where nodes in different remote clusters updating the same set of files could cause deadlock under high load.
    • mmbackup will filter filename with newline correctly.
    • Improve error handling for completed tracks.
    • Fix a bug that causes slowness during mmautoload/mmstartup on systems with automount file system. The performance hit is noticeable on large clusters.
    • Prevent very rare race condition between fileset commands and mount.
    • Fixed rare assert in log migration.
    • Update mmlsquota -j Fileset usage message.
    • Fix allocation message handler to prevent a GPFS daemon assert. The assert could happen when a filesystem is been used by more than 1 remote cluster.
    • Block Linux NFS read of a file when CIFS holds a deny share lock.
    • Speed-up recovery when multiple nodes fail, and multiple mmexpelnode commands are invoked with each failed node as target. Applies mostly to DB2 environments.
    • Fix null ptr dereference in case of i/o failure case on gw node.
    • Fix the mmcrfs command to handle the -n numNodes value greater than 8192.
    • Extend mmbackup's tolerance of TSM failures listed in the audit log even when paths are duplicate or unrequested. TSM frequently logs in the audit log a number of unexpected path names. Sometimes the path name is a duplicate due to repeated errors or due to TSM trying to back up objects in a different order than presented in the list file. Other times the object simply was not requested and it tries to back it up anyway. Make mmbackup ignore these log messages during shadow database compensation. Log all uncompensated error messages to files in backupStore (root) in mmbackup.auditUnresolved.<server> and mmbackup.auditBadPaths.<server> Add new debug bit to DEBUGmmbackup: 0x08 to cause a pause before backup activities commence and a second pause before analysis of audit logs. Correct minor errors in close() handling of various temp files.
    • Fix a restripe code that could cause a potential filesystem corruption. The problem only affect filesystem that was created without FASTEA enabled but was later upgraded to enable FASTEA via mmmigratefs with --fastea option.
    • Loss of access to files with ACLs can occur if independent filesets are, or have been, created in the filesystem.
    • Disable NFS performance fix due to data integrity concerns.
    • This update addresses the following APARs: IV21759 IV22009 IV22131 IV22811 IV23809 IV23813 IV23815 IV23843 IV23878 IV24937.
    • IV24937 is documented further at the URL: http://www.ibm.com/developerworks/forums/thread.jspa?threadID=448578&tstart=0
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2012-09-05T18:00:01Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.16 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.16

    September 5, 2012

    • mmbackup will check if session between remote TSM client node and TSM server is healthy and will remove the combination from transaction if non-healthy situation is detected.
    • Fix snapshot creation code to prevent a possible GPFS daemon assert when filesystem is very low on disk space.
    • Added fix for restripe did not properly handle errors returned by copyReplicas that could cause data corruption.
    • Fix an assert that occurs during deleting an independent fileset.
    • Fixed hang problem when deleting HSM migrated file after creating a snapshot.
    • Fix a GPFS API gpfs_next_inode issue that it doesn't scan the file whose inode number is the max inode number of file system or fileset.
    • Fixed assertion when generating read or destroy events.
    • Fix the code that can cause a GPFS daemon assert when multiple thread working on same file caused a race condition to occur.
    • Fixed sig 11 when background deletions is trying to access OpenFile object that was removed from cache while waiting for quiesce to finish.
    • Fixed race condition between FakeSync and RemoveOpenFile.
    • Fix a kernel panic which caused by a race between two nfs reads.
    • This fix only applies to customers running GPFS on Linux/PowerPC, using WEIGHT clauses in their policy rules.
    • Fix mmdeldisk to ignore special files that do not have data in a pool.
    • Close a hole that gpfs_ireadx/ireadx64 cannot find more than 128 delts. Close a hole that call gpfs_ireadx/ireadx64 for an overwritten file may get assert if the input offset is not 0.
    • Fixed a problem where 'mmchmgr -c' fails on a cluster configured with a tiebreaker disk, resulting in quorum loss.
    • EINVAL returned from gpfs_fputattrs when an empty NFSv4 ACL is included.
    • FSErrBadAclRef reported when lockGetattr called RetrieveAcl with a zero aclRef.
    • deadlock resulting out-of-order aclFile/buffer locking.
    • This fix only applies to customers who have set tscCmdPortRange, running mmapplypolicy, running a firewall that is preventing policy from exploiting multi-nodal operation.
    • Fix code to avoid unavailable disks when there is no metadata replication.
    • Fix rare race condition where a node failure while writing a replicated data block under certain workloads could lead to replica inconsistencies. A subsequent disk failure or disk recovery could cause reads to return stale data for the affected data block.
    • Fix hung AIX IO when the disk transfer size is smaller than the GPFS blocksize.
    • gpfs_i_unlink failed to release d_lock causing d_prune_aliases crash.
    • This fix only applies to customers who are on AIX and have gotten "no enough space" errors when running mmapplypolicy.
    • On Windows mask out ReadEA (which is the same as ReadNamed) from unallowed rights so that the lack of it is not interpreted as a denial. Only the presence of an explicit ACE can deny the ReadEA right.
    • This fix applies to any customer who needs to kill the scripts started by mmapplypolicy. Or who is wondering why on AIX, a faulty program startedby mmapplypolicy "hangs" instead of aborting.
    • Fix assert "MSGTYPE == 34" that occurs in pre and post-3.4.0.7 mixed multicluster environment.
    • This update addresses the following APARs: IV24383 IV25183 IV25328 IV25392 IV25445 IV25462 IV25764 IV25770 IV26017.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2012-12-12T22:09:58Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.18 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.18

    December 12, 2012

    • Fix a race between lookup and mnode takeover which caused lookup to get inconsistent data.
    • Under heavy load, shrink dcache is called by kswapd and finds a candidate dentry to be pruned, but the attached inode has already been deleted (I_CLEAR). The problem occurs when a pcache NFS lookup finds an anonymous (no-name) dentry in the cache. This can happen if the dentry had been previously created through d_alloc_anon() when instantiating an inode from a cached fh instead of a regular lookup. To "repair" this dentry (fill in the name), we allocate a new dentry with the name being looked up and instantiate it with the same inode using d_materialise_unique() and release the old no-name dentry. d_materialise_unique() drops the inode count on success. When shrink dcache runs, it will find the freed no-name dentry with an attached inode with i_count 0 (and assert). Add an extra hold on the inode before calling d_materialise_unique().
    • Enhancement to handle device name changing upon FC cable broken/recovery on recent Linux distributions.
    • Fix race condition, where under certain stress loads, one or more threads could get stuck waiting for 'UpdateLogger::cleanup waiting for holdCount to go to zero'.
    • Fix EA code which caused GPFS daemon assert on filesystem with FASTEA enabled. This is mostly a problem on Windows.
    • Enhancement to better handle new DM-MP path appearance and improve the file system availability.
    • gpfsClose() called for NFS delayed-open instance of a deleted snapshot asserts that gnode is stale but vinfo still connected. On snapshot delete, disconnect any matching vinfo's for open NFS instances during sgmMsgQuiesceOps.
    • Fix a deadlock which occurs on GNR configurations in certain situations. This deadlock can occur when the active RecoveryGroup server fails, and the backup server experiences a SAS problem that prevents access to a sufficient number of disks, preventing RecoveryGroup recovery. In this situation, it is sometimes possible to see failure recovery blocked because the NSD transactions are waiting for the backup server to take over, when it cannot.
    • Add synchronization between filesystem manager resign and some ACL-related operations. This is needed to prevent a possible GPFS daemon assert while running mmchmgr command.
    • Fix range revoke handler to better handle error conditions such as IO error. Instead of causing GPFS daemon assert, just panic the filesystem.
    • Fixed a bug in new background deletion code where it is trying to queue the deletion instead of handling it when maxBackgroundDeletionThreads is zero.
    • Added additional info to noDiskSpace to distinguish the reason of the event. Reasons could be diskspace or inodespace. Added %storagePool to indicate the pool name when %reason is diskspace. Added %filesetName to indicate the fileset name when %reason is inodespace.
    • Fix code for mmrpldisk where it will migrate data off any suspended disk in addition to the disk been replaced. This can lead to both replicas being placed on the replacement disk.
    • Fixed problem in readpage/splice_read where it is returning EFAULT instead of ETIMEDOUT when accessing HSM migrated file from NFS client.
    • Avoid hang with long 'open snapInode0' waiters.
    • Fix duplicate message sometimes being received following an automatic TCP socket reconnect.
    • Fixed a bug when setting filesize with truncate file operation.
    • Fix a small window where a node failure while flushing a replicated data block could result in mismatched data replicas if log wrap was active just prior to the failure.
    • V7000 environment: prevent cluster manager node from losing cluster membership when the other node goes down or upon a network outage.
    • This fix only affects GPFS Linux users: On Linux operating systems, readdir() API on GPFS filesystem was not returning the valid file types in the d_type member of struct dirent output parameter. Modified the code to return valid file types in the d_type member of struct dirent output parameter of readdir() system API on GPFS file system.
    • Fix a timestamp issue for Linux AIO in which the modification timestamp of a file accessed by Linux native AIO interfaces might be set incorrectly in some cases.
    • This only affects GPFS GNR (Vdisk) users, and only those that are running 3.4.0.17 or later in combination with 3.4.0.16 or earlier, on the server pair for a single Recovery Group. If failover from a newer software version to an older software version occurs during a rolling upgrade, or if the newer software is downgraded back to the older software, the some Pdisks may become "missing", or the whole Recovery Group (and therefore all Vdisks in it) will not be recoverable. This fixes this problem.
    • Customer may have experienced asserts like "isMoreRecent && tailLsn >= other.tailLsn || !isMoreRecent && tailLsn <= other.tailLsn" during log recovery. This fix provides a more graceful workaround so that log recovery can proceed safely.
    • This update addresses the following APARs: IV30006 IV30612 IV30740 IV31102.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2013-01-25T20:07:30Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.19 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.19

    January 25, 2013

    • Reduce unnecessary compensation passes. Duplicate entries in the TSM audit failure list cost mmbackup extra passes over the shadow DB to compensate the failures. Since we sort the fail list based on inode order, just use sort -u option to remove duplicates up front. Prevent throwing away entries that fail the grep by adding a number suffix on the inode order key to ensure best chance at getting it in the right order.
    • Update repair code to prevent replica mismatch on EA data after restart down disk.This only affect filesystem with FASTEA enabled and have EA data that can no longerbe stored inside inode.
    • Correct corruption caused by fix during a rare race updating a shared directory.
    • Fix potential deadlock casued by reloading policy rules when file system manager nodes die and another node make takeover.
    • Fixed problem when closing filesystem due to policyfile openfile object still in hash tables and looping in verifyAllGone.
    • Fix striped log file corruption due to snapshot restore.
    • syncnfs mount option not effective for some error tests.
    • Fixed rare deadlock in aclMsgFailureUpdate.
    • This fix applies to all supported releases of the mmapplypolicy command, but is only important if you ever run with -L 3 or higher anda SHOW value that is a very long character string.
    • CNFS: Ignore quorum loss event from remote cluster.
    • Fix race between AsyncRecovery and mmcrfileset that cause assert.
    • This only affects GNR users, and only those that are running 3.4.0.17 or later in combination with 3.4.0.16 or earlier, on the server pair for a single RecoveryGroup. If failover from a newer software version to an older software version occurs during a rolling upgrade, or if the newer software is downgraded back to the older software, the some Pdisks may become "missing", or the whole Recovery Group (and therefore all Vdisks in it) will not be recoverable.
    • GNR is unable to recover after intermittent disk communication failures. Only affects Vdisk users, in situations where intermittent hardware failure causes multiple disks to temporarily report write errors, while the Vdisk server is writing primordial Vdisk Configuration Data (VCD), and then only if the Vdisk server has to restart (perhaps due to server failover) shortly after the temporary write errors occurred. The failure will be indicated in the log by showing either error 214 or checksum error when recovering. This fix corrects this problem, and allows the Vdisk server to recover cleanly. There is no other practical workaround that preserves user data. If all Vdisk NSDs in the affected recovery group can be destroyed, one can instead manually clear all the Pdisks in the failed recovery group (by overwriting the first 4 MiB with zeroes), then manually delete that affected recovery group (with the -p flag on mmdelrecoverygroup), then recreate the recovery group.
    • Fix a stale nfs file handle error under some conditions when listing in aper-directory snaplink dir.
    • Fix rare race condition in a multi-cluster environment that may cause the gpfs daemon to fail with "!oldDiskAddrFound.compAddr(*oldDiskAddrP)" assert when there are frequent conflicting accesses to the same file from different remote clusters.
    • Fix assert in getDatabuf as the blockOffset was not reset.
    • Fix a problem that only one node is working in restripefs.
    • Fixed error code path in updateLogger::intendWrite to fix self deadlockdue to updatelogger mutex.
    • Fix a race condition which could cause same file allocator used by two different files.
    • The fix ensures that mmfsck cleans up the filesystem affected with inodes having cross-linked blocks throughly.
    • Fix a race between NFS reads on same file which would cause kernel panic.
    • This update addresses the following APARs: IV33032 IV33109 IV33392 IV33610 IV33612 IV33614.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2013-03-11T18:16:49Z  in response to gpfs@us.ibm.com
    GPFS 3.4.0.20 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.20

    March 11, 2013

    • Update allocation code to reduce manager resource usage during replace disk and rebalance. This should reduce chance of long waiters during these two operations.
    • support for the Linux ATTR_FORCE flag to reset setid bits on truncate.
    • Fix COW code to don't try to copy overflow block into prev snapshot if the inode doesn't exit in the target snapshot.
    • Reject as invalid an NFS4 ACL that is appended to a posix ACL.
    • Fix a deadlock which can occur during recovery when there is the near simultaneous failure of an NSD server node and disks which are twin tailed to that failed server and another server. There is a narrow timing window within which these multiple near simultaneous failures can trigger the deadlock.
    • Changed the error message when mmclone is run on a filesystem without 'fastea' enabled for AIX.
    • Fix an error of "No such file or directory" after successful mmcrsnapshot.
    • Fixed the allocation code which caused a memory corruption that may lead to FSSTRUCT errors. The problem only occurs when mmadddisk fails due to some unexpected error such as running out of metadata space.
    • This fix applies to all releases of the mmapplypolicy command from 3.4 and onwards.
    • mmbackup will exclude Socket special files from backup.
    • Avoid deadlock writing files in some high stress situations.
    • Fix a problem in mmfileid command so that it can file disk address of an xattr overflow block correctly.
    • Avoid crashes when snapshots are used with high update loads in a filesystem with at least 200M files.
    • This fix applies to all releases of the mmapplypolicy command from 3.1 and onwards.
    • Fix one restripefs hang problem by cleanup the restripeFileP in the endof SFSRepairFileInit if no further repair is needed.
    • This fix applies to all releases of GPFS with an mmapplypolicy command that supports the -q option. The -q option is rather esoteric! Most customers will not use it directly, BUT the mmbackup command does invoke mmapplypolicy with -q.
    • Account for missing storage pool information in disks created prior to GPFS 3.4.
    • Fixed the behavior of resync and failover operations when the Queue State is Dropped, for an AFM fileset.
    • Fix the wrong return code of dm_read_invis() call so that is return the true error code instead of -1.
    • Ensure both SSL key files are restored with a single command invocation.
    • gpfs_putacl allows OWNER@ to be denied READ/WRITE_ACL and ATTR.
    • This update addresses the following APARs: IV34111 IV35514 IV35753 IV35759 IV35763 IV36645 IV36647 IV36670 IV36673 IV36674 IV36675 IV37164.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2013-04-22T14:08:31Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.21 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.21

    April 18, 2013

    * Loss of access to files with ACLs can occur if independent filesets are, or have been created in the filesystem.
    * An assert can occur if mmrestripefs runs while ACLs are being modified.
    * Fixed potential mount hang problem due to multiple dmapi sessions on the same node manage the fs. This situation may happen after cluster lost quorum, then reestablished, and then cluster manager node get migrated to some other nodes.
    * Fix a problem of return code checking in DMAPI.
    * Allow mmlsquota to run when the user executing the command does not belong to any group.
    * Correct the GPFS kernel extension on Linux to check inode's access time value to make sure that it is less than a maximum 32bit unsigned interger.
    * Install this fix when convenient, unless you are bothered by "orphan" policy processes, in which case, sooner!
    * Fixed a problem with metanode where it is setting fragmentChanged flag without holding wa inode lock.
    * Changed the default value of the tscWorkerPool.
    * Fix a problem which causes remote mounts to time-out in only 2.5 seconds in the presence of a communication outage between the clusters.
    * Avoid a rare assert when mount runs while mmrestripefs is active.
    * Fix a problem for AIX of ls .snapshots via NFS getting assert.
    * Fix mmsnmpagentd error (hang and core dump) due to lack of serialization among threads accessing the socket connection to GPFS daemon.
    * Fix GPFS daemon failure (assert getFragmentSubblocks(); mnode.C) when attempting to preallocate disk space for a file located in a snapshot.gpfs_prealloc() will now return with error EPERM when called on a file in asnapshot.
    * permission-denied when nfsd rebuilds dentry trees on RHEL 5.5.
    * This update addresses the following APARs: IV37403 IV37435 IV37605 IV38390 IV38481 IV38622 IV38638 IV38640 IV38641 IV38644 IV38645.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2013-06-10T20:44:17Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.22 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.22

    June 7, 2013

    * Fixed a rare mnode token request failure.
    * Fix assert "(((gpfsNode_t *)((lcl._bufP->b_vp))) == currGnP)" that may occur at mmap stress workloads.
    * RecLockModuleReset crashes in cxiFcntlLock/posix_lock_file
    * Using extended attributes without first running "mmmigratefs <SG>--fastea" and creating fileset snapshots will result in problems accessing snapshot data. This change helps prevent the situation from arising but cannot repair the problem.
    * Use suitable memory size for inode deallocHistory in each indepdendent fileset.
    * Fix assert in flushAllBuffers when there's no logged update on this dirty data buffer.
    * Fix a flaw in the inode replicate compare code that fails to find mismatches in all inodes in a inode block. The fix ensures that all inodes in a inode block are compared, verified and fixed.
    * gpfs_quotactl() should not allow setting quotas for the root user.
    * Only allocate clean buffers for the log descriptor buffer in LogFile::prepareForIO(). Other buffers are allocated correctly already.
    * Fix restripe code that caused inodes to be incorrectly marked as fully repaired even after error has been encountered during repair. This can cause later restripe to skip repairing these incorrectly marked inodes.
    * Fix a deadlock and "22 minutes searching" assert when the FS manager node has to scan so many AllocSegs during initial mount processing that a thread acquiring a log buffer cannot find any buffers to steal.
    * Update mmchdisk to better handle disk usage change.
    * mmbackup improved file exclusion rule.
    * Fix access denied by cNFS RHEL6.3 server.
    * Reduce the number of concurrent quota prefetch requests, per quota client, for the same quota object.
    * Fix a problem that can not steal a buffer within 22 minutes.
    * Make sure pending locks are not granted during mmshutdown cleanup by going into grace period before cleanup starts.
    * reduce distributed locking overhead when a large number of nodes start writing to the same file.
    * Fix assert in flushBuffer because a failed cxiUXfer left adirty buffer but the last data block of the file was not updated.
    * CNFS: restart rpc.statd if inactive and return to normal operation.
    * When file is first accessed locally it gets a file operation table that includes splice_read which is needed for mmap. Since this operation is not needed for NFS but is only used to improve performance for other file system we disable it for NFS access with GPFS since with GPFS it hurts performance.
    * Fix AIX mknod command to correctly initialize the kernel gnode.
    * Correct the socketMaxListenConnections when start sdr server. The problem only occurs if socketMaxListenConnections is not the default.
    * Work-around to avoid long waiters when configured withenforceFilesetQuotaOnRoot=yes, fileset quota is nearly exhausted and storage consumption occurs by growing directories with many very small or empty files.
    * Fix problem detecting network availability during node reboot.
    * Remove locale dependencies when displaying file system version numbers.
    * Change %diskName for diskFailure callback event from one disk Name to a list of comma separated disk Names.
    * Fix a problem that create or delete snapshot hang caused by a leaked ro lock on independent fileset.
    * This update addresses the following APARs: IV39231 IV39712 IV39714 IV39716 IV40230 IV40250 IV40465 IV40804 IV41179 IV41541 IV41543 IV41654 IV41862 IV42669 IV42753.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2013-07-19T16:43:01Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.23 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.23

    July 18, 2013

    * Fix a "VMM iowait" deadlock encountered during mmap flush.
    * Fixed stripe group close issue in panic error which could prevent a node from rejoining the new cluster group after a quorum loss.
    * Apply immediately if you're seeing any segfaults or mysterious mmapplypolicy/tsapolicy crashes during the "execution" phase of the policy command. This defect is more likely to surface on PPC64 machines, but may occur on other architectures.
    * Add nodes with a panicked filesystem to the mountlist during the new sgmgr takeover query to allow the sgmgr to stay long enough to process pending sgmMsgSGPanicLocal requests.
    * The use of LO_XLOCKNOWAIT flag has been extended to ignore the value ofthe xnew_token_mode to prevent a token revoke from blocking a lock request in lock_vfs_m().
    * Important for customers running mmbackup.
    * Reduce deadlock caused by quota module, especially, when quota limits are close to usage.
    * Fixed a problem where a node attempting to mount a file systemon a remote cluster which operates with a tie-breaker disk causes that cluster to lose cluster membership. Problem occurs if the cluster hosting the file system includes nodes that do not have APAR IV21133 (3.3) /IV21759 (3.4) / IV21760 (3.5), or if such nodes remote-mount that file system. Fix needs to be applied to the nodes performing the remote mount.
    * Avoid a rare assert when accessing snapshots during SG manager change.
    * Provide better suggestions if the user specified a conflict set of parameters in mmcrfs.
    * Fix is highly recommended for GNR/GSS customers. Fix avoids a problem that can cause prolonged filesystem unavailability.
    * Fix a problem that create or delete snapshot hang caused by a leaked ro lock on independent fileset.
    * Fix mmrestripefs command failure when unmountOnDiskFail is set.
    * Rectify a situation for Linux where the io size exceeds the max hw sectors of the block device.
    * Resolved a kernel soft-lockup issue.
    * A flaw in the mmfsck code path can result in SEGV while scanning a filesystem that has inodes with metadata inconsistency across replicas, where the first copy is the bad copy and the second copy is the good copy.
    * fix stale advLkObjP that was holding an OpenFile.
    * Avoid potential kernel heap corruption if unexpectedly large symlink object is encountered.
    * kernel exception results from a zero-length on gpfs_getacl.
    * This update addresses the following APARs: IV34561 IV41232 IV43762 IV43765 IV43770 IV43772 IV43775 IV43845 IV44276 IV44413 IV44469 IV44972 IV44976.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2013-09-06T16:01:10Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.24 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.24

    September 4, 2013

    * Fixed hang due to COW and ditto resolution locking order
    * Update token code to prevent possible GPFS daemon assert when accessing files via DIO.
    * Fixed log assert during log recovery.
    * If two instances of mmbackup are running against same file system, the second instance of mmbackup should exit without disrupting the first instance of mmbackup.
    * Fix a SGPANIC issue during failure update.
    * Prevent deadlock between image backup and inode expansion.
    * Fix deadlock encountered during mmap flush.
    * Fixed message timeout calculating error which could expell nodes by wrong.
    * kxFindCloseNFS has been modified to call a new function kxFindCloseSMB which finds the matching NFS instance and then calls LkObjSMB::SMBCloseM to remove its SMBOpen locks if refCount is 0 for the associated NFSDataobject. SMBlock_vfs_m has been changed to call kxFindCloseSMB instead ofkxFindCloseNFS1 as well for the same reason. gpfsRead() has been modified to reacquire the missing SMBOpen read lock, similar to what gpfsWrite() has already done. This is necessary as an open NFS instance can now lose its SMBOpen locks due to a revoke.
    * Count the panicked nodes out in recoverRegions() even if they are on the mount list to allow new fsmgr takeover to succeed even when all other nodes panicked.
    * Fix assert in shrinkToFit when shrinking the directory data block.
    * Add defensive code to prevent /etc/fstab corruption when local fs is 100% full.
    * Call CXI_UPDATE_OSNODE in SFSTrunc() only when truncFile() returns E_OK.
    * Fixed 'inconsistency in file system metadata' after delete fileset.
    * Rebuild mmbackup shadow database will use -s <LocalWorkDirectory> during query shadow file sort.
    * Correct handling of mailbox worker creation when worker1Threads dynamically changed after daemon startup.
    * honor env variable MMBACKUP_LOG_SUFFIX + 2 bug fixes Provide a way for the common, persistent error log lists in mmbackup to be named with a user-provided suffix to aid in finding them after run is complete. Usually the persistent error files are named mmbackup.auditBadPaths.<tsm server name> mmbackup.auditUnresolved.<tsm server name> mmbackup.unsupported.<tsm server name> now they can be suffixed with "." followed by the value in the env variable. Also fixed two bugs with other env variables: Integer evaluate DEBUGmmbackup and carefully check for empty string with 'eq' on MMBACKUP_PROGRESS_CALLOUT.
    * The default mount point for a GPFS file system cannot be set to "/".
    * Desensitize mmbackup to number formatting In non-US LOCALEs the number format varies and some dsmc commands persist in using thousands separators that confuse the code about the number of objects processed for various actions. This can result in mmbackup seeing a failure where none exists. Filter all non nunmeric characters from portions of dsmc output that are tallied to arrive at success or failure determinations.
    * Added a new undocumented option to fsck -xf <dirName> which will allow user to specify an alternate name for 'lost+found' directory. This directory will then be created by fsck and used in place of 'lost+found'. This option is useful if user finds that existing 'lost+found' directory is corrupt.
    * Fixed a problem where offline fsck can sometimes hit segv in gpfs v3.4.0.20 and above.
    * This update addresses the following APARs: IV45552 IV45555 IV45942 IV46218 IV46547 IV46548 IV47144.
     

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2013-10-28T19:46:01Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.25 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.25

    October 24, 2013

    * Fixed rare assert during token manager recovery while mount was in progress.
    * Change the locking mechanism of fcntl F_SETLEASE call to provide better support of Linux 3.0 and above kernel.
    * Fixed rare problem in mnode token revoke.
    * Fix assert in flushBuffer when user provided invalid data buffer to write a new preallocated data block.
    * Fixed mmbackup excluding files with SUID or SGID set.
    * Add a new configuration variable to limit number of prefetch buffer per instance. This will help tune single stream read performance.
    * Fixed problem in snapshot copyonwrite for the request that is not within the range of snapshot to which the data is being copied.
    * Avoid returning stale buffer for read of uncached file if background prefetch is running and there is WAN delays.
    * GNR systems (on the P7IH hardware with the P7IH disk enclosure) or GSS systems (GPFS storage server on xSeries servers) use mmchcarrier to replace Pdisks, and are vulnerable anytime an error occurs during mmchcarrier.
    * Allow file owner to use gpfs_set_winattrs (posix acl or mode).
    * Fix assert when NFS client trying to access file from deleted independent fileset.
    * Refine the condition whether to grant the XW token for the RO request, if there was recent read activity on at least one of the nodes in the copyset, grant the RO token instead of XW token.
    * Fix the problem that HSM recall might hang due to incorrect return code interpretation from ABORT event response.
    * Fix a rare data corruption error which can be hit by Direct IO operation.
    * Fix problem which caused one of the nodes to remain fenced aftera short-term network outage.
    * Ensure mmbackup internal use files and lock are intact when acquiring lock fail.
    * This update addresses the following APARs: IV48793 IV48795 IV48797 IV49855 IV49859 IV49863 IV49865 IV49867.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2014-01-31T21:38:24Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.27 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.27

    January 31, 2014

    * Fixed problem in metanode optimization duting directory lookup.
    * Fix offline fsck incorrectly patching inode fileset id.
    * fix a gpfs/nfs issue on kernel greater than 2.6.36.
    * Fix a problem that non-smgr failure causes the whole deldisk fail.
    * Ensures that the rename code gets the flush flag on objects it needs to modify. This change introduces an option in the lock tab entry for file object to optionally acquire flush flag.
    * Fixed fsck abnormal shutdown due to buffer overflow issues when there are a large number of inode and duplicate fragment corruptions in the file system.
    * Assign and verify fcntl sequence numbers in Linux NFS callback path.
    * fix a split-brain problem during cnfs recovery.
    * Rectify a situation for Linux where the io size exceeds the max hw sectors of the block device.
    * This update addresses the following APARs: IV51521 IV52867 IV54381 IV54461 IV54480.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2014-04-09T14:30:25Z  in response to gpfs@us.ibm.com

    Abstract: GPFS directory corruption with possible undetected data corruption

    Problem Summary: When multiple nodes are updating a shared directory concurrently, the problem could cause incorrect results from directory operations issued on one node, leading to orphaned inodes (files inaccessible from any directory entry), or directory entries pointing to deleted or incorrect files.   This problem could also cause silent data corruption, if any disk contains both GPFS metadata and data, and a stale buffer is written to a disk address that has been freed and reallocated for some other purpose.

    Users affected (both of the following conditions must apply for customer to be affected):
    1. GPFS service levels 3.4.0.24, 3.4.0.25, 3.4.0.26, 3.4.0.27, 3.5.0.13. 3.5.0.14, 3.5.0.15, or 3.5.0.16.
    2. Workload consists of concurrent directory updates from multiple nodes.

    Problem Description: See Problem Summary.

    Recommendation:  Customers who have run the affected service levels should upgrade to GPFS 3.5.0.17 or 3.4.0.28 (when available) service level updates (go to Fix Central http://www.ibm.com/eserver/support/fixes/), or should apply efixes for the affected service levels.Customers who have seen FSSTRUCT 1124 or 1122 messages, or EIO errors during directory operations, should also run off-line fsck to identify and repair possible directory damage.

  • This reply was deleted by puneetc 2014-04-18T17:51:39Z.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2014-04-14T12:44:13Z  in response to gpfs@us.ibm.com

    GPFS V3.4 for Windows is NOT vulnerable to the OpenSSL Heartbleed vulnerability (CVE-2014-0160)

    See the Flash here - http://www-01.ibm.com/support/docview.wss?uid=isg3T1020686

    Abstract

    GPFS V3.4 for Windows is not vulnerable to the CVE-2014-0160 OpenSSL Heartbleed vulnerability

    Content

    GPFS V3.4 for Windows is NOT vulnerable to the OpenSSL Heartbleed vulnerability (CVE-2014-0160).
    Remediation: No action required.

     

  • This reply was deleted by puneetc 2014-04-22T17:08:07Z.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2014-04-22T17:31:28Z  in response to gpfs@us.ibm.com

    GPFS V3.4 and V3.5 for AIX, Linux on Power and Linux on x86 do not ship OpenSSL but action may be required due to the OpenSSL Heartbleed vulnerability (CVE-2014-0160)

    Flash (Alert)

    http://www-01.ibm.com/support/docview.wss?uid=isg3T1020713

    Abstract

    GPFS V3.4 and V3.5 for AIX, Linux on Power and Linux on x86 do not ship OpenSSL but action may be required due to the OpenSSL Heartbleed vulnerability (CVE-2014-0160)

    Content

    GPFS V3.4 and V3.5 for AIX, Linux on Power and Linux on x86 do not ship OpenSSL but action may be required due to the OpenSSL Heartbleed vulnerability (CVE-2014-0160)


    Remediation:

    If you configure your GPFS clusters to use OpenSSL, consult the licensor of the OpenSSL installed on your system for instructions.

    If you obtained OpenSSL from the Operating System, information can be found at these links:

    AIX: http://www14.software.ibm.com/webapp/set2/subscriptions/onvdq?mode=18&ID=3488&myns=pwraix61&mync=E

    Red Hat: https://access.redhat.com/site/solutions/781793

    SUSE/Novell: http://support.novell.com/security/cve/CVE-2014-0160.html

    After you deploy an unaffected level of OpenSSL on all nodes in your clusters, you should take the following actions:

    1. The following can be done on a small group of nodes at each time (ensuring that quorum is maintained) to maintain file system availability:

    a. Stop GPFS on the node

    b. Install the version of OpenSSL which contains the fix

    c. Restart GPFS on the node

    2. The following should be done only when all nodes, across the multiple clusters, are running an unaffected level of OpenSSL (i.e., when the above steps are completed):

    a. Change the security keys used for secure communications. Refer to the Advanced Administration Guide, Chapter 1: Accessing GPFS file systems from other GPFS clusters, Changing security keys section . The steps should be taken up to, and including the procedure to ensure that the old key is no longer accepted

    b. If SSH is used to execute remote GPFS commands, then the SSH host keys must also be changed

    c. If SSH is used to execute remote GPFS commands, then SSH user keys/passwords must also be changed.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2014-07-03T12:56:06Z  in response to gpfs@us.ibm.com

    Abstract
     A recent change in the UEFI driver update for the SAS HBA can result in damage to any disks used for GPFS which previously contained a GPT partition table (due to non-GPFS use) but are now assigned to GPFS, on upgrade.


    Content
    The UEFI firmware includes a function for Disk GPT Table Recovery.  This function will restore the GPT table from the backup GPT table which was stored at the end of the disk, and it is the default function. When a disk contains a backup GPT table but is later used as a GPFS NSD, the GPT Table Recovery action could rewrite GPFS NSD and Disk descriptor headers with the backup GPT table. Thus such NSDs will be lost after the GPT Table Recovery action.


    Users affected:
    This problem may affect Linux customers who have GPFS NSDs that were created with GPFS version 3.5 or earlier, and these disks were partitioned before they were used for a GPFS NSD.


    Remediation:

    An nsdcheck script is available to run against NSD devices to determine if there is a valid backup GPT table on the device. An NSD disk is at risk if the remarks display hasPrimaryGpt=no,hasSecondaryGpt=yes. If the backup table is not valid, the script can then be used to clear the backup GPT table on the NSD device, prior to any firmware updates, as soon as is possible.  

    Running this script is recommended for all Linux customers, as a precaution. The script when used to remove the secondary GPT will only remove it IF AND ONLY IF there is a GPT signature on the last sector of the NSD device but not at the beginning .

    The script is available:
              1.  In the samples directory with GPFS V3.4.0.29 and V3.5.0.19 from FixCentral
     http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.4.0&platform=All&function=all

    http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.5.0&platform=All&function=all

               2. As an attachment to this post.


    Note:  If you need to restore the secondary GPT signature from GptBackupFile to disk, contact IBM Service.  This should not be done without guidance.

    Attachments

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2014-09-10T12:54:08Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.30 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.30

    September 9, 2014

    * Fix two integer overflow problems of GPFS block map allocation module which caused by adding larger disk into existing file system. The problem can lead to blocks lost and data corruption.
    * Fixed a race condition that could lead to an assertion failure when mmpmon is used.
    * Improve handling and reporting of certain types of corrupted directory blocks.
    * Fix a memory leak in the GPFS daemon associated with Events Exporter, mmpmon,and SNMP support.
    * Fixed a memory overwritten problem caused by uninitialized stringcopying in mmfs_dm_query_session().
    * Prevent mmimgbackup from accepting --image option with a value that begins with an absolute path name such as /gpfs/ as this option is meant to only permit a file base name or prefix, not for specifying the output directory.
    * This update addresses the following APARs: IV62542 IV62690 IV62691 IV63358.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2014-11-07T21:49:14Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.31 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.31

    November 7, 2014

    * Fixed a possible cause of deadlock when mmpmon or the GPFS SNMP subagent does node registration.
    * This fix will detect the failure and report that the mmchfs command has failed. The FS will retain its old name and still be usable.
    * Update restripe code to better handle both replica in same failure group after disk usage and failure group were changed via mmchdisk.
    * Fix code to prevent a GPFS daemon assert that could occur after automatic remount of filesystems. The problem only occurs on Linux node when user issued GPFS commands to access the filesystem before automatic remount has completed.
    * Call LOGSHUTDOWN when the token manager cannot allocate a new BRTreeNode to avoid granting conflicting BR tokens.
    * Add additional checks to ensure the uniqueness of certain node attributes.
    * Fix bug introduced in GPFS3.4 PTF30 and GPFS3.5 PTF20 where mmlsnsd -X doesn't display persistent reserve information of the disk.
    * Correct --single-instance option for mmapplypolicy runs against directory.
    * Fix an i_count leak issue caused by nfs.
    * This update addresses the following APARs: IV63919 IV65455 IV65495 IV65496.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2015-03-13T21:45:33Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.32 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.32

    March 12, 2015

    * Allow disk addresses in inode 5 (Extended Attribute File) to be found by the mmfileid command.
    * Fix a problem where fsck hits signal 8 during inode validation.
    * Improve security of temporary and result files created by mmapplypolicy. Apply if secrecy of file metadata (pathnames, attributes and extended attributes) is a concern.
    * Protect fcntl kernel calls against non-privileged callers.
    * GPFS command hardening.
    * Enable dynamically switching from cipherList=EMPTY to cipherList=AUTHONLY without bringing down the entire cluster.
    * This update addresses the following APARs: IV68003 IV68006.

    Updated on 2015-03-13T22:15:02Z at 2015-03-13T22:15:02Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2015-03-16T12:52:48Z  in response to gpfs@us.ibm.com

    Security Bulletin: IBM General Parallel File System is affected by security vulnerabilities (CVE-2015-0197, CVE-2015-0198, CVE-2015-0199)

    View the complete Security Bulletin published on 2015-03-13 at http://www-01.ibm.com/support/docview.wss?uid=isg3T1022062

    Summary

    Security vulnerabilities have been identified in current levels of GPFS V4.1, V3.5, and V3.4:
    - could allow a local attacker which only has a non-privileged account to execute programs with root privileges (CVE-2015-0197)
    - may not properly authenticate network requests and could allow an attacker to execute programs remotely with root privileges (CVE-2015-0198)
    - allows attackers to cause kernel memory corruption by issuing specific ioctl calls to a character device provided by the mmfslinux kernel module and cause a denial of service (CVE-2015-0199)

    Vulnerability Details


    CVEID: CVE-2015-0197
    DESCRIPTION: IBM General Parallel File System could allow a local attacker which only has a non-privileged account to execute programs with root privileges.
    CVSS Base Score: 6.9
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/101224 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:L/AC:M/Au:N/C:C/I:C/A:C)

    CVEID: CVE-2015-0198
    DESCRIPTION: IBM General Parallel File System may not properly authenticate network requests and could allow an attacker to execute programs remotely with root privileges.
    CVSS Base Score: 9.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/101225 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:C/I:C/A:C)

    CVEID: CVE-2015-0199
    DESCRIPTION: IBM General Parallel File System allows attackers to cause kernel memory corruption by issuing specific ioctl calls to a character device provided by the mmfslinux kernel module and cause a denial of service.
    CVSS Base Score: 4.7
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/101226 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:L/AC:M/Au:N/C:N/I:N/A:C)


    Affected Products and Versions

    GPFS V4.1.0.0 thru GPFS V4.1.0.6

    GPFS V3.5.0.0 thru GPFS V3.5.0.23

    GPFSV3.4.0.0 thru GPFSV3.4.0.31

    For CVE-2015-0198, you are not affected if either of the following are true:

        the cipherList configuration variable is set to AUTHONLY or to a cipher

    or

        only trusted nodes/processes/users can initiate connections to GPFS nodes

    Remediation/Fixes

    Apply GPFS 4.1.0.7 , GPFS V3.5.0.24 ,or GPFS V3.4.0.32 as appropriate for your level of GPFS available from Fix Central at http://www-933.ibm.com/support/fixcentral/ ,

    For CVE-2015-0198, after applying the appropriate PTF, set cipherList to AUTHONLY.

    To enable AUTHONLY without shutting down the daemon on all nodes:

        Install the PTF containing the fix on all nodes in the cluster one node at a time
        Generate SSL keys by running the mmauth genkey new command. This step is not needed if CCR is in effect (GPFS 4.1 only)
        Enable AUTHONLY by running the mmauth update . -l AUTHONLY command

    If the mmauth update command fails, examine the messages, correct the problems (or shut down the daemon on the problem node) and repeat the mmauth update command above.

    Note: Applying the PTF for your level of GPFS (GPFS 4.1.0.7 , GPFSV3.5.0.24 , or GPFS V3.4.0.32,) on all nodes in the cluster will allow you to switch cipherList dynamically without shutting down the GPFS daemons across the cluster. The mitigation step below will require all nodes in the cluster to be shut down.

    If there are any nodes running GPFS 3.4 on Windows then switching the cipherList dynamically is only possible in one of the following two scenarios:

        The mmauth update command is initiated from one of the GPFS 3.4 Windows nodes

    or

        If the command is issued from another node in the cluster then GPFS must be down on all the GPFS 3.4 Windows nodes


    Workarounds and Mitigations


    For CVE-2015-0197 and CVE-2015-0199, there are no workarounds or mitigations.

    For CVE-2015-0198, set cipherList to AUTHONLY, or to a real cipher, Follow the instructions above if the PTF was installed on all the nodes in the cluster. Otherwise:

        Generate SSL keys by running the mmauth genkey new command
        Shut down the GPFS daemon on all nodes on the cluster
        Enable AUTHONLY by running mmauth update . -l AUTHONLY

     

    Get Notified about Future Security Bulletins

    Subscribe to My Notifications to be notified of important product support alerts like this.

    Acknowledgement

    The vulnerabilities were reported to IBM by Florian Grunow and Felix Wilhelm of ERNW

    Updated on 2015-03-19T12:11:16Z at 2015-03-19T12:11:16Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.4 announcements

    ‏2015-05-14T02:27:58Z  in response to gpfs@us.ibm.com

    GPFS 3.4.0.33 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.4.0.33

    May 13, 2015

    * Fix a problem with block allocation code where E_NOSPC error could be incorrectly returned after runn
    ing out of disk space in one failure group. This problem only affects file systems with data replicatio
    n.
    * Enforce the same DA name for the old pdisk and the corresponding new one when replacing pdisk with mm
    addpdisk --replace
    * Fix a signal 11 problem in multi-cluster environment when gpfs daemon relay the fsync request through
     metanode but the OpenFile is stolen on the metanode in the middle.
    * Potentially avoid crash on normal OS shutdown of CNFS nodes.
    * Fix command poor performance on cluster that has no security key.
    * mmauth inadvertently change cipherList to an invalid string.
    * This update addresses the following APARs: IV71609 IV71984 IV72015 IV72036 IV72702.