IBM Support

Readme and Release notes for release 3.3.0.10 General Parallel File System (GPFS) 3.3.0.10 GPFS-3.3.0.10-x86-Linux Readme

Fix Readme


Abstract

xxx

Content

Readme file for: GPFS Readme header
Product/Component Release: 3.3.0.10
Update Name: GPFS-3.3.0.10-x86-Linux
Fix ID: GPFS-3.3.0.10-x86-Linux
Publication Date: 16 September 2011
Last modified date: 16 September 2011

Installation information

Download location

Below is a list of components, platforms, and file names that apply to this Readme file.

Fix Download for Linux

Product/Component Name: Platform: Fix:
General Parallel File System (GPFS) Linux 32-bit,x86 RHEL
Linux 32-bit,x86 SLES
GPFS-3.3.0.10-x86-Linux

Prerequisites and co-requisites

None

Known issues

  • - Problem discovered in earlier GPFS releases

    During internal testing, a rare but potentially serious problem has been discovered in GPFS. Under certain conditions, a read from a cached block in the GPFS pagepool may return incorrect data which is not detected by GPFS. The issue is corrected in GPFS 3.3.0.5 (APAR IZ70396) and GPFS 3.2.1.19 (APAR IZ72671). All prior versions of GPFS are affected.

    The issue has been discovered during internal testing, where an MPI-IO application was employed to generate a synthetic workload. IBM is not aware of any occurrences of this issue in customer environments or under any other circumstances. Since the issue is specific to accessing cached data, it does not affect applications using DirectIO (the IO mechanism that bypasses file system cache, used primarily by databases, such as DB2® or Oracle).

    This issue is limited to the following conditions:

    1. The workload consists of a mixture of writes and reads, to file offsets that do not fall on the GPFS file system block boundaries;
    2. The IO pattern is a mixture of sequential and random accesses to the same set of blocks, with the random accesses occurring on offsets not aligned on the file system block boundaries; and
    3. The active set of data blocks is small enough to fit entirely in the GPFS pagepool.

    The issue is caused by a race between an application IO thread doing a read from a partially filled block (such a block may be created by an earlier write to an odd offset within the block), and a GPFS prefetch thread trying to convert the same block into a fully filled one, by reading in the missing data, in anticipation of a future full-block read. Due to insufficient synchronization between the two threads, the application reader thread may read data that had been partially overwritten with the content found at a different offset within the same block. The issue is transient in nature: the next read from the same location will return correct data. The issue is limited to a single node; other nodes reading from the same file would be unaffected.


Korn Shell for SLES 10

The GPFS required level of Korn Shell for SLES 10 support is version ksh-93r-12.16 and can be obtained at the following architecture-specific link.

Installation information

  • - Installing a GPFS update for System x

    Complete the following steps to install the fix package:

    1. Unzip and extract the update package (&lt filename >.tar.gz file) with one of the following commands:

      gzip -d -c &lt filename >.tar.gz | tar -xvf -

      or

      tar -xzvf &lt filename >.tar.gz


    2. Verify the udpate's RPM images in the directory. Normally, the list of RPM images in this directory would be similar to one of the following:

      GPFS update
      gpfs.base.&lt update_version &gt .&lt arch &gt .update.rpm
      gpfs.docs.&lt update_version &gt .noarch.rpm
      gpfs.gpl.&lt update_version &gt .noarch.rpm
      gpfs.msg.en_US.&lt update_version &gt .noarch.rpm


      GPFS update with GPL licensed kernel module
      gpfs.base.&lt update_version &gt .&lt arch &gt .update.rpm
      gpfs.docs.&lt update_version &gt .noarch.rpm
      gpfs.gpl.&lt update_version &gt .noarch.gpl.rpm
      gpfs.msg.en_US.&lt update_version &gt .noarch.rpm


      where
      &lt update_version &gt specifies the version number of the update you downloaded, for example, 3.3.0-7 .

      &lt arch &gt specifies the system architecture, for example i386 for 32-bit System x or x86_64 for 64-bit System x.

      For specific filenames, check the Readme for the GPFS update by clicking the "View" link for the update on the Download tab.

    3. Follow the installation and migration instructions in your GPFS Concepts, Planning and Installation Guide.
  • - Upgrading GPFS nodes

    In the below instructions, node-by-node upgrade cannot be used to migrate from GPFS 2.3 to later releases. For example, upgrading from 2.3.x to 3.1.y requires complete cluster shutdown, upgrade install on all nodes and then cluster startup.

    Upgrading GPFS may be accomplished by either upgrading one node in the cluster at a time or by upgrading all nodes in the cluster at once. When upgrading GPFS one node at a time, the below steps are performed on each node in the cluster in a sequential manner. When upgrading the entire cluster at once, GPFS must be shutdown on all nodes in the cluster prior to upgrading.

    When upgrading nodes one at a time, you may need to plan the order of nodes to upgrade. Verify that stopping each particular machine does not cause quorum to be lost or that an NSD server might be the last server for some disks. Upgrade the quorum and manager nodes first. When upgrading the quorum nodes, upgrade the cluster manager last to avoid unnecessary cluster failover and election of new cluster managers.

    1. Prior to upgrading GPFS on a node, all applications that depend on GPFS (e.g. Oracle) must be stopped. Any GPFS file systems that are NFS exported must be unexported prior to unmounting GPFS file systems. If tracing was turned on, then tracing must be turned off before shutting down GPFS as well.
    2. Stop GPFS on the node. Verify that the GPFS daemon has terminated and that the kernel extensions have been unloaded (mmfsenv -u ). If the command mmfsenv -u reports that it cannot unload the kernel extensions because they are "busy", then the install can proceed, but the node must be rebooted after the install. By "busy" this means that some process has a "current directory" in some GPFS filesystem directory or has an open file descriptor. The freeware program lsof can identify the process and the process can then be killed. Retry mmfsenv -u and if that succeeds then a reboot of the node can be avoided.
    3. Upgrade GPFS using the RPM command as follows:

      GPFS update
      rpm -U gpfs.base-&lt update_version > .&lt arch &gt .update.rpm
      rpm -U gpfs.docs-&lt update_version &gt .noarch.rpm
      rpm -U gpfs.gpl-&lt update_version &gt .noarch.rpm
      rpm -U gpfs.msg.en_US-&lt update_version &gt .noarch.rpm


      GPFS update with GPL licensed kernel module
      rpm -U gpfs.base-&lt update_version &gt .&lt arch &gt .update.rpm
      rpm -U gpfs.docs-&lt update_version &gt .noarch.rpm
      rpm -U gpfs.gpl-&lt update_version &gt .noarch.gpl.rpm
      rpm -U gpfs.msg.en_US-&lt update_version &gt .noarch.rpm


    4. Check the GPFS FAQ to see if any additional images or patches are required for your Linux installation: General Parallel File System FAQs (GPFS FAQs)
       
    5. Recompile any GPFS portability layer modules you may have previously compiled. The recompilation and installation procedure is outlined in the following file:
      /usr/lpp/mmfs/src/README

Additional information

  • - Notices

    [June 9, 2010]

    A build error caused an issue with the DMAPI function in the GPFS 3.2.1-20 package that was released on May 27, 2010. The corresponding packages have now been replaced on the service download site.

    If you installed the May 27 GPFS 3.2.1-20 package and mounted a DMAPI-enabled file system while running GPFS 3.2.1-20 (for example, -z yes by means of the HSM features of TSM), please contact IBM Support. The replacement 3.2.1-20 package works as designed, but does not fix a file system that was mounted with the problematic 3.2.1-20 package.

    Verify that you have the correct package installed by running the rpm -qi gpfs.base command. Make sure that the Build Date is Mon 07 Jun 2010.

    [June 2, 2010]

    A build error caused an issue with the DMAPI function in the GPFS 3.3.0-6 package that was released on May 22, 2010. The corresponding packages have now been replaced on the service download site.

    If you installed the May 22 GPFS 3.3.0-6 package and mounted a DMAPI-enabled file system while running GPFS 3.3.0-6 (for example, -z yes by means of the HSM features of TSM), please contact IBM Support. The replacement 3.3.0-6 package works as designed, but does not fix a file system that was mounted with the problematic 3.3.0-6 package.

    Verify that you have the correct package installed by running the rpm -qi gpfs.base command. Make sure that the Build Date is Thu 27 May 2010.

    [April 1, 2010]

    During internal testing, a rare but potentially serious problem has been discovered in GPFS. Under certain conditions, a read from a cached block in the GPFS pagepool may return incorrect data which is not detected by GPFS. The issue is corrected in GPFS 3.3.0.5 (APAR IZ70396) and GPFS 3.2.1.19 (APAR IZ72671). All prior versions of GPFS are affected.

    Click here for details.

    [March 31, 2010]

    Support for SLES 10 kernel beyond 2.6.16.60-0.58.1 has changed. GPFS 3.3 requires GPFS 3.3.0-5 and GPFS 3.2 requires 3.2.1-18.

    [December 17, 2009]

    Support for GPFS 3.1 has only been extended for AIX and Linux on POWER systems. Service updates will be made available for other Linux platforms, but support is not being extended.

    [November 9, 2009]

    GPFS 3.3.0-1 does not correctly operate with file systems created with GPFS V2.2 (or older). Such file systems can be identified by running "mmlsfs all -u": if "no" is shown for any file system, this file system uses the old format, and the use of GPFS 3.3.0-1 is not possible. GPFS 3.3.0-2 corrects this issue.

    [November 7, 2008]

    GPFS 3.2.1.7 contained a change that impacts TSM HSM recall process of files with stub size >0 causing hangs during recalls. To avoid this problem, the configuration parameter dmapiDataEventRetry has to be set to 'no' via command 'mmchconfig dmapiDataEventRetry=no -i '.

    [September 11, 2008]

    The 3.2.1-5 maintenance level had a data integrity problem using the mmap feature to write or update files on Linux and AIX. The 3.2.1-6 maintenance level is the recommended upgrade path from versions 3.2.0-0 through 3.2.1-4.

  • - Package information

    The update images listed below and contained in the tar image with this README are maintenance packages for GPFS. The update images are a mix of normal RPM images that can be directly applied to your system.

    The update images require a prior level of GPFS. Thus, the usefulness of this update is limited to installations that already have the GPFS product. Contact your IBM representative if you desire to purchase a fully installable product that does not require a prior level of GPFS.

    After all RPMs are installed, you have successfully updated your GPFS product.

    Update to Version:

    3.3.0-10

    Update from Version:

    3.3.0-0 through 3.3.0-9

    Update (tar file) contents:

    README
    changelog
    gpfs.base-3.3.0-10.i386.update.rpm
    gpfs.docs-3.3.0-10.noarch.rpm
    gpfs.gpl-3.3.0-10.noarch.rpm
    gpfs.msg.en_US-3.3.0-10.noarch.rpm
    gpfs.gui-3.3.0-10.i386.rpm

  • - Changelog for GPFS 3.3.x

    Unless specifically noted otherwise, this history of problems fixed for GPFS 3.3.x applies for all supported platforms.

    Problems fixed in GPFS 3.3.0.10 [October 28, 2010]

    • Fix a potential metadata allocation problem where wrong disk may be selected.
    • Fix GPFS automount so that it reads config value in /etc/sysconfig/autofs.
    • Fix "mmquotaon/mmquotaoff/mmdefquotaon/mmdefquotaoff" which provides misleading error msgs when another mounted node is disconnecting.
    • Alarm only if 3 continuous loops all detect the lease expires thread when system time is changed.
    • Fix a potential problem in create snapshot routine so that when create snapshot fails the new snap id won't get set into any of the snapshot files.
    • Fix an erronous assert check in fsck cleanup path. Ensures that the assert is only checked under normal conditions and not during cleanup code path as the relevant data structures would have already cleaned up or be in the process of cleaning up.
    • mmapplypolicy internal error finding one of its internal sort files.
    • If node cannot do cNFS recovery for a failed node, then kill process so another node can do the takeover for both nodes.
    • Prevent logAssertFailed assert which can happen under a rare race condition. This occurred when the SG manager node was trying to resign while acl garbage collector thread was being started.
    • Update to displayed error message and the return code obtained when mmrestoreconfig fails while enabling/disabling default quotas.
    • Fix assert related to RCTX.REPLIED and TSCOMM.C that occurs on the FS manager node if the FS manager is running GPFS release 3.2, and a release 3.3 client tries to mount the filesystem.
    • Linux IO: check mm_struct before pinning pages.
    • Fix retest_path error checking.
    • Improve performance of stat operations on Linux under certain multi-node access patterns.
    • Fixed FSErrValidate error in ACL garbage collection. ACL garbage collection was running at the same time as an inode expansion and was attempting to process new (unititialized) blocks at the end of the inode0 file.
    • Prevent a rare deadlock netween mmcheckquota and FS manager recovery.
    • Corrected assert in mmgetacl/tsgetacl for default ACL on a directory in a remote fs.
    • Forcefully evict unused inodes that have been invalidated to keep the number of unused inodes from growing too large.
    • Fixed mmapplypolicy with SNAPID on AIX getting an SQL error.
    • Improve performance of applications using directIO or, if alignment test fails, go to the regular request path and skip trying to do direct IO.
    • Ensure mmstartup commands are properly serialized thereby avoiding interspersed messages in the mmfs.log file.
    • Fix asserts in fsck while trying to fix corrupt directories.
    • Fix hang between node join thread and events exporter request handler thread.
    • tsapolicy: reduce pathnames (e.g. xxx/. will now be xxx).
    • Fix deadlock when preMount callback invokes mm commands.
    • Fix quote error in mmapplypolicy macro processing.
    • Improve GPFS mmstartup time & other GPFS commands in adminMode=allToAll cluster.
    • Fix buffer length calculation for dmapi user event returned by dm_get_events call.
    • Fix dm_handle_to_path so that it can look up the directory name by its own handle.
    • Fix problem where a remote cluster does not always pick a local NSD server when readReplicaPolicy=local is set.
    • This update addresses the following APARs: IZ84015 IZ84040 IZ84160 IZ85218 IZ85446 IZ86146 IZ86153 IZ86164.

    Problems fixed in GPFS 3.3.0.9 [August 16, 2010]

    • Add T (for terabytes) and P (for petabytes) as suffix to mmedquota/mmdefedquota.
    • Fix an ENOMEM error in the Token Manager memory when multiple remote cluster are working with the same files.
    • Fixed race between tschpolicy thread and deferred deletion thread. This was causing an inconsistent inode state of that policy file being created both in disk inode bitmap and in-memory bitmap.
    • Fixed race between endUse thread and sgmMsgSGTakeoverQuery msg handler.
    • Removed erroneous assert encountered during delete snapshot.
    • Fix a deadlock caused by buffer steal during quota update.
    • Fixes an assert in the communication layer caused due to improper return code being sent back by the block map check message handler while fsck is in progress.
    • Fixes lock_vfs_f/releaseSlow asserts because it doesn't hold the mutex after DAEMON_DEATH.
    • Fix assert in dm_set_disp() path, return error if ccmgr changed in the middle of processing new disposition.
    • Fixes gpfsWrite vinfoUnlock being called when lock was not held and causing exception.
    • Fixes segfaults various ThreadThing methods.
    • Fixed race between mount and garbage collector thread.
    • Fix problem in kSFSGetAttr call to handle compact file case.
    • Robustness improvements to better detect errors from TSM. Better tracking of return codes from mmapplypolicy. Periodic time-stamped output of "Backing up files..." during long-running TSM jobs. Better error diagnostics when remote TSM jobs fail. Corrected file tally for total files backed when remote TSM jobs executed. Cleaner verbose output when debug is not enabled.
    • Fix hang when executable run out of GPFS and mmapRangeLock=no.
    • This fix applies to GPFS 3.2 and higher. To handle recovery when devices return E_NODEV.
    • Fix problem in mmimgrestore when restoring immutable files.
    • Fixed code that allowed regular file read to be performed on a directory which lead to EIO error later. This happens on Linux only.
    • Fix code to always return EISDIR when regular file read is called on a directory.
    • When a migrated file is being deleted, generate dmapi READ event only when copy to snapshot is needed.
    • Fix problem where mmunlinkfileset would sometimes succeed even if there is a process that still has a file in the fileset open on one of the nodes.
    • Fix a deadlock during file system panic processing while ACL garbage collector is running.
    • Fix problem of reading snapshot file after file is migrated and recalled back.
    • Fix get_next_inode to retrieve specified inode information.
    • Improved error message for XATTR policy statement.
    • Fixes gpfsInodeCache slab (and cpu) usage high due to NFS anon dentry allocations.
    • Reduced time required while creating snapshot.
    • Add support for RDMA connections to nsdperf sample program.
    • Adding a trace message and return code for mmnfsrecovernode.
    • Add new path for trace commands access.
    • Fixes tsapolicy command on AIX. Does not produce "error: [X] Error".
    • Fix a problem when adding first disk to a new storage pool while file system is in sync process.
    • Fix a deadlock occuring in a mix of very heavy DIO workload and mmap on Linux.
    • Removed bad DBGASSERT(hasVinfoLock) and add additional maintenance for the local hasVinfoLock flag. Specifically, after kSFSWriteFast had released the lock when returning E_CDITTO_LOCK.
    • Correct a startup problem when migrating from GPFS 2.3.
    • FSCK checking log file inodes even if they have log group number set to -1.
    • Fixed a bug in Windows support that effected systems accessing GPFS through Windows file sharing (CIFS). In some scenarios, directory access could be come very slow and could possibly return incomplete data.
    • Modified the Windows implementation to periodically flush unused executable files from memory.
    • Assert working with elements on the kxRecLockAcquires queue (needs to hold mutex).
    • Fix rare occurrence of file fragment expansion happening during file sync that can cause the assert failure.
    • Fixes asserts in fsck while trying to fix corrupt directories.
    • Fix assertion caused when deleting snapshots with very large files.
    • mmtracectl when running on Windows has been enhanced so that is utilizes the traceFileSize and traceBufferSizeForAIX configuration parameters.
    • Prevent a very narrow race condition during directory lookup.
    • This update addresses the following APARs: IZ80053 IZ80737 IZ80741 IZ80744 IZ80973 IZ81229 IZ81232 IZ82941 IZ83044 IZ83711 IZ83797 IZ83795 IZ84007 IZ84063.

    Problems fixed in GPFS 3.3.0.8 [July 27, 2010]

    • Image restore GXR execution inside policy to allows recognition of failures of remote jobs by incrementing f_errs on failure. This should fix image restore to cause tsapolicy to exit non-zero if a failure occurs.
    • Add error checking for ImagePath parameter and invoke usage message output if incorrect value provided.
    • Capture remote job data that was previously ignored. Detect failures and tally success counts from remote jobs. Display overall job summary at end of run.
    • Choose a bitmap size based on gpfs_statfs64() call to see how many inodes actually in use.
    • Fix a rare assertion during quota file append. During AppendBlockOfRecords a wa lock is needed to append the quota file.
    • Correct a problem with mmlsfileset when Windows is the file system manager-the path name for the junction of the root fileset is missing in the tslsfileset output and the path names for the rest of the junction names are missing the part that represents the mount point.
    • AIX mmapplypolicy error:Missing or improper nodelist file actually is a problem of a fork()d process not terminating correctly. Results were a bogus message "improper nodelist file..."
    • Catch exit codes from critical commands such as sort. Look at returned codes from close calls from pipelined commands. Keep final two lines of output from pipelined commands and display if close returns non-zero.
    • Fixed a rare problem in reading a file from a snapshot that resulted in the data for the last portion of the file being replaced with zeros. Problem occurred only when a node reads the file through a snapshot, then another node appends a small amount of data to the file in the active file system and creates a new snapshot, followed by the original node immediately reading the same file in the new snapshot.
    • Fix the file overwrite codepath on Windows and disallow any operations on symlink objects.
    • Fix mmsdrbackup user exit on Windows.
    • Catch and report errors during Pagepool size reduction on Windows.
    • Ensure that fsck handles orphans from deleted fileset appropriately and deletes them rather than letting them stay unfixed in the filesystem forever.
    • Always stop mmnfsmonitor after GPFS shutdown regardless cnfs status.
    • When trace buffer size given to lxtrace daemon exceeds the lower or upper limit, it should use the minimum or maximum buffer size quietly instead of printing usage message. It should also print a message about what buffer size will be used.
    • Add missing intialization for gpfs32Version variable.
    • Fix bug in mmrestoreconfig on filesystem with filesets.
    • Pass a flag to tell underlying function (flushFile) if flushflag is already held or not.
    • Ensure tsapolicy command has correct exit code. Only affects use of mm image backup.
    • Do not delete *~ files when doing "make clean" in gpl-linux directory since the build process does not create them.
    • Fix assert caused due to accessing deleted inodemap.
    • Avoid rare daemon crash during heavy create load with low memory.
    • Set thread context to global operation context for every snapshot command.
    • putacl/getacl deadlocked on aclFile buffer lock.
    • When doing a trace cycle on linux nodes with RHEL5 and SLES10 or above, generate internaldump after trace cycle.
    • This update addresses the following APARs: IZ79664 IZ79674 IZ79675.

    Problems fixed in GPFS 3.3.0.7 [June 24, 2010]

    • Created new error message and exit function. Utilized in all error exit paths that previously had no failure message.
    • On failover/failback, (gratituous) ARP requests from the node registering a new CNFS IP address are rejected by some switches (that have STP enabled and portfast disabled) for a short period. Subsequently the IP address may not be reachable from outside the subnet. Make sure the port is enabled (and outbound requests are accepted) by first ARPing the gateway (if one is configured) with a deadline of 30 seconds.
    • Serialize xattr registry initialization process.
    • Fix mmbackup to better handle file names with spaces and certain other metachars. Conditionally, replace use of awk with new mmcmi parsebackuprecord function when available. Mmcmi will decorate file paths with double quotes and write to expired and changed files for TSM. Update parsing of PDRs to strictly use file path length value. Allow tsbackup33 to notice if policy run failed and indicate error.
    • Fixed deadlock due to sg takeover failure.
    • Avoid corrupted snapshot files in unusual case of open unlinked files.
    • Avoid deadlock on Linux with small maxFilesToCache due to very frequent file creates and deletes.
    • Always shutdown GPFS when nfsmonitor detects unrecoverable problems such as statd is inactive.
    • Generate DMAPI read event when file is deleted and when copy to snapshot is needed.
    • Handle recovery for devices that return E_NODEV on connectivity loss instead of E_IO on AIX. VIO is an example of this.
    • Prevention of buffers being stolen from inodes that are low-level locked. The current fix ensures only dirty buffers are not stolen.
    • Search /usr/sbin for sm-notify under SLES11 after IP failover.
    • Prevent repeated "No space left on device" filesystem manager failure when snapshot copyon write gets triggered and while filesystem is running out of disk space.
    • Replace use of diff in mmbackup with new mmcmi mergeshadow function. If mmcmi version too old, fall back to diff command.
    • Ensure online fsck does a proper job of cleaning up stale, or failed, allocation message queues. Fixes resulting online fsck assert after finding stale AllocMsgQ.
    • Add (undocumented) --notsm/--tsm switch to permit skipping archiving of file contents for debugging/testing only. In tsbackup33 the switch is passed in as -k.
    • If the file system contains unbalanced big files, there is a small chance to lead file corruption after mmdeldisk is run. Fixed by adjusting PIT code.
    • Fixed problem in dm_set_dmattr() function so that attribute with common first serveral bytes are set correctly.
    • synched disk address in a hyper-allocated file now matches the indirect block allocation address when allocation fails.
    • Fsck now prints verbose information about the range of regions and the stroage pool it scans for each pass.
    • Online fsck handles cached bad disk addresses without causing any SIGFPE.
    • Hook page fault handler when accessing user data. Fixes fatal page fault at kxWaitCondvar+0xf8.
    • Corrected mmapplypolicy failing with "too many files open".
    • Reinitializes ea limit before needing to adjust it after remount of the file system.
    • Quota check operation prints approrpiate error messages when conflicting programs are running.
    • Cleanup allocation message queues properly during a failed fsck operation as a result of stripe group panic. Subsequent online fsck will not assert checking for NULL allocation message queues during initialization.
    • Adds nfsd CAP_DAC_READ_SEARCH and CAP_DAC_OVERRIDE capabilities instead of settting fsuid/fsgid thereby stopping permission-denied errors when nfsd rebuilds dentry trees on kernels 2.6.27 (or later).
    • Use mmapplypolicy ... -g -N ... to preempt disk space issues.
    • `kill -SIGINT ...` has been supported in all previous code releases. This update brings tsapolicy into compliance with the defacto standard for handling SIGTERM.
    • Fixes assert when a Windows node has to create lost+found during mmfsck.
    • Improve takeover time when using tiebreaker disks in certain cases.
    • Fix Signal 11 at QuotaMgr::Phase2OnlineQuotacheck initializing nDests not initialized.
    • Disallow immutable flag to be changed on snapshot files.
    • Turn on the CXIUP_NOWAIT flag when we know that it is safe to use igrab().
    • Correct PIT RPC communication.
    • Fix of internal dmapi attr name comparison routine so that it can compare the string with its true length.
    • Create directory call now always passes in a valid name.
    • fsck code now does not assert trying to look into invalid disk addresses as a result of race with flush buffer operation.
    • Retry deadlocks on rlMutex when called from RecLockReset to cleanup advisory locks.
    • mmapplypolicy is scripted and examines the final command exit code ($?) distinguising skips from errors.
    • Quick response to interruption of command mmlssnapshot.
    • mergeshadow function to emulate diff better thereby correctly specifying modified files needing to be backed up.
    • Extract interface name for networking configuration file on SLES11.
    • Fix assert(ofP->inodeLk.get_lock_state()).
    • Fix a problem in removing empty quota entries by online quota check.
    • CreateReservedFiles checks the number of blocks to be written to inode file before starting threads.
    • Change the way the running command lock is obtained.
    • Fix GPL build problem on BG for IIA. BG has the same kernel version as SLES10SP1. But, it does not ship the new definition file of relayfs (relay.h) in kernel source. Loose the version check to use old definition file (relayfs_fs.h) for this kernel version (2.6.16.46). It is also ok for SLES10SP1 since it ships both old and new definition files.
    • Solved rare race condition which may lead to a kernel crash for 64bit AIX boxes when uninstalling GPFS directly when it is still running.
    • Fix variable initialization that could cause "mmcheckquota -a" to terminate.
    • Check socket connection between command client node and fsmgr node.
    • Fix a long running/hang mmrestoreconfig command when running in a CWD which contains many files/subdirs.
    • Use new memory to pass parameter to new thread for memory reuse.
    • Disallow replication factor change on snapshots.
    • Avoid a crash in mmdeldisk for certain filesystem blocksizes and snapshots present.
    • Rename GPFS device names and remove any external reference to the string 'StripeGroups'.
    • Ensures to "reap" process forked.
    • Policy aputil filename handling.
    • Fixed the code to handle E_DAEMON_DEATH situation in gpfsWrite.
    • Detect missing definition for mmlsfileset utility and define it if needed. Earlier version of globfuncs do not have mmlsfileset utility defined.
    • Check for out of memory condition in token revoke handler.
    • Define I_LOCK if it is not already defined in Linux.
    • diff-replacement code in mmcmi not handling the difference between 3.2 style and 3.3 style shadow file lines smoothly. It calls out the file in the snapshot as needing expiration. Use the original diff code in case of 3.2 style file system backup on 3.3 code.
    • Cleanup allocation message queues properly during a failed offline fsck operation as a result of stripe group panic. Subsequent offline fsck will not assert checking for NULL allocation message queues during initialization.
    • Fix dmapi attribute name comparison routine to take the short length into account.
    • fsck does not assert trying to look into invalid disk addresses as a result of race with flush buffer operation.
    • Increase lower limit of tracedevbuffersize from 4k to 1m.
    • On Windows, some temporary files starting with the name /var/mmfs/tmp/popen.* may not be removed if GPFS shuts down abnormally. These temporary files are not cleaned up the next time GPFS starts. nsdperf source and README files added to the Windows installation package.
    • On Windows, the sample program nsdperf is shipped as an executable (nsdperf.exe). The README files and source code for this sample are now also included on Windows installations.
    • Update kernel code licensing info to reflect Dual BSD/GPL license.
    • Fix failure due to expel command being run during disk election in tiebreakerdisk cluster.
    • This update addresses the following APARs: IZ70721 IZ75258 IZ76614 IZ76615 IZ76798 IZ76810 IZ76834 IZ76837 IZ76939 IZ75549.

    Problems fixed in GPFS 3.3.0.6 [June 02, 2010]

    • Use the disk availability information from the daemon for the mmlsdisk -m/-M options.
    • Reject the mmmount request if the drive letter is in use.
    • Fixed a sig#11 problem during sg disk table update.
    • If there is mount failure to GPFS file system and you can only find "No child processes" message in mmfslog, apply this fix and you will see the real reason for the mount failure. This problem only affects Linux.
    • Add a method to indirect block iterator to ignore last change count so as to step to the next block when the current one is deleted. Use the new method this instead of bumping down the global list change count.
    • Fix for a directory with FGDL enabled, when mnode token is being revoked but not the inode token, and when there is thread hold the openfile, CTF_FINE_GRAIN_DIR_MNODE flag did not get reset which may trigger an assertion next time the node tries to become metanode.
    • Improve performance of inode allocation when running low/out of free inode.
    • Fix for a rare race condition on Windows which may result in conflicting auto-generated SID mappings.
    • Set Shared.processP.pid to -3 when mmfsd is killed by SIGKILL (kill -9) on AIX 64-bit. Otherwise, client command like tsctl or mmfsadm still think daemon is alive and wait for 5 minutes timeout to exit.
    • Fix rare race condition between lease thread and healthcheck thread, resulting in false lease thread stuck condition.
    • Fix race condition that could occur when an active NSD server is also runing workload that uses the NSDs served by that server, and GPFS is being shutdown on that node.
    • Corrected concurrent threads running mmunlinkfileset and performing asynchronous recovery's SFSDoDeferredDeletions cause a file's open instance count to go negative.
    • Initialize dirLockNeed variable to avoid unknown id error during trace formatting.
    • Ensures that false compare mismatch errors are not not reported and a relevant assert is not triggered when compare operation is done on a inode with bad file size.
    • Fix so that a failure to read an inode 0 file for any of the snapshots aborts fsck operation.
    • Change return code from E_PERM to E_INVAL for mmchfs.
    • When a panic causes an EAGAIN to be turned into ESTALE, a call to cxiFcntlUnblock must be made to clean-up the fl_block list. locks_free_lock BUG(fl_block) call on ESTALE return from a fcntl lock.
    • Fix mmrestoreconfig to correctly handle filesystem containing no fileset.
    • Fix mmfsck so that it could handle badly damaged inode better when relica count went bad.
    • Fix a fileset restore for linked filesets whose config was backed up with mmbackupconfig.
    • Fix SLES 11 automount.
    • Allow mmchnode --cnfs-enable to accept trailing spaces in network config file.
    • Fix mmbackup processing of -s and -g switch arguments.
    • Fixed dmapi event timetout handler to correctly broadcast message to waiting threads.
    • Fix mmbackup to limit memory consumed in sort program by using --buffer-size=5% on Linux and by using -T switch on all platforms.
    • Fix incorrect error message after last CNFS node is deleted.
    • Fix the code which caused GPFS daemon to assert after filesystem panic on FS manager node.
    • Inherit ACL entries based on filemode (should be the default ACL mode).
    • Get a stronger lock when prefetching inodes (rf vs ro). Fix assert DE_IS_FREE(fP).
    • The mmauth command, which supports multi-cluster configurations, stopped working on Windows platforms beginning with GPFS 3.3.0.3. The problem was due to incompatibilities between the OpenSSL library and the WinSock library (which was new in this release). This issue is now resolved. GPFS for Windows uses a custom built OpenSSL library compatible with the WinSock library.
    • Correct a problem when running mmmount all_remote for a Windows node.
    • Enforce stricter device naming on Windows cluster.
    • Allow change directio flag for immutable files.
    • Fix a GPFS deadlock that occurs on Linux, under high load conditions, with memory pressure and memory mapped files.
    • A race condition in the Windows POSIX subsystem (SUA) makes it possible for process fork operations can hang. This problem can hang GPFS and require a restart. To avoid this problem, all fork/exec operations in the GPFS daemon, which are used to start a child process, have been replaced with native Windows APIs.
    • There are now long form option names for each command option, at least within the tsapolicy C program.
    • Fix code to remove stale object when deleting snapshot.
    • Fix mmdeldisk syntax error message.
    • Fix code that could cause assert after node fail while running fsck.
    • Change the type of parameter 'ino' to InodeNumber in cxiFillDir_t. If INODE64_PREP is defined, InodeNumber is Int64, otherwaise it's Int32.
    • Fix code which caused an assert during filesystem manager takeover after manager node failed.
    • Fix problem for tsmigrated migrate to new clmgr node when clmgr node changed.
    • Fix module build errors with Linux kernel version 2.6.33.
    • Fix mmfileid command to scan user data for disk address check.
    • Fix a typo in the routine that retrieves node information causing mmsnmpagentd to terminate occasionally.
    • Fix for a rare race condition during disk address lookup of a newly allocated address under heavy load.
    • Install a page fault handler when user data is copied by kxReleaseMutex. Longwaiters and Oops:kxReleaseMutex on ppc64.
    • Fix an unexpected remote copy error from mmrestoreconfig command.
    • Ensure the mmsdrserv process is not killed if it uses its own separate TCP port.
    • Fix an extremely rare case where higher-level indirect blocks were not being flushed when they were supposed to be.
    • Add input validation for xattr value size.
    • Change mmfileid to find "invalid" disk addresses when using the :BROKEN keyword.
    • Fix excessive prefetching IO for random NFS read workload on large files after installing 3.3.0.5.
    • Ensures all locks acquired during the lock file operation are released during a failed operation. And, prevents the need for an explicit file lock release failing which the code will assert.
    • Correct a problem when resetting a config parameter to its default value for a subset of the nodes.
    • Fix code to correctly respond to returning error from PIT parent node.
    • NFS client gets "permission denied" when "subdir/.." is looked-up internally.
    • Fix mmexectsmcmd to to tolerate error return codes such as 4, 8 and interpret as non-fatal.
    • Avoid GPL compiling warning. Use void* instead of struct inode* in cxi file and cast it in OS specific file.
    • Fix for a rare deadlock during recovery on Windows.
    • Avoid chance of deadlock when updating shared directory under high load.
    • Rework handling automount on RHEL 5 or SLES 11.
    • On AIX disable the filter process, and use the trace control file to format the trace instead of the merged ones.
    • Allow mmgetstate -s to continue even if there is an inaccessible node in the cluster.
    • Fix for a rare assert during multi-node create/delete races.
    • Improve handling of PIT worker nodes starting RPC.
    • Fix problem in fully replicated filesystem which will not mount if all the disks in one FG are stopped and suspended.
    • Fix mmwinserv to prevent a possible hang on Windows nodes during mmstartup.
    • Verify interface is up (IFF_UP) before processing it. Interfaces brought down using "ifconfig down" (unlike "ifdown ") are returned with SIOCGIFCONF.
    • Check whether mmfs.log.previous file exists before renaming it.
    • Fix a rare assert caused by RelinquishAttrByteRange thread.
    • Fix EA limit calculation.
    • Fix code to improve cache handling.
    • Close a timing window to eliminate an assert which can happen under heavy load when disks are being quiesced.
    • Fix a deadlock involving read-write mmap under heavy stress.
    • Fixed a Windows node failure that occurred when clusters were configured to provide SNMP events.
    • Fix a race condition between mmnfsdown and mmnfsup so that mmnfsdown can kill all nfsmonitor process.
    • Fix code to correct the order of initializing state of PIT nodes.
    • Fix problem where restripe/chdisk/rpldisk commands return error 'Invalid Argument' while processing user file.
    • Avoid a rare failure accessing a directory long after concurrent updates.
    • Fix array out of bound problem in eaRegistry dump function.
    • Prevent gpfs_iwritedir api from asking to open inode that are fs metadata inodes.
    • Fix assert caused by rare CPU cache inconsistency situation on X86_64 hardware.
    • Fix GPL build problem on BG.
    • This update addresses the following APARs: IZ73346 IZ74517 IZ74539 IZ74542 IZ74544 IZ74547 IZ74549 IZ74550 IZ75250 IZ75252 IZ75259.

    Problems fixed in GPFS 3.3.0.5 [April 1, 2010]

    • Fix problem where metadata-update intensive workloads (e.g., file creates and deletes) running on systems with large inode cache (large maxFilesToCache) would periodically pause for several seconds.
    • Fix spurious EIO errors accessing hidden .snapshots directories enabled via "mmsnapdir -a".
    • Allow case insensitive node identifier in specfile.
    • Fix fsck code so that it doesnt report the corrupt addresses problem that it claimed to have fixed during the previous fsck run.
    • Improve linux trace performance.
    • Fix mmtrace to return non-zero value and report error when lxtrace binary for current kernel is not installed.
    • Fix performance problems when reading large files from NFS clients.
    • Correct problems with mmaddcallback -N clustermanager & mmlscallback.
    • Fix missing out-of-memory check in get inode routine.
    • Improve performance of large file create when DIO is used.
    • Fix buffer calculation in dm_get_events when buffer size is greater than 64K.
    • Fix potential loss of events in dm_get_event() call when buffer size is greater than 64K.
    • Notify dm_get_events that the session already failed after quorum lost in the cluster.
    • Fixed potential assert when writing small files via NFS under heavy load.
    • Fixed hardlink assertion problem when upgrading file system from 2.3 to 3.3 or later.
    • Remove stat() calls in the mmshutdown path.
    • Fixed an allocation loop which could occur during mount and rebalance of a filesystem.
    • Fix mmbackup to divide up the list of files that need backup based on filesize when numberOfProcessesPerClient=2 (or more).
    • A progress indicator is added in the case of mmchfs -F if it leads to the expansion of preallocated inodes.
    • Fix a race condition between deldisk and deallocation of surplus indirect blocks that could result in dangling block pointers.
    • Fix disklease inconsistencies between cluster manager resetting lastLeaseProcessed and the client resetting lastLeaseReplyReceived.
    • When token_revoke results in a downgrade for a device file, call invalidate so that device-specific cleanup occurs.
    • Fix allocation code which can cause "No space left on device" error on initial mount after filesystem creation.
    • Fixed a file sharing check that was causing an incorrect "access denied" error.
    • Restart mmsdrserv after installing new code on Windows.
    • Remove redundant preMount and Mount user callback events.
    • Stop GPFS trace automatically when doing upgrade to GPFS 3.3 on Linux. Added more detailed error messages when GPFS kernel extentions can not be unloaded.
    • Fix a quota problem that fails to translate invalid fileset ids.
    • Fix initial run of mmbackup recall in UTC+ timezone to avoid recall of unchanged files that are already on the TSM server.
    • Fix assert s_magic == GPFS_SUPER_MAGIC on kernel 2.6.16.60-0.59.1 and above.
    • Fixed allocation code which caused an assert during daemon shutdown.
    • Load policy file on sgmgr when file system is mounted so that low space threshold always set when file system is mounted.
    • Return EMEDIUMTYPE rather than ELNRNG for incompatible format errors on Linux.
    • Cleanup flag beingRestriped if inode is deleted while restriping.
    • Fix an assert in kxCommonReclock on AIX node.
    • Error conditions returned due to failed metadata flush operation are handled appropriately preventing the restripe operation from asserting due to failed checks.
    • Fix a problem in quota file creation when file system pool has metadataOnly disks.
    • Fix to avoid holding mutex twice while revoking token encounters SGPanic.
    • Fixed a rare race condition during Windows GPFS initialization that could cause a system fault.
    • Fix code to avoid unreasonable checking for socket.
    • Fix exception using a spin_lock in fasync_helper during fcntl revoke.
    • Fix server side token issue in failure cases.
    • Ensure that online fsck is not held for ever trying to steal buffers from inode range that is currently locked for online fsck.
    • Fixed the parallel inode traversal code which can cause signal 11 during restripe and replace disk.
    • Fix code to remove unnecessary assertion when a token is revoked while a large file is being restriped.
    • Fix problem where in a file system with large snapshots a failure of the file system manager during the first phase of an mmrestripefs or mmdeldisk command could under certain timing conditions cause corruption.
    • Fix problem where filesystems created by GPFS release 2.3 or older were not mountable by GPFS release 3.2 or 3.3.
    • Fix for a very rare race condition where a non-DIO read from a cached buffer may transiently return partially incorrect data.
    • Fixes issues in minorityQuorum clusters that have leaseDuration set and have migrated from 2.3 to 3.2.
    • Fix to tolerate an inconsistent state of Windows security settings on an inode following a failed TSM restore.
    • This update addresses the following APARs: IZ68715 IZ68725 IZ69476 IZ70073 IZ70074 IZ70396 IZ70409 IZ70599.

    Problems fixed in GPFS 3.3.0.4 [January 28, 2010]

    • Fix problem where mmrestripe command might not correctly detect I/O errors during the first phase of restripe.
    • Correct the resetting of config parameters to default on a subset of the nodes.
    • Replace usage of the lsvg command with getlvodm.
    • Fix function checkIntRange error message when checking negative numbers.
    • Clear the tiebreaker disk parameter after mmexportfs all.
    • Fix ioctl opcode conflict with FIGETBSZ on Linux kernel 2.6.31 and later.
    • Fix fsck to avoid incorrectly reporting and fixing of filesystem corruptions in a heterogeneous cluster.
    • If the file system is internally forced to unmount (file system panic), invoke the preunmount user exit if one is installed.
    • Avoid confusion when using a local fcntl lock versus an NLM one.
    • Give customers using mmbackup more flexibility by allowing alternate install location for TSM.
    • Fix determining filename length when filename contains invalid UTF8 characters.
    • Fix data corruption when using mmap.
    • Fix assert due to invalid fcntl acquire sleep element found on the kernel queue.
    • Keep FS descriptors off of excluded disks even if they come online.
    • Fixed a race condition by serializing the xattr object in inode properly.
    • Fix hang between node failure thread and events exporter request handler thread.
    • Fix mmapplypolicy to estimate correctly the number of GPFS storage pool bytes freed by migrating to an external/HSM pool. Introduce MM_POLICY_MIGRATION_STUBSIZE environment variable to allow users to directly control size for migration.
    • Fix mmbackup to avoid giving file name length and file size to TSM for inclusion in backup list.
    • Fix async recovery to let mounts succeed while also processing deffered deletions.
    • Fix assert failure on FS manager node when unmountOnDiskFailure=yes and a disk fails after 3.2.1.14-16 installed.
    • Prevent HSM and NFS from asking to open inodes that are system metadata nodes.
    • Do not let socket get stuck in reconn_cleanup state following repeated breaks that occur just after connection handshake completes.
    • Reduce the pagepool usage by inode allocation segments during FS manager initialization or recovery.
    • Fix a problem with cutting traces in a CNFS setup.
    • Fix filesystem panic when a failed disk holds a FS descriptor and returns unexpected error codes.
    • Fix problem with mmlsfileset when expanding inodes is running concurrently.
    • Ignore un-supported permission flags passed to gpfs_i_permission on SLES11.
    • Fix for a SIGSEGV on Windows caused by a race in accessing the ACL file.
    • Fix a condition where mm commands can exit with errors if CWD is unavailable.
    • Fix for a rare failed assert in the main process thread on Linux.
    • Fix a race condition where node may be deleted right after it started up.
    • Fix code to correct backward compatibility of non-blocking token request between gpfs 3.2 and gpfs 3.3.
    • Make trace recycle timeout message more descriptive and avoid recycle file being overwritten when trace recycles next time (Linux nodes only).
    • Succedent tscrfs command will unset some flags unexpectedly even if it cannot get the permission to run. It will cause a daemon assert. Clear flags only if the command has set it before.
    • When open of the directory fails and not all fields are set, do not call back into GPFS to do close (release). This may cause an invalid assert due to attempting to reference uninitialized fields.
    • Fix signal 11 due to bad RDMA index and cookie received from the TcpConn in verbs::verbsClient_i.
    • Fix remote startup on Windows.
    • Fix a race condition between an mmexpelnode and mmchmgr.
    • Correctly cleanup tmp files on remote nodes.
    • Fix a problem in mmdf where number of free inodes may become negative.
    • Fix race condition that occurs due to disk failure during clmgr election while using tiebreaker disks.
    • Fixed inode expansion code which can cause restripe to fail with an assert. This problem only happens when restripe and inode expansion run concurrently.
    • Several sample script and configuration files are now included with the GPFS for Windows installation. These can be found in %SystemRoot%\SUA\usr\lpp\mmfs\samples. Only the files appropriate for use on Windows are included; additional samples are available with UNIX installations.
    • Fix assert "offset mappedLen" when reading dirs.
    • Fix allocation manager problem that caused pool to not be deleted when it should have been.
    • Initialize allocSize variable during the initialization phase of file repair to prevent assert.
    • Fix a rare bug that occurs during nsd config change along with earlier disk issues to another deleted nsd.
    • Fixed a GPFS on Windows failure that can occur on systems with a large number of cores (e.g. 8 or more) running a workload with thousands of threads. When this error occurs, /var/adm/ras/mmfs.log.* shows "logAssertFailed: tid >= 0 && tid = MAX_GPFS_KERNEL_TID". The fix for this problem removes any assumption on the maximum thread ID.
    • Fix a problem that can lead to loss of an intermediate SSL key file.
    • Fix mmbackup to accurately reflect the error encountered on the TSM server.
    • Fix a problem with interpreting the syncnfs mount option.
    • Fix fsck so that it reports duplicate fragments and its count correctly and also prevent a possible fsck crash due to count overflow.
    • Added %myNode as callback parameters.
    • Fix an assertion during mount that could happen when quota management is enabled and snapshot is being used.
    • Fix fsck so that it detects problems and fixes them without encountering struct assert errors even if the 'assertOnStructureError' config option is turned on.
    • This update addresses the following APARs: IZ67659 IZ67660 IZ67661 IZ67662 IZ67663 IZ67664 IZ67665 IZ67666 IZ67667 IZ67723 IZ67746 IZ68028.

    Problems fixed in GPFS 3.3.0.3 [December 10, 2009]

    • On x86_64 Linux when special encoding flag is set in a functions debugging frame section, an extra offset should be added during decoding. Otherwise, the thread traceback can not be decoded correctly.
    • Fix Stripe group configuration change so data block loss cannot occur if data is being ingested along with configuration changes.
    • On Windows, consolidated separate -msi and -sh installation logs into a single file(gpfs-install.2009.10.20.11.23.58.gpfs-n70-win.log). Also, eliminated Command window popups that were appearing during GPFS installation.
    • Fix problem where the new inode scan function gpfs_next_inode64() would return incorrect values for some gpfs_iattr64_t fields on file systems originally created prior to GPFS 2.3.
    • Change to GPFS inode scan api for gpfs_ireaddir64. Directory entry structure returned by call now contains a flag field and allows directory entry names to be 1020 bytes (ie 255 characters encoded in UTF-16 or other Unicode encodings).
    • Fix policy handling of rules of the form "EXCLUDE FROM POOL" to prevent LOW_SPACE events from incorrectly being logged.
    • Extended Attributes (EAs) on Windows have been changed so that internally they are stored with a "user." prefix. This change supports compatibility with Linux and improves file system security.
    • Fix problem on systems configured with large maxFilesToCache that could cause file systems to be unmounted on some client nodes when running recovery after a manager node failure.
    • Fixed GPFS hang when recalling files from HSM.
    • When running lsvg do not wait for the volume group lock.
    • Fix mmbackup to give users early warning and exit when unlinked filesets are present during backup. It also prevents further processing of files that would otherwise give the user misleading error information.
    • Limit the number of attempts made to destroy nfsd threads in mmnfsquorumloss in case an nfsd thread is stuck waiting for IO to complete in GPFS.
    • Don't stop NFS or unexport fs on quorum loss, and kill NFSDs that are stuck during setNfsdProcs.
    • Fix mmdelnode syntax error checking.
    • Fixed the allocation code which caused a loop during metadata allocation. This problem only affects filesystem with metadata replication enabled.
    • Fix mmbackup incremental to handle conversion from short filename records to new longer records after upgrade to 3.2.1.14 or later.
    • Fix Linux "mmnfsinit start" command and return correct return code.
    • Fix problem where a multi-threaded workload reading extended attributes from a large number of files could cause accumulation of a large number of byte range tokens leading to slowdown and spurious ENOMEM errors.
    • Upon mmfsd daemon failure, change the way of debug data collection to asynchronous script execution.
    • Fix a quota initialization problem that could allow quota files in storage pools, while they really belong in system pool.
    • Fix for a rare race condition that may cause an assert in the invalid fileset object disposal path.
    • Added new command line option "--oneerror" to mmaddcallabck command.
    • Enable mmapplypolicy on fs with 8MB blocksize.
    • Correct processing to prevent quota requests from being performed while the quota manager operations are being quiesced.
    • Converted GPFS to use the Windows sockets library (WinSock) rather than the SUA library. This change fixed an issue with large data transfers that appeared when a file system's block sizes was larger than 256KB. The WinSock library also significantly reduces the CPU consumption required to perform network data transfers.
    • Fixed error handling when registering pagepool memory to Infiniband.
    • Correct quota hard limit processing to check grace time.
    • Correct a rare problem (due to an error encountered writing quota files) that can prevent a newly created filesystem from being mounted.
    • Improve stability when encountering hostname resolve issues.
    • Prevent callback command path in GPFS file system.
    • Fixed a problem which prevents filesystem remount after a forced umount due to error(ie. filesystem panic,quorum loss, etc).
    • Fix panic handler code to ensure that the right fsck cleanup code path is chosen by looking at the workerNode flag in the fsck data structure.
    • Correct processing during restoring filesets to allow more than the KSH array limit of 1024.
    • Fix quota manager cleanup when file system manager migrates to another node.
    • Fix a file structure error caused by SetAllocationSize.
    • Enable kdump to retrieve kernel thread's backtrace on IA64.
    • Handle IB port event of LID change.
    • Correct a problem when verifying that the daemon is down from a Windows node.
    • Resolved an issue with mmwinservctl where the command would fail to set the account name and password. This error would occur if Windows is not installed in same location on all nodes in the cluster (e.g. some nodes have Windows installed on the C: drive and other have it installed on the D: drive.)
    • Fix to avoid assertion when calculating the next valid data block number of low level file.
    • Removed some commands and programs that were included in the Windows installation, but not supported on Windows.
    • Parse the 'ro' mount option and pass it explicitly to gpfsMount to prevent Windows write on readonly filesystem.
    • Fix mmbackupconfig in Windows to give the customers the correct mmbackupconfig behavior and exit gracefully.
    • Fix mmapplypolicy to run on a large number of files with the -g and -N flags.
    • Make filesystem restore messages for mmrestoreconfig more descriptive.
    • Fix mmbackup to backup snapshots that are older than the latest filesystem backup.
    • Fix mmbackup to ensure that the only desired TSM servers will be processed.
    • Package mmbackup32 for use with 3.2 clients.
    • Add support for -N nodeList option in mmbackup version 3.1 and 3.2.
    • Fix possible deadlock restriping a file system with data replication enabled under application load and with small pagepool.
    • Warning messages on conflicting opertaions are sent to stderr to avoid littering to stdout.
    • Resolved an issue that in rare cases could cause GPFS to terminate when tracing is enabled.
    • Fixed some mmwinservctl operations which were causing GPFS to start inadvertently when GPFS was configured with autoload=yes.
    • Fix mmchconfig trace command to load kernel extensions if not already loaded.
    • Increase the default maximum size of the shared segment on 64-bit AIX to 1G (32-bit AIX is architecturally limited to 256M).
    • This update addresses the following APARs: IZ63333 IZ65179 IZ65379 IZ65416.

    Problems fixed in GPFS 3.3.0.2 [November 12, 2009]

    • On Windows, mmtracectl no longer requires ActiveState Python in order to collect and format traces.
    • During a stripe group mount, ignore the 'exclDisks' mount option for a filesystem in a remote cluster.
    • Fix direct I/O path to set the Windows archive bit only for writes, not reads.
    • Add more trace information and ifdef for problems with Linux NFS fast lookup.
    • Ensure mmexportfs does not remove tiebreaker disks unless device name is "all".
    • Fix kernel exception in fifo_open due to an invalid i_pipe pointer.
    • Increase prefetchThreads+worker1Threads+nsdMaxWorkerThreads to 1500 on AIX 64bit systems.
    • Added two new calls to the inode scan api to allow dmapi attributes to be backed up and restored for disaster recovery. The dmapi attributes are also saved in a snapshot of the original file on which they were set.
    • Collect the output of systeminfo.exe command instead of the command itself.
    • Fix trace record missing problem. The _STrace function should pass non-blocking flag as 0 instead of 1 to rl_trc_write.
    • Improve SMP scalability in the DIO code path.
    • Fix small window where message send will hang if destination list includes the local node and all other nodes reply before the local send can start.
    • Prevent force unmounts when disks in different failure groups (FGs) fail, but are in different pools. Prevent marking disks down in multiple FGs when disks die simultaneously.
    • Allow command line case insensitive hostname.
    • Add an new file system option --filesetdf.
    • Fixed assert when deleting files from a dmapi enabled filesystem.
    • Dmapi locks are ignored from lock file operation during offline fsck as offline fsck doesnt have any dmapi context. This will prevent offline fsck from crashing the deaom while fixing orphans in a dmapi enabled filesystem.
    • Fix mmapplypolicy to accept a policy that includes a 3-value THRESHOLD(hi,lo,pre) clause in a migrate to external pool rule.
    • Ensure mmfsctl syncFSconfig does not affect free disks unless device name is "all".
    • Ensure mmaddnode fails if the IP address already appears in the cluster.
    • Fix node crash from destroy event when accessing stalled stripe group.
    • Correct mmcrnsd disk sector size (>2T) error on 32bit Linux.
    • Fix mmpolicyExec-hsm.sample to handle characters \ " and ' in filenames properly so that they work in HSM file list.
    • Fix deadlock during FS manager takeover if previous FS manager and its disks (site failure) fail at the same time.
    • Fix DMAPI enabled filesystems when they are mounted on top of another GPFS filesystem.
    • Fixed performance problem while migrating files.
    • To avoid 32-bit integer overflow in case of huge sparse file, argument dataBlockNum in repairDataBlock is changed to Int64 instead of int.
    • Modified GPFS installation on Windows to prevent GPFS from being started after a package update, but before the mmwinserv service is configured. This problem can result in permission errors when running GPFS administrative operations.
    • Fixed a problem in GPL layer makefile so that no warning messages will appear when upgrading from 3.3 GA code on Linux platform when GPL layer has never been compiled before.
    • Acquire the stripe group descriptor mutex before changing the quota files inode information in the stripe group descriptor.
    • On Windows, error messages related to the mmwinserv service were improved to provide clearer indication of the problem and how it might be corrected.
    • Fixed the allocation code which caused an infinite loop when running out of full metadata block.
    • Fix for a rare race condition that may result in a "Busy inodes after unmount" syslog message on Linux.
    • Fsck generates false positives for bad repication status in user metadata files after a failed PIT operation. The fix ensures that fsck does not generate the false positives.
    • When fixing orphans, fsck now prints the fileset name from which the orphan is generated.
    • Fix mmbackupconfig to present a clearer error message when they attempt to run the command when the filesystem is mounted.
    • Fix mmapplypolicy command to indicate it is making progress.
    • Hold the stripe group descriptor mutex only while actually accessing/updating the stripe group descriptor when updating the quota file information in the descriptor.
    • Avoid crash in rare cases of concurrent multi-node file creates.
    • Prevent customer files that have been backed up to the TSM server from expiring in the same session.
    • Enabled global cluster wide events from remote cluster in user exit callbacks.
    • Fix "mmtrace noformat" to work on linux nodes.
    • Correctly propagate authorization key files to new node in admin central cluster.
    • Fixed repair code which can caused snapshot file corruption which could happen when filesystem contains fileset, snapshots and deleting disk resulted in not enough failure group for proper replication of metadata.
    • Clarify mmwinserv error messages.
    • Fix rare deadlock that occurs during recovery of large filesystem.
    • Fixed GPFS trace control on Windows which in some scenarios was not restarting trace collection correctly.
    • Fix backward compatibility problem that caused file creates to fail on file systems that were originally created with GPFS version 2.2 or earlier.
    • This update addresses the following APARs: IZ63058 IZ63080 IZ63171 IZ63307 IZ63308 IZ63320.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSFKCN","label":"General Parallel File System"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
25 June 2021

UID

isg400000344