Topic
  • 25 replies
  • Latest Post - ‏2011-12-20T17:27:18Z by gpfs@us.ibm.com
ldmurphy
ldmurphy
3 Posts

Pinned topic GPFS v3.2.1 Announcements

‏2009-03-20T12:54:35Z |
Watch this thread for announcements on the availability of updates for GPFS v3.2.1.
Updated on 2011-12-20T17:27:18Z at 2011-12-20T17:27:18Z by gpfs@us.ibm.com
  • ldmurphy
    ldmurphy
    3 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-03-20T13:01:02Z  
    GPFS v3.2.1-10 was released on March 12 and is available from the GPFS Corrective Service Page
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-04-27T19:55:24Z  
    The Linux and Windows versions of the GPFS 3.2.1.11 images built on 8 Apr 2009 had a problem with the 'mmfsadm dump all' command, and have been replaced on the service download site. New images that correct the issue have a build date of 23 Apr 2009. AIX images are not affected by this issue. Since the 'mmfsadm dump' is a service command, even if old 3.2.1.11 GPFS packages are installed, it is not necessary to reinstall GPFS for normal GPFS operation. However, if GPFS debug data is requested by IBM service personnel, installing the corrected version is strongly recommended.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-05-29T12:41:08Z  
    GPFS 3.2.1.12 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.12
    May 28, 2009
    • Correct handling of mmfsck command on remote node if neither -y nor -n flag was specified.
    • Fix case where filesystem recovery might run while a node was not yet completely failed.
    • Fix handling no space errors and obeying the replica allocation policy setting for quota files.
    • Fix for the "privVfsP != NULL" kernel assert in inode.c.
    • Fix dereference of stale OpenFile pointer in gpfsMmap.
    • Fix mmap to avoid mmap related hangs in the Linux IPA program analysis, IMAP processing, and AIX binder.
    • Fix apparent file system hang due to mmdelsnapshot performance if a large number of snapshot files had been accessed recently.
    • Fix mmdumpkthreads/kdump for BlueGene/P IO nodes.
    • Use direct I/O rather than buffered I/O in multipath environment for tspreparedisk.
    • Avoid segmentation violation on 'mmfsadm dump all' command by adding argument validity checking.
    • Fix pagepool buffer deadlock when writing using invalid application buffer memory addresses.
    • Increase the maximum number of supplemental GIDs supported on Linux.
    • Skip fast directory lookup using fold value on older Linux kernels.
    • Correct 'df' command output on NFS exported file-set to report the same numbers as local.
    • Properly return open file pointer in cxiCheckOpen to avoid filesystem corruption.
    • Fix handling of pagepool buffers when snapshot files are scanned and the active file has both user extended attributes and DMAPI attributes.
    • Do not delete the entries from the quotaEntryTab unless all the modified entries have been written to the disks, avoiding an assert.
    • Avoid deadlock when multiple nodes empty a very large directory.
    • Do proper endian conversion on imported FskError inode problems when imported from a different endian machine.
    • FakeSync() routine should check the cached flag of an open file pointer and skip the open file pointer if the flag is not true yet.
    • Make mmrestripefile honor to filesystem's strictness option.
    • Fix handling of quota file indirect blocks.
    • Fixed code which can cause an assert during stripe group cleanup.
    • Add better checking of socket state so sgmgr tsdefragfs thread is aware of client termination in order to exit as early as possible when client program terminates.
    • Fix assert due to narrow race window when a token revoke arrives at a node which is panicking the related file system.
    • Fix assertion that happens when mmcrfs command is interrupted before completion.
    • Fix rare assert due to mmchmgr command running during node recovery.
    • Avoid Oops in datashipping, change to get data from the mailbox before freeing it.
    • Fixed incomplete error checking when acquiring a byte range token for fcntl locking.
    • fixDirBlock() should translate inode numbers outside the allocated range to INVALID_INODE_NUMBER.
    • Fix assert that can occur during a narrow race window in the daemon shutdown sequence.
    • Correct parsing of mmcrfs -i parameter.
    • Fix rare deadlock with low worker1threads on BlueGene/P and large number of openfiles.
    • Fix assert caused due to expanding inode file while creating snapshots and deleting directries.
    • Fix small hole in synchronization between the idle disconnect thread and the receiver thread that can cause communications to hang with connection broken on one side only.
    • When openssl low level function returns error caused by invalid key file, shutdown mmfsd to avoid propagating the error to other code layers that may cause unexpected behavior.
    • Fix a problem in restripe code where a spurious error is generated when the most recent snapshot is in a state where it can't be opened.
    • Fix for a kernel stack overflow on RHEL4/i686 in mmgetacl/mmputacl path.
    • Fixed loop in error path during snapshot file creation.
    • Fix a problem in quota client startup and subsequent cleanup when file system mount can't succeed.
    • Fix double free in mmshutdown, change to use _exit() in linux_exit.
    • Fix dereference of stale OpenFile pointer in gpfsMmap.
    • Validate timestamps when WatchDog thread is used to check hung thread periodically.
    • In case of an non .. entry with invalid inode number, fsck prompts user with the message that the entry needs to be removed.
    • Fix asserts due to quorum loss when revokes are pending to local node in remote cluster context.
    • Fixed assert during processing of mmrestorefs command.
    • Do not allow -N to be specified with mmchconfig dmapiMountEvent=xxxx .
    • Account for a missing common line in the mmlsconfig output.
    • Fix incorrect number of inodes used or free shown by df -iv. Prevent premature inode file expansion.
    • Fix immutable file code which caused long waiters when running mmchattr -i command.
    • Fix assert that occurs in mmcrfs when it is run with a large (~2 billion) number of preallocated inodes specified.
    • Fix problem when system log files are left on a disk that was changed to be "descOnly" disk.
    • Correct Linux daemon shutdown sequence in order to wait for child processes (opened by user exits or pipe) to finish.
    • Fix a broken network connection problem between datashipping nodes, add NULL address processing.
    • Remove Linux vfsUserCleanup which kills processes with a current working directory in a file system being force-unmounted.
    • Fixed node crash when shutting down gpfs due to mmap counter problem.
    • This update addresses the following APARs: IZ43334 IZ48161 IZ48580
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-07-21T15:03:31Z  
    GPFS 3.2.1.13 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.13
    July 9, 2009

    Note: Unless specifically noted otherwise, this history of problems fixed for GPFS 3.2.x applies for all supported platforms.

    • Performance improvement to avoid deadlock when multiple nodes empty large directories.
    • Change mmdeldisk to copy blocks to retain replication while doing its scan. This removes the need to do suspend and rmmrestripefs before doing mmdeldisk.
    • Fix cluster manager migration code which can cause an assert when migration fails.
    • Fix a problem where a file system manager failure followed by another node failure within a very short time could lead to allocation map corruption under certain timing conditions.
    • Avoid hang after forced unmount interrupts addition of new inodes.
    • Prevent rare data corruption after node failure just before enabling log replication.
    • Issue error for mmchfs -v if the node is down.
    • Fix mmrepquota output to display large numerical user ids correctly.
    • Fix mmunlinkfileset problem on systems with large cache so file system operations do not block for an extended period.
    • Correct return value type of getMaxAcqSeqNo to prevent 32-bit overflow.
    • Fix for a deadlock between pdflush and GPFS code that occurs under heavy mmap load.
    • Fixed a problem where the sgmgr node thinks that fs is mounted on client node(s) when it is not.
    • Add defensive code to mmdumpkthreads to prevent infinite loop in case stack frame is corrupt.
    • Fix a small timing window where file access within a snapshot could deadlock if the original file is being deleted at the same time.
    • Verify /proc/fs/nfsd is mounted before trying to update thread count.
    • Fix a rare deadlock in quota management when a quota entry is removed from the quota file.
    • Correct extraneous line in mmgetstate -a output.
    • Fix a defect that could hang GPFS daemon during file system manager take over process.
    • On AIX gpfs_get_realfilename should return ENOSYS and not generate SIGILL.
    • Remove deadlock by passing proper kernel operation to gpfsClose wherever it is already in effect.
    • Fix a problem in the mmpmon histogram facility which can cause a kernel exception.
    • Fix a deadlock during token manager appointment during heavy workload.
    • Prevent deadlocks between HSM recall and filesystem quiesce operations for mmbackup, mmcrsnapshot, mmdelsnapshot, or mmfsctl suspend.
    • Fail unsupported disk sector size on nsd create operation.
    • Fixed a race condition between delsnapshot code and inode prefetch worker thread prefetching inodes from snapshot being deleted.
    • Corrected long waiter in quota procesing during unmount of filesystem.
    • Avoid crash when deleting snapshot on fs with very small block size.
    • Fix problems related to preserving file flags (such as illReplicated) when blocks are moved into a snapshot.
    • Fix mmdelsnapshot to skip syncFS error from nodes that no longer have the file system mounted.
    • Fix for a rare assert that occurs if multiple mmchmgr commands are used to move file system manager around in a quick succession.
    • For file systems with very large number of quota ids, fix a performance problem with cleanup of cached quota entries in file system manager causing slowness in manager takeover.
    • Fix a problem in gpfs_getacl.
    • Fix a rare race condition caused by concurrent mmdelfileset and mmchmgr in a quota enabled file system.
    • Fix a rare race condition caused by concurrent mmdelfileset and mmchmgr in a quota enabled file system.
    • Fixed error in mmaddcallback command with -N quorumNodes option to only add to quorum nodes.
    • GPFS on AIX should be able to terminate cleanly.
    • Fix code to correctly show nested fileset junction path for command mmlsfileset.
    • Fix the size_t data type of the length component in the mmap ne and pass the value appropriately.
    • Correct intermittent error in m4 expansion of policy.
    • Fix the problem that lxtrace off or gpfs hangs on Linux nodes when lxtrace daemon is killed by SIGKILL.
    • Suppress warning during GPL build.
    • Fix an assert caused by incomplete cleanup of a failed mmdelsnapshot command.
    • Fix for a race condition between inode expansion and file system manager migration.
    • Improve performance of GPFS daemon to kernel calls on Linux by using the unlocked_ioctl op on the ss device, reducing acquires and releases of the big kernel lock.
    • Fix code which can cause a signal 11 while adding, deleting or replacing a disk.
    • Avoid unexpected growth of the inode file during FS manager takeover.
    • Fix a problem that occurs when multiple token servers fail. If a second token server fails during log recovery after a first token server failure, there is the potential for an assert in the daemon.
    • Fix potential memory corruption in IO waiter handler.
    • Fix race that occurs if a node goes down and comes back before the peer recognizes the socket connection break.
    • setxattr(ACL) was setting aceLength incorrectly causing mmgetacl to fail.
    • Fixed a problem that a false no space error is returned creating files when disks still have space left.
    • Fix creation of allocation maps when some LUNs have more than 4G subblocks.
    • Make sure arping is done on the real interface when aliases are present.
    • During mount display an error message if new quota files failed to be created, due to inconsistency with current degree of replication, allocation strictness or number of failure groups.
    • Fix problems that occur when snapshots are being processed and the file system runs out of space to allocate the second replica but can allocate the first replica.
    • Fixed snap shot creation code so it will respect replication strictness setting (-K).
    • This update addresses the following APARs: IZ52682 IZ53013 IZ53014 IZ53044 IZ53089 IZ53134 IZ53142 IZ53489 IZ53548 IZ53953
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-08-26T19:38:15Z  
    GPFS 3.2.1.14 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.14
    August 20, 2009

    Note: Unless specifically noted otherwise, this history of problems fixed for GPFS 3.2.x applies for all supported platforms.

    • Fix memory leak that may occur when a node that serves NFS and has active locks fails over.
    • Have gpfs.snap gather disk data for disks that mmdevdiscover would find, not just the local disks.
    • Fix potential for infinite loop when nfsd/nfsmonitor encounters an error.
    • Remove deadlock by passing kernel opereration from NFS to gpfsClose.
    • Avoid hang under very heavy load when multiple filesystems are used.
    • Invoke the syncfsconfig user exit after mmchconfig disk related changes.
    • Fix problem that could cause deadlock in some rare cases after a node receives unrecoverable I/O errors from the disk subsystem.
    • Performance improvement for NFS clients.
    • Fixed potential memory corruption in fcntl lock area.
    • Fixed problem in dm_handle_to_path() function so that it returns path length correctly.
    • Fix snapshot code which caused an assert after delete snapshot failure.
    • Allow lowering the worker1Threads configuration setting dynamically.
    • Prevent force unmounts when disks in different failure groups (FGs) fail, but are in different pools. Prevent marking disks down in multiple FGs when disks die simultaneously.
    • Fix mmbackup to handle pathnames up to 4096 bytes long.
    • Support AIO interfaces for kernel earlier than 2.6.19.
    Note: this is actually synchronous, GPFS 3.2 doesn't support true AIO semantics.
    • Run node failback asynchronously and check for IP release using arping.
    • Fix sample script ilm/mmpolicyExec-hsm.sample to handle spaces (blanks) in filenames.
    • Fix mmapplypolicy statistic to report proper count of LIST candidates.
    • Fix problem where NFS clients accessing hidden .snapshots directories below the file system root (enabled "mmsnapdir -a") would receive spurious ENOENT or ESTALE errors after some snapshot is deleted.
    • Avoid hang under very heavy load when failures cause node recovery.
    • Correct behavior issuing mmdeldisk after an interrupted mmadddisk which could possibly leave a filesystem unmountable due to recovery errors.
    • Allow mmlsfileset command to wait for admin commands to finish.
    • Fix prformance problem with set/get dmattr.
    • Improve some NFS sequential file performance.
    • TSM restore by non-root user failing permission check in putOpaquePolicyAttrs.
    • Avoid hang when unmounting a very active filesystem.
    • Fix errors on various file ops during mmapplypolicy.
    • Fixed race condition in mmap directio operations.
    • Change setacl behavior so non-owner sees error when WRITE_ACL permission is not granted.
    • During quorum loss, force query of stripe group manager.
    • Fix assert encountered by stat during inode revoke handling.
    • Fixed a locking order problem in allocation manager recovery code.
    • Fixed GetSomeDataBlockDiskAddrs to synchronize with metanode takeover.
    • Fix the assert in dm_create_session() to handle error conditions better.
    • Prevent assert that occurs if quorum loss occurs during mmchmgr command.
    • Fix race condition during unmount of a quota enabled file system.
    • Fixed assert when restriping and closing of a file happens simultaneously.
    • Avoid mmdf preventing other managment commands from executing after a mmdf command failure.
    • Add better check for local node address during quorum formation.
    • Fix structure error after mmdelsnapshot caused by incomplete dirty indirect block flush.
    • Fix to check for client socket when scanning inodes file.
    • Fix rare race condition between quorum loss thread and i/o threads.
    • Fix handling of an alloc manager recovery message reply that caused offline fsck to erroneously exit with too many disks unavailable message.
    • Fix mmap write to pass assigned disk address to doDirectio.
    • Fixed infinite loop due to bad hash key when looking for indirect blocks.
    • Fixed crash during unmount of dmapi file system on Linux.
    • Fix printing of mmfsadm eventsExporter responses that are larger than 4000 bytes.
    • Fix a SIGSEGV problem during events export.
    • Fix hang when changing the cluster manager to a new node.
    • Support Linux AIO interfaces for kernel that is earlier than 2.6.19.
    • Fix an assert encountered while executing command mmfileid.
    • Fixed deadlock during quiesce due to requests being missed because they were not in the GlobalWaitQueue.
    • Avoid assert that occurs if the filesystem is attempted to be unmounted during a quotacheck operation.
    • This update addresses the following APARs: IZ54485 IZ55419 IZ56130 IZ56131 IZ56249.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-08-27T14:32:14Z  
    A new release of IBM General Parallel File System (GPFS) was announced on July 14, 2009, via US letter number 209-105 (IBM GPFS for POWER, V3.3) and 209-106 (IBM GPFS Multiplatform, V3.3). Effective with the announce is the introduction of a new pricing, licensing, and entitlement structure for Version 3.3 and Version 3.2. The new GPFS Server and GPFS Client pricing and licensing model will offer you better pricing options. However, this change will require your IBM Client Representative or Business Partner to assist you to identify how you use or will use GPFS for each node activated where GPFS is installed. According to the GPFS Licensing Agreement each node in the cluster must be designated as a GPFS Server or GPFS Client with associated GPFS Client or GPFS Server license.

    The type of license that is associated with a node depends on the functional roles that the node has been designated to perform.

    A node is defined as an individual operating system image that may appear on a single computer within a cluster, on a system within a cluster, or on a partition.

    GPFS Server License:
    •The GPFS Server license permits the licensed node to mount GPFS file systems and access data from operating system block devices. The GPFS Server license permits the licensed node to perform GPFS management functions such as cluster configuration manager, quorum node, manager node, and NSD server.
    •The GPFS Server license also permits the licensed node to share GPFS data through any application, service, protocol or method, such as NFS (Network File System), CIFS (Common Internet File System), FTP (File Transfer Protocol), or HTTP (Hypertext Transfer Protocol).

    GPFS Client License:
    •The GPFS Client license permits a node to mount GPFS file systems and access data from operating system block devices as well as NSD servers.
    •The GPFS Client license only permits exchange of data between nodes that locally mount the same file system. No other export of the data is permitted.

    The GPFS Client may not be used for nodes to share GPFS data through any application, service, protocol or method, such as NFS, CIFS, FTP, or HTTP. For this use, entitlement to GPFS Server would be required.

    See the GPFS V3.3 Concepts, Planning, and Installation Guide and the GPFS V3.3 Advanced Administration Guide for instructions on how to designate a GPFS Server or GPFS Client node.

    Prior to this announcement, GPFS was licensed based on the total number of processor cores or processor value units (PVU) on the nodes where GPFS was installed. There was only one GPFS license regardless of whether the node was being used as a GPFS Server node or a GPFS Client node. With the July 14, 2009 announce, GPFS is split into two license types: a GPFS Server license and a GPFS Client license. Each are priced separately and continue to be priced based on the total number of processors available for use by the GPFS node. For each node in a GPFS cluster, you will need to identify the number of GPFS Server licenses or GPFS Client licenses that correspond to the way GPFS is used on that node. Note: although there are two separate licenses, the code for the GPFS Server and GPFS Client ship on the same media or is available via the same tar ball download; separate installs, one package. There are no functional changes in the GPFS 3.2 version of the code. In GPFS 3.3, customers will have to use the mmchlicense command to identify the node licensing designation. Certain commands (mmchnode, mmcrnsd, etc.) may fail if the node does not have the correct license type.
    Migration of Entitlements for Current GPFS Customers

    There should be no impact to you until your Software Maintenance Agreement or Subscription and Support renewal date. At that time, your entitlements or customer record will need to be migrated or upgraded to the new structure. A customer with 64 GPFS entitlements today will continue to have 64 GPFS entitlements tomorrow. Those entitlements will be split between GPFS Server and GPFS Client entitlements depending upon your configuration or use of GPFS. For example, you may be using GPFS as a server on 24 processors (three eight-way nodes) and as a client on the remaining 40 processors (these could be ten four-way nodes or five eight-way nodes, or some other configuration). Ordering Examples are provided below.

    It will be important to identify the correct configuration prior to renewal since the price for renewing a GPFS Server license versus a GPFS Client license is very different.
    Ordering Examples:

    Example 1; GPFS for POWER :

    One GPFS license = one processor core
    Common small commercial Power Systems cluster, virtualization is used:
    •Four Power 570 systems, eight processor cores per physical system, each partitioned into two LPARs with four processor cores per LPAR. Each LPAR is a GPFS node. All nodes access the disk through a SAN.
    •Three LPARs are configured as quorum nodes: three nodes with four CPUs each: 12 server licenses
    •Five LPARs are configured as non-quorum nodes: five nodes with four CPUs each: 20 client licenses

    Example 2; GPFS Multiplatform:

    GPFS license = 10 Processor Value Units (PVUs), 1 AMD Opteron core requires 50 PVUs
    Common System x HPC setup, no virtualization:
    •Four x3655 machines (eight cores each), 32 x3455 machines (four cores each). Each physical machine is a GPFS node (no virtualization).
    •Four x3655 nodes are configured as NSD servers and quorum nodes: four nodes with eight cores each = 32 AMD Opteron cores*50 PVUs = 160 server licenses.
    •32 x3455 nodes are configured as NSD clients: 32 nodes with four cores each = 128 AMD Opteron cores*50 PVUs = 640 client licenses.
    For Additional Information:
    Refer to the GPFS Announcements for your country dated July 14, 2009.
    Refer to the GPFS V3.3 Concepts, Planning, and Installation Guide and the GPFS V3.3 Advanced Administration Guide
    Refer to the GPFS FAQ at: http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/com.ibm.cluster.gpfs.doc/gpfs_faqs/gpfs_faqs.html
    or send an email to General Parallel File System/Poughkeepsie/IBM (gpfs@us.ibm.com) with your specific question.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-10-15T19:30:16Z  
    GPFS 3.2.1.15 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.15

    October 8, 2009

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    • Add more information about which filesystem a disk may already belong to if known.
    • Fix for recently created filesystems using a 384KB block size and a very large directory getting unexpected failure with EFBIG during create.
    • Fixed memory corruption when dmapi destroy event is enabled.
    • Collect more NFS export information during gpfs.snap.
    • Fix cancel blocking lock over NFS.
    • Fixed daemon assert when deleting files from a dmapi enabled filesystem.
    • Check that we have enough revoke threads.
    • Fix problem where munlinkfileset with the -f option killed programs using mmap.
    • During a stripe group mount, ignore the 'exclDisks' mount option for a filesystem in a remote cluster.
    • Prevent deadlock "In kernel waiting for operations to quiesce" when mmfsctl suspend, mmunlinkfileset, mmcrsnapshot, mmdelsnapshot commands try to quiesce all activity in the filesystem and there are pending mmap operations.
    • Fixed problem that file migration not started after file system manager node changed.
    • Fix performance degradation for multi-node directory updates after a large number of deletes.
    • Fixed tsmigrated problem that handles lowSpace events, automatically restart on new clmgr node.
    • Use ethtool to monitor all interfaces other than infiniband and bonding.
    • Fix rare assert that occurs when a bad contact node is specified in remote cluster configuration.
    • Fix mmap file token revoke to relinquish and invalidate mapped pages in the correct ranges.
    • Fix a hang in certain scenario following mmsetquota -j.
    • Correct code causing fsck problems to be mangled or lost when machines of different endians are involved in shipping the problem between the worker node and the Stripe Group manager.
    • Fix gpfs.snap error in cluster with similar hostname and/or collecting data on a shared filesystem.
    • Fix return from findRecoveryNode.
    • Fix looping allocation in restripe when some disks are suspended and some disks are nearly full.
    • Fix for mmapplypolicy with the -B parameter set to a very large number (100,000 or more) can lead to a program failure which cascades from a malloc failure.
    • Prevent offline fsck from crashing the deamon while fixing orphans in a dmapi enabled filesystem.
    • Fix mmapplypolicy to correctly handle pathnames with shell metacharacthers anywhere within a GPFS filesystem.
    • Ensure mmfsctl syncFSconfig does not affect free disks unless device name is "all".
    • Fix small window where message send will hang if destination list includes the local node and all other nodes reply before the local send can start.
    • Ensure mmaddnode fails if the IP address already appears in the cluster.
    • Prevent quota quiescing operations from being performed again if the stripe group manager quiescing operations are restarted due to an exception.
    • Fix fsck so that it promptly locks the inode from the respective snapshot before it attempts a scan on the inode for the particular snapshot.
    • Ensure internal temporary files needed by mmgetstate are always present.
    • Validate user input for NSDid when using mmdelnsd with -p option.
    • Ensure mmsdrserv is started on the server nodes after an AIX install.
    • Fix daemon crash due to race condition between the fsck watcher thread and the master thread that could lead to the watcher thread accessing stale fsck pointers that would have already been freed.
    • Fix rare assert that occurs if a node joins in the middle of clmgr challenge-response sequence.
    • Fix snmp polling handler refresh scheme to get some GPFS state changes.
    • Fix restripe code to perform rebalance even if file is illreplicated.
    • Correct a problem with logging when a one of the failure groups in a replicated file system is full.
    • Acquire the stripe group descriptor mutex before changing the quota files inode information in the stripe group descriptor.
    • Correct exception in lock_get_status during fcntl revokes on Linux.
    • Added code to syncronize the free allocator counter when the file system manager is exiting to avoid assertion from happening in some rare cases.
    • Fix tsstatus -m to return rc 2 if the file system is not mounted anywhere.
    • Fix an assertion due to synchronization problem between events exporter initialization and its sendHandler thread.
    • Avoid daemon failure due to unusual pattern of directory updates.
    • Ensure that quiesce sleepers are woken up if the stripe group manager take over happens during quiesce operations.
    • Avoid slowdown in certain highly multi-threaded situations when GPFS block size is large.
    • Correct a rare logging problem in snapshot code.
    • Fix a SIGSEGV during file system panic message generation on AIX with 32-bit kernel.
    • Ensure that multi log directory updates are not held up waiting for spool done thread log wrap to complete.
    • Fix for a rare race condition in cleaning up cached data about fileset objects.
    • Fix rare deadlock during fsmgr node failure recovery that has a policy file.
    • Fix for Assert (lockRangeNode != NODE_NONE) in line 4177 of file llio.C.
    • This update addresses the following APARs: IZ59355 IZ60281 IZ60287 IZ60289 IZ60335 IZ60583 IZ60595.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2009-11-13T13:13:41Z  
    GPFS 3.2.1.16 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.16

    November 12, 2009

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    • Ensure mmexportfs does not remove tiebreaker disks unless device name is "all".
    • Fix kernel exception in fifo_open due to an invalid i_pipe pointer.
    • Increase prefetchThreads+worker1Threads+nsdMaxWorkerThreads to 1500 on AIX 64bit systems.
    • Improve SMP scalability in the DIO code path.
    • Fix small window where message send will hang if destination list includes the local node and all other nodes reply before the local send can start.
    • Prevent force unmounts when disks in different failure groups (FGs) fail, but are in different pools. Prevent marking disks down in multiple FGs when disks die simultaneously.
    • When special encoding flag is set in a functions debugging frame section, an extra offset should be added during decoding. Otherwise, the thread traceback can not be decoded correctly.
    • Add an new file system option --filesetdf.
    • If persistent reserve is enabled, fix filesystems so they don't become unmounted if the filesystem has any disks already marked as "down" or issues fencing a disk.
    • Fix mmpolicyExec-hsm.sample to handle characters \ " and ' in filenames properly so that they work in HSM file list.
    • Fix deadlock during FS manager takeover if previous FS manager and its disks (site failure) fail at the same time.
    • Fix DMAPI enabled filesystems when they are mounted on top of another GPFS filesystem.
    • Fixed GPFS hang when recalling files.
    • When running lsvg do not wait for the volume group lock.
    • Fixed performance problem while migrating files.
    • Acquire the stripe group descriptor mutex before changing the quota files inode information in the stripe group descriptor.
    • Fixed the allocation code which caused an infinite loop when running out of full metadata block.
    • Correct a problem when changing the admin node name of a server node.
    • Fsck generates false positives for bad repication status in user metadata files after a failed PIT operation. The fix ensures that fsck does not generate the false positives.
    • Avoid crash in rare cases of concurrent multi-node file creates.
    • Fixed repair code which can caused snapshot file corruption which could happen when filesystem contains fileset, snapshots and deleting disk resulted in not enough failure group for proper replication of metadata.
    • This update addresses the following APARs: IZ59644 IZ62776 IZ63168 IZ63169 IZ63170 IZ63206.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-01-15T13:16:01Z  
    January 14, 2010

    * Stripe group configuration change so data block loss cannot occur if data is being ingested along with the configuration changes.
    * Fix policy handling of rules of the form "EXCLUDE FROM POOL" to prevent LOW_SPACE events from incorrectly being logged.
    * Fix problem on systems configured with large maxFilesToCache that could cause file systems to be unmounted on some client nodes when running recovery after a manager node failure.
    * Replace the lsvg command with getlvodm.
    * Fix ioctl opcode conflict with FIGETBSZ on Linux kernel 2.6.31 and later.
    * Fix fsck to avoid incorrect reporting and fixing of filesystem corruption in a heterogeneous cluster.
    * Avoid confusion when using a local fcntl lock versus an NLM one.
    * This fix will give users early warning and exit when unlinked filesets are present. It also prevents further processing of files that would otherwise give the user misleading error information.
    * Limit the number of attempts made to destroy nfsd threads in mmnfsquorumloss in case an nfsd thread is stuck waiting for IO to complete in GPFS.
    * Do not stop NFS or unexport fs on quorum loss. Kill NFSDs that are stuck during setNfsdProcs.
    * Prevent occasional hang under high stress when several nodes concurrently share multiple directories.
    * Fix mmdelnode syntax error checking.
    * Fix data corruption when using mmap.
    * Avoid kthread waiters in cNFS clusters after failover.
    * Fixed the allocation code which caused a loop during metadata allocation. This problem only affects filesystem with metadata replication enabled.
    * Fix mmbackup incremental to handle conversion from short filename records to new longer records after upgrade to 3.2.1.14 or later.
    * Fix Linux "mmnfsinit start" command and return correct return code.
    * Fix problem where a multi-threaded workload reading extended attributes from a large number of files could cause accumulation of a large number of byte range tokens leading to slowdown and spurious ENOMEM errors.
    * Add support for -N nodeList option in mmbackup version 3.1 and 3.2.
    * Keep FS descriptors off of excluded disks even if they come online.
    * Fix assert failure on FS manager node when unmountOnDiskFailure=yes and a disk fails after 3.2.1.14-16 installed.
    * Reduce number of inodes copied to snapshots.
    * Do not let socket get stuck in reconn_cleanup state following repeated breaks that occur just after connection handshake completes.
    * Reduce the pagepool usage by inode allocation segments during FS manager initialization or recovery.
    * Fix a problem with cutting traces in a CNFS setup.
    * Fix filesystem panic when a failed disk holds a FS descriptor and returns unexpected error codes.
    * Ignore un-supported permission flags passed to gpfs_i_permission on SLES11.
    * Correct intermittent bug where mmlsfileset fails to show junction paths.
    * Fix for a rare race condition that may cause an assert in the invalid fileset object disposal path.
    * Fix for a SIGSEGV on Windows caused by a race in accessing the ACL file.
    * Correct processing to prevent quota requests from being performed while the quota manager operations are being quiesced.
    * Fixed error handling for ibv_reg_mr call.
    * Correct a rare problem (due to an error encountered writing quota files) that can prevent a newly created filesystem from being mounted.
    * Fixed a problem which prevents filesystem remount after a forced umount due to error(ie. filesystem panic,quorum loss, etc).
    * Fix quota manager cleanup when file system manager migrates.
    * Fix a file structure error caused by SetAllocationSize.
    * Handle IB port event of LID change.
    * Correct a problem when verifying that the daemon is down from a Windows node.
    * Fix signal 11 due to bad RDMA index and cookie received from the TcpConn in verbs::verbsClient_i.
    * Fix possible deadlock restriping a file system with data replication enabled under application load and with small pagepool.
    * Warning messages on conflicting opertaions are sent to stderr to avoid littering stdout.
    * Resolved an issue that in rare cases could cause GPFS to terminate when tracing is enabled.
    * Fix a problem in mmdf where number of free inodes may become negative.
    * Fix race condition that occurs due to disk failure during clmgr election while using tiebreaker disks.
    * Fix assert "offset < ddbP->mappedLen" when reading dirs.
    * Fix allocation manager problem that caused pool to not be deleted when it should have been.
    * Fix assert due to invalid fcntl acquire sleep element found on the kernel queue.
    * Fix a rare bug that occurs during nsd config change along with earlier disk issues to another deleted nsd.
    * Fix a problem that can lead to loss of an intermediate SSL key file.
    * Fix a problem with interpreting the syncnfs mount option.
    * Fix an assertion during mount that could happen when quota management is enabled and snapshot is being used.
    * When open of the directory fails and not all fields are set, do not call back into GPFS to do close (release). This may cause an invalid assert due to attempting to reference uninitialized fields.
    * Succedent tscrfs command will unset some flags unexpectedly even if it cannot get the permission to run. It will cause a daemon assert. Clear flags only if the command has set it before.
    * This update addresses the following APARs: IZ63351 IZ65194 IZ65380 IZ65414 IZ65614 IZ66577 IZ66881 IZ66894 IZ67312 IZ67542 IZ67543 IZ67545 IZ67548 IZ67624.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-01-15T13:20:31Z  
    January 14, 2010

    * Stripe group configuration change so data block loss cannot occur if data is being ingested along with the configuration changes.
    * Fix policy handling of rules of the form "EXCLUDE FROM POOL" to prevent LOW_SPACE events from incorrectly being logged.
    * Fix problem on systems configured with large maxFilesToCache that could cause file systems to be unmounted on some client nodes when running recovery after a manager node failure.
    * Replace the lsvg command with getlvodm.
    * Fix ioctl opcode conflict with FIGETBSZ on Linux kernel 2.6.31 and later.
    * Fix fsck to avoid incorrect reporting and fixing of filesystem corruption in a heterogeneous cluster.
    * Avoid confusion when using a local fcntl lock versus an NLM one.
    * This fix will give users early warning and exit when unlinked filesets are present. It also prevents further processing of files that would otherwise give the user misleading error information.
    * Limit the number of attempts made to destroy nfsd threads in mmnfsquorumloss in case an nfsd thread is stuck waiting for IO to complete in GPFS.
    * Do not stop NFS or unexport fs on quorum loss. Kill NFSDs that are stuck during setNfsdProcs.
    * Prevent occasional hang under high stress when several nodes concurrently share multiple directories.
    * Fix mmdelnode syntax error checking.
    * Fix data corruption when using mmap.
    * Avoid kthread waiters in cNFS clusters after failover.
    * Fixed the allocation code which caused a loop during metadata allocation. This problem only affects filesystem with metadata replication enabled.
    * Fix mmbackup incremental to handle conversion from short filename records to new longer records after upgrade to 3.2.1.14 or later.
    * Fix Linux "mmnfsinit start" command and return correct return code.
    * Fix problem where a multi-threaded workload reading extended attributes from a large number of files could cause accumulation of a large number of byte range tokens leading to slowdown and spurious ENOMEM errors.
    * Add support for -N nodeList option in mmbackup version 3.1 and 3.2.
    * Keep FS descriptors off of excluded disks even if they come online.
    * Fix assert failure on FS manager node when unmountOnDiskFailure=yes and a disk fails after 3.2.1.14-16 installed.
    * Reduce number of inodes copied to snapshots.
    * Do not let socket get stuck in reconn_cleanup state following repeated breaks that occur just after connection handshake completes.
    * Reduce the pagepool usage by inode allocation segments during FS manager initialization or recovery.
    * Fix a problem with cutting traces in a CNFS setup.
    * Fix filesystem panic when a failed disk holds a FS descriptor and returns unexpected error codes.
    * Ignore un-supported permission flags passed to gpfs_i_permission on SLES11.
    * Correct intermittent bug where mmlsfileset fails to show junction paths.
    * Fix for a rare race condition that may cause an assert in the invalid fileset object disposal path.
    * Fix for a SIGSEGV on Windows caused by a race in accessing the ACL file.
    * Correct processing to prevent quota requests from being performed while the quota manager operations are being quiesced.
    * Fixed error handling for ibv_reg_mr call.
    * Correct a rare problem (due to an error encountered writing quota files) that can prevent a newly created filesystem from being mounted.
    * Fixed a problem which prevents filesystem remount after a forced umount due to error(ie. filesystem panic,quorum loss, etc).
    * Fix quota manager cleanup when file system manager migrates.
    * Fix a file structure error caused by SetAllocationSize.
    * Handle IB port event of LID change.
    * Correct a problem when verifying that the daemon is down from a Windows node.
    * Fix signal 11 due to bad RDMA index and cookie received from the TcpConn in verbs::verbsClient_i.
    * Fix possible deadlock restriping a file system with data replication enabled under application load and with small pagepool.
    * Warning messages on conflicting opertaions are sent to stderr to avoid littering stdout.
    * Resolved an issue that in rare cases could cause GPFS to terminate when tracing is enabled.
    * Fix a problem in mmdf where number of free inodes may become negative.
    * Fix race condition that occurs due to disk failure during clmgr election while using tiebreaker disks.
    * Fix assert "offset < ddbP->mappedLen" when reading dirs.
    * Fix allocation manager problem that caused pool to not be deleted when it should have been.
    * Fix assert due to invalid fcntl acquire sleep element found on the kernel queue.
    * Fix a rare bug that occurs during nsd config change along with earlier disk issues to another deleted nsd.
    * Fix a problem that can lead to loss of an intermediate SSL key file.
    * Fix a problem with interpreting the syncnfs mount option.
    * Fix an assertion during mount that could happen when quota management is enabled and snapshot is being used.
    * When open of the directory fails and not all fields are set, do not call back into GPFS to do close (release). This may cause an invalid assert due to attempting to reference uninitialized fields.
    * Succedent tscrfs command will unset some flags unexpectedly even if it cannot get the permission to run. It will cause a daemon assert. Clear flags only if the command has set it before.
    * This update addresses the following APARs: IZ63351 IZ65194 IZ65380 IZ65414 IZ65614 IZ66577 IZ66881 IZ66894 IZ67312 IZ67542 IZ67543 IZ67545 IZ67548 IZ67624.
    Additions to changelog post performed on Jan. 15, 2010 for GPFS 3.2.1.17
    GPFS 3.2.1.17 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.17

    January 14, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-03-05T22:08:50Z  
    GPFS 3.2.1.18 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.18

    February 25, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    * Fix bug in mmchnode --cnfs-interface change option.
    * Fixed rare race condition in writing files from multiple nodes which could result in a file system corruption and data loss after a node failure.
    * Fix for a rare race condition that may result in a "Busy inodes after unmount" syslog message on Linux.
    * Fix counting the chars correctly when filename consists invalid UTF8 chars.
    * Fix a race condition between deldisk and deallocation of surplus indirect blocks that could result in dangling block pointers.
    * chrdev_open hits BUG call in list_add (device list is corrupted).
    * Fixed inode expansion code which can cause restripe to fail with an assert. This problem only happens when restripe and inode expansion run concurrently.
    * Fix allocation code which can cause "No space left on device" error on mount. This error is most likely to happen on initial mount after filesystem creation.
    * Prevent HSM and NFS from asking to open inodes that are system metadata nodes.
    * Load policy file on sgmgr node when file system is mounted somewhere so that low space threshold always set when file system is mounted.
    * If the file system is internally forced to unmount (file system panic), invoke the preunmount user exit if one is installed.
    * 1. Fix buffer calculation in dm_get_events when buffer size is greater than 64K. 2. Fix possible lose of events in dm_get_event() call when buffer size is greater than 64K.
    * Fix problem with mmlsfileset when expanding inodes is running.
    * Ensure raw trace files are preserved after a node reboot.
    * Return EMEDIUMTYPE rather than ELNRNG for incompatible format errors on Linux system.
    * Fix fsck so that it detects problems and fixes them without encountering struct assert errors even if the 'assertOnStructureError' config option is turned on.
    * Notify dm_get_events that the session already failed after quorum lost in the cluster.
    * Fixed potential assert when writing small files via NFS under heavy load.
    * Fix spurious EIO errors accessing hidden .snapshots directories enabled via "mmsnapdir -a".
    * Allow case insensitive node identifier in specfile.
    * Improve performance of large file create when DIO is used.
    * Error conditions returned due to failed metadata flush operation are handled appropriately preventing the restripe operation from asserting due to failed checks.
    * Fix fsck code so that it doesnt report the corrupt addresses problem that it claimed to have fixed during the previous fsck run.
    * Fixed hardlink assertion problem when upgrade file system from 2.3 to 3.3 or later.
    * Fix code to avoid holding mutex twice while revoking token encounters SGPanic.
    * Remove stat() calls in the mmshutdown path.
    * Fix assert s_magic == GPFS_SUPER_MAGIC on kernel 2.6.16.60-0.59.1 and above.
    * Fixed an allocation loop which could occur during mount and rebalance of a filesystem.
    * Fix mmbackup to divide up the list of files that need backup based on filesize when numberOfProcessesPerClient=2 (or more).
    * This update addresses the following APARs: IZ67745 IZ68029 IZ68724 IZ68773 IZ69478.
    Updated on 2010-03-05T22:08:50Z at 2010-03-05T22:08:50Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-04-02T12:39:15Z  
    GPFS Service Advisory

    During internal testing, a rare but potentially serious problem has been discovered in GPFS. Under certain conditions, a read from a cached block in the GPFS pagepool may return incorrect data which is not detected by GPFS. The issue is corrected in GPFS 3.3.0.5 (APAR IZ70396) and GPFS 3.2.1.19 (APAR IZ72671). All prior versions of GPFS are affected.

    The issue has been discovered during internal testing, where an MPI-IO application was employed to generate a synthetic workload. IBM is not aware of any occurrences of this issue in customer environments or under any other circumstances. Since the issue is specific to accessing cached data, it does not affect applications using DirectIO (the IO mechanism that bypasses file system cache, used primarily by databases, such as DB2® or Oracle).

    This issue is limited to the following conditions:

    1. The workload consists of a mixture of writes and reads, to file offsets that do not fall on the GPFS file system block boundaries;
    2. The IO pattern is a mixture of sequential and random accesses to the same set of blocks, with the random accesses occurring on offsets not aligned on the file system block boundaries; and
    3. The active set of data blocks is small enough to fit entirely in the GPFS pagepool.

    The issue is caused by a race between an application IO thread doing a read from a partially filled block (such a block may be created by an earlier write to an odd offset within the block), and a GPFS prefetch thread trying to convert the same block into a fully filled one, by reading in the missing data, in anticipation of a future full-block read. Due to insufficient synchronization between the two threads, the application reader thread may read data that had been partially overwritten with the content found at a different offset within the same block. The issue is transient in nature: the next read from the same location will return correct data. The issue is limited to a single node; other nodes reading from the same file would be unaffected.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-04-02T12:43:18Z  
    GPFS 3.2.1.19 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.19

    April 1, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    • Fix for a rare race condition during disk address lookup of a newly allocated address under heavy load.
    • If there is mount failure to GPFS file system and you can only find "No child processes" message in mmfslog, apply this fix and you will see the real reason for the mount failure. This problem only affects Linux.
    • Fix disklease inconsistencies between clmgr resetting lastLeaseProcessed and the client resetting lastLeaseReplyReceived.
    • For a directory with FGDL enabled, when mnode token is being revoked but not the inode token, and when there is thread holding the openfile, CTF_FINE_GRAIN_DIR_MNODE flag will not get reset which may trigger an assertion next time the node tries to become metanode.
    • Correct the ineligibilityReason_id2name macro definition which will result in unknown id2name error during trace formatting.
    • The fix ensures that false compare mismatch errors are not reported and a relevant assert is not triggered when compare operation is done on a inode with bad file size.
    • Remove spurious errors in the mmfs.log.latest file about Expanded inode file.
    • Fix server side token issue in failure cases. If a revoke triggered by serversideRevoke fails with errors like E_BUSY, the client is not driving reset of copyset state & transnode, leaving the token state in fizzy and all future acquires will get queued forever.
    • The fix ensures that online fsck is not held for ever trying to steal buffers from inode range that is currently locked for online fsck scan.
    • locks_free_lock BUG(fl_block) call on ESTALE return from an fcntl lock. A heavy load of advisory locking on a single file range was causing lots of long lock waiters and retries.
    • Memory related problems from cppcheck were corrected.
    • Fix missing out-of-memory check in get inode routine.
    • Fix mmfsck so that it could handle badly damaged inode better when relica count went bad.
    • Fix SLES 11 automount. Only one automount daemon can be started in SLES 11.
    • Allow mmchnode --cnfs-enable to accept trailing spaces in network config file.
    • Fix problem where in a file system with large snapshots a failure of the file system manager during the first phase of an mmrestripefs or mmdeldisk command could under certain timing conditions cause corruption.
    • Fix problem where filesystems created by GPFS release 2.3 or older were not mountable by GPFS release 3.2 or 3.3.
    • Fix the code which caused GPFS daemon to assert after filesystem panic on FS manager node.
    • Inherit ACL entries based on filemode (should be the default ACL mode).
    • Correct a Windows problem when running mmmount all_remote for a Windows node.
    • Fix for a very rare race condition where a non-DIO read from a cached buffer may transiently return partially incorrect data.
    • Fixes issues in minorityQuorum clusters that have leaseDuration set and have migrated from 2.3 to 3.2.
    • Fix to tolerate an inconsistent state of Windows security settings on an inode following a failed TSM restore.
    • Fix mmdeldisk syntax error message.
    • This update addresses the following APARs: IZ72671 IZ72999 IZ72998 IZ72689 IZ73002 IZ73345
  • zstarsales04
    zstarsales04
    5 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-04-30T04:13:45Z  
    Zstar Electronic Co.Ltd, Sell fire cards for DS/NDSL/NDSi, also have Wii, DSiLL, NDSi, NDSL, PSP2000, PSP3000, PS2, PS3, PSP go, PSP, Xbox360 accessories, all kinds of phones are available
    www.zstar.hk
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-05-21T02:32:55Z  
    GPFS 3.2.1.20 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.20

    May 20, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    * Use the disk availability information from the daemon for the mmlsdisk -m/-M options.
    * Fix a sig#11 problem during sg disk table update.
    * Add a method to indirect block iterator to ignore last change count in order to step to the next block when the current one is deleted.
    * Fix a rare assert caused by RelinquishAttrByteRange thread.
    * Fix some performance problems when reading large files from NFS clients.
    * Remove redundant message about loading policy rules on file system manage node's mmfs.log file.
    * Unlink fileset to wait for asynchronous recovery to finish before checking the fileset's files for activity.
    * Exception using a spin_lock in fasync_helper during fcntl revoke.
    * Avoid corrupted snapshot files in unusual case of open unlinked files.
    * Fix assert DE_IS_FREE(fP) in direct-common.C.
    * Allow change directio flag for immutable files.
    * Fix a GPFS deadlock that occurs on Linux, under high load conditions, with memory pressure and memory mapped files.
    * Fix code that could cause assert after node fail while running fsck.
    * Fix code causing an assert after manager node failed during filesystem manager takeover.
    * Fix problem for tsmigrated migrate to new clmgr node when clmgr node changed.
    * Avoid deadlock on Linux with small maxFilesToCache and very frequent file creates and deletes.
    * Ensure the mmsdrserv process is not killed if it uses its own separate TCP port.
    * Fix a deadlock involving read-write mmap under heavy stress.
    * Fix a case where higher-level indirect blocks were not being flushed when they were supposed to be.
    * Add input validation for xattr value size.
    * Change mmfileid to find "invalid" disk addresses when using the :BROKEN keyword.
    * Ensure all locks acquired during the lock file operation are released during a failed operation. Prevents the need for an explicit file lock release failing which will cause the code to assert.
    * Correctly bypass NFS client getting "permission denied" when "subdir/.." is looked-up internally.
    * Fix mmexectsmcmd to tolerate error return codes such as 4, 8 and interpret as non-fatal.
    * Avoid GPL compiling warning. Use void* instead of struct inode* in cxi file and cast it in OS specific file.
    * Fix a race condition between mmnfsdown and mmnfsup so that mmnfsdown can kill all nfsmonitor processes.
    * Fix for a rare deadlock during recovery on Windows.
    * Avoid chance of deadlock when updating shared directory under high load.
    * Verify interface is up (IFF_UP) before processing it. Interfaces brought down using "ifconfig down" (unlike "ifdown ") are returned with SIOCGIFCONF.
    * Rework handling automount on RHEL 5 or SLES 11.
    * Avoid a rare failure accessing a directory long after concurrent updates.
    * Fix array out of bound problem in eaRegistry dump function.
    * Fix problem in fully replicated filesystem which will not mount if all the disks in one FG are stopped and suspended.
    * Fix assert caused by rare CPU cache inconsistency situation on X86_64 hardware.
    * Check whether mmfs.log.previous file exists before renaming it.
    * Fix a rare assert during multi-node create and delete races.
    * Defensive check to ensure that an entry with a zero-length name is never inserted into a directory.
    * Correct permission-denied errors when nfsd rebuilds dentry trees on 2.6.27 (or later) kernels.
    * Retry deadlocks on rlMutex when called from RecLockReset to cleanup advisory locks.
    * This update addresses the following APARs: IZ73005 IZ74241 IZ74537 IZ74541 IZ74545 IZ74546 IZ74548 IZ75128 IZ75249 IZ75251 IZ75257 IZ75771.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-06-30T13:27:16Z  
    GPFS 3.2.1.21 is now available from https://www14.software.ibm.com/webapp/set2/sas/f/gpfs/home.html

    Problems fixed in GPFS 3.2.1.21

    June 29, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    * On failover/failback, (gratituous) ARP requests from the node registering a new CNFS IP address are rejected by some switches (that have STP enabled and portfast disabled) for a short period. Subsequently the IP address may not reachable from outside the subnet. Make sure the port is enabled (and outbound requests are accepted) by first ARPing the gateway (if one is configured) with a deadline of 30 seconds.
    * Serialize xattr registry initialization process.
    * Fixed deadlock due to sg takeover failure.
    * Fix EA limit calculation.
    * Corrected GPL build break on IA64 platform when kernel version is greater than 2.6.16.
    * Always shutdown GPFS when nfsmonitor detects unrecoverable problems such as statd is inactive.
    * Generate DMAPI read event when file is deleted and copy to snapshot is needed.
    * fsck is allowed to steal buffers from files that are low level locked as long as the buffer is not dirty.
    * Prevent repeated filesystem manager failure with "No space left on device" error when snapshot copyon write gets triggered and while filesystem is running out of disk space.
    * Ensure online fsck does a proper job of cleaning up stale, or failed, allocation message queues. Fixes resulting online fsck assert after finding stale AllocMsgQ.
    * Fixed problem in dm_set_dmattr() function so that attribute with common first serveral bytes are set correctly.
    * synched disk address in a hyper-allocated file now matches the indirect block allocation address when allocation fails.
    * Fsck prints verbose information about the range of regions and the stroage pool it scans for each pass.
    * Reinitialize ea limit before adjusting it after remount of the file system.
    * Quota check operation prints approrpiate error messages when conflicting programs are running.
    * Cleanup allocation message queues properly during a failed fsck operation as a result of stripe group panic. Subsequent online fsck will not assert checking for NULL allocation message queues during initialization.
    * `kill -SIGINT ...` has been supported in all previous code releases. This update just brings tsapolicy into compliance with the defacto standard forhandling SIGTERM.
    * Turn on the CXIUP_NOWAIT flag when we know that it is safe to use igrab()
    * Fix internal dmapi attr name comparison routine so that it can compare the string with its true length.
    * fsck code does not assert trying to look into invalid disk addresses as a result of race with flush buffer operation.
    * Fix assert(ofP->inodeLk.get_lock_state())
    * CreateReservedFiles checks the number of blocks to be written to inode file before starting threads.
    * Fix variable initialization that could cause "mmcheckquota -a" to terminate.
    * Detect missing definition for mmlsfileset utility and define it if needed. Earlier version of globfuncs do not have mmlsfileset utility defined.
    * Check for out of memory condition in token revoke handler.
    * Define I_LOCK if it is not already defined in Linux.
    * Fix dmapi attribute name comparison routine to take the short length into account.
    * fsck code does not assert trying to look into invalid disk addresses as a result of race with flush buffer operation.
    * Improve takeover time when using tiebreaker disks in certain cases.
    * Choose a bitmap size based on gpfs_statfs64() call to see how many inodes actually in use.
    * Fix failure due to expel cmd being run during disk election in tiebreakerdisk cluster.
    * Add gpfs32 support on sles11/sp1.
    * This update addresses the following APARs: IZ76391 IZ77298 IZ75263 IZ75575.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-08-13T13:34:37Z  
    GPFS 3.2.1.22 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral/?productGroup0=ibm/fcpower&productGroup1=ibm/ClusterSoftware&productGroup2=ibm/power/IBM+General+Parallel+File+System
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-09-10T15:50:34Z  
    In order for GPFS tracing to function properly on a system running AIX 6.1 with the 6100-06 Technology Level, you must either install AIX 6100-06-02 Service Pack or open a PMR to obtain an iFix from IBM Service. If you are running GPFS on AIX 6.1 TL 6 without 6100-06-02 Service Pack or the iFix and have AIX tracing enabled (such as by using the GPFS mmtracectl command), you will experience a GPFS memory fault (coredump) or node crash with kernel panic.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-09-30T04:01:18Z  
    GPFS 3.2.1.23 is now available from http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.2.1.23
    Sept 23, 2010

    • Fixed a GPFS on Windows failure that can occur on systems with a large number of cores (e.g. 8 or more) running a workload with thousands of threads. When this error occurs, /var/adm/ras/mmfs.log.* shows "logAssertFailed: tid >= 0 && tid <= MAX_GPFS_KERNEL_TID". The fix for this problem removes any assumption on the maximum thread ID.</li>
    • Add T (for terabytes) and P (for petabytes) as suffix to mmedquota/mmdefedquota.
    • Fix an ENOMEM error in the Token Manager memory when multiple remote cluster are working with the same files.
    • Fixed race between tschpolicy thread and deferred deletion thread. This condition caused an inconsistent inode state of that policy file being created both in on disk inode bitmap and in-memory bitmap.
    • Fix a deadlock caused by buffer steal during quota update.
    • Reduced time required while the file system is quiesced during create snapshot.
    • Avoid rare asserts when updating a small directory.
    • Adding a trace message for CNFS userexit.
    • Fix a problem when adding first disk to a new storage pool while file system is in sync process.
    • Fix to add additional maintenance for the local hasVinfoLock flag and remove bad DBGASSERTS.
    • Assert working with elements on the kxRecLockAcquires queue (needs to hold mutex).
    • Fix rare occurrence of file fragment expansion happening, during file sync, that can cause an assert failure.
    • The change fixes asserts in fsck while trying to fix corrupt directories.
    • Fix assertion caused when deleting snapshots with very large files.
    • If node can not do cNFS recovery for a failed node, then commit suicide so another node can do the takeover for both nodes.
    • Cache slab (and cpu) usage high due to NFS anon dentry allocations.
    • Fix assert that occurs on the FS manager node if the FS manager is running GPFS release 3.2, and a release 3.3 client tries to mount the filesystem.
    • Fix GPFS automount so that it reads config value in /etc/sysconfig/autofs.
    • Improve performance of stat operations on Linux under certain multi-node access patterns.
    • This update addresses the following APARs: IZ78118 IZ81253 IZ83749 IZ83794 IZ84006 IZ84014 IZ84019 IZ84038 IZ76610 IZ84096.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-10-14T20:47:21Z  
    End of Marketing (June 17, 2011) and End of Service (September 30, 2011) dates have been announced for GPFS 3.2/3.2.1.
    Please see:

    The Software Support Lifecycle page

    IBM Announcement Letters:

    [http://www-01.ibm.com/common/ssi/rep_ca/3/897/ENUS910-243/ENUS910-243.PDF]

    [http://www-01.ibm.com/common/ssi/rep_ca/0/897/ENUS910-210/ENUS910-210.PDF]
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-10-14T20:50:47Z  
    End of Marketing (June 17, 2011) and End of Service (September 30, 2011) dates have been announced for GPFS 3.2/3.2.1.
    Please see:

    The Software Support Lifecycle page

    IBM Announcement Letters:

    http://www-01.ibm.com/common/ssi/rep_ca/3/897/ENUS910-243/ENUS910-243.PDF

    http://www-01.ibm.com/common/ssi/rep_ca/0/897/ENUS910-210/ENUS910-210.PDF
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-11-05T19:56:27Z  
    GPFS 3.2.1.24 is now available from http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.2.1.24

    Nov 04, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    • Fix AIX crash caused by kxFreeAllSharedMemory.
    • Fix the allocation code which can cause a filesystem to panic with "Too many disks are unavailable" when running out of disk space.
    • Fixed kernel assert when dmapi event generator is accessing null sgP pointer.
    • Fix FSErrValidate error in ACL GC while inode expansion is also running.
    • Prevent a rare deadlock between mmcheckquota and FS manager recovery.
    • Fix assert "aceLength > 0" in tsgetacl for default ACL on a directory in a remote fs.
    • Fixes asserts in fsck while trying to fix corrupt directories.
    • Fix hang between node join thread and events exporter request handler thread.
    • Fix buffer length calculation for dmapi user event returned by dm_get_events.
    • Fix problem where a remote cluster does not always pick a local NSD server when readReplicaPolicy=local is set.
    • Fix for running cNFS cluster node failure-GPFS sometimes crashes when referencing a bad pointer after a node failure.
    • This update addresses the following APARs: IZ85087 IZ85445 IZ86270 IZ86546.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2010-12-20T18:59:23Z  
    GPFS 3.2.1.25 is now available from IBM Fix Central.

    Available at: http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.2.1.25

    Dec. 16, 2010

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    • Assert in setCachedRecAddr when the cached disk address is NULL while the disk address read from disk is a real disk address. Modify the assert to allow this kind of change. And, update the cached address locally.
    • Fix repeated RDMA connection attempts on a down port due to IBV_EVENT_PORT_ERR.
    • Fix a race condition where a file system manager failure during a disk status change could cause temporary loss of file system access.
    • Fix race between two remove threads removing same file. Check for valid inode after acqiring inode lock.
    • Fix race between deferred deletions and policy file creation. Changed runTSChangePolicy code to acquire file lock before calling finish allocation to synchronize with deferred deletions.
    • Fix rare assert in fsync code path. Fix SFSSyncFile to check inode status before updating mtime and mark inode dirty.
    • Reduce message traffic when writing a file with NFS.
    • Fix synchronization problem of dmapi destroy event thread and dmapi event response thread.
    • Use TRCBUFSIZE environment variable for trace buffer size and ensure it is not overwritten by config parameter.
    • Improve performance of mixed random read/write workloads on large files over NFS.
    • Avoid asserts and deadlock by having mmlsfileset and mmlssnapshot commands wait while mmcrsnapshot command runs.
    • Fix logAssertFailed: rmr1 != rmr2 when using GPFS RDMA.
    • Fix Assert exp((mappingP->kvaddr >= SharedSegmentKernelBase) ... on 32-bit Linux.
    • Linux mmdelacl returning E_OPNOTSUPP for files in a "-k nfs4" fs.
    • Add useDIOXW configuration variable to avoid Direct IO token thrashing when using some IO requests that match the GPFS blocksize.
    • Correct linux capabilities are allowing access even when root squashing is enabled.
    • Fix duplicated session id returned by dm_create_session due to clock out of sync problem.
    • Enable all dmapi clients to acquire access rights to a file that is being destroyed.
    • Fix sublock to disk sector conversion routines. These now handle invalid disk addresses by returning E_INVAL back to the caller during fsck scan.
    • Fix for an assert during multiple instance of restripe running in parallel.
    • Improve performance of file system metadata scan phases of mmrestripefs.
    • This update addresses the following APARs: IZ88750 IZ88901 IZ88903 IZ89184 IZ89757.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    231 Posts

    Re: GPFS v3.2.1 Announcements

    ‏2011-03-14T19:19:53Z  
    GPFS 3.2.1.26 is now available from IBM Fix Central.

    Available at: http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.2.1.26

    March 03, 2011

    Note: This service level addresses the below issues. This is not a definitive list as other minor corrections have been made which are not listed here.

    • Fix race condition between flushBuffer and mergeInode updating lastDataBlock.
    • Fix the allocation code causing an assert on filesystem manager node after encountering an I/O error.
    • Fix cxiIsNFSLock erroneously returning FALSE for NFSv4 lockctl calls.
    • Fix allocation code causing delete disk to fail when deleting last disk of a failure group.
    • Include mount event disposition in dm_get_disp() call.
    • Fix invalid assert in fcntl lock token relinquish path.
    • Fix quorum formation when the /var/mmfs/gen/BallotFile file is too small.
    • Fix allocation code to prevent an assert that could occur while trying to delete/replace disk.
    • Ensure the scope of config parameters is not changed as a result of a delete operation.
    • Fix DMAPI enabled fs getting a mount error if one of the sessions that registered for mount got deleted while the mount event is in proces
    • Add additional tracing and assertions to help catch or prevent possible FPE in createClientFilelists2.
    • Remove compiler warnings for mem_cgroup_add_lru_list.
    • Add GPFS mount options nfsHashName and nonfsHashName. If nfsHashName is in effect, the NFS FH will include the hash value of the file name
    • Force setting of numberOfProcesses in createClientFilelists2() to be greater than 0 and also avoid using possibly unset field in backupHdr
    • Fix letting ACL garbage collector delete auto-generated Window SID mappings.
    • Fix race condition starting too many mmkprocs when using many mmapped files.
    • Change allocation code to prevent looping when migrating blocks after a disk's failure group assignment, data type or storage pool has cha
    • Fix gpfsInodeCache slab (and cpu) usage high due to NFS anon dentry allocations.
    • Create gpfs init lock file on system startup thereby enabling GPFS shutdown scripts being executed during system shutdown on RHEL distribu
    • Fix gpfsFcntl referencing a freed sleep element while handling NFS requests.
    • This update addresses the following APARs: IZ91523 IZ92315 IZ92321 IZ92326 IZ92425 IZ93233 IZ94655 IZ94718 IZ94719 IZ94722.