Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
42 replies Latest Post - ‏2015-05-18T18:47:20Z by gpfs@us.ibm.com
gpfs@us.ibm.com
gpfs@us.ibm.com
216 Posts
ACCEPTED ANSWER

Pinned topic GPFS V3.5 announcements

‏2012-04-20T16:26:20Z |
Watch this thread for announcements on the availability of updates for GPFS v3.5.
Updated on 2013-04-01T14:57:27Z at 2013-04-01T14:57:27Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2012-04-20T16:28:01Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.1 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral/
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2012-06-02T15:04:11Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.2 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.2

    June 1, 2012

    • mmbackup will exit 1 when auditlog file is not available for result analysis after backup transaction is done.
    • Fix a problem stealing buffers in a large pagepool after installing 3.4.0.11.
    • When backup partially fail, mmbackup continues to compensate shadow file even thoughthere are multiple failed reported for the same file in auditlog file.
    • Fixed a bug in log recovery which could result in a "CmpMismatch" file system corruption problem.
    • Fix for the "iP->i_count == 0" kernel assert in super.c. This problem onlyaffects Linux 2.6.36 and later.
    • Fix a rare deadlock where a kernel process gets blocked waiting for a free mailbox to send to the GPFS daemon.
    • mmbackup will exit 1 when incremental backup partially fail and shadow file compensation succeed.
    • Correct mmlsfileset output for junctions of deleted filesets in some cases.
    • Fix a memory allocation problem when online mmfsck runs on a node with a heavy mmap workload.
    • mmbackup will not stop processing even though there's no auditlog file if only expiration processing is done.
    • mmbackup will display progress msg "Expiring files..." correctly if expiration transaction takes longer than 30 mins.
    • Prevent the cluster manager from being expelled as a consequence of some communication outage with another node.
    • mmbackup with multiple TSM clients will catch all error messages from dsmc command output.
    • Fixes problem where the 'expelnode' callback indicates that the chosen node had joined the cluster first.
    • Fix a problem with nBytesNonStealable accounting.
    • Fixed message handler for filesystem quiesce which caused a GPFS assert when filesystem manager failed while filesystem is being quiesced.
    • Fix printing of long fileset names in mmrepquota and mmlsquota commands.
    • Fix mmap operations to go through nsd server when direct access to disks are no longer possible.
    • Fix mmsetquota to handle numerical fileset names.
    • mmbackup can backup files/directories with long pathname as long as GPFS and TSM support.
    • Fix an error message in mmchattr command with -M/R/m/r option.
    • Fix a problem that restripe failed in to an inifinite loop when sg panicked on the busy node.
    • mmbackup will display backup/expiration progress message in every interval specified by MMBACKUP_PROGRESS_INTERVAL environment variable if specified. Otherwise, mmbackup will display backup/expiration progress message every 30 mins.
    • Fixed rare assert when deleting files in a fileset.
    • Fixed rare hang problem during sg or token recovery.
    • Fix deadlock when doing inode scan (mmapplypolicy/mmbackup) in small pagepool.
    • getxattr for ACLs may ovewrite the kernel buffer if small buffer sizes (less than 8 bytes) are specified.
    • When mmbackup shadow file is rebuilt by --rebuild or -q option, mmbackup will get CTIME information from TSM server, hence files modified after previous backup but before shadow is rebuilt will be backed up by consequent incremental backup.
    • GNR: fix a problem where certain errors from a pdisk, like media errors, caused RecoveryGroup open to fail. Change code to continue attempting to open the RecoveryGroup and simply discount the pdisk(s) returning media errors(and unexpected error codes).
    • Prevent disks from being marked as 'down' when a node with the configuration option unmountOnDiskFail=yes receives an I/O error or loses connectivity to a disk.
    • When mmbackup can't backup files, the message is more informational.
    • Fixed mailbox calls which can lead to deadlock during filesystem quiesce. The deadlock is most likely to happen on a extremely overloaded system.
    • When backup fail (partially or entirely) due to error from TSM client, mmbackup will display error msg from TSM cleint for easy problem detection. But mmbackup will display the error msg only once for the same error even though multiple times occur.
    • Make GPFS more resilient to sporadic errors during disk access. Upon an unexpected error during disk open, such as ERROR_INSUFFICIENT_BUFFER, GPFS now retries the call after a brief pause.
    • When compesating shadow file takes long time because backup partially fail, mmbackup will show progress message.
    • Fixed the backward compatibility error in reading data across node on different versions. This is needed if you are upgrading from 3.4.0.6 or lower version number to 3.4.0.12 or higher GPFS version.
    • mmbackup Device -t incremental and --rebuild is valid syntax and will work properly.
    • Fix the problem that deldisk returned success even though if failed.
    • handleRevokeM loops when a lease callback is not responded to.
    • This update addresses the following APARs: IV19037 IV19165 IV20350 IV20610 IV20613 IV20615 IV20618 IV20619 IV20625 IV20627 IV20630 IV20634.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2012-08-29T14:23:52Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.3 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.3

    August 21, 2012

    • Fixed potential live-lock in snapshot copy-on-write of the extended attribute overflow block when the next snapshot is being deleted. Problem occurred in rare cases after the inode file increases in size.
    • mmbackup will check if session between remote TSM client node and TSM server is healthy and will remove the combination from transaction if non-healthy situation is detected.
    • Prevent an assert accessing files via DIO.
    • mmbackup will filter ANS1361E Session Rejected: The specified node name is currently locked error and will exit error.
    • mmbackup will filter filename that contains unsupported characters by TSM.
    • When a tiebreaker disk is being used, avoid quorum loss under heavy load when the tiebreaker disk is down but all quorum nodes are still up.
    • Fix the file close code to prevent a daemon assert which can occurs on AIX with DMAPI enabled filesystem.
    • Fix an infinite wait when delsnapshot.
    • Fix a problem that mmdf can not return correct inode info in a BigEndian and LittleEndian mixed cluster.
    • Fix an assert when copy the inode block to previous snapshot.
    • Added logic to reduce the chance of failure for "mmfsadm dump cfgmgr".
    • When a tiebreaker disk is used, prevent situations where more than one cluster configuration manager is present simultaneously in the same cluster.
    • Fixed old bug in getSpareLogFileDA due to a typo.
    • Fix assertion failure when multiple threads use direct I/O to write to the same block of a file that has data replication enabled.
    • Fix daemon crash, during log recovery, when log file becomes corrupted.
    • Fix a problem that would cause mmadddisk failure.
    • Fix assert "isValid()" that occurs during mmbackup a snapshot.
    • Fix an assertion caused by leftover "isBeingRestriped" bit after a failed restripe operation.
    • Update mmrpldisk to issue warning instead of error when it can not invalidate disk contents due to disk been in down state.
    • Fix regression introduced in 3.4.0.13 and 3.5.0.1 that could in some cases cause "mmchdisk ... start" to fail with spurious "Inconsistency in file system metadata" error.
    • Avoid deadlock creating files under extreme stress conditions.
    • Fix code to ensure E_ISDIR error get returned when FWRITE flag is used to open a directory.
    • Fix snapshot creation code to prevent a possible GPFS daemon assert when filesystem is very low on disk space.
    • Fix problems with using mmapped files after a filesystem has been force unmounted by a panic or cluster membership loss.
    • Fix regression where a race condition between restripe and unmount could cause the GPFS daemon to restart with error message "assert ... logInodeNum == descP->logDataP[i].logInodeNum" in the GPFS console log.
    • mmbackup will report severe error if dsmc hit ANS1351E (Session rejected: All server sessions are currently in use).
    • Fix issue in multi-cluster environment, where nodes in different remote clusters updating the same set of files could cause deadlock under high load.
    • mmbackup will filter filename with newline correctly.
    • Improve error handling for completed tracks.
    • Fix a bug that causes slowness during mmautoload/mmstartup on systems with automount file system. The performance hit is noticeable on large clusters.
    • Prevent very rare race condition between fileset commands and mount.
    • Fixed rare assert in log migration.
    • Fix assert "writeNSectors == nSectors" that occurs during "mmchfs --enable-fastea".
    • Update mmlsquota -j Fileset usage message.
    • Fix allocation message handler to prevent a GPFS daemon assert. The assert could happen when a filesystem is been used by more than 1 remote cluster.
    • Block Linux NFS read of a file when CIFS holds a deny share lock.
    • Speed-up recovery when multiple nodes fail, and multiple mmexpelnode commands are invoked with each failed node as target. Applies mostly to DB2 environments.
    • Fix rare assert under workload with concurrent updates to a small directory from multiple nodes.
    • Fix null ptr dereference in case of i/o failure case on gw node.
    • Fixed hang problem when deleting HSM migrated file after creating a snapshot.
    • Fix a GPFS API gpfs_next_inode issue that it doesn't scan the file whose inode number is the max inode number of file system or fileset.
    • Fixed assertion when generating read or destroy events.
    • Fix the mmcrfs command to handle the -n numNodes value greater than 8192.
    • Extend mmbackup's tolerance of TSM failures listed in the audit log even when paths are duplicate or unrequested. TSM frequently logs in the audit log a number of unexpected path names. Sometimes the path name is a duplicate due to repeated errors or due to TSM trying to back up objects in a different order than presented in the list file. Other times the object simply was not requested and it tries to back it up anyway. Make mmbackup ignore these log messages during shadow database compensation. Log all uncompensated error messages to files in backupStore (root) in mmbackup.auditUnresolved.<server> and mmbackup.auditBadPaths.<server> Add new debug bit to DEBUGmmbackup: 0x08 to cause a pause before backup activities commence and a second pause before analysis of audit logs. Correct minor errors in close() handling of various temp files.
    • Fixed sig 11 when background deletions is trying to access OpenFile object that was removed from cache while waiting for quiesce to finish.
    • Fixed race condition between FakeSync and RemoveOpenFile.
    • Fix a kernel panic which caused by a race between two nfs read.
    • Fix a restripe code that could cause a potential filesystem corruption. The problem only affect filesystem that was created without FASTEA enabled but was later upgraded to enable FASTEA via mmmigratefs with --fastea option.
    • Loss of access to files with ACLs can occur if independent filesets are,or have been, created in the filesystem.
    • This fix only applies to customers running GPFS on Linux/PowerPC, using WEIGHT clauses in their policy rules.
    • Fix mmdeldisk to ignore special files that do not have data in a pool.
    • Close a hole that gpfs_ireadx/ireadx64 cannot find more than 128 delts. Close a hole that call gpfs_ireadx/ireadx64 for an overwritten file may get assert if the input offset is not 0.
    • Fixed a problem where 'mmchmgr -c' fails on a cluster configured with a tiebreaker disk, resulting in quorum loss.
    • EINVAL returned from gpfs_fputattrs when an empty NFSv4 ACL is included.
    • FSErrBadAclRef reported when lockGetattr called RetrieveAcl with a zero aclRef.
    • Fixed deadlock resulting out-of-order aclFile/buffer locking.
    • This fix only applies to customers who have set tscCmdPortRange, running mmapplypolicy, running a firewall that is preventing policy from exploiting multi-nodal operation.
    • Fix code to avoid unavailable disks when there is no metadata replication.
    • Fix rare race condition where a node failure while writing a replicated data block under certain workloads could lead to replica inconsistencies. A subsequent disk failure or disk recovery could cause reads to return stale data for the affected data block.
    • Fix hung AIX IO when the disk transfer size is smaller than the GPFS blocksize.
    • gpfs_i_unlink failed to release d_lock causing d_prune_aliases crash.
    • This fix only applies to customers who are on AIX and have gotten "no enough space" errors when running mmapplypolicy.
    • This update addresses the following APARs: IV21750 IV21756 IV21758 IV21760 IV23290 IV23810 IV23812 IV23814 IV23842 IV23855 IV23877 IV23879 IV24151 IV24382 IV24426 IV24942 IV25185 IV25484 IV25487 IV25488 IV25762 IV25763 IV25771.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2012-09-17T18:39:27Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.4 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.4

    September 14, 2012

    • Fix code to prevent a rare condition where many inode expansion thread can get started by periodic sync. This can cause GPFS daemon to run out resources for starting new threads.
    • Fix a segfault prlblem after node takeover as ccmgr and finish dmapi session recover process.
    • Fix the code that can cause a GPFS daemon assert when multiple thread working on same file caused a race condition to occur.
    • Add environment variable MMBACKUP_RECORD_ROOT specs an alternate dir to store shadow files, list files, temp files, etc.
    • Fix an assert encountered during opening of NSDs. This assert occursdue to a rare race condition which requires the device backing particular NSDs to completely disappear from the operating system while opening the NSD.
    • This fix only applies to any customer who want SUBSTR interpreted sensibly for negative indices.
    • Fix null pointer dereference when an RDMA connection breaks during memory buffer adjustment and verbsRdmaSend is enabled.
    • Mask out ReadEA (which is the same as ReadNamed) from unallowed rights so that the lack of it is not interpreted as a denial. Only the presence of an explicit ACE can deny the ReadEA right.
    • Fix an issue in a mixed version cluster, where a node running running GPFS 3.4 or older failing in a small window during mount could cause spurious log recovery errors.
    • Fix CNFS to recognize GPFS filesystem in RHEL6.3.
    • Fixed assert happened in trace statement after xattr overflow block was copied to snapshot.
    • This fix applies to any customer who needs to kill the scripts started by mmapplypolicy. Or who is wondering why on AIX, a faulty program startedby mmapplypolicy "hangs" instead of aborting.
    • Fix assert "MSGTYPE == 34" that occurs in pre and post-3.4.0.7 mixed multicluster environment.
    • offline bit gets lost after CIFS calls gpfs_set_winattrs.
    • Fix a problem which occurs in GNR configurations with replicated file systems. Should an NSD checksum error occur between an NSD client and GNR server, the first such error on a transaction could be mistakenly ignored, resulting in no callback invocation or event generated for it. Additionally, if the checksum error is persistent on the same transaction, the code could attempt to retry the transaction one more time than allowed by the configuration settings.
    • Fix sequential write performance and deadlock since 3.4.0.12 and 3.5.
    • This fix applies to any customer who has policy rules that reference the PATH_NAME variables AND who might encounter a path_name whose length exceeds 1024 bytes.
    • Fix segfault in dm_getall_disp() functions.
    • This update addresses the following APARs: IV27283 IV27287 IV27288 IV27290 IV27291.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2012-11-20T14:55:44Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.6 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.6

    November 16, 2012

    Problems fixed in GPFS 3.5.0.6 November 16, 2012

    • This is fix only for GPFS Linux users. On Linux operating systems, readdir() API on GPFS filesystem was not returning the valid file types in the d_type member of struct dirent output parameter. Modified the code to return valid file types in the d_type member of struct dirent output parameter of readdir() system API on GPFS file system.
    • Fix a race between lookup and mnode takeover which caused lookup to get inconsistent data.
    • Fix EA code which caused GPFS daemon assert on filesystem with FASTEA enabled. This is mostly a problem on Windows.
    • Fix a deadlock which occurs on GNR configurations in certain situations. This deadlock can occur when the active RecoveryGroup server fails, and the backup server experiences a SAS problem that prevents access to a sufficient number of disks, preventing RecoveryGroup recovery. In this situation, it is sometimes possible to see failure recovery blocked because the NSD transactions are waiting for the backup server to take over, when it cannot.
    • If rebuilding shadow file encounters a severe problem when mmbackup is invoked with query option (-q), mmbackup will stop further backup processing against the TSM server.
    • Add synchronization between filesystem manager resign and some ACL related operations. This is needed to prevent a possible GPFS daemon assert while running mmchmgr command.
    • Fix automount problem on SELinux enforcing systems.
    • Release the fileset lock before jumping to noDataToCopy.
    • Fix range revoke handler to better handle error conditions such as IO error. Instead of causing the GPFS daemon to assert, this fix just panics the filesystem.
    • Fix a problem with tsfattr() API where kernel panic may occur when executing GPFS_IREAD/GPFS_IREAD64 command on same file by multiple threads at same time. This problem only occurs on AIX.
    • Fixed code that can cause GPFS daemon assert when multiple threads try to write to the same file after it has been truncated to size 0.
    • Fix CNFS problem on SELinux enforcing systems.
    • Close a hole in fileset snapshot restore tool when it restores a renamed file.
    • Fix slow sequential read performance of very large files in a file system with a 16K block size and a very large pagepool.
    • Fixed a bug in new background deletion code where it is trying to queue the deletion instead of handling it when maxBackgroundDeletionThreads is zero.
    • Force log writes for synchronous NFS unlink operations.
    • Fix rdwrFast.fastpathGetstate() == 0 assert after a cluster membership loss.
    • Fix a race condition where multiple threads appending to the same file with synchronous writes could cause a deadlock.
    • Added additional info to noDiskSpace to distinguish the reason of the event1. Added %reason which could be either diskspace or inodespace2. Added %storagePool to indicate the pool name when %reason is diskspace3. Added %filesetName to indicate the fileset name when %reason is inodespace.
    • Fix a problem where GPFS daemon assert can occur when restripe fails or aborts on very large files.
    • Fixed problem in readpage/splice_read where it is returning EFAULT instead of ETIMEDOUT when accessing HSM migrated file from NFS client.
    • Avoid hang with long 'open snapInode0' waiters.
    • Fixed a bug when setting filesize with truncate file operation.
    • This update addresses the following APARs: IV28097 IV28120 IV28672 IV28685 IV29929 IV29930 IV30005 IV30613 IV30738 IV30744 IV31645 IV31647 IV31684 IV31815.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2012-12-14T21:26:55Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.7 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.7

    December 14, 2012

    • Under heavy load, shrink dcache is called by kswapd and finds a candidate dentry to be pruned, but the attached inode has already been deleted (I_CLEAR). The problem occurs when a pcache NFS lookup finds an anonymous (no-name) dentry in the cache. This can happen if the dentry had been previously created through d_alloc_anon() when instantiating an inode from a cached fh instead of a regular lookup. To "repair" this dentry (fill in the name), we allocate a new dentry with the name being looked up and instantiate it with the same inode using d_materialise_unique() and release the old no-name dentry. d_materialise_unique() drops the inode count on success. When shrink dcache runs, it will find the freed no-name dentry with an attached inode with i_count 0 (and assert). Add an extra hold on the inode before calling d_materialise_unique().
    • gpfsClose() called for NFS delayed-open instance of a deleted snapshot asserts that gnode is stale but vinfo still connected. On snapshot delete, disconnect any matching vinfo's for open NFS instances during sgmMsgQuiesceOps.
    • Fix restripe code to prevent potential deadlock when delete/add disk is run at same time as restripe.
    • Fix an incompatibility issue between GPFS 3.5 and GPFS 3.4. When both versions coexist, nodes running GPFS 3.4 can experience assert like "logAssertFailed: this == newDesc sgdescio.C".
    • Reduce unnecessary compensation passes Duplicate entries in the TSM audit failure list cost mmbackup extra passes over the shadow DB to compensate the failures. Since we sort the fail list based on inode order, just use sort -u option to remove duplicates up front. Prevent throwing away entries that fail the grep by adding a number suffix on the inode order key to ensure best chance at getting it in the right order.
    • Update repair code to prevent replica mismatch on EA data after restart down disk.This only affect filesystem with FASTEA enabled and have EA data that can no longerbe stored inside inode.
    • Fix code for mmrpldisk where it will migrate data off any suspended disk in addition to the disk been replaced. This can lead to both replica been placed on the replacement disk.
    • Fix corruption caused during a rare race updating a shared directory.
    • Fix potential deadlock casued by reloading policy rules when file system manager nodes die and another node make takeover.
    • Fixed problem when closing filesystem due to policyfile openfile object still in hash tables and looping in verifyAllGone.
    • Fix assert "secLsn == curLsn" that may occur under metadata intensive NFS workloads.
    • Fix striped log file corruption due to snapshot restore.
    • Fix a timestamp issue for Linux AIO in which the modification timestamp of a file accessed by Linux native AIO interfaces might be set wrong in some case.
    • syncnfs mount option not effective for some error tests.
    • Fixed rare deadlock in aclMsgFailureUpdate.
    • This only affects GPFS GNR users, and only those that are running 3.4.0.17 or later in combination with 3.4.0.16 or earlier, on the server pair for a single Recovery Group. If failover from a newer software version to an older software version occurs during a rolling upgrade, or if the newer software is downgraded back to the older software, the some Pdisks may become "missing", or the whole Recovery Group (and therefore all Vdisks in it) will not be recoverable. This fixes this problem.
    • Customer may experience assert like "isMoreRecent && tailLsn >= other.tailLsn || !isMoreRecent && tailLsn <= other.tailLsn" during log recovery. The fix provides a more graceful workaround so log recovery can proceed safely.
    • This fix applies to all supported releases of the mmapplypolicy command, but is only important if you ever run with -L 3 or higher anda SHOW value that is a very long character string.
    • Ignore quorum loss event from remote cluster.
    • GPFS Native RAID (GNR) is unable to recover after intermittent disk communication failures. Only affects GNR users, in situations where intermittent hardware failure causes multiple disks to temporarily report write errors, while the GNR server is writing primordial Vdisk Configuration Data (VCD), and then only if the GNR server has to restart (perhaps due to server failover) shortly after the temporary write errors occurred. The failure will be indicated in the log by showing either error 214 or checksum error when recovering. This PTF fixes this problem, and allows the GNR server to recover cleanly. There is no other practical workaround that preserves user data. If all Vdisk NSDs in the affected recovery group can be destroyed, one can instead manually clear all the Pdisks in the failed recovery group (by overwriting the first 4 MiB with zeroes), then manually delete that affected recovery group (with the -p flag on mmdelrecoverygroup), then recreate the recovery group.
    • This update addresses the following APARs: IV28687 IV31663 IV31816 IV32726 IV32729 IV32823 IV33246 IV33393 IV33394.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2013-02-20T19:23:16Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.8 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.8

    February 19, 2013

    • Fix COW code to don't try to copy overflow block into prev snapshot if the inodedoesn't exit in the target snapshot.
    • Reject as invalid an NFS4 ACL that is appended to a posix ACL.
    • Corrected a potential deadlock in the RAID reconstruction code.
    • Avoid very rare hang when concurrently modifying a shared directory.
    • Customer may experience assert like "lfVersion != other.lfVersion || lfVersion== 0 || tailLsn == other.tailLsn" during log recovery due to double allocation of the same log file to two nodes. The fix is avoid the log double allocation situations.
    • Fix race between AsyncRecovery and mmcrfileset that cause assert.
    • Fix a stale nfs file handle error under some conditions when listing in a per-directory snaplink dir.
    • Fix rare race condition in a multi-cluster environment that may cause the gpfs daemon to fail with "!oldDiskAddrFound.compAddr(*oldDiskAddrP)" assert when there are frequent conflicting accesses to the same file from different remote clusters.
    • Fix assert in getDatabuf as the blockOffset was not reset.
    • Fix a problem that only one node is working in restripefs.
    • Fix a race condition which could cause same file allocator used by two different files.
    • The fix ensures that mmfsck cleans up the filesystem affected with inodes having cross-linked blocks throughly.
    • Fix a race between NFS reads on same file which would cause kernel panic.
    • Changed the error message when mmclone is run on a filesystem without 'fastea' enabled for AIX.
    • Don't allow more then one pit job running at a time if cluster configure version less than 3.5.
    • Fix an error of "No such file or directory" after successful mmcrsnapshot
    • Fixed the allocation code which caused a memory corruption that may lead to FSSTRUCT errors. The problem only occurs when mmadddisk fails due to some unexpected error such as running out of metadata space.
    • This fix applies to all releases of the mmapplypolicy command from 3.4 and onwards.
    • mmbackup will exclude Socket special files from backup.
    • Fix a problem in mmfileid command so that it can file disk address of an xattr overflow block correctly.
    • This fix applies to all releases of the mmapplypolicy command from 3.1 and onwards.
    • Eliminate a bogus error message in "mmrepquota -a".
    • This fix applies to all releases of GPFS with an mmapplypolicy command that supports the -q option. The -q option is rather esoteric! Most customers will not use it directly, BUT the mmbackup command does invoke mmapplypolicy with -q.
    • This update addresses the following APARs: IV33613 IV35513 IV35748 IV35750 IV35751 IV35754 IV35760 IV35761 IV35762 IV35802.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2013-04-01T14:57:27Z  in response to gpfs@us.ibm.com
    GPFS 3.5.0.9 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.9

    March 29, 2013

    • support for the Linux ATTR_FORCE flag to reset setid bits on truncate.
    • Fix a deadlock which can occur during recovery when there is the near simultaneous failure of an NSD server node and disks which are twin tailed to that failed server and another server. There is a narrow timing window within which these multiple near simultaneous failures can trigger the deadlock.
    • Fixed potential mount hang problem due to multiple dmapi sessions on the same node manage the fs. This situation may happen after cluster lost quorum, then reestablished, and then cluster manager node get migrated to some other nodes.
    • Fix problem where batched token release optimization introduced in 3.5 may cause inconsistent lock token state under certain race conditions in a multi-cluster environment.
    • Avoid crashes when snapshots are used with high update loads in a filesystem with at least 200M files.
    • Account for missing storage pool information in disks created prior to GPFS 3.4.
    • Fixed the behavior of resync and failover operations when the Queue State is Dropped, for an AFM fileset.
    • Fix the wrong return code of dm_read_invis() call so that it returns the true error code instead of -1.
    • Ensure both SSL key files are restored with a single command invocation.
    • Install this fix when convenient, unless you are bothered by "orphan" policy processes, in which case, sooner!
    • Fix mmdefedquota command usage.
    • Fix mmsnmpagentd error (hang and core dump) due to lack of serialization among threads accessing the socket connection to GPFS daemon.
    • This update addresses the following APARs: IV35916 IV36649 IV36671 IV37163 IV37402 IV37407 IV37410 IV37432 IV37434 IV38096 IV38480.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2013-05-10T17:35:45Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.10 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.10

    May 10, 2013

    * Fix a potential hang during mmgetstate.
    * Fixed a rare mnode token request failure.
    * Fix a problem of return code checking in DMAPI
    * Fix assert "(((gpfsNode_t *)((lcl._bufP->b_vp))) == currGnP)" that may occur at mmap stress workloads.
    * Allow mmlsquota to run when the user executing the command does not belong to any group.
    * Update the GPFS kernel extension on Linux to check inode's access time value to make sure that it is less than a maximum 32bit unsigned interger.
    * Using extended attributes without first running "mmmigratefs stripegroup --fastea" and creating fileset snapshots will result in problems accessing snapshot data. This change helps prevent the situation from arising but cannot repair the problem.
    * Fix an issue of accessing user space data from kernel.
    * Fixed a problem with metanode where it is setting fragmentChanged flag without holding wa inode lock.
    * Changed the default value of the tscWorkerPool.
    * Fix a problem which causes remote mounts to time-out in only 2.5 seconds in the presence of a communication outage between the clusters.
    * gpfs_putacl allows OWNER@ to be denied READ/WRITE_ACL and ATTR.
    * fix rare race condition that could cause long waiters when multiple nodes invoke operations that require an exclusive lock on the same inode.
    * Avoid a rare assert when mount runs while mmrestripefs is active.
    * Use suitable memory size for inode deallocHistory in each indepdendent fileset.
    * Fix a problem for AIX of ls .snapshots via NFS getting assert.
    * Fix a flaw in the inode replicate compare code that fails to find mismatches in all inodes in a inode block. The fix ensures that all inodes in a inode block are compared, verified and fixed.
    * Fix GPFS daemon failure (assert getFragmentSubblocks()) when attempting to preallocate disk space for a file located in a snapshot. gpfs_prealloc() will now return with error EPERM when called on a file in asnapshot.
    * Fixed the locking code to ensure that locks on symlink buffer are not dropped until after the symlink processing is completed and log updates spooled.
    * Changed the GPFS internal function ctAcquire and ctTellServer.
    * gpfs_quotactl() should not allow setting quotas for the root user.
    * Only allocate clean buffers for the log descriptor buffer in LogFile::prepareForIO(). Other buffers are allocated correctly already.
    * Fix restripe code that caused inodes to be incorrectly marked as fully repaired even after error has been encountered during repair. This can cause later restripe to skip repairing these incorrectly marked inodes.
    * Fix a deadlock and "22 minutes searching" assert when the FS manager node has to scan so many AllocSegs during initial mount processing that a thread acquiring a log buffer cannot find any buffers to steal.
    * Clean up error messages from lspv when run as a non-root user.
    * Update mmchdisk to better handle disk usage change.
    * permission-denied when nfsd rebuilds dentry trees on RHEL 5.5
    * Fix problem that pit master and or slave objects were wrongly deleted by dump thread from the job list.
    * mmbackup improved file exclusion rule.
    * Fix access denied by cNFS RHEL 6.3 server.
    * Reduce the number of concurrent quota prefetch requests, per quota client, for the same quota object.
    * Make sure pending locks are not granted during mmshutdown cleanup by going into grace period before cleanup starts.
    * reduce distributed locking overhead when a large number of nodes start writing to the same file.
    * Fix assert in flushBuffer because a failed cxiUXfer left a dirty buffer but the last data block of the file was not updated.
    * When file is first accessed locally it gets a file operation table that includes splice_read which is needed for mmap. Since this operation is not needed for NFS but is only used to improve performance for other file system we disable it for NFS access with GPFS since with GPFS it hurts performance.
    * Correct the socketMaxListenConnections when start sdr server. The problem only occurs if socketMaxListenConnections is not the default.
    * This update addresses the following APARs: IV37606 IV38315 IV38623 IV38639 IV38642 IV38669 IV39233 IV39710 IV39711 IV39713 IV39715 IV40248 IV40251 IV40254 IV40803 IV41653.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2013-07-01T13:51:38Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.11 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.11

    June 28, 2013

    * Fix a "VMM iowait" deadlock encountered during mmap flush.
    * Fixed stripe group close issue in panic error which could prevent a node from rejoining the new cluster group after a quorum loss.
    * Apply immediately if you're seeing any segfaults or mysterious mmapplypolicy/tsapolicy crashes during the "execution" phase of the policy command. This defect is more likely to surface on PPC64 machines, but may occur on other architectures.
    * Add nodes with a panicked filesystem to the mountlist during the new sgmgr takeover query to allow the sgmgr to stay long enough to process pending sgmMsgSGPanicLocal requests.
    * The use of LO_XLOCKNOWAIT flag has been extended to ignore the value ofthe xnew_token_mode to prevent a token revoke from blocking a lock requestin lock_vfs_m().
    * Important for customers running mmbackup.
    * Reduce deadlock caused by quota module, especially, when quota limits are close to usage.
    * Fixed a problem where a node attempting to mount a file systemon a remote cluster which operates with a tie-breaker disk causes that cluster to lose cluster membership. Problem occurs if the cluster hostingthe file system includes nodes that do not have APAR IV21133 (3.3) /IV21759 (3.4) / IV21760 (3.5), or if such nodes remote-mount that file system. Fix needs to be applied to the nodes performing the remote mount.
    * Avoid a rare assert when accessing snapshots during SG manager change.
    * Provide better suggestions if the user specified a conflicting set of parameters in mmcrfs.
    * Fix is highly recommended for GNR/GSS customers. Fix avoids a problem that can cause prolonged filesystem unavailability.
    * Fix a problem that create or delete snapshot hang caused by a leaked ro lock on independent fileset.
    * Fix mmrestripefs command failure when unmountOnDiskFail is set.
    * Rectify a situation for Linux where the IO size exceeds the max hw sectors of the block device.
    * Resolved a kernel soft-lockup issue.
    * A flaw in the mmfsck code path can result in SEGV while scanning a filesystem that has inodes with metadata inconsistency across replicas, where the first copy is the bad copy and the second copy is the good copy.
    * Fix stale advLkObjP that was holding an OpenFile.
    * Avoid potential kernel heap corruption if unexpectedly large symlink object is encountered.
    * Kernel exception results from a zero-length on gpfs_getacl.
    * This update addresses the following APARs: IV40108 IV41180 IV42754 IV43544 IV43741 IV43754 IV43755 IV43760 IV43767 IV43771 IV43777 IV43778 IV43815 IV43831 IV43844 IV44505 IV44506.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Under certain conditions, an issue with IBM GPFS 3.5.0.11 using AFM Single Writer or Independent Writer modes may result in undetected data loss

    ‏2013-07-24T20:49:31Z  in response to gpfs@us.ibm.com

    Abstract:   Under certain conditions, an issue with IBM GPFS 3.5.0.11 using AFM Single Writer or Independent Writer modes may result in undetected     data loss.

    Issue Description:
    Under certain conditions involving an AFM fileset using Single Writer or Independent Writer modes, a write to an uncached
    block in the AFM cache may be overwritten by data subsequently arriving from the AFM home resulting in undetected data loss.  
    This may occur when an application reads enough blocks of an uncached file to trigger AFM background prefetch, which will read
    data from home and write to cache.  While this prefetch is running, an application write to an uncached portion of the file
    (file blocks not yet pre-fetched to the AFM cache) may succeed, but at a later point the block so written may be overwritten
    with data from the AFM home which had not yet arrived in the AFM cache.   

    Users Affected:
    Only users of GPFS version 3.5.0.11 who are utilizing AFM Single Writer or  Independent Writer cache starting with non-empty AFM home. 

    Users not Affected:
    The following uses of GPFS version 3.5.0.11 should not be affected by this issue.

    1. Users not running AFM
    2. Read Only or Local Update AFM caches
    3. Single Writer or Independent Writer AFM caches with empty AFM home
    4. Files created in AFM Single Writer or Independent Writer caches

    Fix and Recommendations
    IBM plans to make the fix available in GPFS 3.5.0.12 (APAR IV46085). Customers using AFM should update to 3.5.0.12 as soon as it is available.
    Until 3.5.0.12 is available, IBM has an ifix available for those customers utilizing AFM. Customers using AFM should contact IBM service to
    request and apply the ifix (APAR IV46085) or observe the workaround recommended below as soon as possible.

    Workaround until fix is applied:
    Both of the below conditions must be met.

    1. Set afmPrefetchThreshold to 100 and
    2. If the AFM prepop command (mmafmctl Device prefetch -j FilesetName) is running, ensure that the command completes before
    issuing writes to the fileset.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2013-09-06T16:03:08Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.12 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.12

    August 8, 2013

    * Fixed hang due to COW and ditto resolution locking order.
    * Apply immediately if you're seeing any messages like "[E] Error parsing work file ... /usr/bin/sort ... PIL.nn.* ... Service index:3".
    * Prevent deadlock between image backup and inode expansion.
    * Fixed message timeout caculating error which could expell nodes by wrong.
    * Resolve a kernel oops issue due to illegal memory access.
    * Fix the directory code which caused FSSTRUCT error to be issued and file lookup to fail. This is rare problem that can only occurs when successful directory split is followed quickly by filesystem panic or node failure.
    * Provide better error handling in mmclone command.
    * Apply immediately if you're seeing any segfaults or mysterious mmapplypolicy/tsapolicy crashes during the "execution" phase of the policy command. This defect is more likely to surface on PPC64 machines, but may occur on other architectures.
    * Fix a race between NFS write on same file which would cause kernel panic.
    * Only applies to GNR and GSS systems. This issue is very unlikely to affect customers. The only case where this problem has been observed was caused by a hardware failure that physically damaged most of the SAS network within a storage enclosure, but for a few milliseconds some of the SAS connectivity continued to function well enough for the second log write to succeed. The change described here prevents such an unlikely hardware failure from causing data corruption in the first place. If the data corruption described here has already occurred, IBM support will have to use special tools to recover the data; those tools are now available in the same release.
    * Avoid long waiters for directory block splits when inode expansion occurs after snapshot creation.
    * Fixed problem where, if the local IP address used by GPFS is temporarily removed, commands (mmexpelnode, for example) will hang.
    * This defect only affects users of Vdisk (GNR and GSS). Causes corruption of the Recovery Group descriptor, when disks are failing rapidly or intermittently during server startup. Only happens if during server startup, disks are initially mostly functioning, but failing at a high rate. The symptom of this defect is an assert during a future recovery, saying "requestedNBlocks > 0". Once this assert is observed, the defect has already destroyed the RGdesc, and this whole RG is not recoverable. Installing a PTF that contains the new code will not help after the RGdesc has already been corrupted.
    * kxFindCloseNFS has been modified to call a new function kxFindCloseSMBwhich finds the matching NFS instance and then calls LkObjSMB::SMBCloseMto remove its SMBOpen locks if refCount is 0 for the associated NFSDataobject. SMBlock_vfs_m has been changed to call kxFindCloseSMB instead ofkxFindCloseNFS1 as well for the same reason. gpfsRead() has been modified to reacquire the missing SMBOpen read lock,similar to what gpfsWrite() has already done. This is necessary as anopen NFS instance can now lose its SMBOpen locks due to a revoke.
    * Count the panicked nodes out in recoverRegions() even if they are on the mount list to allow new fsmgr takeover to succeed even when all other nodes panicked.
    * Avoid oiType == InstForVFS assert during restripe/delsnapshot and deadlock when SG manager recovery completes interrupted snapshot commands.
    * Fix a problem with per-fileset user and group quota entries initialization when default quotas are enabled.
    * Fix assert in shrinkToFit when shrinking the directory data block.
    * Add defensive code to prevent /etc/fstab corruption when local fs is 100% full.
    * Rebuild mmbackup shadow database will use -s <LocalWorkDirectory> during query shadow file sort.
    * Fix CNFS startup issue when bonded interfaces are used.
    * This update addresses the following APARs: IV44685 IV44977 IV44978 IV44979 IV45480 IV45502 IV45553 IV45556 IV46085 IV46213 IV46219.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Under certain conditions, an issue with GPFS 3.5.0.10 - 3.5.0.12 using an AFM fileset may result in undetected in-memory data corruption.

    ‏2013-09-13T15:14:48Z  in response to gpfs@us.ibm.com

    Abstract:   Under certain conditions, an issue with GPFS 3.5.0.10 - 3.5.0.12 using an AFM fileset may result in undetected in-memory data corruption.

    Issue Description:

    Under certain conditions involving an AFM fileset using any mode, a read to an uncached block in AFM cache while background prefetch is running may return wrong data (e.g., a buffer filled with zeros). The AFM background prefetch is triggered automatically when an application reads more than 2 blocks or based on the afmPrefetchThreshold  value set on the fileset. The file data on disk is always correct, so a reread of the file will return correct data once background prefetch of the file completes. 


    Users Affected:

    Only users of GPFS version 3.5.0.10 or later who are utilizing AFM cache and haven't explicitly disabled background prefetch.


    Users not Affected:

    The following uses of GPFS version 3.5.0.11 should not be affected by this issue.
    1. Users not running AFM
    2. GPFS users running 3.5.0.11 or later and having set afmPrefetchThreshold to 100.
    3. GPFS users running AFM using 'SW' (Single Writer) mode, where all data is generated by the cache site ( e.g. there are no read operations to the home cluster).


    Fix and Recommendations

     IBM plans to make the fix available in GPFS 3.5.0.13 (APAR IV48136). Customers using AFM should update to 3.5.0.13 as soon as it is available. Until 3.5.0.13 is available, IBM has an ifix available for those customers utilizing AFM. Customers using AFM should contact IBM service to request and apply the ifix (APAR IV48136) or observe the workaround recommended below as soon as possible.


    Workaround until fix is applied:

    Both of the below conditions must be met.

    1. Set afmPrefetchThreshold to 100  (Note: requires GPFS 3.5.0.11 or later).
    2. If the AFM prepop command (mmafmctl Device prefetch -j FilesetName) is running, ensure that the command completes before  issuing writes to the fileset.

    Updated on 2013-09-13T15:15:42Z at 2013-09-13T15:15:42Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2013-10-07T15:18:46Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.13 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.13

    October 3, 2013

    * Update token code to prevent possible GPFS daemon assert when accessing files via DIO.
    * Fixed log assert during log recovery.
    * Avoid a GPFS crash in high stress situations with many snapshot operations.
    * Fix mmlsattr to display extended attributes more than 1024 bytes long.
    * Allows offline fsck to continue even if the allocation map file is corrupt.
    * Fix a SGPANIC issue during failure update.
    * Change the locking mechanism of fcntl F_SETLEASE call to provide better support of Linux 3.0 and above kernel.
    * Fixed rare problem in mnode token revoke.
    * Call CXI_UPDATE_OSNODE in SFSTrunc() only when truncFile() returns E_OK.
    * Fixed 'Inconsistency in file system metadata' after delete fileset.
    * Fix signal 11 in findVerbsConn.
    * Correct handling of mailbox worker creation when worker1Threads dynamically changed after daemon startup.
    * Avoid loss of directory updates and mmfsck errors after file system panic.
    * Honor env variable MMBACKUP_LOG_SUFFIX + 2 bug fixes. Provide a way for the common, persistent error log lists in mmbackup to be named with a user-provided suffix to aid in finding them after run is complete. Usually the persistent error files are named mmbackup.auditBadPaths.<tsm server name> mmbackup.auditUnresolved.<tsm server name> mmbackup.unsupported.<tsm server name> now they can be suffixed with "." followed by the value in the env variable. Fixed two bugs with other env variables: Integer evaluate DEBUGmmbackup and carefully check for empty string with 'eq' on MMBACKUP_PROGRESS_CALLOUT.
    * Fixed deadlocak in CopyOnWriteData.
    * When a file is being write to a GPFS FPO storage pool and block group factor (BGF) of the file is larger than 1, if GPFS daemon is killed manually or by other cases, block pre-allocation may cause file size is not accurate.
    * The default mount point for a GPFS file system cannot be set to "/".
    * Fixed fsck handling of corrupt dirs having data in inode.
    * Fix a problem with snapshots causing hang.
    * Added temporary work-around for LSI SAS adapter driver defect that incorrectly reports SCSI PI errors in GPFS Storage Server (GSS) systems. This work-around prevents the GPFS Native RAID disk hospital from incorrectly taking actions against disk drives in response to certain SAS fabric problems. Applies to GSS systems only.
    * Fix quota operation to avoid writing to quota files during heavy snapshot stress.
    * Desensitize mmbackup to number formatting. In non-US LOCALEs the number format varies and some dsmc commands persist in using thousands separators that confuse the code about the number of objects processed for various actions. This can result in mmbackup seeing a failure where none exists. Filter all non nunmeric characters from portions of dsmc output that are tallied to arrive at success or failure determinations.
    * Added 3 missing ilm sample files to gpfs.base package : mmpolicy-du.sample, mmpolicy-listall.sample, and mmpolicy-fileheat.sample
    * Fixed a problem where offline fsck can sometimes hit segv in gpfs v3.4.0.20 and above.
    * Fix assert in flushBuffer when user provided invalid data buffer to write a new preallocated data block.
    * Fixed mmbackup Directory(mountpoint) -S <snapshot> problem.
    * Fix a recent problem that prevented the -i and -d options to mmlsfileset from reporting nonzero usage.
    * Fix openssl not found on Windows nodes.
    * Fixed problem in snapshot copyonwrite for the request that is not within the range of snapshot to which the data is being copied.
    * Avoid returning stale buffer for read of uncached file if background prefetch is running and there is WAN delays.
    * GNR systems (on the P7IH hardware with the P7IH disk enclosure) or GSS systems (GPFS storage server on xSeries servers) use mmchcarrier to replace Pdisks, and are vulnerable anytime an error occurs during mmchcarrier.
    * Allow file owner to use gpfs_set_winattrs (posix acl or mode).
    * Fix assert when NFS client trying to access file from deleted independent fileset.
    * Fix a rare data corruption error which can be hit by Direct IO operation.
    * This update addresses the following APARs: IV46544 IV46546 IV47145 IV47623 IV47624 IV47627 IV47628 IV48136 IV48794 IV48796 IV48798.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2013-12-02T19:14:38Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.15 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.15

    November 27, 2013

    * Fix a memory leak in Memory subpool 'Buffer'.


    * This update addresses the following APARs: IV51893.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-02-10T13:20:48Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.16 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.16

    January 24, 2014

    * Information for disks belonging to a pool was not cached properly. This is fixed.
    * Fixed problem in metanode optimization duting directory lookup.
    * The name of the fsck worker node that failed is shown on the command line when fsck aborts.
    * Fix the problem that WADFG rule '*,*,*' does not stripe replicas cross locality groups in FPO environment.
    * Fixed a file content integrity problem when using mmap to read a file.
    * Account for QNumExec correctly in disconnected mode.
    * Fix code to prevent GPFS daemon assert that could occur after extended attribute was added to a file where the file is small enough for data to resides in the inode.
    * The customers using IW failback are affected by this bug, Customers need to update GPFS containing this fix to come out of this state.
    * On AIX, to serialize file system unmount operation for a specific file system to protect a potential system crash when there is a long time sync callback for pre-unmount event.
    * Fix failure when adding files to a directory that has 4K block size when inode size is also 4K.
    * Code fix ensures invalid condition check is not done that can lead to asserts in the inode mergecode path.
    * If the customer has experienced assertion failures on ThRWXLock's or ThSXLock's, then this change is recommended. The changes are not extensive and the fix is considered to be low risk.
    * FPO: fix a FSSTRUCT error which happens at block deallocation time.
    * Fix the snapshot code to prevent a possible deadlock. The deadlock would only occur if there is snapshot command waiting to start when filesystem panics.
    * Ensure that inode scans (and policy) find all inodes in an independent fileset.
    * In fileset snasphot restore tool, restore EA before copying data for a deleted file which has hard link.
    * support GPFS 3.5 per-ACL flags for opaque gpfs_getacl and gpfs_putacl.
    * Fix a rare case which can lead Linux AIO read returns data shorter than required.
    * Fix an assert in reading the last datablock on mmapped file.
    * Fix mmclone split failure when the parent file is so small that its data fits in the inode.
    * Avoid assert when running hadoop map/reduce job and DIO workload at the same time.
    * gpfs_putacl of an ACL obtained via gpfs_getacl from a "-k nfs4" fs failed.
    * Fixed rare probelm where leaving inode lock.
    * Fixed mmap read/write from/to a file with preallocated blocks.
    * Fix symlink loops problem when run multiple gpfs.snap at the same time.
    * Code fix ensures the locking order is not violated in COW code path. This fixes deadlocks in COW code and kernel panics in directio code path.
    * Fix the signing of the MSI packages to include their timestamping.
    * fix a gpfs/nfs issue on kernel greater than 2.6.36.
    * Fixed problem in snapshot code due to COW failure after successfully finishing COW operation on some blocks in the specified range.
    * Make the number of total and free inode reported by SNMP consistent with 'df -i' numbers.
    * Fix mmtracectl hangs.
    * Fix a problem that write on unlinked clone file failed with E_NOENT.
    * Fix a problem that non-smgr failure causes the whole deldisk fail.
    * Ensures that the rename code gets the flush flag on objects it needs to modify. This change introduces an option in the lock tab entry for file object to optionally acquire flush flag.
    * fix for: tsgetacl assert `aceP->aceLength > 0'.
    * Fix one mmdelsnapshot time out issue.
    * Fix assert when we clear out the whole file but there's allocated but not committed DA for the datablock 0.
    * Fixed inconsistency between GPFS 3.5 read-only cache fileset and home after write operation is attempted at read-only cache.
    * mmbackup interprets backslash in include/exclude statement as literal when it is not within a character class.
    * Fix failure when adding external attributes to a directory with certain combinations of a nearly full inode, large inode size, and small block size.
    * On AIX, add another hold count on the root vnode to protect a potential system crash when there is a long time sync callback for pre-unmount event.
    * Fix an assertion that happens during mmap operation on 4K inode size file system on Windows platform.
    * ACL-level flags wrongly used in the initial ACLs for files/subdirs.
    * This update addresses the following APARs: IV50405 IV51515 IV52099 IV52824 IV52858 IV52859 IV52860 IV52862 IV52863 IV52865 IV52866 IV53334.

  • Ziking
    Ziking
    1 Post
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-03-05T02:27:30Z  in response to gpfs@us.ibm.com

    tks a lot

    very usful

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-03-11T11:54:44Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.17 is now available from IBM Fix Central:                                           

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.17

    March 6, 2014

    * Fixed a rare problem in mhSendDirUpdates which is causing assert due to unexpected E_NODEV error from forceInodeFlags.                                                                                                                
    * Fixed inconsistent use of number of inodes in online FSCK.                                                        
    * Avoid long waiters for node recovery when fs panicked.                                                            
    * Fix offline fsck incorrectly patching inode fileset id.                                                           
    * Fix token code to prevent a potential GPFS daemon assert. The assert could happen when removing a sub directory while files is been created/deleted from same directory by other nodes.                                               
    * Fix the allocation code which may lead to FSSTRUCT errors. The problem only occurs on pre-allocated files (files on FPO file system whose block groupfactor is bigger than one, or Windows sanHyperAlloc-ed files.)                   
    * Fixed a problem  in reclaim token storage.                                                                        
    * Fixed a old problem in remount code path where it is invalidating cached inodesafter finishing remount.           
    * Fix a defect in gpfs_next_inode that it may run into dead loop when an independent fileset inode space crosses several inode number ranges.                                                                                           
    * Fixed problem which may result in a stream of "Disk lease period expired" messages in the mmfs.log file in a mixed-platform (Intel and Power)cluster.                                                                                 
    * Fixed fs recovery failure because the missing update was missed to log.                                           
    * Fix a defect that gpfs_ireadx API always returns delta for file has data in inode even without any data change after snapshot.                                                                                                        
    * Avoid mmsnmpagentd termination due to a broken pipe with the snmpd daemon.                                        
    * Fixed fsck abnormal shutdown due to buffer overflow issues when there are a large number of inode and duplicate fragment corruptions in the file system.                                                                              
    * Fixed assert that caused the deamon crash when clean and reuse a data buffer.
    * Suppressed assert due to bad directory during read-only offline fsck.
    * Fixed the waiters by not requeuing the messages if node is no longer GW.
    * Fixed the crash to ignore the handlers in being disabled state when ping thread enters connected mode.
    * On a cluster which uses a callback for event tiebreakerCheck, fix the 6027-2742 error message produced in mmfs.log when the callback program exits with a nonzero value to indicate the actual exit value.
    * Correct a problem upgrading a filesystem from v2.3 or earlier.
    * Assign and verify fcntl sequence numbers in Linux NFS callback path.
    * Fix a split-brain problem during cnfs recovery.
    * Fixed assert that happened in data was saved in the inode before we expanded to a real data block.
    * Fix a SIG11 problem that happens on nsd server when configuration nsdMultiQueueType is 1 or 2.
    * If the GNR server fails to recover a RG with error 214 code 112, after the other server ran for a very short period of time, under very light workload, this may indicate that the problem described above has occurred. In that case, existing versions of the GNR server code will not be able to recover, but no data has has been lost, and on-disk persistent data is not actually damaged. Upgrading the GNR server to a version with this fix will cause recovery to succeed.
    * Rectify a situation for Linux where the io size exceeds the max hw sectors of the block device.
    * Fixed a rare race problem between fetch thread and sync handler where sync handler reduced indirection level while fetch handler is waiting for openFSLock after checking the indirection level.
    * Fixed the crash by releasing the inode after pcache lookup failure on fileset root with error E_AGAIN.
    * Fixed a potential stack overflow problem in dm_handle_to_path() call.
    * Fix corrupted date command in gpfs.snap
    * Fixed a problem that bitmap is corrupted when moving data from inode to allocated datablock.
    * Fix the problem that HSM can not migrate/recall files due to incorrect file system ID was passed to cluster manager node after file system is umounted on a few nodes.
    * Use GPFS_MAXPATHLEN to size symlink buffer size and assertion on link target length.
    * This update addresses the following APARs: IV53489 IV53769 IV54448 IV54463 IV54481 IV55201 IV55208 IV55209 IV55210 IV55607 IV55611.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-04-09T14:29:26Z  in response to gpfs@us.ibm.com

    Abstract: GPFS directory corruption with possible undetected data corruption

    Problem Summary: When multiple nodes are updating a shared directory concurrently, the problem could cause incorrect results from directory operations issued on one node, leading to orphaned inodes (files inaccessible from any directory entry), or directory entries pointing to deleted or incorrect files.   This problem could also cause silent data corruption, if any disk contains both GPFS metadata and data, and a stale buffer is written to a disk address that has been freed and reallocated for some other purpose.

    Users affected (both of the following conditions must apply for customer to be affected):
    1. GPFS service levels 3.4.0.24, 3.4.0.25, 3.4.0.26, 3.4.0.27, 3.5.0.13. 3.5.0.14, 3.5.0.15, or 3.5.0.16.
    2. Workload consists of concurrent directory updates from multiple nodes.

    Problem Description: See Problem Summary.

    Recommendation:  Customers who have run the affected service levels should upgrade to GPFS 3.5.0.17 or 3.4.0.28 (when available) service level updates (go to Fix Central http://www.ibm.com/eserver/support/fixes/), or should apply efixes for the affected service levels.Customers who have seen FSSTRUCT 1124 or 1122 messages, or EIO errors during directory operations, should also run off-line fsck to identify and repair possible directory damage.

  • This reply was deleted by puneetc 2014-04-18T17:51:14Z.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-04-14T12:40:07Z  in response to gpfs@us.ibm.com

    Security Bulletin: GPFS V3.5 for Windows is affected by vulnerabilities in OpenSSL (CVE-2014-0160 and CVE-2014-0076)

     

    See the complete bulletin here -  http://www-01.ibm.com/support/docview.wss?uid=isg3T1020683

     

    Affected Products and Versions

    GPFS for Windows V3.5

    Remediation/Fixes

    GPFS for Windows V3.4

    This version of GPFS uses OpenSSH 0.9.8y which is not affected by this vulnerability.

     

    GPFS for Windows V3.5

     

    If using GPFS on Windows V3.5.0.1, please upgrade to the latest 3.5 PTF level immediately.

    For GPFS on Windows V3.5.0.11 or later, the OpenSSL libraries which are linked in OpenSSH for GPFS on Windows, have been upgraded to 1.0.1g. This updated OpenSSH for GPFS on Windows package is available for download from FixCentral http://www-933.ibm.com/support/fixcentral/ :

    •  
    • 1. Download the GPFS 3.5.0.17 update package into any directory on your system.

      2. Extract the contents of the ZIP archive so that the .msi file it includes is directly accessible to your system.
      3. Follow the instructions in the README included in the update package in order to install the OpenSSH msi package.

     

    If GPFS multiclustering is configured on Windows nodes, upgrade any OpenSSL package that may have been installed. If using OpenSSL from Shining Light Productions (https://slproweb.com/products/Win32OpenSSL.html), get the latest version of OpenSSL 1.0.1g which is available on their website.


    Actions that should be taken:

    1.The following can be done on a small group of nodes at each time (ensuring that quorum is maintained) to maintain file system availability:

    • a. Stop GPFS on the node
      b. Install the version of OpenSSL which contains the fix
      c. Restart GPFS on the node

    2. Additional instructions are needed for CVE-2014-0160. The following should be done only when all nodes, across the multiple clusters, are running with at OpenSSL 1.0.1g level (i.e., when the above steps are completed):
    • a. Change the security keys used for secure communications. Refer to the Advanced Administration Guide, Chapter 1: Accessing GPFS file systems from other GPFS clusters", "Changing security keys" section . The steps should be taken up to, and including the procedure to ensure that the old key is no longer accepted

      b. If SSH is used to execute remote GPFS commands, then the SSH host keys must also be changed

      c. If SSH is used to execute remote GPFS commands, then SSH user keys/passwords must also be changed.

    •  
    Warning: Your environment may require additional fixes for other products, including non-IBM products. Please replace the SSL certificates and reset the user credentials after applying the necessary fixes to your environment.

     

    • gpfs@us.ibm.com
      gpfs@us.ibm.com
      216 Posts
      ACCEPTED ANSWER

      Re: GPFS V3.5 announcements

      ‏2014-04-22T17:30:49Z  in response to gpfs@us.ibm.com

      GPFS V3.4 and V3.5 for AIX, Linux on Power and Linux on x86 do not ship OpenSSL but action may be required due to the OpenSSL Heartbleed vulnerability (CVE-2014-0160)

      Flash (Alert)

      http://www-01.ibm.com/support/docview.wss?uid=isg3T1020713

      Abstract

      GPFS V3.4 and V3.5 for AIX, Linux on Power and Linux on x86 do not ship OpenSSL but action may be required due to the OpenSSL Heartbleed vulnerability (CVE-2014-0160)

      Content

      GPFS V3.4 and V3.5 for AIX, Linux on Power and Linux on x86 do not ship OpenSSL but action may be required due to the OpenSSL Heartbleed vulnerability (CVE-2014-0160)


      Remediation:

      If you configure your GPFS clusters to use OpenSSL, consult the licensor of the OpenSSL installed on your system for instructions.

      If you obtained OpenSSL from the Operating System, information can be found at these links:

      AIX: http://www14.software.ibm.com/webapp/set2/subscriptions/onvdq?mode=18&ID=3488&myns=pwraix61&mync=E

      Red Hat: https://access.redhat.com/site/solutions/781793

      SUSE/Novell: http://support.novell.com/security/cve/CVE-2014-0160.html

      After you deploy an unaffected level of OpenSSL on all nodes in your clusters, you should take the following actions:

      1. The following can be done on a small group of nodes at each time (ensuring that quorum is maintained) to maintain file system availability:

      a. Stop GPFS on the node

      b. Install the version of OpenSSL which contains the fix

      c. Restart GPFS on the node

      2. The following should be done only when all nodes, across the multiple clusters, are running an unaffected level of OpenSSL (i.e., when the above steps are completed):

      a. Change the security keys used for secure communications. Refer to the Advanced Administration Guide, Chapter 1: Accessing GPFS file systems from other GPFS clusters, Changing security keys section . The steps should be taken up to, and including the procedure to ensure that the old key is no longer accepted

      b. If SSH is used to execute remote GPFS commands, then the SSH host keys must also be changed

      c. If SSH is used to execute remote GPFS commands, then SSH user keys/passwords must also be changed.

  • This reply was deleted by puneetc 2014-04-22T17:07:29Z.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-06-27T18:08:24Z  in response to gpfs@us.ibm.com

    Abstract

    IBM has identified an issue with GPFS version 3.5.0.16 and later releases that may affect installations that use both a non-English language locale settings and also have Persistent Reserve enabled for the cluster.

    Problem Summary


    On such systems, it is possible that the disk usage information recorded in the main configuration file (/var/mmfs/gen/mmsdrfs) is not correct.  This may result in an improper handling of the PR settings and inability to mount the affected file systems.  The root cause for this problem is corrected in GPFS 3.5.0.19 and GPFS 4.1.0.1 (APAR IV61323)

    Fix


    To see if your system is susceptible to this problem, run the following command

    grep SG_DISKS /var/mmfs/gen/mmsdrfs | awk -F : '{ print $3 " " $5 " " $8 }'

    and examine the reported disk usage information.  If you see 'descOnly' shown for disks that are supposed to contain data or metadata, then your system is affected and you need to correct the problem using the following procedure:

    - install GPFS 3.5.0.19 or GPFS 4.1.0.1

    - for each of the affected file systems run
       mmcommon recoverfs <deviceName>

    If the mmcommon recoverfs command fails because it cannot read the file system descriptor, then you will need to temporarily disable the Persistent Reserve feature:

    1. mmshtudown -a
    2. mmchcconfig usePersistentReserve no
    3. mmstartup -a
    4. mmcommon recoverfs <deviceName>  # repeat for all affected file systems
    5. mmshtudown -a
    6. mmchcconfig usePersistentReserve yes
    7. mmstartup -a
     

    • oester
      oester
      2 Posts
      ACCEPTED ANSWER

      Re: GPFS V3.5 announcements

      ‏2014-06-27T18:27:37Z  in response to gpfs@us.ibm.com

      I don't see GPFS 3.5.0-19 posted to fix central, as of this writing.

       

      Bob

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-07-03T12:58:00Z  in response to gpfs@us.ibm.com

    Abstract
     A recent change in the UEFI driver update for the SAS HBA can result in damage to any disks used for GPFS which previously contained a GPT partition table (due to non-GPFS use) but are now assigned to GPFS, on upgrade.


    Content
    The UEFI firmware includes a function for Disk GPT Table Recovery.  This function will restore the GPT table from the backup GPT table which was stored at the end of the disk, and it is the default function. When a disk contains a backup GPT table but is later used as a GPFS NSD, the GPT Table Recovery action could rewrite GPFS NSD and Disk descriptor headers with the backup GPT table. Thus such NSDs will be lost after the GPT Table Recovery action.


    Users affected:
    This problem may affect Linux customers who have GPFS NSDs that were created with GPFS version 3.5 or earlier, and these disks were partitioned before they were used for a GPFS NSD.


    Remediation:

    An nsdcheck script is available to run against NSD devices to determine if there is a valid backup GPT table on the device. An NSD disk is at risk if the remarks display hasPrimaryGpt=no,hasSecondaryGpt=yes. If the backup table is not valid, the script can then be used to clear the backup GPT table on the NSD device, prior to any firmware updates, as soon as is possible.  

    Running this script is recommended for all Linux customers, as a precaution. The script when used to remove the secondary GPT will only remove it IF AND ONLY IF there is a GPT signature on the last sector of the NSD device but not at the beginning .

    The script is available:
              1.  In the samples directory with GPFS V3.4.0.29 and V3.5.0.19 from FixCentral
     http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.4.0&platform=All&function=all

    http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.5.0&platform=All&function=all

               2. As an attachment to this post.


    Note:  If you need to restore the secondary GPT signature from GptBackupFile to disk, contact IBM Service.  This should not be done without guidance.

    Attachments

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-07-14T13:25:01Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.19 is now available from IBM Fix Central:                    

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.19

    July 10, 2014

    * Fix thread-safe problem in dumping GPFS daemon threads backtrace.
    * Fixed a problem in clusters configured for secure communications (cipherListconfiguration variable containing a cipher other than AUTHONLY) which may cause communications between nodes to become blocked.                               
    * Fix incorrect "allow-permission-change" value for a file system that is migrated from older format.                 
    * Fix various problems with RDMA reconnection.                                                                        
    * After a file system is panicked, new lock range request will not be accepted.                                       
    * Make block allocation striped across disks in a cyclical fashion when writes from FPO client node.                  
    * Remove mmchconfig -N restrictions for aioWorkerThreads and enableLinuxReplicatedAio.                                
    * Fix truncated file name in the mmfileid output.                                                                     
    * Fixed problem when reading a cloned child from a snapshot.                                                          
    * Fixed a deadlock in a complicated scenario involving restripe, token revoke and exceeding file cache limit.         
    * Fix two integer overflow problems of GPFS block map allocation module which caused by adding larger disk into existing file system. The problem can leads block lost and data corruption.                                                 
    * Fixed race between log recovery and mnodeResign thread.                                                             
    * Deal with stress condition where mmfsd was running out of threads.                                                  
    * Fix a rare assert which happens under low disk space situation.                                                     
    * Prevent GPFS file system program mmfsd crash on a GNR/GSS system while deleting a log vdisk.                        
    * Fixed deadlock during mmap pagein.                                                                                  
    * Fixed the problem of excessive RPCs to get indirect blocks and the problem of metanode lock starvation involving a huge sized sparse file.                                                                                                
    * Fix a potential deadlock when selinux is enabled and FS is dmapi managed.                                           
    * Fixed a kernel oops that caused by a race in multiple NFS readson the same large file.                              
    * Use O_DIRECT on a page-aligned buffer for all device read/write in tspreparedisk.                                   
    * mmchfirmware command will avoid accessing non-existent disk path.                                                   
    * Fix a directory generation mismatch problem in an encrypted secvm file system.                                      
    * shutdown hangs in the kernel trying to acquire revokeLock.                                                          
    * Fixed multiple critical issues related to inode allocation.                                                         
    * The serial number of physical disks is now recorded in the GNR event log, and displayed in the mmlspdisk command.   
    * GNR on AIX allow only 32K segment.                                                                                  
    * Apply if you see tsapolicy failing immediately after a helper node goes down.                                       
    * Eliminate FSSTRUCT errors from occuring during image restore process. Prevent gpfsLookup() common function from proceeding if stripe group is being image restored.                                                                       
    * mmbackup tricked by false TSM success messages Mmbackup can be fooled by TSM output when dsmc decides to roll-back atransaction of multiple files being backed up. When the TSM server runs out of data storage space, the current transaction which may hold many files will be rolled back and re-tried with each file separately. The failure of a file to be backed up in this case was not detected because the earlier message from dsmc contained "Normal File --> <path> [Sent]" though it was later rolled back. Fixes in tsbuhelper now detect the failure signature "** Unsuccessfull **" string and instead of simply ignoring these now will revert the changes in the shadow DB for the matching record(s). Hash table keeps track of last change in each record already, so reverting is now a legal state transition for hashed records. Reorganized some debug messages and streamlined some common code to work better. Now find 'failed' string to issue reversion updates as well. Fixed pattern matching in tsbackup33.pl to properly display all "ANS####" messages.
    * Fix RO cache i/o error if mounting fs in ro mode.
    * Don't release mutex if daemon death.
    * Fix the path buffer length calculation to return the correct length for dm_handle_to_path() functions.
    * Fix bug in mmauth that may cause duplicate configure entries and node numbermismatch in configure file.
    * Ignore Grace msg on nodes that do not support Ganesha.
    * mmbackup fails to read hex env values mmbackup debug values, progress reporting, and possibly other user settings may be presented in decimal or hex, especially the bit-mapped progress and debugging settings. Perl doesn't always interpret the hex values correctly unless converted with the oct() function.
    * Fixed hung problem due ro lock overflow.
    * Fix a problem where user registered callback is unexpectedly invoked when using mount instead of mmmount.
    * Correct an NLS-related problem with mmchdisk and similar commands.
    * Fix a generation number mismatch defect when we create fileset in GPFS secvm file system.
    * Fixed a race condition that could lead to an assertion failure when mmpmon is used.
    * When there is a GPFS failure return EUNATCH to Ganesha.
    * Fix a defect that fileset snapshot restore tool may cannot restore a file which has been changed after snapshot and it has up to more than 16KB extended attributes.
    * Without this fix a setup with 4 or more drawers in an enclosure may not be able to survive the loss of the enclosure even though mmlsrecoverygruop -L states that disk group fault tolerance can survive an enclosure loss.
    * Fixed online fsck assert after block allocation map export.
    * must make sure that all the interfaces are enabled.
    * Fixed a race condition in clusters configured with a non-empty cipherListconfiguration variable, and especially multi-cluster environments, which may result in daemon failure or in GPFS RPCs being stuck.
    * Fixed Ganesha not using right interface in RHEL6.5
    * clear sector 0 and last 4k bytes of the disk before it is created as NSD to prevent accidental GPT table recovery by uEFI driver.
    * Prevent multiple eviction processes from running simultaneously.
    * Assert in Deque after gracePeriodThread runs.
    * In GPFS systems employing GPFS Native RAID, there was a situation in which failover and failback, and disk replacement operations could hang, requiring GPFS to be restarted on the node to clear the hang. Fix is extremely low risk and highly recommended.
    * This update addresses the following APARs: IV59632 IV59771 IV60187 IV60470 IV60474 IV60543 IV60806 IV60817 IV60821 IV60828 IV60829 IV60831 IV61323 IV61524 IV61625 IV61627 IV61632 IV61654 IV61990 IV61994 IV62085 IV62090.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-07-16T15:22:43Z  in response to gpfs@us.ibm.com

    Abstract:  
    GPFS may incorrectly permit a disk that is too large for the file system to be added to an existing FPO storage pool, resulting in undetected data corruption of the file system

    Problem Summary:
    GPFS is designed to impose a maximum allowable disk size for disks added after the file system has been created. An integer overflow problem has been discovered in the GPFS block map allocation module that allows GPFS to incorrectly add disks too large to the file system.  A disk that should have failed this maximum size check could be incorrectly added to an FPO enabled storage pool.  If a disk is added with size equal to or larger than 8 times the size of the disks used at the time the file system was originally created and the file system is utilizing FPO enabled storage pools, data blocks will be corrupted when data is written to the new disk.

    Problem Description:
    See Problem Summary

    Users affected:   
    File systems that have added a disk or disks that are equal to or larger than 8 times the size of the disk used at the time the file system was originally created and GPFS has FPO enabled storage pools.  

    Please check the following conditions to assess if your file system is at risk:

    1. Is the file system FPO enabled?  Use the following command to determine.
    mmlspool <fsname> <poolname> -L | grep allowWriteAffinity
     

     If it's yes, then this is a FPO pool.

    2. Is the file system metadata block size larger than 256KB?  Use the following command to determine.
    mmlspool <fsname> system -L |grep blockSize

    3. Have you added disks (or plan to add disks) to the file system that are equal to or larger than 8 times of the original disks  via mmadddisk  command?
    The new disks must be equal to or larger than 8 times in capacity than the largest disk existing at file system creation time. Use the following mmdf command to review sizes of disks belonging to the file system. Review the "disk size" column to determine disks that are equal to or larger than 8 times of size of the original disks in the file system.

    mmdf <filesystem>
     
    Below is an example from an affected system. The disk data01node04 was added after the file system was created and its size (3.5TB)  is more than 8 times of the original largest disk size (296 GB).
     
    # mmdf gpfs1
    disk                           disk size         failure     holds       holds        free KB                              free KB
    name                           in KB             group   metadata   data      in full blocks                    in fragments
    ---------------                -------------        --------   --------        -----       --------------------                 -------------------
    Disks in storage pool: system (Maximum disk size allowed is 9.0 TB)
    data01node01        296874976     1001      yes            yes       293232640 ( 99%)                 2144 ( 0%)
    data02node01        296874976     1001      yes            yes       293307392 ( 99%)                  2784 ( 0%)
    data02node02        296874976     1002      yes            yes       296857600 (100%)                 1984 ( 0%)
    data01node02        296874976     1002      yes            yes       296856576 (100%)                1984 ( 0%)
    data02node03        296874976     1003      yes            yes       293314560 ( 99%)                  2784 ( 0%)
    data01node03        296874976     1003      yes            yes       293234688 ( 99%)                   2144 ( 0%)
    data01node04       3512693760     1004     yes            yes      1306125312 ( 37%)     404892224 (12%)
                                        -------------                                                          --------------------             -------------------
    (pool total)              5293943616                                                   3072928768 ( 58%)     404906048 ( 8%)

    4. If you are still unable to determine, you will need to unmount the file system and run mmfsck. Please contact IBM support for assistance and further details to run mmfsck.


    Required Actions:

    IBM recommends that GPFS FPO enabled users apply a fix as soon as it is available and before adding new disks to the file system.  


    For GPFS 3.5, the fix is available in GPFS 3.5.0.19 (IV60817) on the Fix Central site (http://www.ibm.com/eserver/support/fixes/).  


    For GPFS 4.1, IBM plans to make the fix available in GPFS 4.1.0-2 (IV62418) on the Fix Central site. Until the fix for 4.1 is available, IBM has an efix. Please contact IBM support if you need the efix.  

    If you have determined that you are affected, please call IBM support as soon as possible for assistance with data recovery.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-07-17T14:29:28Z  in response to gpfs@us.ibm.com

    Security Bulletin: Vulnerability in Open Secure Shell for GPFS V3.5 on Windows (CVE-2014-1692)

    Security Bulletin

    see the complete Security Bulletin at http://www-01.ibm.com/support/docview.wss?uid=isg3T1020637

    Summary

    A security vulnerability has been identified in the level of OpenSSH that is currently shipped with GPFS V3.5.0.11, or later, on Windows.
    The current level of OpenSSH could allow a remote attacker to execute arbitrary code on the system to corrupt memory or cause a denial of service.


    Vulnerability Details

    CVE-2014-1692
    OpenSSH could allow a remote attacker to execute arbitrary code on the system, caused by the failure to initialize certain data structures by the hash_buffer function in schnorr.c. An attacker could exploit this vulnerability using unknown attack vectors to corrupt memory and execute arbitrary code on the system or cause a denial of service.


    CVSS Base Score: 7.5
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/90819 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:P/I:P/A:P)


    Affected Products and Versions

    GPFS V3.5.0.11 or later levels of V3.5 on Windows.


    Remediation/Fixes

    In GPFS V3.5.0.19, IBM upgraded to OpenSSH-6.6p1 to address this vulnerability. System administrators should update their systems to GPFS V3.5.0.19, APAR IV60817 by following the steps below.

    1. Download the GPFS 3.5.0.19 update package dated July 2014 into any directory on your system from IBM
    at http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.5.0&platform=Windows&function=all

    2. Extract the contents of the ZIP archive so that the .msi file it includes is directly accessible to your system.

    3. Follow the instructions in the README included in the update package in order to install the OpenSSH msi package.



    Workarounds and Mitigations

    none

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-08-20T21:13:36Z  in response to gpfs@us.ibm.com

    Flash (Alert)

    Abstract:
    GPFS on clusters enabled for RDMA use may experience server crashes, RDMA failures, hangs, or undetected data corruption.

    Problem Summary:
    IBM has identified a problem with GPFS versions 3.5.0.17 efix18 and efix19, 3.5.0.19 and 4.1.0.2, for clusters enabled for GPFS RDMA
    when the value configured for verbsRdmasPerNode is less than the value configured for nsdMaxWorkerThreads for any NSD server.  Under
    certain conditions, the NSD server thread may get indication that the RDMA completed successfully before the RDMA actually completes.   
    This problem may result in NSD server crashes, RDMA failures, hung NSD server threads, or undetected data corruption.

    Problem Description:
    See Problem Summary.

    Users Affected:
    Only customers running the affected levels, configured to use RDMA, with a value of verbsRdmasPerNode that is less than the value
    configured for nsdMaxWorkerThreads for the NSD servers, are vulnerable to the problem.

    To verify if the NSD servers are vulnerable to the problem, run the following command on each NSD server:

        mmfsadm test verbs config | grep -e "Max RDMAs per node"

    In the examples below:

    The value for "Max RDMAs per node max" corresponds to nsdMaxWorkerThreads.
    The value for "Max RDMAs per node curr" corresponds to verbsRdmasPerNode (which may be adjusted by GPFS).

    An example of an NSD server that is not vulnerable to the problem:

        In this example, "Max RDMAs per node max" reports the same value (512) as "Max RDMAs per node curr" (512):

        mmfsadm test verbs config | grep -e "Max RDMAs per node"
          Max RDMAs per node max              : 512
          Max RDMAs per node curr             : 512

    An example of an NSD server that is vulnerable to the problem:

        In this example, "Max RDMAs per node max" reports a value (512) that is greater than "Max RDMAs per node curr" (128):

        mmfsadm test verbs config | grep -e "Max RDMAs per node"
          Max RDMAs per node max              : 512
          Max RDMAs per node curr             : 128


    Recommendations:

    IBM recommends that customers vulnerable to the problem immediately install an efix including IV63698; for the affected levels, the relevant efixes are:

        If 3.5.0.17 efix18 or efix19 is installed, then install 3.5.0.17 efix20
        If 3.5.0.19 is installed, then install 3.5.0.19 efix5
        If 4.1.0.2 is installed, then install 4.1.0.2 efix2

    Customers vulnerable to the problem but unable to immediately apply the above fix levels should run the following command to change the
    value of verbsRdmasPerNode to equal the value of nsdMaxWorkerThreads for each NSD Server, as a workaround.  The customer may experience
    performance impacts while this workaround is in effect.

        In this example, N = the value configured for nsdMaxWorkerThreads.

        mmchconfig verbsRdmasPerNode=N

    Customers vulnerable to the problem, after applying the service or workaround above, should contact GPFS support for instructions to run
    mmfsck to detect and repair any metadata damage.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-08-29T17:56:58Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.20 is now available from IBM Fix Central:                                              

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.20

    August 29, 2014

    * Add tsm server selection option and improve messages.
    * write(fd,NULL,bs) gives rc -1 and inconsistent behavior Added a check in code to validate if user provided buffer is NULL. If user provided buffer for read/write system call is NULL than error is returned much earlier in code.        
    * Update restripe code to better handle both replica in same failure group with one replica on suspended disk.        
    * Fix deadlock between recovery thrd and GW node change.                                                              
    * Fix failback cmd failure with E_BUSY due to earlier failures.                                                       
    * Fix a rare case live lock which can happen when FPO file system is in low space situation.                          
    * Fix a bug in mmdel/adddisk that may cause file system to become unknown to GPFS if the name of that file system contains special char.                                                                                                    
    * Fixed a possible cause of deadlock when mmpmon or the GPFS SNMP subagent does node registration.                    
    * Improve handling and reporting of certain types of corrupted directory blocks.                                      
    * Fix GPFS_FCNTL_GET_DATABLKDISKIDX fcntl API to return location info of pre allocated block correctly.               
    * Fix a memory leak in the GPFS daemon associated with Events Exporter, mmpmon,and SNMP support.
    * Fix code that can cause GPFS daemon to assert when running mkdir on an AFM/SECVM enabled filesystem with metadata replication. The problem will only occur when there is no AIX node and GPFS can't allocate disk space for both replica.
    * Ensure SQL migration is done on GSS nodes only.
    * Limit PORTMAP inactive failure due to DNS busy.
    * Ensure that vdisks are automatically scrubbed periodically.
    * Initialize the fromFsP to NULL in openArch() to guard against ever calling gpfs_free_fssnaphandle() with a bad argument. Add an informative message to look for the an error log in /tmp when the file writer pipeline is broken.
    * Ensured not to create file if it already exists for NFS when Ganesha is running.
    * + Apply if you are troubled by mmapplypolicy not terminating promptly and cleanly when ENOSPC is encountered.
    * Fixed a rare problem in background deletion code due to uninitialized list pointers.
    * Callback/user exit support is added for new event "afmFilesetUnmounted" which gets called when fileset is unmounted.
    * Prevent the GPFS daemon from running into the assertion when a GSS vdisk client IO fails.
    * Ganesha: Fix race condition when renaming files.
    * Reduce the kernel stack usage in the write code path to void potentional kernal stack overflow.
    * Avoid possible core dump by protecting pclose() call if broken pipe is known. Also, add LCL_ESTALE to list of fatal conditions.
    * This fix will detect the failure and report that the mmchfs command has failed. The FS will retain its old name and still be usable.
    * Fixed a memory overwritten problem caused by uninitialized stringcopying in mmfs_dm_query_session()
    * Update restripe code to better handle both replica in same failure group after disk usage and failure group were changed via mmchdisk.
    * Fix is recommended on all systems in which the disk enclosure supports slot power control. Currently, this includes only the P7IH.
    * Prevent mmimgbackup from accepting --image option with a value that begins with an absolute path name such as /gpfs/ as this option is meant to only permit a file base name or prefix. Not for specifying the output directory.
    * Improve handling of unexpectedly corrupted directory blocks.
    * This fix is recommended for GPFS Native RAID systems running on the AIX operating system. It has no effect on Linux.
    * Fix inode dealloc issue in AFM LU mode.
    * Fix mmfsck assert "More than 22 minutes searching for a free buffer in the pagepool"
    * Fix an RDMA problem that will cause NSD server hangs/failures or possibly data corruption.
    * This update addresses the following APARs: IV62214 IV62923 IV62925 IV62926 IV62928 IV62933 IV62934 IV62935 IV62936 IV63463 IV63465 IV63468 IV63470 IV63712.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-10-07T16:16:43Z  in response to gpfs@us.ibm.com

    Security Bulletin

    See the complete bulletin at http://www-01.ibm.com/support/docview.wss?uid=isg3T1021317

    Summary

    Security vulnerabilities have been identified in the level of OpenSSL that is currently shipped with GPFS V3.5.0.11, or later, on Windows. The current level of OpenSSL could allow a remote attacker to :
    - Cause a denial of service (CVE-2014-3512, CVE-2014-3509, CVE-2014-3506, CVE-2014-3507, CVE-2014-3505, CVE-2014-3510, CVE-2014-5139)
    - Bypass security restrictions (CVE-2014-3511)
    - Obtain sensitive information (CVE-2014-3508)


    Vulnerability Details

    CVE-ID: CVE-2014-3512
    Description: OpenSSL is vulnerable to a denial of service, caused by an internal buffer overrun. A remote attacker could exploit this vulnerability using invalid SRP parameters sent from a malicious server or client to cause a denial of service.
    CVSS Base Score: 5
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95158 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3509
    Description: OpenSSL is vulnerable to a denial of service, caused by a race condition in the ssl_parse_serverhello_tlsext() code. If a multithreaded client connects to a malicious server using a resumed session, a remote attacker could exploit this vulnerability to cause a denial of service.
    CVSS Base Score: 4.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95159 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3506
    Description: OpenSSL is vulnerable to a denial of service, caused by an error when processing DTLS handshake messages. A remote attacker could exploit this vulnerability to consume an overly large amount of memory.
    CVSS Base Score: 5
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95160 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3507
    Description: OpenSSL is vulnerable to a denial of service. By sending specially-crafted DTLS packets, a remote attacker could exploit this vulnerability to leak memory and cause a denial of service.
    CVSS Base Score: 5
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95161 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3511
    Description: OpenSSL could allow a remote attacker to bypass security restrictions, caused by the negotiation of TLS 1.0 instead of higher protocol versions by the OpenSSL SSL/TLS server code when handling a badly fragmented ClientHello message. An attacker could exploit this vulnerability using man-in-the-middle techniques to force a downgrade to TLS 1.0.
    CVSS Base Score: 4.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95162 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:N/I:P/A:N)

    CVE-ID: CVE-2014-3505
    Description: OpenSSL is vulnerable to a denial of service, caused by a double-free error when handling DTLS packets. A remote attacker could exploit this vulnerability to cause the system to crash.
    CVSS Base Score: 5
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95163 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3510
    Description: OpenSSL is vulnerable to a denial of service, caused by a NULL pointer dereference in anonymous ECDH ciphersuites. A remote attacker could exploit this vulnerability using a malicious handshake to cause the client to crash.
    CVSS Base Score: 4.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95164 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3508
    Description: OpenSSL could allow a remote attacker to obtain sensitive information, caused by an error in OBJ_obj2txt. If applications echo pretty printing output, an attacker could exploit this vulnerability to read information from the stack.
    CVSS Base Score: 4.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95165 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:P/I:N/A:N)


    CVE-ID: CVE-2014-5139
    DESCRIPTION: OpenSSL is vulnerable to a denial of service, caused by a NULL pointer dereference when an SRP ciphersuite is specified without being properly negotiated with the client. A remote attacker could exploit this vulnerability to cause the client to crash.
    CVSS Base Score: 5
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/95166 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:N/A:P)

     

    Affected Products and Versions

    GPFS V3.5.0.11 or later levels of V3.5 on Windows


    Remediation/Fixes

    In GPFS V3.5.0.20 dated October 2014, IBM upgraded to OpenSSL 1.0.1i to address this vulnerability. System administrators should update their systems to GPFS V3.5.0.20 by following the steps below.

    1. Download the GPFS 3.5.0.20 update package dated October 2014 into any directory on your system. From IBM at http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.5.0&platform=Windows&function=all

    2. Extract the contents of the ZIP archive so that the .msi file it includes is directly accessible to your system.

    3. Follow the instructions in the README included in the update package in order to install the OpenSSH msi package.This updated OpenSSH msi package is built using OpenSSL 1.0.1i.

    If GPFS multiclustering is configured on Windows nodes, upgrade all OpenSSL packages that may have been installed. The following can be done on a small group of nodes at each time (ensuring that quorum is maintained) to maintain file system availability:

    a. Stop GPFS on the node
    b. Install the version of OpenSSL
    c. Restart GPFS on the node


    Workarounds and Mitigations

    None

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-10-07T16:25:07Z  in response to gpfs@us.ibm.com

    Security Bulletin

    See the complete bulletin at http://www-01.ibm.com/support/docview.wss?uid=isg3T1021316

    Summary

    Security vulnerabilities have been identified in the level of OpenSSH that is currently shipped with GPFS V3.5.0.11, or later, on Windows. The current level of OpenSSH could allow a remote attacker to bypass security restrictions caused by:
    - (CVE-2014-2653) an error in the SSH client when handling a HostCertificate.
    - (CVE-2014-2532) the inclusion of wildcard characters in the AcceptEnv lines of the sshd_config configuration file within the sshd program.

    Vulnerability Details

    CVE-2014-2653
    DESCRIPTION: OpenSSH could allow a remote attacker to bypass security restrictions, caused by an error in the SSH client when handling a HostCertificate. By persuading a victim to visit a specially-crafted Web site containing a malicious certificate, an attacker could exploit this vulnerability using a malicious server to disable SSHFP-checking.
    CVSS Base Score: 4.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/92116 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:N/I:P/A:N)

    CVE-2014-2532
    DESCRIPTION: OpenSSH could allow a remote attacker to bypass security restrictions, caused by the inclusion of wildcard characters in the AcceptEnv lines of the sshd_config configuration file within the sshd program. By using a substring before a wildcard character, an attacker could exploit this vulnerability to bypass intended environment restrictions.
    CVSS Base Score: 5
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/91986 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:P/A:N)

    Affected Products and Versions

    GPFS V3.5.0.11 or later levels of V3.5 on Windows.

    Remediation/Fixes

    In GPFS V3.5.0.20 dated October 2014, IBM patched the OpenSSH-6.6p1 shipped to address this vulnerability. System administrators should update their systems to GPFS V3.5.0.20 by following the steps below.

    1. Download the GPFS 3.5.0.20 update package dated October 2014 into any directory on your system. From IBM at http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.5.0&platform=Windows&function=all

    2. Extract the contents of the ZIP archive so that the .msi file it includes is directly accessible to your system.

    3. Follow the instructions in the README included in the update package in order to install the OpenSSH msi package.


    Workarounds and Mitigations

    None

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-10-16T16:59:24Z  in response to gpfs@us.ibm.com

    Flash (Alert)

    In Linux environments, GPFS may incorrectly fail writev() with EINVAL resulting in the user application failing during write

    http://www-01.ibm.com/support/docview.wss?uid=isg3T1021392

    Abstract

    IBM has identified a problem with GPFS 3.5.0.20 and GPFS 4.1.0.2 where GPFS may fail to correctly handle multiple vectors passed via the writev() system call. When a {NULL, 0} is passed as the first vector, an EINVAL error may be incorrectly returned. This would cause the user application to fail unexpectedly when writev() is called to write to a GPFS file. User data are not affected. The writev() call is most likely to have been automatically generated by the library or compiler.


    Content

    User affected: Only customers running the affected level on Linux and have applications which use the writev() system call for writes to a GPFS file.

    Note: The writev() call is most likely to have been automatically generated by the library or compiler. For example, using C++ stream class to write more than 1023 byte to a file will generate a writev() call that could fail with an EINVAL error.

    The following sample program shows an example for which a write using stream class may fail unexpectedly:

    #include<cassert>
    #include<cstdio>
    #include<fstream>

    int main (int argc, char** argv) {
    assert(argc == 2);

    char* data = new char[1000000];
    std::ofstream f(argv[1], std::ios_base::binary);
    f.write(data, 1023); // this would succeed
    perror("write call");
    f.flush();

    f.write(data, 1024); // this would fail
    perror("write call");
    f.flush();

    f.write(data, 1025); // this would fail
    perror("write call");
    f.flush();

    f.write(data, 1023); // this would succeed
    perror("write call");
    f.write(data, 1024); // this would succeed
    perror("write call");
    f.flush();

    f.write(data, 1024); // this would fail
    perror("write call");
    f.write(data, 1023); // this would succeed
    perror("write call");
    f.flush();

    f.write(data, 512); // this would succeed
    perror("write call");
    f.write(data, 512); // this would succeed
    perror("write call");
    f.write(data, 1024); // this would succeed
    perror("write call");
    f.flush();

    f.close();
    delete[] data;
    return 0;
    }

    Recommendation

    Affected V3.5 customer should contact IBM Service for an efix containing APAR IV64863; IBM plans to make this fix available in GPFS 3.5.0.21(APAR IV64863). V4.1 customers should upgrade to GPFS 4.1.0.3 (APAR IV64862) at Fix Central http://www.ibm.com/eserver/support/fixes/

    Updated on 2014-10-17T12:01:49Z at 2014-10-17T12:01:49Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-10-29T15:30:56Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.21 is now available from IBM Fix Central:                                      

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.21

    October 28, 2014

    * Fixed asserts and hung clusters.
    * Fix for rare condition that would cause file /var/mmfs/mmpmon/response to uncontrollably grow to a large size.
    * Fix stale mount handling in case of home mount issues.                                                        
    * This fix improves the GPFS tool useability to use mmaddcallback command to constantly monitor a desired file system status.                                                                                                               
    * Fix code to prevent a GPFS daemon assert that could occur after automatic remount of filesystems. The problem only occurs on Linux node when user issued GPFS commands to access the filesystem before automatic remount has completed.   
    * Fix AIO code path to properly manage vfs user count. Incorrect vfs user count could prevent file system quiesce and cause some GPFS commands (ex. snapshot commands) to fail.                                                             
    * Fix node failure recovery code to prevent a possible GPFS daemon assert. This problem could occur if file system manager node fails in the middle of restripe.                                                                            
    * Fixed the startup code which will handle the locks in previous uncleaned mmshutdown better.                         
    * Fix rare race condition that could result in a deadlock when moving data from the inode into a regular data block.  
    * Call LOGSHUTDOWN when the token manager cannot allocate a new BRTreeNodeto avoid granting conflicting BR tokens.    
    * Do not log fsstruct errors when offline fsck is running.                                                            
    * When performing automatic disk recovery, take into account the value of defaultHelperNodes when initiating file system restripe operations.                                                                                               
    * Add additional checks to ensure the uniqueness of certain node attributes.
    * Fixed assert in setDataValid becuase blockOffset larger than the start offset.
    * Fix a defect on gpfs_ireaddirx API that it may cannot get directory entry within a file system snapshot.
    * Prevented I/O from being started after a node has lost contact with the cluster.
    * Fix a defect that gpfs_prealloc API cannot work under GPFS + SECVM solution.
    * Fix "subscript out of range" error in recoverFailedDisk.
    * Fixed that problem that in a fs with different metadatablock size, but a disk in system pool is wrongly allowed to hold data.
    * Fix bug introduced in GPFS3.4 PTF30 and GPFS3.5 PTF20 where mmlsnsd -X doesn't display persistent reserve information of the disk.
    * Correct --single-instance option for mmapplypolicy runs against directory.
    * The fix avoids a bug where pdisks become "missing" due to descriptors being overwritten. The fix is recommended if system firmware upgrades are applied.
    * Fix code used to check user provided buffer for NULL where it can cause writev() to incorrectly fail with EINVAL. Both readv()/writev() could be affected.
    * Fixed the problem that O_DSYNC was not honored on data write to small files.
    * Fixed Ganesha thread deadlock caused by Ganesha up-call thread's getting byte rance lock that was just released.
    * Fix ibv_modify_qp error 22 when RDMA connecting client mlx4_0 port 2 HCA to server single port qib0 HCA.
    * tsputacl.C's UID/GID parser is fixed.
    * fix an i_count leak issue caused by nfs.
    * Fix lookup after rename moving file to .ptrash.
    * Too many RDMA connection created between nodes.
    * Fix Signal 11 on Connect-IB
    * Make sure that FHs are still pointing to directories after we get the lock for rename.
    * 1) Make sure stop Ganesha as soon as GPFS daemon cleanup starts 2) Make sure that FHs are still pointing to directories after we get the lock for rename.
    * Fix problem where AFM does not correctly replay Asynchronous I/O (AIO) writes,such as with aio_write(), when the file is opened with O_DIRECT flag. The problem may cause files to exhibit inconsistency between cache and home.
    * Fixed a timing issue which may cause AFM to miss replaying updates to home,if the update happens after a period of long inactivity. Disable a three-phase protocol during which requests are dropped while the gateway nodes are temporarily made inactive so they can be marked clean.
    * This update addresses the following APARs: IV63517 IV63910 IV63912 IV63927 IV64599 IV64863 IV65117 IV65143 IV65144 IV65145 IV65829 IV66171 IV66270.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-11-04T12:41:12Z  in response to gpfs@us.ibm.com

    IBM has identified a problem with the mmrestripefs -c command

    Flash (Alert) http://www-01.ibm.com/support/docview.wss?uid=isg3T1021472

    Abstract

    IBM has identified a problem with the online replica compare/repair function invoked via the mmrestripefs -c command.


    Content

    Problem Summary:

    IBM has identified a problem with the online replica compare/repair function invoked via the mmrestripefs -c command, when invoked with any disks having status other than ready/replacement, for example with any disks in suspended state. This function may cause file system corruption, with potential loss or corruption of user files, if any replica is to be copied or moved for reasons other than replica mismatch. This problem affects customers running any PTF level of GPFS 3.5 from GPFS 3.5.0.11 through 3.5.0.20, or any level of GPFS 4.1 from GPFS 4.1.0.0 through GPFS 4.1.0.3. The function provided by the mmrestripefs -c command is disabled in PTFs 3.5.0.21 and 4.1.0.4.

    Users affected:

    This problem affects customers running any PTF level of GPFS 3.5 from GPFS 3.5.0.11 through 3.5.0.20, or any level of GPFS 4.1 from GPFS 4.1.0.0 through GPFS 4.1.0.3.

    This problem can only occur when the mmrestripefs -c command is run while there are disks with status other than ready/replacement. For a file system with data replication, the problem may only occur if none of the replicas of a data block are on a disk with status of ready/replacement.

    Users can check if they may have been affected by running the following command to determine if mmrestripefs -c was ever issued in the cluster:

    mmdsh -N all grep "mmrestripefs.*-c" /var/adm/ras/mmfs.log\*

    If the result of the grep indicates that the command has been run, contact IBM Service.

    Recommendation:

    • Avoid running mmrestripefs with the -c option until a fix is made available by IBM. A fix will be made available for GPFS V3.5 with APAR IV66437 and GPFS V4.1 with APAR IV66123.
    • Contact IBM for an efix to disable the code; APAR IV66270 for GPFS V3.5 and APAR IV66271 for GPFS V4.1.
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-11-13T15:06:58Z  in response to gpfs@us.ibm.com

    Security Bulletin

    Vulnerabilities in OpenSSL affect GPFS V3.5 for Windows (CVE-2014-3513, CVE-2014-3567, CVE-2014-3568)

    See the complete bulletin at http://www-01.ibm.com/support/docview.wss?uid=isg3T1021548

    Summary

    OpenSSL vulnerabilities along with SSL 3 Fallback protection (TLS_FALLBACK_SCSV) were disclosed on October 15, 2014 by the OpenSSL Project. OpenSSL is used by GPFS V3.5 for Windows. GPFS V3.5 for Windows has addressed the applicable CVEs and included the SSL 3.0 Fallback protection (TLS_FALLBACK_SCSV) provided by OpenSSL.


    Vulnerability Details

    CVE-ID: CVE-2014-3513
    DESCRIPTION: OpenSSL is vulnerable to a denial of service, caused by a memory leak in the DTLS Secure Real-time Transport Protocol (SRTP) extension parsing code. By sending multiple specially-crafted handshake messages, an attacker could exploit this vulnerability to exhaust all available memory of an SSL/TLS or DTLS server.

    CVSS Base Score: 5.0
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/97035 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3567

    DESCRIPTION: OpenSSL is vulnerable to a denial of service, caused by a memory leak when handling failed session ticket integrity checks. By sending an overly large number of invalid session tickets, an attacker could exploit this vulnerability to exhaust all available memory of an SSL/TLS or DTLS server.

    CVSS Base Score: 5.0
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/97036 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:L/Au:N/C:N/I:N/A:P)

    CVE-ID: CVE-2014-3568

    DESCRIPTION: OpenSSL could allow a remote attacker bypass security restrictions. When configured with "no-ssl3" as a build option, servers could accept and complete a SSL 3.0 handshake. An attacker could exploit this vulnerability to perform unauthorized actions.

    CVSS Base Score: 2.6
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/97037 for more information
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:H/Au:N/C:N/I:P/A:N)


    Affected Products and Versions

    OpenSSH for GPFS V3.5 for Windows


    Remediation/Fixes

    In GPFS V3.5.0.21 dated November 2014, IBM upgraded OpenSSH for GPFS on Windows to use OpenSSL 1.0.1j to address this vulnerability. System administrators should update their systems to GPFS V3.5.0.21 by following the steps below.

    1. Download the GPFS 3.5.0.21 update package dated November 2014 into any directory on your system. From IBM at http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.5.0&platform=Windows&function=all

    2. Extract the contents of the ZIP archive so that the .msi file it includes is directly accessible to your system.

    3. Follow the instructions in the README included in the update package in order to install the OpenSSH msi package. This updated OpenSSH msi package is built using OpenSSL 1.0.1j.

    If GPFS multiclustering is configured on Windows nodes, upgrade all OpenSSL packages that may have been installed. The following can be done on a small group of nodes at each time (ensuring that quorum is maintained) to maintain file system availability:

    a. Stop GPFS on the node
    b. Install the version of OpenSSL
    c. Restart GPFS on the node

     

    Workarounds and Mitigations

    None known

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-11-13T15:18:16Z  in response to gpfs@us.ibm.com

    Security Bulletin

    Vulnerability in SSLv3 affects GPFS V3.5 for Windows (CVE-2014-3566)

    See the complete bulletin at http://www-01.ibm.com/support/docview.wss?uid=isg3T1021546


    Summary

    SSLv3 contains a vulnerability that has been referred to as the Padding Oracle On Downgraded Legacy Encryption (POODLE) attack. SSLv3 is enabled in GPFS V3.5 for Windows


    Vulnerability Details

    CVE-ID: CVE-2014-3566
    DESCRIPTION: Product could allow a remote attacker to obtain sensitive information, caused by a design error when using the SSLv3 protocol. A remote user with the ability to conduct a man-in-the-middle attack could exploit this vulnerability via a POODLE (Padding Oracle On Downgraded Legacy Encryption) attack to decrypt SSL sessions and access the plaintext of encrypted connections.

    CVSS Base Score: 4.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/97013 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:P/I:N/A:N)


    Affected Products and Versions

    OpenSSH for GPFS V3.5 for Windows


    Remediation/Fixes

    In GPFS V3.5.0.21 dated November 2014, IBM upgraded OpenSSH for GPFS on Windows to use OpenSSL 1.0.1j to address this vulnerability. System administrators should update their systems to GPFS V3.5.0.21 by following the steps below.

    1. Download the GPFS 3.5.0.21 update package dated November 2014 into any directory on your system. From IBM at http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%2Bsoftware&product=ibm/power/IBM+General+Parallel+File+System&release=3.5.0&platform=Windows&function=all

    2. Extract the contents of the ZIP archive so that the .msi file it includes is directly accessible to your system.

    3. Follow the instructions in the README included in the update package in order to install the OpenSSH msi package. This updated OpenSSH msi package is built using OpenSSL 1.0.1j.

    If GPFS multiclustering is configured on Windows nodes, upgrade all OpenSSL packages that may have been installed. The following can be done on a small group of nodes at each time (ensuring that quorum is maintained) to maintain file system availability:

    a. Stop GPFS on the node
    b. Install the version of OpenSSL
    c. Restart GPFS on the node

    IBM recommends that you review your entire environment to identify areas that enable the SSLv3 protocol and take appropriate mitigation and remediation actions. The most immediate mitigation action that can be taken is disabling SSLv3.


    Workarounds and Mitigations

    None

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2014-12-23T16:43:39Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.22 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.22

    December 18, 2014

    * Fix an alloc cursor issue in block allocation code that may lead to spurious no space error in FPO file system.
    * Fixed a problem where offline fsck fails assert (!hardMaxSize or newSize <= hardMaxSize) if 'tsdbfs -f' is run when fsck is running.
    * Fixes a problem may cause mmchdisk start to hang if there are too many pools or diskless nodes in a FPO cluster.
    * Reduce number of nsdMsgRdmaPrepare messages sent.
    * Fix GSS bug related to concurrent overlapping read operations during failover/error scenarios.
    * Redirect automatic recovery's tsrestripefs output to /var/adm/ras/restripefsOnDiskFailure.log
    * Fix problem with verbsRdmasPer[Node Connection] set to a value of 1.
    * Reduce CNFS failover time on systems with large list of exports.
    * Allow disk addresses in inode 5 (Extended Attribute File) be be found bythe mmfileid command.
    * Exclude COMMON_TEMPSHIP_DEFINES from Windows Release builds.
    * If the user of a GSS system had previously changed the slow disk detection parameters manually to the following values: nsdRAIDDiskPerformanceMinLimitPct=50 and nsdRAIDDiskPerformanceShortTimeConstant=25000, then they can now remove the manual setting; but they don't have to remove it.
    * Fixes a problem where fsck hits signal 8 during inode validation.
    * Fix a stack corruption issue.
    * This update addresses the following APARs: IV66618 IV66865 IV66871 IV67558 IV67561.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2015-02-16T18:42:37Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.23 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.23

    February 12, 2015

    * Update code to prevent a deadlock that could occur when multiple threads try to create the same file in a directory at same time.
    * Reduce memory utilization for GPFS RDMA QPs and fix a problem with Connect-IB when verbsRdmaSend is enabled.
    * Correct tsbuhelper updateshadow command to recognize several circumstances when gencount changes even for files not yet backed up, or changed.
    * FPO: Use disks evenly when doing mmrestripefs -r or mmdeldisk.
    * Fix a defect which may cause data consistency problem if one runs mmrestripefile after reducing replica level.
    * When creating FPO file system, always use 'cluster' layoutMap if allowWriteAffinity is yes and layoutMap is unspecified.
    * "Waiting for nn user(s) of shared segment" messages on shutdown.
    * Fix GNR bug related to a race condition that cause recovery failure during startup or failover.
    * Fix a mmdeldisk problem caused by disabled quota files placed in the system pool.
    * Fix potential loss of IO error in linux io_getevents() call when enableLinuxReplicatedAio is enabled (3.5.0.14+). Fix a problem that returns 'error 217' improperly when do Direct IO on a replicated file which has partial replicas on unavailable disks (4.1.0+).
    * Fix a linux lookup crash issue.
    * Apply if secrecy of file metadata (pathnames, attributes and extended attributes) is a concern.
    * Fixed problem when deleting files from independent fileset which is causing unnecessary recalls when there are no snapshots.
    * Ensured to check null pointer when alloc memory pool fail while Ganesha is active.
    * Fix is recommended in all GNR configurations.
    * This update addresses the following APARs: IV67811 IV68061 IV68492 IV68660 IV68680 IV68681 IV68991.

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2015-03-13T22:03:47Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.24 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.24

    March 12, 2015

    * Fix a problem with block allocation code, where E_NOSPC error could be incorrectly returned after running out of disk space in one failure group. This problem only affects file systems with data replication.
    * Fix fsck incorrect handling of horizontal block allocation map file.
    * When a GSS logTipBackup pdisk fails, mmlsrecoverygroup output will now display offline (as opposed to error) for the affected logTipBackup vdisk.
    * Enforce the same declustered array (DA) name for the old pdisk and the corresponding new one when replacing a pdisk with mmadpdisk --replace
    * Fix problem that may cause assertion '!ofP->destroyOnLastClose' when the file system is mounted in RO mode on some nodes and in RW more on others.
    * Fix bug in change license in CCR cluster.
    * Fix a problem that might cause auto recovery failure in FPO cluster.
    * Ensure mmmigratefs tool does not assert trying to process deleted inodes that have stray xattr bits turned on.
    * Close a very small timing window where a FileBlockRandomWriteFetchHandlerThread waiting for a lock may not get awakened and cause a deadlock situation.
    * Fix a problem that might cause no space error even if there is free disk space.
    * Protect fcntl kernel calls against non-privileged callers.
    * GPFS command hardening.
    * Fix a problem with directory lookup code that can cause FSErrInodeCorrupted error to be incorrectly issued. This could occur when lookup on '..' entry occurs at the same time as the directory and its parent are being deleted.
    * Enable dynamically switching from cipherList=EMPTY to cipherList=AUTHONLY without bringing down the entire cluster.
    * This update addresses the following APARs: IV68840 IV69426 IV69797 IV69834 IV69835 IV69836 IV69837 IV70611.

    Updated on 2015-03-13T22:13:14Z at 2015-03-13T22:13:14Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2015-03-16T12:52:08Z  in response to gpfs@us.ibm.com

    Security Bulletin: IBM General Parallel File System is affected by security vulnerabilities (CVE-2015-0197, CVE-2015-0198, CVE-2015-0199)

    View the complete Security Bulletin published on 2015-03-13 at http://www-01.ibm.com/support/docview.wss?uid=isg3T1022062

    Summary

    Security vulnerabilities have been identified in current levels of GPFS V4.1, V3.5, and V3.4:
    - could allow a local attacker which only has a non-privileged account to execute programs with root privileges (CVE-2015-0197)
    - may not properly authenticate network requests and could allow an attacker to execute programs remotely with root privileges (CVE-2015-0198)
    - allows attackers to cause kernel memory corruption by issuing specific ioctl calls to a character device provided by the mmfslinux kernel module and cause a denial of service (CVE-2015-0199)

    Vulnerability Details


    CVEID: CVE-2015-0197
    DESCRIPTION: IBM General Parallel File System could allow a local attacker which only has a non-privileged account to execute programs with root privileges.
    CVSS Base Score: 6.9
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/101224 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:L/AC:M/Au:N/C:C/I:C/A:C)

    CVEID: CVE-2015-0198
    DESCRIPTION: IBM General Parallel File System may not properly authenticate network requests and could allow an attacker to execute programs remotely with root privileges.
    CVSS Base Score: 9.3
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/101225 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:N/AC:M/Au:N/C:C/I:C/A:C)

    CVEID: CVE-2015-0199
    DESCRIPTION: IBM General Parallel File System allows attackers to cause kernel memory corruption by issuing specific ioctl calls to a character device provided by the mmfslinux kernel module and cause a denial of service.
    CVSS Base Score: 4.7
    CVSS Temporal Score: See http://xforce.iss.net/xforce/xfdb/101226 for the current score
    CVSS Environmental Score*: Undefined
    CVSS Vector: (AV:L/AC:M/Au:N/C:N/I:N/A:C)


    Affected Products and Versions

    GPFS V4.1.0.0 thru GPFS V4.1.0.6

    GPFS V3.5.0.0 thru GPFS V3.5.0.23

    GPFSV3.4.0.0 thru GPFSV3.4.0.31

    For CVE-2015-0198, you are not affected if either of the following are true:

        the cipherList configuration variable is set to AUTHONLY or to a cipher

    or

        only trusted nodes/processes/users can initiate connections to GPFS nodes

    Remediation/Fixes

    Apply GPFS 4.1.0.7 , GPFS V3.5.0.24 ,or GPFS V3.4.0.32 as appropriate for your level of GPFS available from Fix Central at http://www-933.ibm.com/support/fixcentral/ ,

    For CVE-2015-0198, after applying the appropriate PTF, set cipherList to AUTHONLY.

    To enable AUTHONLY without shutting down the daemon on all nodes:

        Install the PTF containing the fix on all nodes in the cluster one node at a time
        Generate SSL keys by running the mmauth genkey new command. This step is not needed if CCR is in effect (GPFS 4.1 only)
        Enable AUTHONLY by running the mmauth update . -l AUTHONLY command

    If the mmauth update command fails, examine the messages, correct the problems (or shut down the daemon on the problem node) and repeat the mmauth update command above.

    Note: Applying the PTF for your level of GPFS (GPFS 4.1.0.7 , GPFSV3.5.0.24 , or GPFS V3.4.0.32,) on all nodes in the cluster will allow you to switch cipherList dynamically without shutting down the GPFS daemons across the cluster. The mitigation step below will require all nodes in the cluster to be shut down.

    If there are any nodes running GPFS 3.4 on Windows then switching the cipherList dynamically is only possible in one of the following two scenarios:

        The mmauth update command is initiated from one of the GPFS 3.4 Windows nodes

    or

        If the command is issued from another node in the cluster then GPFS must be down on all the GPFS 3.4 Windows nodes


    Workarounds and Mitigations


    For CVE-2015-0197 and CVE-2015-0199, there are no workarounds or mitigations.

    For CVE-2015-0198, set cipherList to AUTHONLY, or to a real cipher, Follow the instructions above if the PTF was installed on all the nodes in the cluster. Otherwise:

        Generate SSL keys by running the mmauth genkey new command
        Shut down the GPFS daemon on all nodes on the cluster
        Enable AUTHONLY by running mmauth update . -l AUTHONLY

     

    Get Notified about Future Security Bulletins

    Subscribe to My Notifications to be notified of important product support alerts like this.

    Acknowledgement

    The vulnerabilities were reported to IBM by Florian Grunow and Felix Wilhelm of ERNW

    Updated on 2015-03-19T12:12:24Z at 2015-03-19T12:12:24Z by gpfs@us.ibm.com
  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2015-04-17T14:52:46Z  in response to gpfs@us.ibm.com

    Security Bulletin: Vulnerability in RC4 stream cipher affects GPFS V3.5 for Windows (CVE-2015-2808) / Enabling weak cipher suites for IBM General Parallel File System is NOT recommended

    Summary
    The RC4 "Bar Mitzvah" Attack for SSL/TLS affects OpenSSH for GPFS V3.5 for Windows. Additionally, with the recent attention to RC4 "Bar Mitzvah" Attack for SSL/TLS, this is a reminder to NOT enable weak or export-level cipher suites for IBM General Parallel File System (GPFS).

    See the complete bulletin at  http://www-01.ibm.com/support/docview.wss?uid=isg3T1022137

  • gpfs@us.ibm.com
    gpfs@us.ibm.com
    216 Posts
    ACCEPTED ANSWER

    Re: GPFS V3.5 announcements

    ‏2015-05-18T18:47:20Z  in response to gpfs@us.ibm.com

    GPFS 3.5.0.25 is now available from IBM Fix Central:

    http://www-933.ibm.com/support/fixcentral

    Problems fixed in GPFS 3.5.0.25

    May 17, 2015

    * Fix a problem that the last data block was wrongly set causing an assert.
    * Fix online replica compare tool to not report false replica mismatches when the file system has suspended disks.
    * Try to reduce Linux memcg/oom deadlock risk in the GPFS side
    * Fix a problem in determining whether copy-on-write is needed or not in the presence of snapshots, which sometimes could result in s
    purious write operation failures (especially, but not limited to file/directory creation).
    * Fix a problem where the mmcrsnapshot and mmdelsnapshot commands timeout while a migrated file being deleted is waiting for recall.
    * Ensure that EA migration to enable FastEA support for a file system does not assert, under certain conditions, for 'data-in-inode'
    case.
    * Update allocation code to close a small timing window that can lead to file system corruption. The problem could occur when a clien
    t node panic while a new file system manager is in the middle of takeover.
    * Fix a signal 11 problem in multi-cluster environment when gpfs daemon relay the fsync request through metanode but the OpenFile got
     stolen on the metanode in the middle.
    * The privateSubnetOverride configuration parameter may be used to allow multiple clusters on the same private subnet to communicate
    even when cluster names are not specified in the 'subnets' configuration parameter.
    * Fix a workload counter used for NVRAM log tip I/O processing queues. Recommended if NVRAM log tip is in-use.
    * Potentially avoid crash on normal OS shutdown of CNFS node.
    * Fix a problem where the number of nodes allowed in a cluster is reset from 16384 to 8192.
    * Prevent the user exit for the nodeLeave event to be called twice for the same failing node.
    * This affects GSS/ESS customers who are using chdrawer to prepare to replace a failed storage enclosure drawer on an active system.
    * Fix a problem which may cause auto recovery fails to suspend some down disks.
    * Fix command poor performance on cluster that has no security key.
    * Fix a problem with DIRECT_IO write which can cause data loss when file system panic or node fails after a write passes the end of f
    ile using DIRECT_IO and causes an increase in file size. The file size increase could be lost.
    * File cache filled-up with deleted objects (Linux NFS)
    * Fix problems with mmapplypolicy termination that could hang, loop or segmentation fault.
    * Fix handling of policy rules like ... MIGRATE ... TO some-group-pool THRESHOLD (hi,lo) ...
    * mmauth inadvertently change cipherList to an invalid string.
    * Fix a problem in mmremotecluster command that fails to refresh authorized keys.
    * This update addresses the following APARs: IV71015 IV71608 IV71614 IV71615 IV71819 IV71989 IV71993 IV72013 IV72028 IV72031 IV72035
    IV72038 IV72040 IV72046 IV72679 IV72685 IV72686 IV72693 IV72697 IV72699.