Cumulative readme for fixed issues
Refer to this file to find out which issues were resolved in each IAS release since version 1.0.2.0.
Issues fixed in 1.0.23.2
- Memsize issues
Solved the issue with the IPMI commands taking more time to complete as compared to the typical execution time. Because of this, the session was timing out and an incomplete picture of the memory that was present was logged in. The timeout is extended to 5 minutes now.
- GPFS
mmhealth node show used to show a non-optimal GPFS configuration. It is now set to
0. On a running machine, a GPFS restart is needed to enforce the GPFS change. Also, the value persists after GPFS reboot/restart.
Issues fixed in 1.0.22.1
- Replaced device mapper storage driver with overlay2 storage driver. This fixes any future
thinpoolissues during upgrades. With upgrading to version 1.0.21.3, DSX reinstall might still be required. - Fixed an upgrade issue with Docker not starting.
- Fixed an issue with upgrade removing EMC NetWorker configuration. EMC NetWorker package and configuration are now retained after the upgrade.
- Fixed the following issues with static routing:
- Fixed issue with static routes and nodes losing their fbond/mbond network interfaces.
- Static routes can now be deleted on every node using the apsetup utility. For more information, see Deleting static routes
- sys_hw_diags network no longer crashes on a 4 HA or above systems.
- db_restore will not be transferring the ownership for system-generated sequences, only for user-defined sequences.
- Resolved the issue with Platform Manager not being able to establish a master.
- Resolved issues with node not coming up on the mgt interface after reboot.
When apupgrade resumed after a node reboot, the node failed to come up on the mgt interface. Now apupgrade attempts to confirm it can communicate over the management network with all of the nodes. If any nodes are unreachable, it attempts to restart the network.
- Resolved issues related to the
netbackup-server/-nboption for schema-level backup. - DSX is not enabled during the final stages of the upgrade procedure if you disabled DSX on your
machine. Note: If you want to disable DSX, you have to uninstall it. To uninstall DSX, run ./InstallPackages/utils/uninstall.sh from the directory where install was run. Usually, the install directory is located in /opt/ibm/appliance/storage/platform. If you can't find the directory, contact IBM Support.
- FOS 8.2.1 support
- In upgrade only mode, rev firmware was generating an error and preventing update. This mode was added to the criteria for allowing update.
- The quiet mode in the
_fcsw_restartmethod was not being propagated to_wait_rebootand this was causing two countdown timers to appear in the same location.
- Resolved issues with _syshw_int_hndlr() taking exactly 1 argument (2
given).
The required
frameargument passed to the interrupt handler assigned to a signal is put back. Also, thecleanup_and_exitAPI is now consistent across all calls. To avoid putting pylint ignore rules in the code, the unused arguments are used in a NOP clause. - Resolved issues with
sys_hw_check fsnWhen ethernet was removed from both FSN canisters, a FAIL was printed and no details about the FSN were provided. It now falls back to serial connections to gather component information and print a WARN message about the ethernet connection not being present.
- Fixed the security vulnerability issue with the remote NPS server responding to mode 6 queries. Before devices that responded to the queries could have been used in NTP amplification attacks.
Issues fixed in 1.0.19.7
- Added a pre-check option apdbrollback --restore_check to verify that there is enough space in the system to restore a snapshot when performing the rollback of database upgrade.
- Added the following new alert in Platform Manager
902: Stored snapshots reduce available storage. The alert is opened when after a database upgrade to 11.5 the snapshots are not deleted as described in appl_upgrade_engine.html#task_wdf_gks_pjb__delete_snapshot.
Issues fixed in 1.0.19.2
- Fixed the issue with GPFS that was not upgraded during IAS upgrade.
- Defects addressed in Db2 Warehouse:
- LDAP should have TLS 1.2 as minimum allowed version
- After ap upgrade to 1.0.16.0, DB2 services are not coming up, causing HA management and Db2 to go down.
- Fixed the issue with hidden columns in IBM Data Replication for Availability: when creating a target table a hidden identity column on source was not created as hidden on target.
- Web console component includes fixes for the following issues:
- Backup and restore could not be launched from console
- Console configuration files might be corrupted sometimes
- User provided certificate is lost after console is upgraded
- Large sort memory and tablespace consumption in console utility event monitor query
- Granular backup and restore object ownership fixes.
- Fixed the issue with incremental database backup: updated db_backup to estimate the increase in database size since the last full online backup and then compare if the given backup path has enough space to accommodate the backup, instead of comparing it with the entire database size.
- Fixed a bug in db_restore, to allow database restore from multiple paths
specified with a space ( ), or a comma (,), or both, a comma and a space (, ) as separators. Examples:
-path p1,p2-path p1 p2-path p1, p2
- Fixed a bug in db_backup, to avoid creating multiple database backups under
the same directory. Now, every database backup has its own unique directory
backup_<type>_<number>. - Updated db_backup script to create backup directories as db2inst1 or database
admin user, to allow users to delete schema backups without root access.
During schema backup, if the db2inst1 or the database admin user doesn't have the right permissions to create backup directories in the given path, then the directories are created with a root user and a warning is displayed. Note that root access is required to remove them.
- Fixed
db_backup -historycommand not to throw a Traceback error when run during an offline backup or any restore operation. However, this command cannot return history results when the database is down during any offline operation. - The list of Red Hat CVEs patched in 1.0.19.0 release:
CVE-2018-10911 CVE-2019-11477 CVE-2019-11478 CVE-2019-11479 CVE-2019-10161 CVE-2019-10166 CVE-2019-10167 CVE-2019-10168 CVE-2019-10160 CVE-2019-12735
Issues fixed in 1.0.18.1
- A number of issues that were related to the upgrade procedure were fixed:
- Fixed the issue with NodeOS upgrade fail due an rpm package installed by DSX.
- To avoid breaking GPFS during upgrade, run a pre-check mmccr check command on
all nodes to verify that GPFS is in a consistent state before running an upgrade. The command should
return the following
output:
To resolve the issue, run the following command on the node where the command mmccr check failed:[root@node0101 ~]# mmccr check mmccr check results:CCR Client initialization succeed Check CCR authorized key file succeed Check CCR cached directory and file succeed Check both CCR paxos files succeedT
Then, run mmccr check again to confirm that the CCR error is resolved.mmchcluster --ccr-disable mmsdrrestore mmchcluster --ccr-enable - Fixed the issue with ROW systems: When upgrading row organized system, apinit was changing dashDB.env file parameters from row to column parameters, and the environments variable was changed.
- Fixed the issue with NodeOS upgrade resulting in the following
error:
NodeosUpgrader.install : nodeos:NodeosUpgrader.install:Fatal Problem: The Node OS rpm is not correctly installed/configured. This error requires manual intervention to resolve. Please contact IBM Support.
Issues fixed in 1.0.18.0
- Backup and restore:
- For a given backup path, updated db_backup to only change permissions on the directories created by the script, leaving the permissions of the existing directories unaltered.
- During a schema backup, modules and their aliases will be backed up. And during a restore, these objects will be left unaltered if they do not exist in the backup.
- Updated db_backup and db_restore to limit the number of processes spawned during a schema backup or a restore.
- For multi-table restore, updated -tablefile option to handle empty lines in the file that has the table names to restore.
- db_backup now uses schema size instead of the database size to check if the given path has enough space for the schema backup.
- Updated checks for schema backups: when multiple paths are specified with a comma (,) and no space separator is used, then these paths are now parsed correctly as multiple paths instead of a single path.
- For a schema backup, the trigger and schema objects' ownership as well as the ownership type is now backed up. And during a restore, the ownership of these objects will be restored to its original owner.
- All the backup directories will be owned by db2inst1 user and database admin group to allow
non-bluadmin users to access the backup. A warning will be displayed instead of an error if
db_backup fails to change the ownership on the backup directories
to:
db2inst1 : {admin_group} - Fixed a bug in db_restore, to allow restores when a table has the same name as the schema.
- The list of Red Hat CVEs patched in 1.0.18.0 release:
CVE-2019-3839 CVE-2019-2602 CVE-2019-2684 CVE-2019-2698 CVE-2018-9568 CVE-2018-17972 CVE-2018-18445 CVE-2019-6974 CVE-2019-7221 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130 CVE-2019-11091 CVE-2018-5743 CVE-2019-10132 CVE-2014-3565 CVE-2019-9636 CVE-2016-10745 CVE-2019-5953
Issues fixed in 1.0.17.1
- The list of Red Hat CVEs patched in 1.0.17.1 release:
CVE-2019-2422 CVE-2019-6454 CVE-2019-3835 CVE-2019-3838 CVE-2019-2422 CVE-2018-9568 CVE-2018-17972 CVE-2018-18445 CVE-2019-6454 CVE-2019-3855 CVE-2019-3856 CVE-2019-3857 CVE-2019-3863 CVE-2018-5407
- Fixed an issue with third party rpm packages installed by DSX on IAS causing the upgrade to fail.
- Fixed a known issue with adding SAN, when in some cases when attempting to add a SAN, the
following error was
seen:
Checking multipath devices and populating GPFS stanza file... ERROR: No multipaths found... - Updated db_backup to back up all the Grant statements and store the ownership information for schema objects like tables, views, stored procedures and references. Also, db_restore can now be used to preserve the permissions using those Grant statements and also restore the ownership for the above schema objects.
- Fixed an issue with db_restore, where it used to skip restoring the very first view in a schema. Now all the views in a schema can be restored successfully.
- Updated external table options to allow db_backup and db_restore to support backup and restore of the tables that have data with ASCII characters stored in binary format.
- Updated db_backup schema size check to get the size of tables in a schema using ADMIN_GET_TAB_INFO procedure, in place of ADMINTABINFO table, to avoid lock timeout issues during CTAS operations.
- Fixed a bug in multi-path size check for full database backup. Database size is now compared with the sum of space left in each given path instead of comparing the database size with the space left in individual paths.
- Fixed bogus data integrity check failure for a specific case of NUMERIC() in db_migrate.
- Fixed issue with empty tables in db_migrate_iias.
- Fixed issue with single quote value for
-escapeCharparameter in dbload.
Issues fixed in 1.0.16.0
- The list of Red Hat CVEs patched in 1.0.16.0 release:
CVE-2018-5742 CVE-2018-16540 CVE-2018-19475 CVE-2018-19476 CVE-2018-19477 CVE-2019-6116 CVE-2018-18397 CVE-2018-18559 CVE-2018-15688 CVE-2018-16864 CVE-2018-16865 CVE-2019-3815 CVE-2018-18311 CVE-2019-6133
Issues fixed in 1.0.15.0
- Fixed the issue with storage setup -e san not working on multi-domain systems.
- db_backup and db_restore now support multiple paths that are comma-separated.
- db_restore now supports schema or table restores with identity columns.
- db_backup now supports taking a schema backup even when the spatial column is empty with no data in a table.
- db_backup and db_restore are updated to backup and restore foreign key constraints as part of schema backups and schema restores. For table restores, foreign key constraints are disabled and need to be enabled manually.
- The list of Red Hat CVEs patched in 1.0.15.0 release:
CVE-2018-15908 CVE-2018-15909 CVE-2018-16511 CVE-2018-16539 CVE-2018-16863 CVE-2018-15911 CVE-2018-16541 CVE-2018-16802 CVE-2018-17183 CVE-2018-17961 CVE-2018-18073 CVE-2018-18284 CVE-2018-19134 CVE-2018-19409 CVE-2018-15908 CVE-2018-15909 CVE-2018-16511 CVE-2018-16539 CVE-2018-16863 CVE-2018-15911 CVE-2018-16541 CVE-2018-16802 CVE-2018-17183 CVE-2018-17961 CVE-2018-18073 CVE-2018-18284 CVE-2018-19134 CVE-2018-19409 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646 CVE-2018-14633 CVE-2018-14646
Issues fixed in 1.0.14.1
- Added a Red Hat hotfix rpm for kernel crash issue "Server crash with kernel memory exposure attempt detected".
Issues fixed in 1.0.14.0
- Fixed the problem with permissions for deleting backups and backup directories:
- Added group write permissions to
db2iadm1user group on the backup directories. - Added group read and write permissions to
db2iadm1user group on backup files.
- Added group write permissions to
- FSN firmware is upgraded to 1.5.2.1 to fix the incorrect multibit ECC error during code
load.
The FSN900 DRAM technology consists of two dies in a single package, where each die is a different memory rank. Micron is calling this technology TwinDie. The issue is that the Xilinx memory controller is loading the wrong values into the top rank. The wrong values affect the DRAM write path and thus causes the incorrect multibit ECC during code load. The code is loaded correctly but due to this issue the ECC is incorrect.
Issues fixed in 1.0.13.0
-
A security hole was closed which allowed unprivileged user login from a remote system. The hole was unintentionally opened due to the way Brocade implemented log collection for hardware issues in its Fibre Channel switches. The IAS log collection has been updated, but if an installed system has ever had an issue on a Fibre Channel switch which required log collection, the hole will remain until logs are collected. In such a situation, collect logs with the following command, targeting every node in the system to ensure the hole is closed:
This command may be run with the database online, and it will take approximately 5 minutes per node.apdiag collect --components hw/switch/fcsw --fcsw hadomain1.fcswa --node node0101-fab - Fixed reporting of FSP events. In previous releases, if more than one FSP unrecoverable event was generated in a short period of time, Platform Manager sent events for all of them, but only one of these events would contain proper details of FSP event. In others, details were reported as unknown.
- Fixed an issue with apstop reporting failures on stopping containers even if it was successful.
- Fixed an issue with ap hw reporting node status as OK instead of UNREACHABLE when the node is disabled and then turned off.
- db_logprune now cleans up all the older log chains (C directories) under the archive log path, keeping only up to 50 latest logs in the latest log chain (C directory).
- Fixed a problem with copying or moving a file system backup due to lacking permissions. Backup
directory and images ownership is now changed from
db2inst1tobluadminuser to avoid such problems. - An issue with untagged images taking up Docker storage on nodes is fixed. The untagged images are now pruned after upgrades.
Issues fixed in 1.0.12.0
- Fixed an issue where in case of a power outage impacting all the appliance nodes at the same, Platform Management software could not automatically start Db2wh containers when the power was restored.
- Fixed an issue with the ap node enable command failing due to GPFS problems.
Platform Manager did not allow to enable that node again, displaying
in the output of ap node enable command. To make it ready for enabling a node again, Platform Manager had to be restarted (apstop -p; apstart -p).Could not complete command - If the system is configured to run TSM LAN-Free backup and restore, the DSMSTA service will be left running all the time except for when running non-TSM backup and restore.
Issues fixed in 1.0.11.1
- Fixed a defect with false positive states reported by HA when the system was stressed: When Db2® was running out of application HEAP, no new connections were possible to bludb. Since HA depends on connecting to Db2 for status, it was reporting that DB2® was down and in recovery phase.
- Fixed an issue where during a node failover, Db2 ACTIVATE DATABASE BLUDB command was hung.
- Fixed a docker mount count issue in systems with Tiered storage or external mount.
- Fixed a defect in
emc_client settings --backupcommand, to only unlink the NMDA library instead of removing it entirely. - Fixed an issue in the DR environment, where after a snapshot job failure Db2 was put into WRITE-SUSPEND mode and was not resumed.
- Fixed an issue with ap info showing Growth on Demand data with significant delay (~2 hours) after it’s updated or set with the apgodmgr command.
- Fixed an issue with Platform Manager unable to close alert 112 (FSP unrecoverable events detected) on one or more nodes if FSP event with the same identifier was generated by nodes.
- Fixed a bug in db_backup and db_restore commands to clean up the lock acquired once the backup or restore is complete.
- Fixed a bug in backup progress view on web console to report a more accurate percentage of completion.
Issues fixed in 1.0.10.1
- Fixed the network configuration issue with the MTU of the fbond:Fl interface incorrectly specified.
Issues fixed in 1.0.10.0
- Fixed an issue with ap fs command failing to display information about file systems mounts on appliances with tiered storage attached. The failure was related to timeout in handling requests in REST of Platform Manager.
- Fixed an issue with docker monitoring in Platform Manager failing when communication with docker service is interrupted by ReadTimeout even for one node. After the fix, monitoring fails for a given node only.
- Fixed an issue where due to incorrect handling of some SMTP errors, Platform Manager would unnecessarily queue-and-repeat failed requests for sending alert email notification. After SMTP errors were fixed (on SMTP service or in network) the old alerts were delivered, even after number of days.
- Fixed an issue with Web console where due to a bug in generating access tokens in Platform Manager, Web Console would lose access to monitoring data. As a result, some of the panels in the Web Console were not populated with platform management data.
- Fixed an issue with Platform Management starting a container before GPFS file system is mounted correctly, which led to Db2 start failure.
- Fixed db_backup and db_restore to block all remote and local non-admin connections before running offline backups and all restores. Enhanced error handling mechanism to restart the connections back up in case of failures.
Issues fixed in 1.0.9.0
- Fixed an issue where Platform Manager requested an unnecessary node power cycle due to internal problems related to REST communication with an agent that is running on that node.
- Fixed an issue with a node not responding to shut down request sent by apstop command. The command failed with Problem occurred: Timeout exceeded when waiting for deactivation message.
- Fixed an issue where the Db2 Warehouse container on a node was restarted by Platform Manager when GPFS NSDs related to cool file systems of tiered storage were not available.
- Fixed an issue with alert 410 generated during software update of Call Home service, about not being able to stop Call Home.
- Fixed an issue where a multi-HA domain system got stuck in RECOVERING state if all Db2 Warehouse containers in the first HA domain were restarted as part of a GPFS file system recovery.
Issues fixed in 1.0.8.0
- Platform Manager no longer reports that the database state is OK when some database services are stopped and as a result, the user cannot connect to the database. Database connections are now monitored and recovery is enabled.
- Added a check to see whether a backup or restore process is already running when a user tries to run a new backup or restore.
- A
TRACKMODcheck is now included for incremental backup. The Track modified pages setting must be set to YES for incremental backup. - Platform level backups:
- Added prechecks for GPFS Cluster Status and GPFS Mounts before Platform Level Backups
- Renamed
apcommsoption asapnetworking, andrlockdownoption asapsecurity.
- The db_backup -history and db_restore -history command output no longer shows an error when no records are found in the history. Instead, a relevant message is displayed.
- Fixed an issue with the external LDAP server configuration. The external LDAP user can now log in to any node having an external IP, and can switch to any of the platform nodes with the valid password.
- Added a meaningful error message if host, search-base-dn, searcher-dn and ca-cert arguments that are passed to ap_external_ldap.pl are empty.
- The port number that is passed to ap_external_ldap.pl is now validated, and a meaningful error message is provided if it is not valid.
- Fixed an issue where an alert about high storage usage (reason code 901) was raised incorrectly (without legitimate reason) on problems with mounting GPFS files system on one of the nodes.
Issues fixed in 1.0.7.0
- The system no longer fails to start when external storage that is attached to GPFS is not mounted.
- Changes to the default pam authentication settings lead to external LDAP user authentication failures. These issues are now resolved.
- The database no longer becomes disabled for a significant amount of time after the network is reconfigured by using the apsetup command.
- Nodes no longer become disabled due to failed attempts to restart the docker service after the service was power-cycled.
- Alert 412 (node time is not synchronized) is now automatically closed after the node time is synchronized.
- When an external IP is assigned to the 10Gbps connection to an appliance, the gateway setting is no longer lost after a server is power-cycled.
- A Fibre Channel switch no longer causes a kernel race/deadlock among the PCI devices.
- A race condition between multiple network services in some deployment situations no longer results in a loss of connectivity.
- You no longer need to manually run db_startsrc -all after a restore operation. If a restore attempt fails, db_restore now runs automatically to enable HA and DSM connections.
- Before you run an upgrade, the /etc/ntp.conf file no longer needs to be manually backed up and DSX no longer needs to be started and stopped. The upgrade process takes care of these steps automatically.
- In earlier releases, an error occurred when you attempted to enable external LDAP authentication (by using ap_external_ldap) if the platform was already enabled with external LDAP authentication. A message is now displayed that instructs you to disable the previous LDAP authentication.
- You no longer need to manually copy the external LDAP certificate from the head node to peer nodes when you are enabling external LDAP authentication. You also no longer need to manually delete the external LDAP certificate from the head node to peer nodes when you are disabling external LDAP authentication.
- An auto compaction feature is provided to allow etcd to manage keyspace. With this feature enabled, etcd detects and prunes the database automatically to keep it within its keyspace limit.
Issues fixed in 1.0.6.0
- When all host names are removed from a host's network configuration, the operation to remove them from the /etc/hosts file failed. This issue has been fixed.
- During network configuration, extraneous output would sometimes appear. This distracting output has been removed.
- In some instances, the upgrade of the
ap-commsRPM was failing during installation, leaving the RPM in a half-installed state. This issue has been fixed. - IAS now only successfully authenticates users that have been explicitly provided access to the
system by the platform administrator using the usermod option of the
/opt/ibm/appliance/platform/ldap/bin/ap_external_ldap.pl command. In previous
versions of IAS, a platform user with a group id of
2001or2002was able to authenticate to IAS from the external LDAP server even without access explicitly granted by the administrator.
Issues fixed in 1.0.5.0
- After node reboot, Platform Manager might not have started node's application container due to an issue in the startup policy.
- Unreachable node can be disabled untimely when three docker service starts to fail too quickly.
Issues fixed in 1.0.4.0
- dbsql with -schema option no longer returns segmentation error and core dump.
Issues fixed in 1.0.2.0
- Pre-install steps are now skipped for components already up to date.
- Adding DNS search domains no longer incorrectly adds the word
domainsin the resolv.conf file. - The nodes are now included in the VLAN. Therefore, by using VLAN to access the IAS, you can also directly access the nodes.