A fix is available
APAR status
Closed as program error.
Error description
The changes in this APAR change the default behavior and correct documentation about how the parallel engine makes changes to datasets. Operating systems try to avoid slow operations to external storage devices by caching file buffers in memory until the application program closes the file or does other operations which requirewriting permanent data to the file system. If there is a system crash while an application is running, the file buffers might not be written to storage and open files can be corrupted. Operating systems have calls which force writing buffers to storage The fsync() call is used to write out the buffers for a specific open file. The sync() call is used to write out buffers for all active files. PX has environment variables which control how often the files used to implement PX datasets are written to storage. APT_DATASET_FLUSH_NOFSYNC The parallel engine default is to call fsync() after after a dataset file descriptor is changed, including closing, truncating, or deleting the dataset. The fsync() call only affects one dataset so it does not have broad system performance implications. Users can gain some performance benefit by setting APT_DATASET_FLUSH_NOFSYNC, but datasets will be vulnerable to job failures or file or system crashes. APT_DATASET_FLUSH_SYNC Setting this environment variable will force a call to sync() after a dataset file descriptor is changed, including closing, truncating, or deleting the dataset. The sync() call is expensive because it affects all open files on the engine tier system. Normally, this environment variable should not be set. Prior to the fixes for this APAR, the PX default behavior was to call both fsync() and synch() when the a dataset file descriptor changed. With this APAR, the default is to only call fsynch(). There was an environment variable called APT_DATASET_FLUSH_NOSYNC, which could be set to turn off the synch() call. APT_DATASET_FLUSH_NOSYNC is no longer needed ,and it has no effect if it is used. A Knowledge Center description of APT_DATASET_FLUSH_SYNC has been added and the description for APT_DATASET_FLUSH_NOSYNC has been dropped.
Local fix
Set APT_DATASET_FLUSH_NOSYNC=1
Problem summary
REVISE PARALLEL ENGINE FILE SYSTEM SYNC BEHAVIOR FOR BETTER PERFOMANCE REPORTED COMPONENT ID 5724Q36DS ERROR DESCRIPTION The changes in this APAR change the default behavior and correct documentation about how the parallel engine makes changes to datasets. Operating systems try to avoid slow operations to external storage devices by caching file buffers in memory until the application program closes the file or does other operations which require writing permanent data to the file system. If there is a system crash while an application is running, the file buffers might not be written to storage and open files can be corrupted. Operating systems have calls which force writing buffers to storage The fsync() call is used to write out the buffers for a specific open file. The sync() call is used to write out buffers for all active files. PX has environment variables which control how often the files used to implement PX datasets are written to storage. APT_DATASET_FLUSH_NOFSYNC The parallel engine default is to call fsync() after after a dataset file descriptor is changed, including closing, truncating, or deleting the dataset. The fsync() call only affects one dataset so it does not have broad system performance implications. Users can gain some performance benefit by setting APT_DATASET_FLUSH_NOFSYNC, but datasets will be vulnerable to job failures or file or system crashes. APT_DATASET_FLUSH_SYNC Setting this environment variable will force a call to sync() after a dataset file descriptor is changed, including closing, truncating, or deleting the dataset. The sync() call is expensive because it affects all open files on the engine tier system. Normally, this environment variable should not be set. Prior to the fixes for this APAR, the PX default behavior was to call both fsync() and synch() when the a dataset file descriptor changed. With this APAR, the default is to only call fsynch(). There was an environment variable called APT_DATASET_FLUSH_NOSYNC, which could be set to turn off the synch() call. APT_DATASET_FLUSH_NOSYNC is no longer needed ,and it has no effect if it is used. A Knowledge Center description of APT_DATASET_FLUSH_SYNC has been added and the description for APT_DATASET_FLUSH_NOSYNC has been dropped.
Problem conclusion
Patches are available which implement the changes.
Temporary fix
Comments
APAR Information
APAR number
JR59463
Reported component name
WIS DATASTAGE
Reported component ID
5724Q36DS
Reported release
B50
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-04-23
Closed date
2018-07-27
Last modified date
2018-12-13
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
None SERER
Fix information
Fixed component name
WIS DATASTAGE
Fixed component ID
5724Q36DS
Applicable component levels
R912 PSY
UP
[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSVSEF","label":"InfoSphere DataStage"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.5"}]
Document Information
Modified date:
02 September 2021