IBM Support

JR59463: REVISE PARALLEL ENGINE FILE SYSTEM SYNC BEHAVIOR FOR BETTER PERFORMANCE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The changes in this APAR change the default behavior and
    correct documentation about how the parallel engine makes
    changes to datasets.
    Operating systems try to avoid slow operations to external
    storage devices by caching file buffers in memory until the
    application program closes the file or does other operations
    which requirewriting permanent data to the file system.
    If there is a system crash while an application is running, the
    file buffers might not be written to storage and open files can
    be corrupted.
     Operating systems have calls which force writing buffers to
    storage
    The fsync() call is used to write out the buffers for a
    specific open file.
    The sync() call is used to write out buffers for all active
    files.
    PX has environment variables which control how often the files
    used to implement PX datasets are written to storage.
    APT_DATASET_FLUSH_NOFSYNC
    The parallel engine default is to call fsync() after after a
    dataset file descriptor is changed, including closing,
    truncating, or deleting the dataset.
    The fsync() call only affects one dataset so it does not have
    broad system performance
    implications. Users can gain some performance benefit by
    setting APT_DATASET_FLUSH_NOFSYNC, but datasets will be
    vulnerable to
    job failures or file or system crashes.
    APT_DATASET_FLUSH_SYNC
    Setting this environment variable will force a call to sync()
    after a dataset file descriptor is changed, including closing,
    truncating, or deleting the dataset. The sync() call is
    expensive because it affects all open files on the engine tier
    system.
    Normally, this environment variable should not be set.
    Prior to the fixes for this APAR, the PX default behavior was
    to call both fsync() and synch()
    when the a dataset file descriptor changed.
    With this APAR, the default is to only call fsynch().
    There was an environment variable called
    APT_DATASET_FLUSH_NOSYNC, which could be set to turn off the
    synch() call.
    APT_DATASET_FLUSH_NOSYNC is no longer needed ,and it has no
    effect if it is used.
    A Knowledge Center description of APT_DATASET_FLUSH_SYNC has
    been added and the description  for APT_DATASET_FLUSH_NOSYNC
    has been dropped.
    

Local fix

  • Set APT_DATASET_FLUSH_NOSYNC=1
    

Problem summary

  • REVISE PARALLEL ENGINE FILE SYSTEM SYNC BEHAVIOR FOR BETTER
    PERFOMANCE
     REPORTED COMPONENT ID
     5724Q36DS
     ERROR DESCRIPTION
    The changes in this APAR change the default behavior and
    correct documentation about how the parallel engine makes
    changes to datasets.
    Operating systems try to avoid slow operations to external
    storage devices by caching file buffers in memory until the
    application
    program closes the file or does other operations which require
    writing permanent data to the file system.
    If there is a system crash while an application is running, the
    file buffers might not be written to storage
     and open files can be corrupted.
     Operating systems have calls which force writing buffers to
    storage
    The fsync() call is used to write out the buffers for a
    specific open file.
    The sync() call is used to write out buffers for all active
    files.
    PX has environment variables which control how often the files
    used to implement PX datasets are written to storage.
    APT_DATASET_FLUSH_NOFSYNC
    The parallel engine default is to call fsync() after after a
    dataset file descriptor is changed, including closing,
    truncating, or deleting the dataset.
    The fsync() call only affects one dataset so it does not have
    broad system performance
    implications. Users can gain some performance benefit by
    setting APT_DATASET_FLUSH_NOFSYNC, but datasets will be
    vulnerable to job failures or file or system crashes.
    APT_DATASET_FLUSH_SYNC
    Setting this environment variable will force a call to sync()
    after a dataset file descriptor is changed, including closing,
    truncating, or deleting the dataset. The sync() call is
    expensive because it affects all open files on the engine tier
    system.
    Normally, this environment variable should not be set.
    Prior to the fixes for this APAR, the PX default behavior was
    to call both fsync() and synch()
    when the a dataset file descriptor changed.
    With this APAR, the default is to only call fsynch().
    There was an environment variable called
    APT_DATASET_FLUSH_NOSYNC, which could be set to turn off the
    synch() call.
    APT_DATASET_FLUSH_NOSYNC is no longer needed ,and it has no
    effect if it is used.
    A Knowledge Center description of APT_DATASET_FLUSH_SYNC has
    been added and the description  for APT_DATASET_FLUSH_NOSYNC
    has been dropped.
    

Problem conclusion

  • Patches are available which implement the changes.
    

Temporary fix

Comments

APAR Information

  • APAR number

    JR59463

  • Reported component name

    WIS DATASTAGE

  • Reported component ID

    5724Q36DS

  • Reported release

    B50

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-04-23

  • Closed date

    2018-07-27

  • Last modified date

    2018-12-13

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Modules/Macros

  • None
    SERER
    

Fix information

  • Fixed component name

    WIS DATASTAGE

  • Fixed component ID

    5724Q36DS

Applicable component levels

  • R912 PSY

       UP

[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSVSEF","label":"InfoSphere DataStage"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.5"}]

Document Information

Modified date:
02 September 2021