IBM Support

IT27526: ONDBSPACEDOWN ALLOWING A CHECKPOINT WITH I/O ERRORS TO COMPLETE

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • With ONDBSPACEDOWN 2 (wait) you could see the following:
    
    01:54:22  Assert Warning: I/O write chunk 28, pagenum 20,
    pagecnt 4 --> errno = 2
    01:54:22  IBM Informix Dynamic Server Version 12.10.FC5W1XZ
    01:54:22   Who: Thread(13, flush_sub(1), 6207b1a8, 11)
                    File: rsbuff.c Line: 5725
    01:54:22   Action: Please notify IBM Informix Techical Support.
    01:54:22  stack trace for pid 113092 written to
    /opt/informix/tmp/af.3f5d91d
    01:54:22   See Also: /opt/informix/tmp/af.3f5d91d
    01:54:22  I/O write chunk 28, pagenum 20, pagecnt 4 --> errno =
    2
    01:54:33  Checkpoint Completed:  duration was 70 seconds.
    01:54:33  Sat Nov 18 - loguniq 84375, logpos 0x3926b018,
    timestamp: 0xd2a3ebff Interval: 324187
    
    01:54:33  Maximum server connections 806
    01:54:33  Checkpoint Statistics - Avg. Txn Block Time 0.122, #
    Txns blocked 27, Plog used 130899, Llog used 266693
    
    01:54:33  WARNING: Checkpoint blocked by down space, waiting for
    override or shutdown
    02:00:07  Logical Log 84376 Complete, timestamp: 0xd2a401a4.
    02:00:07  Logical Log 84376 - Backup Started
    
    
    That is, the ongoing checkpoint incurring errors flushing dirty
    buffers to disk is allowed to complete and only the next
    checkpoint would be blocked.
    
    Consequently, after 'onmode -ky', chunks would not be marked
    down and fast recovery would start at that inconsistent, but
    complete checkpoint:
    
    ...
    02:02:59  Logical Recovery Started.
    02:02:59  56 recovery worker threads will be started.
    02:03:04  Fast Recovery Switching to Log 84376
    02:03:05  Fast Recovery Switching to Log 84377
    02:03:06  Logical Recovery has reached the transaction cleanup
    phase.
    02:03:12  Checkpoint Completed:  duration was 6 seconds.
    02:03:12  Sat Nov 18 - loguniq 84376, logpos 0x35e5018,
    timestamp: 0xd2aa771e Interval: 324188
    
    02:03:12  Maximum server connections 0
    02:03:12  Checkpoint Statistics - Avg. Txn Block Time 0.000, #
    Txns blocked 0, Plog used 16455, Llog used 0
    
    02:03:13  Checkpoint Completed:  duration was 0 seconds.
    02:03:13  Sat Nov 18 - loguniq 84377, logpos 0x1018, timestamp:
    0xd2aa774d Interval: 324189
    
    02:03:13  Maximum server connections 0
    02:03:13  Checkpoint Statistics - Avg. Txn Block Time 0.000, #
    Txns blocked 0, Plog used 1900, Llog used 2
    
    02:03:14  Logical Recovery Complete.
              21224 Committed, 21 Rolled Back, 0 Open, 0 Bad Locks
    
    02:03:15  Onconfig parameter RAS_PLOG_SPEED modified from 148485
    to 89735.
    02:03:15  Onconfig parameter RAS_LLOG_SPEED modified from 690 to
    1565.
    02:03:15  Dataskip is now OFF for all dbspaces
    02:03:15  listener-thread: err = -27002: oserr = 0: errstr = :
    No connections are allowed in quiescent mode.
    
    02:03:16  Checkpoint Completed:  duration was 0 seconds.
    02:03:16  Sat Nov 18 - loguniq 84377, logpos 0x290c0, timestamp:
    0xd2aa7fae Interval: 324190
    
    02:03:16  Maximum server connections 0
    02:03:16  Checkpoint Statistics - Avg. Txn Block Time 0.000, #
    Txns blocked 0, Plog used 1562, Llog used 41
    
    02:03:16  On-Line Mode
    
    ... which can be told from interval number of first checkpoint
    after fast recovery.
    
    What seemingly got through fast recovery fine, indeed is missing
    at least one page flushed to disk and has to be considered
    inconsistent, with oncheck likely to find corruption.
    In a clustered environment this also can be assumed to be the
    cause of (differing) corruptions on HDR secondary or RSS.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * Users of IDS 12.10.xC10 and earlier versions.                *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * ONDBSPACEDOWN allowing a checkpoint with I/O errors to       *
    * complete.                                                    *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    

Problem conclusion

  • Fixed in IDS 12.10.xC11.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT27526

  • Reported component name

    INFORMIX SERVER

  • Reported component ID

    5725A3900

  • Reported release

    C10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-12-24

  • Closed date

    2019-10-08

  • Last modified date

    2019-10-08

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    INFORMIX SERVER

  • Fixed component ID

    5725A3900

Applicable component levels

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSGU8G","label":"Informix Servers"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"C10","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
08 October 2019