Topic
8 replies Latest Post - ‏2013-01-23T21:35:34Z by outlaw
lzrycki
lzrycki
29 Posts
ACCEPTED ANSWER

Pinned topic Cobol for AIX - Return-Code

‏2012-10-04T19:04:46Z |
Hi people,

running a program I have a SIGSEGV cancellation.

About cancellations and return-codes, COBOL for AIX Programming Guide syas:

Unrecoverable exception: When a program encounters an unrecoverable exception,
the user return code is set to 128 plus the signal number.

For instance, SIGABRT ( data exceptions, etc ) generates the return-code 134 = 128 + 6 ( SIGABRT number ).

I was surprised when my SIGSEGV gave me also 134, because SIGSEGV number is 11 and following the Programming
Guide the return code must be 128 + 11 = 139.

Someone could tell me why the cancellation gives SIGSEGV return code 134 and not 139.

Thanks

Regards

Leonardo

prudebug-19:44:50-PASO 2IWZ995C SIGSEGV signal received while executing code at location 0xf838.
prudebug-19:44:50-PASO 2 Message routine called from offset 0x14c of routine EVS29336.
prudebug-19:44:50-PASO 2IWZ901S Program exits due to severe or critical error.
prudebug-19:44:50-PASO 2-======================================
prudebug-19:44:50-PASO 2-DEBUG IKJRP_RC --->>> 134
prudebug-19:44:50-PASO 2-======================================
Updated on 2013-01-23T21:35:34Z at 2013-01-23T21:35:34Z by outlaw
  • SystemAdmin
    SystemAdmin
    403 Posts
    ACCEPTED ANSWER

    Re: Cobol for AIX - Return-Code

    ‏2012-10-12T20:29:20Z  in response to lzrycki
    This sounds like a possible product defect, you should call IBM service and report this problem.

    COBOL is the Language of the Future!
    Tom
    • lzrycki
      lzrycki
      29 Posts
      ACCEPTED ANSWER

      Re: Cobol for AIX - Return-Code

      ‏2012-10-15T20:15:25Z  in response to SystemAdmin
      Thanks Tom.

      We have another problem with Return Codes. First of all, all my experience was in mainframe Z/OS-Assembler-COBOL-CICS-DB2-Mq, etc.

      In this moment I'm working upgrading an AIX-COBOL 2-Oracle 10g - TX Series 6 to COBOL 4.1.1.6-Oracle 11g - TX Series 7, testing our application system in a Lab environment ( COBOL 4.1.1.6 ) comparing it with Production ( COBOL 2 ).

      The same program with the same data cancelled with overpunch in COBOL 2 and data exception in COBOL 4. Overpunch is a synonym of data exception, the classical mainframe 0C7; the issue: in our Production environment the Return Code is 134 but in LAB ( COBOL 4 ) the RC is 1, and this is a very big problem because RC 1 is a common RC managed by all our programs.

      RC 134 is the correct RC.

      Could someone tell me why we received 4 RC 1 in COBOL 4.1.1.6 ?

      Regards

      Leonardo

      Production
      IWZ039S An invalid overpunched sign was detected.
      Message routine called from offset 0x34 of routine

      iwzWriteERRmsg called from offset 0xa0 of routine
      _iwzcBCD_CONV_Pckd_To_ZndUS.
      _iwzcBCD_CONV_Pckd_To_ZndUS called from offset 0x298fc of

      CCR00270.
      CCR00270 called from offset 0x2d9e8 of routine CCB00010.
      IWZ901S Program exits due to severe or critical error.
      IKJEFT01 RUNPROGRAM CCB00010 RC:134

      LAB
      <Thread 1> Traceback:
      <Thread 1> Offset 0x00000564 in procedure writeERRmsg
      <Thread 1> Offset 0x00000044 in procedure iwzWriteERRmsg
      <Thread 1> Offset 0x000000c8 in procedure _iwzcBCD_CONV_Pckd_To_ZndUS
      <Thread 1> Offset 0x00033d88 in procedure CCR00270
      <Thread 1> Offset 0x000343fc in procedure CCB00010
      <Thread 1> --- End of call chain ---
      IWZ903S The system detected a data exception.
      IWZ901S Program exits due to severe or critical error.
      IKJEFT01 RUNPROGRAM CCB00010 RC:1
      • lzrycki
        lzrycki
        29 Posts
        ACCEPTED ANSWER

        Re: Cobol for AIX - Return-Code

        ‏2012-10-23T15:34:41Z  in response to lzrycki
        Hi people, sorry for being so persistent with this topic but I’m very confused about cancellations in our environment AIX + COBOL 4.1.1.6 + Oracle 11g.

        Making deeper our analysis we found that we have a general problem that can be showed in a very simple COBOL program.

        IDENTIFICATION DIVISION.
        PROGRAM-ID . EVS29336.
        ENVIRONMENT DIVISION.
        DATA DIVISION.
        WORKING-STORAGE SECTION.
        01 USERNAME PIC X VALUE '/'.
        01 CUENTA PIC 9(9).
        * *
        * Trick to generate a Data Exception *
        * *
        01 HORA PIC X(3) VALUE X'001000'.
        01 HORAC REDEFINES HORA PIC S9(5) COMP-3.
        ****************************************************
        EXEC SQL
        INCLUDE SQLCA
        END-EXEC.

        PROCEDURE DIVISION.

        COMPUTE CUENTA = HORAC * 10

        EXEC SQL
        CONNECT :USERNAME
        END-EXEC

        STOP RUN.

        Running this version of the program, a cancellation occurs in the COMPUTE statement; the SQL Connection is after COMPUTE and was not executed. In this case the Return Code is 134 consistent with the documentation:
        cancelation RC = 128 + 6 (SIGABRT number) = 134
        If we change the order of the statements

        EXEC SQL
        CONNECT :USERNAME
        END-EXEC

        COMPUTE CUENTA = HORAC * 10

        the abend also is done in the COMPUTE statement, but, the Return Code is 1; in our real programs when a SQL statement was executed ,no matter the SQL sentence, if we have an abend at any point of a program always the RC is 1.

        How is possible that SQL statements generated by ProCobol, the Oracle precompiler, can influence the RC when a future abend is produced ?

        For us is a big problem, because a lot of programs deals internally with RC =1 ; for our scheduler TWS RC=1 is a warning.

        Any idea about this issue ?

        Thanks

        Regards

        Leonardo
        • outlaw
          outlaw
          35 Posts
          ACCEPTED ANSWER

          Re: Cobol for AIX - Return-Code

          ‏2012-10-23T21:31:56Z  in response to lzrycki
          Ah, now we're getting somewhere! I was all set to blame IKJEFT01, but it appears to be an Oracle issue, which you can likely find in your Oracle product documentation.

          Oracle, not surprisingly, appears to be using atexit() handlers to facilitate commit/rollback handling, and setting the final step rc (likely 0==ok/commit, 1==bad/rollback).

          The atexit() hook will be driven after COBOL issues the abort() (or workalike), and has no control over other product's use of atexit().

          The fact that you get rc=134 if you abort before the SQL call, and rc=1 afterwards supports this hypothesis, in that Oracle doesn't get control, and therefore can't set the atexit() hook until the first call.

          These days, there are vastly superior methods of handling most things done in an atexit() hook - mostly thanks to C++ requirements for constructor/destructor, and these techniques alleviate most of the atexit() issues that have long been known.

          However, use of atexit() for commit/rollback still seems to be a common technique for a variety of database products... These products want you to know if the transaction was commited, or rolledback - hence the overwriting of the rc value.

          The long and short of it all is that you need to peruse the Oracle documentation to see what its return code conventions are - and likely avoid that set of values from your own programs.

          Richard A Nelson (Rick)
          COBOL Development IBM Silicon Valley Laboratory
          http://www.ibm.com/software/awdtools/cobol/
          • lzrycki
            lzrycki
            29 Posts
            ACCEPTED ANSWER

            Re: Cobol for AIX - Return-Code

            ‏2012-10-24T19:51:01Z  in response to outlaw
            Hi Rick,

            thanks for your explantion; IKJEFT01 come from mainframe because our application system is a downsizing from mainframe Z/OS to ISeries AIX. All the JCLs was translate to AIX scripts 7 years ago.

            Oracle ProCobol documentation said:

            "The contents of the RETURN-CODE special register (for those systems that support it) are unpredictable after any SQL statement or SQLLIB function."

            because of this our programs retain the return-code before the execution of a SQL statmente and restore it after this execution.

            This is the only thing that I found in Oracle documentation about RC; in no one place is written that Oracles routines will intercept a cancellation and put the final RC.

            We had some support discussions with IBM; yesterday was solved and we put a PMR with all our issues.

            Dan K. and/or David G. will work with us; I'll send all your comments.

            Thanks a lot for your time.

            Regards

            Leonardo
            • outlaw
              outlaw
              35 Posts
              ACCEPTED ANSWER

              Re: Cobol for AIX - Return-Code

              ‏2012-10-24T21:30:31Z  in response to lzrycki
              Oracle libraries have no access to the RETURN-CODE special register, unless their pre-processor adds references to it in their translated COBOL.

              Your saving and restoring of RETURN-CODE will not help with the 134 vs 1 issue :(

              Oracle does document that a program failure will cause an automatic roll-back - and the only way to do that is with one or more of library init/term routines, atexit() hooks, and (percolating) signal handlers.

              It is very likely that this auto-rollback process is changing the 134 to 1.

              If that is, indeed, what is happening, and I'll wager heavily that it is, there is nothing IBM can do, and it is something that very likely is documented somewhere in the Oracle documentation, but finding it may not be trivial.

              Interestingly enough, I wonder if IBM's DB2 has this same problem?

              Richard A Nelson (Rick)
              COBOL Development IBM Silicon Valley Laboratory
              http://www.ibm.com/software/awdtools/cobol/
  • outlaw
    outlaw
    35 Posts
    ACCEPTED ANSWER

    Re: Cobol for AIX - Return-Code

    ‏2012-10-23T21:10:27Z  in response to lzrycki
    You get rc=134 because we captured the SEGV, issued some (hopefully) useful data, then we abort the program - so you wind up with SIGABRT as the final signal.

    There are some cases where we might be able to provide more specific values, assuming no cascading of signals, but doing so across the board (thread, non-thread, etc.) quickly becomes a quagmire :(

    The salient information is that an rc > 127 means something when 'orribly wrong.

    If you require a higher level of detail than is currently provided, you'll need to go through channels to get a feature request started.

    Richard A Nelson (Rick)
    COBOL Development IBM Silicon Valley Laboratory
    http://www.ibm.com/software/awdtools/cobol/
    • outlaw
      outlaw
      35 Posts
      ACCEPTED ANSWER

      Re: Cobol for AIX - Return-Code

      ‏2013-01-23T21:35:34Z  in response to outlaw
      What, if anything, have you found from the Oracle folks on this issue ?

      Richard A Nelson (Rick)
      COBOL Development IBM Silicon Valley Laboratory
      http://www.ibm.com/software/awdtools/cobol/