Topic
  • No replies
SRYER
SRYER
2 Posts

Pinned topic Converting strings to decimal

‏2012-09-12T12:43:56Z |
Hello

I receive external data in string format and have to convert some decimal numbers represented as strings into real decimal types in COBOL.
In the following example I assume the decimal point to be "." (a dot) but this is not important.

I would like to know how to generically parse the following strings to decimals:

'1'
'0'
'0.1'
'-1'
'-123456789'
'123456789'
'-123456789.987654321'
'123456789.987654321'
'0.1234567891'
'-0.1234567891'

As far as I can understand, the use of NUMVAL does not allow me to detect if rounding happens when converting the number into a decimal, nor do I have the oppotunity to detect if truncation has happened on the left hand side of the decimal point in case of very large numbers.

Does anyone have a good solution for a general purpose string => decimal parsing which at least gives an error when the number is manipulated during the parse?

Thank you in advance
Updated on 2012-10-13T14:21:00Z at 2012-10-13T14:21:00Z by BillWoodger
  • nico1964
    nico1964
    5 Posts

    Re: Converting strings to decimal

    ‏2012-09-12T12:51:34Z  
    Ha,

    i use the following for receiving data from a web service:

    01 WSAA-DIGIT PIC 9.
    01 WSAA-DIVISOR PIC 9(5).
    01 WSAA-INDEX PIC S9(4) COMP-3.
    01 WRAA-PD PIC 9(3)V9(5) EXTERNAL.

    01 LSAA-VALUE.
    03 LSAA-VALUE-LEN PIC S9(4) BINARY.
    03 LSAA-VALUE-BODY PIC X(32767).
    01 LSAA-ATTRS POINTER.
    /

    * We need to convert the character data 1234.56789 to 9(4)v9(5)
    MOVE 0 TO WRAA-PD
    MOVE 1 TO WSAA-DIVISOR
    PERFORM VARYING WSAA-INDEX FROM 1 BY 1
    UNTIL WSAA-INDEX > LSAA-VALUE-LEN
    OR LSAA-VALUE-BODY(WSAA-INDEX:1) = '.'
    MOVE LSAA-VALUE-BODY(WSAA-INDEX:1) TO WSAA-DIGIT
    COMPUTE WRAA-PD = WRAA-PD * 10 + WSAA-DIGIT
    END-PERFORM
    * Now the decimal portion, if any
    ADD 1 TO WSAA-INDEX
    IF WSAA-INDEX < LSAA-VALUE-LEN
    PERFORM VARYING WSAA-INDEX
    FROM WSAA-INDEX BY 1
    UNTIL WSAA-INDEX > LSAA-VALUE-LEN
    OR LSAA-VALUE-BODY(WSAA-INDEX:1) = ' '
    MOVE LSAA-VALUE-BODY(WSAA-INDEX:1) TO WSAA-DIGIT
    COMPUTE WSAA-DIVISOR= WSAA-DIVISOR * 10
    COMPUTE WRAA-PD = WRAA-PD +
    (WSAA-DIGIT / WSAA-DIVISOR)
    END-PERFORM
    END-IF
    END-IF

    I hope is helpful for you
  • nico1964
    nico1964
    5 Posts

    Re: Converting strings to decimal

    ‏2012-09-12T13:01:07Z  
    • nico1964
    • ‏2012-09-12T12:51:34Z
    Ha,

    i use the following for receiving data from a web service:

    01 WSAA-DIGIT PIC 9.
    01 WSAA-DIVISOR PIC 9(5).
    01 WSAA-INDEX PIC S9(4) COMP-3.
    01 WRAA-PD PIC 9(3)V9(5) EXTERNAL.

    01 LSAA-VALUE.
    03 LSAA-VALUE-LEN PIC S9(4) BINARY.
    03 LSAA-VALUE-BODY PIC X(32767).
    01 LSAA-ATTRS POINTER.
    /

    * We need to convert the character data 1234.56789 to 9(4)v9(5)
    MOVE 0 TO WRAA-PD
    MOVE 1 TO WSAA-DIVISOR
    PERFORM VARYING WSAA-INDEX FROM 1 BY 1
    UNTIL WSAA-INDEX > LSAA-VALUE-LEN
    OR LSAA-VALUE-BODY(WSAA-INDEX:1) = '.'
    MOVE LSAA-VALUE-BODY(WSAA-INDEX:1) TO WSAA-DIGIT
    COMPUTE WRAA-PD = WRAA-PD * 10 + WSAA-DIGIT
    END-PERFORM
    * Now the decimal portion, if any
    ADD 1 TO WSAA-INDEX
    IF WSAA-INDEX < LSAA-VALUE-LEN
    PERFORM VARYING WSAA-INDEX
    FROM WSAA-INDEX BY 1
    UNTIL WSAA-INDEX > LSAA-VALUE-LEN
    OR LSAA-VALUE-BODY(WSAA-INDEX:1) = ' '
    MOVE LSAA-VALUE-BODY(WSAA-INDEX:1) TO WSAA-DIGIT
    COMPUTE WSAA-DIVISOR= WSAA-DIVISOR * 10
    COMPUTE WRAA-PD = WRAA-PD +
    (WSAA-DIGIT / WSAA-DIVISOR)
    END-PERFORM
    END-IF
    END-IF

    I hope is helpful for you
    sorry, but htere is one end-if too much in the code because this is a code snippet from a running programm
  • SRYER
    SRYER
    2 Posts

    Re: Converting strings to decimal

    ‏2012-09-13T06:27:54Z  
    I haven't run your code yet, but does it handle negative numbers? And overflows with too many characters?
  • BillWoodger
    BillWoodger
    127 Posts

    Re: Converting strings to decimal

    ‏2012-09-25T10:56:29Z  
    • SRYER
    • ‏2012-09-13T06:27:54Z
    I haven't run your code yet, but does it handle negative numbers? And overflows with too many characters?
    If your data are as simple as that, this should be OK.

    I have used an output field of 8V8 to use your data for the "truncation" and then repeated with the truncated fields reduced by one or two digits as necessary.

    Gets a warning on the compile, but since it is checking for truncation anyway, shouldn't be a problem. Can be avoided, of course, but since you've not specified the lengths you want to deal with, I can leave that.

    
    .     01  W-INPUT-STRING. 05  W-INPUT-SIGN-IF-PRESENT          PIC X. 88  W-INPUT-HAS-SIGN             VALUE 
    "-". 05  W-INPUT-STRING-NO-SIGN           PIC X(19). 01  W-STRING-TO-CONVERT                  PIC X(19). 01  W-INTEGER-PART                       PIC 9(18). 01  W-DECIMAL-PART                       PIC X(18). 01  W-DECIMAL-PART-AS-DECIMAL REDEFINES W-DECIMAL-PART           PIC V9(18). 01  W-DECIMAL-POINT                      PIC X VALUE 
    ".". 01  W-VER-OUTPUT-NUMBER                  PIC 9(8)V9(8). 01  FILLER REDEFINES W-VER-OUTPUT-NUMBER. 05  W-VER-OUTPUT-NUMBER-INTEGER      PIC 9(8). 05  W-VER-OUTPUT-NUMBER-DECIMAL      PIC V9(8). 01  W-FINAL-OUTPUT-NUMBER                PIC S9(8)V9(8). 01  FILLER                               PIC X. 88  W-TRUNCATION-OCCURED             VALUE 
    "Y". 88  W-TRUNCATION-NOT-FOUND           VALUE 
    "N". 01  W-FORMATTED-OUTPUT-NUMBER            PIC -(7)9.9(8).
    


    
    .          IF W-INPUT-HAS-SIGN MOVE W-INPUT-STRING-NO-SIGN TO W-STRING-TO-CONVERT ELSE MOVE W-INPUT-STRING      TO W-STRING-TO-CONVERT END-IF MOVE SPACE                   TO W-DECIMAL-PART   UNSTRING                        W-STRING-TO-CONVERT DELIMITED                  BY W-DECIMAL-POINT OR                            SPACE INTO                          W-INTEGER-PART W-DECIMAL-PART INSPECT W-DECIMAL-PART REPLACING                  ALL SPACE BY                         ZERO DISPLAY 
    "INPUT*" W-INPUT-STRING 
    "*" DISPLAY 
    "*" W-INTEGER-PART 
    "*" 
    "*" W-DECIMAL-PART 
    "*" IF W-INPUT-HAS-SIGN DISPLAY 
    "NEGATIVE" END-IF COMPUTE W-VER-OUTPUT-NUMBER  = W-INTEGER-PART + W-DECIMAL-PART-AS-DECIMAL SET W-TRUNCATION-NOT-FOUND   TO TRUE IF W-VER-OUTPUT-NUMBER-INTEGER NOT EQUAL TO W-INTEGER-PART DISPLAY 
    "INTEGER TRUNCATION" SET W-TRUNCATION-OCCURED TO TRUE END-IF IF W-VER-OUTPUT-NUMBER-DECIMAL NOT EQUAL TO W-DECIMAL-PART-AS-DECIMAL DISPLAY 
    "DECIMAL TRUNCATION" SET W-TRUNCATION-OCCURED TO TRUE END-IF IF W-TRUNCATION-OCCURED MOVE ZERO                TO W-FINAL-OUTPUT-NUMBER ELSE MOVE W-VER-OUTPUT-NUMBER TO W-FINAL-OUTPUT-NUMBER END-IF IF W-INPUT-HAS-SIGN SUBTRACT W-FINAL-OUTPUT-NUMBER FROM ZERO GIVING W-FINAL-OUTPUT-NUMBER END-IF MOVE W-FINAL-OUTPUT-NUMBER   TO W-FORMATTED-OUTPUT-NUMBER DISPLAY 
    "*" W-FORMATTED-OUTPUT-NUMBER 
    "*" DISPLAY 
    " " .
    


    Output is:

    
    INPUT*-1.1                * *000000000000000001**100000000000000000* NEGATIVE *      -1.10000000* INPUT*1                   * *000000000000000001**000000000000000000* *       1.00000000* INPUT*0                   * *000000000000000000**000000000000000000* *       0.00000000* INPUT*0.1                 * *000000000000000000**100000000000000000* *       0.10000000* INPUT*-1                  * *000000000000000001**000000000000000000* NEGATIVE *      -1.00000000* INPUT*-123456789          * *000000000123456789**000000000000000000* NEGATIVE INTEGER TRUNCATION *       0.00000000* INPUT*123456789           * *000000000123456789**000000000000000000* INTEGER TRUNCATION *       0.00000000* INPUT*-123456789.987654321* *000000000123456789**987654321000000000* NEGATIVE INTEGER TRUNCATION DECIMAL TRUNCATION *       0.00000000* INPUT*123456789.987654321 * *000000000123456789**987654321000000000* INTEGER TRUNCATION DECIMAL TRUNCATION *       0.00000000* INPUT*0.1234567891        * *000000000000000000**123456789100000000* DECIMAL TRUNCATION *       0.00000000* INPUT*-0.1234567891       * *000000000000000000**123456789100000000* NEGATIVE DECIMAL TRUNCATION *       0.00000000* INPUT*-12345678           * *000000000012345678**000000000000000000* NEGATIVE *-2345678.00000000* INPUT*12345678            * *000000000012345678**000000000000000000* * 2345678.00000000* INPUT*-12345678.87654321  * *000000000012345678**876543210000000000* NEGATIVE *-2345678.87654321* INPUT*12345678.87654321   * *000000000012345678**876543210000000000* * 2345678.87654321* INPUT*0.12345678          * *000000000000000000**123456780000000000* *       0.12345678* INPUT*-0.12345678         * *000000000000000000**123456780000000000* NEGATIVE *      -0.12345678*
    
  • SystemAdmin
    SystemAdmin
    403 Posts

    Re: Converting strings to decimal

    ‏2012-10-12T20:42:19Z  
    FYI,m NUMVAL does not do rounding or truncation, but the MOVE or IF statement that it is imbedded in might. If you store the result of NUMVAL into sufficiently large numbers you should be safe.
    IE something like this:

    77 BIG-DEC PIC S9(25)V9(6) PACKED-DECIMAL.

    . . .

    COMPUTE BIG-DEC = FUNCTION NUMVAL(input)

    This seems so much simpler!

    COBOL is the Language of the Future!
    Tom
  • BillWoodger
    BillWoodger
    127 Posts

    Re: Converting strings to decimal

    ‏2012-10-12T23:47:25Z  
    FYI,m NUMVAL does not do rounding or truncation, but the MOVE or IF statement that it is imbedded in might. If you store the result of NUMVAL into sufficiently large numbers you should be safe.
    IE something like this:

    77 BIG-DEC PIC S9(25)V9(6) PACKED-DECIMAL.

    . . .

    COMPUTE BIG-DEC = FUNCTION NUMVAL(input)

    This seems so much simpler!

    COBOL is the Language of the Future!
    Tom
    NUMVAL and NUMVAL-C don't do rounding or truncation. I assumed there had been observation of "something" which was being interpreted, erroneously, as such.

    The "something" would be this, I assumed, from the Language Reference:

    "The returned value is a floating-point approximation of the numeric value represented by argument-1. The precision of the returned value depends on the setting of the ARITH compiler option. For details, see Converting to number (NUMVAL, NUMVAL-C) in the Enterprise COBOL Programming Guide."

    The Programming Guide:

    "The arguments must not exceed 18 digits when you compile with the default option ARITH(COMPAT) (compatibility mode) nor 31 digits when you compile with ARITH(EXTEND) (extended mode), not including the editing symbols.

    NUMVAL and NUMVAL-C return long (64-bit) floating-point values in compatibility mode, and return extended-precision (128-bit) floating-point values in extended mode. A reference to either of these functions represents a reference to a numeric data item.

    At most 15 decimal digits can be converted accurately to long-precision floating point (as described in the related reference below about conversions and precision). If the argument to NUMVAL or NUMVAL-C has more than 15 digits, it is recommended that you specify the ARITH(EXTEND) compiler option so that an extended-precision function result that can accurately represent the value of the argument is returned."

    Going to "S9(25)V9(6)" necessitates ARITH(EXTEND) anyway, so sufficient (in it being exact, as I read it) precision is gained.

    The second possible element of "truncation" is of course the size of the "receiving" field, as you point out.

    However, extending the size of the receiving field from NUMVAL then requires some further code to identify whether truncation will/has occurred when placed in its final destination field.

    If the NUMVAL is in an IF and the value would be truncated in its final field, you have to re-work that as well.

    On top of that, changing compile options which affect code-generation is non-trivial. For a start, ARITH may be "locked" to COMPAT. Even if not, I'd not care for a "mix" of modules, some with COMPAT and some with EXTEND, so to change one to EXTEND implies, to me, changing all within that particular system. Not necessarily something a manager will "buy" just for "simplicity".

    So, I chose a 16-digit final destination, but with a way to identify truncation either before or after the decimal point, without loosing any of the value from the input "string" and which wouldn't have any chance of loosing precision. With no need to even consider which sub-option is used for ARITH.

    The data shown was very simple. I felt it deserved a simple solution with no other "impacts". OK, yes, if the existing NUMVAL is in an IF, that would again need to be changed. Personally I feel an IF is not "simple" if it contains a FUNCTION anyway.

    Whilst we are here, there is the not-mentioned-this-time problem with NUMVAL. Feed it something bad and the world collapses. With "string" numerics often coming from external sources, I always feel the need to "verify" (we used to call it "edit") them anyway. By the time you know it is OK for NUMVAL, you have the means to "do" the "numval" yourself :-)

    Maybe you have something in the near future for us on that? If you do, I guess you'll let us know at the time that you want to. I hope it isn't "use another FUNCTION to let us know it is OK to use NUMVAL which will do everything again that the first FUNCTION already did plus actually giving the result".

    A simple solution to the simple data presented, which needs pay no heed to floating-point precision nor the setting of ARITH, is available. If to be needed in more than one place, stick it in a CALLed module. Like we've always done. Gosh. It's a bit like a "user function". Few parameters, and it can do all the possible truncation checking itself, and whatever else we may reasonably feel needed. It becomes a "simple" CALL without the NUMVAL "baggage" of precision and ARITH and with whatever results the designer of the module feels are needed, including identifying an invalid format without causing a disturbance.
  • lbjerges
    lbjerges
    33 Posts

    Re: Converting strings to decimal

    ‏2012-10-13T14:05:32Z  
    NUMVAL and NUMVAL-C don't do rounding or truncation. I assumed there had been observation of "something" which was being interpreted, erroneously, as such.

    The "something" would be this, I assumed, from the Language Reference:

    "The returned value is a floating-point approximation of the numeric value represented by argument-1. The precision of the returned value depends on the setting of the ARITH compiler option. For details, see Converting to number (NUMVAL, NUMVAL-C) in the Enterprise COBOL Programming Guide."

    The Programming Guide:

    "The arguments must not exceed 18 digits when you compile with the default option ARITH(COMPAT) (compatibility mode) nor 31 digits when you compile with ARITH(EXTEND) (extended mode), not including the editing symbols.

    NUMVAL and NUMVAL-C return long (64-bit) floating-point values in compatibility mode, and return extended-precision (128-bit) floating-point values in extended mode. A reference to either of these functions represents a reference to a numeric data item.

    At most 15 decimal digits can be converted accurately to long-precision floating point (as described in the related reference below about conversions and precision). If the argument to NUMVAL or NUMVAL-C has more than 15 digits, it is recommended that you specify the ARITH(EXTEND) compiler option so that an extended-precision function result that can accurately represent the value of the argument is returned."

    Going to "S9(25)V9(6)" necessitates ARITH(EXTEND) anyway, so sufficient (in it being exact, as I read it) precision is gained.

    The second possible element of "truncation" is of course the size of the "receiving" field, as you point out.

    However, extending the size of the receiving field from NUMVAL then requires some further code to identify whether truncation will/has occurred when placed in its final destination field.

    If the NUMVAL is in an IF and the value would be truncated in its final field, you have to re-work that as well.

    On top of that, changing compile options which affect code-generation is non-trivial. For a start, ARITH may be "locked" to COMPAT. Even if not, I'd not care for a "mix" of modules, some with COMPAT and some with EXTEND, so to change one to EXTEND implies, to me, changing all within that particular system. Not necessarily something a manager will "buy" just for "simplicity".

    So, I chose a 16-digit final destination, but with a way to identify truncation either before or after the decimal point, without loosing any of the value from the input "string" and which wouldn't have any chance of loosing precision. With no need to even consider which sub-option is used for ARITH.

    The data shown was very simple. I felt it deserved a simple solution with no other "impacts". OK, yes, if the existing NUMVAL is in an IF, that would again need to be changed. Personally I feel an IF is not "simple" if it contains a FUNCTION anyway.

    Whilst we are here, there is the not-mentioned-this-time problem with NUMVAL. Feed it something bad and the world collapses. With "string" numerics often coming from external sources, I always feel the need to "verify" (we used to call it "edit") them anyway. By the time you know it is OK for NUMVAL, you have the means to "do" the "numval" yourself :-)

    Maybe you have something in the near future for us on that? If you do, I guess you'll let us know at the time that you want to. I hope it isn't "use another FUNCTION to let us know it is OK to use NUMVAL which will do everything again that the first FUNCTION already did plus actually giving the result".

    A simple solution to the simple data presented, which needs pay no heed to floating-point precision nor the setting of ARITH, is available. If to be needed in more than one place, stick it in a CALLed module. Like we've always done. Gosh. It's a bit like a "user function". Few parameters, and it can do all the possible truncation checking itself, and whatever else we may reasonably feel needed. It becomes a "simple" CALL without the NUMVAL "baggage" of precision and ARITH and with whatever results the designer of the module feels are needed, including identifying an invalid format without causing a disturbance.
    I would just like to comment on "Feed it something bad and the world collapses". You could actually write a condition handler to trap that type of event end do a softer termination.
  • BillWoodger
    BillWoodger
    127 Posts

    Re: Converting strings to decimal

    ‏2012-10-13T14:21:00Z  
    • lbjerges
    • ‏2012-10-13T14:05:32Z
    I would just like to comment on "Feed it something bad and the world collapses". You could actually write a condition handler to trap that type of event end do a softer termination.
    Yes, if I wanted to do that I've "Got COBOL?...".

    But I don't want to do that for this. I want to process input numbers from an external source, reject but log those whose format are "bad" and process (or perhaps reject, depending) the rest. I want to not give NUMVAL bad data if it is going to fail. Simplest way to do that for now seems to be not to use NUMVAL unless you can rely on the quality of the data.

    Catching an abend in LE each time I get bad data into NUMVAL is not what I want to do.