IBM Support

Error: bad input row from scientific notation

Troubleshooting


Problem

Error when using nzload to load scientific notation data into numeric columns.

Symptom

When using nzload to load scientific notation data into numeric columns, the following error appears:

bad #: input row #(byte offset to last char examined) [field #, declaration] diagnostic, "text consumed"[last char examined]
----------------------------------------------------------------------------------------------------------------------------
1: 1007(136) [187, NUMERIC(17,9)] expected field delimiter or end of record, "3.
62"[E]

Resolving The Problem

Depending on your NPS release level, you can load scientific notation data into a numeric column in one of the following ways (described below):

  • Using nzload with floating point data types (applies to NPS 4.0 and later)
  • Using staging tables to load into character columns
  • Using nzload to load the majority of the data and capture the error rows in a separate file

Using nzload with Floating Point Data Types

On systems running NPS 4.0 or later, nzload can load raw scientific notation data into columns of the floating point data types: REAL and DOUBLE PRECISION as shown in the following example:
    Anonymous(ADMIN)=> create table TABLE_S (col1 int, col2 int, col3 character(10), col4 DOUBLE) distribute on (col1);

    Anonymous(ADMIN)=> \d TABLE_S

    Table "TABLE_S"

    Attribute | Type | Modifier | Default Value

    -----------+------------------+----------+---------------

    COL1 | INTEGER | |

    COL2 | INTEGER | |

    COL3 | CHARACTER(10) | |

    COL4 | DOUBLE PRECISION | |

    Distributed on hash: "COL1"

    Anonymous(ADMIN)=>

    [nz@cs-spubox6 ~]$ more load_scientific.dat

    8|8|test 8|3.62E-7

    8|8|test 8|3.62E-7

    [nz@cs-spubox6 ~]$ nzload -db Anonymous -t TABLE_S -df load_scientific.dat -delim '|'

    Load session of table 'TABLE_S' completed successfully

    [nz@cs-spubox6 ~]$

    Anonymous(ADMIN)=> select * from TABLE_S;

    COL1 | COL2 | COL3 | COL4

    ------+------+------------+----------

    8 | 8 | test 8 | 3.62e-07

    8 | 8 | test 8 | 3.62e-07

    (2 rows)

You can then either put it into another table or select from the DOUBLE column by doing the following:
    Anonymous(ADMIN)=> select col4::numeric(10,10) from TABLE_S;

    ?COLUMN?

    --------------

    0.0000003620

    0.0000003620

    0.0000003620

    0.0000003620

    (4 rows)

    Anonymous(ADMIN)=>

    Anonymous(ADMIN)=> create table TABLE_N (col1 int, col2 numeric(10,10)) distribute on (col1);

    CREATE TABLE

    Anonymous(ADMIN)=> insert into TABLE_N select col1, col4::numeric(10,10) from TABLE_S;

    INSERT 0 4

    Anonymous(ADMIN)=> select * from TABLE_N;

    COL1 | COL2

    ------+--------------

    8 | 0.0000003620

    8 | 0.0000003620

    8 | 0.0000003620

    8 | 0.0000003620

    (4 rows)



Using Staging Tables to Load Scientific Data into Character Columns

On any NPS system, you can use staging tables and multiple INSERT commands to load scientific data into character columns as shown in the following example:
    create table staging_table (col1 varchar(20));
    create table production_table (col1 numeric(17,9));

Instead of using nzload, you can use the following INSERT commands because the staging table column is defined as varchar data which means that all types of data can be loaded into it.
    insert into staging_table values ('0');
    insert into staging_table values ('87654321.987654321');
    insert into staging_table values ('99999999.999999999');
    insert into staging_table values ('11111111.111111111');
    insert into staging_table values ('10000000.000000001');
    insert into staging_table values ('3.62E-7');

Promote the data from the staging table to the production table column-by-column.
    insert into production_table select * from staging_table;

NPS detects if any of the columns are not similarly defined and automatically does the conversion which results in the following statement:
    select * from production_table;

       COL1

       --------------------

       11111111.111111111

       10000000.000000001

       0.000000362

       0.000000000

       99999999.999999999

       87654321.987654321

       (6 rows)
Keep in mind that the column in the staging table is defined as VARCHAR(20) which will allow you to load any data. If you load invalid data into it, then the INSERT statement above will fail and no rows will be moved into the production table because that data cannot be translated to a numeric value.

Use nzload to Load the Majority of the Data and Capture the Error Rows for Loading Separately

The following load method can be performed on any NPS system. It involves moving data from a DOUBLE column into the NUMERIC column and is summarized below:

Perform the nzload directly into the production table with the -maxErrors option set to a decently high number to catch all of the problem rows. All of the rows that use scientific notation will be written to the *.nzbad file. Everything else will be directly loaded into the table.
Move, rename, and save the *.nzbad file.
Load the problem rows into a different staging table where that particular column is defined as DOUBLE rather than a NUMERIC(n,n).
Do the INSERT INTO production_table SELECT * FROM staging_table to promote the small subset of rows that used scientific notation.
When you use nzload to load the data into the staging table, it will throw away any rows that do not have valid data in the DOUBLE column.
In this method, less data has to be loaded and then moved into the final table because the initial nzload operation loaded most of the records without any problems.


[nz@cs-anonymous tmp]$ more scientific.data

3.62E-7

19.50

20.25

21.00

3.62E-7



Loading the rows into a column defined as numeric which will reject some rows..


anonymous(ADMIN)=> create table production_table (col1 numeric(17,9));

CREATE TABLE



nzload -db anonymous -t production_table -maxErrors 1000000000 -df /tmp/scientific.data -bf /tmp/scientific.bad



[nz@cs-anonymous tmp]$ more PRODUCTION_TABLE.anonymous.nzlog


Load started at:18-Aug-09 13:25:57 EDT


Database: anonymous

Tablename: PRODUCTION_TABLE

Datafile: /tmp/scientific.data

Host: cs-anonymous


Load Options


Field delimiter: '\t' File Buffer Size (MB): 8

NULL value: NULL Quoted data: No

Checkpoint: 0 Max errors: 1000000000

Skip records: 0 Max rows: 0

FillRecord: No Truncate String: No

Escape Char: None Accept Control Chars: No

Distribution stats: No Allow CR in string: No


BoolStyle: ONE_ZERO


Date Style: YMD Date Delim: '-'

Time Style: 24 Hour Time Delim: ':'

Number of Restarts: 2 Encoding: Latin9

Ignore Zero: No Require Quotes: No

Time extra zeros: No Load Replay Region (MB): 0


Found bad records


bad #: input row #(byte offset to last char examined) [field #, declaration] diagnostic, "text consumed"[last char examined]

----------------------------------------------------------------------------------------------------------------------------

1: 1(4) [1, NUMERIC(17,9)] expected field delimiter or end of record, "3.62"[E]

2: 5(4) [1, NUMERIC(17,9)] expected field delimiter or end of record, "3.62"[E]


Statistics


number of records read: 5

number of bad records: 2

number of discarded records: 0

-------------------------------------------------

number of records loaded: 3


Elapsed Time (sec): 0.0


-----------------------------------------------------------------------------

Load completed at: 18-Aug-09 13:25:57 EDT

=============================================================================

[nz@cs-anonymous tmp]$



anonymous(ADMIN)=> select * from production_table;

COL1

--------------

20.250000000

21.000000000

19.500000000

(3 rows)




Loading the rejected rows into a column defined as Double from the bad file generated from the previous step.


anonymous(ADMIN)=> create table production_table2 (col1 double);

CREATE TABLE


nzload -db anonymous -t production_table2 -maxErrors 1000000000 -df /tmp/scientific.bad -bf /tmp/scientific.bad2




Now we can move these rows from production_table2 to production_table.




anonymous(ADMIN)=> insert into production_table select * from production_table2;

INSERT 0 2

anonymous(ADMIN)=> select * from production_table;

COL1

--------------

19.500000000

0.000000000

0.000000000

20.250000000

21.000000000

(5 rows)

[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":null,"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Historical Number

NZ321781

Document Information

Modified date:
17 October 2019

UID

swg21570795