Mapping source table data to Hive tables

A Hive table is automatically created every time you run an activity that moves data from a relational database into a Hadoop Distributed File System (HDFS) in InfoSphere® BigInsights®. The data type of each column in the Hive table is automatically assigned based on metadata information that is detected about the data type of each column in the source database that you specify when you run the activity.

InfoSphere Data Click uses Big SQL to create the Hive table. As shown in the following table, the data type of columns in the source table are mapped to Big SQL data types for the columns in the Hive table.
Table 1. Mapping the data type of columns in the source table to Big SQL data types for the columns in the Hive table
ODBC type in the metadata repository Big SQL Data type used for creating the column in Hive
BIGINT BIGINT
BINARY BINARY (length). The length parameter is the length of the source column. If a length is not provided in the source, then the data type is BINARY.
BIT BOOLEAN
CHAR CHAR(length). The length parameter is the length of the source column. If a length is not provided in the source, then the data type is CHAR.
DATE VARCHAR(10)
DECIMAL VARCHAR(precision +2). The precision value specifies the precision for the source column. If the precision parameter is not provided in the source, then the data type is STRING.
DOUBLE DOUBLE
FLOAT FLOAT
INTEGER INT
LONGVARBINARY VARBINARY(32768)
Note: LONGVARBINARY is mapped to VARBINARY(32768). Big SQL places a restriction on the length of the VARBINARY column. Because of this restriction, data truncation might occur when you move data from a source table containing a LONGVARBINARY column. This can lead to the activity finishing with warnings.
LONGVARCHAR STRING
NUMERIC NUMERIC(precision, scale). The precision and scale values specify the precision and scale for the source column. If the precision or scale parameters are not provided in the source, then the data type is NUMERIC.
REAL FLOAT
SMALLINT SMALLINT
TINYINT INT
TIME VARCHAR(8)
TIMESTAMP VARCHAR(26)
VARBINARY BINARY(length). The length parameter is the length of the source column. If a length is not provided in the source, then the data type is BINARY.
VARCHAR VARCHAR(length). The length parameter is the length of the source column. If the length is not provided in the source, then the data type is VARCHAR.
WCHAR CHAR(length). The length parameter is the length of the source column. If a length is not provided in the source, then the data type is CHAR.
WLONGVARCHAR STRING
WVARCHAR VARCHAR(length). The length parameter is the length of the source column. If the length is not provided in the source, then the data type is VARCHAR.