Data types in Db2 Big SQL

The smallest unit of data that can be manipulated in SQL is called a value. Values are interpreted according to the data type of their source.

Db2 Big SQL uses HCatalog (and thus the Hive Metastore) as its underlying data representation and access method. Therefore, there are several important distinctions about Db2 Big SQL data types that must be understood.

  • Some data types that are provided by Db2 Big SQL are data types that are not available in Hive. However, when physically stored or retrieved from the underlying data source, these types are always represented as a data type that Hive does understand. These data types are referred to as extended data types.
  • The underlying storage format for the data dictates the final encoding for a value. For example, an integer that is stored in a delimited text file can require a varying number of bytes to represent. The size of a type (in bytes) in this document is intended to help to describe the range of values that the type can represent. The size of a type does not necessarily represent the amount of storage that is required to store the value.
  • In some cases, data types that Hive represents are not yet fully supported by Db2 Big SQL. Some of these types result in an error when you try to access a column of that type. In other cases, the Hive type is mapped to the nearest Db2 Big SQL data type. For example, the Hive TINYINT data type is supported. However, it is promoted to a SMALLINT upon retrieval from Hive and treated as a SMALLINT throughout the SQL statement that accesses the column.

For data type and file format performance considerations see Recommended file formats for data types.