IBM PureData System for Analytics, Version 7.1

Format background

All data is a series of byte-sequences and has an associated data type, used as a conceptual or abstract attribute of the data. Without an associated data type, a byte-sequence can be interpreted in numerous ways.

A single data type can be represented in different forms. For example, an integer data type can be represented or stored in various types of binary format, or in human-readable text or character format (typically ASCII). Similarly, dates, times, and other data types have multiple representations used by different programs, languages, and environments. At some point, though, these data types must be represented in readable form, so users can do something with the data. Data for loading into the data warehouse typically is presented in either delimited format or fixed-length format by using either ASCII or UTF-8.



Feedback | Copyright IBM Corporation 2014 | Last updated: 2014-02-28