Data types
The following sections describe the various data types and their valid values:
Value formats
The section is about parsing and formatting of data, depending on its type.
Parsing values from JSON
The following table gives examples of valid values for the supported data types. Messages with invalid values are skipped and reported in the Notifications pane of the Metrics page. This data type handling applies to all Source operators that output data in JSON format, for example, HTTP, Kafka, MQTT, and Watson IoT.
For valid output values in Code operators, see section Expected Code operator output values.
| Flow column data types | Valid value examples | The default when value is missing |
|---|---|---|
| Text | “Hello World” | ”” (empty string) |
| Number | 12 -7 3.14 “12” “-7” “3.14” |
0 (zero) |
| Date | “2018-03-26T11:38:47” “2018 03 26 11:38:47” “2018-03-26T11:38:47.123456” For more information, see section Date. |
No default value because a missing Date value is not valid. Messages with missing or empty Date values are skipped and reported in the Notifications pane of the Metrics page. |
| Boolean | true false “true” “false” Note: These values are the only valid Boolean values, and they are case-sensitive. |
false |
| Binary | Not supported in JSON parsing. You can ingest binary data as raw data by selecting “None” as the parsing option, where available. | N/A |
Serializing values to JSON and CSV
Formatted values are in quotation marks (“) only for data types Text and Date. The following table shows examples:
| Flow column data types | Output value examples |
|---|---|
| Text | “Hello World” |
| Number | 12 -7 3.14 |
| Date (*) | “2018-03-26T15:38:47”(with no split seconds) “2018-03-26T15:38:47.123456” |
| Boolean | true false |
| Binary | Not supported in serializing. You can write binary data by selecting “None” as the format option, where available. When selecting “None”, there must only be one event attribute entering the target operator. |
(*) For more information about inserting Date values into schema-less or schema-based Target operators, see Formatting Date output for Target operators.
Input and output values in Code operators
Data values at Code operator input
The following table describes the data type mapping between flow columns and Python dictionary values at Code operator inputs. It applies to the Code operator in Sources and Targets, and to the Python Model operator in Processing and Analytics.
| Flow column data types | Mapped to Python dictionary value data type | Python value examples |
|---|---|---|
| Text | str | “Hello World” |
| Number | float | -2.5 3.14 |
| Number | int | -5 451 |
| Date | datetime.datetime |
datetime.datetime objectdatetime.datetime(2018, 3, 27, 10, 42, 0, 892960) |
| Boolean | bool | True False |
| Binary | memoryview | <memory at 0x103efc708> |
Expected Code operator output values
The section applies to the Code operator in Sources, and to the Code operator and the Python Model operator in Processing and Analytics.
The following table gives examples of valid output values, by data type. Outputs with invalid values are skipped and reported in the Notifications pane of the Metrics page.
| Flow column data types | Output value data types | Valid output value examples | Default value |
|---|---|---|---|
| Text | str Any non-string values are auto-converted to strings |
“Hello World” “78” Valid: - 78 (converts to “78”) - True (converts to “True”) |
”” (empty string) |
| Number | int | 0 -4 451 |
0 |
| Number | float | -3.14 0.0 17.9 |
0.0 |
| Date | datetime.datetime | datetime.datetime object - datetime.datetime.now()- datetime.datetime(2018, 3, 27, 10, 42, 0, 892960)- datetime.datetime.strptime( "2018-03-27T07:58:35", '%Y-%m-%dT%H:%M:%S') |
datetime.datetime(1970, 1, 1, 0, 0, 0) |
| Boolean | bool Any non-bool value except 0, None, empty strings, and empty objects are auto-converted to True |
True False The following values are also accepted: “true” (converts to True) “False” (converts to True) (*) “go” (converts to True) 451 (converts to True) ”” (converts to False) “0” (converts to False) |
False |
| Binary | bytes memoryview |
bytes.fromhex(‘48 49’) memoryview(b’HI’) |
Empty binary with no elements |
(*) False is a Boolean literal, but "False" is a string literal. Any non-empty string is interpreted as True, whether it’s “True”, “False”, or “yes sir”.
When your output dictionary has no entry for an output column, or when its value is None, then the streaming runtime acts in the following way:
-
If a same-name input column with the same column data type exists, its value is used (the value is passed through).
-
Otherwise, the default value for the output column’s data type is used.
Numbers
-
When Number column values are parsed from JSON messages of Source operators such as Event Streams or Watson IoT, the values are always mapped to float values in input dictionaries.
-
When Number column values are coming in from other operators, the Python type can be mapped to float or int values, depending on the operator logic. For example, the count function in the Aggregation operator produces values of type int.
-
In the Code operator, you can output either int or float values for output columns of type Number.
Dates
The following traits apply to Date values:
-
Date values express a point in time, and as such, include date and time information similar to a UNIX timestamp.
-
Split-second precision of Date values is up to microseconds (up to 6 digits after the decimal point).
-
Date values do not carry time zone information.
Parsing dates from Source operator messages
-
When parsing data that comes from Source operators such as IBM Event Streams or IBM Watson IoT, a streams flow assumes that all Date values are given in ISO 8601 format. The streams flow normalizes the date values internally to Coordinated Universal Time (UTC).
-
Separator characters other than
-,T, and:are also accepted. For example, the date might use a space as a separator:
2018-03-26 15:3:47.123456 -
This parsing is applicable to all Source operators except Code (in Sources).
The following table gives examples of supported formats:
| Format example | Comments | Internal time representation |
|---|---|---|
| 2018-03-26T11:38:47 | No time zone. Up to full seconds. | 11:38:47 (*) |
| 2018-03-26T11:38:47.123456 | No time zone. With microseconds. | 11:38:47.123456 (*) |
| 2018-03-26T11:38:47.123 | No time zone. With split seconds other than microseconds. | 11:38:47.123 (*) |
| 2018-03-26T11:38:47+01:00 | With time zone. The time zone is UTC+1, so it is normalized to UTC. | 10:38:47 The time in UTC when it’s 11:38:47 in a time zone that’s UTC+01:00 |
| 2018-03-26T11:38:47-01:00 | With time zone. The time zone is UTC-1, so it is normalized to UTC. | 12:38:47 The time in UTC when it’s 11:38:47 in a time zone that’s UTC-01:00 |
| 2018-03-26T11:38:47.123456-05:00 | With time zone and microseconds. The time zone is UTC-5, so it is normalized to UTC. | 16:38:47.123456 The time in UTC when it’s 11:38:47 in a time zone that’s UTC-05:00 |
(*) When no time zone is indicated, the time is assumed to be given as UTC.
Tip
When processing dates from multiple time zones, their textual representation in source messages ideally have the respective time zone offset. If not, then you can use a Code operator to enhance the Date values with time zone information that is not explicit in the source. You can also use a Code operator for parsing formats other than ISO 8601.
Formatting Date output for Target operators
-
Outputting Date values to schema-less Target operators, such as Kafka or Redis
The values are formatted according to ISO 8601:
-
2018-03-26T11:38:47Z (value has no split seconds)
-
2018-03-26T11:38:47.123400Z (value contains split seconds)
-
Binary data
-
For ingesting Binary data from source operators that output a single data attribute, select the parsing option “None”. This will stream the data as raw data, without the built-in parsing. The following table gives examples for such operators:
Source operator Data attribute name Kafka event_message HTTP http_body -
For writing Binary data in target operators that usually need a format setting for serializing the incoming data, set the format option to “None” and make sure this operator’s output schema has exactly one attribute. This will pass the data as raw data, without the built-in serialization.
-
In operators that receive or output structured data as typed attributes, the attributes can be of type Binary. Examples of such operators are Code and Streams.