Primitive types
A primitive type, such as int32 or rstring, is one that is not composed of other types.
By supporting many primitive types, SPL gives you fine control over data representation, which is crucial for performance in high-volume streams. Tight representation is important both to keep the data on the wire small, and to reduce serialization and deserialization time. SPL supports the following primitive types:
Type | Representation | |
---|---|---|
boolean | true or false | |
enum | User-defined enumeration of identifiers | |
intb | Signed b-bit integer. The signed integer types can be: | |
int8 | -128 to 127 | |
int16 | -32,768 to 32,767 | |
int32 | -2,147,483,648 to 2,147,483,647 | |
int64 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | |
uintb | Unsigned b-bit integer. The unsigned integer types can be: | |
uint8 | 0 to 255 | |
uint16 | 0 to 65,535 | |
uint32 | 0 to 4,294,967,295 | |
uint64 | 0 to 18,446,744,073,709,551,615 | |
floatb | IEEE 754 binary b-bit floating point number. The binary floating types can be: | |
float32 | Single-precision (equivalent to float in Java™): ±, significand 24 binary digits, exponent 2-126 to 2127 |
|
float64 | Double-precision (equivalent to double in Java ): ±, significand 53 binary digits, exponent 2-1022 to 21023 |
|
decimalb | IEEE 754 decimal b-bit floating point number. The decimal floating types can be: | |
decimal32 | ±, significand 7 decimal digits (0 to 9.999999), exponent 10-95 to 1096 | |
decimal64 | ±, significand 16 decimal digits, exponent 10-383 to 10384 | |
decimal128 | ±, significand 34 decimal digits, exponent 10-6,143 to 106,144 | |
complexb | 2b-bit complex number. The complex types can be: | |
complex32 | Both real and imaginary parts are float32 | |
complex64 | Both real and imaginary parts are float64 | |
timestamp | Point in time, with nanosecond precision | |
rstring | Sequence of raw bytes that
supports string processing when the character encoding is known. Note: SPL
functions that manipulate an rstring need to process
the character encoding of the string. In general, an SPL function
can handle single-byte character encoding. However, the string might
be encoded as multi-byte, such as UTF-8, so you need to process accordingly
in your application.
|
|
ustring | String of UTF-16 Unicode characters, which are based on ICU library | |
blob | Sequence of raw bytes | |
rstring[n] | Bounded-length sequence of, at most, n raw bytes that support string processing when the character encoding is known | |
xml | xml | Holds XML values |
xml<"schemaURI"> | Holds XML values that match the schemaURI |

The names of numeric types include their bit-width to make the naming consistent and to avoid unwieldy names such as “long long unsigned int”. Users can also define their own type names. See Type definitions.
type tracing = enum { off, error, warn, info, debug, trace }; //1
Any of the identifiers off, ....., trace can be used where a value of enumeration tracing is expected. The scope of the identifiers off, ....., trace is the same as the scope that contains the type definition. Enumerations are ordered (they permit comparison with <, >, <=, and >=) but not numeric (they do not permit arithmetic with +, −, *, and so on).
Like in C/Java, literals for int, uint, float, and decimal can have optional type suffixes. For example, 123 is signed (int32) whereas 123u is unsigned (uint32). One suffix indicates the kind of number.
Suffix | Meaning |
---|---|
s | Signed integer (default for integer literals) |
u | Unsigned integer |
f | Binary floating-point (default for floating point literals) |
d | Decimal floating-point |
Another suffix indicates the number of bits.
Suffix | Meaning |
---|---|
b (byte) | 8-bit |
h (halfword) | 16-bit |
w (word) | 32-bit (default for integer literals) |
l (long) | 64-bit (default for floating point literals) |
q (quad-word) | 128-bit |
Some more examples for literals with type suffix: 0.0005 (float64), 0.5e−3 (float64), 3.5d (decimal64), 3.5w (float32), 123d (decimal64), 123dq (decimal128).
SPL supports hexadecimal literals. One can specify a hexadecimal literal with a 0x prefix. Valid suffixes for hexadecimal literals are s (signed integer) and u (unsigned integer). By default a hexadecimal literal is a signed integer. Its data length is determined by the number of hexadecimal digits specified. The data length includes any leading zeros. A maximum of 16 hexadecimal digits are supported (int64, uint64). Specifying more than 16 hexadecimal digits results in an error.
Some examples of hexadecimal literals: 0xf (int8), 0x00fu (uint16), 0x12345 (int32), −0x12345s (int32), 0x0123456789ABCDEF (int64)
String literals are written in single quotation marks or double quotation marks. SPL supports two string types, "Unicode" and "raw". ustring contains Unicode characters that are encoded as UTF-16, and rstring contains raw bytes. This behavior allows the developer to pick Unicode when international character sets are important, and to pick raw strings when constant-time random access and a tight representation are important. A type suffix in the literals indicates the string kind: r indicates rstring (the default without suffix) and u indicates ustring.
String escape character | Meaning |
---|---|
\a | Alert |
\b | Backspace |
\f | Form feed |
\n | Newline |
\r | Carriage return |
\t | Horizontal tab |
\v | Vertical tab |
\' | Single quotation mark |
\" | Double quotation mark |
\? | Question mark |
\0 | Null character |
\\ | Literal backslash |
Recall from topic Lexical syntax that SPL files are written in UTF-8, so letters such as ñ can also appear directly in a string literal, without the escape sequence. Both ustring and rstring can contain internal null characters, which, unlike in C, are not considered terminating. In other words, characters whose encoding is zero carry no special meaning, and the length of a string is independent from whether or not it contains such characters.
rstring myString=
"A long
string with a newline in it.";
This example is equivalent
to:rstring myString="A long\n string with a newline in it.";
A literal for a complex number is written as a list literal with a cast, as in (complex32)[1.0, 2.0]. The real and imaginary components of a complex number can be extracted by using the SPL built-in functions real() and imag().
The timestamp type is designed to allow a high degree of precision as well as avoid overflow conditions (it can represent values that range over billions of years) by following widely accepted standards. It uses a 128-bit representation, where 64 bits store the seconds since the epoch as a signed integer, 32 bits store the nanoseconds, and 32 bits store an optional identifier of the computer where the measurement was taken, which can be useful for after-the-fact drift compensation. The epoch, time zone, and so on, depend on the library functions used to manipulate time stamps; for more information about these functions, see the API documentation. A timestamp can be initialized by using one of the SPL functions or from a float64. There is no literal for a timestamp.
Many operators and functions are overloaded to work with different types. For example, the operator + can add various types of numbers, but it can also concatenate strings. Likewise, the function length(x) is overloaded to accept x of type rstring or ustring.
To permit efficient marshalling and unmarshalling of network packets, SPL offers a bounded-size variant of rstring, list, set, and map types. For example, rstring[5] can store any rstring of at most 5 characters, and each character in an rstring takes 1 byte. If all parts of a data value have a fixed size, then parts can be found at a fixed offset without decoding the whole. The compiler prohibits implicit conversions from unbounded to bounded types, but the user can override that by explicit casts. Type bounds, whether in variable declarations or in casts, must be compile-time constants. A cast from any string to a bounded string truncates the value if it is too long. SPL does not offer bounded ustring values because bounding the number of Unicode characters would not achieve the goal of fixing the size of the network byte representation after conversion. SPL limits all strings, bounded or unbounded, raw or Unicode, to at most 231-1 characters.
Blobs are sequences of at most 263-1 raw bytes. A blob can be initialized from a list<uint8>. There is no literal for a blob.
- xml: Holds well-formed XML documents. A runtime check is done on assignment or conversion to ensure that the values are well-formed XML. A C++ exception is thrown and the operator is terminated if the value that is being assigned is not well-formed. The convertToXML standard library function can be used to check whether a conversion causes an exception.
- xml<"schemaURI">: Holds XML values that match the schemaURI. It is checked at run time, unless the value is known to be valid already. A C++ exception is thrown and the operator is terminated if the value that is being assigned is not valid for the schema. The IBM® Streams instance can optimize the checking if the right side is known to be well-formed (that is, it is from an xml type with the same schema). The schemaURI is fetched on demand. Using a networked schemaURI might cause SPL program failures if the schema on the network is changed without changing the SPL program, a networked connection is not available, or the site that is referenced by the URI is not available. Maintain the schema locally as part of the application. Application directory relocation needs to be addressed as well. The schema should be relative to the data directory, or available at the same absolute path from all computers.
"<?xml version=\"1.0\"?><x a=\"b\">55</x>"x
'<x a="b">55</x>'x
XML literals are checked by the SPL compiler to see whether they are well-formed. The compiler checks for validity if the left side of an assignment or formal parameter has an xml<schemaURI> form. A warning is generated if the XML literal is not well-formed, or if it is not valid for assignment or passing to an xml<schemaURI> type.