sscanf() — Read and Format Data

Format

#include <stdio.h>

int sscanf(const char *__restrict__buffer, const char *__restrict__format-string, …);

General Description

The sscanf() function reads data from buffer into the locations given by argument-list. If the strings pointed to by buffer and format-string overlap, behavior is undefined.

Each entry in the argument list must be a pointer to a variable of a type that matches the corresponding conversion specification in format-string. If the types do not match, the results are undefined.

The format-string controls the interpretation of the argument list. The format-string can contain multibyte characters beginning and ending in the initial shift state.

The format string pointed to by format-string can contain one or more of the following:
  • White space characters, as specified by isspace(), such as blanks and newline characters. A white space character causes sscanf() to read, but not to store, all consecutive white space characters in the input up to the next character that is not white space. One white space character in format-string matches any combination of white space characters in the input.
  • Characters that are not white space, except for the percent sign character (%). A non-white space character causes sscanf() to read, but not to store, a matching non-white space character. If the next character in the input stream does not match, the function ends.
  • Conversion specifications which are introduced by the percent sign (%) or the sequence (%n$) where n is a decimal integer in the range [1,NL_ARGMAX]. A conversion specification causes sscanf() to read and convert characters in the input into values of a conversion specifier. The value is assigned to an argument in the argument list.

sscanf() reads format-string from left to right. Characters outside of conversion specifications are expected to match the sequence of characters in the input stream; the matched characters in the input stream are scanned but not stored. If a character in the input stream conflicts with format-string, the function ends, terminating with a “matching” failure. The conflicting character is left in the input stream as if it had not been read.

When the first conversion specification is found, the value of the first input field is converted according to the conversion specification and stored in the location specified by the first entry in the argument list. The second conversion specification converts the second input field and stores it in the second entry in the argument list, and so on through the end of format-string.

When the %n$ conversion specification is found, the value of the input field is converted according to the conversion specification and stored in the location specified by the nth argument in the argument list. Numbered arguments in the argument list can only be referenced once from format-string.

The format-string can contain either form of the conversion specification, that is, % or %n$ but the two forms cannot be mixed within a single format-string except that %% or %* can be mixed with the %n$ form.

An input field is defined as:
  • All characters until a white space character (space, tab, or newline) is encountered
  • All characters until a character is encountered that cannot be converted according to the conversion specification
  • All characters until the field width is reached.

If there are too many arguments for the conversion specifications, the extra arguments are evaluated but otherwise ignored. The results are undefined if there are not enough arguments for the conversion specifications.

Syntax of Conversion Specification for sscanf()

Read syntax diagramSkip visual syntax diagram%*widthhhhlllLjtzconversion specifier

Each field of the conversion specification is a single character or a number signifying a particular format option. The conversion specifier, which appears after the last optional format field, determines whether the input field is interpreted as a character, a string, or a number. The simplest conversion specification contains only the percent sign and a conversion specifier (for example, %s).

Each field of the format specification is discussed in detail below.

Other than conversion specifiers, avoid using the percent sign (%), except to specify the percent sign: %%. Currently, the percent sign is treated as the start of a conversion specifier. Any unrecognized specifier is treated as an ordinary sequence of characters. If, in the future, z/OS® XL C/C++ permits a new conversion specifier, it could match a section of your format string, be interpreted incorrectly, and result in undefined behavior. See Table 1 for a list of conversion specifiers.

An asterisk (*) following the percent sign suppresses assignment of the next input field, which is interpreted as a field of the specified conversion specifier. The field is scanned but not stored.

width is a positive decimal integer controlling the maximum number of characters to be read. No more than width characters are converted and stored at the corresponding argument.

Fewer than width characters are read if a white space character (space, tab, or newline), or a character that cannot be converted according to the given format occurs before width is reached.

The optional prefix l shows that you use the long version of the following conversion specifier, while the prefix h indicates that the short version is to be used. The corresponding argument should point to a long or double object (for the l character), a long double object (for the L character), or a short object (with the h character). The l and h modifiers can be used with the d, i, o, x, and u conversion specifiers. The l and h modifiers are ignored if specified for any other conversion specifier.

Optional prefix: Used to indicate the size of the argument expected.
h
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to short or unsigned short.
hh
Specifies that a following d, i, o, u, x, X or n conversion specifier applies to an argument with type pointer to signed char or unsigned char.
j
Specifies that a following d, i, o, u, x, X or n conversion specifier applies to an argument with type pointer to intmax_t or uintmax_t.
l
Specifies that a following e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to double.
ll
Specifies that a following d, i, o, u, x, X or n conversion specifier applies to an argument with type pointer to long long or unsigned long long.
L
Specifies that a following e, E, f, g, or G conversion specifier applies to an argument with type pointer to long double.
t
Specifies that a following d, i, o, u, x, X or n conversion specifier applies to an argument with type pointer to ptrdiff_t or the corresponding unsigned type.
z
Specifies that a following d, i, o, u, x, X or n conversion specifier applies to an argument with type pointer to size_t or the corresponding signed integer type.

The type characters and their meanings are in Table 1.

Table 1. Conversion Specifiers in sscanf()
Conversion Specifier Type of Input Expected Type of Argument
d Decimal integer Pointer to int
o Octal integer Pointer to unsigned int
x
X
Hexadecimal integer Pointer to unsigned int
i Decimal, hexadecimal, or octal integer Pointer to int
u Unsigned decimal integer Pointer to unsigned int
c

Sequence of one or more characters as specified by field width; white space characters that are ordinarily skipped are read when %c is specified. No terminating null is added.

Pointer to char large enough for input field.
s

Like c, a sequence of bytes of type char (signed or unsigned), except that white space characters are not allowed, and a terminating null is always added.

Pointer to character array large enough for input field, plus a terminating NULL character (\0) that is automatically appended.
n No input read from stream or buffer. Pointer to int, into which is stored the number of characters successfully read from the stream or buffer up to that point in the call to either fscanf() or to scanf().
p Pointer to void converted to series of characters. For the specific format of the input, see the individual system reference guides. Pointer to void.

A non-empty sequence of bytes to be matched against a set of expected bytes (the scanset), which form the conversion specification. White space characters that are ordinarily skipped are read when %[ is specified.

Consider the following situations:

[^bytes].  In this case, the scanset contains all bytes that do not appear between the circumflex and the right square bracket.

[]abc] or [^]abc.]    In both these cases the right square bracket is included in the scanset (in the first case: ]abc and in the second case, not ]abc)

[a–z]  In EBCDIC The – is in the scanset, the characters b through y are not in the scanset; in ASCII The – is not in the scanset, the characters b through y are.

The code point for the square brackets ([ and ]) and the caret (^) vary among the EBCDIC encoded character sets. The default C locale expects these characters to use the code points for encoded character set Latin-1 / Open Systems 1047. Conversion proceeds one byte at a time: there is no conversion to wide characters.

Pointer to the initial byte of an array of char, signed char, or unsigned char large enough to accept the sequence and a terminating byte, which will be added automatically.
e
E
f
F
g
G
Floating-point value consisting of an optional sign (+ or -), a series of one or more decimal digits possibly containing a decimal-point, and an optional exponent (e or E) followed by a possibly signed integer value. Pointer to float

The format string passed to sscanf() must be encoded as IBM-1047.

To read strings not delimited by space characters, substitute a set of characters in square brackets ([ ]) for the s (string) conversion specifier. The corresponding input field is read up to the first character that does not appear in the bracketed character set. If the first character in the set is a logical not (¬), the effect is reversed: the input field is read up to the first character that does appear in the rest of the character set.

To store a string without storing an ending NULL character (\0), use the specification %ac, where a is a decimal integer. In this instance, the c conversion specifier means that the argument is a pointer to a character array. The next a characters are read from the input stream into the specified location, and no NULL character is added.

The input for a %x conversion specifier is interpreted as a hexadecimal number.

The sscanf() function scans each input field character by character. It might stop reading a particular input field either before it reaches a space character, when the specified width is reached, or when the next character cannot be converted as specified. When a conflict occurs between the specification and the input character, the next input field begins at the first unread character. The conflicting character, if there is one, is considered unread and is the first character of the next input field or the first character in subsequent read operations on the input stream.

The sscanf family functions match e, E, f, F, g or, G conversion specifiers to floating-point number substrings in the input stream. The sscanf family functions convert each input substring matched by an e, E, f, F, g, or G conversion specifier to a float, double or long double value depending on the size modifier before the e, E, f, F, g, or G conversion specifier.

Many z/OS Metal C formatted input functions, including the sscanf family of functions, use the IEEE binary floating-point format and recognize special infinity and NaN floating-point number input sequences.
  • The special sequence for infinity input is [+/-]inf or [+/-]INF, where + or - is optional.
  • The special sequence of NaN input is either [+/-]nan(n) for a signaling nan or [+/-nanq(n)] for a quiet nan, where n is an integer and 1<= n <= INT_MAX-1. If (n) is omitted, n is assumed to be 1. The value of n determines what IEEE binary floating-point NaN fraction bits are produced by the formatted input functions. For a signaling NaN, these functions produce NaN fraction bits (left to right) by reversing the bits (right to left) of the even integer value 2*n. For a quiet NaN, they produce NaN fraction bits (left to right) by reversing the bits (right to left) of the odd integer value 2*n-1.

Returned Value

The sscanf() function returns the number of input items that were successfully matched and assigned. The returned value does not include conversions that were performed but not assigned (for example, suppressed assignments). The functions return EOF if there is an input failure before any conversion, or if EOF is reached before any conversion. Thus a returned value of 0 means that no fields were assigned: there was a matching failure before any conversion.

Related Information