mbrtowc() — Convert a multibyte character to a wide character
Standards
Standards / Extensions | C or C++ | Dependencies |
---|---|---|
ISO C Amendment
C99 Single UNIX Specification, Version 3 |
both |
Format
#include <wchar.h>
size_t mbrtowc(wchar_t * __restrict__pwc, const char * __restrict__s,
size_t n, mbstate_t * __restrict__ps);
#define _XOPEN_SOURCE
#define _MSE_PROTOS
#include <wchar.h>
size_t mbrtowc(wchar_t *pwc, const char *s, size_t n, mbstate_t *ps);
General description
The mbrtowc()
function is equivalent to mbrtowc(NULL,"",1,ps)
.
If s is a NULL pointer, the mbrtowc() function ignores the n and the pwc, and resets the shift state, pointed to by ps, to the initial shift state.
If s is not a NULL pointer, mbrtowc() inspects at most n bytes, beginning with the byte pointed to by s, and the shift state pointed to by ps, and determines the number of bytes that is needed to complete the valid multibyte character.
When the multibyte character is completed, mbrtowc() determines the value of the corresponding wide character and stores it in the object pointed to by pwc, so long as pwc is not a NULL pointer. Finally, mbrtowc() stores the actual shift state in the object pointed to by ps. If ps is a NULL pointer, mbrtowc() uses its own internal object to track the shift state.
mbrtowc() is a restartable version of mbtowc(). That is, shift-state information is passed as one of the arguments and is updated on exit. With mbrtowc(), you can switch from one multibyte string to another, provided that you have kept the shift-state information.
The behavior of this wide-character function is affected by the LC_CTYPE category of the current locale. If you change the category, undefined results may occur.
Special behavior for XPG4
If you define any feature test macro specifying XPG4 behavior before the statement in your program source file to include the wchar header, then you must also define the _MSE_PROTOS feature test macro to make the declaration of the mbrtowc() function in the wchar header available when you compile your program. Please see Table 1 for a list of XPG4 and other feature test macros.
Returned value
If s is a NULL pointer, mbrtowc() resets the shift state to the initial shift state and returns 0.
- 0
- If the next n or fewer bytes complete the valid multibyte character that corresponds to the NULL wide character.
- positive integer
- If the next n or fewer bytes complete the valid multibyte character; the value returned is the number of bytes that complete the multibyte character.
- -2
- If the next n bytes form an incomplete
(but potentially valid) multibyte character, and all n bytes
have been processed. It is unspecified whether this can occur when
the value of n is less than that of the
MB_CUR_MAX macro. Note: When a -2 value is returned, and n is at least MB_CUR_MAX, the string would contain redundant shift-out and shift-in characters. To continue processing the multibyte string, increment the pointer by the value n, and call the mbrtowc() function.
- -1
- If an encoding error occurs (when the next n or fewer bytes do not contribute to the complete and valid multibyte character). The value of the macro EILSEQ is stored in errno, but the conversion state is unchanged.
Example
/* CELEBM04 */
#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
int main(void)
{
wchar_t wc;
char mbs[5] = "a"; /* string containing the multibyte char */
mbstate_t ss = 0; /* set shift state to the initial state */
int length;
/* Determine the length of the multibyte character pointed to by */
/* mbs. Store the multibyte character in the wchar_t object */
/* called wc. */
length = mbrtowc(&wc, mbs, MB_CUR_MAX, &ss);
printf(" length: %d \n", length);
printf(" wc:'%lc'\n", wc);
printf(" mbs:\"%s\"\n", mbs);
printf("MB_CUR_MAX: %d \n", MB_CUR_MAX);
printf(" ss: %d \n", ss);
}
Related information
- “Internationalization: Locales and Character Sets” in z/OS XL C/C++ Programming Guide
- locale.h — Locale settings
- wchar.h — ISO/C Multibyte Support extensions
- mblen() — Calculate length of multibyte character
- mbrlen() — Calculate length of multibyte character
- mbsrtowcs() — Convert a multibyte string to a wide-character string
- setlocale() — Set locale
- wcrtomb() — Convert a wide character to a multibyte character
- wcsrtombs() — Convert wide-character string to multibyte string