Manual Page Result
0
Command: mbrtoc16 | Section: 3 | Source: NetBSD | File: mbrtoc16.3
MBRTOC16(3) FreeBSD Library Functions Manual MBRTOC16(3)
NAME
mbrtoc16 - Restartable multibyte to UTF-16 conversion
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <uchar.h>
size_t
mbrtoc16(char16_t * restrict pc16, const char * restrict s, size_t n,
mbstate_t * restrict ps);
DESCRIPTION
The mbrtoc16 function decodes multibyte characters in the current locale
and converts them to UTF-16, keeping state so it can restart after
incremental progress.
Each call to mbrtoc16:
1. examines up to n bytes starting at s,
2. yields a UTF-16 code unit if available by storing it at *pc16,
3. saves state at ps, and
4. returns either the number of bytes consumed if any or a special
return value.
Specifically:
o If the multibyte sequence at s is invalid after any previous input
saved at ps, or if an error occurs in decoding, mbrtoc16 returns
(size_t)-1 and sets errno(2) to indicate the error.
o If the multibyte sequence at s is still incomplete after n bytes,
including any previous input saved in ps, mbrtoc16 saves its state in
ps after all the input so far and returns (size_t)-2. All n bytes of
input are consumed in this case.
o If mbrtoc16 had previously decoded a multibyte character but has not
yet yielded all the code units of its UTF-16 encoding, it stores the
next UTF-16 code unit at *pc16 and returns (size_t)-3. No bytes of
input are consumed in this case.
o If mbrtoc16 decodes the null multibyte character, then it stores zero
at *pc16 and returns zero.
o Otherwise, mbrtoc16 decodes a single multibyte character, stores the
first (and possibly only) code unit in its UTF-16 encoding at *pc16,
and returns the number of bytes consumed to decode the first
multibyte character.
If pc16 is a null pointer, nothing is stored, but the effects on ps and
the return value are unchanged.
If s is a null pointer, the mbrtoc16 call is equivalent to:
mbrtoc16(NULL, "", 1, ps)
This always returns zero, and has the effect of resetting ps to the
initial conversion state, without writing to pc16, even if it is nonnull.
If ps is a null pointer, mbrtoc16 uses an internal mbstate_t object with
static storage duration, distinct from all other mbstate_t objects
(including those used by mbrtoc8(3), mbrtoc32(3), c8rtomb(3),
c16rtomb(3), and c32rtomb(3)), which is initialized at program startup to
the initial conversion state.
IMPLEMENTATION NOTES
On well-formed input, the mbrtoc16 function yields either a Unicode
scalar value in the Basic Multilingual Plane (BMP), i.e., a 16-bit
Unicode code point that is not a surrogate code point, or, over two
successive calls, yields the high and low surrogate code points (in that
order) of a Unicode scalar value outside the BMP.
RETURN VALUES
The mbrtoc16 function returns:
0 [null] if mbrtoc16 decoded a null multibyte character.
i [code unit] where 1 <= i <= n, if mbrtoc16 consumed i
bytes of input to decode the next multibyte character,
yielding a UTF-16 code unit.
(size_t)-3 [continuation] if mbrtoc16 consumed no new bytes of
input but yielded a UTF-16 code unit that was pending
from previous input.
(size_t)-2 [incomplete] if mbrtoc16 found only an incomplete
multibyte sequence after all n bytes of input and any
previous input, and saved its state to restart in the
next call with ps.
(size_t)-1 [error] if any encoding error was detected; errno(2) is
set to reflect the error.
EXAMPLES
Print the UTF-16 code units of a multibyte string in hexadecimal text:
char *s = ...;
size_t n = ...;
mbstate_t mbs = {0}; /* initial conversion state */
while (n) {
char16_t c16;
size_t len;
len = mbrtoc16(&c16, s, n, &mbs);
switch (len) {
case 0: /* NUL terminator */
assert(c16 == 0);
goto out;
default: /* scalar value or high surrogate */
printf("U+%04"PRIx16"\n", (uint16_t)c16);
break;
case (size_t)-3: /* low surrogate */
printf("continue U+%04"PRIx16"\n", (uint16_t)c16);
break;
case (size_t)-2: /* incomplete */
printf("incomplete\n");
goto readmore;
case (size_t)-1: /* error */
printf("error: %d\n", errno);
goto out;
}
s += len;
n -= len;
}
ERRORS
[EILSEQ] The multibyte sequence cannot be decoded in the current
locale as a Unicode scalar value.
[EIO] An error occurred in loading the locale's character
conversions.
SEE ALSO
c16rtomb(3), c32rtomb(3), c8rtomb(3), mbrtoc32(3), mbrtoc8(3), uchar(3)
The Unicode Standard,
https://www.unicode.org/versions/Unicode15.0.0/UnicodeStandard-15.0.pdf,
The Unicode Consortium, September 2022, Version 15.0 -- Core
Specification.
P. Hoffman and F. Yergeau, UTF-16, an encoding of ISO 10646, Internet
Engineering Task Force, RFC 2781,
https://datatracker.ietf.org/doc/html/rfc2781, February 2000.
STANDARDS
The mbrtoc16 function conforms to ISO/IEC 9899:2011 ("ISO C11").
HISTORY
The mbrtoc16 function first appeared in NetBSD 11.0.
FreeBSD 14.1-RELEASE-p8 August 14, 2024 FreeBSD 14.1-RELEASE-p8