Manual Page Result
0
Command: c16rtomb | Section: 3 | Source: OpenBSD | File: c16rtomb.3
C16RTOMB(3) FreeBSD Library Functions Manual C16RTOMB(3)
NAME
c16rtomb - convert one UTF-16 encoded character to UTF-8
SYNOPSIS
#include <uchar.h>
size_t
c16rtomb(char * restrict s, char16_t c16, mbstate_t * restrict mbs);
DESCRIPTION
This function converts one UTF-16 encoded character to UTF-8. In some
cases, it is necessary to call the function twice to convert a single
character.
First, call c16rtomb() passing the first 16-bit code unit of the UTF-16
encoded character in c16. If the return value is greater than 0, the
character is part of the UCS-2 range, the complete UTF-8 encoding
consisting of at most MB_CUR_MAX bytes has been written to the storage
starting at s, and the function does not need to be called again.
If the return value is 0, the first 16-bit code unit is a UTF-16 high
surrogate and the function needs to be called a second time, this time
passing the second 16-bit code unit of the UTF-16 encoded character in
c16 and passing the same mbs again that was also passed to the first
call. If the second 16-bit code unit is a UTF-16 low surrogate, the
second call returns a value greater than 0, the surrogate pair represents
a Unicode code point beyond the basic multilingual plane, and the
complete UTF-8 encoding consisting of at most MB_CUR_MAX bytes is written
to the storage starting at s.
The output encoding that c16rtomb() uses in s is determined by the
LC_CTYPE category of the current locale. OpenBSD only supports UTF-8 and
ASCII output, and this function is only useful for UTF-8.
The following arguments cause special processing:
c16 == 0 A NUL byte is stored to *s and the state object pointed to
by mbs is reset to the initial state. On operating systems
other than OpenBSD that support state-dependent multibyte
encodings, a special byte sequence ("shift sequence") is
written before the NUL byte to return to the initial state
if that is required by the output encoding and by the
current output encoding state.
mbs == NULL An internal mbstate_t object specific to the c16rtomb()
function is used instead of the mbs argument. This
internal object is automatically initialized at program
startup and never changed by any libc function except
c16rtomb().
s == NULL The object pointed to by mbs, or the internal object if mbs
is a NULL pointer, is reset to its initial state, c16 is
ignored, and 1 is returned.
RETURN VALUES
c16rtomb() returns the number of bytes written to s on success or
(size_t)-1 on failure, specifically:
0 The first 16-bit code unit was successfully decoded as a
UTF-16 high surrogate. Nothing was written to s yet.
1 The first 16-bit code unit was successfully decoded as a
character in the range U+0000 to U+007F, or s is NULL.
2 The first 16-bit code unit was successfully decoded as a
character in the range U+0080 to U+07FF.
3 The first 16-bit code unit was successfully decoded as a
character in the range U+0800 to U+D7FF or U+E000 to U+FFFF.
4 The second 16-bit code unit was successfully decoded as a
UTF-16 low surrogate, resulting in a character in the range
U+10000 to U+10FFFF.
greater Return values greater than 4 may occur on operating systems
other than OpenBSD for output encodings other than UTF-8, in
particular when a shift sequence was written.
(size_t)-1 UTF-16 input decoding or LC_CTYPE output encoding failed, or
mbs is invalid. Nothing was written to s, and errno has been
set.
ERRORS
c16rtomb() causes an error in the following cases:
[EILSEQ] UTF-16 input decoding failed because the first 16-bit
code unit is neither a UCS-2 character nor a UTF-16
high surrogate, or because the second 16-bit code unit
is not a UTF-16 low surrogate; or output encoding
failed because the resulting character cannot be
represented in the output encoding selected with
LC_CTYPE.
[EINVAL] mbs points to an invalid or uninitialized mbstate_t
object.
SEE ALSO
mbrtoc16(3), setlocale(3), wcrtomb(3)
STANDARDS
c16rtomb() conforms to ISO/IEC 9899:2011 ("ISO C11").
HISTORY
c16rtomb() has been available since OpenBSD 7.4.
CAVEATS
The C11 standard only requires the c16 argument to be interpreted
according to UTF-16 if the predefined environment macro __STDC_UTF_16__
is defined with a value of 1. On OpenBSD, <uchar.h> provides this
definition. Other operating systems which do not define __STDC_UTF_16__
could theoretically use a different, implementation-defined input
encoding for c16 instead of UTF-16. Using UTF-16 becomes mandatory in
C23.
FreeBSD 14.1-RELEASE-p8 August 20, 2023 FreeBSD 14.1-RELEASE-p8