Manual Page Result
0
Command: LC_NUMERIC | Section: 5 | Source: Digital UNIX | File: LC_NUMERIC.5.gz
i18n_intro(5) File Formats Manual i18n_intro(5)
NAME
i18n_intro, i18n, LANG, LC_ALL, LC_COLLATE, LC_CTYPE, LC_MESSAGES,
LC_MONETARY, LC_NUMERIC, LC_TIME - Introduction to internationalization
(I18N)
DESCRIPTION
Internationalization refers to the process of developing programs with-
out prior knowledge of the language, cultural data, or character-encod-
ing schemes that the programs are expected to handle. In other words,
internationalization refers to the availability and use of interfaces
that let programs modify their behavior at run time for operation in a
specific language environment. The abbreviation I18N is often used to
stand for internationalization as there are 18 characters between the
beginning "I" and the ending "N" of that word.
The I18N interfaces and utilities provided in DIGITAL UNIX conform to
Issue 4 of X/Open CAE specifications.
A concept related to internationalization is localization (L10N), which
refers to the process of establishing information within a computer
system for each combination of native language, cultural data, and
coded character set (codeset). A locale is a database that provides in-
formation for a unique combination of these three components. However,
locales do not solve all of the problems that localization must ad-
dress. Many native languages require additional support in the form of
language-specific print filters, fonts, codeset converters, character
input methods, and other kinds of specialized software.
For additional introductory information on topics related to interna-
tionalization, refer to the following reference pages: For more infor-
mation on localization and locales For an introduction to codeset con-
version For a summary of printer support for native languages
Characters, Character Sets, and Codesets
A character is a member of a set of elements used for the organization,
control, or representation of data.
A character set is a set of alphabetic or other characters used to con-
struct the words and other elementary units of a native language or
computer language. A character set only specifies the characters that
are included in the set. ASCII, CNS 11643 and DTSCS are examples of
character sets.
A coded character set (codeset) is a set of unambiguous rules that sup-
port one or more character sets and establishes the one-to-one rela-
tionship between each character and its bit representation. In other
words, a codeset consists of the code points for characters in one or
more character sets. For example, DEC Hanyu (dechanyu) is a codeset for
Chinese and contains code points for characters in the ASCII, CNS
11643-1986 (plane 1 and plane 2), and DTSCS character sets.
Language Announcement (Setting Locale)
Language announcement is the mechanism by which language, cultural
data, and codeset requirements are set either for the system as a whole
or by individual users. An application can also set these requirements,
although it is more common for an internationalized application to use
the setting in effect for the user who runs the program. Refer to the
System Administration manual for information about setting systemwide
defaults for shells. Refer to setlocale(3) and Writing Software for the
International Market for information on how applications query or set
locale requirements at run time.
Language announcement is performed by setting one or more reserved en-
vironment variables to the name of an installed locale. Each locale has
associated with it collating sequences, character conversion tables,
character classification tables, formats for different kinds of data,
and message catalogs. If the same locale meets user requirements in all
these categories, set only the LANG environment variable to the locale
name. A locale name usually has the following format:
language_territory.codeset[@modifier]
The following Korn shell example sets LANG to a locale supporting the
English language, United States cultural data, and ISO8859-1 codeset: $
LANG=en_US.ISO8859-1
The following C shell example sets LANG to a locale supporting the Tra-
ditional Chinese language, Hong Kong cultural data, and the DEC Hanyu
codeset: % setenv LANG zh_HK.dechanyu
Note that locale name formats can vary from vendor to vendor. Use the
locale -a command to display the names of locales installed on your
system. Refer to the l10n_intro(5) reference page for a list of the
locales provided with the DIGITAL UNIX product.
An alternative way to set locale requirements for all locale categories
is to set the LC_ALL environment variable. The difference between the
LANG and LC_ALL variables is that LC_ALL is a high-precedence variable
that overrides all other locale variables, including LANG. The LANG
variable, on the other hand, is a low-precedence variable. When used
by itself, the LANG variable implicitly sets all locale categories to
the specified locale just as LC_ALL does. However, the LANG variable
can be used together with variables for specific locale categories to
create a multilocale environment. The category-specific locale vari-
ables and what they control follow: String collation Character classi-
fication Translations for messages and valid strings for "yes" and "no"
responses The currency symbol and the format of monetary values The
format of numeric values The format of date and time values
Some locale names have one or more @modifier suffixes. A locale with
the suffix @ucs4 is for use by applications that require internal
process code to be in UCS-4 format. See Unicode(5) for more information
about UCS-4. Other @modifier suffixes indicate locale variants that
support alternative rules for collation in Asian languages. Use locales
with these suffixes only when setting LC_COLLATE. For example, there
are three different sets of collation rules (chuyin, radical, and
stroke) that can be used with the locale supporting the Chinese lan-
guage, Taiwanese cultural data, and the Taiwanese EUC codeset. If Korn
shell users want to use this locale, they might make the following set-
tings: $ LANG=zh_TW.eucTW $ LC_COLLATE=zh_TW.eucTW@stroke
The preceding example implicitly sets all locale category variables to
zh_TW.eucTW, except for the LC_COLLATE variable, which is set to
zh_TW.eucTW@stroke. The following locale command displays the variable
settings after these assignments:
$ locale LANG=zh_TW.eucTW LC_COLLATE=zh_TW.eucTW@stroke
LC_CTYPE="zh_TW.eucTW" LC_MONETARY="zh_TW.eucTW" LC_NUMERIC="zh_TW.eu-
cTW" LC_TIME="zh_TW.eucTW" LC_MESSAGES="zh_TW.eucTW" LC_ALL=
SEE ALSO
Commands: locale(1), setlocale(3)
Others: i18n_printing(5), iconv_intro(5), l10n_intro(5), Unicode(5)
Writing Software for the International Market
System Administration
i18n_intro(5)