Manual Page Result
0
Command: tr | Section: 1 | Source: Digital UNIX | File: tr.1.gz
tr(1) General Commands Manual tr(1)
NAME
tr - Translates characters
SYNOPSIS
tr [-Acs] string1 string2
tr -ds [-Ac] string1 string2
tr -d [-Ac] string1
tr -s [-Ac] string1
The tr command copies characters from the standard input to the stan-
dard output with substitution or deletion of selected characters.
STANDARDS
Interfaces documented on this reference page conform to industry stan-
dards as follows:
tr: XPG4, XPG4-UNIX
Refer to the standards(5) reference page for more information about in-
dustry standards and associated tags.
OPTIONS
[DIGITAL] Translates on a byte-by-byte basis. When you specify this
option, tr does not support extended characters. Complements (inverts)
the set of characters in string1, which is the set of all characters in
the current character set, as defined by the current setting of
LC_CTYPE, except for those actually specified in the string1 argument.
These characters are placed in the array in ascending collation se-
quence, as defined by the current setting of LC_COLLATE. Deletes all
occurrences of input characters or collating elements found in the ar-
ray specified in string1. Replaces any character specified in string1
that occurs as a string of two or more repeating characters as a single
instance of the character in string2.
OPERANDS
Translation control strings as explained in the DESCRIPTION section.
DESCRIPTION
Input characters from string1 are replaced with the corresponding char-
acters in string2. The tr command cannot handle an ASCII NUL (\000) in
string1 or string2; it always deletes NUL from the input.
[DIGITAL] The trbsd command is a BSD compatible version of tr.
The following constructs can be used to specify characters or single-
character collating elements. If any of these constructs result in mul-
ticharacter collating elements, tr excludes those elements from the re-
sulting array without issuing a diagnostic. Represents itself when not
described by one of the other conventions in this list. Represents a
character by using its octal value. An octal sequence consists of a
backslash followed by the longest sequence of one-, two-, or three-oc-
tal-digit characters (01234567). The sequence causes the character
whose encoding is represented by the one-, two-, or three-digit octal
value to be placed in the string. Represent standard backslash-escape
sequences. No results are defined by the Single UNIX Specification for
specifying characters after a backslash other than the ones listed
here. In portable applications, a backslash should be followed only by
an octal sequence, another backslash, or the lowercase letter a, b, f,
n, r, t, or v.
[DIGITAL] On UNIX systems, you can enclose string operands in
quotation marks or specify a backslash before some characters,
such as * (an asterisk), to remove the special meaning of those
characters to the shell. Represents a range of collating ele-
ments between the specified range endpoints, inclusive, as de-
fined by the current locale setting of the LC_COLLATE category.
The starting element, c1, must precede the ending element, c2,
in the current collation order. The characters or collating ele-
ments in the range are placed in the associated string in as-
cending collation sequence. Note that the collation sequence
for ASCII characters, such as letters in the English alphabet,
may vary among locales. In the POSIX locale, for example, a-z
produces a string with all English lowercase letters in English
alphabetical order. However, when LC_COLLATE is set to a differ-
ent locale, English lowercase letters may be subject to a dif-
ferent collation order. Therefore, a-z may produce a different
result for locales other than the POSIX locale. Stands for num-
ber repetitions of the character c. The number is considered to
be in decimal unless the first digit of number is 0; then it is
considered to be in octal. This format is valid only as
string2. Represents all characters or collating elements be-
longing to the equivalence class specified by equiv, as defined
by the LC_COLLATE locale category. An equivalence class expres-
sion can be used for string1 or string2 only when used in combi-
nation with the -d and -s options. (For more information, see
the locale(4) reference page.) Represents all characters be-
longing to the defined character class, as defined by the cur-
rent setting of the LC_CTYPE locale category. The following
character class names are accepted when specified in string1:
alnum cntrl lower space alpha digit print upper
blank graph punct xdigit
If the current locale defines additional keywords (by including
additional charclass definitions in the LC_TYPE category), the
tr command also recognizes those keywords as class values.
When the -d and -s options are specified together, any of the
character class names are accepted in string2; otherwise, only
character class names lower or upper are accepted in string2 and
then only if the class complement, (upper or lower, respec-
tively) is specified in the same relative position in string1.
Such a specification is interpreted as a request for case con-
version.
When [:lower:] appears in string1 and [:upper:] appears in
string2, the arrays contain the characters from the toupper map-
ping in the LC_CTYPE category of the current locale. When [:up-
per:] appears in string1 and [:lower:] appears in string2, the
arrays contain the characters from the tolower mapping in the
LC_CTYPE category of the current locale.
The first character from each mapping pair is in the array for
string1 and the second character from each mapping pair is in
the array for string2 in the same relative position.
[DIGITAL] When string2 is shorter than string1, a difference results
between historical System V and BSD systems. A BSD system pads string2
with the last character found in string2. Thus, it is possible to do
the following: tr 0123456789 d
[DIGITAL] The preceding command translates all digits to the letter d.
A portable application cannot rely on the BSD behavior; it would have
to code the example in the following way: tr 0123456789 '[d*]'
[DIGITAL] If a given character appears more than once in string1, the
character in string2 corresponding to its last appearance in string1
will be used in the translation.
If the -c and -d options are both specified, all characters except
those specified by string1 are deleted. The contents of string2 are ig-
nored, unless -s is also specified. Note, however, that the same
string cannot be used for both the -d and the -s options; when both op-
tions are specified, both string1 (used for deletion) and string2 (used
for squeezing) are required.
If the -d option is not specified, each input character or collating
element found in the array specified by string1 is replaced by the
character or collating element in the same relative position in the ar-
ray specified by string2.
When the -s option is specified, if the string2 contains a character
class, the argument's array contains all of the characters in that
character class. For example: tr -s '[:space:]'
In a case conversion, however, the string2 array contains only those
characters defined as the second characters in each of the toupper or
tolower character pairs, as appropriate. For example: tr -s '[:upper:]'
'[:lower:]'
System V Compatibility
[DIGITAL] The root of the directory tree that contains the commands
modified for SVID 2 compliance is specified in the file
/etc/svid2_path. You can use /etc/svid2_profile as the basis for, or to
include in, your /etc/svid2_profile reads /etc/svid2_path and sets the
first entries in the PATH environment variable so that the modified
SVID 2 commands are found first.
[DIGITAL] In the SVID 2 compliant version of the tr command, only
characters in the octal range of 1 to 377 are complemented when you
specify the -c option. This behavior is accomplished because the -A
option is implicitly forced to be on when you specify the -c option.
NOTES
[DIGITAL] Specifying the -A option improves ASCII performance. De-
spite similarities in appearance, the string arguments used by tr are
not regular expressions. The tr command correctly processes NULL char-
acters in its input stream. NULL characters can be stripped using the
following command: tr -d '\000' If string1 or string2 is the empty
string, results are undefined and unpredictable.
EXIT STATUS
The following exit values are returned: Successful completion. An er-
ror occurred.
EXAMPLES
To translate braces into parentheses, enter: tr '{}' '()' <textfile
>newfile
This translates each { (left brace) to ( (left parenthesis) and
each } (right brace) to ) (right parenthesis). All other char-
acters remain unchanged. In the POSIX locale, to translate low-
ercase ASCII characters to uppercase, you can enter: tr 'a-z'
'A-Z' <textfile >newfile
This command assumes that English letters are collated in Eng-
lish alphabetical order, which may not be true for locales other
than the POSIX locale. The following command is recommended for
case conversion for all locales: tr '[:lower:]' '[:upper:]'
<textfile >newfile The two strings can be of different lengths:
tr '0-9' '#' <textfile >newfile
This translates each 0 into a # (number sign) but does not treat
the digits 1 to 9; if the two character strings are not the same
length, the extra characters in the longer one are ignored. To
translate each digit to a # (number sign), enter:
tr '0-9' '[#*]' <textfile >newfile
The * (asterisk) tells tr to repeat the # (number sign) enough
times to make the second string as long as the first one. To
translate each string of digits to a single # (number sign), en-
ter: tr -s '0-9' '[#*]' <textfile >newfile In the POSIX locale,
to translate all ASCII characters that are not specified, enter:
tr -c '[ -~]' '[A-_]' <textfile >newfile
This translates each nonprinting ASCII character to the next
following corresponding control key letter (\001 translates to
B, \002 to C, and so on). ASCII DEL (\177), the character that
follows ~ (tilde), translates to a ] (right bracket). This com-
mand assumes that ASCII characters are collated in a certain or-
der, which may not be true for locales other than the POSIX lo-
cale. To create a list of all words in file1 one per line in
file2, where a word is taken to be a maximal string of letters,
enter: tr -cs '[:alpha:]' '[\n*]' < file1 > file2 To use an
equivalence class to identify accented variants of the base
character e in file1, which are stripped of diacritical marks
and written to file2, enter: tr '[=e=]' '[e*]' < file1 > file2
Equivalence classes are locale dependent. Some locales may not
include equivalence classes to associate base letters and their
accented variants.
ENVIRONMENT VARIABLES
The following environment variables affect the execution of tr: Pro-
vides a default value for the internationalization variables that are
unset or null. If LANG is unset or null, the corresponding value from
the default locale is used. If any of the internationalization vari-
ables contain an invalid setting, the utility behaves as if none of the
variables had been defined. If set to a non-empty string value, over-
rides the values of all the other internationalization variables. De-
termines the locale for the behavior of range expressions and equiva-
lence classes. Determines the locale for the interpretation of se-
quences of bytes of text data as characters (for example, single-byte
as opposed to multibyte characters in arguments) and the behavior of
character classes. Determines the locale for the format and contents
of diagnostic messages written to standard error. Determines the loca-
tion of message catalogues for the processing of LC_MESSAGES.
SEE ALSO
Commands: ed(1), ksh(1), sed(1), Bourne shell sh(1b), POSIX shell
sh(1p), trbsd(1)
Files: ascii(5)
Standards: standards(5)
tr(1)