SORT(1) FreeBSD General Commands Manual SORT(1)
NAME
sort - sort, merge, or sequence check text and binary files
SYNOPSIS
sort [-bCcdfgHhiMmnRrsuVz] [-k field1[,field2]] [-o output] [-S size]
[-T dir] [-t char] [file ...]
DESCRIPTION
The sort utility sorts the lines of text or binary files. A line is a
record separated from the subsequent record by a newline (default) or NUL
`\0' character (-z option). A record can contain any printable or
unprintable characters. Comparisons are based on one or more sort keys
extracted from each line according to the specified command line options.
By default, sort uses entire lines for comparison and sorts in ascii(7)
order.
If no file is specified, or if file is `-', the standard input is used.
The options are as follows:
-C, --check=silent|quiet
Check that the single input file is sorted. If it is, exit 0; if
it's not, exit 1. In either case, produce no output.
-c, --check
Like -C, but additionally write a message to stderr if the input
file is not sorted.
-m, --merge
Merge only; the input files are assumed to be pre-sorted. If
they are not sorted, the output order is undefined.
-o output, --output=output
Write the output to the output file instead of the standard
output. This file can be the same as one of the input files.
-S size, --buffer-size=size
Use a memory buffer no larger than size. The modifiers %, b, K,
M, G, T, P, E, Z, and Y can be used. If no memory limit is
specified, sort may use up to about 90% of available memory. If
the input is too big to fit into the memory buffer, temporary
files are used.
-s Stable sort; maintains the original record order of records that
have an equal key. This is a non-standard feature, but it is
widely accepted and used.
-T dir, --temporary-directory=dir
Store temporary files in the directory dir. The default path is
the value of the environment variable TMPDIR or /tmp if TMPDIR is
not defined.
-u, --unique
Unique: suppress all but one in each set of lines having equal
keys. This option implies -s. If used with -C or -c, sort also
checks that there are no lines with duplicate keys.
The following options override the default ordering rules. If ordering
options appear before the first -k option, they apply globally to all
sort keys. When attached to a specific key (see -k), the ordering
options override all global ordering options for that key. Note that the
ordering options intended to apply globally should not appear after -k or
results may be unexpected.
-d, --dictionary-order
Consider only blank spaces and alphanumeric characters in
comparisons.
-f, --ignore-case
Consider all lowercase characters that have uppercase equivalents
to be the same for purposes of comparison.
-g, --general-numeric-sort, --sort=general-numeric
Use an initial numeric string as the key and sort numerically.
As opposed to -n, this option handles general floating point
numbers. It has a more permissive format than that allowed by -n
but it has a significant performance drawback.
-h, --human-numeric-sort, --sort=human-numeric
Use an initial numeric string with an optional SI suffix as the
key. Sorts first by numeric sign (negative, zero, or positive);
then by SI suffix (either empty, or `k' or `K', or one of
`MGTPEZY', in that order); and finally by numeric value. The SI
suffix must immediately follow the number. For example, '12345K'
sorts before '1M', because M is "larger" than K. This sort
option is useful for sorting the output of a single invocation of
a df(1) command with -h or -H options (human-readable).
-i, --ignore-nonprinting
Ignore all non-printable characters.
-M, --month-sort, --sort=month
Sort by month abbreviations. Unknown strings are considered
smaller than valid month names.
-n, --numeric-sort, --sort=numeric
Use an initial numeric string as the key, consisting of optional
blank space, an optional minus sign, and zero or more digits
including an optional decimal point, and sort numerically.
Leading blank characters are ignored.
-R, --random-sort, --sort=random
Sort lines in random order. This is a random permutation of the
inputs with the exception that equal keys sort together. It is
implemented by hashing the input keys and sorting the hash
values. The hash function is randomized with data from
arc4random_buf(3), or by file content if one is specified via
--random-source. If multiple sort fields are specified, the same
random hash function is used for all of them.
-r, --reverse
Sort in reverse order.
-V, --version-sort
This option is intended to sort strings that contain version
numbers but it can be used for other purposes as well, for
example to sort IPv4 addresses in dotted quad notation.
When comparing two strings, both strings are split into
substrings such that the first and every other odd-numbered
substring consists of non-digit characters only, while every
even-numbered substring consists of digits only. These
substrings are compared in turn from left to right until a
difference is found. The first substring can be empty; all
others cannot.
Non-digit substrings are compared alphabetically, with upper case
letters sorting before lower case letters, letters sorting before
non-letters, and non-letters sorting in ascii(7) order.
Substrings consisting of digits are compared as integer numbers.
At the end of each string, zero or more suffixes that start with
a dot, consist only of letters, digits, and tilde characters, and
do not start with a digit are ignored, equivalent to the regular
expression "(\.([A-Za-z~][A-Za-z0-9~]*)?)*". This is intended
for ignoring filename suffixes such as ".tar.bz2".
In the following example, the first substring is "sort-" and the
other odd-numbered substrings are all ".":
$ ls sort* | sort -V
sort-1.022.tgz
sort-1.23.tgz
sort-1.23.1.tgz
sort-1.024.tgz
sort-1.024.003.
sort-1.024.003.tgz
sort-1.024.07.tgz
sort-1.024.009.tgz
The treatment of field separators can be altered using these options:
-b, --ignore-leading-blanks
Ignore leading blank space when determining the start and end of
a restricted sort key (see -k). If -b is specified before the
first -k option, it applies globally to all key specifications.
Otherwise, -b can be attached independently to each field
argument of the key specifications. Note that -b should not
appear after -k, and that it has no effect unless key fields are
specified.
-k field1[,field2], --key=field1[,field2]
Define a restricted sort key that has the starting position
field1, and optional ending position field2 of a key field. The
-k option may be specified multiple times, in which case
subsequent keys are compared after earlier keys compare equal.
The -k option replaces the obsolete options +pos1 and -pos2, but
the old notation is also supported.
-t char, --field-separator=char
Use char as the field separator character. The initial char is
not considered to be part of a field when determining key
offsets. Each occurrence of char is significant (for example,
"charchar" delimits an empty field). If -t is not specified, the
default field separator is a sequence of blank-space characters,
and consecutive blank spaces do not delimit an empty field;
further, the initial blank space is considered part of a field
when determining key offsets. To use NUL as field separator, use
-t '\0'.
-z, --zero-terminated
Use NUL as the record separator. By default, records in the
files are expected to be separated by the newline characters.
With this option, NUL (`\0') is used as the record separator
character.
Other options:
--batch-size=num
Specify maximum number of files that can be opened by sort at
once. This option affects behavior when having many input files
or using temporary files. The minimum value is 2. The default
value is 16.
--compress-program=program
Use program to compress temporary files. When invoked with no
arguments, program must compress standard input to standard
output. When called with the -d option, it must decompress
standard input to standard output. If program fails, sort will
exit with an error. The compress(1) and gzip(1) utilities meet
these requirements.
--debug
Print some extra information about the sorting process to the
standard output.
--files0-from=filename
Take the input file list from the file filename. The file names
must be separated by NUL (like the output produced by the command
"find ... -print0").
--heapsort
Try to use heap sort, if the sort specifications allow. This
sort algorithm cannot be used with -u and -s.
--help Print the help text and exit.
-H, --mergesort
Use mergesort. This is a universal algorithm that can always be
used, but it is not always the fastest.
--mmap Try to use file memory mapping system call. It may increase
speed in some cases.
--qsort
Try to use quick sort, if the sort specifications allow. This
sort algorithm cannot be used with -u and -s.
--radixsort
Try to use radix sort, if the sort specifications allow. The
radix sort can only be used for trivial locales (C and POSIX),
and it cannot be used for numeric or month sort. Radix sort is
very fast and stable.
--random-source=filename
For random sort, the contents of filename are used as the source
of the `seed' data for the hash function. Two invocations of
random sort with the same seed data produce the same result if
the input is also identical. By default, the arc4random_buf(3)
function is used instead.
--version
Print the version and exit.
A field is defined as a maximal sequence of characters other than the
field separator and record separator (newline by default). Initial blank
spaces are included in the field unless -b has been specified; the first
blank space of a sequence of blank spaces acts as the field separator and
is included in the field (unless -t is specified). For example, by
default all blank spaces at the beginning of a line are considered to be
part of the first field.
Fields are specified by the -k field1[,field2] option. If field2 is
missing, the end of the key defaults to the end of the line.
The arguments field1 and field2 have the form m.n (m,n > 0) and can be
followed by one or more of the modifiers b, d, f, i, n, g, M and r, which
correspond to the options discussed above. When b is specified, it
applies only to field1 or field2 where it is specified while the rest of
the modifiers apply to the whole key field regardless if they are
specified only with field1 or field2 or both. A field1 position
specified by m.n is interpreted as the nth character from the beginning
of the mth field. A missing .n in field1 means `.1', indicating the
first character of the mth field; if the -b option is in effect, n is
counted from the first non-blank character in the mth field; m.1b refers
to the first non-blank character in the mth field. 1.n refers to the nth
character from the beginning of the line; if n is greater than the length
of the line, the field is taken to be empty.
nth positions are always counted from the field beginning, even if the
field is shorter than the number of specified positions. Thus, the key
can really start from a position in a subsequent field.
A field2 position specified by m.n is interpreted as the nth character
(including separators) from the beginning of the mth field. A missing .n
indicates the last character of the mth field; m = 0 designates the end
of a line. Thus the option -k v.x,w.y is synonymous with the obsolete
option +v-1.x-1 -w-1.y; when y is omitted, -k v.x,w is synonymous with
+v-1.x-1 -w.0. The obsolete +pos1 -pos2 option is still supported,
except for -w.0b, which has no -k equivalent.
ENVIRONMENT
TMPDIR Path to the directory in which temporary files will be stored.
Note that TMPDIR may be overridden by the -T option.
FILES
/tmp/.bsdsort.PID.* Temporary files.
EXIT STATUS
The sort utility exits with one of the following values:
0 Successfully sorted the input files or if used with -C or
-c, the input file already met the sorting criteria.
1 On disorder (or non-uniqueness) with the -C or -c options.
2 An error occurred.
SEE ALSO
comm(1), join(1), uniq(1)
STANDARDS
The sort utility is compliant with the IEEE Std 1003.1-2008 ("POSIX.1")
specification, except that it ignores the user's locale(1) and always
assumes LC_ALL=C.
The flags [-gHhiMRSsTVz] are extensions to that specification.
All long options are extensions to the specification. Some are provided
for compatibility with GNU sort, others are specific to this
implementation.
Some implementations of sort honor the -b option even when no key fields
are specified. This implementation follows historic practice and IEEE
Std 1003.1-2008 ("POSIX.1") in only honoring -b when it precedes a key
field.
The historic practice of allowing the -o option to appear after the file
is supported for compatibility with older versions of sort.
The historic key notations +pos1 and -pos2 are supported for
compatibility with older versions of sort but their use is highly
discouraged.
HISTORY
A sort command appeared in Version 1 AT&T UNIX.
AUTHORS
Gabor Kovesdan <
[email protected]>
Oleg Moskalenko <
[email protected]>
CAVEATS
This implementation of sort has no limits on input line length (other
than imposed by available memory) or any restrictions on bytes allowed
within lines.
The performance depends highly on efficient choice of sort keys and key
complexity. The fastest sort is on whole lines, with option -s. For the
key specification, the simpler to process the lines the faster the search
will be.
When sorting by arithmetic value, using -n results in much better
performance than -g so its use is encouraged whenever possible.
FreeBSD 14.1-RELEASE-p8 April 1, 2025 FreeBSD 14.1-RELEASE-p8