diff, diffh, bdiff

compare two text files and show differences 

Command


SYNOPSIS

diff [-BbefHhimnrsw] [-C n] [-c[n]] [-Difname] [-Mmark] [-U[[[c][lb8oa]][p[lb8oa]]]] path1 path2

diff [-BbefHhimnrsw] [-C n] [-c[n]] [-Difname] [-Mmark] [-U[[[c][lb8oa]][p[lb8oa]]]] file... dir

diffh [-Bbefimnrsw] [-C n] [-c[n]] [-Difname] [-Mmark] [-U[[[c][lb8oa]][p[lb8oa]]]] path1 path2

diffh [-Bbefimnrsw] [-C n] [-c[n]] [-Difname] [-Mmark] [-U[[[c][lb8oa]][p[lb8oa]]]] file... dir

bdiff [-Bbefimnrsw] [-C n] [-c[n]] [-Difname] [-Mmark] [-U[[[c][lb8oa]][p[lb8oa]]]] path1 path2 [n]

bdiff [-Bbefimnrsw] [-C n] [-c[n]] [-Difname] [-Mmark] [-U[[[c][lb8oa]][p[lb8oa]]]] file... dir [n]


DESCRIPTION

The diff command attempts to determine the minimal set of changes needed either to convert a file named path1 into path2 or the group of files indicated by file... into files of the same name found under the directory dir.

Besides normal ASCII text files, diff and its related utilities also work on UTF-8 files and 16-bit wide Unicode files. Such files normally begin with a multiple-byte marker indicating whether the file's contents are Unicode big-endian, Unicode little-endian, or UTF-8. Such files are detected automatically by diff; however, when the multiple-byte marker is missing you can use the -U option or the TK_STDIO_DEFAULT_INPUT_FORMAT/TK_STDIO_DEFAULT_OUTPUT_FORMAT environment variables to force any file to be treated as a Unicode or UTF-8 file.

Normally, the output format of these utilities defaults to the format of the first file it displays unless the -U option or the TK_STDIO_DEFAULT_OUTPUT_FORMAT environment variable is used to override the output format. For more details on this and other Unicode-related file handling issues see the unicode reference page.

If either (but only one) file name is -, diff reads that file from the standard input. If only two path names appear on the command line and one of path1 or path2 is a directory, diff uses a file in that directory with the same name as the other file name. If both are directories, diff compares files with the same file names under the two directories. However, diff does not compare files in subdirectories unless you specify the -r option. When comparing two directories, diff does not compare block special files, character special files, or FIFO special files to any other files and does not compare regular files to directories.

If more than two path names appear on the command line, the last path name is assumed to be a directory and the preceding path names to be files. diff then compares each file in the list to a file with the name under the specified directory.

By default, output consists of descriptions of the changes in a style reminiscent of the ed text editor. A line indicating the type of change is given. The three types are a (append), d (delete), and c (change). The output is symmetric in the sense that a delete in path1 is the counterpart of an append in path2. diff prefixes each operation with a line number (or range) in path1 and suffixes each with a line number (or range) in path2. After the line giving the type of change, diff displays the deleted or added lines, prefixing lines from path1 with < and lines from path2 with >.

When you call the command as diffh, it automatically uses the -h option.

When you call it as bdiff, diff computes the differences in chunks of n lines (default 3999). This lets you process arbitrarily large files and generally produces less output than the -h option.

Options

-B 

uses diffb to compare the files when binary files are detected.

-b 

ignores white space at the end of each line (except the newline) and treats all consecutive strings of white space elsewhere in a line as equivalent (effectively, reducing all strings of white space to a single space for the purpose of comparing lines). For example if one file contained a string of three spaces and a tab at a given location while the other file contained a string of two spaces at the same location, diff would not report this as a difference.

-C n 

shows n lines of context before and after each change. diff marks lines removed from path1 with -, lines added to path2 with + and lines changed in both files with !. This option conflicts with the -e and -f options.

-c[n

is equivalent to -C n, but n is optional. The default value for n is 3. This option conflicts with the -e and -f options.

-Difname 

displays output that is the appropriate input to the C preprocessor to generate the contents of path2 when ifname is defined, and the contents of path1 when ifname is not defined.

-e 

writes out a script of commands for the ed text editor that converts path1 to path2. diff sends the output to the standard output. This option conflicts with the -m, -c, and -C options.

-f 

writes a script similar to the one produced under -e to the standard output, but with the command sequence reversed, the command form reversed, and the line-range separator a space, rather than a comma. This option conflicts with the -m, -c, and -C options.

-H 

uses the half-hearted (-h) algorithm only if the normal algorithm runs out of system resources.

-h 

uses a fast, half-hearted algorithm instead of the normal diff algorithm. This algorithm can handle arbitrarily large files; however, it is not particularly good at finding a minimal set of differences in files with many differences.

-i 

ignores the case of letters when doing the comparison.

-Mmark 

describes the diff mark to be used in place of the vertical line character (|) when the -m option is also specified. mark can be any nroff/troff character or string. It can be as simple or complex as you like. For example, specifying mark as 1 causes the diff mark to be the character 1. Specifying \(sq causes it to be an open square and specifying \s+6\fB\d~\u\fP\s0 causes the diff mark to be a bold-faced tilde (~) six points larger in size half a line below the current line.

This option does not change the character used to indicate deleted lines. It remains the asterisk(*).

-m 

produces the contents of path2 with extra formatter request lines interspersed to show which lines were added or changed and which were deleted.

If you do not also specify the -M option, added, or changed lines are indicated by a vertical line character | in the right margin. If you do specify the -M option, diff uses the nroff/troff string given as mark to indicated additions or changes. Deleted lines are always indicated with an asterisk (*) in the right margin.

These are nroff/ troff requests. This option conflicts with the -e and -f options.

-n 

displays the differences in a form that is usable by PTC Windchill Requirements and Validation.

-r 

compares corresponding files under the directories, and recursively compares corresponding files under corresponding subdirectories. You can use this option when you specify two directory names on the command line.

-s 

compares two directories, file by file, and prints messages for identical files between the two directories.

-U[[[c][lb8oa]][p[lb8oa]]] 

specifies the input format of any file missing the initial multiple-byte marker, the output format produced, or both.

When c is specified, the specifiers that follow it apply to the input consumed.

When p is specified, the specifiers that follow it apply to the output produced.

When neither c nor p are specified, the remaining -U specifiers apply to the input consumed.

When both c and p are specified, the remaining -U arguments apply to both input and output.

The remaining specifiers indicate the format of the characters read from input or written to output (as determined by c and p):

l     little-endian 16-bit wide characters
b     big-endian 16-bit wide characters
8     UTF-8 characters
a     ASCII characters from the ANSI code page
o     ASCII characters from the OEM code page

When multiple format specifiers can be associated with either c or p, the last appropriate one given on the command for each of c and p is used. For example:

-Ucoapl8

is the same as:

-Ucap8

When a p specifier is given without a c specifier and format specifiers are given before the p specifier, those format specifiers apply to the input. For example:

-Uopl

is the same as:

-Ucopl

When c or p is specified with no format specifies, little endian 16-bit wide characters are used by default for either input or output, as appropriate.

As an alternative to specifying formats for both input and output with the same -U option, you can specify the -U option multiple times. For example, the following are identical:

-Uca -Upb
-Ucapb

Note:

The -U specifiers are actually case-insensitive. For example, the following are all identical in their behavior:

-Ucl
-UcL
-UCl
-UCL

-w 

ignores white space (not including newlines) when making the comparison. For example, the following two lines are equivalent:

a b  c d
abcd


EXAMPLES

The following example illustrates the output of the diff command. The following two files, price1 and price2, are compared with and without the use of the -c option.

The contents of price1 are as follows:

Company X Price List:
$  0.39  -- Package of Groat Clusters
$  5.00  -- Candy Apple Sampler Pack
$ 12.00  -- Box of Crunchy Frog Chocolates
$ 15.99  -- Instant Rain (Just Add Water)
$ 20.00  -- Asparagus Firmness Meter
$ 25.00  -- Package of Seeds for 35 Herbs
$ 30.00  -- Child's Riding Hood (Red)
$ 35.00  -- Genuine Placebos
$ 45.00  -- Case of Simulated Soy Bean Oil
$ 75.88  -- No-Name Contact Lenses
$ 99.99  -- Kiddie Destructo-Bot
$125.00  -- Emperor's New Clothes

The contents of price2 are as follows:

Company X Price List:
$  0.39  -- Package of Groat Clusters
$  5.49  -- Candy Apple Sampler Pack
$ 12.00  -- Box of Crunchy Frog Chocolates
$ 15.99  -- Instant Rain (Just Add Water)
$ 17.00  -- Simulated Naugahyde Cleaner
$ 20.00  -- Asparagus Firmness Meter
$ 25.00  -- Package of Seeds for 35 Herbs
$ 30.00  -- Child's Riding Hood (Red)
$ 35.00  -- Genuine Placebos
$ 45.00  -- Case of Simulated Soy Bean Oil
$ 75.88  -- No-Name Contact Lenses
$ 99.99  -- Kiddie Destructo-Bot

The command

diff price1 price2

results in the following output:

4c4
< $  5.00  -- Candy Apple Sampler Pack
---
> $  5.49  -- Candy Apple Sampler Pack
6a7
> $ 17.00  -- Simulated Naugahyde Cleaner
14d14
< $125.00  -- Emperor's New Clothes

The addition of the -c option, as in

diff -c price1 price2

results in the following output:

*** price1 Wed Mar 04 10:08:40 1992
--- price2 Wed Mar 04 10:09:10 1992
***************
*** 1,9 ****
  Company X Price List:
  $  0.39  -- Package of Groat Clusters
! $  5.00  -- Candy Apple Sampler Pack
  $ 12.00  -- Box of Crunchy Frog Chocolates
  $ 15.99  -- Instant Rain (Just Add Water)
  $ 20.00  -- Asparagus Firmness Meter
  $ 25.00  -- Package of Seeds for 35 Herbs
  $ 30.00  -- Child's Riding Hood (Red)
--- 1,10 ----
  Company X Price List:
  $  0.39  -- Package of Groat Clusters
! $  5.49  -- Candy Apple Sampler Pack
  $ 12.00  -- Box of Crunchy Frog Chocolates
  $ 15.99  -- Instant Rain (Just Add Water)
+ $ 17.00  -- Simulated Naugahyde cleaner
  $ 20.00  -- Asparagus Firmness Meter
  $ 25.00  -- Package of Seeds for 35 Herbs
  $ 30.00  -- Child's Riding Hood (Red)
***************
*** 11,14 ****
  $ 45.00  -- Case of Simulated Soy Bean Oil
  $ 75.88  -- No-Name Contact Lenses
  $ 99.99  -- Kiddie Destructo-Bot
- $125.00  -- Emperor's New Clothes
--- 12,14 ----

diff -c marks lines removed from price1 with -, lines added to price1 with + and lines changed in both files with !. In the example, diff shows the default 3 lines of context around each changed line. One line was changed in both files (marked with !), one line was added to price1 (marked with +), and one line was removed from price1 (marked with -).

Note:

If there are no marks to be shown in the corresponding lines of the file being compared, the lines are not displayed. Lines 12 to 14 of price2 are suppressed for this reason.


ENVIRONMENT VARIABLES

TK_STDIO_DEFAULT_INPUT_FORMAT 

Sets the default input format for files that don't have the initial multibyte marker. The value must be one of those listed in the File Character Formats section of the unicode reference page.

TK_STDIO_DEFAULT_OUTPUT_FORMAT 

Sets the default output format. Normally the format of the first file read is used as the default output format. The value must be one of those listed in the File Character Formats section of the unicode reference page.


DIAGNOSTICS

Possible exit status values are:

0 

No differences between the files compared.

1 

diff compared the files and found them to be different.

2 

Failure due to any of the following:

— invalid command line argument
— cannot open one of the input files
— out of memory
— read error on one of the input files
— more than LINE_MAX characters between newlines
4 

At least one of the files is a binary file containing embedded NUL (\0) bytes.

Messages

file "filename": no such file or directory 

The specified filename does not exist. filename was either typed explicitly, or generated by diff from the directory of one file argument and the base name of the other.

Files file1 and file2 are identical 

The -s option was specified and the two named files are identical.

Common subdirectories: name and name 

This message appears when diff is comparing the contents of directories, but you have not specified -r. When diff discovers two subdirectories with the same name, it reports that the directories exist, but it does not try to compare the contents of the two directories.

Insufficient memory (try diff -h) 

diff ran out of memory for generating the data structures used in the file differencing algorithm (see LIMITS). The -h option of diff can handle any size of file without running out of memory.

Internal error--cannot create temporary file 

diff was unable to create a working file that it needed. Ensure that you either have a /tmp directory or that the environment contains a variable TMPDIR that names a directory where diff may store temporary files. Also, ensure that there is sufficient file space in this directory.

Missing #ifdef symbol after -D 

You did not specify a conditional label on the command line after the -D option.

Only one file may be "-" 

Of the two input files normally found on the command line of diff, only one can be the standard input.

Too many lines in "filename

A file of more than the maximum number of lines (see LIMITS) was given to diff.


LIMITS

The longest input line is 8192 on Windows and most UNIX systems. -h, files are limited to INT_MAX lines.


PORTABILITY

POSIX.2. x/OPEN Portability Guide 4.0. All UNIX systems. Windows 8.1. Windows Server 2012 R2. Windows 10. Windows Server 2016. Windows Server 2019. Windows 11. Windows Server 2022.

The -D, -H, -h, -i, -m, -n, -s, and -w options; the n argument to the -c option; and the diffh and bdiff versions of the command are extensions to the POSIX and x/OPEN standard. The -f option is an x/OPEN extension to the POSIX standard.


AVAILABILITY

PTC MKS Toolkit for Power Users
PTC MKS Toolkit for System Administrators
PTC MKS Toolkit for Developers
PTC MKS Toolkit for Interoperability
PTC MKS Toolkit for Professional Developers
PTC MKS Toolkit for Professional Developers 64-Bit Edition
PTC MKS Toolkit for Enterprise Developers
PTC MKS Toolkit for Enterprise Developers 64-Bit Edition
PTC Windchill Requirements and Validation


SEE ALSO

Commands:
cmp, comm, diff3, diffb, dircmp, patch, vdiff32


PTC MKS Toolkit 10.4 Documentation Build 39.