DiffUtil(n) 0.4.2 diffutil "Comparision Utilities"

Name

DiffUtil - Compare Stuff

Table Of Contents

Synopsis

Description

This package provides utilites for comparisons of strings, lists and files. The base comparison is a Longest Common Substring algorithm based on J. W. Hunt and M. D. McIlroy, "An algorithm for differential file comparison," Comp. Sci. Tech. Rep. #41, Bell Telephone Laboratories (1976). Available on the Web at the second author's personal site: http://www.cs.dartmouth.edu/~doug/

COMMANDS

::DiffUtil::diffFiles ?options? file1 file2

Compare two files line by line. The return value depends on the -result option, see below.

-nocase

Ignore case.

-i

Ignore case.

-b

Ignore space changes. Any sequence of whitespace is treated as a single space, except at beginning of line where it is completely ignored.

-w

Ignore all spaces.

-noempty

Ignore empty lines in initial compare step. I.e. empty lines are considered not equal at first. Multiple equal lines gives longer runtimes and by avoiding the very common empty line, runtimes are improved. With -b or -w, all-space lines are considered empty. This is similar to having a -pivot value of 1 just for empty lines.

-pivot value

Ignore common lines in initial compare step. Multiple equal lines gives longer runtime and by avoiding them runtime is improved. The pivot value says how many equal lines there at most may be in file2 for those lines to be regarded. The default is 10.

-nodigit

Consider any sequence of digits equal.

-align list

Align lines. The argument is a list with an even number of elements. Each pair is line numbers in the first and second file. Those lines are considered equal and matching, regardless of their contents.

-range list

Diff only a range of the files. The list is {first1 last1 first2 last2}

-regsub list

Apply a search/replace regular expression before comparing. The list consists of an even number of elements. Each pair is a regular expression and a substitution, as used in regsub -all. Multiple pairs are allowed and multiple -regsub are allowed. All patterns will be applied in order on each line.

-regsubleft list

Like -regsub but only applied to the first file.

-regsubright list

Like -regsub but only applied to the second file.

-result style

Select result style. The default is diff.

diff

Returns a list of differences, each in a four element list. {LineNumber1 NumberOfLines1 LineNumber2 NumberOfLines2} The first line in a file is number 1.

match

The return value is a list of two lists of equal length. The first sublist is of line numbers in file1, and the second sublist is of line numbers in file2. Each corresponding pair of line numbers corresponds to equal lines in the files.

-encoding encoding

Apply encoding when reading files. This works as for fconfigure.

-translation value

Apply translation when reading files. This works as for fconfigure.

-gz

Apply gunzip decompression when reading files. Requires zlib.

-lines varname

Keep the data read from the files. A two element list with the lines from each file is put in the given variable.

::DiffUtil::diffLists ?options? list1 list2

Compare two lists element by element. The return value depends on the -result option, see below.

-nocase

Ignore case.

-i

Ignore case.

-b

Ignore space changes.

-w

Ignore all spaces.

-noempty

Ignore empty elements in initial compare step. I.e. empty elements are considered not equal at first. Multiple equal elements give longer runtimes and by avoiding the very common empty element, runtimes are improved. With -b or -w, all-space elements are considered empty. Empty elements that obviously matches are noted as equal in a post processing step, but empty elements within change blocks will be reported as changes.

-nodigit

Consider any sequence of digits equal.

-result style

Select result style. The default is diff.

diff

Returns a list of differences, each in a four element list. {ElementIndex1 NumberOfElements1 ElementIndex2 NumberOfElements2}

match

The return value is a list of two lists of equal length. The first sublist is of indices in list1, and the second sublist is of indices in list2. Each corresponding pair of indices corresponds to equal elements in the sequences.

::DiffUtil::compareFiles ?options? file1 file2

Compare two files. The return value is a boolean which is true when equal.

-nocase

Ignore case.

-ignorekey

Ignore keyword substitutions. This is limited to the first 60k of the file.

-encoding enc

Read files with this encoding. (As in fconfigure -encoding.)

-translation trans

Read files with this translation. (As in fconfigure -translation.)

::DiffUtil::compareStreams ?options? ch1 ch2

Compare two channel streams. The return value is a boolean which is true when equal.

-nocase

Ignore case.

-ignorekey

Ignore keyword substitutions. This is limited to the first 60k read.

-binary

Treat stream as binary data. Normally this means it is configured with -translation binary.

EXAMPLES

% DiffUtil::diffFiles $file1 $file2
{{3 2 3 4}}

Keywords

diff, lcs, longest common substring