bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Diff doesn't properly ignore whitespace for this input


From: Tyler Bletsch
Subject: Re: Diff doesn't properly ignore whitespace for this input
Date: Fri, 17 Jul 2015 13:22:36 -0400
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

Thanks for the reply. It is so neat to see one of the original authors update diff with a fix that actually affects me. A nice object lesson in production software development for my class, too.

- Tyler

On 7/16/2015 4:45 PM, Jim Meyering wrote:
On Tue, Jul 14, 2015 at 12:01 PM, Tyler Bletsch <address@hidden> wrote:
I believe I've found a bug in diff's handling of "ignore whitespace" mode. I
have two test files that differ only in whitespace and newlines; I've
verified this using a separate tool (WinMerge) plus doing a diff on the
files after doing s/\s*/ / on the whole file. When I ask for the diff using
"-wb", it reports a spurious difference only in whitespace if I give the
arguments in one order, but correctly reports no differences if I give it
the reverse order.  Further, I get consistently correct behavior if I add
the "-d" option.

Example:

$ diff -wB in1.txt in2.txt
3946c4201,4203
< Exits:
---

Exits:
$ diff -wB in2.txt in1.txt
$ diff -dwB in1.txt in2.txt
$ diff -dwB in2.txt in1.txt

This came up while using diff to automatically grade a text adventure I'm
having students do in my class -- this is the ONLY file pair out of over
3000 that appears to exhibit the problem. This leads me to believe that it
must be a fairly rare issue. I'm fixing it on my end by always using -d, but
I think this should be classified as a bug, because it reports a
non-whitespace difference in files where none exists.

I'm not sure if this mailing list allows attachments, so I've put the files
in question here:

https://dl.dropboxusercontent.com/u/68643317/diff-bug-test-files.zip

I tried paring the files down to just demonstrate the bug and nothing else,
but the behavior would seemingly go away at random as I removed content from
the files. Therefore, I'm including the files in their original form. The
files represent test output of the text adventure, specifically navigation
of the default world from the ROM 2.4b6 MUD (after having been converted to
a format for my class's assignment). This content is safe to share.

I've confirmed that this behavior is present in the following builds of
diff:
- diff (GNU diffutils) 2.8.1 on Red Hat Enterprise Linux Server release 6.5
(Santiago)
- diff (GNU diffutils) 3.2 on Ubuntu 12.04.4 LTS
- diff (GNU diffutils) 2.9 on Cygwin 32-bit (Windows 7 x64)
Thank you for the report.
I confirm that it also affects diff-3.3, but found that with the very
latest from diff.git (v3.3-30-g29e8de4), the problem does not arise.
I.e., comparing your two files like this produces no output:

   $ src/diff -wBu /t/in{1,2}.txt | wc -c
   0

I suspect that it was fixed via this change by Paul Eggert:

   
http://git.savannah.gnu.org/cgit/diffutils.git/commit/?id=9b48bf3d3ed002e32fad
   http://bugs.gnu.org/16848




reply via email to

[Prev in Thread] Current Thread [Next in Thread]