bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#9780: sort -u data loss deserves new release ASAP [Re: bug#9780: sor


From: Jim Meyering
Subject: bug#9780: sort -u data loss deserves new release ASAP [Re: bug#9780: sort -u...
Date: Fri, 17 Aug 2012 12:00:24 +0200

Jim Meyering wrote:
> Jim Meyering wrote:
> ...
>> In case anyone is chomping at the bit, here's a preliminary patch:
>>
>> Here's a smaller test case that appears to be host/nproc-independent:
>> It should print two lines: 1, then 7.
>> Without this patch, it prints only "7".
>>
>>     (yes 7|head -11; echo 1)|sort --parallel=1 -S32b -u
...
> Here's a complete patch:
>
>>From 431102766cbf7c360ee6fa1f157ebcd7d8b9ca0e Mon Sep 17 00:00:00 2001
> From: Jim Meyering <address@hidden>
> Date: Wed, 15 Aug 2012 12:30:44 +0200
> Subject: [PATCH] sort: sort --unique (-u) could cause data loss
>
> sort -u could omit one or more lines of expected output.
> This bug arose because sort recorded the most recently printed line via
> reference, and if you were unlucky, the storage for that line would be
> reused (overwritten) as additional input was read into memory.  If you
> were doubly unlucky, the new value of the "saved" line would not only
> match the very next line, but if that next line were also the first in
> a series of identical, not-yet-printed lines, then the corrupted "saved"
> line value would result in the omission of all matching lines.
>
> * src/sort.c (saved_line): New static/global, renamed and moved from...
> (write_unique): ...here.  Old name was "saved", which was too generic
> for its new role as file-scoped global.
> (fillbuf): With --unique, when we're about to read into a buffer that
> overlaps the saved "preceding" line (saved_line), copy the line's .text
> member to a realloc'd-as-needed temporary buffer and adjust the line's
> key-defining members if they're set.
> (overlap): New function.
> * tests/misc/sort: New tests.
> * NEWS (Bug fixes): Mention it.
> * THANKS.in: Update.
> Bug introduced via commit v8.5-89-g9face83.
> Reported by Rasmus Borup Hansen in
> http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/23173/focus=24647

That sort -u can cause data loss is a big deal.
I want to make a release with this fix as soon as possible.
Since I'm making this a mostly-bug-fix release, the du and md5 --tag
changes will have to wait for 8.20.
However, I'll be happy to apply documentation-correcting changes
if someone would post a complete, updated patch or two.

If Bruce and Paul find that changing gnulib's parse-datetime test
will avoid a failure on LFS, I'll pull in a gnulib update for that.

Any other bug-fix-like changes that people can suggest?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]