bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18291: Unix Sort Bug Report


From: Eric Blake
Subject: bug#18291: Unix Sort Bug Report
Date: Mon, 18 Aug 2014 10:32:07 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0

On 08/18/2014 09:57 AM, Pádraig Brady wrote:
> On 08/18/2014 09:55 AM, NTENTOS STAVROS wrote:
>>
>> Hello developers,
>>
>> Recently, using the sort utility I run into an omission. While I cannot 
>> disclose the file in question, I will try to explain the issue:
>> On a Windows-created file (line ending: \r\n) I tried to perform a sorting, 
>> which happened to sort the last entry somewhere above. The last line did not 
>> have a line ending of any kind, and sort created a Unix-like ending (\r), 
>> which afterwards creates a parsing problem with the file.
> 
> Well a \n is inserted actually, not \r, but yes that is a problem on windows.
> This demonstrates the behavior:
> 
>   $ printf '2\r\n1' | sort | od -Ax -tx1z -v
>   000000 31 0a 32 0d 0a                                   >1.2..<
> 
> The \n is inserted so as to delimit the reordered item appropriately,
> which is set here:
> 
> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/sort.c;h=c2493192;hb=HEAD#l178
> 
> It seems that this should be set to '\r\n' on cygwin builds,
> (wither other adjustments to handle multiple chars).

If the file was opened in text mode, then sort only sees \n line endings
on input (cygwin already shortened \r\n to \n before handing the line to
sort), and on output all \n are automatically converted back to \r\n.
If the file was opened in binary mode, then cygwin CANNOT second guess
what line endings you wanted.  It sounds like your file lives on a
binary mount point, when you want it to live on a text mount point
instead; at which point cygwin should do the right thing (although I
admit I did not actually try this on cygwin, because I seldom use cygwin
text mounts).  But that is probably more a question for cygwin
downstream, not for upstream coreutils (the POSIX requirement is that
text and binary file modes are identical, so any system like cygwin
where there are not is already non-POSIX and starts to get into a
question of whether pushing upstream fixes for a downstream-only problem
is maintainable).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]