bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#14555: Facing Some problem in uniq command


From: Shahid Hussain
Subject: bug#14555: Facing Some problem in uniq command
Date: Wed, 5 Jun 2013 10:47:43 +0530

Hi,
Appreciate your quick reply. What exactly i m doing is there are so many
files in my product which contains some data in "name =  value" format. By
using some pattern i m extracting only "value" field from all files and
redirecting the output to one temporarily file as i do not want any value
to be repeated in any file. And here i m applying uniq command to this
temporary file (by pipe lining sort [sort |uniq -c tempFile]) But i am
unable to get expected result.

But as you have told whitespace also should be identical at every line so
this might be the problem in my case. Because when i displayed content of
file using cat command and manually copied the same data to another file
and then tried uniq with sort command it works fine.


So it is fine for me but it would be too better if there could be an option
in uniq command to work fine even if  whitespace is not identical :).

Lot of thanks,
shahid hussain




On Tue, Jun 4, 2013 at 10:00 PM, Eric Blake <address@hidden> wrote:

> tag 14555 moreinfo
> thanks
>
> On 06/04/2013 06:07 AM, Shahid Hussain wrote:
> > I have a file (named 'a')which contains following data.
>
> > 9041
> > 9042
> > 8336
> ...
>
> > 9041
>
> Ouch.  Your file is not sorted.  Therefore, 9041 is NOT unique when run
> through 'uniq', which only compares adjacent lines.
>
> > And Below is the commands i am executing along with its output with
> > comments.
> > address@hidden config]$ uniq -d a
> > 8336
> > 8338
>
> I get different results when copying and pasting from your email:
> $ uniq -d a
> 8336
> 8338
> 9040
> 18000
> $ uniq --version | head -n1
> uniq (GNU coreutils) 8.17
>
> Could it be you are using an older version of coreutils, and we have
> fixed a bug in the meantime for how unique behaves when presented an
> unsorted file?
>
> >       1 18000
> >       1 18000
> > //Observe last line which is repeated with its previous line (some other
> > entries are also there)but uniq command not able to find it.
>
> One other possibility: Are you sure the whitespace is identical on every
> line?  Or could you have trailing whitespace on one line but not the
> other (such as a carriage return), so that the lines really are not
> unique even though they appeared unique?  If so, that would explain why
> _my_ uniq run counted 18000 as a duplicate, if the act of sending the
> email and then me copying and pasting into a file munged the whitespace
> differences away.
>
> While I suspect that there is no bug in coreutils, I need more
> information from you to confirm that claim, so I'm leaving the bug open
> for now.
>
> --
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]