bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#14555: Facing Some problem in uniq command


From: Bob Proulx
Subject: bug#14555: Facing Some problem in uniq command
Date: Wed, 5 Jun 2013 09:06:38 -0600
User-agent: Mutt/1.5.21 (2010-09-15)

Shahid Hussain wrote:
> Appreciate your quick reply. What exactly i m doing is there are so many
> files in my product which contains some data in "name =  value" format. By
> using some pattern i m extracting only "value" field from all files and
> redirecting the output to one temporarily file as i do not want any value
> to be repeated in any file. And here i m applying uniq command to this
> temporary file (by pipe lining sort [sort |uniq -c tempFile]) But i am
> unable to get expected result.

It might be better if in your script you set:

  #!/bin/sh
  LC_ALL=C
  export LC_ALL
  ...
  sort | uniq
  ...

That will force a standard sort order everywhere in your script.

> But as you have told whitespace also should be identical at every line so
> this might be the problem in my case. Because when i displayed content of
> file using cat command and manually copied the same data to another file
> and then tried uniq with sort command it works fine.

Without knowing enough about your data a quick and dirty hack to clean
up whitespace might be to pass it through awk.

  awk '{print$1}' somefile1 | sort | uniq ...

Since awk splits on whitespace this will only print the first field
and any whitespace or additional anything will be discarded.

> So it is fine for me but it would be too better if there could be an option
> in uniq command to work fine even if  whitespace is not identical :).

No.  The way is not to use an option.  The way is to prepare the data
without whitespace differences.  You have the option of using tools
like awk to split on whitespace while preparing the data.  Preparing
the data to avoid whitespace differences is the right option to use.

Bob





reply via email to

[Prev in Thread] Current Thread [Next in Thread]