bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort


From: Bob Proulx
Subject: Re: sort
Date: Mon, 29 Aug 2005 11:46:31 -0600
User-agent: Mutt/1.5.9i

Nathan Moore wrote:
> I guess that the best way to put it is, what is the correct behavior 
> when none of the LC_ environmental variables
> are set?

What is the output of 'locale'?

  locale

That will display the settings according to the environment
variables.  If none are set then you will get a C/POSIX locale by
default.  But that command will display them individually.

> This really isn't mentioned in the documentation (or I wasn't 
> able to find it).  My version of coreutils
> is 5.2.1, which is the most recent.

Please suggest improvements to the documentation so that they can be
improved.  The info docs currently say this:

       (1) If you use a non-POSIX locale (e.g., by setting `LC_ALL' to
    `en_US'), then `sort' may produce output that is sorted differently
    than you're accustomed to.  In that case, set the `LC_ALL' environment
    variable to `C'.  Note that setting only `LC_COLLATE' has two problems.
    First, it is ineffective if `LC_ALL' is also set.  Second, it has
    undefined behavior if `LC_CTYPE' (or `LANG', if `LC_CTYPE' is unset) is
    set to an incompatible value.  For example, you get undefined behavior
    if `LC_CTYPE' is `ja_JP.PCK' but `LC_COLLATE' is `en_US.UTF-8'.

How might that be improved?

Looking at this now I think suggesting to set LC_ALL=C is too strong.
I know why it was done, so that it would override LANG.  But now I
think it should probably just suggestion LANG with the warning that
LC_COLLATE overrides LANG and LC_ALL overrides LC_COLLATE.

> I've never really messed w/ the LC_ environmental variables before
> and some of mine were not set (on SuSE 9.2).

You don't need to set all of them.  Just the ones you want.  Don't try
to set them all.  Personally I use this:

  export LANG=en_US.UTF-8
  export LC_COLLATE=C

> I've figured it out (export `locale`), but it seems like that is one
> of those things that just isn't written down anywhere.

You should not need to do that.  I recommend against it.

> Since sending the initial report, I had figured out that 
> "LC_COLLATE=ascii sort" did what I wanted.

Hmm...  I think "ascii" is actually unrecognized and that is causing a
fallback to C/POSIX.  I think other programs will complain when they
can't find that locale data.  So this will actually create other
errors.  Better to set this to C or POSIX instead.

> Thanks for your replies, and please tell me what the behavior is
> without any LC_ settings.  I'm just curious.

You get C/POSIX sort ordering by default if none of LC_ nor LANG
(don't forget LANG) is set.

Note that GNU coreutils does not set any of the locale settings in
your environment.  This was very likely done by your distro.  I
believe that doing this without notifying the user is a distro problem
and not a coreutils problems.  You might need to address this problem
with your distro.

Bob




reply via email to

[Prev in Thread] Current Thread [Next in Thread]