bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort


From: Bob Proulx
Subject: Re: sort
Date: Mon, 29 Aug 2005 22:40:08 -0600
User-agent: Mutt/1.5.9i

Nathan Moore wrote:
> address@hidden:~> locale
> LANG=en_US.UTF-8
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"

So your collate sequence will be dictionary sort ordering.  Case is
folded and punctuation will be ignored.

> LC_MONETARY="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_PAPER="en_US.UTF-8"
> LC_NAME="en_US.UTF-8"
> LC_ADDRESS="en_US.UTF-8"
> LC_TELEPHONE="en_US.UTF-8"
> LC_MEASUREMENT="en_US.UTF-8"
> LC_IDENTIFICATION="en_US.UTF-8"
> LC_ALL=
> address@hidden:~>
> address@hidden:~> set | grep LC_
> address@hidden:~> set | grep LANG
> LANG=en_US.UTF-8

You have LANG set which is driving all of the others.  LC_COLLATE is
not set and so inherits the value from LANG.  This is also true for
the others being reported by the locale command.  This is a reasonable
configuration and many prefer it.  (Not me though.)

> > Hmm...  I think "ascii" is actually unrecognized and that is causing a
> > fallback to C/POSIX.  I think other programs will complain when they
> > can't find that locale data.  So this will actually create other
> > errors.  Better to set this to C or POSIX instead.

> Well, that is odd.  I would have thought that LC_COLLATE being
> undefined, being set to empty, or being set to something invalid
> would all have the same effect.

Here is an illustration of the problem

  LC_ALL=C perl -e 0

That works because C is always okay to use.  But setting to ascii does
not because that does not really exist.  Perl will complain about this.

  LC_ALL=ascii perl -e 0
  perl: warning: Setting locale failed.
  perl: warning: Please check that your locale settings:
          LANGUAGE = "en_US:en_GB:en",
          LC_ALL = "ascii",
          LC_COLLATE = "C",
          LANG = "en_US.UTF-8"
      are supported and installed on your system.
  perl: warning: Falling back to the standard locale ("C").

> So, If LC_'s are not set, but LANG is, what method of comparing used?

LANG is the normal variable to set to control your locale setting.
LC_COLLATE overrides the LANG setting for character collating sequence.
LC_ALL overrides all of the LC_* variables.

So as you can see the precendence is LC_ALL as the highest precedence
followed by LC_individual item followed by LANG as the lowest priority.

> (funny aside -- I had a Red Hat distro once that didn't come w/
> stat.  That should have been illegal)

In the grand scheme of things stat is new.  It has only recently been
added to coreutils.  So that just says you were using an older version
of the utils.  Time passes and things like this change.  It is not
covered by POSIX and is not standardized.  But it is quite useful.

Bob




reply via email to

[Prev in Thread] Current Thread [Next in Thread]