bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode support


From: Bruno Haible
Subject: Re: Unicode support
Date: Tue, 25 Jul 2006 20:42:11 +0200
User-agent: KMail/1.9.1

Jarl Friis wrote:
> I didn't know that the default
> "to-encoding" on iconv is UTF-8, but a small test reveals this fact.

iconv's default "to-encoding" (as well as its default "from-encoding")
is the locale encoding. It can be specified at system installation
time (for most Linux distributions) or later, ad-hoc, through the
environment variables LANG or LC_ALL.

If you found out that for you, the default "to-encoding" is UTF-8, it
means you are already in a UTF-8 locale.

> So I assume with these very good arguments that the diff utils support
> UTF-8, right?

The input and output encoding of 'diff' is also the locale encoding.
So, for you, it's UTF-8. Other people, who don't usually work in an UTF-8
locale, can convert the UTF-8 files to their locale encoding before
running 'diff':

   #!/bin/bash
   inputfile1=$1
   inputfile2=$2
   diff <(iconv -f UTF-8 < "$inputfile1") <(iconv -f UTF-8 < "$inputfile2")

Or can run 'diff' on the UTF-8 files directly and then convert to the
encoding of your locale:

   #!/bin/bash
   LC_ALL=en_US.UTF-8 diff "$@" | iconv -f UTF-8

The result should be the essentially the same.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]