bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode support


From: Bruno Haible
Subject: Re: Unicode support
Date: Tue, 25 Jul 2006 19:16:08 +0200
User-agent: KMail/1.9.1

Hi,

Jarl Friis wrote:
> I would like to see support for UNICODE files, i.e. text files encoded
> as ucs2.
> 
> i.e. support for this in diff and diff3.

The basic principle of Unix on the command-line is that you can put
together complex commands from simple ones. The one you ask for goes
roughly like this:

   #!/bin/bash
   inputfile1=$1
   inputfile2=$2
   diff <(iconv -f UCS-2 < "$inputfile1") <(iconv -f UCS-2 < "$inputfile2")

There is no need to add this support directly to 'diff' itself, because
  - UCS-2 encoded files are quite rare on Unix,
  - the above solution does it.

By the way, the standard encoding on many Linux systems nowadays is
UTF-8. It is also Unicode, and unlike UCS-2,
  - it supports all traditional chinese characters, not just the most
    frequently used 50%,
  - it does not require unreliable heuristics to determine the "endianness"
    of the encoding.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]