[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Finding and mapping all UTF-8 characters
From: |
Peter Dyballa |
Subject: |
Re: Finding and mapping all UTF-8 characters |
Date: |
Sat, 5 Dec 2009 19:40:56 +0100 |
Am 05.12.2009 um 17:03 schrieb deech:
Is there a way to (1) search for the UTF-8 encoded characters in a
document
Yes. In GNU Emacs 23 I've seen in the *Warnings* buffer hyper-links to
the characters not fitting into the specified encoding.
You could also search for the usual prefixes of UTF-{7,8,16} encoded
characters.
and (2) map them to a sensible ASCII character?
How can you map 100,000 or 200,000 characters to a very limited set of
100? This mapping would be candidate for the most successful
compression algorithm...
Besides, it's not sane to save a file in an encoding a when the file's
header tells its contents is in encoding b.
--
Greetings
Pete
If you don't find it in the index, look very carefully through the
entire catalogue.
– Sears, Roebuck, and Co., Consumer's Guide, 1897