[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Coding system prefer
From: |
Sergio |
Subject: |
Re: Coding system prefer |
Date: |
Thu, 5 Mar 2009 00:06:14 -0800 (PST) |
User-agent: |
G2/1.0 |
On Mar 5, 10:19 am, Miles Bader <mi...@gnu.org> wrote:
> Sergio <sergio.pokrovs...@gmail.com> writes:
>> The FAR file manager,http://en.wikipedia.org/wiki/FAR_Managerdoes it
>> quite reliably using statistics about the character frequency
>> distribution.
> Does that work for anything except text files containing prose?
Yes, it does.
Of course it does not work for a binary file; but it works fine for a
text file in formal language, like C program with Russian strings or a
text with HTML markup.
I never explored the internals, but I guess that normally one can
ignore the ASCII part; only codes greater than 127 really matter. Of
these, one can easily detect utf-8 or other unicode encoding (at least
for the alphabetic planes; I never need the CJK part). And there are
8-bit codes, in which the higher part is characteristic.
And usually the noise part (like markup or formal language statements)
is in ASCII.
I never needed EBCDIC or any other encoding which is not a superset of
ASCII.
--
Sergei
- Coding system prefer, Maze, 2009/03/03
- Re: Coding system prefer, Jason Rumney, 2009/03/03
- Re: Coding system prefer, Teemu Likonen, 2009/03/03
- Re: Coding system prefer, Maze, 2009/03/04
- Re: Coding system prefer, Peter Dyballa, 2009/03/04
- Message not available
- Re: Coding system prefer, Sergio, 2009/03/04
- Re: Coding system prefer, Miles Bader, 2009/03/04
- Re: Coding system prefer,
Sergio <=
- Re: Coding system prefer, Maze, 2009/03/20
- Re: Coding system prefer, Fedor Khod'kov, 2009/03/04
- Re: Coding system prefer, Fedor Khod'kov, 2009/03/04
- Re: Coding system prefer, Eli Zaretskii, 2009/03/04
Re: Coding system prefer, Peter Dyballa, 2009/03/03