lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Character set support


From: Klaus Weide
Subject: Re: LYNX-DEV Character set support
Date: Mon, 12 May 1997 18:41:08 -0500 (CDT)

On Mon, 12 May 1997, Hynek Med wrote:

> Few general hints for using the development code with cyrlilic/east
> european encodings (substitute Windows-1250 with Windows-1252 and koi-8-r

Note that it has to be koi8-r, not koi-8-r.

But what Michael wanted, it seems was not support for koi8-r, but for what
he calls "IBM PC character set".  But he seems to mean some character set
for PC's for Russian, and that is *not* what if means for Lynx.
The "IBM PC character set" on Lynx's Options screen refers to one specific
code page, cp437 a.k.a. ibm427 - the original US American thing.   
There is also another "PC character set", codepage 850, in standard Lynx.
Lynx has to know which "PC character set" it is, so that it can map those
character from other charset into it which are also available in the "PC
character set".

Michael is probably right that "IBM PC character set" and codepage 850
should have the same transparency characteristics as KOI8-R, when Raw 
Mode is on, since they all use characters in the 0x80-0x9F range
(including 0x9B).  But then, there aren't many webpages using them, they
are usually used to display things which are transmitted as iso-8859-1
and not in Raw mode.  But the chartrans stuff in the development code
should treat "IBM PC character set" and KOI8-R the same way, in this
respect.

Michael is not right if he assumes that he should use the display
character set meant for cp437 for his "Russian PC character set".
The best thing would be to use the mechanism in the development code and
add a table for this.  Then Cyrillic characters could also be translated
from KOI8-R and iso-8859-5 and even UTF-8 (the devel code already knows
about those.)

With the devel code, if he doesn't want to add a table, he should be able
to use the "Transparent" pseudo display character set.  I am not sure 
that currently works correctly, though.

With standard Lynx (without changing the code in LYCharSets.c and other
places), he can only "cheat" by pretending to have some other character
set, but there's no way this can work correctly for all cases (say, for
unlabelled as well as labelled pages, and some of them also using some
Latin-1 characters) even if he restricts himself to only look at pages
which are transmitted in the "character set" which his display uses.

   Klaus


;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]