lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Lynx character entity references fix


From: Leonid Pauzner
Subject: Re: lynx-dev Lynx character entity references fix
Date: Thu, 11 Mar 1999 18:38:52 +0300 (MSK)

10-Mar-99 09:39 Klaus Weide wrote:
> On Tue, 9 Mar 1999, Leonid Pauzner wrote:
>> 9-Mar-99 12:45 Klaus Weide wrote:
>> > Among the previous changes (that are in dev.18/dev.19), the following
>> > looks wrong.  In UC_con_set_trans():
...
>> > Here ptrans points to one of the four tables (slots) in translations[].
>> > Your change leaves the table unchanged when it should be re-initialized.
>> > So (to-Unicode translation for) one charset could effectively inherit
>> > the translations for a completely different charset that used the same slot
>> > before.
>>
>> Yes, I was not able to understand why we have four tables
>> (IMO only one is really used) and what is UC_MapGN for.

> Efficiency - saves table initialization time when moving between documents
> with different charsets.  Only applies to the "forward" (to Unicode)
> tables.  But this should be the most likely kind of switching that occurs
> (under the simplifying assumptions that "forward" translation occurs only from
> the document charset and from-Unicode translation occurs ony to the Display
> character set, and that changing the Display character set is not a frequent
> event).

OK, changing of "assume charset" for unlabelled document gives the folowing
(grep UC_MapGN from trace log):

UC_MapGN: Using 1 <- 26 (windows-1251)
UC_MapGN: Using 1 <- 1 (iso-8859-15)
UC_MapGN: Using 2 <- 2 (cp850)
UC_MapGN: Using 1 <- 3 (windows-1252)
UC_MapGN: Using 2 <- 4 (cp437)
UC_MapGN: Using 1 <- 5 (dec-mcs)
UC_MapGN: Using 2 <- 6 (macintosh)
UC_MapGN: Using 1 <- 7 (next)
UC_MapGN: Using 2 <- 8 (hp-roman8)

It is for "forward" translation and apparently slots #3 and #4 are not used.

> Not invented by me, taken from the original linux code.

>> So I just "add" num_n256 so things works without index overrun
>> (and hopefully with a proper result) and postpone more UCDomap.c changes
>> for dev.Next - patch from your side really welcome :-)

> Are changes necessary, and for what purpose?
Removing of num_n256 staff gives core dump at startup.
Another way may be to set UChndl = -1 in LYRegister_with_LYCharSets()
to simulate "old" style behaviour (but not for utf-8).
All UCTrans* functions preserved by UChndl >= 0 check.

> Most of UCdomap.c is the inner engine of the chartrans code, while it may
> not be very clear, it has done its job faithfully with minimal changes over
> several versions.

> I am inclined to say, mess with it at your own risk...
I understand it:)

>> > The closer equivalent to previous behavior would be to initialize all 256
>> > elements to 0xfffd.

> And that could be easily restored.

>    Klaus




reply via email to

[Prev in Thread] Current Thread [Next in Thread]