lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV fotemods.zip update


From: Foteos Macrides
Subject: Re: LYNX-DEV fotemods.zip update
Date: Thu, 28 Aug 1997 19:12:24 -0500 (EST)

Klaus Weide <address@hidden> wrote:
>On Thu, 28 Aug 1997, Foteos Macrides wrote:
>
>>      An update of fotemods.zip is available in:
>> 
>>      http://www.slcc.edu/lynx/fote/patches/
>> 
>> 1997-08-28
>> * Tweaks of LYExpandString() in LYCharUtils.c to do 8th bit stripping
>>   rather than Uhhh substitutions for untranslated KOI8-R characters in
>>   attribute values. - FM
>
>Just in case you were wondering why any of this didn't show up yet in the
>development code - I _am_ still working on it.  (Yeah I know, I've been
>saying that for many months.  But it's still true :) )
>
>With "any of this" I mean the changes to LYExpandString() and
>LYUnEscapeEntities() etc. to properly translate characters in attributes.
>I have noted your code changes and also your caveat in FOTEMODS.
>I am aiming for something more general -- without the restriction w.r.t.
>UTF-8, and also with back-translation for form submission -- and found
>it necessary to make lots of changes, to pass a charset parameter in
>many places.  Have bits and pieces of it lying around, but didn't want
>to put it in 2.7.1ac-0.* before it is at least half working.
>
>If I totally mess it up, we still have your code to fall back to.  :)
>
>As for your detailed message on this topic, I didn't reply (yet) because I
>wanted to try to have some "running code" first.  But basically, I think
>we agree that the chartrans code is difficult to follow, especially in
>SGML.c (to put it very mildly; "a mess" might better cover it).
>We may disagree on how to proceed from this basic fact, maybe...  
>A proper state mechanism in SGML.c instead of looping tricks (your words
>from FOTEMODS) would be nice; currently I am just trying not to break
>the functionality there (such as it is; I am certainly not convinced that
>it is doing "the right thing" in all cases now, but it could be worse).

        The two problems in SMGL.c are that you can lose utf-8 stuff when
not in the S_text state, and you can end up with 7-bit approximations
when your display character set is "UNICODE UTF 8" and should be able
to handle multibyte renditions of the characters.  I could extend the
looping tricks to deal with the first problem, but figured I'd wait and
see what you do with it.  My LYExpandString() is just a "make do for
now function".  It works, most of the time, for the other charsets and
display character sets, but is not necessarily the ideal way to do what
it's doing, and it could use the me structure more effectively.  My
LYUnEscapeEntities() reproduces the "7-bit approximations when it could
be the real thing" glitch, and should be fixed homologously to how you
deal with it in SGML.c.

        It's orders of magnitude better than what's in the vanilla
v2.7.1, nonetheless. :)

        The only thing that *really* matters for form attribute values
is that you not end up using Lynx's internal markers for high-value
space glyphs and soft hyphen in the submitted content.  The rest is
"judgement calls" which can be tweaked when the chartrans handling,
itself, if fully squared away.

                                Fote

=========================================================================
 Foteos Macrides            Worcester Foundation for Biomedical Research
 address@hidden         222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]