lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev Re: msg00798.html (was: 0x2276 handling)


From: Leonid Pauzner
Subject: lynx-dev Re: msg00798.html (was: 0x2276 handling)
Date: Sun, 10 May 1998 15:12:06 +0400 (MSD)

> >> >>         For any raw x80-x9F characters when the document charset is
> >> >
> >> >Actually, Lynx ignores that characters only for iso-8859-1.
> >>
>         The filter is applied in SGML_character() of SGML.c, and
> HTPlain_write() of HTPlain.c.  It as a substantial "- FM" comment, so
> you shouldn't have any trouble finding it within those functions.
>
>         It's based on the LYlowest_eightbit value for the charset, so you
> need to figure out why it wasn't set to 160 for iso-8859-x charsets other
> than iso-8859-1 (used to be, so something got changed or broken in the
> actual release).

Fixed, before "Convert the octet to Unicode. - FM".
It was just >= 127 for both 2.7.2 and 2.8
but iso-latin-1 survives due to "old translation mechanism".
Another possible solution is to enable u+fffd mapped to space in def7_uni.tbl:
misc windows charsets have undefined characters in the middle, look xxxx-uni.h

It may be useful for Lynx future to fix 2.7.2 for several features
recently implemented in 2.8, primarily verbose_images and chartrans fixes.
I can make a patch from my subject.

    /*
    **  If we want the raw input converted
    **  to Unicode, try that now. - FM
    */
    if (context->T.trans_to_uni &&
        ((unsign_c >= LYlowest_eightbit[context->inUCLYhndl]) ||
                     /* it ^ was >= 127 */
         (unsign_c < 32 && unsign_c != 0 &&
          context->T.trans_C0_to_uni))) {
        /*
        **  Convert the octet to Unicode. - FM
        */
        clong = UCTransToUni(c, context->inUCLYhndl);
        if (clong > 0) {



>
>         Note that the filter is blocked when the Display Character Set is
> "transparent", but not if you specify the Display Character Set you
> actually have and toggle on RAW mode.  I explained in an earlier message

changing of rawmode settings force document reloading
with new settings (assumed "in-" charset).

"transparent" not in unicode; it pass symbols
which may be restricted in current display charset.
(presently it fill the screen with U81 U82... in text/html mode
and ignore in text/plain if those symbols < LYlowest_eightbit[out-charset]
but pass everithing >= , see above about windows).

> why "transparent" is dangerous for general users, and that I was
> apprehensive about including it.  I agree with David Wooley that some
> warning about it, that will get noticed by general users, should be added
> (I doubt it would ever be used for a denial of service attack and that
> a CERT bulletin is needed.  That was just David going overboard again.
> But do think about it. :)
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]