lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev Patch for "stopping when viewing a site" hang


From: Klaus Weide
Subject: lynx-dev Patch for "stopping when viewing a site" hang
Date: Wed, 18 Aug 1999 02:59:52 -0500 (CDT)

On Wed, 18 Aug 1999, Henry Nelson wrote:

> > Could you please summarize the two problems you are talking about?
> 
> They are both evident in the trace I sent:
>    http://www.flora.org/lynx-dev/html/month0799/msg00564.html.

Hmm, it was never clear to me that that thread was about a chartrans
problem.  It was never mentioned that the problem only occurs with
a CJK display character set, or with more than the given site.
So I thought it was just a weird networking problem.  (Yes, you did
say the same happened with a local copy...  I ignored that part....)

I saw the "Unknown entity" line in the trace, but didn't think it was
significant for the hanging.  I was wrong.

>       HTML:begin_element[8]: adding style to stack - HeadingLeft
>       SGML: Unknown entity 'reg' 174 -3
> This is the one I (hope I) fixed.  I've been aware of it for a while,
> but in general don't much care for those "extra" characters, so I didn't
> pursue it until now.  In the past, EUC-JP (so as to not generalize) always
> defaulted to the 7 bit approximations, and suddenly stopped doing so.
> Not being a programmer, plus there seemingly being a *bunch* of dead code
> lying around, it's hard for me to say, but it seems that someone didn't
> think about what all LYCharSets[] in LYCharSets.c was doing.

I haven't looked at your patch in detail (or tried it), I hope Leonid
will.

> > I tried to find out what happens to 'entities in the decimal 160-255 range'
> > by setting display character set to a CJK one (I picked Korean, also
> > tried EUC-JP), then
> > loading <http://sol.slcc.edu/lynx/current/lynx2-8-3/test/ALT88592.html>.
> > I got a lynx hanging(!) (looping?) in LYUCFullyTranslateString_1.
> 
> Yes, Lynx is not in as good health as some would like to think (what I
> was grumbling about the other day).  I assume you did not apply my patch,
> and so you are seeing the second problem I refer to.

Yes.

> The last two lines
> in the trace I sent, which is the last output before Lynx hangs, give a
> hint to what is happening:
>       SGML: Start <IMG>
>       stop_curses: done.
> If you have an entity which is unknown within an ALT string, Lynx will
> hang.  

Well, only for some display character sets.  which means most people don't
see the problem even when they try to reproduce it, unless they know the
necessary condition.

> Since my patch makes entities become "known", it ends up hiding
> the real problem.  

Appended is a patch that solves the other (more severe) half of the
problem.  The "hang" problem was caused by a combination of removing
too much under 'case S_check_name' in 'LYUCFullyTranslateString_1',
_and_ having having some Latin 1 character codes that are untranslatable
to the display character set.

> Another complicating factor is that if you had
> "0x5c U+00a5" (gives me a true yen sign on a Japanese Windows machine)
> instead of "U+00a5:YEN" in def7_uni.tbl you wouldn't be aware of the
> problem either (alt="&yen;").  

0x5c is '\' (backslash), does that mean Japanese Windows machines
cannot display a backslash but show a yen sign instead???

> "ALT88592.html" is sort of overkill :).
> What tipped me off was:
>         <img src="/design/pentium/qit/pix/pent1.gif" align="left" hspace="0"
>         width="175" height="184" ALT="Pentium&#174; processor package">
> That person trying to read the Chinese page was hanging on:
>       <a href="http://www.educities.edu.tw/";><img src="/images/brand.gif"
>         border="0" alt="&uml;&Egrave;&uml;&ocirc;&yen;&laquo;" WIDTH="61"
>         HEIGHT="21"></a>
> 
> Don't you just love that ascii art?

Mother of bogosities.

> __Henry
> 
> BTW, on another topic, Lynx doesn't know about "hspace".  Is that okay?

Just another unrecognized attribute, why should it matter?  Look at traces
for other sites, they are often full of "SGML: Unknown attribute" lines.

   Klaus


Index: lynx2-8-3/src/LYCharUtils.c
--- lynx2-8-3.old/src/LYCharUtils.c Sat, 26 Jun 1999 03:47:04 -0500 lynxdev
+++ lynx2-8-3/src/LYCharUtils.c Wed, 18 Aug 1999 00:32:19 -0500 lynxdev
@@ -2290,6 +2290,22 @@
                    */
                    state = S_got_outchar;
                    break;
+
+                   /* The following disabled section doesn't make sense
+                   ** any more.  It used to make sense in the past, when
+                   ** S_check_named would look in "old style" tables
+                   ** in addition to what it does now.
+                   ** Disabling of going to S_check_name here prevents
+                   ** endless looping between S_check_uni and S_check_names
+                   ** states, which could occur here for Latin 1 codes
+                   ** for some cs_to if they had no translation in that
+                   ** cs_to.  Normally all cs_to *should* now have valid
+                   ** translations via UCTransUniChar or UCTransUniCharStr
+                   ** for all Latin 1 codes, so that we would not get here
+                   ** anyway, and no loop could occur.  Still, if we *do*
+                   ** get here, FALL THROUGH to case S_recover now.  - kw
+                   */
+#if 0
                    /*
                    **  If we get to here, convert and handle
                    **  the character as a named entity. - FM
@@ -2298,6 +2314,7 @@
                    name = HTMLGetEntityName(code - 160);
                    state = S_check_name;
                    break;
+#endif
                }
 
        case S_recover:



reply via email to

[Prev in Thread] Current Thread [Next in Thread]