[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev A Missing >...
From: |
Klaus Weide |
Subject: |
Re: lynx-dev A Missing >... |
Date: |
Wed, 19 Jul 2000 14:45:10 -0500 (CDT) |
> > > > A missing closing > in a </STYLE> tag is making the document at
> > > > http://advice.networkice.com/Advice/Support/KB/default.htm and all
> > > > "q000xxx/default.htm" documents unreadable to lynx. However, w3m
> > > > correctly displays the documents. I've sent the webmaster a message
> > > > asking them to correct the error.
On Tue, 18 Jul 2000, Thomas E. Dickey wrote:
> I downloaded a copy and looked at it (I haven't spent the time to figure
> out how to make w3m work with our firewall), and can see where to tweak
> the html as well. There are actually two details that I can see where
> w3m differs - it also tolerates trailing whitespace inside the tag, e.g.,
>
> </style >
>
> is also not recognized by lynx. Both conditions look fairly
> straightforward to check for (though I've not done much with that part of
> lynx).
There's already some logic in SGML.c that would recover in these
cases. It can be enabled by a 1-bit (yes, bit not bye) change
in HTMLDTD.c:
change
#define T_STYLE 0x40000,0x00000,0x00000,0x7638F,0x76FAF,0x8001F,0x00000
to
#define T_STYLE 0x40000,0x00000,0x00000,0x7638F,0x76FAF,0x8001F,0x00008
(and similar for T_SCRIPT if you wish). This will be effective only
in Sorta SGML mode though.
If the '>' is completely missing from the '</SCRIPT>' tag, some stuff
after the defective tag will just be junked - until the next '>';
this isn't different from missing '>' in other situations.
I don't think this is the best way. It may also prematurely end the
SCRIPT contents on '</SOMETHING' that isn't '</SCRIPT', and while that's
not valid input (any '</SOMETHING' in the script content should have been
written with some form of escaping), it is probably more useful to continue
looking for a '</SCRIPT' in that case.
Klaus
; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden