lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev strict SortaSGML rules for <PRE>...</PRE> content


From: Vlad Harchev
Subject: Re: lynx-dev strict SortaSGML rules for <PRE>...</PRE> content
Date: Thu, 5 Aug 1999 12:15:55 +0500 (SAMST)

On Fri, 6 Aug 1999, Klaus Weide wrote:

> On Fri, 6 Aug 1999, Leonid Pauzner wrote:
> 
> > Many documents on the Web use <pre>...</pre> tags to include prepormatted
> > text with end-of-lines. Unfortunately, they also use some html decorations
> > like <em>..</em> or <h2>..</h2>, the latter break PRE mode (by forcing 
> > </PRE>)
> > which was not happen for TagSoup mode or Big Two browsers.
> > 
> > The question is what does the list of restricted tags under PRE looks like?
> > 
> > HTML 4.0 said the following (and <Hx> does not listed in the exclusions):
> > ========
> > 
> > 
> >   9.3.4 Preformatted text: The PRE element
> > 
> > <!ENTITY % pre.exclusion 
> > "IMG|OBJECT|APPLET|BIG|SMALL|SUB|SUP|FONT|BASEFONT">
> > 
> > <!ELEMENT PRE - - (%inline;)* -(%pre.exclusion;) -- preformatted text -->
>                     \---------/ \----------------/         
>                          |              |
>                          |            These are _additionally_ forbidden,
>  Everything that matches %inline;     and the exclusion applies at any
>  is allowed as a direct child.        level (not just for direct children).
>  Look for
>  <!ENTITY % inline .....>
>  somewhere.  Since there is no
>  other allowed content specified,
>  everything that does not match 
>  %inline; is not allowed.
> 
> H2 is forbidden because it is not covered by %inline;.
> 
> 
>     Klaus
> 

 As for me, I'd like to have a switch that will modify contents of the 
tags[HTML_PRE] (this modification should take place in HTSwitchDTD, and depend
on newly added variable to LYMain.c, that will allow the H[1-6] in <PRE> in
SortaSGML mode - some docs (on several russian sites) are converted to html
from plain text with sed script, and they enclose all text in <PRE>, but use
H[1-6] for marking sections and paragrphs. When viewed with SortaSGML mode,
the <PRE> </PRE> that surround everything are ignored, original
formatting is lost, and document becomes completely unreadable.

 Here is a tiny patch to do this (this is tested and works fine, and don't
think that I'm hiding my patches - I wrote this patch after reading this
message). I'm very confused by the  need of adding  another commandline option
and lynx.cfg setting (but this should be done)  - Leonid, can
you extend the patch to "production level" (add lynx.cfg setting, commandline
option, few lines to lynx.cfg with comments, document it in lynx.man)?

 This functionality depends on the newly added variable
'allow_headers_in_pre'. This code should be added to HTMLDTD.c:HTSwitchDTD
after the array is copied.

    if (allow_headers_in_pre) {
        tags[HTML_PRE].contains |= Tgc_Plike;
        tags[HTML_PRE].icontains |= Tgc_Plike;
    };

 
 Best regards,
  -Vlad


reply via email to

[Prev in Thread] Current Thread [Next in Thread]