lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] lynx misrenders many *IN*valid xhtml5 pages on my site


From: Thorsten Glaser
Subject: Re: [Lynx-dev] lynx misrenders many *IN*valid xhtml5 pages on my site
Date: Mon, 12 Jun 2023 23:22:24 +0000 (UTC)

Hi again!

Lennart, you nerdsniped me.

Dixi quod…
>Lennart Jablonka dixit:
>>> I’m not sure whether it may then also self-close all tags but would
>>> assume so (except I know tech is… tricky).
>>
>> As in an XML document, <asdf/> and <asdf></asdf> are entirely equivalent, 
>> yes,
>> the server may then “self-close” all empty elements.
>
>That’s what made me say I’d assume so, but I know tech, which is
>why I hesitate.

I found hints towards still requiring the empty not-self-closed
tags even in XML but I forgot where during the subsequent hacking
which took m̲u̲c̲h̲ longer than expected.

But here is that hacking’s result. Find attached an LD_PRELOAD library
that makes “xmlstarlet fo”, without -o (because it then uses yet other
libxml2 function calls), output XHTML ☻

Prepare:

$ sudo apt-get install libxml2-dev

Compile and link:

$ gcc -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -fstack-protector-strong \
      -Wformat -Werror=format-security -Wall -Wextra \
      $(xml2-config --cflags) -DPIC -fPIC -shared -o libforceXHTML.so \
      forceXHTML.c

Use:

$ LD_PRELOAD=$PWD/libforceXHTML.so xmlstarlet fo [-n] [-e encoding] filename|-

C̲a̲v̲e̲a̲t̲:̲ without -n it breaks up “old browser-safe” framing for CSS and 
JS:

 <style type="text/css"><!--/*--><![CDATA[/*><!--*/
  …
 /*]]>*/--></style>
 <script type="text/javascript"><!--//--><![CDATA[//><!--
  …
 //--><!]]></script>

This is because in XML, the <!--/*--> or <!--//--> is a
comment node inside the style/script node (as is correct)
and libxml2’s “XHTML” output code writes a newline after
each node if indenting. xhtmlNodeListDumpOutput() is
static, so not up for LD_PRELOAD hacks. But the OP was
not formatting/indenting their XML anyway so this strikes
me as a suitable postprocessing step. I did verify that it
properly adds spaces and not-self-closes elements for one
static XHTML file.

This was initially very mildly based on libxml2 itself,
whose public API sucks badly enough I had to redraft it
from the beginning. (This the reason of taking so long.)
I publish this under Ⓕ CC0.

Enjoy,
//mirabilos
PS: Shlomi Fish, when replying to me, please send to the list
    as your provider fails badly enough at SMTP it cannot send
    eMails directly to me :/
-- 
FWIW, I'm quite impressed with mksh interactively. I thought it was much
*much* more bare bones. But it turns out it beats the living hell out of
ksh93 in that respect. I'd even consider it for my daily use if I hadn't
wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh

Attachment: forceXHTML.c
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]