lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV ac0.97: CGI &-separated parameters are broken now.


From: Klaus Weide
Subject: Re: LYNX-DEV ac0.97: CGI &-separated parameters are broken now.
Date: Mon, 1 Dec 1997 22:12:07 -0600 (CST)

On Tue, 2 Dec 1997, A. Chernov wrote:

> On Mon, 1 Dec 1997, Duncan Hill wrote:
> 
> > On Tue, 2 Dec 1997, A. Chernov wrote:
> > 
> > > <a href="http://www.Lpage.com/wgb/wgbview.dbm?owner=AcheBook&lang=ru-en";> 
> > > My guestbook</a>
> > 
> > If you're entering it directly, try:
> > ...wgbview.dbm?owner=AcheBook&amp;lang=ru-en
> 
> Of course not directly, I use it already long time on my webpage and it
> works with previous Lynx versions.
> 
> First obvious error is that Lynx violates HTML specs here trying to parse
> &-entity without final ";"

If you want to claim that that is an "obvious error", you have to provide
some references to support that claim.  It certainly isn't obvious to me.

> Second thing is not so obvious, i.e. are &xxx; entities even allowed in
> URLs? I don't know, not have specs nearby....

In SGML terms, it's just CDATA.  Therefore there is not reason to treat
a HREF attribute value any different from, for example, an ALT attribute
value (with respect to recognizing entities): character entities get
expanded (if recognized).

Really, your HTML fragment should look like this:
 <a
 href="http://www.Lpage.com/wgb/wgbview.dbm?owner=AcheBook&amp;lang=ru-en";>
 My guestbook</a>
to be correct.  But that would make the link fail with other browsers,
including older versions of Lynx.

This is a clash that was bound to happen, sooner or later.  It has been a
known problem for a long time that CGI uses '&' as a separator which
conflicts with the use of '&' in HTML.  Note that this problem doesn't
occur when the URL is not within a HTML document (for example, Lynx
'g'oto), or when the client constructs the URL from the names and values
of form fields on submission of a FORM with METHOD=GET (the "normal"
use of such URLs).

I expect that (new) Lynx is not the only client that tries to expand
entities in HREF (and similar) attributes (as required by SGML).
Even if other browsers currently don't do it, they may do so in the
future.  So don't just blame Lynx.

There are several solutions to the pratical problem: 

IF you are the provider of a CGI resource with
'&' in the URL, AND want that URL to be usable in links, THEN change
whatever parses the query part of the URL to additionally recognize
a different character like ';' as separator (if it doesn't do so already),
AND use this alternative URL in links.  For your example:
 <a
 href="http://www.Lpage.com/wgb/wgbview.dbm?owner=AcheBook;lang=ru-en";>
 My guestbook</a>

OR use a FORM instead of A HREF=..., as in
 <FORM ACTION="http://www.Lpage.com/wgb/wgbview.dbm";><INPUT
 TYPE=HIDDEN NAME="owner" VALUE="AcheBook"><INPUT
 TYPE=HIDDEN NAME="lang" VALUE="ru-en"><INPUT 
 TYPE=SUBMIT VALUE="My guestbook"
 ></FORM>

OR just "Do It The Right Way" in links, i.e. use
 <a
 href="http://www.Lpage.com/wgb/wgbview.dbm?owner=AcheBook&amp;lang=ru-en";>
 My guestbook</a>
and let your server / CGI script deal with requests for
  GET /wgb/wgbview.dbm?owner=AcheBook&amp;lang=ru-en HTTP/1.x
(instead of
  GET /wgb/wgbview.dbm?owner=AcheBook&lang=ru-en HTTP/1.x)
that will be sent by clients that don't parse "The Right Way",
by treating them the same.  That probably means you just have to strip an
initial "amp;" from the apparent field name in the CGI script.


   Klaus



reply via email to

[Prev in Thread] Current Thread [Next in Thread]