lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Conversion of special character codes within anchor tags


From: David Woolley
Subject: Re: lynx-dev Conversion of special character codes within anchor tags
Date: Fri, 25 Sep 1998 08:30:57 +0100 (BST)

> 
> Fair enough. Let's take the case of these characters. Suppose I substitute
> the field name "gt" instead of "curren" in my example. Now we have a tag
> which reads <A HREF="http://www.some.site/sample.cgi?para=1&gt=GBP";>. By
> your logic, this would actually be interpreted as <A
> HREF="http://www.some.site/sample.cgi?para=1>=GBP">. Do you want to have a

No it wouldn't.  It is a deprecated entity encoding of an anchor for the
URL:  http://www.some.site/sample.cgi?para=1>=GBP

The preferred coding would have ; after the gt.

> guess at how many browsers would get confused by the premature close to the
> tag? Why is this useful? Why is this sensible?

If it had been <A HREF="http://www.some.site/sample.cgi?para=1>=GBP"> some
browsers would have been confused, which is why Lynx has broken quoting
options to allow > to terminate a quote.

> which require entitizing. In other words, there is not a single character
> for which it is useful to translate the character entity into the actual
> character in a URL, because every single one of them (as well as several

& is a counter example.  It is sometimes useful to code a form type URL
in a normal anchor.  A URL encoded & would NOT be treated as a field
delimiter but as part of the field value.  It must be entity encoded in
an HREF attribute though.  The browser should resolve that encoding and
send an un-encoded &.

> others) needs URL encoding. The only characters which don't need URL
> encoding are the alphanumeric ones (A-Z, a-z, 0-9) plus the characters
> $-_.+!*'(),. Which one of these do you intend to send to the browser in the

The last time I looked, all top bit set characters didn't require encoding,
although this may have changed because of internationalisation issues and
I would be tempted to encode them to be on the safe side, with the caveat
that there are some lazy server and CGI implemenations which compare strings
in their encoded form.

> Would that be the semicolon as in "other possible delimiters (such as + or
> ;)"? This would be a useful point if it were not for the fact that every
> browser I have ever come across uses ampersand as its delimiter. Actually,

However it is becoming quite common for hard coded forms URLs to use
;, partly because it doesn't require entity encoding in the HTML.

> we may both be wrong on this point. RFC 1738 (which is the one I think you
> want, states in section 3.3:
> 
>    "Within the <path> and <searchpart> components, "/", ";", "?" are
>    reserved."
> 
> So ";" is not available as a delimiter.

It is definitely being used as that these days although I don't know the
underpinning standards.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]