lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Relative URLs, BASE implementation


From: Foteos Macrides
Subject: Re: LYNX-DEV Relative URLs, BASE implementation
Date: Tue, 22 Oct 1996 13:36:43 -0500 (EST)

Klaus Weide <address@hidden> wrote:
>[...]
>Hey, improving HTParse.c looks like a reasonable project for someone
>with some knowledge of C and URL syntax (or ability to read an RFC), but
>without the time to figure out all of the Lynx code.  Just HTParse.h
>and HTParse.c should be enough.  Anybody up for it?

        I suspect your assessment of not needing to understand the Lynx
code for mods of HTParse.c functions may be misguided (and your worsening
of the situation when you tried, perhaps should have been a tip off,
though a period of "trial and error" is important for anyone to get good
at supporting Lynx 8-).

        The parsing functions there are used not only for http URLS, but
also for the built-in gateways, a variety of internal URL schemes, and
external but lynx-specific schemes (e.g., lynxexec, lynxprog, lynxcfg).
It also reflects FM judgements about what will succeed in the majority
of cases with what the real world throws at Lynx.  Beware of creating
FAQs like the one about why some POST redirections "don't work anymore"
with v2.6.  The FAQ about why HREF="../foo" resolutions don't always
work in v2.6 even though Netscape gets it "right", reflects my incorrect
judgement that it was about time for Lynx to do that right, and it wasn't
yet (Does someone want to add a prompt for choosing the wrong way, when
needed? 8-).  If you follow what RFC 1808 recommends for error handling
(though it's not described as that), beware of many more FAQs that will
be created for Lynx.

        Note that Lynx v2.6 does use functions in HTAAServ, HTUU.c and
HTRules.c, and would be using more of them if the HTTP/1.1 draft
and associated IDs had been further along at the time of its release.

        Lynx v2.6 does not handle ;parameters or ?searchparts in the
HTParse.c functions (treats them, there, as part of the path field).
It only handles fragments (following any ;parameters and/or ?searchparts,
if present) in that code.  It handles ;type=[A, D, I] in the ftp gateway,
and ?searchpart in GridText.c.   The treatment of an empty HREF as the
base *less* any fragment it had is intentional, and I still can't think
of a "real world" situation in which that wouldn't be more appropriate
than retaining the fragment.  Note also that RFC 1808, without comment
or explanation, changed what's dictated in RFC 1738 for escaping the
hash ('#')  and parsing of fragments.  RFC 1808 directs parsing for the
hash from left to right, and indicates that unescaped hashes could be
present to the right of that punctuation.  RFC 1738 states that only
one unescaped hash can be present, as punctuation for a fragment, and
that all other hashes should be escaped, even in URLs that do not
support fragments (e.g., mailto URLs).  If only one unescaped hash
can be present, the direction of parsing is irrelevant, and right to
left is more efficient, as is done in all libwwws, including the
most current Reference Library, and by most deployed browsers (not
MSIE, whose developers, being new, didn't know what grains of salt
to apply to RFC 1808 8-).

        Much of what is in RFC 1808 was written from an armchair,
rather than from hands on implementation experience (i.e., it's like
what's in the RFCs and "official" drafts for FORM markup 8-).

        Note also that what RFC 1808 says about handling of ".."
embedded within paths, though logical, doesn't necessarily take
into account how that might be used for spoofing on Unix, where
it has meaning for the platform's actual file system.  We worry
about the parsing of those in a variety of LYfoo functions, not
just in the libwww modules.  We also treat '~' as a meaningful
symbol in file and ftp URLs, under some circumstances, though
that's not in the specs.  If you simply follow the specs for
that, you'll create yet more FAQs. :) :)

                                Fote

=========================================================================
 Foteos Macrides            Worcester Foundation for Biomedical Research
 address@hidden         222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]