lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Revised patch for HTFTP.c


From: Klaus Weide
Subject: Re: lynx-dev Revised patch for HTFTP.c
Date: Sat, 19 Aug 2000 01:13:20 -0500 (CDT)

On Fri, 18 Aug 2000, Doug Kaufman wrote:

> The discussion raised by the patch I submitted for HTFTP.c seems to
> have died down, so perhaps this is an opportunity to summarize at
> least what I drew from the thread.
> 
> 1. ASCII mode for the ftp protocol is designed to transfer text files
> between machines with different native formats, converting the formats
> as transfer is done.
> 
> 2. The FTP server can not determine the format of the files.

Of course an FTP server *could* analyze the contents of a file that
is local (to it) and perhaps vary its actions accordingly.  I don't
know of any that do.

> 3. It is the responsibility of the client, not of the server, to
> specify Binary or Ascii transfer mode.
> 
> 4. Lynx can render text files that have EOLs for Macintosh (CR),
> DOS (CR LF), or unix (LF).

I regard this more as an accident than anything else.  An accidental
consequence of using the same handling mechanisms for FTP data that
are also used for HTTP data.  HTTP does not require that textual
data content is transmitted in canonical (CRLF) form, see 3.7.1 in
RFC 2616, so lynx has to be liberal in what it recognizes as a line
break for the sake of HTTP.

> 5. ASCII transfer mode would be appropriate when transfering text
> files from servers whose native mode is not Macintosh, DOS, or unix
> (e.g., VM/CMS).

ASCII transfer mode is appropriate when transferring text files from
ANY servers, independent of their native mode (or even whether they
have one).  That's what ASCII mode is meant for.

Now it happens that for some combinations of server and client systems,
binary (image) mode can *also* be use for transferring text files, such
that they are received in a form that is either (a) identical to the
client system's normal text format or (b) "similar enough" - for example,
the same except for CRLF vs. CR vs. LF differences.

But these are _logically_ the exceptions - no matter that numerically,
they may cover the vast majority of cases encountered in practice.  
Relying on this for text transfer means relying on an accidental
similarity between sender and recipient native text representations.

You realize that this isn't always the case.  But, if I understand you
right, you want to specially recognize the cases where the similarity does
not occur ("e.g., VM/CMS") and treat everything else, by default, under
a "similar enough" assumption.  I could find a more conservative approach
acceptable - make the "similar enough" assumption only if we have concrete
indication that the server system is indeed similar to the client's; maybe
by virtue of the server explicitly telling us that it's Macintosh, DOS, or
unix.  But I don't think it would be very useful.

The FTP protocol ought to be general enough such that it will continue
to work in the future, with currently not yet existing systems, as long
as those future systems themselves follow the protocol.  ANd not just the
protocol should be general enough, but its current implementations
(including lynx).  Consider a system LFCR/OS that becomes popular in 10
years, and which is based on ASCII but for which the native line end
representation is LF CR...  Or just any system for which the native
character encoding is not ASCII based and which isn't recognized as
VM/CMS or similar either.

If I understand yout patch right, if it got applied now, lynx would
not work correctly (for simple text rendering) with those systems,
until it gets specifically patched to work with them.  (I am assuming
that those hypothetical systems have a correct FTP server - one that
converts transmitted tet to canonical form.)  Without your patch, the
normal assumption that text is to be transferred in ASCII mode will be
correct, so current lynx should work with those systems.

> 6. We can not count on the fact that files that appear to be text
> will always be in the native format of the server, although it is
> apparently common practice for this to be true.

>From which perspective is this, i.e. who is "we"?  If "we" is just the FTP
client side, then it shouldn't matter to us at all how the server *stores*
file.  All that matters is how it sends them to us, since that is all we
are going to see.

If we are the FTP maintainer, or someone uploading and/or downloading 
files to/from the same server storage by different means, _then_ the
storage format matters.

> It appears to me that the patch fixes the problem with extra lines
> appearing in rendering of DOS style files on unix servers, without
> breaking anything. If no one sees anything that gets broken by the
> patch, I would still recommend its incorporation into the lynx code.

I see it breaking the generality of the protocol assumptions.  It
anticipates only that which is currently observed in practice, as opposed
to: the full range of possible situations for which the protocol
allows.


Compare the following two statements:

  All systems that matter these days are ASCII based, of either the
  Macintosh or DOS or unix variety.  Actually there are some exceptions
  (like VM/CMS), but we can recognize them specially so we don't have to
  deal with the general case of "an FTP server".

  All browsers that matter these days are configured to show inline
  images / frames / interpret Javascript / ...  Actually there are some
  exceptions (like Lynx), but they can be recognized specially (if someone
  really cares) so as a Web author we don't have to deal with the general
  case of "a[ny] Web browser".

   Klaus





; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]