bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Shouldn't wget strip leading spaces from a URL?


From: L A Walsh
Subject: Re: [Bug-wget] Shouldn't wget strip leading spaces from a URL?
Date: Wed, 14 Jun 2017 11:49:59 -0700
User-agent: Thunderbird

Dale R. Worley wrote:
 But of course, no [RFC3986-conforming] URL
 contains an embedded space because that's what it
 says in RFC 3986, which is "what *defines* what a
 URL *is*"[sic; should read "is one definition of
a URL.
---
   Right, just like speed limit signs define
what the maximum speed is.

There is the "model" and there is reality.  To believe that
the model replaces and/or dictates reality is not
realistic and bordering on some mental pathology.

I understand what you are saying Dale.  My dad was a lawyer,
and life would be so much easier if specs, RFCs or other
models of reality were the only thing we had to pay attention
to.  But... to do so generally creates various levels of
discomfort and/or headaches.


 Now, someone can provide a string that contains spaces and claim
 it's a URL, but it isn't. The question is, What to do with it?  My
 preference is to barf and tell the user that what they provided
 wasn't a proper URL.
---
   I.e.: not doing what you can to give them some output
that is your _best_ _attempt_ to give them what they wanted
(excluding dangerous interpretations).
   A friendly user-interface attempts to help the user get
what they want despite their not asking for it according to
regulation or with poor syntax or spelling.



 Beyond that, one might do some simple tidying up, such as removing
 leading and trailing spaces.  That fix, by the way, is known to be
 safe, *because a URL can't contain a space*, and so any trailing
 space can't actually be part of the URL.
----
   One might argue that leading and trailing space, since they
are not "internal" to the URL, aren't really a part of the URL.

 It gets uglier when there are invalid characters in the middle of
 the URL, because simply deleting them is unlikely to produce the
 results the user expected.
---
   Yup.  Thus my original post thinking that they should be
removed since they can't really be part of a URL and as "characters
non gratis", should be removed before sending them to a remote
website.

-linda






reply via email to

[Prev in Thread] Current Thread [Next in Thread]