lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Escaping URLs on the command line?


From: Nelson H. F. Beebe
Subject: Re: lynx-dev Escaping URLs on the command line?
Date: Wed, 24 Apr 2002 06:10:21 -0600 (MDT)

Walter Ian Kaye <address@hidden> writes on 24 Apr
2002 00:50:01 -0700 with a request about how to escape special
characters in URLs.

The easy thing to do is simply represent problem characters in
uppercase hexadecimal, e.g., %C0 for character 196 decimal.  This is
permitted anywhere in a URL.

>From RFC 1630, available at

        ftp://ftp.internic.net/rfc/rfc1630.txt
        ftp://ftp.math.utah.edu/pub/rfc/rfc1630.txt
        
entitled ``Universal Resource Identifiers in WWW'',

>> ...
>> ...
>>       There is a conflict between the need to be able to represent many
>>       characters including spaces within a URI directly, and the need to
>>       be able to use a URI in environments which have limited character
>>       sets or in which certain characters are prone to corruption.  This
>>       conflict has been resolved by use of an hexadecimal escaping
>>       method which may be applied to any characters forbidden in a given
>>       context.  When URLs are moved between contexts, the set of
>>       characters escaped may be enlarged or reduced unambiguously.
>> ...
>>    CONVENTIONAL URI ENCODING SCHEME
>> 
>>       Where the local naming scheme uses ASCII characters which are not
>>       allowed in the URI, these may be represented in the URL by a
>>       percent sign "%" immediately followed by two hexadecimal digits
>>       (0-9, A-F) giving the ISO Latin 1 code for that character.
>>       Character codes other than those allowed by the syntax shall not
>>       be used unencoded in a URI.
>>
>>    REDUCED OR INCREASED SAFE CHARACTER SETS
>> 
>>       The same encoding method may be used for encoding characters whose
>>       use, although technically allowed in a URI, would be unwise due to
>>       problems of corruption by imperfect gateways or misrepresentation
>>       due to the use of variant character sets, or which would simply be
>>       awkward in a given environment.  Because a % sign always indicates
>>       an encoded character, a URI may be made "safer" simply by encoding
>>       any characters considered unsafe, while leaving already encoded
>>       characters still encoded.  Similarly, in cases where a larger set
>>       of characters is acceptable, % signs can be selectively and
>>       reversibly expanded.
>> ...

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- Center for Scientific Computing       FAX: +1 801 585 1640, +1 801 581 4148 -
- University of Utah                    Internet e-mail: address@hidden  -
- Department of Mathematics, 110 LCB        address@hidden  address@hidden -
- 155 S 1400 E RM 233                       address@hidden                    -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe  -
-------------------------------------------------------------------------------

; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]