lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev URLs with raw 8-bit chars (was: lynx: have bug)


From: Leonid Pauzner
Subject: Re: lynx-dev URLs with raw 8-bit chars (was: lynx: have bug)
Date: Mon, 22 Mar 1999 12:42:51 +0300 (MSK)

21-Mar-99 20:37 Klaus Weide wrote:
> On Sun, 21 Mar 1999, Leonid Pauzner wrote:
>> 21-Mar-99 12:38 Klaus Weide wrote:
>> > On Sun, 21 Mar 1999, Leonid Pauzner wrote:
>> >>
>> >> UTF-8 URL-encoding was proposed in several recent drafts
>> >> (not handy, but I remember a note that certain protocols
>> >> or servers may expect blind %xx encoding, not utf-8
>> >> so we may need a configurable option between (1) and (2) for 
>> >> compatibility.
>> >> Also I doubt lynx do (2) in all cases, saw it only for HTML's -
>>
>> I mean the translation to utf-8 exist and document charset is not iso-8859-1.


> The behavior seems to be consistently this, for normal 8-bit charsets
> ('translation to utf-8 exist'  applies, didn't test UTF-8, CJK,
> Transparent):

>   If Display character set == the document's effective charset,
>   then raw 8-bit bytes get hex-encoded directly as byte values.
>   If Display character set != the document's effective charset,
>   then UTF-8 representation gets hex-encoded.  'effective charset'
>   as derived from explicit label and -assume_charset etc. as usual,
>   .i.e. what '=' shows.

This is not logical within our model: server is not aware of our
display charset.
In my 'real life' example the document was windows-1251 (assumed)
and dispaly cp866, so I failed ('right' according to your logic)
but someone else with windows-1251 display will succeed.


> That's also what I had intended to happen in LYUCFullyTranslateString,
> IIRC...

> This means that the user can usually toggle between the two interpretations
> with -raw / '@'.   It's not completely logical that the interpretation
> of URLs should depend on this.  OTOH there's the ease of switching, and
> it's more likely that encoding the raw value is the right thing (or even
> possible) when the user's environment is consistent with the server's.

Completely wrong to overload -raw mode here (to ask user
to get the document unreadable in order to follow a link),
it may be switchable like "dsoft-quotes" instead.

>     Klaus



reply via email to

[Prev in Thread] Current Thread [Next in Thread]