lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Chartrans - forms


From: Hynek Med
Subject: Re: LYNX-DEV Chartrans - forms
Date: Sun, 2 Mar 1997 23:45:55 +0100 (MET)

On Sun, 2 Mar 1997, Klaus Weide wrote:

> On Sun, 2 Mar 1997, Hynek Med wrote:
> 
> > Klaus, I have one more idea for you - how are the forms data translated? 
> > I recall using your old patches against a traceroute gateway, and the
> > added charset parameter to content-type has puzzled the script so it
> > didn't produce any results.. 
> 
> If you can reproduce that, please give the URL and what settings for
> character set and raw you were using.  And make sure that the thing
> works without the chartrans code, if you have another version of
> Lynx around.

Well, I couldn't use other lynx, because the server sent Windows-1250
marked documents. And after my complaint the webmaster had changed it, so
now it doesn't mark the documents as Windows-1250. I know that it wasn't
perhaps the best solution, but he decided to do this, not me. Anyway, at
least it's now readable in older (read: non-chartrans) versions of lynx.

To make myself clearer - if a document is marked as Windows-1250 by the
server (for example
http://www.ics.muni.cz/cgi-bin/toCP1250/~dolecek/rozcesti.html),
non-chartrans versions of lynx cannot cope with it. They do this: 

text/html;charset=windows-1250  D)ownload, or C)ancel

How nice would it be to have there an I)gnore option too..

> I didn't change anything in the code that appends "; charset=".. for
> POSTed form submissions.  (But I probably should.)  Some of the assumptions
> made in that code (see comment in GridText.c) are wrong now, but
> "charset=" will still only be sent back if the document containing the
> form had an explicitly given charset. 

Which was the case and the script didn't react on this correctly. Instead
of changing the script the webmaster dropped charset marking on documents.. 

> (In that case though, the charset
> sent back may not always be correct now...)  So I don't understand how
> the curent chartrans code could lead to the failure you describe -
> a server (or CGI form) which deals out charset= parameters should be able 
> to also understand them.  

Normal lynx code would lead to this too, but, as I wrote before, normal
lynx wouldn't even let me see the form.. Sorry for blaming your patches to
do that. :-)

> > Maybe it's a better idea to translate the
> > data for the forms to the character set of the document with the form? 
> 
> That's a nice idea.  Finally it should probably be done.
> The nice idea becomes less nice when you consider that
> - we may not know the document charset (but have only "assumed" it)

Well, we do this for normal documents anyway.. Or we could do this only
if we know for sure what the document charset is..

> - we may not be able to translate back, or translate not all characters

Do you mean - if your display is US-ASCII you can't do č from it?
Well.. Still better partial translation than nothing.. 

> And of course documents should contain the ACCEPT-CHARSET attribute on
> input form fields, as in RFC2070...

Sorry to say, I haven't seen this in practice. People just look at their
pages with Netscape, in Windows encoding, and if it's fine for them, they
think it's fine for everybody. :-(

> > PS It's nice to see extra HTML entites like č beeing added..
> 
> Yes, it might be nice.  Go ahead, Hynek, add more of them!
> The place to add them is in HTMLDTD.c, you don't need to understand C,
> youst follow the examples (search for "ccaron"...) and add entities
> in alphabetic (ASCII) order.  Of course you need to know the Unicode
> values those entities stand for...

If it's only a problem of adding them there, I can do that. (I have even
looked at the code already - because I was wondering why ccaron I tried
worked but Ccaron didn't - it looks we both like ccaron for testing.. :-) 
I know I should stop complaining / asking for features and start doing
something..

BTW, why isn't ‎ working? It reads:

 {"lrm",       8206},  /* left-to-right mark */

And is rendered as ‎ in lynx.. I don't know what lrm should be,
anyway. Does it mean that the entities entries can be in hex only?


Hynek

--
Hynek Med, address@hidden



;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]