lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev reading sjis docs [was Re: lynxcgi problem]


From: Hataguchi Takeshi
Subject: Re: lynx-dev reading sjis docs [was Re: lynxcgi problem]
Date: Thu, 30 Dec 1999 11:14:56 +0900 (JST)

On Tue, 28 Dec 1999, Henry Nelson wrote:

> > By the way, I'm wondering ASSUME_CHARSET doesn't work for Japanese
> > as expected now as you've ever wrote.
> > Do you know the relationship between ASSUME_CHARSET and
> > "kanji code", which can be changed by ^L with SH_EX?
> 
> ASSUME_CHARSET is turned off for CJK, as far as I know.  Our LAN service
> is very unstable right now, so I cannot try to search the archives for you,
> but look in the "http://www.flora.org/lynx-dev/html/month1097"; archives,
> and grep for "did something happen to."  Or obtain the whole month's archive
> from me: "http://www.irm.nara.kindai.ac.jp/lynxdev/archives/9710.arc.gz.
> Don't bother reading my posts; only read those by Klaus Weide.  You may find
> a few hints.  Klaus and Leonid Pauzner are probably the only two people
> besides Hiroyuki Senshu who could help you in this area.

Thank you very much. Now I see ASSUME_CHARSET is off for CJK.
But I've not understood why it's off. I'll continue to check archives.

> My *hunch* is that ASSUME_CHARSET would not offer much to help Lynx render
> Japanese documents.  How can you assume?

My idea is almost same as Hiroyuki's manual overriding switch.
We usually set it as "Japanese (Auto Detect)" and sometimes
set it as "Japanese (Shift_JIS)" or "Japanese (EUC)"
when Lynx fails to detect document character set.

I think ASSUME_CHARSET is a something which should play this role.
Anyway I'll try to find the reason ASSUME_CHARSET is off for CJK.

> But rather than waste your time because of my ignorance, I have put four
> examples of a "problem page" on my server.  They contain a mixture of
> three encodings, one with no meta tag, and three with meta tags as indicated.
> They have the same content except for the meta.

Thanks. It seems there are no differences between output of them.
It seems <META ... CONTENT="text/html;charset=hogehoge"> has no effect
for Japanese documents.

# There are some Japanese documents which declare WRONG character set.
# If Lynx processs the META tag strictly, we can't get proper output 
# from such wrong pages. I'm wondering this is one reason that Hiroyuki 
# added manual overriding function.
# In the case of NN and IE, it seems they don't processs the META tag
# strictly. I think that's the reason why there exists wrong documents 
# in Japan. :-<

> > If this is right, I think ASSUME_CHARSET should work properly.
> > # "Japanese (Auto Detect)" should be added in the list, if needed.
> > Don't you agree with me, Henry?
> 
> Sorry, but I just don't know.  My "gut feeling" is that "Japanese (Auto
> Detect)" should be the default unless Lynx can determine from the server
> header or a meta definition what the character encoding is, and in that
> case set the document character encoding to what it has determined.

Right.

> Another point I don't understand is how changing the document character
> set from the form-based O)ption Menu is/should be different from the ^L
> switch.

I think there are two differences. One is the way to change the character 
set, the another is whether ASSUME_CHARSET is used or not 
(I'm sorry no other diffrences except what you may already know).

I simply thought ASSUME_CHARSET should be used.
But I'm not sure that's right.
--
Takeshi Hataguchi
E-mail: address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]