lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev assume_charset for Japanese


From: Hataguchi Takeshi
Subject: lynx-dev assume_charset for Japanese
Date: Sun, 23 Jan 2000 00:48:49 +0900 (JST)

I wrote a patch to improve assume_charset when display charset and
assume charset are Japanese.

The main changes are:
  o enable assume_charset when display charset and assume charset 
    are Japanese. assume_charset acts like charset by META tag
    only when charset isn't specified by META nor HTTP responses.

  o change the behavior of the option menu.
    For example, when display charset is Japanese (EUC-JP) and 
    CJK mode is on, changing assume_charset from iso-8859-1 to 
    shift_jis using the option menu, 
        old behavior:
            assume_charset won't be changed.
            CJK mode will turn off.
        new behavior:
            assume_charset will be changed to shift_jis
            CJK mode won't be changed.

  o minor bug fix of my patch on 16 Jan.
    # only changing "#ifdef 0" to "#if 0" in SGML.c

I attach for sample files, which are the same as I've ever sent.
    metaEUC.html, nometaEUC.html, metaSJIS.html, nometaSJIS.html

metaEUC.html and nometaEUC.html are written in euc-jp and almost 
same except in the HEAD element. metaEUC.html specifies charset by META.

metaSJIS.html and nometaSJIS.html are written in shift_jis and almost
same except in the HEAD element. metaEUC.html specifies charset by META.

I expect the same result from
    % lynx -dump metaEUC.html
    % lynx -dump metaSJIS.html
    % lynx -dump -assume_charset=euc-jp nometaEUC.html
    % lynx -dump -assume_charset=shift_jis nometaSJIS.html

I can get the result as expected from Lynx applied this patch.
I can't get the result as expected from Lynx not applied this patch.

Please apply this patch after applying my old patch on 16 Jan.
All changes can be tested without CJK_EX.
--
Takeshi Hataguchi
E-mail: address@hidden

%%% Created Sat Jan 22 22:08:49 JST 2000 by target lynx.patch. %%%
diff -bru orig/lynx2-8-3/WWW/Library/Implementation/SGML.c 
lynx2-8-3/WWW/Library/Implementation/SGML.c
--- orig/lynx2-8-3/WWW/Library/Implementation/SGML.c    Sun Jan 16 17:14:56 2000
+++ lynx2-8-3/WWW/Library/Implementation/SGML.c Sat Jan 22 09:07:18 2000
@@ -39,7 +39,7 @@
 # include <LYPrettySrc.h>
 #endif
 
-#ifdef 0
+#if 0
 #ifdef CJK_EX  /* 1997/12/12 (Fri) 16:54:58 */
 extern HTkcode last_kcode;
 #endif
diff -bru orig/lynx2-8-3/src/GridText.c lynx2-8-3/src/GridText.c
--- orig/lynx2-8-3/src/GridText.c       Sun Jan 16 18:57:58 2000
+++ lynx2-8-3/src/GridText.c    Sat Jan 22 21:44:34 2000
@@ -855,7 +855,6 @@
     /*
      *  Check the kcode setting if the anchor has a charset element. - FM
      */
-    if (anchor->charset)
        HText_setKcode(self, anchor->charset,
                       HTAnchor_getUCInfoStage(anchor, UCT_STAGE_HTEXT));
 
@@ -11446,7 +11445,16 @@
                HTCJK = NOCJK;
        }
     }
-    text->specified_kcode = explicit ? text->kcode : NOKANJI;
+    if (explicit)
+       text->specified_kcode = text->kcode;
+    else {
+       if (UCAssume_MIMEcharset) {
+           if (!strcmp(UCAssume_MIMEcharset, "euc-jp"))
+               text->kcode = text->specified_kcode = EUC;
+           else if (!strcmp(UCAssume_MIMEcharset, "shift_jis"))
+               text->kcode = text->specified_kcode = SJIS;
+       }
+    }
 
     return;
 }
diff -bru orig/lynx2-8-3/src/LYOptions.c lynx2-8-3/src/LYOptions.c
--- orig/lynx2-8-3/src/LYOptions.c      Fri Jan  7 12:02:22 2000
+++ lynx2-8-3/src/LYOptions.c   Sat Jan 22 21:48:40 2000
@@ -24,6 +24,8 @@
 
 #include <LYLeaks.h>
 
+extern HTCJKlang HTCJK;
+
 BOOLEAN term_options;
 
 PRIVATE void terminate_options PARAMS((int sig));
@@ -922,6 +924,7 @@
                            StrAllocCopy(UCAssume_MIMEcharset,
                                         
LYCharSet_UC[UCLYhndl_for_unspec].MIMEname);
                        }
+                       if (HTCJK != JAPANESE)
                        LYRawMode = (BOOL) (UCLYhndl_for_unspec == 
current_char_set);
                        HTMLSetUseDefaultRawMode(current_char_set, LYRawMode);
                        HTMLSetCharacterHandling(current_char_set);
@@ -4048,7 +4051,7 @@
                LYUseDefaultRawMode = TRUE;
                HTMLUseCharacterSet(current_char_set);
            }
-       if (assume_char_set_changed) {
+       if (assume_char_set_changed && HTCJK != JAPANESE) {
                LYRawMode = (BOOL) (UCLYhndl_for_unspec == current_char_set);
            }
        if (raw_mode_old != LYRawMode || assume_char_set_changed) {

Attachment: samples.tar.gz
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]