Re: lynx-dev current_codepage in WIN_EX&&CJK

lynx-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev current_codepage in WIN_EX&&CJK_EX (was: Lynx .IDE file for

From:	Hataguchi Takeshi
Subject:	Re: lynx-dev current_codepage in WIN_EX&&CJK_EX (was: Lynx .IDE file for Borland C ++)
Date:	Wed, 12 Jan 2000 00:22:28 +0900 (JST)

Thank you for your comment.

On Mon, 10 Jan 2000, Klaus Weide wrote:

> On Mon, 10 Jan 2000, Hataguchi Takeshi wrote:
> 
> > I wrote a patch for dev18. The main changes are:
> >   o wrap a long text which includes only CJK characters in source mode.
> >   o avoid to write CJK characters at the 80th column.
> >
> > A text which includes only CJK characters was never wrapped in
> > source mode before. So I changed "goto check_IgnoreExcess;" to
> > "goto check_Tab;". But I'm not sure this is an appropriate way.
[snip]
> You are now jumping to
> > +check_Tab:
> >      if (ch == '\t') {
> but is that really what you want?  I.e., do you really want the stuff
> between there and
>     } /* if tab */
>     else if ( (text->source || dont_wrap_pre) && text == HTMainText) {
> ?

No.

> If no, then note that the 'else' in
>     else if ( (text->source || dont_wrap_pre) && text == HTMainText) {
> is completely unnecessary.  It can be removed without changing the flow
> of control.  If that were more obvious, maybe that is where you would have
> put your new goto label?

That's true. Thank you.

> > I changed only HText_appendCharacter, but didn't change display_line.
> > Though display_line should be changed also, I don't know how to change it.
> 
> Before LY_SOFT_NEWLINE was introduced, in SOURCE mode lines would just
> grow (without splitting at LYcols-1) up to MAX_LINE.  It was completely
> up to display_line to suppress characters beyond LYcols-1.  Now it isn't
> that important.  It will be if we return to the older handling (which would
> be a useful option IMO), and it is possible that some of Vlad's changes
> also effectively do something like that (dont_wrap_pre??).
> 
> In the display_line loop, i is the current display position for the
> character, although with some weird offset applied (it already points
> to the next position, or something like that.  It's even possible that
> it's broken).  So to do the right change to display_line, you'd probably,
> test i against LYcols(+something) under
>                 } else if (HTCJK != NOCJK && !isascii((unsigned 
> char)buffer[0])
> ) {
> and break or continue if there isn't enough space.

Thank you very much for your advice.
I added a "goto" to break the while loop because the point is also 
in the switch clause.

> > --- GridText.c.org    Fri Jan  7 12:02:22 2000
> > +++ GridText.c        Mon Jan 10 09:09:36 2000
> > @@ -3677,7 +3677,7 @@
[snip]
> This whole section where you applies most of your change - between
>     } /* if tab */
> and
>     if (ch == ' ') {
> - wasn't there at all before the LY_SOFT_NEWLINE introduction.
> Now apparently - it seemed so to you, at least? - it does the main branch
> of line splitting logic.  It didn't use to (since it wasn't there at all.
> compare some 2.7.1 source, for example.).
> 
> The unplanned (it seems it "just happened") shifting of line splitting
> logic from after check_IgnoreExcess to before it may be responsible for
> some of the stuff we (you) now have to fix up.
> 
> Anyway, I suggest you try to imagine ths section were not there at all..
> (or actually remove it).  The logic should still be right, with exception
> of LY_SOFT_NEWLINE insertion.

I see. I tried as you wrote here and changed display_line.

> > @@ -3998,6 +4007,7 @@
> >       */
> >      if (((indent + (int)line->offset + (int)line->size) +
> >        (int)style->rightIndent - ctrl_chars_on_this_line +
> > +      (((HTCJK != NOCJK) && text->kanji_buf) ? 1 : 0) +
> >        ((line->size > 0) &&
> >         (int)(line->data[line->size-1] ==
> >                               LY_SOFT_HYPHEN ?
> 
> Shouldn't you do kanji_state-preservation in the new_line calls that
> follow after this, too?

No. That isn't needed here because text->kanji_buf and text->state 
are never refered nor changed in new_line now.

> It seems you could do this by putting the equivalent of
> > +         int save_kanji_buf = text->kanji_buf;
> > +         int save_state = text->state;
> > +
> > +         text->kanji_buf = '\0';
> > +         text->state = S_text;
>   .....
> > +         text->kanji_buf = save_kanji_buf;
> > +         text->state = save_state;
> 
> *into* split_line, so you don't have to surround all new_line
> occurrences with it.

We have to surround HText_appendCharacter because 
text->kanji_buf and text->state play very important roll in it.
I changed the order of these substitutions.

This is my new patch.

--- GridText.c.org      Fri Jan  7 12:02:22 2000
+++ GridText.c  Tue Jan 11 23:40:02 2000
@@ -1346,7 +1346,15 @@
                    /*
                     *  For CJK strings, by Masanobu Kimura.
                     */
+                   /* We don't care Japanese half width characters here.
+                    * I think they souldn't included in HText structure.
+                    * Please compile with CJK_EX, then all half width 
+                    * characters will be converted to full width. -- TH
+                    */
+                   if (i >= LYcols) goto after_while;
+
                    buffer[1] = *data;
+                   buffer[2] = '\0';
                    data++;
                    i++;
                    addstr(buffer);
@@ -1379,6 +1387,7 @@
        } /* end of switch */
     } /* end of while */
 
+after_while:
 #if !defined(NCURSES_VERSION)
     if (text->has_utf8) {
        LYtouchline(scrline);
@@ -3677,7 +3686,7 @@
                }
            }
        } else {
-           goto check_IgnoreExcess;
+           goto check_WrapSource;
        }
     } else if (ch == CH_ESC) {  /* S/390 -- gil -- 1587 */
        return;
@@ -3952,20 +3961,33 @@
        }
        return;
     } /* if tab */
-    else if ( (text->source || dont_wrap_pre) && text == HTMainText) {
+
+check_WrapSource:
+    if ( (text->source || dont_wrap_pre) && text == HTMainText) {
        /*
         * If we're displaying document source, wrap long lines to keep all of
         * the source visible.
         */
        int target = (int)(line->offset + line->size) - ctrl_chars_on_this_line;
        int target_cu = target + utfxtra_on_this_line;
-       if (target >= (LYcols-1) - style->rightIndent ||
+       if (target >= (LYcols-1) - style->rightIndent - 
+           (((HTCJK != NOCJK) && text->kanji_buf) ? 1 : 0) ||
            (text->T.output_utf8 &&
             target_cu + UTF_XLEN(ch) >= (LYcols_cu-1))
            ) {
+           int saved_kanji_buf;
+           int saved_state;
+
            new_line(text);
            line = text->last_line;
+
+           saved_kanji_buf = text->kanji_buf;
+           saved_state = text->state;
+           text->kanji_buf = '\0';
+           text->state = S_text;
            HText_appendCharacter (text, LY_SOFT_NEWLINE);
+           text->kanji_buf = saved_kanji_buf;
+           text->state = saved_state;
        }
     }
 
@@ -3998,6 +4020,7 @@
      */
     if (((indent + (int)line->offset + (int)line->size) +
         (int)style->rightIndent - ctrl_chars_on_this_line +
+        (((HTCJK != NOCJK) && text->kanji_buf) ? 1 : 0) +
         ((line->size > 0) &&
          (int)(line->data[line->size-1] ==
                                LY_SOFT_HYPHEN ?

--
Takeshi Hataguchi
E-mail: address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

lynx-dev current_codepage in WIN_EX&&CJK_EX (was: Lynx .IDE file for Borland C ++), Hataguchi Takeshi, 2000/01/09
- Re: lynx-dev current_codepage in WIN_EX&&CJK_EX (was: Lynx .IDE file for Borland C ++), Klaus Weide, 2000/01/10
- Re: lynx-dev current_codepage in WIN_EX&&CJK_EX (was: Lynx .IDE file for Borland C ++), Hataguchi Takeshi <=

Prev by Date: Re: lynx-dev Converting HTML to Text with Lynx
Next by Date: Re: lynx-dev problem with -post_data
Previous by thread: Re: lynx-dev current_codepage in WIN_EX&&CJK_EX (was: Lynx .IDE file for Borland C ++)
Next by thread: lynx-dev lynx and metamail
Index(es):
- Date
- Thread