lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev outgoing_mail_charset (was: megapatch)


From: Klaus Weide
Subject: Re: lynx-dev outgoing_mail_charset (was: megapatch)
Date: Sun, 17 Oct 1999 18:37:29 -0500 (CDT)

On Sun, 17 Oct 1999, Leonid Pauzner wrote:

> Yes, I was unhappy with all this special cases LY_BOLD_*, LY_SOFT_HYPHEN,
> backspaces (no idea about it et all) and such.

Just get rid of LY_BOLD_START_CHAR, LY_BOLD_END_CHAR, LY_UNDERLINE_START_CHAR,
and LY_UNDERLINE_END_CHAR from the text, the rest may sort itself out
automatically...  (should be tested though)

I think a new incarnation of print_wwwfile_to_fd, maybe
print_translated_wwwfile_to_fd (which takes an additional charset
handle parameter) would be the way to go, rather than putting
more different cases within the existing function.  Better now
before it's to late because print_wwwfile_to_fd has become another
too-complex megamulti-purpose function where no-one understands all
the cases...

The new function can forget about the DUMP_WITH_BACKSPACES stuff
and the LY_SOFT_HYPHEN stuff (it should be handled in the LYUCTrans*
function if there really are LY_SOFT_HYPHENs (which is rare)).
The LY_BOLD_* etc. could be removed by calling remove_special_attr_chars
(or doing the equivalent thing by hand) after a full line has been
collected.  LYK_SOFT_NEWLINEs should probably be consumed before that,
to splice lines together.

(Btw. I would find it nicer for display if LYK_SOFT_NEWLINEs were at
the end of the preceding line rather than at the beginning of the next.
Showing '\' at end-of-line rather than '+' at beginning-of-line is
more obvious for showing that a line has been split, IMO.  Anyway,
all that should currently only occur for various forms of SOURCE.)

> Perhaps a better idea is using of second temp file: read the result of
> print_wwwfile_to_fd() and convert it via LYUCTranslateBackFormData()
> line-by-line, but you decide this as a complication...

Using another tempfile would just make the code more messy, IMO.
It would make sense if lynx were calling external programs for
charset transcoding, but not otherwise.

Something equivalent seems to be done for *news postings* in the
case of CJK charsets, see use of CJKfile in LYNewsPost().
There doesn't seem to be an equivalent for sending mail, although
I may have overlooked it.  (Maybe the mailer is supposed to take
care of translating to ISO-2022-* on Japanese etc. systems - I know
very little about that.)

> The explanation given in lynx.cfg: there are many d.c.s. possible and (remote)
> mail agent may not recognize iso-8859-4 or cp866 or other uncommon charsets.
> Some kind of approximation would not be too bad - lynx have already done the
> conversion to d.c.s. ("internal" charset) - why not to convert it more if
> necessary? Anyway, the approximation is more readable than garbage.
> 
> In my particular case - d.c.s=cp866 which is generally not recognized
> by mail agents, but windows-1251 and koi8-r are.
> (also, 128-159 characters got stripped from the subject line
> by my mail provider; windows-1251 and koi8 keep letters out of this range).

Yes, it should work well in that case.  In other cases it would produce
unrecognizable 'garbage'.  Converting non-Latin text to US-ASCII (as
7 bit approximations) would not be useful in most situations, just a bit
better than nothing-at-all.  (For KOI8-R it might actually be useful
sometimes, because of the way the Cyrillic letters are assigned to
latin-ASCI-value + 0x80.)

Maybe ther should be some sort of warning in the lynx.cfg, like
"You're on your own".


Another limitation is that OUTGOING_MAIL_CHARSET is only applied
for one form of sending mail (from 'P'rint).  The others, specifically
'c'omment sending, are not covered (should they?).

   Klaus


reply via email to

[Prev in Thread] Current Thread [Next in Thread]