lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Way to define character set when doing newspost? LYNX-DEV


From: David Woolley
Subject: Re: Way to define character set when doing newspost? LYNX-DEV
Date: Fri, 20 Jun 1997 09:04:05 +0100 (BST)

> 
> > Headers of posts from Lynx Version 2.7.1ac-0.28 read:
> >     'Content-Type: text/plain; charset=unknown-8bit'.
> 
> Hm, you have to talk with Klaus about it.. I think it's because Lynx
> doesn't insert the charset name for Japanese charsets, and some
> sendmail/whatever on the way adds there unknown-8bit..

If Lynx is going to permit mailing in 8 bit codes it really should declare
both character set and encoding.  If, as I think it does, it uses sendmail,
and it doesn't encode itself, it should use the new sendmail option to 
indicate the presence of raw 8 bit data.

There is a catch though.  It is also bad practice to declare a non-ASCII
character set when there are no non-ASCII characters.  This is becoming
annoyingly common on mailing lists, and can causes mail clients to do
special processing** - in particular, any mail declared as EUC, but actually
containing only ASCII might cause the client to consider the mail 
undisplayable.  The use of Transfer-Encoding: 8BIT may also force conversion
to BASE64.

As such, Lynx really ought to pre-scan for 8 bit codes, if it is going
to permit 8 bit mailing.

> 
> > I was also puzzled by this header:
> >     `Content-Transfer-Encoding: base64'.
> > I assumed that Lynx had added this by default.  If I were using EUC,
> > would I want `Content-Transfer-Encoding: 8bit'?

That's not legal over SMTP.

> 
> It's probably because some sendmail/whatever on the way converts 8bit
> MIME to base64.. 


I think in this case it is converting undeclared 8 bit (a protocol violation)
into Base 64.  This is one of the legal options (the main other one is to
reject the mail) for an ESMTP compliant mailer when it receives a mail
without having 8 bit transmission negotiated (this behaviour is in an 
auxiliary RFC, entitled something like "Just Send 8", not the main ESMTP one,
which doesn't permit undeclared 8 bit transmission).  The behaviour of
SMTP mailers is explicitly undefined in this case.

(An ESMTP mailer receiving properly declared and negotiated 8 bit data will
still perform this conversion when forwarding via SMTP or ESMTP without 
8BITMIME support.

There are parts of the world where "just send 8" is endemic, and character
sets are assumed.  Because of this, sendmail can be configured to ignore the
protocol requirements in this area, and accept undeclared 8 bit mail quietly.
It can also be configured to assume that particular outbound relays will
accept 8 bit data even though they don't support ESMTP.  This can be quite
a controversial subject, with claims of American chauvinism - however, when
one looks at it, in most cases the character set is assumed, and this only
works within regions with similar alphabets.

The Lynx mailing list is not limited to a single character set area, so
just send 8 is not a reasonable approach.)
> 

When non-ASCII is involved, email composing gets quite complicated and ought
really to be left to a dedicated mail program.  From the point of view of Lynx
that makes it difficult to set up the initial message, even if the inocming
page is pure ASCII.  (Quoting in mixed character sets is possible
within the standards, but is essentially unsupported by the current
email clients.)

** Elm shells to metamail, and returns to the index rather than stepping to
the next item at the end of an item.

Windows ones are likely to display in their own code page, without
any warning.
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]