lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Way to define character set when doing newspost? LYNX-DEV


From: Foteos Macrides
Subject: Re: Way to define character set when doing newspost? LYNX-DEV
Date: Sat, 21 Jun 1997 12:24:19 -0500 (EST)

David Woolley <address@hidden> wrote:
>> 
>> > Headers of posts from Lynx Version 2.7.1ac-0.28 read:
>> >    'Content-Type: text/plain; charset=unknown-8bit'.
>> 
>> Hm, you have to talk with Klaus about it.. I think it's because Lynx
>> doesn't insert the charset name for Japanese charsets, and some
>> sendmail/whatever on the way adds there unknown-8bit..
>
>If Lynx is going to permit mailing in 8 bit codes it really should declare
>both character set and encoding.  If, as I think it does, it uses sendmail,
>and it doesn't encode itself, it should use the new sendmail option to 
>indicate the presence of raw 8 bit data.
>[...]

        Lynx, since it's version 1.0, has had an option in its 'p'rint
menu to dump the HText structure for the (rendered) current document to
a temporary file and email it, presumeably to yourself, with your
"Personal mail address" (if defined via the 'o'ptions menu) offered as
the default.  That never has, and still doesn't, in any code set, include
a Content-Type or Content-Transfer-Encoding header, even though the
rendering may have 8-bit characters (almost surely will, for some of the
display character sets).

        The can of worms we're discussing in this thread started when
I added hacks in LYPrint.c for emulating the Netscape hacks to allow
mailing of HTML "source" with a BASE tag artifically prepended, and
a Content-Type indicating text/html included, so that people could
invoke Lynx instead of just Netscape for such email.  That so-called
"source" is not, in fact, the HTML source.  It's a rendering, for
screen display (via the '\' command) of the text/html as if it were
text/plain, and thus the charset of the actual HTML source does not
validly apply to this "text/plain rendered, so-called source".  If
you want to email yourself a faithful copy of HTML source, you'd have
to set up an option for that in the 'd'ownload menu, or add a new
command which does it in conjunction with other things needed to
generate truly valid mail headers, and suffer a new retreival of the
HTML document.

        In conjunction with adding and refining the EXP_CHARTRANS
stuff for the development code set, by virtue of its ability to
do cross-translations of different document charsets to display
character sets, Klaus could, and did, add a check for whether any
8-bit characters ended up in the rendition, and could specify the
display character set as the body charset for email headers, and
be right most of the time, as well as include a header indicating
presence of 8-bit encoding only when that's true.  That code is
broken for the CJK display character sets, but as far as I can tell,
works properly for all the others.  Homologous EXP_CHARTRANS stuff
was then added for all mailto HREFs (and for the 'c'omment command),
but that's broken for all display character sets as I've explained
in previous messages.  What it would take to make it work validly
with the current Lynx API, even with the EXP_CHARTRANS stuff added,
entails a reasonable amount of overhead.  Also, since you're involving
other (i.e., external mailer) software, and sending streams externally,
not just to the user's screen, even if you did it "right" according
to what is being said in this thread, you have no guarantees that the
external software has the requisite capabilities, or is configured
properly to use them.  Note that the IETF's MHTML-WG finally put out
an RFC, and almost immediately has started talking about rescinding
it or quickly replacing it, because the "implementation experience"
which followed indicates that specs in the RFC are unworkable in the
real world.  This is an area in which "what should be done" and "what
will actually work in the real world" can be very far apart.

                                Fote

=========================================================================
 Foteos Macrides            Worcester Foundation for Biomedical Research
 address@hidden         222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]