lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev UTF-8 encoding questions (was: Superscripts)


From: Klaus Weide
Subject: lynx-dev UTF-8 encoding questions (was: Superscripts)
Date: Wed, 7 Jun 2000 12:50:39 -0500 (CDT)

On 7 Jun 2000, Sergei Pokrovsky wrote:

> (this time I'll try to send it in an attachment).

That arrived here corrupt...  in MIME quoted-printable encoding
it's easy to see:

<i>Ruse:</i> =D0=BF=D1=AE=80=D0=C1=8F =D0=BE=D0=B1=D1=AF=D0, =D0=BF=D1=B0=
=80=D0=D1=D0=80

Looking only at the most significate bits of each byte that are
necessary to decode UTF-8,

 =D0=BF  (binary 110xxxxx 10xxxxxx) is a valid character
 =D1=AE  (binary 110xxxxx 10xxxxxx) is a valid character
 =80     (binary 10xxxxxx) isn't the beginning of a character

 and so on.

Although labelled as UTF-8 in both the MIME body part's Content-type
header field and in the included META tag, this certainly isn't.

OTOH, the KOI8-R in the first body part came through correctly.

Note that the putative UTF-8 encoding of the words is *shorter* than
the KOI8-R encoding, while actually it should be *longer*.  Did you
apply some conversion in the wrong direction??

   Klaus


; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]