[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
LYNX-DEV Re: lynx2.6 chartrans
From: |
Klaus Weide |
Subject: |
LYNX-DEV Re: lynx2.6 chartrans |
Date: |
Fri, 29 Nov 1996 17:02:22 -0600 (CST) |
On Fri, 29 Nov 1996, Drazen Kacar wrote [to me]:
> Klaus Weide wrote:
> > On Fri, 29 Nov 1996, Drazen Kacar wrote:
> >
> > > Klaus Weide wrote:
> > > >
> > > > The code is available from
> > > >
> > > > URL: http://www.tezcat.com/~kweide/lynx-chartrans/
> [...] On IANA site, there is URL for tables with Microsoft code pages,
> but FTP server returns file not found. I suppose I'll manage, somehow :)
When I ran into that, I noticed that the directiory structure on the
site pointed to had changed. Just browse through the FTP directories on
unicode.org, and you will probably find what you want.
> >For example you
> > mentioned that there are standards with a well-defined one-to-one
> > mapping between iso-8859-2 and cyrillic characters (for some languages).
>
> There is for Serbian and Macedonian. Could be there is for some others
> too, but I don't know. There might be some problems with ISO 8859-5.
> Recently there was a post on comp.unix.solaris from some Russian guy..
> He said that Sun maybe thinks ISO 8859-5 is enough for Russian, but they
> have different opinion on that. KOI-8 is de facto standard there.
> He posted URL for explanation, but I didn't save the article.
> It should be available via dejanews, article has iso-8859-2 in subject
> (or iso8859-2).
Without knowing much of the background - it seems obvious that KOI-8
may now be "standard" for Russian but is insufficient for many other
languages that use cyrillic script. ISO-8859-5 has at least some
characters used in languages other than Russian, while KOI-8 uses a lot
of space for all those DOS linedrawing characters.
> > Using such a mapping for displaying cyrillic text in (e.g.) iso-8859-2
> > would just require the right table. (The other direction would be
> > slightly more difficult, since currently for ASCII chars the translation
> > is skipped [cheating for speed].)
>
> Hehe... Did I tell you that some bright guys use ISO 646 code pages?
> Those are 7-bit standards, national characters are put instead of
> []{}|address@hidden characters. Hehehe... Don't be afraid, you don't have to
> support
> that... :)
I think that's also still popular in some Scandinavian countries.
> > Don't be mislead if, when you first try my code, your name comes up
> > correctly, Dražen Kačar :). I added those two latin-2
> > entities just for you :) (actually, for testing), but not a full set of
> > latin-2 entities or latin-1 replacements for latin-2 chars.
>
> There's one wrong thing with Lynx entities table (before your changes).
> Ð and &Dstroke; are synonyms. ETH is Latin 1 (Icelandic, I think) and
> Dstroke is Latin 2 (4 or 5 languages). Upper case glyphs look the same,
> but lower case glyphs don't. And ASCII approximation is different, "dh"
> for ð and "dj" for &dstroke;. The full ISO entities tables come
> with XEmacs, IIRC.
So how about sending a patch?
> > As long as longs are at least 32-bit, and shorts are 16-bit, the chartrans
> > code _should_ not create any additional problems...
>
> Did you mean shorts are at least 16 bits, or shorts are 16 bits? I have
> a bad feeling about this. Do you have access to OSF?
No access to OSF. I _think_ shorts longer than 16 bits wouldn't be a
problem (except for wasting some memory), but I guess we will find out.
[ some details about Solaris 2.6 pointer problems snipped ]
> There are probably some more little devils, but nothing else comes to my
> mind right now.
I don't think any of those should be a problem for my patches, or even
for the Lynx code as a whole. But you'll find out..
> > > > Output of raw UTF8 (needs of course a termina which understands it)
> > > > seems to work better, but not perfect, with Slang. This is a problem
> > > > beyond Lynx, a curses replacement which understands multibyte characters
> > > > properly would be needed to avoid putting characters in the wrong screen
> > > > position. (Does anyone know of such a beast?)
> > >
> > > I think native curses on Solaris can. I'll check.
> >
> > I would also be interested to hear what other terminal environments can
> > understand UTF8. So far I only know about the Linux console.
>
> Perhaps Digital UNIX. I'll check... if you can tell me what to check.
> Which man page, or whatever.
Guessing where which vendor will hide such information goes beyond my
abilities. But `man curses' should be a good starting point...
Klaus
;
; To UNSUBSCRIBE: Send a mail message to address@hidden
; with "unsubscribe lynx-dev" (without the
; quotation marks) on a line by itself.
;
- LYNX-DEV Re: lynx2.6 chartrans,
Klaus Weide <=