gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz] Literal names in the structure


From: Rauli Ruohonen
Subject: [Gzz] Literal names in the structure
Date: Thu, 27 Feb 2003 11:28:11 +0200 (EET)

I looked at the recent RDF stuff, and while I think that it's the right
direction to go to (it even feels almost like high time to do that ;-),
there's one thing that bugs me with it: literal strings in places
where they don't belong to. (or maybe I'm mistaken, I did look at it
rather superficially.. yet, most systems still get this wrong)

Anything that doesn't come with a content-type, that can't be replaced
e.g. with a silly animated png of a bear or something, should be a
randomly generated bit string, or otherwise complete gibberish to a human.

When using Unicode, people somehow feel like it's much "safer" than ASCII,
that anything can be plausibly written in it. But anything can be
plausibly written in ASCII, or encoded to 0 and 1.. Unicode is not the
be-all end-all of character encodings; there are many characters in the
Asian languages that can not be written in it, and in some fields it
causes a bit of a problem when people can't write the special vocabulary
of the field as they wish to. New symbols may also appear in math etc. One
should always be allowed to resort to using PNG etc., or better yet,
scalable font using a non-unicode character set.

BTW, there are even efforts to produce an encoding of the Chinese
characters in a more intelligent fashion, as a decomposition of parts
(which they are), but I don't know how far these are. In any case, there
are people who aren't satisfied with a straightforward encoding of
characters in the normal fashion, and it's better to regard character
strings as arbitrary functions from (string, size) to bitmaps as anything
else, when thinking in label context.

In short, if you can write something somewhere, you should be able to put
anything there. The reason that everything else should be gibberish is
that there's no reason for it not to be so, except that people are lazy
and would abuse places where no intelligible text should be. That would be
unfortunate, because it would annoy the heck out of people who would like
to write something non-unicode in there.

There's also that everything (with URI, of course, every concept has an
URI ;-) has many names, and more than one per language. Which one is shown
should be determined by the user, not hard-coded anywhere, especially in a
place that's hard to change. There are still literate people in this world
who can't even read the roman alphabet.. (except perhaps for the numbers,
not too sure about that either) Making things inconvenient for non-english
speakers is not a good idea, IMHO, and that will happen if english-speaking
people take shortcuts others can't use.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]