Re: character sets as they relate to “Raw” string literals for elisp

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character sets as they relate to “Raw” string literals for elisp

From:	Daniel Brooks
Subject:	Re: character sets as they relate to “Raw” string literals for elisp
Date:	Tue, 05 Oct 2021 15:13:20 -0700
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Richard Stallman <rms@gnu.org> writes:

> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
>   > Suppose this hypothetical contribution were a language mode for a
>   > Japanese programming language, and thus had the same support profile?
>
> I have to guess what a "Japanese programming language" would mean, but
> I think you're talking about a mode for editing programs written in a
> language whose symbols are meaningful in Japanese and perhaps written
> in kana and kanji.

Correct. The idea is that this hypothetical Emacs feature would be
useful primarily to people who could already read and write Japanese,
and who thus would not be inconvenienced because the software was also
written in Japanese.

> We could conceivably add such a program to Emacs, but should we?  I
> think it is not worth the trouble; I'd say, let's not.
>
> You can write and destribute the program, and people could run it.
> But we should not distribute programs we can't read.

Fair enough; thanks for answering the question!

>   > I think that if I read between the lines, you are saying that the Emacs
>   > project _could_ grow to become multi–lingual at all levels, with a
>   > sufficient number of invested contributors who could each review and
>   > maintain different parts of the code.
>
> It would be an enormous effort -- just consider translating the
> manuals.  And updating the translations for each Emacs version. It
> would be a big burden.

Yes, that’s certainly true; the cost of getting complete parity between
English and a second language would be significant. However, I don’t
think that the ongoing costs would be insurmountable, assuming the
project attracted additional trusted and proven maintainers along with
each additional language. A few docstrings and manual pages get changed
in most version, but not enough to make it impossible to keep up.

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Daniel Brooks <db48x@db48x.net>
>> Cc: emacs-devel@gnu.org
>> Date: Mon, 04 Oct 2021 13:49:53 -0700
>>
>> Would there be any reason to turn away that contribution, or to make the
>> contributor rewrite it?
>
> I'm sorry, this is too abstract and theoretical issue, with many
> important details missing.  So I don't think it will be useful to
> seriously consider such a theoretical example.

That, however, is not a useful answer. :)

What assumptions would you need to make before you could answer yes?

Note that this is a purely hypothetical situation; aside from a
smattering of Latin and Greek that are useful for English etymology, I
cannot read or write any other languages. I don’t have a pile of code
written in Japanese that I’m going to spring on you if you find a way to
say yes. Instead I am looking ahead and wondering what the conditions
would have to be like 20 years from now for non–English code to start
showing up.

> It turns out there are more exception than we imagine.  We just now
> had another bug report, this time about Kitty terminal emulator, which
> has yet another set of issues with displaying non-ASCII characters
> from Emacs.  So much so that I was prompted to add an entry in
> etc/PROBLEMS with some workarounds for users of Kitty.  Granted, their
> problems are not that they don't support recently added Unicode
> characters, it's that they support them "too well".  B ut still, it
> doesn't help when the result is a messed-up display.
>
> Unicode is not a static target, it's a moving one.  They issue a new
> version of the standard twice a year, and each new version adds new
> codepoints with new attributes.  If a new version of Unicode adds
> double-width characters, and some terminal emulator doesn't keep up,
> you will have problems displaying those new codepoints.  (AFAIK,
> that's in essence the problem with the Linux console: they last
> updated when Unicode 5.0 was released.)

That’s an interesting point. On the one hand, the fact that the Linux
console is still using Unicode 5.0 shows just how unmaintained it is
(released in July 2006; the next Emacs release was 22.1 in 2007). On the
other hand, perhaps if problems like this keep cropping up we will have
to add encodings for older unicode versions. People using the Linux
console could set their terminal encoding to
'utf-8-unicode5.0. Characters added after that would show up escaped,
and Emacs would know what width the terminal was going to use for each
character.

> So it might be possible to say that many terminals support substantial
> portions of Unicode, but it definitely is NOT right to say that we can
> freely use any character we want and think they will work everywhere.

So one assumption that you might make is that new source code being
added to Emacs must use characters from a version of Unicode which is
known to have wide compatibility, rather than immediately jumping to the
bleeding–edge version? That would be perfectly reasonable.

db48x

[Prev in Thread]

Current Thread

[Next in Thread]

linux console limitations, (continued)

Prev by Date: Re: Do shorthands break basic tooling (tags, grep, etc)? (was Re: Shorthands have landed on master)
Next by Date: Re: Do shorthands break basic tooling (tags, grep, etc)? (was Re: Shorthands have landed on master)
Previous by thread: Re: character sets as they relate to âRawâ string literals for elisp
Next by thread: Re: character sets as they relate to “Raw” string literals for elisp
Index(es):
- Date
- Thread