help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode fonts - Re: Why do I find ^L in elisp code?


From: Jean Louis
Subject: Re: Unicode fonts - Re: Why do I find ^L in elisp code?
Date: Mon, 24 May 2021 23:19:27 +0300
User-agent: Mutt/2.0.6 (2021-03-06)

* Eli Zaretskii <eliz@gnu.org> [2021-05-24 21:13]:
> > Date: Mon, 24 May 2021 21:05:28 +0300
> > From: Jean Louis <bugs@gnu.support>
> > Cc: help-gnu-emacs@gnu.org
> > 
> > > Would you also like "реасе" to be supported by English screen
> > > readers, for example?
> > 
> > Definitely, just that I don't understand the meaning of your
> > question. Do you mean that piece and peace would be spoken same?
> 
> Look closer: that word I wrote is not "peace".

(•◡•) Good trick to demonstrate your point. That type of style is
already used on social media extensively, letters that do not belong
where they should are used for expressions. That is true, and IMHO, it
is up to artificial intelligence to try to decipher that. And it is
possible.

There is Mozilla Voice project where people donate voice for voice
recognition: https://voice.mozilla.org where people speak and listen,
people tell to computer what is the meaning of the voice.

By using that same principle people may provide submissions, andeven
реасе may be interpreted as "peace" in English if it is in the English
context. Similar thing does Google on https://translate.google.com
where it asks users to correct translations. 𝗛𝗲𝗹𝗹𝗼 𝘁𝗵𝗲𝗿𝗲

𝗚𝗼𝗼𝗱 𝗲𝘅𝗮𝗺𝗽𝗹𝗲:
https://translate.google.com/?sl=auto&tl=it&text=%F0%9D%97%9B%F0%9D%97%B2%F0%9D%97%B9%F0%9D%97%B9%F0%9D%97%BC%20%F0%9D%98%81%F0%9D%97%B5%F0%9D%97%B2%F0%9D%97%BF%F0%9D%97%B2&op=translate

In that example one can see that Google artificial intelligence
recognizes 𝗛𝗲𝗹𝗹𝗼 𝘁𝗵𝗲𝗿𝗲 as English, click on the speech icon, it will
speak English perfectly and Italian's 𝗵𝗲𝗹𝗹𝗼 𝘁𝗵𝗲𝗿𝗲 will be spoken in
English with Italian accent. The fact is, Google's artificial
intelligence does recognize Mathematical Sans-Serif Bold.

There could be a more global free software licensed artificial
intelligence that could collect the meanings from people, whatever
they may be. 

> > > You are judging characters by their appearance, which is incorrect.
> > 
> > Yes, surely I understand it may be technically incorrect, though
> > humanely it gives a style even in those cases where text style cannot
> > be otherwise assigned.
> 
> No, that's a slippery slope towards the so-called "confusables", see
> 
>   https://websec.github.io/unicode-security-guide/visual-spoofing/

I understand your rejection as programmer of Emacs and that is fine in
that context. Though on the other side, Emacs is used by thousands of
artists who express themselves beyond technicalities. Programmers of
new software have to be aware of new developments and thus take in
account Unicode symbols.

One may call some of those "confusables", but real problem is in
Unicode's fundamental design of those characters or lack of
attributes. If 𝗔 is not A technically, it is humanely still "A" with a
difference that one could be displayed slightly different, but it
remains the letter A. 

Now if Unicode would assign some attributes or additional type
meanings to it, programs would get information on how to treat that
easier. Now it is possible only on the higher level be telling to
program how to treat a character, but Unicode could inject the type
version into the character itself on fundamental level.

The type could tell that character is readable, or not readable, or
similar to other characters and so on and programs could interpret it
correctly.

That would be fundamental solution to the problem including to
"confusables"  as the type would be fundamental, downloaded from a
central place like Unicode, and programs would just need to read the
type and tell to user that pаураl.com is not equal to paypal.com and
mechanism for that already exist, that is the user's preferred
language when browsing, though it does not apply to internationalized
domai names, but that is yet up to browser authors to harmonize it.

If user wish to read English language, than it is up to browser to say
"No, this domain `pаураl.com' has some cyrillic characters, do you
really wish to proceed?"

Computers need teaching. 

Browser could be instructed to watch out for the alphabet that user
uses, screen reader could be instructed by the 𝗔 attribute to
represent also a letter A, and to read it properly, not just to read
Latin alphabet. Emacs has its properties that can keep such data and
as such could be the exemplary way of presenting such text or set of
characters.



Jean

Take action in Free Software Foundation campaigns:
https://www.fsf.org/campaigns

Sign an open letter in support of Richard M. Stallman
https://stallmansupport.org/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]