help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to get the script name symbols of a specific character?


From: Eli Zaretskii
Subject: Re: How to get the script name symbols of a specific character?
Date: Mon, 11 Feb 2013 22:08:56 +0200

> From: Jambunathan K <kjambunathan@gmail.com>
> Date: Tue, 12 Feb 2013 01:27:28 +0530
> Cc: help-gnu-emacs@gnu.org
> 
> YE Qianchuan <stool.ye@gmail.com> writes:
> 
> > On 02/11/2013 07:34 PM, Jambunathan K wrote:
> >> Put your cursor on the box and type
> >>          C-u C-x =
> > In fact, it's the same as `describe-char'. This command invokes
> > `what-cursor-position', which invokes `describe-char' eventually.
> >>
> >> It will give more useful pointers.  The codepoint of a particular
> >> character.  The name of the character, in the example below is prefixed
> >> by the script it comes from etc.
> > Cool, I didn't notice its name may be prefixed by its script. It does
> > make a lot sense.
> >
> > However sadly, not all characters do so. For example, a CJK character
> > has prefix CJK.
> > But cjk is not a script name (though there's a script called cjk-misc)
> > and it should belong
> > to `han'.
> >
> > What's worse is, some characters don't show their names at all, even
> > if I assign a font to it.
> >
> > For example:
> >              position: 806 of 1031 (78%), column: 1
> >             character: 😀 (displayed as 😀) (codepoint 128512, #o373000,
> > #x1f600)
> >     preferred charset: unicode (Unicode (ISO10646))
> > code point in charset: 0x1F600
> >                syntax: w     which means: word
> >              category: L:Left-to-right (strong)
> >           buffer code: #xF0 #x9F #x98 #x80
> >             file code: #xF0 #x9F #x98 #x80 (encoded by coding system
> > utf-8-unix)
> >               display: no font available
> >
> > Character code properties: customize what to show
> >   general-category: Cn (Other, Not Assigned)
> >   decomposition: (128512) ('😀')
> 
> This is what I get.  Emacs reports that it is a GRINNING FACE.  
> 
> I run Emacs from trunk though.  I am not sure this makes any actuall
> difference.

The names come from the Unicode character database (UCD) that is
processed into a bunch of Emacs Lisp files and then preloaded into
Emacs.  The version of the Unicode database built into Emacs
determines which codepoints have names and which don't.

> I think it would be useful to have one browse different Unicode Blocks
> or have C-u C-x = report the block name of a character.

If that data is not in the UCD, Emacs cannot know it, unless someone
adds it to Emacs.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]