help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Cocoa emacs renders Unicode combining diacritics improperly


From: Peter Dyballa
Subject: Re: Cocoa emacs renders Unicode combining diacritics improperly
Date: Tue, 17 Jul 2012 23:15:20 +0200

Am 17.07.2012 um 14:52 schrieb Dan Maftei:

> 
> Here's how to make ñ compositionally:
> 
> n C-x 8 <RET> 0303 <RET>

I perform this much simple: ~n. ~ is on me German keyboard combining. The same 
is true for ´,`, ^, ¨.

> 
> Could you run describe-char on a compositional character and post the
> results? I want to see how it differs from my output. (Presuming, of
> course, that your emacs renders them correctly :-)

This is from the NS variant of GNU Emacs 23.4:

                character: ñ (241, #o361, #xf1)
        preferred charset: iso-8859-1 (Latin-1 (ISO/IEC 8859-1))
               code point: 0xF1
                   syntax: w    which means: word
                 category: .:Base, j:Japanese, l:Latin
              buffer code: #xC3 #xB1
                file code: #xC3 #xB1 (encoded by coding system utf-8-unix)
                  display: by this font (glyph code)
            
nil:-apple-Lucida_Sans_Typewriter-medium-normal-normal-*-9-*-*-*-m-0-iso10646-1 
(#x78)
        
        Character code properties: customize what to show
          name: LATIN SMALL LETTER N WITH TILDE
          general-category: Ll (Letter, Lowercase)
          canonical-combining-class: 0 (Spacing, split, enclosing, reordrant, 
and Tibetan subjoined)
          decomposition: (110 771) ('n' '̃')
        
        There are text properties here:
          fontified            t

and this is from the NS variant of GNU Emacs 24.1:

                    character: ñ (displayed as ñ) (codepoint 241, #o361, #xf1)
            preferred charset: iso-8859-1 (Latin-1 (ISO/IEC 8859-1))
        code point in charset: 0xF1
                       syntax: w        which means: word
                     category: .:Base, L:Left-to-right (strong), j:Japanese, 
l:Latin
                  buffer code: #xC3 #xB1
                    file code: #xC3 #xB1 (encoded by coding system utf-8-unix)
                      display: by this font (glyph code)
            nil:-apple-Menlo-medium-normal-normal-*-9-*-*-*-m-0-iso10646-1 
(#xB3)
        
        Character code properties: customize what to show
          name: LATIN SMALL LETTER N WITH TILDE
          general-category: Ll (Letter, Lowercase)
          canonical-combining-class: 0 (Spacing, split, enclosing, reordrant, 
and Tibetan subjoined)
          decomposition: (110 771) ('n' '̃')
        
        There are text properties here:
          fontified            t

You can see the different "character:" lines and font (type) descriptions.


This comes from the "AppKit Emacs":

                    character: ñ (displayed as ñ) (codepoint 241, #o361, #xf1)
            preferred charset: iso-8859-1 (Latin-1 (ISO/IEC 8859-1))
        code point in charset: 0xF1
                       syntax: w        which means: word
                     category: .:Base, L:Left-to-right (strong), j:Japanese, 
l:Latin
                  buffer code: #xC3 #xB1
                    file code: #x6E #xCC #x83 (encoded by coding system 
utf-8-hfs-unix)
                      display: by this font (glyph code)
            mac-ct:-*-Monaco-normal-normal-normal-*-10-*-*-*-m-0-iso10646-1 
(#x78)
        
        Character code properties: customize what to show
          name: LATIN SMALL LETTER N WITH TILDE
          general-category: Ll (Letter, Lowercase)
          canonical-combining-class: 0 (Spacing, split, enclosing, reordrant, 
and Tibetan subjoined)
          decomposition: (110 771) ('n' '̃')
        
        There are text properties here:
          fontified            t

You can see that the two 24.1 versions use different coding systems.


> 
> Thanks for the patches. I've applied them to the 24.1.1 source but make
> segfaults when compiling profile.c. I don't have the time to fix this
> unfortunately.

I wrote "GNU Emacs 24.1" and YAMAMOTO Mitsuharu mentions in NEWS-mac at its top:

        * emacs-24.1-mac-3.0 (2012-06-10)
        Based on Emacs 24.1.

So using the sources for GNU Emacs 24.1.1 is not correct. Use the sources from 
the official GNU Emacs 24.1 release!

> 
> I presume you use emacs on OS X? Did you build it using this patch? Do
> compositional characters work?

Three times: yes.

> Further, if you have the time, could you build the regular source --with-ns 
> and see if they work there? Perhaps the issue is with my OS.

It works. Your fault is that you try to use an Emacs input method, which is not 
necessary. Just use your keyboard and its own dead (combining) accents! If I 
try to use your input method I get:

                    character: n (displayed as n) (codepoint 110, #o156, #x6e)
            preferred charset: ascii (ASCII (ISO646 IRV))
        code point in charset: 0x6E
                       syntax: w        which means: word
                     category: .:Base, L:Left-to-right (strong), a:ASCII, 
l:Latin, r:Roman
                  buffer code: #x6E
                    file code: #x6E (encoded by coding system utf-8-unix)
                      display: composed to form "ñ" (see below)
        
        Composed with the following character(s) "̃" using this font:
          nil:-apple-Menlo-medium-normal-normal-*-9-*-*-*-m-0-iso10646-1
        by these glyphs:
          [0 1 110 81 5 0 4 5 0 nil]
          [0 1 771 648 5 0 3 1 0 [-4 0 0]]
        
        Character code properties: customize what to show
          name: LATIN SMALL LETTER N
          general-category: Ll (Letter, Lowercase)
          canonical-combining-class: 0 (Spacing, split, enclosing, reordrant, 
and Tibetan subjoined)
          decomposition: (110) ('n')
        
        There are text properties here:
          fontified            t

The combined character looks quite good with Menlo on Snow Leopard but as awful 
as your screenshot with Monaco (differently awful with Lucida Sans Typewriter). 
In the "AppKit Emacs" with Monaco the accented character looks exactly like the 
~n composed character and is described as:

                    character: n (displayed as n) (codepoint 110, #o156, #x6e)
            preferred charset: ascii (ASCII (ISO646 IRV))
        code point in charset: 0x6E
                       syntax: w        which means: word
                     category: .:Base, L:Left-to-right (strong), a:ASCII, 
l:Latin, r:Roman
                  buffer code: #x6E
                    file code: #x6E (encoded by coding system utf-8-hfs-unix)
                      display: composed to form "ñ" (see below)
        
        Composed with the following character(s) "̃" using this font:
          mac-ct:-*-Monaco-normal-normal-normal-*-10-*-*-*-m-0-iso10646-1
        by these glyphs:
          [0 1 110 120 6 0 6 8 0 nil]
        
        Character code properties: customize what to show
          name: LATIN SMALL LETTER N
          general-category: Ll (Letter, Lowercase)
          canonical-combining-class: 0 (Spacing, split, enclosing, reordrant, 
and Tibetan subjoined)
          decomposition: (110) ('n')
        
        There are text properties here:
          fontified            t


--
Greetings

  Pete

The best way to accelerate a PC is 9.8 m/s²




reply via email to

[Prev in Thread] Current Thread [Next in Thread]