help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Strange whitespaces.


From: Hongyi Zhao
Subject: Strange whitespaces.
Date: Thu, 30 Sep 2021 17:37:20 +0800

I've seen two strange whitespaces which shown as underscores in
scratch buffer, and `M-x describer-char RET' give the following
results:

The first one:

===============
           position: 146 of 148 (98%), column: 0
            character:   (displayed as  ) (codepoint 160, #o240, #xa0)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0xA0
               script: latin
               syntax:       which means: whitespace
             category: .:Base, b:Arabic, j:Japanese, l:Latin
             to input: type "C-x 8 RET a0" or "C-x 8 RET NO-BREAK SPACE"
          buffer code: #xC2 #xA0
            file code: #xC2 #xA0 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    ftcrhb:-PfEd-DejaVuSansMono Nerd Font
Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#x62)
       hardcoded face: nobreak-space

Character code properties: customize what to show
  name: NO-BREAK SPACE
  old-name: NON-BREAKING SPACE
  general-category: Zs (Separator, Space)
  decomposition: (noBreak 32) (noBreak ' ')

There are text properties here:
  fontified            t
  wrap-prefix          " "
  ws-butler-chg        delete


The second:

 ===============
           position: 148 of 148 (99%), column: 2
            character:   (displayed as  ) (codepoint 8194, #o20002, #x2002)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x2002
               script: symbol
               syntax:       which means: whitespace
             category: .:Base
             to input: type "C-x 8 RET 2002" or "C-x 8 RET EN SPACE"
          buffer code: #xE2 #x80 #x82
            file code: #xE2 #x80 #x82 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    ftcrhb:-PfEd-DejaVuSansMono Nerd Font
Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#x712)
       hardcoded face: nobreak-space

Character code properties: customize what to show
  name: EN SPACE
  general-category: Zs (Separator, Space)
  decomposition: (compat 32) (compat ' ')

There are text properties here:
  fontified            t
  rear-nonsticky       t
  wrap-prefix          " "
  ws-butler-chg        chg


If I copy and paste these two characters into other editors, say,
Gmail web client or gedit, I will see nothing of them. OTOH, if I copy
them back to Emacs again, for the Gmail web client case, the first
character will be lost.

I am puzzled by this phenomenon: Why do people design so many
whitespace representations  and how to safely manipulate them between
different editors

Regards, HZ

Attachment: whitespaces.png
Description: PNG image


reply via email to

[Prev in Thread] Current Thread [Next in Thread]