Re: desktop and encodings

help-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: desktop and encodings

From:	Peter Dyballa
Subject:	Re: desktop and encodings
Date:	Mon, 23 May 2005 20:01:54 +0200


Am 23.05.2005 um 16:13 schrieb Mads Jensen:

æøå gets turned into something like Â¥...

What see is the 'translation' of some ISO Latin encoding into UTF-8 andthen displaying these double byte values as unibytes!


This could explain a bit:

;   oct   dec   hex    UCS2    UTF-8
;=====================================
  = 240 = 160 = A0 = U+00A0 =    C2 A0 : NO-BREAK SPACE

Ą = 241 = 161 = A1 = U+0104 = C4 84 : LATIN CAPITAL LETTER A WITHOGONEK

ĸ = 242 = 162 = A2 = U+0138 =    C4 B8 : LATIN SMALL LETTER KRA

Ŗ = 243 = 163 = A3 = U+0156 = C5 96 : LATIN CAPITAL LETTER R WITHCEDILLA

¤ = 244 = 164 = A4 = U+00A4 =    C2 A4 : CURRENCY SIGN

Ĩ = 245 = 165 = A5 = U+0128 = C4 A8 : LATIN CAPITAL LETTER I WITHTILDEĻ = 246 = 166 = A6 = U+013B = C4 BB : LATIN CAPITAL LETTER L WITHCEDILLA

§ = 247 = 167 = A7 = U+00A7 =    C2 A7 : SECTION SIGN
¨ = 250 = 168 = A8 = U+00A8 =    C2 A8 : DIAERESIS

Š = 251 = 169 = A9 = U+0160 = C5 A0 : LATIN CAPITAL LETTER S WITHCARONĒ = 252 = 170 = AA = U+0112 = C4 92 : LATIN CAPITAL LETTER E WITHMACRONĢ = 253 = 171 = AB = U+0122 = C4 A2 : LATIN CAPITAL LETTER G WITHCEDILLAŦ = 254 = 172 = AC = U+0166 = C5 A6 : LATIN CAPITAL LETTER T WITHSTROKE

 = 255 = 173 = AD = U+00AD =    C2 AD : HYPHEN-MINUS

Ž = 256 = 174 = AE = U+017D = C5 BD : LATIN CAPITAL LETTER Z WITHCARON

Á = 301 = 193 = C1 = U+00C1 = C3 81 : LATIN CAPITAL LETTER A WITHACUTEÂ = 302 = 194 = C2 = U+00C2 = C3 82 : LATIN CAPITAL LETTER A WITHCIRCUMFLEXÃ = 303 = 195 = C3 = U+00C3 = C3 83 : LATIN CAPITAL LETTER A WITHTILDEÄ = 304 = 196 = C4 = U+00C4 = C3 84 : LATIN CAPITAL LETTER A WITHDIAERESISÅ = 305 = 197 = C5 = U+00C5 = C3 85 : LATIN CAPITAL LETTER A WITHRING ABOVE

Æ = 306 = 198 = C6 = U+00C6 =    C3 86 : LATIN CAPITAL LETTER AE

æ = 346 = 230 = E6 = U+00E6 =    C3 A6 : LATIN SMALL LETTER AE

First column contains the glyphs as they are, next columns have theglyph's byte value expressed as octal, decimal, or hexadecimalnumerals. Next column, UCS2, show the slot number (ASCII code) of thatglyph in Unicode (which, I think, is too the internal representation inGNU Emacs). The next column now shows into which bytes the glyphs fromcolumn 1 are translated as UTF-8. As you can see you can 'see' theUTF-8 bytes as 'normal' characters, a UTF-8 encoded æ is just 'ÄĻ' ifdisplayed in ISO Latin-4, 'Ä¦' in ISO Latin-1 ...

So, to conclude: your Emacs obviously saves your input as UTF-8, andyou have to make the buffer display in UTF-8 too! The correct headerswould look like


        ;;; -*- mode: Text; coding: utf-8; -*-

Once you have the file opened in the wrong encoding you can change thatwith revert-buffer-with-coding-system, C-x RET r utf-8 RET.


Have you thought of

(prefer-coding-system     'utf-8-unix)

Could be it cures a lot. There is too (set-language-environment'Danish) ...



--
Mit friedvollen Grüßen

  Pete

In a world without walls and fences, who needs gates and windows?

[Prev in Thread]

Current Thread

[Next in Thread]

desktop and encodings, Mads Jensen, 2005/05/19
- Re: desktop and encodings, Kevin Rodgers, 2005/05/19
- Message not available
  - Re: desktop and encodings, Mads Jensen, 2005/05/21
    - Re: desktop and encodings, Peter Dyballa, 2005/05/21
    - Message not available
    - Re: desktop and encodings, Mads Jensen, 2005/05/23
    - Re: desktop and encodings, Peter Petersen, 2005/05/23
    - Re: desktop and encodings, Peter Dyballa <=

Prev by Date: Re: Emacs 21 and w3 on Debian
Next by Date: how to change current directory in perl debug (perldb)
Previous by thread: Re: desktop and encodings
Next by thread: Suprise for your woman...
Index(es):
- Date
- Thread