help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: garbage chars when pasting French chars into emacs


From: ken
Subject: Re: garbage chars when pasting French chars into emacs
Date: Wed, 01 Feb 2012 21:39:22 -0500
User-agent: Thunderbird 2.0.0.24 (X11/20111109)


On 02/01/2012 04:23 PM Eli Zaretskii wrote:
Date: Wed, 01 Feb 2012 15:41:42 -0500
From: ken <gebser@mousecar.com>

Just to be comprehensive I'll state at the outset that I'm using Linux (CentOS 5.7), so this is the environment emacs is working in. From a shell I get this:

$ set|grep -i lang
LANG=en_US.UTF-8

Now I pull up a webpage with some French on it: <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>. Examining the source code of this page, I see at the top:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

So this page is presented in UTF-8.

Firefox is also set to present pages in UTF-8: View -> Character Encoding -> UTF-8

But when I copy and paste the text from "Francais" to "invisible, 1964)" inclusive, many of the characters aren't rendered correctly; I get "garbage" characters in their stead, e.g., the second-to-last line appears something like this:

     * L^[$(B!G^[$(C)+^[(Bil et l^[$(B!G^[(Besprit, Gallimard, 1960

Other lines are improperly rendered also.

I'd like to fix this. And if possible understand why this doesn't work, so I might be able to diagnose these problems for myself.

What is your value of selection-coding-system?  Try setting it to
something like ctext-with-extensions.

Thanks, Eli,

Immediately prior to doing the copy-and-paste I ran all of these:

(set-language-environment               'UTF-8)
(set-default-coding-systems             'utf-8)
(setq file-name-coding-system           'utf-8)
(setq default-buffer-file-coding-system 'utf-8)
(setq coding-system-for-write           'utf-8)
(set-keyboard-coding-system             'utf-8)
(set-terminal-coding-system             'utf-8)
(set-clipboard-coding-system            'utf-8)
(set-selection-coding-system            'utf-8)
(prefer-coding-system                   'utf-8)
(modify-coding-system-alist 'process "\\*shell\\*\\'" 'utf-8-unix)

Following your advice, I ran

(set-selection-coding-system 'ctext-with-extensions)

and then did the same copy-and-paste again. This got more of the characters correct, but not all of them. So we're a lot closer.... Got another suggestion?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]