help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encoding/decoding problems


From: Deniz Dogan
Subject: Re: Encoding/decoding problems
Date: Thu, 28 Jul 2011 11:26:53 +0200
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0

On 2011-07-28 11:01, Eli Zaretskii wrote:
Date: Thu, 28 Jul 2011 10:18:04 +0200
From: Deniz Dogan<deniz@dogan.se>

I'm fetching an XML document that's uses iso-8859-1 coding with
`url-retrieve' and then I parse it using `xml-parse-region'.

After that, I get the parts of the document that I want and insert them
into a buffer.  However, the Swedish characters å, ä and ö are displayed
as \345, \344 and \326 respectively.

I've tried messing around with `encode-coding-region' and
`decode-coding-region' but I'm really not sure what to do here.

I suggest to start with describing a reproducible recipe for this
problem.  Not sure if this forum is appropriate, perhaps emacs-devel
is a better place (as it sounds like you are describing a bug).


Here is the code to reproduce it:

(defun fetch-and-show ()
  (interactive)
  (let* ((old-buffer (current-buffer))
         (url "http://dogan.se/sites/default/files/example.xml";)
         (buffer (url-retrieve-synchronously url)))
    (with-current-buffer buffer
      (let ((doc (car (xml-parse-region (point-min) (point-max)))))
        (with-current-buffer old-buffer
          (insert
           (nth 2 (nth 2 (nth 3 doc)))))))))

The XML file is encoded in iso-8859-1 with a bunch of Swedish characters here and there. The buffer I'm testing this with is *scratch* with utf-8-unix. It should insert "hallå" but inserts "hall\345".

I have no idea whether I should use `encode-region-string' or `decode-region-string' or what. I'd doubt it's a bug to be honest, it's probably my lack of understanding that's causing this.

Deniz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]