|
From: | Deniz Dogan |
Subject: | Re: Encoding/decoding problems |
Date: | Thu, 28 Jul 2011 11:26:53 +0200 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 |
On 2011-07-28 11:01, Eli Zaretskii wrote:
Date: Thu, 28 Jul 2011 10:18:04 +0200 From: Deniz Dogan<deniz@dogan.se> I'm fetching an XML document that's uses iso-8859-1 coding with `url-retrieve' and then I parse it using `xml-parse-region'. After that, I get the parts of the document that I want and insert them into a buffer. However, the Swedish characters å, ä and ö are displayed as \345, \344 and \326 respectively. I've tried messing around with `encode-coding-region' and `decode-coding-region' but I'm really not sure what to do here.I suggest to start with describing a reproducible recipe for this problem. Not sure if this forum is appropriate, perhaps emacs-devel is a better place (as it sounds like you are describing a bug).
Here is the code to reproduce it: (defun fetch-and-show () (interactive) (let* ((old-buffer (current-buffer)) (url "http://dogan.se/sites/default/files/example.xml") (buffer (url-retrieve-synchronously url))) (with-current-buffer buffer (let ((doc (car (xml-parse-region (point-min) (point-max))))) (with-current-buffer old-buffer (insert (nth 2 (nth 2 (nth 3 doc)))))))))The XML file is encoded in iso-8859-1 with a bunch of Swedish characters here and there. The buffer I'm testing this with is *scratch* with utf-8-unix. It should insert "hallå" but inserts "hall\345".
I have no idea whether I should use `encode-region-string' or `decode-region-string' or what. I'd doubt it's a bug to be honest, it's probably my lack of understanding that's causing this.
Deniz
[Prev in Thread] | Current Thread | [Next in Thread] |