[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#50391: 28.0.50; json-read non-ascii data results in malformed string
From: |
Lars Ingebrigtsen |
Subject: |
bug#50391: 28.0.50; json-read non-ascii data results in malformed string |
Date: |
Sun, 05 Sep 2021 10:08:35 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
Zhiwei Chen <condy0919@gmail.com> writes:
> When fetch json from youdao (a dict service in China).
>
> #+begin_src elisp
> (url-retrieve
> "https://dict.youdao.com/suggest?q=accumulate&le=eng&num=80&doctype=json"
> (lambda (_status)
> (goto-char (1+ url-http-end-of-headers))
> (write-region (point) (point-max) "/tmp/acc1.json")))
> #+end_src
>
> Then C-x C-f "/tmp/acc1.json", the file is correctly encoded without
>
> But If `json-read' then `json-insert', the file is malformed even if
> uchardet shows the encoding of the file is utf-8.
When you do the `write-region', Emacs writes the octets you received
from the web server to a file. When Emacs loads that file in again, it
guesses that it's utf-8 and decodes it that way, so that's why that
works correctly.
> #+begin_src elisp
> (url-retrieve
> "https://dict.youdao.com/suggest?q=accumulate&le=eng&num=80&doctype=json"
> (lambda (_status)
> (goto-char (1+ url-http-end-of-headers))
> (let ((j (json-read)))
> (with-temp-buffer
> (json-insert j)
> (write-region (point-min) (point-max) "/tmp/acc2.json")))))
> #+end_src
But here you're asking Emacs to use json-read on a buffer that's not
been decoded. The http buffer at this points looks like this:

You have to say (decode-coding-region (point) (point-max) 'utf-8) first
for that to work. I.e.,
(url-retrieve
"https://dict.youdao.com/suggest?q=accumulate&le=eng&num=80&doctype=json"
(lambda (_status)
(goto-char (1+ url-http-end-of-headers))
(let ((buf (current-buffer))
(end (1+ url-http-end-of-headers)))
(with-temp-buffer
(insert-buffer-substring buf end)
(goto-char (point-min))
(let ((j (json-read)))
(erase-buffer)
(json-insert j)
(write-region (point-min) (point-max) "/tmp/acc2.json"))))))
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no