[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: default charset for text/html selection in X11
From: |
Robert Pluim |
Subject: |
Re: default charset for text/html selection in X11 |
Date: |
Thu, 22 Jun 2023 11:07:45 +0200 |
>>>>> On Thu, 22 Jun 2023 15:57:59 +0800, Po Lu <luangruo@yahoo.com> said:
Po Lu> Robert Pluim <rpluim@gmail.com> writes:
>>>>>>> On Thu, 22 Jun 2023 11:37:14 +0800, Po Lu <luangruo@yahoo.com> said:
>>
>> Po Lu> Po Lu <luangruo@yahoo.com> writes:
>> >> What is the type of the string? IOW, what's
>> >>
>> >> (get-text-property html 'foreign-selection)
>>
>> Po Lu> (get-text-property 0 html 'foreign-selection), of course. Sorry
about
>> Po Lu> the confusion.
>>
>> (get-text-property 0 'foreign-selection html) => STRING
>>
>> but itʼs definitely a utf-8 string, not iso-latin-1.
Po Lu> Would you please report this as a bug, to the Chromium developers?
Po Lu> That is, if:
Po Lu> (x-get-selection-internal 'CLIPBOARD 'text/html)
Po Lu> returns a string of the same type.
It does.
Po Lu> The ICCCM clearly states that:
Po Lu> STRING as a type or a target specifies the ISO Latin-1 character
set
Po Lu> plus the control characters TAB (octal 11) and NEWLINE (octal 12.)
Po Lu> The spacing interpretation of TAB is context dependent. Other
ASCII
Po Lu> control characters are explicitly not included in STRING at the
Po Lu> present time.
Iʼm not about to contradict the ICCCM, but `gui-get-selection' does
the following
;; Guess at the charset for types like text/html
;; -- it can be anything, and different
;; applications use different encodings.
((string-match-p "\\`text/" (symbol-name data-type))
(decode-coding-string
data (car (detect-coding-string data))))
;; Do nothing.
I took a closer look, and `yank-media' does the wrong thing, but
`(yank-media-types t)' and selecting "text/html" does the right
thing. The difference is that the former uses
`gui-backend-get-selection', and the latter uses `gui-get-selection',
and thus does the auto-detection.
Robert
--