[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [AUCTeX-devel] [AUCTeX-diffs] GNU AUCTeX branch, master, updated. 09
From: |
Ikumi Keita |
Subject: |
Re: [AUCTeX-devel] [AUCTeX-diffs] GNU AUCTeX branch, master, updated. 097084443771d6716c6870f2f8d329e9c0949d97 |
Date: |
Wed, 31 Oct 2018 01:01:06 +0900 |
Hi David,
>>>>> David Kastrup <address@hidden> writes:
>> I changed it to "[\x00-\xFF]+" to process all the raw 8-bit bytes
>> together at decoding with the relavant coding system.
> That does not cover raw bytes since they are not in the range 00-ff in
> Emacs multibyte characters. So that expression would only work for
> bytes in buffers decoded from files considered to be in Latin-1
> encoding.
If my memory serves, that's the behavior of non-unicode emacs
(mule-version < 6). The current emacs (mule-version = 6) actually has a
multibyte treatment smart (or confusing) enough to match raw 8-bit byte
with regexp "[\x00-\xFF]". The both form
(string-match "[\x00-\xFF]" (string-to-unibyte (byte-to-string #xab)))
(string-match "[\x00-\xFF]" (string-to-multibyte (byte-to-string #xab)))
returns non-nil value (0), at least on my emacs 26.1.
Although it is true that raw 8-bit characters in multibyte string are
not in the range 00-ff, the current emacs automatically (and implicitly)
converts them into 00-ff when matching against such regexps. Whereas
the form
(aref (string-to-multibyte (byte-to-string #xab)) 0)
returns #x3fffab, the string matches with "[\x00-\xFF]" in
`string-match'. (I admit that this behavior is confusing.)
Regards,
Ikumi Keita