help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Solved] RE: Differences between identical strings in Emacs lisp


From: Stefan Monnier
Subject: Re: [Solved] RE: Differences between identical strings in Emacs lisp
Date: Thu, 09 Apr 2015 08:32:09 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux)

> I could imagine that the step from the equivalence char=byte to
> char=unicode code point (long(er) integer) is not so difficult. But we have
> in addition the UTF-8 representation. To what of the two latter--unicode code
> point (integer, several bytes long) or its UTF-8 representation (sequence of
> several bytes) does the term "multibyte" refer?

multibyte refers to "string of characters".  These have been represented
internally using an iso-2022 encoding until Emacs-22 and since Emacs-23
they're represented internally with a utf-8 encoding.  The name comes
from the fact that each element can use up more than one byte.
But that's just an internal detail that is mostly hidden from Elisp.

To turn such a string of characters into a string of bytes you need to
use things like encode-coding-(string|buffer), at which point you have
to specify which encoding you want to use (e.g. utf-8).


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]