[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gcl-devel] utf8 and emacs text/string multibyte representation
From: |
Camm Maguire |
Subject: |
Re: [Gcl-devel] utf8 and emacs text/string multibyte representation |
Date: |
Sat, 01 Nov 2014 11:03:12 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) |
Greetings! One other thing -- any opinions on using locales and wchar_t
for conversions?
Is there really any other (than utf-8) external encoding that needs
support in a common lisp, practically speaking?
Take care,
Carl Shapiro <address@hidden> writes:
> On Fri, Oct 31, 2014 at 11:20 AM, Camm Maguire <address@hidden> wrote:
>
> It really appears that unicode refers more to a glyph than anything
> else. If we follow your suggestions, and leave characters 8-bit, aref
> random O(1) access, is there any utility to providing unicode functions
> #'glyph-length or some such in a common lisp implementation?
>
> Yes, a Common Lisp character is a UTF-8 code unit. As such, (length "א")
> would return 2 in GCL whereas it returns 1 in CMUCL.
>
> For iterating across strings in ways other than by UTF-8 code unit, you will
> want to provide an iterators for iterating by code point, by glyph,
> and so forth.
>
> In theory, something like CL-UNICODE would provide that but I think its
> really lacking in a number of important ways. GCL being what it is, you
> could link against ICU and use their functions to start with.
>
--
Camm Maguire address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens." -- Baha'u'llah