[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Dynamic loading progress
From: |
Eli Zaretskii |
Subject: |
Re: Dynamic loading progress |
Date: |
Sun, 22 Nov 2015 21:43:25 +0200 |
> From: Philipp Stephani <address@hidden>
> Date: Sun, 22 Nov 2015 19:10:44 +0000
> Cc: address@hidden, address@hidden, address@hidden
>
> It is only used in one place: the internal representation of
> characters in buffers and strings. Emacs _never_ lets this internal
> representation leak outside.
>
> If I run in scratch:
>
> (with-temp-buffer
> (insert #x3fff40)
> (describe-char (point-min)))
Emacs will never find such "byte" in any text. So this feature is not
really relevant to the issue at hand.
> Then the resulting help buffer says "buffer code: #xF8 #x8F #xBF #xBD #x80",
> is
> that not considered a leak?
No. You created this yourself, and got what you asked for.
More generally, can you imagine a real-life situation where a string
with such "bytes" could be received from a module, as part of a C
'char *' string?
> You are suggesting to expose the internal representation to outside
> application code, which predictably will cause that representation to
> leak into Lisp. That'd be a disaster. We had something like that
> back in the Emacs 20 era, and it took many years to plug those leaks.
> We would be making a grave mistake to go back there.
>
> I don't suggest leaking anything what isn't already leaked. The extension of
> the codespace to 22 bits is well documented.
I don't think it's reasonable to request that module authors read all
that stuff and understand it, before they can write a simple module
that manipulates non-ASCII text. Writing such modules should be that
hard.
> Returning raw bytes means that encoding and decoding isn't a perfect
> roundtrip:
>
> (decode-coding-string (encode-coding-string (string #x3fffc2 #x3fffbb)
> 'utf-8-unix) 'utf-8-unix)
> "ยป"
If you start with raw bytes, not large integers, then the roundtrip
will be perfect.
> What are the exact difference between the approaches? As far as I can see
> differences exist only for the following points:
> - Accepting invalid sequences. I consider that a bug in general-purpose APIs,
> including decode-coding-string. However, given that Emacs already extends the
> Unicode codespace and therefore has to accept some invalid sequences anyway,
> it
> might be OK if it's clearly documented.
> - Emitting raw bytes instead of extended sequences. Though I'm not a fan of
> this it might be unavoidable to be able to treat strings transparently (which
> is desirable).
Then I think we agree after all.
Thanks.
- Re: Dynamic loading progress, (continued)
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/21
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/21
- Re: Dynamic loading progress, David Kastrup, 2015/11/21
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/21
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/21
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/21
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/22
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/22
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/22
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/22
- Re: Dynamic loading progress,
Eli Zaretskii <=
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/22
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/22
- Re: Dynamic loading progress, David Kastrup, 2015/11/22
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/22
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/22
- Re: Dynamic loading progress, David Kastrup, 2015/11/22
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/22
- Re: Dynamic loading progress, David Kastrup, 2015/11/22
- Re: Dynamic loading progress, Eli Zaretskii, 2015/11/22
- Re: Dynamic loading progress, Philipp Stephani, 2015/11/22