chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-hackers] CHICKEN in production


From: Aleksej Saushev
Subject: Re: [Chicken-hackers] CHICKEN in production
Date: Tue, 14 Oct 2014 15:14:39 +0400
User-agent: Gnus/5.1299999999999999 (Gnus v5.13) Emacs/24.3 (berkeley-unix)

John Cowan <address@hidden> writes:

> Aleksej Saushev scripsit:
>
>> > Valid but useless.  It has no significance whatever.
>> 
>> It has no significance for what exactly?
>
> For the Unicode Standard.  Characters like #\; and #\A and #\newline
> have definite meaning to Unicode, but NUL does not.  It corresponds to
> unpunched paper tape, which was traditionally ignored.

I think that this is irrelevant to the question. For instance, original
meaning CR and LF differs from its current use almost everywhere with
exception of very old networking protocols.

>> My experience with Forth implementations that append NUL terminator
>> is that this doesn't bring enough gain while adding more obstacles
>> for string processing. You end up using conventional Forth strings
>> (represented by pointer-length pairs) and NUL-terminated strings
>> (e.g. for FFI purpose) as different concepts coexisting side by side.
>
> Nobody is proposing the abandonment of counted strings as they currently
> exist, or the creation of two kinds of strings.  The original idea was
> simply to ensure that the internal representation of a string always
> contains a zero byte *outside* the count and *not* visible to Scheme,
> so that it can safely be shared with C without copying.
>
> It is already the case that we don't allow a string that contains a zero
> byte to be passed to C by copying.  So de facto there are already two
> kinds of strings, those that can be passed to C and those that can't.
> I proposed eliminating that distinction by disallowing strings with zero
> bytes at all.

I don't think that it is really good idea. My practical experience is
that disallowing this use case forces me to work around this deficiency
in certain tasks which are rather frequent.

Since in Scheme you have better control of access to strings,
you can introduce "FFI-ready" string subtype and maintain it.
If you forbid substring "views" and you don't modify string to replace
character with NUL, you stay within this C-compatible type.

>> Allowing NUL within strings allows better handling of some protocols that
>> are text-oriented yet use NUL as field separator.
>
> That's a good point, and may be enough to warrant allowing NUL in
> strings, though most of the time C manages fine without being able to
> handle such things in its (effectively) native string type.  A procedure
> read-null-terminated-string would probably be sufficient; note
> that I am not proposing the elimination of the character #\u0000.

Well... "Manages fine" is rather strange to hear from a person
who advocates separation of octet array and string types. :)
Yes, C "manages fine" because its strings are actually octet arrays
with special syntax for literal values.


-- 
HE CE3OH...




reply via email to

[Prev in Thread] Current Thread [Next in Thread]