[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
srfi-130 2.0.1: An improved CHICKEN string library
From: |
Wolfgang Corcoran-Mathe |
Subject: |
srfi-130 2.0.1: An improved CHICKEN string library |
Date: |
Fri, 9 Sep 2022 13:33:44 -0400 |
Hi all,
I’ve just released version 2.0.1 of the srfi-130 egg[0], which is my
quixotic attempt at a better string library for CHICKEN. It’s a new,
fully Unicode-aware, opaque-cursor implementation of John Cowan’s
SRFI 130[1] built on top of the utf8[2] egg. Some benefits:
* String cursors, which encapsulate byte offsets, provide faster
indexing and substring operations on Unicode strings than codepoint
indices. For example, srfi-130’s ‘string-ref/cursor’ runs in
(notional) constant time when given a cursor, while utf8’s
‘string-ref’ requires O(n) time.
* All srfi-130 procedures that take cursors can also take
(codepoint) indices, so porting between srfi-13/srfi-152/utf8
and srfi-130 should be relatively easy.
* Cursors are type-safe, and you can only create valid cursors
(but see “Caveats” below). Low-level functional programmers
may consider this decadent, but I believe it encourages better
programming. Passing hand-computed offsets to CHICKEN’s
byte-oriented string operations is asking for trouble, and cursors
are a more disciplined way to achieve the same goals with similar
efficiency.
* Better error reporting. The srfi-130 egg tries to provide
useful exceptions with correct locations which follow CHICKEN’s
internal condition protocol (e.g. type errors raise (exn type)
conditions, etc.) This is in contrast to the utf8 egg’s errors,
which are often hard to trace (“where exactly did string-ref get
that invalid index?”).
* More rigorous, randomized testing using the test-generative egg.
# Caveats
Cursors are very useful, but they don’t play well with string
mutation. Mutating a string invalidates all cursors into it, but
it’s a hard problem to catch these situations efficiently. It’s
also possible to use a cursor on a different string than the one
it refers to, which is also an (uncaught) error. This could be
averted with an ‘eqv?’ check, if it annoys enough people.
In sum, I think that the new srfi-130 egg has some important benefits
while mostly maintaining backwards compatibility with srfi-13 and the
other CHICKEN string libraries. I hope that some CHICKEN programmers
will consider it. Suggestions and patches are welcome.
Best regards, Wolf
[0] https://wiki.call-cc.org/eggref/5/srfi-130
[1] https://srfi.schemers.org/srfi-130/
[2] https://wiki.call-cc.org/eggref/5/utf8
Thanks to John and to Will Clinger for creating SRFI 130.
--
Wolfgang Corcoran-Mathe <wcm@sigwinch.xyz>
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- srfi-130 2.0.1: An improved CHICKEN string library,
Wolfgang Corcoran-Mathe <=