chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-hackers] [PATCH] Get rid of special encoding for qualified symb


From: Peter Bex
Subject: [Chicken-hackers] [PATCH] Get rid of special encoding for qualified symbols (fixes #1077)
Date: Sun, 6 Jan 2019 20:10:28 +0100
User-agent: NeoMutt/20170113 (1.7.2)

Hi all,

After a few failed attempts, I finally figured out a way to fix #1077
without breaking the world.  The issue was that "qualified" symbols
(i.e., things like ##core#blabla) are encoded with a length prefix byte
like so: "\004coreblabla".  That means all symbols that start with a byte
that's lower than 41 will be treated as qualified symbols.

The only reasons for it to be like this are historical, as far as I can
tell.  So, I wanted to make these symbols represented like all others, as
simply "##core#foo".  Changing the reader to avoid encoding them in the
special way is not enough, because the "internal" core language uses
qualified symbols all over the place (think ##core#inline and such).

The compiler would still be comparing the symbols it read against those
it was compiled with, which means the reader would read "##core#inline"
but the compiler's C code still had "\004coreinline" in its symbol table
as a different symbol.

The workaround I came up with is to simply malloc a new string whenever
we try to intern a qualified symbol like "\004coreinline".  This new
string is then changed to read "##core#inline".  That way, we'll end up
with _only_ new-style qualified symbols, even if the runtime is still
filled with old-style qualified symbols.  Everything is normalized at
the point of interning.  The first attached patch takes care of this.

Then, when you have compiled a compiler with the first patch and
bootstrapped it with itself, it will no longer contain any old-style
qualified symbols in the runtime.  Then, all the special-cased code for
old-style qualified symbols can be dropped.  The second patch takes care
of this.

I think the first patch should be applied, then a new 5.0.1 snapshot
should be tagged.  This snapshot should then be used in bootstrap.sh
so we have a version that can build new CHICKENs correctly.  Then we can
apply the second patch to drop all the deprecated compatibility code.

The second patch also contains a regression test for #1077.

Finally, keywords like foo: are still encoded as "\000foo".  We might
want to fix that as well, but that should be an easier fix and not as
invasive.  On the other hand, it will also require a binary version
bump, so we could decide to tackle it right now too.  I'm just not
sure how to, yet.

Cheers,
Peter

Attachment: 0001-When-interning-qualified-symbols-convert-them-to-reg.patch
Description: Text Data

Attachment: 0002-Drop-support-for-old-style-qualified-symbols-fixes-1.patch
Description: Text Data

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]