chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-hackers] Made a start with CHICKEN 5 proposal


From: Peter Bex
Subject: Re: [Chicken-hackers] Made a start with CHICKEN 5 proposal
Date: Mon, 8 Sep 2014 22:57:02 +0200
User-agent: Mutt/1.4.2.3i

On Thu, Sep 04, 2014 at 12:44:54AM +0200, Felix Winkelmann wrote:
> Hello, Peter!
> 
> I generally agree with most proposed changes on this list (with the
> exception of the idea to drop "fluid-let", of course.) But it must be
> clear to you that you already created a "pony" page. It is impossible
> to do all of that, so before we go crazy with ideas, we should perhaps
> get back to what we want to achieve with CHICKEN 5.

Yeah, I kind of had that "pony" impression while writing it up :)

> As I understand it, the idea is to decruftify (that is drop or eggify
> library code), give proper names and modularize.

That's the main reason we're breaking back-compat, I think.

> This is all related
> in one way or the other and looks like it can be done with the little
> resources we have, especially considering that it will take ages until
> only a reasonable subset of the existing eggs compiles and runs
> properly in the new system.
> 
> * Designing a decent POSIX API is a hard task. I have not seen any
>   reasonably good API wrapper for that yet - they are either too
>   lowlevel (Basis, Ocaml, etc.), or too highlevel.

For now a modest refactoring would be enough.

[begin of short brain dump about the POSIX situation]

Putting things like, for example, "directory" in some other unit would
make more sense to me, because there's nothing inherently POSIXy in
reading the contents of a directory. (though the _implementation_
happens to rely on the C POSIX API, of course), and I think it belongs
with make-pathname and friends (ie, a "paths" or "files" module).

Ideally, there wouldn't be much left of the "posix" unit except some
deeply POSIXy things like fork, signal, fcntl, environment vars etc.
Probably this means the really high-level things move elsewhere.
In time, we might even move the POSIX unit out of core into an egg
and keep only truly "portable" (or essential) things in core.  I'm
not sure what will happen to POSIX in the future, but I think its
hegemony will end sooner rather than later.  the landscape is shifting
so quickly with these mobile devices (think Windows Phone, Firefox OS
but also the crippled POSIX support on iOS and Android), OS research
is slowly picking up again and the Linux crowd seems to be taking an
increasingly aggressive stance against "backwards compatibility" (think
Wayland, systemd etc).

So, I'm not against any POSIX support, but relying too much on it in
core itself is probably a mistake in the (very) long run.

[end of braindump]

> * Changing the string representation is much harder than you think
>   (quoting John: "If Chibi can do it, so can we" completely ignores
>   the fact that writing a string-representation implementation from
>   scratch is something vastly different than modifying an existing
>   one, one that is much older and much more widely used from
>   foreign/native code.)

Agreed.  Recall that my suggestion was simply to "bless" UTF-8 as the
canonical internal representation (which is the case, de facto, anyway)
and *maybe* adding some detection code to reject invalid sequences rather
than just continuing with bogus data.  Possibly making the default
string ops the ones from the UTF-8 egg.  Anything beyond that is
overkill and I would definitely not support changing the encoding in
this effort.

Of course if someone sent in a patch, that might change my mind...
but that's just wishful "pony" thinking ;)

> * Numeric tower support: this is also hard, and will have a
>   considerable performance impact, needs changes in the compiler, in
>   all the icky C glue code and particularly in foreign code - which
>   means things will break all over the place in user code.

There is strong support from the community to do this, and I'm willing
to put in the required effort.  I feel very strongly about adding at
least bignum support to core.  I don't care as much about ratnums and
I don't care at all about compnums, but it may be simpler to add them;
the code to support them too is relatively straightforward.

Not having bignums in core causes too much headache:
- When dealing with foreign procedures returning full-width 64-bit
   integers, as those simply cannot be fully represented by flonums.
- Having bignums be external to the core causes a lot of headaches when
   one generates them and passes them to some library.  For instance,
   storing very large numbers in a database is perfectly sane and
   generally possible with the DECIMAL type, but this requires all the
   database eggs to pull in the numbers egg, which they currently don't.
   In short, the numbers egg is "contagious".
- There are several hard to fix bugs that become trivial once bignums
   are supported: #1096, #1000, #1139, #823.  There have been other
   such problems.
- Also, it confuses the newbies :)

If I don't make it before all the other things have been taken care of,
feel free to release CHICKEN 5 without it.

> * Port-refactoring: again - basically a good idea, but tricky to
>   design, and may have a large performance impact, and the refactoring
>   will be work-intensive (all the direct peeking and poking in port
>   records needs to be localized and changed). This change should also
>   ideally be considered to be done in tandem with changing the string
>   representation.

Here too, a modest change would be enough.  Just using a proper
struct/record type would make later refactorings easier.  The best
part is that the performance impact of adding an offset to the write
buffer is a positive one.  But if we won't be able to make this work,
I won't be too sad, I promise ;)

We don't have to make a perfect design, just one that scales better
with future changes.  I was thinking to make the constructors accept
keyword arguments, so that we can later add things (like position
setting etc) without breaking existing programs.

> * chicken-install/setup-files: a major and very important project on
>   its own. I started thinking about this some time ago, but didn't get
>   anywhere. Something very simple needs to be found that covers most
>   use cases, but this is something that needs input by many people
>   that have experience with the egg system and applicastions written
>   in CHICKEN. Perhaps we should plan to think about this the next time
>   some CHICKEN-hackers meet?

Sounds like a good plan.  I also think this one may be too difficult and
too much work to do it for CHICKEN 5.0 unless lots of people chip in.

> I _do_ think all the proposed changes make sense more or less, but
> it's unrealistic to think that we achieve anything more than one or
> two of the big parts.

Agreed.  I'll put in some extra effort this week to get the numbers
egg in good shape for importing it into core, and maybe try to get
started on a core patch.

> A few more notes:
> 
> * I think John's idea of putting all the little SRFIs in a few (or a
>   single) module is better that splitting everything up into
>   modules. Having modules for each and everything looks nice on paper
>   but quickly gets old when you have to modify your module imports
>   every time you use a common but nonstandard language construct.  I
>   understand that some people like this kind of bureaucracy, but
>   what's wrong with making things easier for the user?

Yeah, I said much the same at the start of the section about SRFIs.
However, I think it *does* make it easier for the user to _also_ offer
the SRFI libraries separately.  There's already a hacky workaround for
require-extension's builtin-features in eval.scm so that you can say,
for example, (require-extension (srfi 2)), so I think it makes sense to
also provide "full" library declarations, to make it simpler to use and
write portable R7RS programs.

Note that this does not mean this needs to be the only library to export
said SRFI procedures!

> * Please use long, explicit library names, it's easier to remember
>   ("there are many ways to abbreviate something, but only one way not
>   to" - I forgot who said this, John will tell me, I'm sure.) And I
>   would also suggest to avoid using "srfi-XXX" as a module name, and
>   to use something meaningful (yes, I know that in the past I was
>   largely responsible for that mistake in numerous situations.) That
>   would also allow adding our own extensions.

For portability, I prefer at least also allowing the srfi numbers.
But yes, long names are good.  However, there will be so few SRFIs
that will still be left as part of core that it makes very little
sense to rename the existing SRFIs, except when grouping several
constructs together.

> * I can't resist to add a pony on my own: I fear that integrating the
>   R7RS syntax-rules cleanly and transparently inside an egg will be
>   tricky. What about changing syntax-rules to have R7RS semantics in
>   general? I'm not sure if I understand the differences well enough,
>   perhaps someone (Peter?) can comment on this.

I think we already did the important bits (ellipsis identifiers and
tail patterns - ie, SRFI-46).  There are two more changes, AFAIK:

- The "new" syntax-rules foolishly changed the underscore to act as
   a wildcard symbol, making it - strictly speaking - incompatible with
   R5RS.  I don't think it's a good idea to support this in core.
- For no good reason, R7RS syntax-rules allows not only renaming
   ellipsis identifiers, but also quoting them (which I think is
   a bit ugly).  I *think* this is entirely backwards compatible,
   so we could add that to core.

This is easily put in the R7RS egg, though.  Remember, any use of
syntax-rules simply expands into one big ER macro transformer, and it
is a completely self-contained file which may be taken and copied into
the R7RS egg, and tweaked there to support these two cases.  But it could
be simpler to do as a simple preprocessor which generates a "core"
syntax-rules expansion.

> So, in short: forget about unicode, the full numeric tower,
> chicken-install, port-refactoring and everything but modularization,
> the internal structure (and size!) and the necessary issues of doing a
> major release (e.g. the question of how to integrate that with
> henrietta.)

I think we can do minor things to make changing things in a backwards-
compatible way.  These are important to postpone the need to "break the
world" a third time as much as possible.

I'd really like to hear other people's ideas about what would be the
best way to integrate the changes with Henrietta.  Personally, I think
the easiest way is to simply deploy a second copy of henrietta which
reads from a different cache, populated by a second henrietta-cache cron
job which reads from a different master list.

> The major problem is that re-modularization will be the biggest
> barrier in migrating user code. Once that is done we have a groundwork
> for the really tricky things, and for smaller API changes that are
> easier to detect via the module system.

Agreed.  How will we attack the problem of bootstrapping?  We will make
some breaking changes which might mean CHICKEN 4 may be unable to
bootstrap the CHICKEN-5-in-progress at some point.  Now that we're on
a separate branch we can't really release snapshots in the 4.9.x series.
Maybe fall back to a simple date, or git hash versioning scheme for the
time being?  We don't need to make them public "official" releases of
course.  I just don't know how well our infrastructure will cope with
a different naming strategy.  Should we do this by hand?

Cheers,
Peter
-- 
http://www.more-magic.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]