[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Chicken-hackers] simplifying loading/linking/import (long)
From: |
Felix Winkelmann |
Subject: |
[Chicken-hackers] simplifying loading/linking/import (long) |
Date: |
Wed, 02 Jul 2014 00:50:02 +0200 (CEST) |
Ok, let's start from scratch...
* We can't change the existing machinery without breaking an awful lot
of code, so any solution must be an addition to what we currently
have.
* The basic entities we have to deal with are "compilation units",
bodies of code, either statically linked into an executable (or
library) or dynamically loaded.
* These compilation units may or may not contain one or more
"modules", which are separate namespaces (or "bindings") over those
bodies of code.
* "import" incorporates bindings into the current environment, either
globally or inside another namespace (module). We want to
_automatically_ make the code associated with that namespace
available, regardless of the nature of the compilation unit that
contains that code. Is this interpretation correct?
* Making the code inside a compilation unit available happens either
by loading, resulting at one point of time in a call to "load" (this
includes interpreted code in source form, which is just another
flavor of a compilation unit), or it happens by declaring an
externally available entry-point, currently via "(declare (uses
...))".
(This needs a more obvious or natural syntax at some point, but
that isn't relevant right now)
* Declaring an entry-point into the current compilation unit
(basically the current source file) takes place by "(declare (unit
...))".
* The last 2 points are important if we want to support static
linking. Loading is in this case the simpler operation, as the
entry-point always has the same name. For static linking the
entry-points need to be named differently (there might be ways
around this limitation, but to keep things simple, let's not
consider that right now.)
* So, if we create a "registry" of linked/loaded compilation units,
"import" can consult this registry and check whether a compilation
unit of the same name is already registered and, if not, default to
loading a ".so" or ".scm" with the same name. If the latter is not
found, we have an error. If it is found, add it to the registry.
* "import" incorporates bindings from a set of available modules, also
registered somewhere, specifically in ##sys#module-table. Should it
also handle compilation-units for which no bindings exist (i.e. all
bindings are unqualified)? This is only useful at toplevel, or, in
other words, not inside a module. This will also bring up the
question whether such a behaviour might lead to head-scratching in
case a module should exist, but the binding-information is
unavailable for some erroneous reason.
* Declaring an externally available entry-point must add the
compilation unit associated with it to the registry.
(Sorry, now it gets complicated...)
* libchicken contains a number of entry-points, one for each library
unit that comes with the core system. The registry must already have
entries for these. Users might want to have to use a similar
physical structure of their code, so we will have to provide means
to add "default" registry entries, I think (I'm not completely
sure right now - the resolution of the entry-points happens
automatically by the linker, but we have to make later "import"s
aware of this.)
* Currently "(declare (unit ...))" calls the entry-point,
_initializing_ the compilation unit. Later "import"s will just
incorporate the bindings. Do we want to initialize the compilation
unit on the first "import"? If yes, we need to separate the notions
of declaring an externally available entry-point and calling it, the
latter being done (we hope) transparently by "import".
* The same situation arises with loaded compilation units. Consider a
dynamically loaded ".so" that holds several compilation units: When
is the entry-point of each contained compilation unit called? On
first "import"? I this case it makes sense to generalize this, I
think.
* The different actions or declarations will need different constructs
to implement the low-level behaviour. Not all of them need to be
user-visible. "import" naturally will. Declaring the current
compilation unit to have a separately named entry point will do so
as well. Declaring an externally available entry-point will. And
finally something for registering a "default" (admittedly for those
special occasions...)
* The registry needs to be something more extensible than a simple
"feature" list. We have to keep track of what is initialized, and so
on. Using any existing mechanism will only make it harder to later
remove the old code and make the existing code even more complicated
than it already is.
* Changing the semantics of "import" for "late" initializing of
compilation units breaks backwards compatibility, but we don't want
to create yet another special form, right? The conservative solution
is to do initialization at the point where an externally available
entry point is declared or code is explicitly loaded, like it
currently is implemented.
(Side note: loading invokes the default entry point "C_toplevel",
declaring an externally available entry-point invokes the
entry-point derived from the name of the compilation unit. In the
case of an ".so" holding several compilation units, we have a
mixture of default entry-point + separately named entry-points. Oh,
this is fun...)
* Thinking of this now, I realize that the compilation unit itself
might already contain the binding-information - this is the case
when we compile a module without emitting an import library. So late
initialization actually doesn't work, unless we want to require
import libraries in any case. A valid approach, but this may have
again other implications.
* It would be nice to have some terminology for those "bodies of code"
that we can use to invent new special forms to cleanly perform the
above mentioned "actions". This will of course increase the
confusion in the beginning, but we can deprecate the old forms at
some point.
I'm sure I have forgotten something, but it is important that we think
of all possible use cases before anything is changed, or we really
start going into details.
Note that our current CHICKEN does even more than this:
"require-extension" handling feature-IDs, for example. Or
automatically loading syntax-extensions. It's not a coincedence that
handling extensions/using/importing is in part done by a procedure
called "##sys#do-the-right-thing". And then there is figuring out
where the extensions are located, or telling the compiler what units
are loaded, or handling the "(srfi N ...)" extension-specifier even in
the presence of module-binding modifiers like "rename". Wheels within
wheels - it's terrible...
All that nasty lowlevel stuff does not necessarily have to be touched,
but care must be taken before we lock down what is in the future to be
allowable and what not. This is kind of obvious, but I just wanted to
mention it once more.
I hope I haven't raised the confusion to unbearable levels. My
intention was to clear things up, but I have my doubts whether this
was succesful.
felix