[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pika-dev] SCM_LSET question
From: |
Tom Lord |
Subject: |
Re: [Pika-dev] SCM_LSET question |
Date: |
Fri, 6 Feb 2004 08:50:52 -0800 (PST) |
> From: Andreas Rottmann <address@hidden>
> I'm not totally clear about this macro. It's to be used for
> setting locals, which are protected by
> SCM_PROTECT_FRAME. However, those can be set by function calls,
> like scm_make_false (&l.bar, arena). Since the code in
> scm_make_false doesn't use SCM_LSET for setting l.bar, this
> leads me to the question if there shouldn't be a macro like
> SCM_RSET (for result-set) or if SCM_LSET can't be dropped...
Ok, here's the deal.
The FFI is designed so that:
~ GC can be precise
~ GC can be concurrent
~ GC can be asynchronous
~ GC can be copying
~ GC can be incremental
~ The representation of values, t_scm_word, can be any C type
To achieve those goals, the FFI has to have a some properties:
* No t_scm_word rvalues
The FFI interface must never _require_ that Scheme values
(t_scm_word values) wind up in anonymous storage known only to the
compiler.
For example, the interface to CAR can _not_ be:
l.answer = scm_car (arena, pair)
because after `scm_car' returns and before the result is stored
in `answer', the `t_scm_word' value would exist only in a register
or similar anonymous location managed by the compiler. GC would
not be able to find it -- so it would not be GC protected and would
not be updated if the value were relocated by a copying GC.
Instead, the inteface is:
scm_car (&l.answer, arena, pair)
We regard `scm_car' as primitive: internally, it has to coordinate
with GC. But to a caller, that interface means that we never have
the return value as an "intermediate value" -- by the time
`scm_car' returns, the result is stored in `l.answer' where the GC
can reliably find it.
This rule effects some other idioms, too. You can't copy local
variables or parameters in a way that creates an rvalue.
So you can _not_ say:
l.tail = l.fast
or
*answer = l.tail
or
l.list = *parameter
but have to instead say:
SCM_LSET (&l.tail, &l.fast)
and
SCM_LSET (answer, &l.tail)
and
SCM_LSET (&l.list, parameter)
To better understand that, you can think of Scheme memory as being
abstractly described as a set of Locations, each of which holds a
single Scheme value. For example, a cons pair contains two
Locations. A local Scheme variable in your C code is a Location.
In the FFI, the type `t_scm_word' is, in essense, the type of
a Scheme Location.
So when you declare some local variables:
struct length_locals
{
SCM_FRAME;
t_scm_word slow;
t_scm_word fast;
t_scm_word tail;
} l;
SCM_PROTECT_FRAME (l);
[...]
what you're doing is creating some new Locations (slow, fast, tail,
in this case) which happen to be GC roots.
Everywhere in the FFI, Locations are referred to by _address_, not
name. So, for example, one of the locations created by that
declaration is called:
&l.slow
All of the primitives in the FFI operate on locations, not scheme
values. So:
SCM_LSET (to, from)
means, "copy the value in location `from' to location `to'"
and
scm_car (&l.answer, arena, pair)
means "store the CAR of the value in location `pair' in the
location `&l.answer'"
When you write a function and it takes `t_scm_word *' input
parameters, really, that means that that it's taking as paremters
some locations to operate on. It's `t_scm_word *' output
parameters are the (possibly overlapping) locations where results
should be stored.
You could think of the FFI primitives as the instruction set of
a virtual machine that has no registers -- only locations. When
you write new functions in libscm, you're creating "macro
instructions" out of those primitives.
* treat t_scm_word as an opaque type
This is really implied by the "no rvalues" rule, but it's worth
mentioning separately.
You mustn't write code like:
if (l.fast == l.slow) /* circular list? */
For one thing, you're using `t_scm_word' as an rvalue, violating the
rules above. But for another thing, you're treating `t_scm_word'
as a non-opaque type and assuming that it can be compared for
equality using `=='.
Instead, that has to be written:
if (scm_values_eq (arena, &l.fast, &l.slow)) /* circular list? */
which asks the FFI implementation to tell you if the values stored
in locations `&l.fast' and `&l.slow' are EQ?. For one thing,
that allows the equality test to be more complicated than just
EQ? (as it might have to be in, say, an incrementally copying GC).
Those rules are consistent with the goals:
~ GC can be precise
Because the GC always knows where all of the Locations are and
Values are stored only in Locations.
~ GC can be concurrent
~ GC can be asynchronous
Because Values are only ever stored in Locations -- never
in registers or "elsewhere".
~ GC can be copying
~ GC can be incremental
Because Values are only in Locations and Locations are never read
or written to directly. The GC is free to impose a read or write
barrier on Locations and to update them at any time.
~ The representation of values, t_scm_word, can be any C type
FFI-using code never does anything but take the address of a
Location. It doesn't care what C-type is used to represent that
Location.
Now the tricky part:
> Since the code in scm_make_false doesn't use SCM_LSET for setting
> l.bar, this leads me to the question if there shouldn't be a
> macro like SCM_RSET (for result-set) or if SCM_LSET can't be
> dropped...
No. You have to understand the code of Pika is being split into two
layers:
everything else
---------------
reps
and their function is:
everything else:
use the "core FFI" and extend it
-----------------------------------
reps:
implement the "core FFI"
For example, you earlier fixed a bug by moving the declaration for
`t_scm_word' from the "everything else" part of the code to the "reps"
part where it belongs. That declaration is part of the
implementation of the core FFI.
SCM_LSET is another part of the core FFI -- defined in reps.
`scm_make_false' is a part of the core FFI.
The reps part, the core FFI, is designed to be "swappable". For
example, if you want to use a different GC, you can do that just by
modifying or replacing the reps layer -- all the other code remains
unchanged.
The current "reps" layer is being designed with the primary goals:
~ get it working quickly and easily
~ use it for bootstrapping the project
It _doesn't_ try to implement the core FFI in a way that is thread
safe or that has incremental GC. Internally, it doesn't have to
follow the strict rules about manipulating `t_scm_word' values.
So, for example, `scm_make_false' consists of just the code:
*result = scm_false;
because that's the simplest thing given the primary goals.
Later on, another version of `scm_make_false' in a fancier
implementation of REPS might look more like:
block_gc_tracer ();
*result = scm_false;
unblock_gc_tracer ();
or might look like:
mutator_yield ();
*result = scm_false;
or
*result = scm_false;
mutator_yield ();
or
suspicious_location_begin (result);
*result = scm_false;
suspicious_location_end (result);
or
SCM__REPS_STORE_IMMEDIATE (result, scm_false);
or
(who knows ...)
-t