pika-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pika-dev] Re: a little hashq problem


From: Andreas Rottmann
Subject: [Pika-dev] Re: a little hashq problem
Date: Thu, 11 Mar 2004 11:33:49 +0100
User-agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3 (gnu/linux)

Tom Lord <address@hidden> writes:

> So, I'm writing up next-stage implementation plans and I noticed a 
> nasty bug.
>
> Currently, we use a copying GC -- so heap allocated values can change
> location in memory.
>
> Currently, we implement HASHQ as a function that returns a (mangled
> form of) an object's address in memory.
>
> Oops.  Bad combination.   (As I recall, the hashq implementation
> predates the copy collector and was written with the expectation that
> the first collector would be mark/sweep.)
>
This is probably also true for the current (preliminary) object lock
implementation, as in my --object-locks branch: To obtain a lock, I
take the objects heap address and hash that. 

It seems there should be general hooks in the GC, so subsystems such
as the object locks and hashq can deal with this in a general way and
we don't have to invade those subsystems with GC implementation
details.

As for how such hooks might look like, I don't really know. Tom?

> I see only two practical solutions:
>
> ~ make objects bigger
>
[...]

>
> ~ memoize hashes
>
>   Another idea is to keep a table of assigned hashq values, rebuilding
>   the table as objects are copied.
>
>   In other words, hashq would _internally_ record the hashq value of
>   an object in its own private hash table, keyed on object addresses.
>   For objects not previously hashq'ed, it could just make up a new
>   random value.
>
>   Copying an object during collection would involve:
>
>       ~ Using the old address to see if it has a hash value 
>           assigned already
>
>       ~ If so, recording that hash value keyed on the new address
>           in the new table.
>
>         ~ At the end of copying, discard the old table and switch 
>           to the new.
>
>From that description, there should be maybe three hooks:

* start of (copying) GC hook
* a per-object copy hook
* end of copying GC hook

> Of these two, I somewhat prefer the second, though I may be crazy for
> doing so.   My reasoning is that the hash check during copying can be
> made awefully fast --- approximately no worse than unconditionally
> copying the extra data if all objects grow in size.
>
This somewhat contradicts my hook idea, which imposes some overhead,
which we might want to avoid.

> At the same time, the memoized hashes approach can use considerably
> less memory in nearly all applications.
>
I agree that this a *really* worthwhile goal.

> Anyone care to work on that?
>
Given that my current object-lock implementation presumably has the
same problems as hashq, I might give it a try. We should first draft
out how to implement this, however.

Thinking about the hook overhead, there might be another possibility:
a facility to attach arbitrary data to a heap object, with moving this
data "along" as the object is copied. Of course, the attached data
doesn't have to be moved, but just updated to be keyed to the new
address. This might be then used for both hashq and object locks.

Something like:

~ int scm_create_object_data_area (t_scm_arena arena, int size);

Create a new data area of `size' bytes for objects. The return value
is the key to this data area.

~ void *scm_obtain_object_data_area (t_scm_arena arena, 
                                     t_scm_word * obj, int key);

Obtain the data area `key' for an object. The memory is allocated, if
not already present because of an earlier call.

~ void scm_release_object_data_area (t_scm_arena arena, t_scm_word * obj, 
                                     int key);

Release the data area `key' for an object. The memory will be freed.


For implementing this interface efficiently, someting like GLib's
memory chunks[0] would be very helpful to have in hackerlab. The above
interface is modeled somewhat after the GLib datasets[1].

I like the above interface quite a bit, now that I think about it,
since it can be used to manage all kinds of "object
properties". Thoughts?

[0] http://developer.gnome.org/doc/API/2.0/glib/glib-Memory-Chunks.html
[1] http://developer.gnome.org/doc/API/2.0/glib/glib-Datasets.html

Cheers, Andy
-- 
Andreas Rottmann         | address@hidden      | address@hidden | address@hidden
http://yi.org/rotty      | GnuPG Key: http://yi.org/rotty/gpg.asc
Fingerprint              | DFB4 4EB4 78A4 5EEE 6219  F228 F92F CFC5 01FD 5B62

Make free software, not war!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]