pika-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pika-dev] a little hashq problem


From: Tom Lord
Subject: [Pika-dev] a little hashq problem
Date: Wed, 3 Mar 2004 11:31:32 -0800 (PST)


So, I'm writing up next-stage implementation plans and I noticed a 
nasty bug.

Currently, we use a copying GC -- so heap allocated values can change
location in memory.

Currently, we implement HASHQ as a function that returns a (mangled
form of) an object's address in memory.

Oops.  Bad combination.   (As I recall, the hashq implementation
predates the copy collector and was written with the expectation that
the first collector would be mark/sweep.)

I see only two practical solutions:

~ make objects bigger

  One idea is to double the size of each object, storing a hashq value
  when the object is first allocated.  Pairs would be 4 words, for
  example.

  I notice that (in the version I have on hand) mzscheme small objects
  (e.g., cons pairs) are already a three-word structure and do indeed
  include a field in which to store a hash value.   So, this is not an
  unprecedented solution.


~ memoize hashes

  Another idea is to keep a table of assigned hashq values, rebuilding
  the table as objects are copied.

  In other words, hashq would _internally_ record the hashq value of
  an object in its own private hash table, keyed on object addresses.
  For objects not previously hashq'ed, it could just make up a new
  random value.

  Copying an object during collection would involve:

        ~ Using the old address to see if it has a hash value 
          assigned already

        ~ If so, recording that hash value keyed on the new address
          in the new table.

        ~ At the end of copying, discard the old table and switch 
          to the new.


Of these two, I somewhat prefer the second, though I may be crazy for
doing so.   My reasoning is that the hash check during copying can be
made awefully fast --- approximately no worse than unconditionally
copying the extra data if all objects grow in size.   At the same
time, the memoized hashes approach can use considerably less memory in
nearly all applications.

Anyone care to work on that?

-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]