gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] Two word cons


From: Paul F. Dietz
Subject: Re: [Gcl-devel] Two word cons
Date: Wed, 18 May 2005 20:48:24 -0500
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050511

Camm,

  Judging by the space allocation numbers from TIME, objects
are also constrained to be an even number of words in Allegro CL
and in SBCL.  I think this is a common idea.

  If you have an extra field in some objects, consider uses for
it.  For example, we might want to precompute a hash key
for symbols (based on the package and symbol-name) if we're not
doing this already.  This could speed up SXHASH and EQ/EQL hash tables,
as well as CASE forms on symbols (switch on the hash key).

        Paul Dietz

Camm Maguire wrote:
[ background for gcl-devel readers -- I';ve recently returned from a
visit to UT/Austin where there are quite a few serious GCL users,
mostly as a base lisp behind ACL2.  I'll try to make a full report on
this visit soon.  One of the suggestions raised were the excessive
memory requirements of GCL, esp. its current 3 word cons. ]

Greetings!

Found a little time, and managed to get a two word cons working on
version 2.7.0 (CVS head).  Appears to be passing all tests with all GC
options, e.g. SGC, optimize-maximum-pages, etc.   There are doubtless
a few issues remaining somewhere.  But it looks quite doable.

I'm not sure the approach I took is best, so would like to consult.
I've taken advantage of the fact that even on 32bit machines, all our
structures are at least 2 words long, and can therefore be 8 byte
aligned, giving three mark bits.  The least significant indicates that
the whole word is a traditional type word, the next is the GC mark
bit, and the the next is a bit indicating the object is free.  This
allows basically all pointer indirection to proceed without masking as
pointers to be indirected in a cons only have bits set in the GC --
not that this is important from a performance point of view, but it
would take considerable work to rewrite the compiler to put the masks
in everywhere.  The only explicit masking-prior-to-indirection that
needs doing is in the GC (e.g. mark_cons), which is thankfully
well-localized.  This requires, however, that all odd word structures
on 32bit machines be padded by one word.  This is most wasteful for
the other three word structs (complex, ratio, ...) which are now 4
words long (on 32bit only), but my (completely unsubstantiated) hunch
is that this wastage is dwarfed by the cons savings.

Here is what (room) looks like (p means struct has been padded by one
word, m means trimmed by one word):

(2 words)                    m
   800/1352   30.2%         CONS FIXNUM SHORT-FLOAT CHARACTER RANDOM-STATE 
READTABLE SPICE
(10 words)                   p
   102/301    87.3%         SYMBOL
(14 words)                   p
     1/2      21.9%         PACKAGE
(8 words) p p p p p p 106/306 49.3% ARRAY HASH-TABLE VECTOR BIT-VECTOR STREAM PATHNAME CCLOSURE CLOSURE
(4 words)                    p                 p       p        p
   267/321     9.0%         STRUCTURE BIGNUM RATIO LONG-FLOAT COMPLEX CFUN
(6 words)                    p     p     p     p   p     p
    56/276    88.2%         SFUN STRING GFUN VFUN AFUN CFDATA

   612/768                1 contiguous (150 blocks)
       13107                hole
       5242    0.0%         relocatable

      1332 pages for cells
     20293 total pages
     99672 pages available
     11107 pages in heap but not gc'd + pages needed for gc marking
    131072 maximum pages


While some of these might be compressed further, there is definitely
no compression room for the 4 word structs, the most wasteful of which
is likely the structure structs.  In some 4 word cases, the compile of
course can inline the variables and pass them around on the stack.
Either I figure out how to live without a free object bit, or we
conclude that the cons load greatly dominates in all real world
situations, and that 64bit is the medium term future, where there is
only savings and no waste.

There is also a minor consequence for SGC.  SGC basically selects a
subset of pages to work with and marks the rest read-only.  All old
objects on the working set at the time sgc is turned on can never be
freed, as the mark might have to proceed via a read-only page, which
the algorithm skips for efficiency.  These SGC_NORMAL vs. SGC_RECENT
objects were designated by yet another bit in the type word.  16 byte
alignment is definitely too wasteful IMHO, so there is no room for
this on cons (only), in which case we only claim totally free pages for
sgc, and use the sgc page flag to effectively determine SGC_RECENT
cons from SGC_NORMAL.
Secondly, and perhaps more importantly, we discussed how ld.so puts
the shared libraries at 0x40000000 on Linux for example, limiting or
corrupting a big heap depending on the robustness of GCL's
algorithms.  The way around this appears to be via a linker script,
using a PT_LOAD entry in the program header to make a section taking
no ram or disk space but occupying and effectively reserving the
desired area.  I should have more information on this soon.

Take care,

"Warren A. Hunt Jr." <address@hidden> writes:


Hi Camm,

Here are some of the things we discussed.

Cheers,

Warren
++++++

                           Items Discussed

1.   Hash CONS (HONS).
  a. Weak Hash
  b. Randomize FIXNUM hashing
2.   Clear understanding of (ROOM T)
3.   Unbox FIXNUM or CONS for GC
4.   Threads
5.   Complier Emit Function Signatures and Boxing
6.   Upon function redefintion, flush all function properties
7.   Mutual Recursion
8.   Bigger FIXNUM, si::allocate-bigger-fixnum ?
9.   Bigger PageSize
10.   Fold xgcl into the standard build
11.   Replace #n# tables with dynamic tables
12.   Eliminate the compile-time maximum pages
13.   Place shared libraries elsewhere in memory
14.   Fix gethash and sethash code


enum httest {                   /*  hash table key test function  */
        htt_eq,                 /*  eq  */
        htt_eql,                /*  eql  */
        htt_equal               /*  equal  */
};

struct htent {                  /*  hash table entry  */
        object  hte_key;        /*  key  */
        object  hte_value;      /*  value  */
};

struct hashtable {              /*  hash table header  */
                FIRSTWORD;
        struct htent
                *ht_self;       /*  pointer to the hash table  */
        object  ht_rhsize;      /*  rehash size  */
        object  ht_rhthresh;    /*  rehash threshold  */
        // WAH,Jr. -- At creation and extension, recomput this next number.
        int     ht_int_thres    /*  Interger number of maxium entries */
        int     ht_nent;        /*  number of entries  */
        int     ht_size;        /*  hash table size  */
        short   ht_test;        /*  key test function  */
                                /*  of enum httest  */
};


struct htent *
gethash(key, hashtable)
object key;
object hashtable;
{
        enum httest htest;
        int hsize;
        struct htent *e;
        object hkey;
        int i=0, j = -1, k; /* k added by chou */
        bool b=FALSE;

        htest = (enum httest)hashtable->ht.ht_test;
        hsize = hashtable->ht.ht_size;

        // WAH,Jr. --  Make "/ 4" into ">> WORDSIZE_IN_BYTES"
        if (htest == htt_eq)
                i = (long)key / 4;
        // WAH,Jr. --  Pull out FIXNUM and CHARACTER tests
        else if (htest == htt_eql)
                i = hash_eql(key);
        else if (htest == htt_equal)
                i = ihash_equal(key,0);
        // WAH,Jr. --  Fix constant below.
        i &= 0x7fffffff;
        // WAH,Jr. --  Restructure with two simple loops, don't use MOD, don't 
need k
        for (i %= hsize, k = 0; k < hsize;  i = (i + 1) % hsize, k++) { /* k 
added by chou */
                e = &hashtable->ht.ht_self[i];
                hkey = e->hte_key;
                if (hkey == OBJNULL) {
                        if (e->hte_value == OBJNULL)
                                if (j < 0)
                                        return(e);
                                else
                                        return(&hashtable->ht.ht_self[j]);
                        else
                                if (j < 0)
                                        j = i;
                                else if (j==i)
                                  /* this was never returning --wfs
                                     but looping around with j=0 */
return(e) ;
                        continue;
                }
        // WAH,Jr. --  Eliminate these tests each time around the loop.
                if (htest == htt_eq)
                        b = key == hkey;
                else if (htest == htt_eql)
                        b = eql(key, hkey);
                else if (htest == htt_equal)
                        b = equal(key, hkey);
                if (b)
                        return(&hashtable->ht.ht_self[i]);
        }
        return(&hashtable->ht.ht_self[j]);       /* added by chou */
}


static void
extend_hashtable(object);

void
sethash(key, hashtable, value)
object key, hashtable, value;
{
        int i;
        bool over=FALSE;
        struct htent *e;
        
        i = hashtable->ht.ht_nent + 1;
        // WAH,Jr.  Test for excess size by simple integer comparison.
        if (type_of(hashtable->ht.ht_rhthresh) == t_fixnum)
                over = i >= fix(hashtable->ht.ht_rhthresh);
        else if (type_of(hashtable->ht.ht_rhthresh) == t_shortfloat)
                over =
                i >= hashtable->ht.ht_size * sf(hashtable->ht.ht_rhthresh);
        else if (type_of(hashtable->ht.ht_rhthresh) == t_longfloat)
                over =
                i >= hashtable->ht.ht_size * lf(hashtable->ht.ht_rhthresh);
        if (over)
                extend_hashtable(hashtable);
        e = gethash(key, hashtable);
        if (e->hte_key == OBJNULL)
                hashtable->ht.ht_nent++;
        e->hte_key = key;
        e->hte_value = value;
}
        
static void
extend_hashtable(hashtable)
object hashtable;
{
        object old;
        int new_size=0, i;

        if (type_of(hashtable->ht.ht_rhsize) == t_fixnum)
new_size = hashtable->ht.ht_size + fix(hashtable->ht.ht_rhsize);
        else if (type_of(hashtable->ht.ht_rhsize) == t_shortfloat)
new_size = hashtable->ht.ht_size * sf(hashtable->ht.ht_rhsize);
        else if (type_of(hashtable->ht.ht_rhsize) == t_longfloat)
new_size = hashtable->ht.ht_size * lf(hashtable->ht.ht_rhsize);
        {BEGIN_NO_INTERRUPT;    
        old = alloc_object(t_hashtable);
        old->ht = hashtable->ht;
        vs_push(old);
        hashtable->ht.ht_self = NULL;
        hashtable->ht.ht_size = new_size;
        if (type_of(hashtable->ht.ht_rhthresh) == t_fixnum)
                hashtable->ht.ht_rhthresh =
                make_fixnum(fix(hashtable->ht.ht_rhthresh) +
                            (new_size - old->ht.ht_size));
        hashtable->ht.ht_self =
        (struct htent *)alloc_relblock(new_size * sizeof(struct htent));
        for (i = 0;  i < new_size;  i++) {
                hashtable->ht.ht_self[i].hte_key = OBJNULL;
                hashtable->ht.ht_self[i].hte_value = OBJNULL;
        }
        for (i = 0;  i < old->ht.ht_size;  i++) {
                if (old->ht.ht_self[i].hte_key != OBJNULL)
                        sethash(old->ht.ht_self[i].hte_key,
                                hashtable,
                                old->ht.ht_self[i].hte_value);
        }
        hashtable->ht.ht_nent = old->ht.ht_nent;
        vs_popp;
        END_NO_INTERRUPT;}
}










reply via email to

[Prev in Thread] Current Thread [Next in Thread]