[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] hashing

From: Henry Baker
Subject: Re: [Gcl-devel] hashing
Date: Thu, 14 Nov 2013 09:40:45 -0800

The best hashing performance requires deep understanding of the
virtual memory address translation mechanism (aka translation
lookaside buffer "TLB").  This is because the translation of
virtual memory addresses to real memory addresses _already
uses a builtin hardware hash table_.  You might as well take
advantage of this hardware help when accessing large hash



Hashing performance is also dependent upon the cache line size.

It is sometimes possible to get better performance by using
several levels of tables, if you can be assured that the top
levels are essentially always in the cache and in the TLB.

Also, overflows can be very fast _so long as the overflow
scan occurs within the same cache line_.

I presume that there are academic papers that address (!) this
particular subject, but if not, a couple of days worth of
experiments might be very educational.

Common Lisp's SETF mechanism for installing items into a hash
table has always been at odds with the best possible performance
for hash tables in which a large % of the items will _never_ be
accessed -- i.e., a _write-mostly_ hash table.  The reason is
that what you really want is an insert-if-not-already-there
function which performs only one hashing operation.

Also, Common Lisp severely crippled their built-in hash tables
by not allowing full generality with a user-specified equality
operator.  As a result, I typically use Common Lisp hash tables
for prototyping, but almost invariably have to write my own routines
for the highest possible performance.

It would also behoove Common Lisp implementors to allow the
_inlining_ of hash table lookup & store operations.  This would
remove one more performance penalty when using the CL standard
hash table operations.

At 08:25 AM 11/14/2013, Camm Maguire wrote:
>Greetings!  This is just a note for posterity on some of the recent
>learnings regarding hashing in GCL.  Except for symbol lookups in
>packages, gcl uses 'open addressing' hash tables, as is customary for a
>one word stored 'value', i.e. a lisp object pointer.  With this
>strategy, collisions result in a linear probe of the table until either
>an empty slot or the desired key is found.  The negative performance
>effects of the collision therefore occur not only for keys hashing to
>the same index, but for those hashing to *adjacent* positions.  Using a
>cons address directly as an index will thus suffer, as conses are
>frequently allocated in bulk in a contiguous region of memory.
>The current solution relies on our 'universal' hash function, i.e. a 256
>wide table of random 64bit numbers indirected by bytes of the key and
>xored together, to not only distribute the hashed items uniformly
>throughout the table, but also to minimize the possibility of
>consecutive indices.  I think these criteria should be fulfilled at
>present, though there are explicit algorithmic solutions to the
>consecutive index problem which do not impinge on the hash function.
>These require more memory and complexity, and seem unwarranted at
>What we were seeing with the recent performance degradation was a strong
>dependence on the data layout as represented by the addresses in use.
>As an historical aside we should note that GCL did randomize these
>addresses in the past, but it was found to be 'too good' in the sense
>that there were apparently ideal memory layouts which could be indexed
>faster if the addresses were used directly and the randomization
>skipped.  I believe this discussion was in an ACL2 context some years
>ago.  This is just to note that we are now 'reverting that reversion',
>and opting for a good average performance which (should be) independent
>of data layout.
>Take care,
>Camm Maguire                                        address@hidden
>"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]