gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] hash redux


From: Jared C. Davis
Subject: Re: [Gcl-devel] hash redux
Date: Mon, 11 Nov 2013 16:18:24 -0600

Hi,

I believe Matt is correct that the only use of EQUAL hash tables in
the (h) part of ACL2(h) is for string hashing.  In fact, for the most
part, in a single-threaded context, I think there should typically be
just a single string hash table.

At the relevant part of your benchmark, you might run (hons-summary)
to see the size and count of this table, in case that's helpful.  Or
if you want to get your hands on the hash table to really take a deep
look at it, you can try, e.g.,:

ACL2 !>(hons "foo" "bar")
("foo" . "bar")
ACL2 !>:q
:q

Exiting the ACL2 read-eval-print loop.  To re-enter, execute (LP).
? (hl-hspace-str-ht *default-hs*)
#<HASH-TABLE :TEST EQUAL size 2/1000 #x30200EA5441D>

Cheers,
Jared

On Mon, Nov 11, 2013 at 3:21 PM, Matt Kaufmann <address@hidden> wrote:
> Hi, Camm --
>
> That's interesting, but I'm confused, and I'm definitely not an expert
> on hash tables.  I looked at the files that implement the "(h)" part
> of ACL2(h), which is almost certainly what is involving hash tables,
> and it looks to me like maybe the only 'equal hash tables are for
> strings.
>
> I'm forwarding this to Jared, since he is the most recent author of
> that code (plus, you mention him as helping with potentially related
> reader issues), in case he has time to shed light on this.
>
> -- Matt
>    From: Camm Maguire <address@hidden>
>    Date: Mon, 11 Nov 2013 12:49:41 -0500
>
>    Greetings!
>
>    Just a followup -- the remaining time appears to be in sethash for an
>    'equal hash-table:
>
>    
> =============================================================================
>    index % time    self  children    called     name
>                                 103979625             sethash [1]
>    [1]     84.2    2.11   49.03       0+103979625 sethash [1]
>                   22.58    6.16 167566772/167566772     fShash_equal [2]
>                    0.00   20.28  119656/131885      alloc_relblock [6]
>                    0.01    0.00  119656/205048      alloc_object [47]
>                                 103979625             sethash [1]
>    -----------------------------------------------
>                   22.58    6.16 167566772/167566772     sethash [1]
>    [2]     47.3   22.58    6.16 167566772         fShash_equal [2]
>                    5.25    0.00 363849475/363849475     hash_eql [12]
>                    0.91    0.00 1174935219/1174940911     eql1 [18]
>                    0.00    0.00      12/2577623     Fand <cycle 2> [151]
>    -----------------------------------------------
>                    0.29    5.78       3/14          make_cons [9]
>                    1.06   21.19      11/14          alloc_relblock [6]
>    [3]     46.6    1.35   26.97      14         GBC [3]
>                   26.93    0.00 25304834/25331171     sgc_mark_object1 <cycle 
> 1> [5]
>    
> =============================================================================
>
>    This is somewhat remarkable, as the 'eql gethash calls which greatly
>    dominate in number are no longer on the radar.  Presumably the algorithm
>    makes some complex cons, (definitely not your grandmother's '(1 2 3)
>    list), uses an 'equal hashtable to make it equal-unique, and then uses
>    that as a key in an 'eql hashtable for the real heavy work.
>
>    This just reminded me of the work we did earlier regarding the loading
>    of complex conses in compiled files, which overloaded the #= reader
>    until we memoized the routine calculating the hash-equal index.  This is
>    barely necessary to the gcl compiler -- the point is to catch errors
>    where the constant list to be compiled in changes during compilation.
>    And as I indicated earlier, we flush the memoizing hash tables on each
>    compile-file.  This, together with the implementation of the 'hybrid' #=
>    algorithm suggested by Jared, made the loading of these conses very
>    fast.
>
>    My question is if we've learned anything which might make the above
>    results yet faster.  By default, the hash-equal index descends no more
>    than three levels, car and cdr, into a cons to xor up the index.  It
>    does not attempt to descend the entire structure memoizing as one goes
>    like the compiler version.  There the depth limit is much greater (1000)
>    due to its purpose and the absence of any table.  My intuition tells me
>    that there is no way a memoized version of the generic hash-equal would
>    pay off.  It seems we would have to flush on each call, or never.  It
>    would only speed up index calculations of great depth, which is only
>    useful in hash tables if your index is insufficiently random at the
>    default depth of 3.  This does not appear the case, as #'equal itself is
>    absent from the profiling report, implying that the hit rate to the
>    index is good.
>
>    I suppose an 'equal hashtable could keep an 'eq hashtable internally for
>    the life of the table.  That might be interesting.
>
>    In any case, I don't want to waste a lot of time reinventing some
>    wheel.  If you or any of the other hashtable experts have some wisdom
>    here, I'd be most appreciative.
>
>    Take care,
>    --
>    Camm Maguire                                        address@hidden
>    ==========================================================================
>    "The earth is but one country, and mankind its citizens."  --  Baha'u'llah
>
>
> _______________________________________________
> Gcl-devel mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/gcl-devel



-- 
Jared C. Davis <address@hidden>
11410 Windermere Meadows
Austin, TX 78759
http://www.cs.utexas.edu/users/jared/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]