gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] hash redux


From: Matt Kaufmann
Subject: Re: [Gcl-devel] hash redux
Date: Mon, 11 Nov 2013 16:49:12 -0600

P.S. I wonder if def-gl-thm clears hash tables.  But even if it does,
I wouldn't expect it that particular form to hash a lot of strings.

-- Matt
   Date: Mon, 11 Nov 2013 16:44:54 -0600
   From: Matt Kaufmann <address@hidden>
   Cc: address@hidden

   Hi, Jared and Camm --

   I ran the experiment you suggested, Jared (thanks for the suggestion).
   In books/centaur/gl/:

   (ubt! 1)
   (include-book "portcullis")
   (rebuild "solutions.lisp" t)
   (u)
   (time$ (def-gl-thm 1f
     :hyp (and (unsigned-byte-p 3000 x)
               (unsigned-byte-p 3000 y))
     :concl (equal (+ x y) (+ y x))
     :g-bindings (gl::auto-bindings (:mix (:nat x 3000)
                                          (:nat y 3000)))))

   That took 78 seconds (a very nice improvement!).  Then:

   ACL2 !>:q

   Exiting the ACL2 read-eval-print loop.  To re-enter, execute (LP).
   ACL2>(hons-summary)

   Normed Objects Summary

    - NIL-HT:                     4 count,           5,000 size ( 0.08% full)
    - CDR-HT:             9,071,974 count,      12,974,622 size (69.92% full)
    - CDR-HT-EQL:                 0 count,           1,000 size ( 0.00% full)
    - STR-HT:                     1 count,           1,000 size ( 0.10% full)
    - PERSIST-HT:                 0 count,             100 size ( 0.00% full)
    - FAL-HT:                     0 count,           1,000 size ( 0.00% full)

   NIL

   ACL2>(hl-hspace-str-ht *default-hs*)

   #<hash-table 0000000004e06af0>

   ACL2>

   (I did some searching and did find another 'equal hash table besides
   that str-ht, namely; *hcomp-book-ht*, but it's quite small and not
   relevant here.)

   So I'm again stumped, since the cdr-ht is, I think, an 'eq hash
   table.

   Camm, is there a way to identify the callers that are setting a hash
   table with test 'equal?  The profile you sent seems to be at the level
   of C, so I don't know what to trace.

   -- Matt
      From: "Jared C. Davis" <address@hidden>
      Date: Mon, 11 Nov 2013 16:18:24 -0600
      Cc: Camm Maguire <address@hidden>,
              "address@hidden" <address@hidden>

      Hi,

      I believe Matt is correct that the only use of EQUAL hash tables in
      the (h) part of ACL2(h) is for string hashing.  In fact, for the most
      part, in a single-threaded context, I think there should typically be
      just a single string hash table.

      At the relevant part of your benchmark, you might run (hons-summary)
      to see the size and count of this table, in case that's helpful.  Or
      if you want to get your hands on the hash table to really take a deep
      look at it, you can try, e.g.,:

      ACL2 !>(hons "foo" "bar")
      ("foo" . "bar")
      ACL2 !>:q
      :q

      Exiting the ACL2 read-eval-print loop.  To re-enter, execute (LP).
      ? (hl-hspace-str-ht *default-hs*)
      #<HASH-TABLE :TEST EQUAL size 2/1000 #x30200EA5441D>

      Cheers,
      Jared

      On Mon, Nov 11, 2013 at 3:21 PM, Matt Kaufmann <address@hidden> wrote:
      > Hi, Camm --
      >
      > That's interesting, but I'm confused, and I'm definitely not an expert
      > on hash tables.  I looked at the files that implement the "(h)" part
      > of ACL2(h), which is almost certainly what is involving hash tables,
      > and it looks to me like maybe the only 'equal hash tables are for
      > strings.
      >
      > I'm forwarding this to Jared, since he is the most recent author of
      > that code (plus, you mention him as helping with potentially related
      > reader issues), in case he has time to shed light on this.
      >
      > -- Matt
      >    From: Camm Maguire <address@hidden>
      >    Date: Mon, 11 Nov 2013 12:49:41 -0500
      >
      >    Greetings!
      >
      >    Just a followup -- the remaining time appears to be in sethash for an
      >    'equal hash-table:
      >
      >    
=============================================================================
      >    index % time    self  children    called     name
      >                                 103979625             sethash [1]
      >    [1]     84.2    2.11   49.03       0+103979625 sethash [1]
      >                   22.58    6.16 167566772/167566772     fShash_equal [2]
      >                    0.00   20.28  119656/131885      alloc_relblock [6]
      >                    0.01    0.00  119656/205048      alloc_object [47]
      >                                 103979625             sethash [1]
      >    -----------------------------------------------
      >                   22.58    6.16 167566772/167566772     sethash [1]
      >    [2]     47.3   22.58    6.16 167566772         fShash_equal [2]
      >                    5.25    0.00 363849475/363849475     hash_eql [12]
      >                    0.91    0.00 1174935219/1174940911     eql1 [18]
      >                    0.00    0.00      12/2577623     Fand <cycle 2> [151]
      >    -----------------------------------------------
      >                    0.29    5.78       3/14          make_cons [9]
      >                    1.06   21.19      11/14          alloc_relblock [6]
      >    [3]     46.6    1.35   26.97      14         GBC [3]
      >                   26.93    0.00 25304834/25331171     sgc_mark_object1 
<cycle 1> [5]
      >    
=============================================================================
      >
      >    This is somewhat remarkable, as the 'eql gethash calls which greatly
      >    dominate in number are no longer on the radar.  Presumably the 
algorithm
      >    makes some complex cons, (definitely not your grandmother's '(1 2 3)
      >    list), uses an 'equal hashtable to make it equal-unique, and then 
uses
      >    that as a key in an 'eql hashtable for the real heavy work.
      >
      >    This just reminded me of the work we did earlier regarding the 
loading
      >    of complex conses in compiled files, which overloaded the #= reader
      >    until we memoized the routine calculating the hash-equal index.  
This is
      >    barely necessary to the gcl compiler -- the point is to catch errors
      >    where the constant list to be compiled in changes during compilation.
      >    And as I indicated earlier, we flush the memoizing hash tables on 
each
      >    compile-file.  This, together with the implementation of the 
'hybrid' #=
      >    algorithm suggested by Jared, made the loading of these conses very
      >    fast.
      >
      >    My question is if we've learned anything which might make the above
      >    results yet faster.  By default, the hash-equal index descends no 
more
      >    than three levels, car and cdr, into a cons to xor up the index.  It
      >    does not attempt to descend the entire structure memoizing as one 
goes
      >    like the compiler version.  There the depth limit is much greater 
(1000)
      >    due to its purpose and the absence of any table.  My intuition tells 
me
      >    that there is no way a memoized version of the generic hash-equal 
would
      >    pay off.  It seems we would have to flush on each call, or never.  It
      >    would only speed up index calculations of great depth, which is only
      >    useful in hash tables if your index is insufficiently random at the
      >    default depth of 3.  This does not appear the case, as #'equal 
itself is
      >    absent from the profiling report, implying that the hit rate to the
      >    index is good.
      >
      >    I suppose an 'equal hashtable could keep an 'eq hashtable internally 
for
      >    the life of the table.  That might be interesting.
      >
      >    In any case, I don't want to waste a lot of time reinventing some
      >    wheel.  If you or any of the other hashtable experts have some wisdom
      >    here, I'd be most appreciative.
      >
      >    Take care,
      >    --
      >    Camm Maguire                                        address@hidden
      >    
==========================================================================
      >    "The earth is but one country, and mankind its citizens."  --  
Baha'u'llah
      >
      >
      > _______________________________________________
      > Gcl-devel mailing list
      > address@hidden
      > https://lists.gnu.org/mailman/listinfo/gcl-devel



      -- 
      Jared C. Davis <address@hidden>
      11410 Windermere Meadows
      Austin, TX 78759
      http://www.cs.utexas.edu/users/jared/


   _______________________________________________
   Gcl-devel mailing list
   address@hidden
   https://lists.gnu.org/mailman/listinfo/gcl-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]