chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-hackers] [PATCH] Improve performance by skipping mutations on n


From: Peter Bex
Subject: [Chicken-hackers] [PATCH] Improve performance by skipping mutations on non-stack pointers
Date: Sat, 5 Apr 2014 21:19:21 +0200
User-agent: Mutt/1.4.2.3i

Hi all,

I happened to be working on some benchmarks for the Postgres egg and due
to random reasons started digging into why (time) was showing mutations in
the GC even if I wasn't mutating anything.  I still haven't figured that
one out, but I stumbled across a nice performance boost that I wanted to
share with you before continuing.

The idea of the mutation stack (as you all know by now after reading my
blog post about the GC :P) is that it is needed to determine the live
values to copy from the stack (nursery) when performing a minor GC.
This means that only mutations to objects living in the heap (or
elsewhere like forwarding table, literal frames, symbol table,
"collectibles" or GC roots) which get nursery objects written to
their slots will need to be tracked (and eventually, copied).  It
turns out that the current implementation tracks *every* single
mutation, even if it's several mutations to the same slot of an object!

If mutations occur on objects in the stack, this means the target
objects will get copied even if the mutated object isn't copied because
it is not "live".  This causes more copying and also drives up the total
memory use, which increases GC pressure, resulting in possibly more
major GCs.

I experimented with deduplicating mutations but the check itself caused
too much overhead compared to the wins of not needing to traverse the
object more than once during GC, so that wasn't much of a win.  However,
a simple check whether the object being written to lives outside of the
nursery, and the object being written to the slot lives in the nursery
resulted in rather nice performance wins.

For example, a quick uri-generic microbenchmark:

Before:
#;1> (use uri-generic)
#;2> ,t (let loop ((i 0)) (unless (> i 10000) (uri-reference 
"http://call-with-current-continuation.org/foo/bar?mooh#qux";) (loop (add1 i))))
0.296s CPU time, 0.02s GC time (major), 1320215 mutations, 16/2170 GCs 
(major/minor)

After:
#;1> (use uri-generic)
#;2> ,t (let loop ((i 0)) (unless (> i 10000) (uri-reference 
"http://call-with-current-continuation.org/foo/bar?mooh#qux";) (loop (add1 i))))
0.268s CPU time, 0.004s GC time (major), 1320215/5934 mutations 
(total/tracked), 10/2176 GCs (major/minor)

The total/tracked mutations show the number of mutations and the
number of mutations which need to be tracked because they point
from the heap into the nursery.  You can also see that the GC time
is reduced to a fifth of its original time (which is due to the
big difference in the number of tracked mutations and the
fact less major GCs are needed).

The difference in the "deriv" and "dderiv" benchmarks is even more
dramatic:

Before:

$ csi -s dderiv.scm
0.436s CPU time, 0.12s GC time (major), 1200076 mutations, 263/2105 GCs 
(major/minor)
$ csi -s deriv.scm
0.408s CPU time, 0.112s GC time (major), 1200076 mutations, 242/2187 GCs 
(major/minor)

After:
$ csi -s dderiv.scm
0.244s CPU time, 0.004s GC time (major), 1200076/11833 mutations 
(total/tracked), 24/2344 GCs (major/minor)
$ csi -s deriv.scm
0.244s CPU time, 0.004s GC time (major), 1200076/12164 mutations 
(total/tracked), 23/2406 GCs (major/minor)

See the attached log files for the full chicken-benchmark differences.
It would be cool if people could try this on performance-critical
production code, to see how much of a difference it makes in the Real
World(TM).

I'm not 100% sure, but this patch might be simple enough that we may
want to roll this into the 4.9.0 release cycle, after it has been
double-checked by Salmonella, of course!  On the other hand, we could
also try to reduce the time between 4.9.0 and 4.10.0, so it doesn't take
as insanely long as the time between 4.8.0 and 4.9.0

I am unsure what to do with the GC mutation hook, though.  My patch keeps
the hook invoked for mutations on all non-immediate objects, even if they
will eventually not get tracked.  Perhaps this should be moved down
below the new check?

Cheers,
Peter
-- 
http://www.more-magic.net

Attachment: 0001-Improve-GC-performance-by-avoiding-tracking-of-nurse.patch
Description: Text document

Attachment: original.log
Description: Text document

Attachment: with-nursery-check.log
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]