chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-hackers] [PATCH] Fix performance bottleneck in compiling large


From: Peter Bex
Subject: [Chicken-hackers] [PATCH] Fix performance bottleneck in compiling large files, add profiling option
Date: Sat, 21 Jan 2012 20:38:39 +0100
User-agent: Mutt/1.4.2.3i

Hi all,

I'm currently still in the process of getting the "numbers" tests to
compile in reasonable time.  To do this, I managed to find one part in
Chicken's compilation process where it's slow thanks to Felix's hint
to use -debug p.

To debug this, I created a small script that builds a file with an
increasing number of toplevel forms (see measure-compilation.scm).
It appears that simply putting (print #t) in a file a lot of times
simulates a large toplevel pretty well and triggers the exponential
behaviour reliably.  This is visualized by the green line in the
file "comparison.png".

Afterwards, I added the option to selectively turn on profiling for
parts in Chicken itself to the Makefiles (see patch 0001 or changeset
57fff73344771739e7723e28df0c0019058a0268 in the sjamaan-pending branch).
This helped me analyze the performance by simply profiling Chicken while
it compiled a test file containing 1000 toplevel expressions.

Turns out that the "k1000 in k999 in k998 in ..." comments that Chicken
generates in the C file are nested rather deeply when there's a long
procedure or a lot of toplevel forms, which causes the bottleneck to
appear in ##compiler#real-name.  The notes.txt file shows this quite
obviously, it has a short heading of chicken-profile output.

Instead of using sprintf and allocating a new string and copying over
every time, I've changed real-name to use build up a list and use
string-intersperse to join them together at the end (see patch 0002 or
changeset 27dbaf0440617fa92a33c592aba9d52b72a95117 in my branch).
This causes compilation time to grow a lot less hard (see the red line
in comparison.png) and compilation time to be more than halved when
compiling the 1000-form file, but real-name is still dominating the
procedure-calls which makes sense since there are places where the
comment is several standard 80x25 screenfulls.  Perhaps these comments
should be cut off at a certain length?

Unfortunately, the compilation time is still exponential on the input,
but with a much smaller growth factor (you can see this more clearly in
string-intersperse.png).

Also, the compilation time for the numbers egg's tests is still close to
7 minutes (this change shaved off about half a minute).  Turns out that the
numbers test is sufficiently more complex than the simple tests you see
here, so the main bottleneck is probably elsewhere (I can tell because the
time spent in the other phases is now higher than the time spent in
the code generation phase).

I'll dig in deeper and try to work out where it's being slow, but I
didn't want to withhold these patches and insights from y'all :)

Cheers,
Peter
-- 
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
 is especially attractive, not only because it can be economically
 and scientifically rewarding, but also because it can be an aesthetic
 experience much like composing poetry or music."
                                                        -- Donald Knuth

Attachment: 0001-Add-option-to-enable-profiling-more-easily-for-speci.patch
Description: Text document

Attachment: 0002-Improve-performance-by-not-using-sprintf-to-build-co.patch
Description: Text document

Attachment: comparison.png
Description: PNG image

Attachment: measure-compilation.scm
Description: Text document

Attachment: notes.txt
Description: Text document

Attachment: string-intersperse.png
Description: PNG image


reply via email to

[Prev in Thread] Current Thread [Next in Thread]