[Tinycc-devel] Huge swings in cache performance

Hello everyone,

Reminder/Background: C::Blocks is my Perl wrapper around my fork of tcc with extended symbol table support.

I've begun writing benchmarks to seriously test how C::Blocks compares with other JIT and JIT-ish options for Perl. I've noticed a couple of situations in which slight modifications to the code cause a huge drop in performance. One benchmark went from 370ms to 5,000ms (i.e. 5 sec).

The change to the code was so slight that I immediately suspected cache misses as the culprit. Running with linux's "perf" command gave proof of that (hopefully this format properly with fixed-width characters):

Fast Slow Significant

time (ms) 370 5022 **

instructions 3.5B 3.5B

branches 640M 650M

branch-miss 687k 671k

dcache-miss 974k 71M **

icache-miss 3.2M 83M **

By dcache-miss I refer to what perf calls "L1 dcache load miss", and by icache-miss I refer to what perf calls "L1 icache load miss".

I'm a bit confused on what would cause this sort of persistent cache miss behavior. In particular, I've tried working with highly distinct strategies for managing executable memory, including ensuring page alignment (wrong: it should be line-width alignment of 64 bytes). This fixed a similar issue previously observed, but didn't seem to improve the situation here. I used malloc instead of Perl's built-in memory allocator. I created a pool for executable memory so that multiple chunks of executable code would all be written to the same page in memory. EVEN THIS did not fix this issue, which really surprised me since I would have thought adjacent memory would hash to different caches.

I believe that what I've found is an issue with tcc, but I haven't golfed it down to a simple libtcc-consuming example. I can do that, but wanted to see if anybody could think of an obvious cause, and fix, without going to such lengths. If not, I will see if I can write a small reproducible example.

Thanks!

David

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." -- Brian Kernighan

From:	David Mertens
Subject:	[Tinycc-devel] Huge swings in cache performance
Date:	Tue, 20 Dec 2016 08:16:08 -0500