[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Tinycc-devel] Re : Implementing a Global RegisterAllocator for TCC

From: david . koch
Subject: [Tinycc-devel] Re : Implementing a Global RegisterAllocator for TCC
Date: Sat, 6 Aug 2022 12:34:37 +0200 (CEST)

Nice, have a look at this repository :



----- Mail d'origine -----
De: Michael Richter <gruss.mir@gmx.net>
À: tinycc-devel@nongnu.org
Envoyé: Sat, 06 Aug 2022 08:35:18 +0200 (CEST)
Objet: [Tinycc-devel] Implementing a Global RegisterAllocator for TCC

Did anyone take note of this diplome thesis? It in german, but I guess
there are some germanspekaing here in the list. Author claims 30% speed
up generated code...

Thesis is from 2014, "Implementing a Global RegisterAllocator for TCC",
Sebastian Falbesoner TU Wien (Vienna)


Michael Richter

Complete abstract:

Register allocation is a long-standing research topic of computer
science that has been studied extensively over the last decades. Its
goal is to map a theoretically infinite number of program variables onto
a finite, small set of CPU registers during compilation. Eventhough
caches try to bridge the gap between register and memory access time,
keeping asmany values in registers as long as possible is crucial for
good performance. Hence registerallocation is still considered to be one
of the most important compiler optimizations.

The present diploma thesis describes the process of implementing a
register allocatorfor TCC, a small single-pass C compiler written in C.
While TCC is very fast (up to amagnitude faster thangcc -O0for the x86
architecture), it produces code that is quite inefficient – some naive
kind of register allocation is done only on the basis of statements.Our
goal of implementing register allocation done on the basis of whole
functions, that is,global register allocation, should hence result in a
notable performance increase.

As prerequesite for register allocation in TCC, a proper IR
(Intermediate Representation) needs to be generated in a first pass. On
top of this internal representation, livevariable analysis, register
allocation and finally the code generation is then performed. We
determine the variable live intervals with an extraordinarily simple
approach that avoidsthe usual costly data-flow analysis techniques – it
trades off the accuracy of the calculated intervals for higher execution
speed. For the register allocation we use Linear Scan, a strategy that
doesn’t abstract the problem to graph coloring, but rather takes a
simple approachinvolving the linear traversal of variable live
intervals. While linear scan is very fast, it is stated that the
generated code is almost as efficient as with graph coloring, making it
apopular choice for JIT compilers.As target machine, ARM, a classical
RISC architecture, now widespread especially inmobile phones and in many
other embedded systems, is chosen. Since data processing operands must
not reside in memory in thisLoad/Store architecture, register allocation
iseven more important.We determine the performance gain with various
test applications, predominantly fromthe benchmark suiteMiBench.
Additionally, we compare compile-time as well as run-timeperformance
with the widespread compilergcc.The execution time speedup of code
generated by our implementation is about 32% onaverage (compared to the
original TCC), while the compile-time is increased only marginallyand is
still an order of magnitude lower than forgccwithout any optimization.
Conse-quently, our register allocator implementation is an attractive
trade-off for dynamic codegeneration with TCC.

Aktuelle Lichtmaschinenkunst


reply via email to

[Prev in Thread] Current Thread [Next in Thread]