|
From: | Elijah Stone |
Subject: | Re: [Tinycc-devel] manually inlining functions |
Date: | Fri, 30 Apr 2021 22:49:48 -0700 (PDT) |
On Sat, 1 May 2021, Yakov wrote:
On this sample using macros speeds the program up 400%
Be that as it may, it's not representative of most application. For instance, cpython's performance increases by only 10-15% with the inliner turned on.
(And actually that's misleading, because inlining enables many other optimizations. The impact for tcc if it _only_ added inlining would probably be much less. Unfortunately gcc doesn't seem to be willing to inline at -O0.)
I have recently read a paper about a Linear Scan Register Allocator[1], they claim it gives you 95% performance or Graph Coloring Register Allocator in basically no time, and requires no SSA.
Yes, linear scan is quite nice. It's not really compatible with tcc's compilation model--nor are most other optimizations, including inlining--but I mentioned it because it's probably the most worthwhile optimization a compiler can perform and it's not too difficult.
In the context of a compiler like gcc or llvm, linear scan takes almost no time at all. However it depends on a certain model of code that tcc does not provide currently. Gcc already produces such a model, even without optimizations, and linear scan takes advantage of the information which is already there; a big part of the reason why tcc is so fast is that it produces no such model. For gcc this is a sunk cost; for tcc, not.
-E
[Prev in Thread] | Current Thread | [Next in Thread] |