Re: [Tinycc-devel] RE :Re: inline functions

tinycc-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Tinycc-devel] RE :Re: inline functions

From:	Jared Maddox
Subject:	Re: [Tinycc-devel] RE :Re: inline functions
Date:	Mon, 16 Dec 2013 05:02:33 -0600
> Date: Sun, 15 Dec 2013 10:56:06 +0000 (GMT)
> From: Rob <address@hidden>
> To: address@hidden
> Subject: Re: [Tinycc-devel] RE :Re: inline functions
> Message-ID: <address@hidden>
> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
> On Wed, 11 Dec 2013, Jared Maddox wrote:
>
>>> Date: Wed, 11 Dec 2013 23:27:37 +0000 (GMT)
>>> From: Rob <address@hidden>
>>>

>> A flag won't help, I'm pretty sure this is an inherent limit of TCC.
>> The lack of stored parse information about functions means that the
>> information that TCC would need to perform "proper" inlining just
>> isn't available to the compiler when the inlining must be done,
>> because it isn't kept anywhere.
>
> Ah okay, I was thinking of testing for VT_INLINE and if extern and
> static aren't present, we skip that definition, and just don't do any
> inlining for now.
>

Yeah, as you might be guessing right now, "undefined reference" unless
it's also unused.

>> There are a few ways to change this:
>>
>> 1) When I finish my current pet project (which is taking longer than I
>> planned, due to sluggishness on my part) I'm planning to write a
>> reference-loop-safe smart-pointer system in C, and use it to add a
>> "parse tree generator" backend to TCC. At that point a parse tree will
>> be trivially available, and the problem can be dealt with by a
>> function that receives said parse tree, before passing it on to a
>> parse tree -to- code stage. If you think that this feature is half-way
>> time critical, then don't wait for me to do this.
>>
>> 2) We could always cheat by coming up with some custom function format
>> based upon working "on top" of the caller's stack, and claim that
>> we've inlined in that way (we might even be able to do a copy-paste
>> into the correct location on some platforms).
>>
>> 3) We could come up with a "raw function" object format that describes
>> the calling convention, alignment requirements, and "patch
>> requirements" required to copy the enclosed inlineable function into
>> another function. Nesting of inline functions wouldn't work if it
>> hadn't already been done to the inlineable function, but no big
>> surprise there. Note that this could technically be done with a copy
>> of the source code & information about what globals it can see,
>> instead of with a proper object file.
>
> I'd be interested in your smart-pointer system, do you have a blog or
> anything we can follow?

I could say yes, but I'll say no, because it's been at least six
months since I used it, at which point it was wholly dedicated to the
kind of minutia that technical schools have you post when you're in
the very early stages of their programs. As it is, the biggest blocker
is currently a software renderer pet project which I had intended to
finish a month or two ago, but let slip instead. The only thing that I
recall actually needing for the smart-pointers is a good search tree,
and I was figuring that I could get one up, and then macro-ed, within
a week once I started working on it. Everything else was just working
out the details (anything of this sort will, of course, start with
subtle bugs: given that I've had a desire for this sort of system for
years, it's basically a given that I'll be making it work, since the
parse tree will just be the first use of it, not the only one). I also
want to use it for a bencode parser, an MSCOM-knockoff (but with
better reference tracking), a custom language, and doubtlessly other
things as well.

> But yeah, I imagine getting tcc to inline
> functions will be quite the overhaul. You're brave!
>

Well, to be fair, I'm not going after inlining, or even any use of the
parse tree itself at all. An older list subject was on compiling C to
C to produce optimizations, which is as far as I know pretty
nonsensical. But, I DID figure that the TCC parse system is simple
enough that I could understand it in a sensible amount of time, and
hence could use it to build a parse tree for anyone else to use, which
in turn would allow future projects of that sort to have a better
starting point: a parse tree.

> I wonder if it would be possible to combine your patching idea and just
> run the vstack manipulation that the inline function does, but on the
> current function's vstack. The difficulty would be getting symbol
> references correct, we'd have to figure out a nice way of mapping
> argument symbols onto the rvalues we pass into the inline function.
>

If I were doing it, here's how I'd approach it:
1) I'd figure out how the code needed to be generated so that it
didn't care what I did with it: we're talking about full-sized
pointers instead of some optimized forms, always assembly long-jumps
and never assembly short-jumps, and some way to correctly designate
stack positions DESPITE borrowing part of the stack space of another
function. The pointers shouldn't be hard (in fact, I assume they're
already like that), the jumps might require a bit of work but should
be solvable by temporarily throwing the tokens through a specialized
"inlines parser" that just changes the type of jump opcode issued
(short jumps are size-efficient, but only offer something like
127-byte jumps), and the stack positions could be via either stack
pointer, or via a TWO-offsets-from-stack-base system.

2) Once you have the inlinable-assembly generator ready, you need your
meta-data. This consists of: the base-address (within the "object
file", which I probably would implement as a file, albeit a temporary
one), length, and alignment requirements of the assembly (only case
that you'd need it for that I KNOW of is if we want to target Google
NaCl in the future, but I wouldn't be surprised at actual processors
having requirements); the length (in entries or bytes, whichever) of
the patch table, and a list of assembly-address/modifier/type triplets
to specify the address to be modified within the inlinable assembly
(assumed to be the same size as native pointers), a necessary piece of
data to be used in the modification (for an argument or an invocation
variable you'd need an offset), and the type of modification (i.e. is
the location an argument/variable, a thread-local, a function-static,
an offset to some location within the inlinable assembly itself,
something else that doesn't come to mind, etc.); the string-table
(string tables apparently have pretty standard formats: they're
seemingly always a set of C strings followed by an extra null); and a
table of patch-tables, indexed via the string table (i.e., the first
entry corresponds to the first string, etc.), where each patch-table
consists of the length, a type to apply to the whole patch-table, and
a list of assembly-address/modifier pairs.

Note that the first piece of meta-data that I described allows the
actual inlinable-assembly to be placed anywhere: I would probably
initially write it into a "scratch" file while the patch tables & such
were being written, and then just copy it in, and modify the base
address initially written to the "storage" file.

3) That having been done, all calls to the inline would then consist
of copying the assembly to the current object-file output, modifying
the appropriate locations according to the information in the
patch-tables as you go along. Note that there are certain to be plenty
of details: I haven't said whether you modify the argument accesses to
point to the locations where their originating values come from (note
that I actually oppose this: it strikes me as a bad idea), or instead
push copies of those values onto the stack as if you were actually
calling the function (note that this can actually be done via the
redirect-to-origin option that I'm suggesting against).

You also need some way to pop out a conventional form if it's address
gets taken, of course, but I expect that you can do that with one bit
per inline function, and a final check to see if the bit has been set
(if it has, then spit out a relevant prologue, inline the code into
that function as if it had been called, pop out an epilogue, and
you're done).

Note that this whole scheme does result in a calling convention:
inline convention, which I would suggest deriving from the platform's
C calling convention. Also, I just remembered that return information
needs to be present in the inlinable-assembly as well. This should be
achievable with an appropriate entry in the symbol table (I'd suggest
against plain "return" as the symbol, since we might someday see
arbitrary-identifier features added to C: some sort of escape-code
would be better instead). The return itself is best achieved with a
jump. You will, naturally, have to go back through the final output
code after the inlining to provide the correct jump target: just keep
the inlinable-assembly temp file open, and track the address where you
most recently started writing the inlinable into the output.

>> If such a thing were done, then I would ALSO suggest putting in a
>> compiler flag to force inlines to be implemented as statics, like they
>> currently are (or at least should presumably be).
>
> Yeah, perhaps using attributes, like force_inline and noinline from gcc?
>

I was thinking a command-line argument to apply it to the entire file,
but I think that implementing your version and then having a
command-line argument to set it as the default behavior would be the
better solution (it would also help to play nicer on platforms where
inlining support is being worked on, but isn't yet available).
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [Tinycc-devel] RE :Re: inline functions, Jared Maddox, 2013/12/11
- Re: [Tinycc-devel] RE :Re: inline functions, Rob, 2013/12/15
- Re: [Tinycc-devel] RE :Re: inline functions, Jared Maddox <=
Prev by Date: [Tinycc-devel] Teaching tcc a new language, like c++?
Next by Date: Re: [Tinycc-devel] Unable to link a binary on OS X
Previous by thread: Re: [Tinycc-devel] RE :Re: inline functions
Next by thread: [Tinycc-devel] Problem using GetLargestConsoleWindowSize
Index(es):
- Date
- Thread