chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-hackers] ABI woes


From: Peter Bex
Subject: Re: [Chicken-hackers] ABI woes
Date: Fri, 10 Jul 2015 12:47:46 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Jul 10, 2015 at 12:23:36PM +0200, address@hidden wrote:
> Hello!
> 
> After thinking about this, it seems to be that the "compound literal" 
> approach 
> is the best one. What this means is that CPS-calls are changed in such a 
> manner, that arguments are passed rthrough a pointer to a C_word-array (so it 
> doesn't have to do anything with compound literals, that was just the first 
> idea.) This approach has several advantages compared to other methods: 
> 
> - It's completely ABI independent.
> 
> - It simplifies code-generation a bit: there is no need for separate 
>   "trampoline" procedures for normal CPS calls, as the vector can directly be 
>   used for saving arguments on the temp-stack, with restoration being 
> unnecessary 
>   (the args are picked out of the argvector anyway). 

This sounds very interesting and may cut down on compilation time and
binary size.  I'm assuming this refers to the "trf123" functions, right?
Performance wise this shouldn't make much of a difference because those
are only called after GC.

> - There is no need for "rest-arg wrappers", functions that extract the rest
>   argument and then call the actual compiled C function. This can be done
>   directly from the argvector.

This is cool, but if I understand correctly it also means that all the
C_fast_retrieve_proc(lf[123])(a, b, c) calls now need to be preceded by
the filling of a stack-allocated vector, where before the arguments would
be passed in registers on most platforms.  That means (even) more stack
usage in CPS calls, which slows stuff quite a bit as we've seen with the
numbers integration.

Also, having to read arguments from a pointer will be slower than passing
them in a register, and potentially slower than reading them from the top
of the stack (though that may not be the case).

> How this will influence performance, I can't say. This will reduce code size 
> (many trampolines go, as do rest-arg wrappers). Allocation of arg-vectors 
> will 
> use more stack-space, but removal of trampolines will remove 
> activation-frames. 

> Calls to known targets can still be done as normal C calls.

But only if all the callers are known, correct?

> Whatever approach is used, this will be a substantial change, taking a lot of 
> work: the backend needs to be changed, and all hand-written CPS primitives 
> need 
> to be adapted, also changing the way rest-args are handled. This needs to be 
> implemented for CHICKEN 4 and later ported to CHICKEN 5, adapting all new CPS 
> procedures that where introduced with the bignum-related changes. Oh, what 
> fun... 

Yeah, this will be a massive undertaking, especially as CHICKEN 5 is
diverging further and further from CHICKEN 4.

Do you think there's a possibility to make a limited prototype, to see
what the performance impact will be, and whether the approach is viable
at all (ie, it works on ios)?

Finally, have you considered my idea of using macros for the calls and
function definitions to switch between approaches?  If the performance
is too bad, we can keep using the current approach for platforms where
it does not break.  Unfortunately, C macrology is even worse than Scheme
macrology.

Cheers,
Peter

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]