[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [avr-libc-dev] Quick test of Björn Haase's relax patch
From: |
Björn Haase |
Subject: |
Re: [avr-libc-dev] Quick test of Björn Haase's relax patch |
Date: |
Sun, 5 Mar 2006 05:41:00 +0100 |
User-agent: |
KMail/1.7.1 |
Dmitry K. wrote on Sonntag, 5. März 2006 06:22 :
> On Saturday 04 March 2006 20:23, you wrote:
> > Björn and Dmitry,
> > have you tried "program-at-once" compilation (avr-gcc -combine
> > -fwhole-program *.c ..)? This may save some more bytes with recent gccs.
>
> In result:
>
> Options: rlx cmb cmb+rlx
> ---------------------------------------------
> flash, bytes 11412 11056 11412 11104
> call 198 24 198 46
> rcall 143 316 143 294
> jmp 34 10 34 14
> rjmp 442 467 442 463
>
The issue probably is that "cmb" reorders the sequence of the functions in
memory in a less favorable fashion. The functions itself seem to be the same.
I'd like to suggest to use -ffunction-sections and -Wl,--gc-sections. Maybe
this helps improving the ordering of the functions.
> And another question about replacing '[r]call/ret' to
> '[r]jmp/ret'. Whether this optimization is safe?
> For example, avr-libc: fplib. Split function pops
> two bytes from stack and returns into more high level
> in error case. Similar tricks make such optimization
> to be erroneous.
I agree that in these cases one could run into trouble. However, as long as
gcc generated assembly is used, I don't see where problems could arise.
> On the other hand, the GCC x86 port
> for a long time is able to optimize such tails. It also
> does it more effectively as deletes not necessary
> more 'ret'.
I know that there are ways to implement this in gcc itself. IIRC the method is
rather complicated. Since the inclusion of this optimization in the linker
was not very difficult, I had simply added it. In the long run, of course, it
would be better to make gcc do it itself.
> Can be add an additional option for obvious inclusion
> of this optimization?
Yes, of course. The question is wether to activate or deactivate the
optimization by default. However, in order to add such a switch, I would have
to learn beforehand how to add switches to ld. Probably one would have to
write a avr emulation template for ld.? What I am planning also is to add
"use-16k-wrap-around" and "use-32k-wrap-around" flags. Adding this
optimization is already prepared in the patch but currently not activated.
Bjoern.