[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[avr-gcc-list] A couple of optimizer suggestions
From: |
Brian Dean |
Subject: |
[avr-gcc-list] A couple of optimizer suggestions |
Date: |
Fri, 14 Jan 2005 00:21:27 -0500 |
User-agent: |
Mutt/1.4.2.1i |
Let me preface that I don't really know the internals of gcc that well
so I don't know the relative difficulty of implementing the following
optimizations. What follows are a few suggestions that came to my
attention on another list. Given a simple bit of code such as:
SIGNAL(SIG_OUTPUT_COMPARE0)
{
uint8_t i = PORTB;
PORTB = 0xF0;
PORTB = 0x0F;
PORTB = i;
}
Compiling with -O2, gcc produces:
26:hw.c **** SIGNAL(SIG_OUTPUT_COMPARE0)
27:hw.c **** {
89 .LM5:
90 /* prologue: frame size=0 */
91 001c 1F92 push __zero_reg__
92 001e 0F92 push __tmp_reg__
93 0020 0FB6 in __tmp_reg__,__SREG__
94 0022 0F92 push __tmp_reg__
95 0024 1124 clr __zero_reg__
96 0026 8F93 push r24
97 0028 9F93 push r25
98 /* prologue end (size=7) */
28:hw.c **** uint8_t i = PORTB;
100 .LM6:
101 002a 88B3 in r24,56-0x20
29:hw.c **** PORTB = 0xF0;
103 .LM7:
104 002c 90EF ldi r25,lo8(-16)
105 002e 98BB out 56-0x20,r25
30:hw.c **** PORTB = 0x0F;
107 .LM8:
108 0030 9FE0 ldi r25,lo8(15)
109 0032 98BB out 56-0x20,r25
31:hw.c **** PORTB = i;
111 .LM9:
112 0034 88BB out 56-0x20,r24
113 /* epilogue: frame size=0 */
114 0036 9F91 pop r25
115 0038 8F91 pop r24
116 003a 0F90 pop __tmp_reg__
117 003c 0FBE out __SREG__,__tmp_reg__
118 003e 0F90 pop __tmp_reg__
119 0040 1F90 pop __zero_reg__
120 0042 1895 reti
121 /* epilogue end (size=7) */
122 /* function __vector_15 size 20 (6) */
With that same snippet of code, the IAR compiler produces:
51 __interrupt void handler()
\ handler:
52 {
\ 00000000 931A ST -Y,R17
\ 00000002 930A ST -Y,R16
53 BYTE i = PORTB;
\ 00000004 B318 IN R17,0x18
54 PORTB = 0xF0;
\ 00000006 EF00 LDI R16,240
\ 00000008 BB08 OUT 0x18,R16
55 PORTB = 0x0F;
\ 0000000A E00F LDI R16,15
\ 0000000C BB08 OUT 0x18,R16
56 PORTB = i;
\ 0000000E BB18 OUT 0x18,R17
57 }
\ 00000010 9109 LD R16,Y+
\ 00000012 9119 LD R17,Y+
\ 00000014 9518 RETI
58
The body of the handler looks very similar; there's not much to
improve on there. But check the prologue and the epilogue. Note a
couple of additional optimizations are possible that IAR takes
advantage of. Namely, registers that are not used are not pushed.
Note that GCC pushes __zero_reg__ and __tmp_reg__, neither of which
are (need to be) used (OK, __tmp_reg__ is used but it doesn't need to
be - see the next sentence). The second thing is that none of the
instructions in the interrupt handler actually changes the status
register, thus IAR doesn't push it either. The result is that IAR
gets a smaller and tighter interrupt handler.
It'd be really cool if the GCC optimizer could do that too :-)
-Brian
--
Brian Dean
BDMICRO - ATmega128 Based MAVRIC Controllers
http://www.bdmicro.com/
- [avr-gcc-list] A couple of optimizer suggestions,
Brian Dean <=