[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[avr-libc-dev] Smaller boot.h macros for AVRs with >64K Flash
From: |
Mike Perks |
Subject: |
[avr-libc-dev] Smaller boot.h macros for AVRs with >64K Flash |
Date: |
Thu, 30 Apr 2009 08:38:17 -0500 |
User-agent: |
Thunderbird 2.0.0.21 (Windows/20090302) |
Elvind suggested I post this idea on this mailing list for inclusion in
a future release of avrlibc.
I have a 512 word bootloader that I use for my om128
<http://www.avrfreaks.net/address@hidden&func=viewItem&item_id=822>
and om644p
<http://www.avrfreaks.net/address@hidden&func=viewItem&item_id=906>
microcontroller devices. These bootloaders were compiled with an older
version of the GCC compiler (3.4.5) and avrlibc (1.4.4) using WinAVR
20060125. This resulted in very compact code. A further constraint is
that the bootloader is actually in two parts. A 128 word "stub" and a
384 word main loader. The stub can be used to provide a self-updating
feature.
Later versions of GCC (i.e v4) produced much larger code that would not
fit. This hasn't been a problem up to now because because I simply used
the older version of GCC. However I am now working on some new devices
that require a later compiler to support the underlying AVRs (e.g.
mega328p and mega1284p).
On further examination the stub was too big to fit in 128 words for the
128K byte flash devices (mega128, mega1284) and main culprits were the
boot.h macros that required 32-bit addresses:
* boot_page_erase(address)
* boot_page_fill(address, data)
* boot_page_write(address)
The problem is that the 32-bit address needs 4 registers, plus another 4
for pointer addition plus all the pushes and pops needed for the
additional registers - a net add of around 50 words.
I resisted the temptation to completely rewrite the stub in assembly and
looked at the boot.h macro definitions. For the 128K devices, the RAMPZ
register may need to be set. It seemed like a good idea to separate this
out from the address calculation and set the RAMPZ register separately.
This idea should also work for devices with >128K flash.
Here is a snippet of the bootloader that shows the calculation of the
address and RAMPZ register from a flash page number:
*
*
/* 256 byte page size */
uint16_t address = Page << 8;
/* erase page and wait for completion */
#if defined(RAMPZ)
#if defined(__AVR_ATmega128__) || defined(__AVR_ATmega1284P__)
/* 256 byte page size */
boot_page_erase_extended(address, (Page >> 8));
#endif
#elif defined(__AVR_ATmega644__) || defined(__AVR_ATmega644P__)
boot_page_erase(address);
#endif
boot_spm_busy_wait();
Obviously the generated code is quite nice for 128 word pages (shifting
left or right by 8 bits is easy). The new macro boot_page_erase_extended
is defined as follows.
#define boot_page_erase_extended(address, ramp) \
(__extension__({ \
__asm__ __volatile__ \
( \
"sts %3, %4\n\t" \
"sts %0, %1\n\t" \
"spm\n\t" \
: \
: "i" (_SFR_MEM_ADDR(__SPM_REG)), \
"r" ((uint8_t)__BOOT_PAGE_ERASE), \
"z" ((uint16_t)address), \
"i" (_SFR_MEM_ADDR(RAMPZ)), \
"r" ((uint8_t)ramp) \
); \
}))
The "extended" boot_page_erase macro is very similar to the normal macro
for 16-bit addresses except for setting the RAMPZ register i.e. the
generated code is only a few words larger. The corresponding macros for
boot_page_fill and boot_page_write functions are*
*
#define boot_page_fill_extended(address, data, ramp) \
(__extension__({ \
__asm__ __volatile__ \
( \
"movw r0, %3\n\t" \
"sts %4, %5\n\t" \
"sts %0, %1\n\t" \
"spm \n\t" \
"clr r1\n\t" \
: \
: "i" (_SFR_MEM_ADDR(__SPM_REG)), \
"r" ((uint8_t)__BOOT_PAGE_FILL), \
"z" ((uint16_t)address), \
"r" ((uint16_t)data), \
"i" (_SFR_MEM_ADDR(RAMPZ)), \
"r" ((uint8_t)ramp) \
: "r0" \
); \
}))
#define boot_page_write_extended(address, ramp) \
(__extension__({ \
__asm__ __volatile__ \
( \
"sts %0, %1\n\t" \
"sts %3, %4\n\t" \
"spm\n\t" \
: \
: "i" (_SFR_MEM_ADDR(__SPM_REG)), \
"r" ((uint8_t)__BOOT_PAGE_WRITE), \
"z" ((uint16_t)address), \
"i" (_SFR_MEM_ADDR(RAMPZ)), \
"r" ((uint8_t)ramp) \
); \
}))
Here are the resultant code sizes for just the stub using the latest GCC
compiler:
* mega1284p - 108 words
* mega128 - 96 words
* mega644p - 99 words
The mega644p is 9 words smaller (3 per RAMPZ use) than the mega1284p.
The mega128 is 12 words smaller than the mega1284p because it can use
in/out instructions rather than the longer lds/sts instructions for I/O
registers. This shows one of the benefits of keeping to C code as much
as possible - the compiler can do a much better optimization job
although as shown in this post, sometimes it needs some help.
Regards,
Mike
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [avr-libc-dev] Smaller boot.h macros for AVRs with >64K Flash,
Mike Perks <=