tinycc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] NetBSD/aarch64 Unknown relocation type for got: 299


From: Herman ten Brugge
Subject: Re: [Tinycc-devel] NetBSD/aarch64 Unknown relocation type for got: 299
Date: Mon, 11 Jan 2021 08:44:20 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0

On 1/7/21 10:53 PM, grischka wrote:
Herman ten Brugge wrote:
I just commited an update.

Thanks ;)

Also a bit more explanation would be nice.  Does it introduce text
relocations or does it try to avoid them?  What is the benefit, under
what scenario, and is it desirable always or should it maybe depend
on a -fPIE switch or like that?

As far as I know all targets generate -FPIC code by default.

For bsd support I recently applied patches to use the GOT table correctly.
Also netbsd requires that DT_TEXTREL is not set. So I applied patches
for that also.

I just wondered whether it adds some overhead that for most cases is
not really necessary when executables are loaded at fixed addresses
without need for relocations to its own symbols at runtime really.

Some systems may require position independent executables but even
GCC I think needs a configure option to make them by default.

Also there is still the ARM-PE target, aka wince.  I don't know if
it's still functional (or ever was) though.

I first bench marked the code before committing. The slowdown was
between 2% and 5% depending on how much global data is used.
I also saw that incrementing a variable in global memory does
2 times a pointer load.
So:

int a;
int main(void) { a++; return 0; }

results in arm code:

00000000 <main>:
   0:   e1a0c00d        mov     ip, sp
   4:   e92d5800        push    {fp, ip, lr}
   8:   e1a0b00d        mov     fp, sp
   c:   e1a00000        nop                     ; (mov r0, r0)
  10:   e59fe000        ldr     lr, [pc]        ; 18 <main+0x18>
  14:   ea000000        b       1c <main+0x1c>
  18:   fffffff4                        ; <UNDEFINED> instruction: 0xfffffff4
  1c:   e08ee00f        add     lr, lr, pc
  20:   e59ee000        ldr     lr, [lr]
  24:   e59e0000        ldr     r0, [lr]
  28:   e1a01000        mov     r1, r0
  2c:   e2800001        add     r0, r0, #1
  30:   e59fe000        ldr     lr, [pc]        ; 38 <main+0x38>
  34:   ea000000        b       3c <main+0x3c>
  38:   fffffff4                        ; <UNDEFINED> instruction: 0xfffffff4
  3c:   e08ee00f        add     lr, lr, pc
  40:   e59ee000        ldr     lr, [lr]
  44:   e58e0000        str     r0, [lr]
  48:   e3a00000        mov     r0, #0
  4c:   e89ba800        ldm     fp, {fp, sp, pc}

The lr register loaded at line 24 is not reused and is fetched again.
If this for example is fixed the slowdown would be much less.
The only target that does now uses DT_TEXTREL is i386. But this requires
a complete rewrite.
I made some simple patches to the i386 code but quickly came to the
conclusion that converting this target is not feasible.
So I will not update this target.

    Herman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]