tinycc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Tinycc-devel] Non PIC-code and TCC's GOT strategy for TCC_OUTPUT_MEMORY


From: Macoy Madson
Subject: [Tinycc-devel] Non PIC-code and TCC's GOT strategy for TCC_OUTPUT_MEMORY
Date: Mon, 5 Dec 2022 16:10:17 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

Hi,

I'm working on using TCC as the core of run-time modifiable applications. I am using TCC as a library, and compiling/linking straight to memory, i.e. tcc_set_output_type(state, TCC_OUTPUT_MEMORY). I have mob c03d59e.

I was having issues with static linking SDL2 with TCC, where SDL2 was built with GCC -O3. SDL2 would segmentation fault as soon as I tried to initialize it. I tracked it down to the log function using stderr, which was a bogus address.

I found the issue:

- GCC generates R_X86_64_PC32 relocations to unknown symbols like stderr in SDL2.

- TCC, while relocating, sees that the PC32 reference is undefined, so creates an "AUTO_GOTPLT_ENTRY" and updates the relocation to go to "stderr@plt" instead of "stderr".

* Note that the final address of stderr is a dynamic symbol, because the TCC "environment" itself is linked to GNU libc.

This ends in disaster, because it appears TCC assumes that the relocation can safely become a GOT/PLT relocation in build_got_entries()--gotplt_entry_type() returns AUTO_GOTPLT_ENTRY for the R_X86_64_PC32 relocations.

Now, I'm not well-practiced at reading assembly so I may be wrong here, but here's an example of the code that is breaking (objdump --disassemble --reloc build/SDL.o):

0000000000000020 <SDL_InitSubSystem_REAL>:
    20:    f3 0f 1e fa              endbr64
    24:    48 8b 05 00 00 00 00     mov    0x0(%rip),%rax # 2b <SDL_InitSubSystem_REAL+0xb>
              27: R_X86_64_PC32    stderr-0x4
    2b:    48 89 05 00 00 00 00     mov    %rax,0x0(%rip) # 32 <SDL_InitSubSystem_REAL+0x12>
              2e: R_X86_64_PC32    MyStderr-0x4
    32:    48 8d 05 00 00 00 00     lea    0x0(%rip),%rax # 39 <SDL_InitSubSystem_REAL+0x19>
              35: R_X86_64_PC32    stderr-0x4
    39:    48 89 05 00 00 00 00     mov    %rax,0x0(%rip) # 40 <SDL_InitSubSystem_REAL+0x20>
              3c: R_X86_64_PC32    MyStderrAddr-0x4
    40:    b8 ff ff ff ff           mov    $0xffffffff,%eax
    45:    c3                       retq
    46:    66 2e 0f 1f 84 00 00     nopw   %cs:0x0(%rax,%rax,1)
    4d:    00 00 00

And here's the working GOT version:

0000000000000020 <SDL_InitSubSystem_REAL>:
    20:    f3 0f 1e fa              endbr64
    24:    48 8b 05 00 00 00 00     mov    0x0(%rip),%rax # 2b <SDL_InitSubSystem_REAL+0xb>
              27: R_X86_64_REX_GOTPCRELX    stderr-0x4
    2b:    48 8b 15 00 00 00 00     mov    0x0(%rip),%rdx # 32 <SDL_InitSubSystem_REAL+0x12>
              2e: R_X86_64_REX_GOTPCRELX    MyStderr-0x4
    32:    48 8b 08                 mov    (%rax),%rcx
    35:    48 89 0a                 mov    %rcx,(%rdx)
    38:    48 8b 15 00 00 00 00     mov    0x0(%rip),%rdx # 3f <SDL_InitSubSystem_REAL+0x1f>
              3b: R_X86_64_REX_GOTPCRELX    MyStderrAddr-0x4
    3f:    48 89 02                 mov    %rax,(%rdx)
    42:    b8 ff ff ff ff           mov    $0xffffffff,%eax
    47:    c3                       retq
    48:    0f 1f 84 00 00 00 00     nopl   0x0(%rax,%rax,1)

The (modified) source is the following:

void* MyStderr = NULL;
void* MyStderrAddr = NULL;

SDL_InitSubSystem(Uint32 flags)
{
    Uint32 flags_initialized = 0;

    MyStderr = stderr;
    MyStderrAddr = (void*)&stderr;
    return -1;

}

I use those My* variables outside SDL2 to print the address of stderr, for debugging. I am able to print in the other compilation unit because it uses the GOTPCRELX relocations to find stderr instead of the faulty PC32 ones.

It appears to me that the PC32 version could not safely use the GOT, because it doesn't do the extra dereference necessary due to the indirection.

I can fix this by compiling SDL2 files with -fPIC, which generates R_X86_64_REX_GOTPCRELX relocations, and TCC handles those perfectly.

I can also go the 100% static linked approach by compiling in static musl libC or something so that all the symbols are defined at relocation time. This has implications on how many hoops the "user" needs to jump through in order to get their program working; switching libc is harder than adding -fPIC and recompiling. I am of the opinion that way more things should just be 100% static-linked, but know that there is a huge body of code that isn't that takes a decent amount of tedious effort to convert over.

I'm fine with modifying my SDL2 build to work, but I really need something that can detect when any PC32 relocation is caused to become a GOT/PLT relocation. That way, I can at least error and instruct the user to re-compile the code as position-independent.

What I would like to get confirmation on is

A) R_X86_64_32 entries are unsafe to convert to GOT entries through "AUTO_GOTPLT_ENTRY" because at the very least they are seemingly unimplemented in TCC's -run/memory mode

and

B) Using COPY relocations wouldn't work in TCC's memory mode either, because the existing dynamic symbols provided by the TCC application itself have already been placed, so the copy operation cannot occur without "moving" the TCC application's symbols. They would have to be moved because the TCC application may have them in a far away place in memory, too far for a PC32 relocation to reference.

If I am misunderstanding, an explanation of how AUTO_GOTPLT_ENTRY works with output type = TCC_OUTPUT_MEMORY would be greatly appreciated.

With my current understanding, it appears that I can:

- Simply error if I detect an undefined symbol at the relocate stage with any PC-relative relocation that isn't the full address space (e.g. PC64 should be fine, but PC32 would not work)

- Or do the slightly more sophisticated search of all the loaded dlls (RTLD_DEFAULT and others) for the symbol, then check if the DLL's already placed symbol is within the PC-relative relocation distance. If it is, then target it rather than @plt and use the PC32 relocation normally.

Am I on track here? I greatly appreciate anyone who read through this. I've been banging my head against this for several days now.

Thanks,

Macoy Madson




reply via email to

[Prev in Thread] Current Thread [Next in Thread]