[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Tinycc-devel] Non PIC-code and TCC's GOT strategy for TCC_OUTPUT_MEMORY
From: |
Macoy Madson |
Subject: |
[Tinycc-devel] Non PIC-code and TCC's GOT strategy for TCC_OUTPUT_MEMORY |
Date: |
Mon, 5 Dec 2022 16:10:17 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 |
Hi,
I'm working on using TCC as the core of run-time modifiable
applications. I am using TCC as a library, and compiling/linking
straight to memory, i.e. tcc_set_output_type(state, TCC_OUTPUT_MEMORY).
I have mob c03d59e.
I was having issues with static linking SDL2 with TCC, where SDL2 was
built with GCC -O3. SDL2 would segmentation fault as soon as I tried to
initialize it. I tracked it down to the log function using stderr, which
was a bogus address.
I found the issue:
- GCC generates R_X86_64_PC32 relocations to unknown symbols like stderr
in SDL2.
- TCC, while relocating, sees that the PC32 reference is undefined, so
creates an "AUTO_GOTPLT_ENTRY" and updates the relocation to go to
"stderr@plt" instead of "stderr".
* Note that the final address of stderr is a dynamic symbol, because the
TCC "environment" itself is linked to GNU libc.
This ends in disaster, because it appears TCC assumes that the
relocation can safely become a GOT/PLT relocation in
build_got_entries()--gotplt_entry_type() returns AUTO_GOTPLT_ENTRY for
the R_X86_64_PC32 relocations.
Now, I'm not well-practiced at reading assembly so I may be wrong here,
but here's an example of the code that is breaking (objdump
--disassemble --reloc build/SDL.o):
0000000000000020 <SDL_InitSubSystem_REAL>:
20: f3 0f 1e fa endbr64
24: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 2b
<SDL_InitSubSystem_REAL+0xb>
27: R_X86_64_PC32 stderr-0x4
2b: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 32
<SDL_InitSubSystem_REAL+0x12>
2e: R_X86_64_PC32 MyStderr-0x4
32: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 39
<SDL_InitSubSystem_REAL+0x19>
35: R_X86_64_PC32 stderr-0x4
39: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 40
<SDL_InitSubSystem_REAL+0x20>
3c: R_X86_64_PC32 MyStderrAddr-0x4
40: b8 ff ff ff ff mov $0xffffffff,%eax
45: c3 retq
46: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
4d: 00 00 00
And here's the working GOT version:
0000000000000020 <SDL_InitSubSystem_REAL>:
20: f3 0f 1e fa endbr64
24: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 2b
<SDL_InitSubSystem_REAL+0xb>
27: R_X86_64_REX_GOTPCRELX stderr-0x4
2b: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 32
<SDL_InitSubSystem_REAL+0x12>
2e: R_X86_64_REX_GOTPCRELX MyStderr-0x4
32: 48 8b 08 mov (%rax),%rcx
35: 48 89 0a mov %rcx,(%rdx)
38: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 3f
<SDL_InitSubSystem_REAL+0x1f>
3b: R_X86_64_REX_GOTPCRELX MyStderrAddr-0x4
3f: 48 89 02 mov %rax,(%rdx)
42: b8 ff ff ff ff mov $0xffffffff,%eax
47: c3 retq
48: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
The (modified) source is the following:
void* MyStderr = NULL;
void* MyStderrAddr = NULL;
SDL_InitSubSystem(Uint32 flags)
{
Uint32 flags_initialized = 0;
MyStderr = stderr;
MyStderrAddr = (void*)&stderr;
return -1;
}
I use those My* variables outside SDL2 to print the address of stderr,
for debugging. I am able to print in the other compilation unit because
it uses the GOTPCRELX relocations to find stderr instead of the faulty
PC32 ones.
It appears to me that the PC32 version could not safely use the GOT,
because it doesn't do the extra dereference necessary due to the
indirection.
I can fix this by compiling SDL2 files with -fPIC, which generates
R_X86_64_REX_GOTPCRELX relocations, and TCC handles those perfectly.
I can also go the 100% static linked approach by compiling in static
musl libC or something so that all the symbols are defined at relocation
time. This has implications on how many hoops the "user" needs to jump
through in order to get their program working; switching libc is harder
than adding -fPIC and recompiling. I am of the opinion that way more
things should just be 100% static-linked, but know that there is a huge
body of code that isn't that takes a decent amount of tedious effort to
convert over.
I'm fine with modifying my SDL2 build to work, but I really need
something that can detect when any PC32 relocation is caused to become a
GOT/PLT relocation. That way, I can at least error and instruct the user
to re-compile the code as position-independent.
What I would like to get confirmation on is
A) R_X86_64_32 entries are unsafe to convert to GOT entries through
"AUTO_GOTPLT_ENTRY" because at the very least they are seemingly
unimplemented in TCC's -run/memory mode
and
B) Using COPY relocations wouldn't work in TCC's memory mode either,
because the existing dynamic symbols provided by the TCC application
itself have already been placed, so the copy operation cannot occur
without "moving" the TCC application's symbols. They would have to be
moved because the TCC application may have them in a far away place in
memory, too far for a PC32 relocation to reference.
If I am misunderstanding, an explanation of how AUTO_GOTPLT_ENTRY works
with output type = TCC_OUTPUT_MEMORY would be greatly appreciated.
With my current understanding, it appears that I can:
- Simply error if I detect an undefined symbol at the relocate stage
with any PC-relative relocation that isn't the full address space (e.g.
PC64 should be fine, but PC32 would not work)
- Or do the slightly more sophisticated search of all the loaded dlls
(RTLD_DEFAULT and others) for the symbol, then check if the DLL's
already placed symbol is within the PC-relative relocation distance. If
it is, then target it rather than @plt and use the PC32 relocation normally.
Am I on track here? I greatly appreciate anyone who read through this.
I've been banging my head against this for several days now.
Thanks,
Macoy Madson
- [Tinycc-devel] Non PIC-code and TCC's GOT strategy for TCC_OUTPUT_MEMORY,
Macoy Madson <=