Hi,
I'm working on using TCC as the core of run-time modifiable applications. I am
using TCC as a library, and compiling/linking straight to memory, i.e.
tcc_set_output_type(state, TCC_OUTPUT_MEMORY). I have mob c03d59e.
I was having issues with static linking SDL2 with TCC, where SDL2 was built
with GCC -O3. SDL2 would segmentation fault as soon as I tried to initialize
it. I tracked it down to the log function using stderr, which was a bogus
address.
I found the issue:
- GCC generates R_X86_64_PC32 relocations to unknown symbols like stderr in
SDL2.
- TCC, while relocating, sees that the PC32 reference is undefined, so creates an
"AUTO_GOTPLT_ENTRY" and updates the relocation to go to "stderr@plt" instead of
"stderr".
* Note that the final address of stderr is a dynamic symbol, because the TCC
"environment" itself is linked to GNU libc.
This ends in disaster, because it appears TCC assumes that the relocation can
safely become a GOT/PLT relocation in build_got_entries()--gotplt_entry_type()
returns AUTO_GOTPLT_ENTRY for the R_X86_64_PC32 relocations.
Now, I'm not well-practiced at reading assembly so I may be wrong here, but
here's an example of the code that is breaking (objdump --disassemble --reloc
build/SDL.o):
0000000000000020 <SDL_InitSubSystem_REAL>:
20: f3 0f 1e fa endbr64
24: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 2b
<SDL_InitSubSystem_REAL+0xb>
27: R_X86_64_PC32 stderr-0x4
2b: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 32
<SDL_InitSubSystem_REAL+0x12>
2e: R_X86_64_PC32 MyStderr-0x4
32: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 39
<SDL_InitSubSystem_REAL+0x19>
35: R_X86_64_PC32 stderr-0x4
39: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 40
<SDL_InitSubSystem_REAL+0x20>
3c: R_X86_64_PC32 MyStderrAddr-0x4
40: b8 ff ff ff ff mov $0xffffffff,%eax
45: c3 retq
46: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
4d: 00 00 00
And here's the working GOT version:
0000000000000020 <SDL_InitSubSystem_REAL>:
20: f3 0f 1e fa endbr64
24: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 2b
<SDL_InitSubSystem_REAL+0xb>
27: R_X86_64_REX_GOTPCRELX stderr-0x4
2b: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 32
<SDL_InitSubSystem_REAL+0x12>
2e: R_X86_64_REX_GOTPCRELX MyStderr-0x4
32: 48 8b 08 mov (%rax),%rcx
35: 48 89 0a mov %rcx,(%rdx)
38: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 3f
<SDL_InitSubSystem_REAL+0x1f>
3b: R_X86_64_REX_GOTPCRELX MyStderrAddr-0x4
3f: 48 89 02 mov %rax,(%rdx)
42: b8 ff ff ff ff mov $0xffffffff,%eax
47: c3 retq
48: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
The (modified) source is the following:
void* MyStderr = NULL;
void* MyStderrAddr = NULL;
SDL_InitSubSystem(Uint32 flags)
{
Uint32 flags_initialized = 0;
MyStderr = stderr;
MyStderrAddr = (void*)&stderr;
return -1;
}
I use those My* variables outside SDL2 to print the address of stderr, for
debugging. I am able to print in the other compilation unit because it uses the
GOTPCRELX relocations to find stderr instead of the faulty PC32 ones.
It appears to me that the PC32 version could not safely use the GOT, because it
doesn't do the extra dereference necessary due to the indirection.
I can fix this by compiling SDL2 files with -fPIC, which generates
R_X86_64_REX_GOTPCRELX relocations, and TCC handles those perfectly.
I can also go the 100% static linked approach by compiling in static musl libC or
something so that all the symbols are defined at relocation time. This has implications
on how many hoops the "user" needs to jump through in order to get their
program working; switching libc is harder than adding -fPIC and recompiling. I am of the
opinion that way more things should just be 100% static-linked, but know that there is a
huge body of code that isn't that takes a decent amount of tedious effort to convert over.
I'm fine with modifying my SDL2 build to work, but I really need something that
can detect when any PC32 relocation is caused to become a GOT/PLT relocation.
That way, I can at least error and instruct the user to re-compile the code as
position-independent.
What I would like to get confirmation on is
A) R_X86_64_32 entries are unsafe to convert to GOT entries through
"AUTO_GOTPLT_ENTRY" because at the very least they are seemingly unimplemented
in TCC's -run/memory mode
and
B) Using COPY relocations wouldn't work in TCC's memory mode either, because the existing
dynamic symbols provided by the TCC application itself have already been placed, so the
copy operation cannot occur without "moving" the TCC application's symbols.
They would have to be moved because the TCC application may have them in a far away place
in memory, too far for a PC32 relocation to reference.
If I am misunderstanding, an explanation of how AUTO_GOTPLT_ENTRY works with
output type = TCC_OUTPUT_MEMORY would be greatly appreciated.
With my current understanding, it appears that I can:
- Simply error if I detect an undefined symbol at the relocate stage with any
PC-relative relocation that isn't the full address space (e.g. PC64 should be
fine, but PC32 would not work)
- Or do the slightly more sophisticated search of all the loaded dlls
(RTLD_DEFAULT and others) for the symbol, then check if the DLL's already
placed symbol is within the PC-relative relocation distance. If it is, then
target it rather than @plt and use the PC32 relocation normally.
Am I on track here? I greatly appreciate anyone who read through this. I've
been banging my head against this for several days now.
Thanks,
Macoy Madson
_______________________________________________
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel