|
From: | Paul Cercueil |
Subject: | Re: [Lightning] jit_qdivr_u trashes JIT_R0 on x86_64 |
Date: | Thu, 29 Aug 2019 02:41:05 +0200 |
Hi Paulo,Le mer. 28 août 2019 à 20:01, Paulo =?iso-8859-1?b?Q+lzYXI=?= Pereira de Andrade <address@hidden> a écrit :
Em qua, 28 de ago de 2019 às 15:36, Paul Cercueil <address@hidden> escreveu:Hi Paul, [...]>> I call jit_qdiv(reg1, reg2, reg2, reg1)>> (and also jit_qdiv_u with the same arguments). reg1 and reg2 cannot>> represent the same register. >> Ok. I see it can fail on some ports, actually, probably only on x86.> I will add extra tests for these conditions, and fix any register > clobbers > that should be being generated.> Note that by doing the way you described, it might cause lightning> to > generate code to get at least one extra temporary depending on > registers> input. It might look clean in the lightning input, but may generate > significantly larger code due to moves to/from %rax and %rdx, as well> as > spill reloads.Since I have my own register allocator on top of Lightning, sometimes in a function all the registers offered by Lightning have been writtento, and I guess there's no way for Lightning to know which registersare safe to use without spilling. A good improvement would be to havean API function to mark a register as freely usable.I just tested extending check/qalu.inc to test all possible permutations, and also check for any register clobber, and it still pass all tests on x86_64. Can you please describe how your api works? I suspect it might be doing something that is triggering a bug somewhere. Maybe point to some git repository with an example of how to reproduce the problem. It would be far better if you could create an example input to check/lightning. For example: $ cat test.tst .data 32 fmt: .c "7 / 3 = (%ld, %ld)\n" .code prolog movi %r1 7 movi %r0 3 // force live movi %r2 10 movi %v0 11 movi %v1 12 movi %v2 13 // end force live qdivr %r0 %r1 %r1 %r0 // check registers bnei L0 %r2 10 bnei L0 %v0 11 bnei L0 %v1 12 beqi L1 %v2 13 L0: calli @abort L1: prepare pushargi fmt ellipsis pushargr %r0 pushargr %r1 finishi @printf ret epilog $ ./ligtning test.tst 7 / 3 = (2, 1)
I tried, but I could not succeed to make a test case that shows the issue... All I tried have been working just fine.
The actual code that triggers the issue is this one: https://gist.github.com/pcercuei/8db789b415f6ced73abb01a6504b64c7
As it's generated, it's not the easiest to understand, sorry...The thing to see, is that line 26 I write register 0 (rax), which is then untouched until line 58 or 71.
This is what Lightning generates: https://gist.github.com/pcercuei/6b4afef692682b139a46449665454681
As you can see, line 30 %rax is written to, and %eax is written back to memory on lines 69 or 84.
But also, line 49 %rax is unconditionally overwritten. Cheers, -Paul
You might also use some ideas from the allocator in https://github.com/pcpa/owl/blob/master/lib/oemit.c but the allocator really is for virtual registers. It hardcodes actual registers for the virtual registers, thread pointer, this pointer, (virtual) stack pointer, global pointer, and some temporaries. But as long as the code generator knows it is working only with integer operations that cannot overflow or float/double values it does not generate code to spill reload the hardware register assigned to the (four) virtual registers. Note that with lightning you should think of registers as resources, somewhat like a file descriptor.Thanks, -PaulThanks! Paulo
[Prev in Thread] | Current Thread | [Next in Thread] |