lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lightning] jit_qdivr_u trashes JIT_R0 on x86_64


From: Paulo César Pereira de Andrade
Subject: Re: [Lightning] jit_qdivr_u trashes JIT_R0 on x86_64
Date: Wed, 28 Aug 2019 14:01:06 -0400

Em qua, 28 de ago de 2019 às 15:36, Paul Cercueil <address@hidden> escreveu:

  Hi Paul,
[...]
> >>  I call jit_qdiv(reg1, reg2, reg2, reg1)
> >>  (and also jit_qdiv_u with the same arguments). reg1 and reg2 cannot
> >>  represent the same register.
> >
> >   Ok. I see it can fail on some ports, actually, probably only on x86.
> > I will add extra tests for these conditions, and fix any register
> > clobbers
> > that should be being generated.
> >   Note that by doing the way you described, it might cause lightning
> > to
> > generate code to get at least one extra temporary depending on
> > registers
> > input. It might look clean in the lightning input, but may generate
> > significantly larger code due to moves to/from %rax and %rdx, as well
> > as
> > spill reloads.
>
> Since I have my own register allocator on top of Lightning, sometimes
> in a function all the registers offered by Lightning have been written
> to, and I guess there's no way for Lightning to know which registers
> are safe to use without spilling. A good improvement would be to have
> an API function to mark a register as freely usable.

  I just tested extending check/qalu.inc to test all possible
permutations, and also check for any register clobber, and it still
pass all tests on x86_64.

  Can you please describe how your api works? I suspect it might be
doing something that is triggering a bug somewhere. Maybe point to
some git repository with an example of how to reproduce the problem.
It would be far better if you could create an example input to
check/lightning. For example:
$ cat test.tst
.data   32
fmt:
.c      "7 / 3 = (%ld, %ld)\n"
.code
        prolog
        movi %r1 7
        movi %r0 3
        // force live
        movi %r2 10
        movi %v0 11
        movi %v1 12
        movi %v2 13
        // end force live
        qdivr %r0 %r1 %r1 %r0
        // check registers
        bnei L0 %r2 10
        bnei L0 %v0 11
        bnei L0 %v1 12
        beqi L1 %v2 13
L0:
        calli @abort
L1:
        prepare
                pushargi fmt
                ellipsis
                pushargr %r0
                pushargr %r1
        finishi @printf
        ret
        epilog
$ ./ligtning test.tst
7 / 3 = (2, 1)

  You might also use some ideas from the allocator in
https://github.com/pcpa/owl/blob/master/lib/oemit.c
but the allocator really is for virtual registers. It hardcodes actual
registers for the virtual registers, thread pointer, this pointer,
(virtual) stack pointer, global pointer, and some temporaries. But as
long as the code generator knows it is working only with integer
operations that cannot overflow or float/double values it does not
generate code to spill reload the hardware register assigned to the
(four) virtual registers.
  Note that with lightning you should think of registers as resources,
somewhat like a file descriptor.

> Thanks,
> -Paul

Thanks!
Paulo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]