lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] ppc: Fix 'calli' when floating-point arguments are passed


From: Paulo César Pereira de Andrade
Subject: Re: [PATCH] ppc: Fix 'calli' when floating-point arguments are passed
Date: Thu, 8 Sep 2022 17:33:22 -0300

Em qui., 8 de set. de 2022 às 16:00, Paul Cercueil
<paul@crapouillou.net> escreveu:

  Hi Paul,

> Le jeu., sept. 8 2022 at 14:52:28 -0300, Paulo César Pereira de
> Andrade <paulo.cesar.pereira.de.andrade@gmail.com> a écrit :
> > Em qui., 8 de set. de 2022 às 14:18, Paulo César Pereira de Andrade
> > <paulo.cesar.pereira.de.andrade@gmail.com> escreveu:
> >
> > [snip]
> >
> >>  > The problem now is that r26 should be live at L2 but is not
> >> detected as
> >>  > such.
> >>  > This causes Lightning to use r26 as a temporary for the andi line
> >> 34
> >>  > (lines 153-155 in the generated code).
> >>
> >>    What code you use to access 'r10' and 'r3'? It is possible to use
> >>  JIT_R(5) and JIT_R(12) as I did in the C code, but it is an ugly
> >> hack,
> >>  taking advantage that it does not check bounds. The
> >> check/lightning.c
> >>  code need to be patched to accept it...
> >>
> >>    Can you create a reproducer in C, starting with the above sample?
> >
> >   Well, the most likely cause is that it did happen because of a
> > lightning
> > build without the last 3 commits, and is a side effect of the jumps
> > to a
> > raw address:
> >
> >     jmpi 0x20000d0c
>
> I can assure you I triple-checked that. I removed the library in my
> toolchain, my program would complain that liblightning.so.0 is missing,
> then I just "make clean install" with the proper path, my program picks
> up the library just fine. And I'm at the latest origin/master, clean,
> no local changes. Also, others can reproduce my issue with my program +
> Lightning master.
>
> Now what's really triggering me, is that I have a C program that
> produces the *exact* same Lightning output (see attachment). It is
> exactly the same, the *only* differences are the jmpi addresses, and my
> test program does not have the r26 missing in L2.
>
> So even with the exact same code in a test example, I cannot reproduce
> it. Yet my regular program shows the bug with the exact same Lightning
> calls. I have absolutely no idea what's going on.

  Please try this:

(gdb) break _split_branches
(gdb) run

Once it stops and the _jit is the jit_context_t where the problem happens,
type:

(gdb) finish

get the proper block, if it is the sample, it should be _jitc->blocks.ptr + 2,
and add a watchpoint, for example:

(gdb) p &_jitc->blocks.ptr[2].reglive
$1 = (jit_regset_t *) 0x10114238
(gdb) watch *$1

then, first check when it sets r26 as live, for example:

(gdb) c
Continuing.
Hardware watchpoint 2: *$1

Old value = 0
New value = 32
_jit_setup (_jit=0x10111270, block=0x10114230) at lightning.c:2259
2259            if (value & jit_cc_a0_reg) {
(gdb) p/t *$1
$2 = 100000
(gdb) p/t *$1 & (1<<17)
$3 = 0
(gdb) c
Continuing.
Hardware watchpoint 2: *$1

Old value = 32
New value = 131104
_jit_setup (_jit=0x10111270, block=0x10114230) at lightning.c:2251
2251            if ((value & jit_cc_a1_reg) &&
(gdb) p/t *$1 & (1<<17)
$4 = 100000000000000000

Once the value is set, keep checking bit 17, that is r26:

Old value = 131104
New value = 16908320
_jit_setup (_jit=0x10111270, block=0x10114230) at lightning.c:2259
2259            if (value & jit_cc_a0_reg) {
(gdb) p/t *$3 & (1<<17)
$5 = 100000000000000000

If at any moment:

(gdb) p/t *$3 & (1<<17)

prints zero, the problem has been found. In that case, please let me
know the output of:

(gdb) bt full

It is very strange that only a single bit is modified, and does not look
like corruption or use after free.

It might be required for you to provide me with ssh access to a debug
environment, but for the moment lets have a better idea of the problem
based on the backtrace.

> -Paul

Thanks,
Paulo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]