tinycc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] trying to fix the bug of unclean FPU st(0)


From: grischka
Subject: Re: [Tinycc-devel] trying to fix the bug of unclean FPU st(0)
Date: Tue, 09 Jun 2009 19:00:40 +0200
User-agent: Thunderbird 1.5.0.10 (Windows/20070221)

Soloist Deng wrote:
Hi all:

   I  am using  tcc-0.9.25, and the FPU bug brought a big trouble to
me. I read the source and tried to fix it.
Below is my solution.

Thanks for the detailed analysis and the fix.

The thing as I see it is that TCC basically implements register
based arithmetics and assumes at least two registers available.
As this doesn't really fit with the FPU stack model, some trickery
is necessary.

In any case it seems that to pop the FP stack, `fstp %st(0)' is
the right thing to do.  The reason why it wasn't applied yet was
the other bug in 'gv_dup()' which we weren't able to find.

Of course we cannot be sure that this was the only implication
however I'd suggest to apply your patch now and worry later.

As always, you can push patches on our "mob" branch:
$ git push ssh://address@hidden/srv/git/tinycc.git mob:mob

--- grischka

 There are two places where program(`o(0xd9dd)') will generates `fstp
%st(1)': vpop() in tccgen.c:689 and save_reg() in tccgen.c:210.
We should first change both of them to `o(0xd8dd) // fstp %st(0)'.
But these changes are not enough.  Let's check the following code.

void foo()
{
 double var = 2.7;
 var++;
}

Using  the changed tcc will generate following machine code:

.text:08000000                 public foo
.text:08000000 foo             proc near
.text:08000000
.text:08000000 var_18          = qword ptr -18h
.text:08000000 var_10          = qword ptr -10h
.text:08000000 var_8           = qword ptr -8
.text:08000000
.text:08000000                 push    ebp
.text:08000001                 mov     ebp, esp
.text:08000003                 sub     esp, 18h
.text:08000009                 nop
.text:0800000A                 fld     L_0
.text:08000010                 fst     [ebp+var_8]
.text:08000013                 fstp    st(0)
.text:08000015                 fld     [ebp+var_8]
.text:08000018                 fst     [ebp+var_10]
.text:0800001B                 fstp    st(0)
.text:0800001D                 fst     [ebp+var_18]
.text:08000020                 fstp    st(0)
.text:08000022                 fld     L_1
.text:08000028                 fadd    [ebp+var_10]
.text:0800002B                 fst     [ebp+var_8]
.text:0800002E                 fstp    st(0)
.text:08000030                 leave
.text:08000031                 retn
.text:08000031 foo             endp
.text:08000031
.text:08000031 _text           ends
--------------------------------------------------
.data:08000040 ; Segment type: Pure data
.data:08000040 ; Segment permissions: Read/Write
.data:08000040 ; Segment alignment '32byte' can not be represented in assembly
.data:08000040 _data           segment page public 'DATA' use32
.data:08000040                 assume cs:_data
.data:08000040                 ;org 8000040h
.data:08000040 L_0             dq 400599999999999Ah
.data:08000048 L_1             dq 3FF0000000000000h
.data:08000048 _data           ends

Please notice the code snippet from 0800000A  to 08000020
// double var = 2.7; load constant to st(0)
.text:0800000A                 fld     L_0
// double var = 2.7; store st(0) to `var'
.text:08000010                 fst     [ebp+var_8]
// double var = 2.7; poping st(0)  will empty the floating registers stack
.text:08000013                 fstp    st(0)

  After that ,tcc will call `void inc(int post, int c)" in
tccgen.c:2150, and produce 08000015 to 0800001B through the calling
chain (inc ->gv_dup)
// load from `var' to st(0)
.text:08000015                 fld     [ebp+var_8]
// store st(0) to a temporary location
.text:08000018                 fst     [ebp+var_10]
// poping st(0)  will empty the floating registers stack
.text:0800001B                 fstp    st(0)

  And the calling chain
(gen_op('+')->gen_opif('+')->gen_opf('+')->gv(rc=2)->get_reg(rc=2)->save_reg(r=3))
will produce 0800001D to 08000020 .
// store st(0) to a temporary location, but floating stack is empty!
.text:0800001D                 fst     [ebp+var_18]
// poping st(0)  will empty the floating registers stack
.text:08000020                 fstp    st(0)

   The `0800001D   fst     [ebp+var_18]' will store st(0) to a memory
location, but st(0) is empty. That will cause  FPU invalid operation
exception(#IE).
Why does tcc do that? Please read `gv_dup' called by `inc' carefully.
Notice these lines:

(1):        r = gv(rc);
(2):       r1 = get_reg(rc);
(3):        sv.r = r;
       sv.c.ul = 0;
(4)        load(r1, &sv); /* move r to r1 */
(5)        vdup();
       /* duplicates value */
(6)        vtop->r = r1;
  (1)  let the vtop occupy TREG_ST0, and `r' will be TREG_ST0.  (2)
try to get a free floating register,but tcc assume
there is only one, so it wil force vtop goto memory and assign `r1'
with TREG_ST0. When executing (3), it will do nothing
because `r' equals `r1'. (5) duplicates vtop.  Then (6) let the new
vtop occupy TREG_ST0, but this will cause problem
because the old vtop has been moved to memory, so the new duplicated
vtop does not reside in TREG_ST0 but also
in memory after that. TREG_ST0 is not occupied but freely availabe
now.   `gen_op('+')'  need at least one oprand in register,
so it will incorrectly think TREG_ST0 is occupied by vtop and produce
instructions(0800001D and 08000020) to store it to
a temporary memory location.

  According program above, if `r' == `r1' it is impossible for the old
vtop to still occupy the `r' register .  And `load' will do nothing
too at this condition.
So the `gv_dup' can not promise the semantics that old vtop in one
register and the new duplicated vtop in another register at the same
time.

  I changed (6) to
if (r != r1)
{
 vtop->r = r1;
}

  Then the new generated machine code will be :

.text:08000000                 push    ebp
.text:08000001                 mov     ebp, esp
.text:08000003                 sub     esp, 10h
.text:08000009                 nop
.text:0800000A                 fld     L_0
.text:08000010                 fst     [ebp+var_8]
.text:08000013                 fstp    st(0)
.text:08000015                 fld     [ebp+var_8]
.text:08000018                 fst     [ebp+var_10]
.text:0800001B                 fstp    st(0)
.text:0800001D                 fld     L_1
.text:08000023                 fadd    [ebp+var_10]
.text:08000026                 fst     [ebp+var_8]
.text:08000029                 fstp    st(0)
.text:0800002B                 leave
.text:0800002C                 retn

   It works well, and will clean the floating registers stack when return.

  Finally, I want to know there is any potential problem of this fixing ?

-----------------------------------------------
soloist


_______________________________________________
Tinycc-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/tinycc-devel









reply via email to

[Prev in Thread] Current Thread [Next in Thread]