bug-classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug classpath/34823] New: Floating point rounding mode error in inline


From: swadams at ca dot ibm dot com
Subject: [Bug classpath/34823] New: Floating point rounding mode error in inline asm code, linux/cell be cross compiler
Date: 17 Jan 2008 01:36:45 -0000

Hello, I am a undergraduate student assisting in the testing of the accelerated
math library functions for the Cell BE SDK 3.0 at IBM, Markham Lab.  This is my
first bug submission so please forgive the lack of proper format.  If I can
provide further information, please feel free to contact me.

using spu-gcc linux to cell BE cross compiler v4.1.1 under Fedora release 7
Kernel: 2.6.21-1.3194.fc7.  I am uncertain of the hardware specs.

Short version of the bug: When using any of the -O modes gcc returns incorrect
single precision values when the program includes inline assembly functions
unless -ffloat_store is specified.  The bug is that float_store shouldn't
affect this situation as floating point math on the SPU should use round to 0
rounding.

Long version of the bug (methodology)
While testing accelerated math functions (mass library) I discovered the bug
when attempting to assert that our inline assembly (.h, (eg: static __inline
vector float _acosf4(vector float var1))) and library versions (.a) produced
the same results.  The program is very simple, it uses the spu_decrementer to
time how long it takes to call the function with N results and compares the
output of the (eg) _acosf4() fn with acosf4().  If the optimization levels are
not used then spu_gcc reports the difference between the functions as 0 (as
does XL C).  If any of the -O levels (except perhaps 0, I don't recall) is used
then a discrepency appears.  Below are the output of my program as examples.

I tested this with many settings (using the list elsewhere on this site,
building up the individual flags that compose the -O levels) and the bug did
not manifest itself.  -ffloat_store causes the correct results to appear,
however the description of float_store is inconsistant with the spu rounding
modes.  Also since float_store is very expensive to use it can produce a
significant performance hit.

The program is a pure SPU program, not imbedded in a PPU program if that
matters.

EXAMPLES:
/usr/bin/ccache /usr/bin/spu-gcc        -W -Wall -Winline -Wno-main  -I. -I
/usr/spu/include/mass/ -I /usr/spu/include -I /opt/cell/sdk/usr/spu/include 
-O3 -ffloat-store -D NARGS=1 -D MINEXPO=-149 -D MAXEXPO=127 -D MINEXPO2=-149 -D
MAXEXPO2=149 -D SIFUNC=_acosf4 -D SLFUNC=acosf4 -D LVFUNC=vsacos -D NFLT=4096
-D INLINE='"/usr/spu/include/mass/acosf4.h"' -D
INC='"/usr/spu/include/mass/acos.h"' -D SPEC_FN=0 -O3 -c spu_prog.c

Standalone SPU comparison program
NFLT=4096, ntests=1134592
MINEXPO= -149 MAXEXPO= 127
x  range = 0.000000e+00 3.402734e+38

Long vector results
total ticks            = 9.600000e+01
ticks per vector float = 9.375000e-02
ticks per float        = 2.343750e-02

SIMD inline results
maxrelerr = 0.000000e+00 at 9.999900e+04
total ticks            = 1.219000e+03
ticks per vector float = 1.190430e+00
ticks per float        = 2.976074e-01

SIMD lib results
maxrelerr = 0.000000e+00 at 9.999900e+04
total ticks            = 6.970000e+02
ticks per vector float = 6.806641e-01
ticks per float        = 1.701660e-01

/usr/bin/ccache /usr/bin/spu-gcc        -W -Wall -Winline -Wno-main  -I. -I
/usr/spu/include/mass/ -I /usr/spu/include -I /opt/cell/sdk/usr/spu/include 
-O3 -D NARGS=1 -D MINEXPO=-149 -D MAXEXPO=127 -D MINEXPO2=-149 -D MAXEXPO2=149
-D SIFUNC=_acosf4 -D SLFUNC=acosf4 -D LVFUNC=vsacos -D NFLT=4096 -D
INLINE='"/usr/spu/include/mass/acosf4.h"' -D
INC='"/usr/spu/include/mass/acos.h"' -D SPEC_FN=0 -O3 -c spu_prog.c

Standalone SPU comparison program
NFLT=4096, ntests=1134592
MINEXPO= -149 MAXEXPO= 127
x  range = 0.000000e+00 3.402734e+38

Long vector results
total ticks            = 9.500000e+01
ticks per vector float = 9.277344e-02
ticks per float        = 2.319336e-02

SIMD inline results
maxrelerr = 3.200000e+01 at 9.985811e-01
total ticks            = 5.130000e+02
ticks per vector float = 5.009766e-01
ticks per float        = 1.252441e-01

SIMD lib results
maxrelerr = 0.000000e+00 at 9.999900e+04
total ticks            = 6.690000e+02
ticks per vector float = 6.533203e-01
ticks per float        = 1.633301e-01


-- 
           Summary: Floating point rounding mode error in inline asm code,
                    linux/cell be cross compiler
           Product: classpath
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: classpath
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: swadams at ca dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34823





reply via email to

[Prev in Thread] Current Thread [Next in Thread]