* [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard @ 2007-01-12 5:19 Zhu Ebony-r57400 2007-01-12 5:29 ` Paul Mackerras 2007-01-12 6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala 0 siblings, 2 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-12 5:19 UTC (permalink / raw) To: paulus; +Cc: linuxppc-dev Hi Paul, This series of patch add support to fully comply with IEEE-754 standard for E500/E500v2 core when hardware floating point compiling is used. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 5:19 [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Zhu Ebony-r57400 @ 2007-01-12 5:29 ` Paul Mackerras 2007-01-12 5:46 ` Kumar Gala 2007-01-12 6:38 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400 2007-01-12 6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala 1 sibling, 2 replies; 45+ messages in thread From: Paul Mackerras @ 2007-01-12 5:29 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev Zhu Ebony-r57400 writes: > This series of patch add support to fully comply with IEEE-754 standard > for E500/E500v2 core when hardware floating point compiling is used. Your patch descriptions need to explain in detail in what way the current code doesn't comply with the IEEE-754 standard, and what approach you have taken to make it comply. If there are alternative approaches, explain why the approach you have taken is the best. Thanks, Paul. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 5:29 ` Paul Mackerras @ 2007-01-12 5:46 ` Kumar Gala 2007-01-12 8:27 ` Zhu Ebony-r57400 2007-01-12 6:38 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400 1 sibling, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-01-12 5:46 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev On Jan 11, 2007, at 11:29 PM, Paul Mackerras wrote: > Zhu Ebony-r57400 writes: > >> This series of patch add support to fully comply with IEEE-754 >> standard >> for E500/E500v2 core when hardware floating point compiling is used. > > Your patch descriptions need to explain in detail in what way the > current code doesn't comply with the IEEE-754 standard, and what > approach you have taken to make it comply. If there are alternative > approaches, explain why the approach you have taken is the best. > > Thanks, > Paul. In addition, this is something that we need to know how it was tested for IEEE-754 compliance. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 5:46 ` Kumar Gala @ 2007-01-12 8:27 ` Zhu Ebony-r57400 2007-01-12 12:06 ` Segher Boessenkool 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-12 8:27 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, Paul Mackerras Hi Kumar, I wrote some cases for testing. For SPFP and DPFP exception testing, the test cases included plus, minus, multiply, divide, comparisons, = conversions, DBZ... for Nan/Denorm/Inf numbers. I also tested the cases that=20 the operation result would generate Nan/Denorm/Inf/overflow/underflow numbers. For Vector SPFP exception testing, I wrote inline asm based = testing program to test the instructions directly. Ebony > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C212=C8=D5 13:46 > To: Paul Mackerras > Cc: Zhu Ebony-r57400; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Jan 11, 2007, at 11:29 PM, Paul Mackerras wrote: >=20 > > Zhu Ebony-r57400 writes: > > > >> This series of patch add support to fully comply with IEEE-754=20 > >> standard for E500/E500v2 core when hardware floating point=20 > compiling=20 > >> is used. > > > > Your patch descriptions need to explain in detail in what way the=20 > > current code doesn't comply with the IEEE-754 standard, and what=20 > > approach you have taken to make it comply. If there are=20 > alternative=20 > > approaches, explain why the approach you have taken is the best. > > > > Thanks, > > Paul. >=20 > In addition, this is something that we need to know how it=20 > was tested for IEEE-754 compliance. >=20 > - k >=20 ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 8:27 ` Zhu Ebony-r57400 @ 2007-01-12 12:06 ` Segher Boessenkool 2007-01-15 8:41 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Segher Boessenkool @ 2007-01-12 12:06 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras > I wrote some cases for testing. For SPFP and DPFP exception testing, > the test cases included plus, minus, multiply, divide, comparisons, > conversions, > DBZ... for Nan/Denorm/Inf numbers. I also tested the cases that > the operation result would generate Nan/Denorm/Inf/overflow/underflow > numbers. For Vector SPFP exception testing, I wrote inline asm based > testing > program to test the instructions directly. Any chance you could submit that testing code too? Would be useful for others :-) Segher ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 12:06 ` Segher Boessenkool @ 2007-01-15 8:41 ` Zhu Ebony-r57400 0 siblings, 0 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-15 8:41 UTC (permalink / raw) To: Segher Boessenkool; +Cc: linuxppc-dev, Paul Mackerras =20 > -----Original Message----- > From: Segher Boessenkool [mailto:segher@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C212=C8=D5 20:06 > To: Zhu Ebony-r57400 > Cc: Kumar Gala; Paul Mackerras; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 > > I wrote some cases for testing. For SPFP and DPFP exception=20 > testing,=20 > > the test cases included plus, minus, multiply, divide, comparisons,=20 > > conversions, DBZ... for Nan/Denorm/Inf numbers. I also tested the=20 > > cases that the operation result would generate=20 > > Nan/Denorm/Inf/overflow/underflow numbers. For Vector SPFP=20 > exception=20 > > testing, I wrote inline asm based testing program to test the=20 > > instructions directly. >=20 > Any chance you could submit that testing code too? Would be=20 > useful for others :-) >=20 >=20 > Segher >=20 >=20 The below snipped code just tests limited instructions. On actural = developing process all instructrions were tested, FYI. --------------------------Snip-------------------------------------------= ---------- /* compile with freescale gcc for MPC8548 powerpc platform /opt/mtwk/usr/local/gcc-3_4-e500-glibc-2.3.4-dp/powerpc-linux-gnuspe/bin/= powerpc-linux-gnuspe-gcc -mcpu=3D8548 -mhard-float -ffloat-store=20 -fno-strict-aliasing -o Mult Mult.c -lm */ = = =20 #include <stdio.h> #include <math.h> = = =20 int main() { float j =3D0.0; float k =3D0.0; float result, result0, result1, result2, result3; = = =20 printf ("Invalid operation (denorm) 1:\n"); k =3D 2.1E-44; j =3D 1.5666666; result0 =3D k * j; result1 =3D k + j; result2 =3D k - j ; result3 =3D k / j ; printf("after %g * %g result is %g \n",k,j,result0 ); printf("after %g + %g result is %g \n",k,j,result1 ); printf("after %g - %g result is %g \n",k,j,result2 ); printf("after %g / %g result is %g \n",k,j,result3 ); if (k>j) {=09 printf("The bigger one is %g\n",k); } if (k<j) { printf("The smaller one is %g\n",k); }=20 if (k =3D=3D j) { printf("equal\n"); } } --------------------------Snip-------------------------------------------= ---------- To test VSPFT instructions, inline asm based C code is used like: --------------------------Snip-------------------------------------------= ---------- #include <stdlib.h> #include <asm/reg.h> static void write_reg(volatile unsigned *addr, float val); static float read_reg(volatile unsigned *addr); static void write_reg(volatile unsigned *addr, float val) { __asm__ __volatile__("stwx %1,0,%2; eieio" : "=3Dm" (*addr) : "r" (val), "r" (addr)); } static void write_reg_dbl(volatile unsigned *addr, double val) { __asm__ __volatile__("evstddx %1,0,%2; eieio" : "=3Dm" (*addr) : "r" (val), "r" (addr)); } static float read_reg(volatile unsigned *addr) { float ret; __asm__ __volatile__("lwzx %0,0,%1; eieio" : "=3Dr" (ret) : "r" (addr), "m" (*addr)); return ret; } static int read_reg_int(volatile unsigned *addr) { unsigned int ret; __asm__ __volatile__("lwzx %0,0,%1; eieio" : "=3Dr" (ret) : "r" (addr), "m" (*addr)); return ret; } inline void evfsadd(volatile unsigned *addr) { unsigned int rA; unsigned int rB; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evlddx %4, 0, %3\n" "evfsadd %1, %1, %4\n" "evstdwx %1, 0, %5\n" : "=3Dm" (*addr) : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4) ); } inline void evfssub(volatile unsigned *addr) { unsigned int rA; unsigned int rB; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evlddx %4, 0, %3\n" "evfssub %1, %1, %4\n" "evstdwx %1, 0, %5\n" : "=3Dm" (*addr) : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4) ); } inline void evfsmul(volatile unsigned *addr) { unsigned int rA; unsigned int rB; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evlddx %4, 0, %3\n" "evfsmul %1, %1, %4\n" "evstdwx %1, 0, %5\n" : "=3Dm" (*addr) : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4) ); } inline void evfsdiv(volatile unsigned *addr) { unsigned int rA; unsigned int rB; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evlddx %4, 0, %3\n" "evfsdiv %1, %1, %4\n" "evstdwx %1, 0, %5\n" : "=3Dm" (*addr) : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4) ); } inline int evfscmpeq(volatile unsigned *addr) { unsigned int rA; unsigned int rB; unsigned int val; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evlddx %4, 0, %3\n" "evfscmpeq %0, %1, %4\n" "mfcr %0\n" : "=3Dr" (val) : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB)); return (val); } inline int evfscmpgt(volatile unsigned *addr) { unsigned int rA; unsigned int rB; unsigned int val; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evlddx %4, 0, %3\n" "evfscmpgt %0, %1, %4\n" "mfcr %0\n" : "=3Dr" (val) : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB)); return (val); } inline int evfscmplt(volatile unsigned *addr) { unsigned int rA; unsigned int rB; unsigned int val; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evlddx %4, 0, %3\n" "evfscmplt %0, %1, %4\n" "mfcr %0\n" : "=3Dr" (val) : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB)); return (val); } inline void evfsabs(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsabs %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsnabs(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsnabs %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsneg(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsneg %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsctui(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsctui %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsctsi(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsctsi %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsctsiz(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsctsiz %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsctuiz(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsctuiz %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsctuf(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsctuf %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void evfsctsf(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "evfsctsf %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void efsctsf(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("lwzx %1, 0, %2\n" "efsctsf %1, %1\n" "stwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void efsctuf(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("lwzx %1, 0, %2\n" "efsctuf %1, %1\n" "stwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void efdctuf(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "efdctuf %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void efdctsf(volatile unsigned *addr) { unsigned int rD; __asm__ __volatile__ ("evlddx %1, 0, %2\n" "efdctsf %1, %1\n" "evstdwx %1, 0, %3\n" : "=3Dm" (*addr) : "r" (rD), "r" (addr), "r" (addr+4) ); } inline void write_reg_vec (unsigned int *addr, float rA0, float rA1, float rB0, = float rB1) { write_reg (addr, rA0); write_reg (addr+1, rA1); write_reg (addr+2, rB0); write_reg (addr+3, rB1); } int main() { unsigned *store_addr; float a0 =3D 2.1e-44;=20 float a1 =3D -5.738e-42; float b0 =3D 1.5666666; float b1 =3D 1.0001221; float d0, d1; double b =3D 0.9999999996507541e+320; unsigned int d0_uint, d1_uint; unsigned int crD; double result; printf ("a0, a1 =3D %g, %g\n", a0, a1); printf ("b0, b1 =3D %g, %g\n", b0, b1); printf ("b =3D %g\n", b); =09 store_addr =3D malloc (sizeof(float)*6); write_reg_vec (store_addr, a0, a1, b0, b1);=20 evfsadd(store_addr); d0 =3D read_reg (store_addr+4); d1 =3D read_reg (store_addr+5); printf ("evfsadd: d0 =3D %g, d1 =3D %g\n", d0, d1); evfssub(store_addr); d0 =3D read_reg (store_addr+4); d1 =3D read_reg (store_addr+5); printf ("evfssub: d0 =3D %g, d1 =3D %g\n", d0, d1); =09 evfsmul(store_addr); d0 =3D read_reg (store_addr+4); d1 =3D read_reg (store_addr+5); printf ("evfsmul: d0 =3D %g, d1 =3D %g\n", d0, d1); =09 evfsdiv(store_addr); d0 =3D read_reg (store_addr+4); d1 =3D read_reg (store_addr+5); printf ("evfsdiv: d0 =3D %g, d1 =3D %g\n", d0, d1); =09 evfsabs(store_addr); d0 =3D read_reg (store_addr+4); d1 =3D read_reg (store_addr+5); printf ("evfsabs: d0 =3D %g, d1 =3D %g\n", d0, d1); =09 evfsnabs(store_addr); d0 =3D read_reg (store_addr+4); d1 =3D read_reg (store_addr+5); printf ("evfsnabs: d0 =3D %g, d1 =3D %g\n", d0, d1); evfsneg(store_addr); d0 =3D read_reg (store_addr+4); d1 =3D read_reg (store_addr+5); printf ("evfsneg: d0 =3D %g, d1 =3D %g\n", d0, d1); =09 crD =3D evfscmpeq(store_addr); printf ("efscmpeq: crD =3D %08x\n", crD); =09 crD =3D evfscmpgt(store_addr); printf ("efscmpgt: crD =3D %08x\n", crD); crD =3D evfscmplt(store_addr); printf ("efscmplt: crD =3D %08x\n", crD); =09 evfsctui(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("evfsctui: d0 =3D %u, d1 =3D %u\n", d0_uint, d1_uint); =09 evfsctuiz(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("evfsctuiz: d0 =3D %u, d1 =3D %u\n", d0_uint, d1_uint); evfsctsi(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("evfsctsi: d0 =3D %d, d1 =3D %d\n", d0_uint, d1_uint); evfsctsiz(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("evfsctsiz: d0 =3D %d, d1 =3D %d\n", d0_uint, d1_uint); evfsctuf(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("evfsctuf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint); =09 =09 evfsctsf(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("evfsctsf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint); efsctsf(store_addr); d0_uint =3D read_reg_int (store_addr+4); printf ("efsctsf: d0 =3D %08x\n", d0_uint); =09 efsctuf(store_addr); d0_uint =3D read_reg_int (store_addr+4); printf ("efsctuf: d0 =3D %08x\n", d0_uint); write_reg_dbl (store_addr, b);=20 =09 efdctuf(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("efdctuf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint); =09 efdctsf(store_addr); d0_uint =3D read_reg_int (store_addr+4); d1_uint =3D read_reg_int (store_addr+5); printf ("efdctsf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint); =09 } --------------------------Snip-------------------------------------------= ---------- Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard 2007-01-12 5:29 ` Paul Mackerras 2007-01-12 5:46 ` Kumar Gala @ 2007-01-12 6:38 ` Zhu Ebony-r57400 2007-01-12 6:49 ` Kumar Gala 2007-01-12 12:03 ` Segher Boessenkool 1 sibling, 2 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-12 6:38 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev Hi Paul, On SPE implemented E500/E500v2 core, the embedded floating-point APU implements a floating-point system as defined in ANSI/IEEE standard754-1985 but rely on software support in order to conform fully with the standard. Thus, whenever an input operand of a floating-point instruction has data values that are +infinity, =A8Cinfinity, denorm, or = NaN, or when the result of an operation produces an overflow or an underflow, an interrupt may be taken and the interrupt handler is responsible for = delivering IEEE 754-compliant behavior if desired. When floating-point invalid input exceptions are disabled (SPEFSCR[FINVE] is cleared), default results are provided by the=20 hardware when an infinity, denorm, or NaN input is received, or for the operation 0/0. When floating-point underflow exceptions are disabled (SPEFSCR[FUNFE] is cleared) and the result of a floating-point operation underflows, a signed zero result is produced. When floating-point = overflow exceptions are disabled (EFSCR[FOVFE] is cleared) and the result of a=20 floating-point operation overflows, a pmax or nmax result is produced.=20 A divide-by-zero exception enable flag (SPEFSCR[FDBZE]) is provided for generating an interrupt when a divide-by-zero operation is attempted to allow a software handler to conform to the IEEE 754 = standard. In current code, all of these exceptions are disabled, and the IEEE-754 = standard is not fully complied. Let's see an example: 2.1E-44 * 1.5666666 =3D ? On IEEE-754 fully complied system (x86, 7450, etc.), the result should = be=20 3.22299e-44. But on E500/E500v2 core, the result is 0. And there are much more cases show that E500 SPE core is not fully IEEE-754 complied. The approach I've taken to solve this issue is: 1. Enable SPEFSCR[FINVE|FDBZE|FUNFE|FOVFE] to make sure exceptions can take place 2. Use exceptions handlers to handle the exceptions. 3. Restore registers and exit from exception. In arch/powerpc/math, there are some files to emulate floating point = instructions on non-FPU systems, which may come from glibc. Some macros are provided = to emulate plus, minus, multiply, divide, etc. Therefore, I re-used some of = the codes there and add some new routines to emulated SPE instruction that may cause = exception, including SPFP instructions, DPFP instructions and Vector SPFP = instructions. Writing some independent codes to handle the exceptions my be an = alternative way, but I think re-use the existing interfaces in kernel is the best = approach. Ebony > -----Original Message----- > From: Paul Mackerras [mailto:paulus@samba.org]=20 > Sent: 2007=C4=EA1=D4=C212=C8=D5 13:30 > To: Zhu Ebony-r57400 > Cc: linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754standard >=20 > Zhu Ebony-r57400 writes: >=20 > > This series of patch add support to fully comply with IEEE-754=20 > > standard for E500/E500v2 core when hardware floating point=20 > compiling is used. >=20 > Your patch descriptions need to explain in detail in what way=20 > the current code doesn't comply with the IEEE-754 standard,=20 > and what approach you have taken to make it comply. If there=20 > are alternative approaches, explain why the approach you have=20 > taken is the best. >=20 > Thanks, > Paul. >=20 ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard 2007-01-12 6:38 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400 @ 2007-01-12 6:49 ` Kumar Gala 2007-01-12 12:03 ` Segher Boessenkool 1 sibling, 0 replies; 45+ messages in thread From: Kumar Gala @ 2007-01-12 6:49 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras On Jan 12, 2007, at 12:38 AM, Zhu Ebony-r57400 wrote: > Hi Paul, > > On SPE implemented E500/E500v2 core, the embedded floating-point > APU implements a floating-point system as defined in ANSI/IEEE > standard754-1985 but rely on software support in order to conform =20 > fully > with the standard. Thus, whenever an input operand of a floating-point > instruction has data values that are +infinity, =A8Cinfinity, denorm, =20= > or NaN, > or when the result of an operation produces an overflow or an =20 > underflow, > an interrupt may be taken and the interrupt handler is responsible =20 > for delivering > IEEE 754-compliant behavior if desired. In addition to some other corner cases the HW punts on. [snip] > The approach I've taken to solve this issue is: > 1. Enable SPEFSCR[FINVE|FDBZE|FUNFE|FOVFE] to make sure exceptions > can take place > 2. Use exceptions handlers to handle the exceptions. > 3. Restore registers and exit from exception. > > In arch/powerpc/math, there are some files to emulate floating =20 > point instructions > on non-FPU systems, which may come from glibc. Some macros are =20 > provided to > emulate plus, minus, multiply, divide, etc. Therefore, I re-used =20 > some of the codes there > and add some new routines to emulated SPE instruction that may =20 > cause exception, > including SPFP instructions, DPFP instructions and Vector SPFP =20 > instructions. > > Writing some independent codes to handle the exceptions my be an =20 > alternative way, > but I think re-use the existing interfaces in kernel is the best =20 > approach. I don't believe there is any other way to solve this problem. On =20 these particular exceptions, the HW doesn't provide any real assist =20 and we have to recompute the result from scratch. Once, we agree the approach is reasonable I'll make comments on the =20 actual handlers. - k > Ebony > > > >> -----Original Message----- >> From: Paul Mackerras [mailto:paulus@samba.org] >> Sent: 2007=C4=EA1=D4=C212=C8=D5 13:30 >> To: Zhu Ebony-r57400 >> Cc: linuxppc-dev@ozlabs.org >> Subject: Re: [patch][0/5] powerpc: Add support to fully >> comply with IEEE-754standard >> >> Zhu Ebony-r57400 writes: >> >>> This series of patch add support to fully comply with IEEE-754 >>> standard for E500/E500v2 core when hardware floating point >> compiling is used. >> >> Your patch descriptions need to explain in detail in what way >> the current code doesn't comply with the IEEE-754 standard, >> and what approach you have taken to make it comply. If there >> are alternative approaches, explain why the approach you have >> taken is the best. >> >> Thanks, >> Paul. >> ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard 2007-01-12 6:38 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400 2007-01-12 6:49 ` Kumar Gala @ 2007-01-12 12:03 ` Segher Boessenkool 2007-01-15 8:16 ` Zhu Ebony-r57400 1 sibling, 1 reply; 45+ messages in thread From: Segher Boessenkool @ 2007-01-12 12:03 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras > When floating-point underflow exceptions are disabled > (SPEFSCR[FUNFE] is cleared) and the result of a floating-point > operation > underflows, a signed zero result is produced. You probably want to make at least this one tweakable per-process; on some important algorithms (some FFTs etc.) the results are perfectly acceptable with underflow- to-zero and the performance difference is huge. AltiVec can do this per-process too, for example -- it has a user register for setting this, you need some other interface though. Segher ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard 2007-01-12 12:03 ` Segher Boessenkool @ 2007-01-15 8:16 ` Zhu Ebony-r57400 2007-01-15 16:08 ` Segher Boessenkool 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-15 8:16 UTC (permalink / raw) To: Segher Boessenkool; +Cc: linuxppc-dev, Paul Mackerras =20 > -----Original Message----- > From: Segher Boessenkool [mailto:segher@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C212=C8=D5 20:04 > To: Zhu Ebony-r57400 > Cc: Paul Mackerras; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754standard >=20 > > When floating-point underflow exceptions are disabled=20 > (SPEFSCR[FUNFE]=20 > > is cleared) and the result of a floating-point operation=20 > underflows, a=20 > > signed zero result is produced. >=20 > You probably want to make at least this one tweakable=20 > per-process; on some important algorithms (some FFTs > etc.) the results are perfectly acceptable with underflow-=20 > to-zero and the performance difference is huge. AltiVec can=20 > do this per-process too, for example -- it has a user=20 > register for setting this, you need some other interface though. >=20 >=20 > Segher Do you mean we can make a switch in order to let user choose whether to enable exception handling or just use default value? If so, a separate CONFIG_SPE_MATH_EMU in Kconfig is reasonable... Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard 2007-01-15 8:16 ` Zhu Ebony-r57400 @ 2007-01-15 16:08 ` Segher Boessenkool 0 siblings, 0 replies; 45+ messages in thread From: Segher Boessenkool @ 2007-01-15 16:08 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras > Do you mean we can make a switch in order to let user choose > whether to enable exception handling or just use default value? Yes exactly. I'm not sure what kind of interface you should use for this though, maybe a sysctl? Someone else can tell you I hope :-) Segher ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 5:19 [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Zhu Ebony-r57400 2007-01-12 5:29 ` Paul Mackerras @ 2007-01-12 6:41 ` Kumar Gala 2007-01-12 8:09 ` Zhu Ebony-r57400 1 sibling, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-01-12 6:41 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote: > Hi Paul, > > This series of patch add support to fully comply with IEEE-754 > standard > for E500/E500v2 core when hardware floating point compiling is used. > > Ebony Here are some general comments: * We should be able to support math-emu (as it stands) and the fixup handling [you break math-emu] * Copyrights / header comments should give credit to the orig math- emu code * Why isn't there any handling of SPEFloatingPointRound exceptions? - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala @ 2007-01-12 8:09 ` Zhu Ebony-r57400 2007-01-12 12:04 ` Segher Boessenkool 2007-01-12 18:36 ` Kumar Gala 0 siblings, 2 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-12 8:09 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus Hi Kumar, > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42 > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote: >=20 > > Hi Paul, > > > > This series of patch add support to fully comply with IEEE-754=20 > > standard for E500/E500v2 core when hardware floating point=20 > compiling=20 > > is used. > > > > Ebony >=20 > Here are some general comments: > * We should be able to support math-emu (as it stands) and=20 > the fixup handling [you break math-emu] I don't think I break the math-emu. I think the codes I added have no impact to the existing math-emu. > * Copyrights / header comments should give credit to the orig=20 > math- emu code I'd like to do this, but in most handler codes, I can't find copyright = information of the orig authors. I think the math-emu code comes from glibc. In the sigfpe_handler.c, I gave credit to the orig author. > * Why isn't there any handling of SPEFloatingPointRound exceptions? I think the SPEFloatingPointRound exception is not necessary to handle = if we handle floating point exception this way.=20 >=20 > - k >=20 ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 8:09 ` Zhu Ebony-r57400 @ 2007-01-12 12:04 ` Segher Boessenkool 2007-01-15 6:45 ` Zhu Ebony-r57400 2007-01-12 18:36 ` Kumar Gala 1 sibling, 1 reply; 45+ messages in thread From: Segher Boessenkool @ 2007-01-12 12:04 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus >> * Why isn't there any handling of SPEFloatingPointRound exceptions? > > I think the SPEFloatingPointRound exception is not necessary to handle > if we > handle floating point exception this way. Some more explanation than "I don't think so" would be nice ;-) Segher ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 12:04 ` Segher Boessenkool @ 2007-01-15 6:45 ` Zhu Ebony-r57400 2007-01-15 15:54 ` Segher Boessenkool 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-15 6:45 UTC (permalink / raw) To: Segher Boessenkool; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Segher Boessenkool [mailto:segher@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C212=C8=D5 20:05 > To: Zhu Ebony-r57400 > Cc: Kumar Gala; paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 > >> * Why isn't there any handling of SPEFloatingPointRound exceptions? > > > > I think the SPEFloatingPointRound exception is not=20 > necessary to handle=20 > > if we handle floating point exception this way. >=20 > Some more explanation than "I don't think so" would be nice ;-) >=20 Thanks for your reminder, I would like to correct what I said before. FP round interrupt may be taken on some circumstance that FP data interrupt doesn't take place, so we may still have to handle round interrupt to fully comply IEEE-754. Do you think using the same way to handle FP round interrupt as FP data interrupt is a reasonable approach? Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-15 6:45 ` Zhu Ebony-r57400 @ 2007-01-15 15:54 ` Segher Boessenkool 0 siblings, 0 replies; 45+ messages in thread From: Segher Boessenkool @ 2007-01-15 15:54 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus > Thanks for your reminder, I would like to correct what I said before. > FP round interrupt may be taken on some circumstance that FP > data interrupt doesn't take place, so we may still have to handle round > interrupt to fully comply IEEE-754. Do you think using the same way to > handle FP round interrupt as FP data interrupt is a reasonable > approach? Well you would take and handle the exception with similar code, based on the same config option too. The actual way to handle the math is very different I suppose, like Kumar said. Segher ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 8:09 ` Zhu Ebony-r57400 2007-01-12 12:04 ` Segher Boessenkool @ 2007-01-12 18:36 ` Kumar Gala 2007-01-15 6:37 ` Zhu Ebony-r57400 1 sibling, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-01-12 18:36 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote: > Hi Kumar, > >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42 >> To: Zhu Ebony-r57400 >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org >> Subject: Re: [patch][0/5] powerpc: Add support to fully >> comply with IEEE-754 standard >> >> >> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote: >> >>> Hi Paul, >>> >>> This series of patch add support to fully comply with IEEE-754 >>> standard for E500/E500v2 core when hardware floating point >> compiling >>> is used. >>> >>> Ebony >> >> Here are some general comments: >> * We should be able to support math-emu (as it stands) and >> the fixup handling [you break math-emu] > > I don't think I break the math-emu. I think the codes I added have > no impact to the existing math-emu. This snippet of code breaks it from math-emu/sfp-machine.h >> +#ifdef CONFIG_SPE >> +#define __FPU_FPSCR (current->thread.spefscr) >> +#else >> #define __FPU_FPSCR (current->thread.fpscr.val) >> +#endif By doing this if I want 'classic FP' emulation as well as the IEEE =20 fixup my fpscr for classic emu will not be updated properly. > >> * Copyrights / header comments should give credit to the orig >> math- emu code > I'd like to do this, but in most handler codes, I can't find =20 > copyright information > of the orig authors. I think the math-emu code comes from glibc. In =20= > the > sigfpe_handler.c, I gave credit to the orig author. I think a comment is sufficient stating this is take from the math-=20 emu code. >> * Why isn't there any handling of SPEFloatingPointRound exceptions? > > I think the SPEFloatingPointRound exception is not necessary to =20 > handle if we > handle floating point exception this way. I dont believe this, you'll have to explain if this is really true. =20 But, I'm almost sure that if the RND mode is set to +/-inf and we do =20 an operation that is within the normal bounds that should round we =20 will NOT get one of the other exceptions. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-12 18:36 ` Kumar Gala @ 2007-01-15 6:37 ` Zhu Ebony-r57400 2007-01-15 14:37 ` Kumar Gala 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-15 6:37 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C213=C8=D5 02:36 > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote: >=20 > > Hi Kumar, > > > >> -----Original Message----- > >> From: Kumar Gala [mailto:galak@kernel.crashing.org] > >> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42 > >> To: Zhu Ebony-r57400 > >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with=20 > >> IEEE-754 standard > >> > >> > >> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote: > >> > >>> Hi Paul, > >>> > >>> This series of patch add support to fully comply with IEEE-754=20 > >>> standard for E500/E500v2 core when hardware floating point > >> compiling > >>> is used. > >>> > >>> Ebony > >> > >> Here are some general comments: > >> * We should be able to support math-emu (as it stands) and=20 > the fixup=20 > >> handling [you break math-emu] > > > > I don't think I break the math-emu. I think the codes I=20 > added have no=20 > > impact to the existing math-emu. >=20 > This snippet of code breaks it from math-emu/sfp-machine.h >=20 > >> +#ifdef CONFIG_SPE > >> +#define __FPU_FPSCR (current->thread.spefscr) > >> +#else > >> #define __FPU_FPSCR (current->thread.fpscr.val) > >> +#endif >=20 > By doing this if I want 'classic FP' emulation as well as the=20 > IEEE fixup my fpscr for classic emu will not be updated properly. Logically, user can choose "SPE Support" and "Math emulation" at the=20 same time on menuconfig. But from my understanding, it is not necessary to select math-emu on a SPE available system, since SPE can do math = operation. >=20 > > > >> * Copyrights / header comments should give credit to the orig > >> math- emu code > > I'd like to do this, but in most handler codes, I can't find =20 > > copyright information > > of the orig authors. I think the math-emu code comes from=20 > glibc. In =20 > > the > > sigfpe_handler.c, I gave credit to the orig author. >=20 > I think a comment is sufficient stating this is take from the math-=20 > emu code. >=20 > >> * Why isn't there any handling of SPEFloatingPointRound exceptions? > > > > I think the SPEFloatingPointRound exception is not necessary to =20 > > handle if we > > handle floating point exception this way. >=20 > I dont believe this, you'll have to explain if this is really true. =20 > But, I'm almost sure that if the RND mode is set to +/-inf and we do =20 > an operation that is within the normal bounds that should round we =20 > will NOT get one of the other exceptions. >=20 > - k >=20 >=20 I looked into the manual again, and found what you are saying is = correct. The reason for developing IEEE-754 fixup came from customer's complain, which is = about denormalized computation can't generate the correct result as the same as on x86. So = what I was concentrating on is floating-point data interrupt. The truth is, FP = round interrupt may be taken on some circumstance that FP data interrupt doesn't take place. As you said, if RND mode is set to +/- inf, FP round interrupt will = generate if we do an operation within the normal bounds. Do you think we use the same = way to handle FP round interrupt as FP data interrupt is reasonable? How would = you suggest? Thanks. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-15 6:37 ` Zhu Ebony-r57400 @ 2007-01-15 14:37 ` Kumar Gala 2007-01-16 9:54 ` Zhu Ebony-r57400 ` (2 more replies) 0 siblings, 3 replies; 45+ messages in thread From: Kumar Gala @ 2007-01-15 14:37 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Jan 15, 2007, at 12:37 AM, Zhu Ebony-r57400 wrote: > > >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: 2007=C4=EA1=D4=C213=C8=D5 02:36 >> To: Zhu Ebony-r57400 >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org >> Subject: Re: [patch][0/5] powerpc: Add support to fully >> comply with IEEE-754 standard >> >> >> On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote: >> >>> Hi Kumar, >>> >>>> -----Original Message----- >>>> From: Kumar Gala [mailto:galak@kernel.crashing.org] >>>> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42 >>>> To: Zhu Ebony-r57400 >>>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org >>>> Subject: Re: [patch][0/5] powerpc: Add support to fully >> comply with >>>> IEEE-754 standard >>>> >>>> >>>> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote: >>>> >>>>> Hi Paul, >>>>> >>>>> This series of patch add support to fully comply with IEEE-754 >>>>> standard for E500/E500v2 core when hardware floating point >>>> compiling >>>>> is used. >>>>> >>>>> Ebony >>>> >>>> Here are some general comments: >>>> * We should be able to support math-emu (as it stands) and >> the fixup >>>> handling [you break math-emu] >>> >>> I don't think I break the math-emu. I think the codes I >> added have no >>> impact to the existing math-emu. >> >> This snippet of code breaks it from math-emu/sfp-machine.h >> >>>> +#ifdef CONFIG_SPE >>>> +#define __FPU_FPSCR (current->thread.spefscr) >>>> +#else >>>> #define __FPU_FPSCR (current->thread.fpscr.val) >>>> +#endif >> >> By doing this if I want 'classic FP' emulation as well as the >> IEEE fixup my fpscr for classic emu will not be updated properly. > > Logically, user can choose "SPE Support" and "Math emulation" at the > same time on menuconfig. But from my understanding, it is not =20 > necessary > to select math-emu on a SPE available system, since SPE can do math =20= > operation. This is not true. If I want to run a "classic" PPC binary with FP I =20 need "Math emulation" and if I want to run an SPE one I enable "SPE =20 Support". I could want to run both of these types of binaries on the =20= same system at the same time. >>>> * Copyrights / header comments should give credit to the orig >>>> math- emu code >>> I'd like to do this, but in most handler codes, I can't find >>> copyright information >>> of the orig authors. I think the math-emu code comes from >> glibc. In >>> the >>> sigfpe_handler.c, I gave credit to the orig author. >> >> I think a comment is sufficient stating this is take from the math- >> emu code. >> >>>> * Why isn't there any handling of SPEFloatingPointRound exceptions? >>> >>> I think the SPEFloatingPointRound exception is not necessary to >>> handle if we >>> handle floating point exception this way. >> >> I dont believe this, you'll have to explain if this is really true. >> But, I'm almost sure that if the RND mode is set to +/-inf and we do >> an operation that is within the normal bounds that should round we >> will NOT get one of the other exceptions. >> >> - k >> >> > > I looked into the manual again, and found what you are saying is =20 > correct. The reason > for developing IEEE-754 fixup came from customer's complain, which =20 > is about denormalized > computation can't generate the correct result as the same as on =20 > x86. So what I was > concentrating on is floating-point data interrupt. The truth is, FP =20= > round interrupt may > be taken on some circumstance that FP data interrupt doesn't take =20 > place. > > As you said, if RND mode is set to +/- inf, FP round interrupt will =20= > generate if we > do an operation within the normal bounds. Do you think we use the =20 > same way to > handle FP round interrupt as FP data interrupt is reasonable? How =20 > would you suggest? No, I think the round handler should try to do the rounding by hand. =20= Since you have the non rounded information provided by HW, its much =20 simpler to just do the rounding step. - k > > Thanks. > > Ebony > > > > ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-15 14:37 ` Kumar Gala @ 2007-01-16 9:54 ` Zhu Ebony-r57400 2007-01-25 8:25 ` Zhu Ebony-r57400 2007-02-07 5:52 ` Zhu Ebony-r57400 2 siblings, 0 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-16 9:54 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C215=C8=D5 22:37 > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Jan 15, 2007, at 12:37 AM, Zhu Ebony-r57400 wrote: >=20 > > > > > >> -----Original Message----- > >> From: Kumar Gala [mailto:galak@kernel.crashing.org] > >> Sent: 2007=C4=EA1=D4=C213=C8=D5 02:36 > >> To: Zhu Ebony-r57400 > >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with=20 > >> IEEE-754 standard > >> > >> > >> On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote: > >> > >>> Hi Kumar, > >>> > >>>> -----Original Message----- > >>>> From: Kumar Gala [mailto:galak@kernel.crashing.org] > >>>> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42 > >>>> To: Zhu Ebony-r57400 > >>>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > >>>> Subject: Re: [patch][0/5] powerpc: Add support to fully > >> comply with > >>>> IEEE-754 standard > >>>> > >>>> > >>>> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote: > >>>> > >>>>> Hi Paul, > >>>>> > >>>>> This series of patch add support to fully comply with IEEE-754=20 > >>>>> standard for E500/E500v2 core when hardware floating point > >>>> compiling > >>>>> is used. > >>>>> > >>>>> Ebony > >>>> > >>>> Here are some general comments: > >>>> * We should be able to support math-emu (as it stands) and > >> the fixup > >>>> handling [you break math-emu] > >>> > >>> I don't think I break the math-emu. I think the codes I > >> added have no > >>> impact to the existing math-emu. > >> > >> This snippet of code breaks it from math-emu/sfp-machine.h > >> > >>>> +#ifdef CONFIG_SPE > >>>> +#define __FPU_FPSCR (current->thread.spefscr) > >>>> +#else > >>>> #define __FPU_FPSCR (current->thread.fpscr.val) > >>>> +#endif > >> > >> By doing this if I want 'classic FP' emulation as well as the IEEE=20 > >> fixup my fpscr for classic emu will not be updated properly. > > > > Logically, user can choose "SPE Support" and "Math=20 > emulation" at the=20 > > same time on menuconfig. But from my understanding, it is not=20 > > necessary to select math-emu on a SPE available system,=20 > since SPE can=20 > > do math operation. >=20 > This is not true. If I want to run a "classic" PPC binary=20 > with FP I need "Math emulation" and if I want to run an SPE=20 > one I enable "SPE Support". I could want to run both of=20 > these types of binaries on the same system at the same time. >=20 So how about defining a separate macro for spefscr? #define __FPU_SPEFSCR (current->thread.spefscr)=20 > >>>> * Copyrights / header comments should give credit to the orig > >>>> math- emu code > >>> I'd like to do this, but in most handler codes, I can't find=20 > >>> copyright information of the orig authors. I think the=20 > math-emu code=20 > >>> comes from > >> glibc. In > >>> the > >>> sigfpe_handler.c, I gave credit to the orig author. > >> > >> I think a comment is sufficient stating this is take from=20 > the math-=20 > >> emu code. > >> > >>>> * Why isn't there any handling of SPEFloatingPointRound=20 > exceptions? > >>> > >>> I think the SPEFloatingPointRound exception is not necessary to=20 > >>> handle if we handle floating point exception this way. > >> > >> I dont believe this, you'll have to explain if this is really true. > >> But, I'm almost sure that if the RND mode is set to +/-inf=20 > and we do=20 > >> an operation that is within the normal bounds that should round we=20 > >> will NOT get one of the other exceptions. > >> > >> - k > >> > >> > > > > I looked into the manual again, and found what you are saying is=20 > > correct. The reason for developing IEEE-754 fixup came from=20 > customer's=20 > > complain, which is about denormalized computation can't=20 > generate the=20 > > correct result as the same as on x86. So what I was=20 > concentrating on=20 > > is floating-point data interrupt. The truth is, FP round=20 > interrupt may=20 > > be taken on some circumstance that FP data interrupt doesn't take=20 > > place. > > > > As you said, if RND mode is set to +/- inf, FP round interrupt will=20 > > generate if we do an operation within the normal bounds. Do=20 > you think=20 > > we use the same way to handle FP round interrupt as FP data=20 > interrupt=20 > > is reasonable? How would you suggest? >=20 > No, I think the round handler should try to do the rounding=20 > by hand. =20 > Since you have the non rounded information provided by HW,=20 > its much simpler to just do the rounding step. >=20 OK, I will study it. Thanks, Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-15 14:37 ` Kumar Gala 2007-01-16 9:54 ` Zhu Ebony-r57400 @ 2007-01-25 8:25 ` Zhu Ebony-r57400 2007-01-25 8:28 ` Kumar Gala 2007-02-07 5:52 ` Zhu Ebony-r57400 2 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-25 8:25 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > No, I think the round handler should try to do the rounding=20 > by hand. =20 > Since you have the non rounded information provided by HW,=20 > its much simpler to just do the rounding step. Hi Kumar, I have some new thoughts about rounding handler.=20 Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf) and a normal "efsmul" may generate rounding interrupt. At this time, according to manual, unrounded (truncated) result is placed in the target register. Please note the target register contains a hexadecimal representation of a floating point number. Since it represents a floating point number exactly so we can not round it anymore. Maybe we still need to emulate the whole "efsmul" instruction by software. What do you think? Any idea is appreciated! B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-25 8:25 ` Zhu Ebony-r57400 @ 2007-01-25 8:28 ` Kumar Gala 2007-01-25 8:53 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-01-25 8:28 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote: > >> No, I think the round handler should try to do the rounding >> by hand. >> Since you have the non rounded information provided by HW, >> its much simpler to just do the rounding step. > > Hi Kumar, > > I have some new thoughts about rounding handler. > Suppose we set SPEFSCR[FRMC]=0b10 (rounding towards +Inf) and > a normal "efsmul" may generate rounding interrupt. At this time, > according > to manual, unrounded (truncated) result is placed in the target > register. Please > note the target register contains a hexadecimal representation of a > floating point number. Since it represents a floating point number > exactly > so we can not round it anymore. I don't follow what you mean by not being able to round it anymore. > Maybe we still need to emulate the whole "efsmul" instruction by > software. You can't always do that. Think about the following instruction: efsmul r3, r3, r3 You'll have lost the original value of r3 when the exception occurs. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-25 8:28 ` Kumar Gala @ 2007-01-25 8:53 ` Zhu Ebony-r57400 2007-01-25 15:10 ` Kumar Gala 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-25 8:53 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C225=C8=D5 16:29 > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote: >=20 > > > >> No, I think the round handler should try to do the=20 > rounding by hand. > >> Since you have the non rounded information provided by HW,=20 > its much=20 > >> simpler to just do the rounding step. > > > > Hi Kumar, > > > > I have some new thoughts about rounding handler. > > Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf)=20 > and a normal=20 > > "efsmul" may generate rounding interrupt. At this time,=20 > according to=20 > > manual, unrounded (truncated) result is placed in the=20 > target register.=20 > > Please note the target register contains a hexadecimal=20 > representation=20 > > of a floating point number. Since it represents a floating point=20 > > number exactly so we can not round it anymore. >=20 > I don't follow what you mean by not being able to round it anymore. I try to make myself clear: >From my understanding, rounding is from a floating point number to = another which can be represented by IEEE-754 complied hexadecimal, but not from a hexadecimal to another. For example: Assume the result we got from efsmul is 3.29305125103e-44 It will be stored in target register as 0x00000017. However, 0x00000017 Represents 3.2229864679470793e-44 accurately. Can we round = 3.2229864679470793e-44? I'm afraid not. I mean, we must round the result before it being stored = in target register as hexadecimal, not after. > > Maybe we still need to emulate the whole "efsmul" instruction by=20 > > software. >=20 > You can't always do that. Think about the following instruction: >=20 > efsmul r3, r3, r3 >=20 > You'll have lost the original value of r3 when the exception occurs. If this operation causes FP data interrupt, just let data interrupt = handler to do the simulation. I think there's no chance that we get data and round = interrupts simultaneously. >=20 > - k >=20 >=20 ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-25 8:53 ` Zhu Ebony-r57400 @ 2007-01-25 15:10 ` Kumar Gala 2007-01-26 6:16 ` Zhu Ebony-r57400 2007-01-29 10:00 ` Zhu Ebony-r57400 0 siblings, 2 replies; 45+ messages in thread From: Kumar Gala @ 2007-01-25 15:10 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Jan 25, 2007, at 2:53 AM, Zhu Ebony-r57400 wrote: > > >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: 2007=C4=EA1=D4=C225=C8=D5 16:29 >> To: Zhu Ebony-r57400 >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org >> Subject: Re: [patch][0/5] powerpc: Add support to fully >> comply with IEEE-754 standard >> >> >> On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote: >> >>> >>>> No, I think the round handler should try to do the >> rounding by hand. >>>> Since you have the non rounded information provided by HW, >> its much >>>> simpler to just do the rounding step. >>> >>> Hi Kumar, >>> >>> I have some new thoughts about rounding handler. >>> Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf) >> and a normal >>> "efsmul" may generate rounding interrupt. At this time, >> according to >>> manual, unrounded (truncated) result is placed in the >> target register. >>> Please note the target register contains a hexadecimal >> representation >>> of a floating point number. Since it represents a floating point >>> number exactly so we can not round it anymore. >> >> I don't follow what you mean by not being able to round it anymore. > > I try to make myself clear: >> =46rom my understanding, rounding is from a floating point number to =20= >> another > which can be represented by IEEE-754 complied hexadecimal, but not > from a hexadecimal to another. For example: > > Assume the result we got from efsmul is 3.29305125103e-44 > It will be stored in target register as 0x00000017. However, =20 > 0x00000017 > Represents 3.2229864679470793e-44 accurately. Can we round =20 > 3.2229864679470793e-44? > I'm afraid not. I mean, we must round the result before it being =20 > stored in > target register as hexadecimal, not after. I still don't follow what you are getting at. The HW stores the non-=20 rounded result. You seem to imply there is some format change or =20 something that is going on between the computed result and what's =20 stored in the register. If the result is such that it doesn't need =20 rounding than you don't round (I forget if you can an exception or =20 not if G|X are not set). >>> Maybe we still need to emulate the whole "efsmul" instruction by >>> software. >> >> You can't always do that. Think about the following instruction: >> >> efsmul r3, r3, r3 >> >> You'll have lost the original value of r3 when the exception occurs. > > If this operation causes FP data interrupt, just let data interrupt =20= > handler to > do the simulation. I think there's no chance that we get data and =20 > round interrupts > simultaneously. Agreed, and I think FP data will take precedence, however the example =20= I use can still cause a round exception and no data exception given =20 the right input values. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-25 15:10 ` Kumar Gala @ 2007-01-26 6:16 ` Zhu Ebony-r57400 2007-01-29 10:00 ` Zhu Ebony-r57400 1 sibling, 0 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-26 6:16 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: 2007=C4=EA1=D4=C225=C8=D5 23:11 > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Jan 25, 2007, at 2:53 AM, Zhu Ebony-r57400 wrote: >=20 > > > > > >> -----Original Message----- > >> From: Kumar Gala [mailto:galak@kernel.crashing.org] > >> Sent: 2007=C4=EA1=D4=C225=C8=D5 16:29 > >> To: Zhu Ebony-r57400 > >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with=20 > >> IEEE-754 standard > >> > >> > >> On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote: > >> > >>> > >>>> No, I think the round handler should try to do the > >> rounding by hand. > >>>> Since you have the non rounded information provided by HW, > >> its much > >>>> simpler to just do the rounding step. > >>> > >>> Hi Kumar, > >>> > >>> I have some new thoughts about rounding handler. > >>> Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf) > >> and a normal > >>> "efsmul" may generate rounding interrupt. At this time, > >> according to > >>> manual, unrounded (truncated) result is placed in the > >> target register. > >>> Please note the target register contains a hexadecimal > >> representation > >>> of a floating point number. Since it represents a floating point=20 > >>> number exactly so we can not round it anymore. > >> > >> I don't follow what you mean by not being able to round it anymore. > > > > I try to make myself clear: > >> From my understanding, rounding is from a floating point number to=20 > >> another > > which can be represented by IEEE-754 complied hexadecimal, but not=20 > > from a hexadecimal to another. For example: > > > > Assume the result we got from efsmul is 3.29305125103e-44=20 > It will be=20 > > stored in target register as 0x00000017. However, > > 0x00000017 > > Represents 3.2229864679470793e-44 accurately. Can we round =20 > > 3.2229864679470793e-44? > > I'm afraid not. I mean, we must round the result before it being=20 > > stored in target register as hexadecimal, not after. >=20 > I still don't follow what you are getting at. The HW stores=20 > the non- rounded result. You seem to imply there is some=20 > format change or something that is going on between the=20 > computed result and what's stored in the register. If the=20 > result is such that it doesn't need rounding than you don't=20 > round (I forget if you can an exception or not if G|X are not set). I'm now confused on this point. If I get round interrupt and the target register is 0x00000017, what the round result should be for 0x00000017? What is it if round to zero? If round to nearest? If round to +Inf? If = round to -Inf? >From the only info that 0x00000017 is a non-rounded result, we still = don't know how to round it. 0x00000017 is an inexact value itself. Actually, in existing code of math-emu, rounding takes place when = packing the bits back into native fp result. > >>> Maybe we still need to emulate the whole "efsmul" instruction by=20 > >>> software. > >> > >> You can't always do that. Think about the following instruction: > >> > >> efsmul r3, r3, r3 > >> > >> You'll have lost the original value of r3 when the=20 > exception occurs. > > > > If this operation causes FP data interrupt, just let data interrupt=20 > > handler to do the simulation. I think there's no chance that we get=20 > > data and round interrupts simultaneously. >=20 > Agreed, and I think FP data will take precedence, however the=20 > example I use can still cause a round exception and no data=20 > exception given the right input values. OK, and I think I need to do some more tests to prove this. Thanks for your feedback! B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-25 15:10 ` Kumar Gala 2007-01-26 6:16 ` Zhu Ebony-r57400 @ 2007-01-29 10:00 ` Zhu Ebony-r57400 2007-01-29 14:30 ` Kumar Gala 1 sibling, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-29 10:00 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus Hi Kumar, I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=3D0b10/0b11 to enable = FP round interrupt will cause the exception occurring very often, which will dramatically decrease the performance of SPE instructions. Do you think putting an option in menuconfig to let user choose whether to enable FP round simulation is a reasonable idea? Thanks. Ebony=20 ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-29 10:00 ` Zhu Ebony-r57400 @ 2007-01-29 14:30 ` Kumar Gala 2007-01-31 9:45 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-01-29 14:30 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote: > Hi Kumar, > > I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=0b10/0b11 to > enable FP > round interrupt will cause the exception occurring very often, which > will dramatically decrease the performance of SPE instructions. Do you > think putting an option in menuconfig to let user choose whether to > enable FP round simulation is a reasonable idea? I don't see any issue with it, but I have to believe if you want full IEEE results, you want full IEEE results for everything. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-29 14:30 ` Kumar Gala @ 2007-01-31 9:45 ` Zhu Ebony-r57400 2007-01-31 14:48 ` Kumar Gala 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-01-31 9:45 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: Monday, January 29, 2007 10:31 PM > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote: >=20 > > Hi Kumar, > > > > I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=3D0b10/0b11=20 > to enable=20 > > FP round interrupt will cause the exception occurring very often,=20 > > which will dramatically decrease the performance of SPE=20 > instructions.=20 > > Do you think putting an option in menuconfig to let user choose=20 > > whether to enable FP round simulation is a reasonable idea? >=20 > I don't see any issue with it, but I have to believe if you=20 > want full IEEE results, you want full IEEE results for everything. >=20 > - k >=20 Agreed, we need to fully comply with IEEE754. So let's talk something about the handler. The round exceptions can be put into 2 categories: 1. SPEFSCR[FRMC] =3D 0b10 or 0b11 (rounding toward +Inf and -Inf) We need to handle this exception to comply with IEEE 2. SPEFSCR[FINXE] =3D 1 If we enable this, round exception will occurs when inaccurate results are generated. However, I think we don't need to do so. With FINXE=3D0, if = SPE data exception occurs, we can handle the exception by existing handler, which is fully IEEE complied, including rounding. If no data exception occurs, HW can implement "round to nearest" and "round toward zero" with IEEE complied, and "round toward +Inf/-Inf" can be handled by the handler of point 1. So all the situations are covered, we do have to enable FINXE. Could you make some comments on this? Thanks! B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-31 9:45 ` Zhu Ebony-r57400 @ 2007-01-31 14:48 ` Kumar Gala 2007-02-01 9:35 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-01-31 14:48 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus >> On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote: >> >>> Hi Kumar, >>> >>> I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=0b10/0b11 >> to enable >>> FP round interrupt will cause the exception occurring very often, >>> which will dramatically decrease the performance of SPE >> instructions. >>> Do you think putting an option in menuconfig to let user choose >>> whether to enable FP round simulation is a reasonable idea? >> >> I don't see any issue with it, but I have to believe if you >> want full IEEE results, you want full IEEE results for everything. >> >> - k >> > > Agreed, we need to fully comply with IEEE754. So let's talk something > about the handler. > > The round exceptions can be put into 2 categories: > > 1. SPEFSCR[FRMC] = 0b10 or 0b11 (rounding toward +Inf and -Inf) > We need to handle this exception to comply with IEEE > > 2. SPEFSCR[FINXE] = 1 > If we enable this, round exception will occurs when inaccurate results > are > generated. However, I think we don't need to do so. With FINXE=0, > if SPE > data > exception occurs, we can handle the exception by existing handler, > which > is fully IEEE complied, including rounding. If no data exception > occurs, > HW > can implement "round to nearest" and "round toward zero" with IEEE > complied, > and "round toward +Inf/-Inf" can be handled by the handler of point 1. > So all > the situations are covered, we do have to enable FINXE. > > Could you make some comments on this? Thanks! While I agree with most of what you're saying there is one issue. If the user want's an exception reported on inexact results when the rounding mode is set to "round to nearest" or "round towards zero". Of course we know when the user requests this and can enable/disable this exception at that point if we want to. On a side node, wondering if you've come across this test suite: http://www.jhauser.us/arithmetic/TestFloat.html - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-31 14:48 ` Kumar Gala @ 2007-02-01 9:35 ` Zhu Ebony-r57400 0 siblings, 0 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-01 9:35 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: Wednesday, January 31, 2007 10:49 PM > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 > >> On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote: > >> > >>> Hi Kumar, > >>> > >>> I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=3D0b10/0b11 > >> to enable > >>> FP round interrupt will cause the exception occurring very often,=20 > >>> which will dramatically decrease the performance of SPE > >> instructions. > >>> Do you think putting an option in menuconfig to let user choose=20 > >>> whether to enable FP round simulation is a reasonable idea? > >> > >> I don't see any issue with it, but I have to believe if=20 > you want full=20 > >> IEEE results, you want full IEEE results for everything. > >> > >> - k > >> > > > > Agreed, we need to fully comply with IEEE754. So let's talk=20 > something=20 > > about the handler. > > > > The round exceptions can be put into 2 categories: > > > > 1. SPEFSCR[FRMC] =3D 0b10 or 0b11 (rounding toward +Inf and -Inf) We = > > need to handle this exception to comply with IEEE > > > > 2. SPEFSCR[FINXE] =3D 1 > > If we enable this, round exception will occurs when=20 > inaccurate results=20 > > are generated. However, I think we don't need to do so.=20 > With FINXE=3D0,=20 > > if SPE data exception occurs, we can handle the exception=20 > by existing=20 > > handler, which is fully IEEE complied, including rounding.=20 > If no data=20 > > exception occurs, HW can implement "round to nearest" and "round=20 > > toward zero" with IEEE complied, and "round toward=20 > +Inf/-Inf" can be=20 > > handled by the handler of point 1. > > So all > > the situations are covered, we do have to enable FINXE. > > > > Could you make some comments on this? Thanks! >=20 > While I agree with most of what you're saying there is one=20 > issue. If the user want's an exception reported on inexact=20 > results when the =20 > rounding mode is set to "round to nearest" or "round towards zero". =20 > Of course we know when the user requests this and can=20 > enable/disable this exception at that point if we want to. >=20 > On a side node, wondering if you've come across this test suite: > http://www.jhauser.us/arithmetic/TestFloat.html >=20 > - k Thank you for your comments and useful link. It seems quite good for testing the handler. B.R. Ebony >=20 ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-01-15 14:37 ` Kumar Gala 2007-01-16 9:54 ` Zhu Ebony-r57400 2007-01-25 8:25 ` Zhu Ebony-r57400 @ 2007-02-07 5:52 ` Zhu Ebony-r57400 2007-02-07 7:11 ` Kumar Gala 2 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-07 5:52 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus > >> This snippet of code breaks it from math-emu/sfp-machine.h > >> > >>>> +#ifdef CONFIG_SPE > >>>> +#define __FPU_FPSCR (current->thread.spefscr) > >>>> +#else > >>>> #define __FPU_FPSCR (current->thread.fpscr.val) > >>>> +#endif > >> > >> By doing this if I want 'classic FP' emulation as well as the IEEE=20 > >> fixup my fpscr for classic emu will not be updated properly. > > > > Logically, user can choose "SPE Support" and "Math=20 > emulation" at the=20 > > same time on menuconfig. But from my understanding, it is not=20 > > necessary to select math-emu on a SPE available system,=20 > since SPE can=20 > > do math operation. >=20 > This is not true. If I want to run a "classic" PPC binary=20 > with FP I need "Math emulation" and if I want to run an SPE=20 > one I enable "SPE Support". I could want to run both of=20 > these types of binaries on the same system at the same time. If this is the case, maybe we need a separate macro like #define __SPE_SPEFSCR (current->thread.spefscr) But if we do this, how does the kernel know if the emulation is for "classic" PPC binary with FP or an SPE one, thus corresponding registers(fpscr or spefscr) being updated? B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-02-07 5:52 ` Zhu Ebony-r57400 @ 2007-02-07 7:11 ` Kumar Gala 2007-02-07 7:21 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-02-07 7:11 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote: >>>> This snippet of code breaks it from math-emu/sfp-machine.h >>>> >>>>>> +#ifdef CONFIG_SPE >>>>>> +#define __FPU_FPSCR (current->thread.spefscr) >>>>>> +#else >>>>>> #define __FPU_FPSCR (current->thread.fpscr.val) >>>>>> +#endif >>>> >>>> By doing this if I want 'classic FP' emulation as well as the IEEE >>>> fixup my fpscr for classic emu will not be updated properly. >>> >>> Logically, user can choose "SPE Support" and "Math >> emulation" at the >>> same time on menuconfig. But from my understanding, it is not >>> necessary to select math-emu on a SPE available system, >> since SPE can >>> do math operation. >> >> This is not true. If I want to run a "classic" PPC binary >> with FP I need "Math emulation" and if I want to run an SPE >> one I enable "SPE Support". I could want to run both of >> these types of binaries on the same system at the same time. > > If this is the case, maybe we need a separate macro like > #define __SPE_SPEFSCR (current->thread.spefscr) > But if we do this, how does the kernel know if the emulation is for > "classic" PPC binary with FP or an SPE one, thus corresponding > registers(fpscr or spefscr) being updated? It's based on what instruction you are trying to emulate. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-02-07 7:11 ` Kumar Gala @ 2007-02-07 7:21 ` Zhu Ebony-r57400 2007-02-07 7:57 ` Kumar Gala 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-07 7:21 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: Wednesday, February 07, 2007 3:12 PM > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote: >=20 > >>>> This snippet of code breaks it from math-emu/sfp-machine.h > >>>> > >>>>>> +#ifdef CONFIG_SPE > >>>>>> +#define __FPU_FPSCR (current->thread.spefscr) > >>>>>> +#else > >>>>>> #define __FPU_FPSCR (current->thread.fpscr.val) > >>>>>> +#endif > >>>> > >>>> By doing this if I want 'classic FP' emulation as well=20 > as the IEEE=20 > >>>> fixup my fpscr for classic emu will not be updated properly. > >>> > >>> Logically, user can choose "SPE Support" and "Math > >> emulation" at the > >>> same time on menuconfig. But from my understanding, it is not=20 > >>> necessary to select math-emu on a SPE available system, > >> since SPE can > >>> do math operation. > >> > >> This is not true. If I want to run a "classic" PPC binary=20 > with FP I=20 > >> need "Math emulation" and if I want to run an SPE one I=20 > enable "SPE=20 > >> Support". I could want to run both of these types of=20 > binaries on the=20 > >> same system at the same time. > > > > If this is the case, maybe we need a separate macro like > > #define __SPE_SPEFSCR (current->thread.spefscr) > > But if we do this, how does the kernel know if the emulation is for=20 > > "classic" PPC binary with FP or an SPE one, thus corresponding=20 > > registers(fpscr or spefscr) being updated? >=20 > It's based on what instruction you are trying to emulate. >=20 For example, the FP_ROUNDMODE now defined by __FPU_FPSCR is widely used in existing code. If the kernel doesn't know the emulation is for classic PPC or SPE fixup, then it doesn't know where to get the correct rounding mode, from fpscr or spefscr? This has confused me. B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-02-07 7:21 ` Zhu Ebony-r57400 @ 2007-02-07 7:57 ` Kumar Gala 2007-02-07 8:04 ` Zhu Ebony-r57400 2007-02-08 3:50 ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400 0 siblings, 2 replies; 45+ messages in thread From: Kumar Gala @ 2007-02-07 7:57 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Feb 7, 2007, at 1:21 AM, Zhu Ebony-r57400 wrote: > > >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: Wednesday, February 07, 2007 3:12 PM >> To: Zhu Ebony-r57400 >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org >> Subject: Re: [patch][0/5] powerpc: Add support to fully >> comply with IEEE-754 standard >> >> >> On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote: >> >>>>>> This snippet of code breaks it from math-emu/sfp-machine.h >>>>>> >>>>>>>> +#ifdef CONFIG_SPE >>>>>>>> +#define __FPU_FPSCR (current->thread.spefscr) >>>>>>>> +#else >>>>>>>> #define __FPU_FPSCR (current->thread.fpscr.val) >>>>>>>> +#endif >>>>>> >>>>>> By doing this if I want 'classic FP' emulation as well >> as the IEEE >>>>>> fixup my fpscr for classic emu will not be updated properly. >>>>> >>>>> Logically, user can choose "SPE Support" and "Math >>>> emulation" at the >>>>> same time on menuconfig. But from my understanding, it is not >>>>> necessary to select math-emu on a SPE available system, >>>> since SPE can >>>>> do math operation. >>>> >>>> This is not true. If I want to run a "classic" PPC binary >> with FP I >>>> need "Math emulation" and if I want to run an SPE one I >> enable "SPE >>>> Support". I could want to run both of these types of >> binaries on the >>>> same system at the same time. >>> >>> If this is the case, maybe we need a separate macro like >>> #define __SPE_SPEFSCR (current->thread.spefscr) >>> But if we do this, how does the kernel know if the emulation is for >>> "classic" PPC binary with FP or an SPE one, thus corresponding >>> registers(fpscr or spefscr) being updated? >> >> It's based on what instruction you are trying to emulate. >> > For example, the FP_ROUNDMODE now defined by __FPU_FPSCR is > widely used in existing code. If the kernel doesn't know the emulation > is for > classic PPC or SPE fixup, then it doesn't know where to get the > correct > rounding mode, from fpscr or spefscr? This has confused me. Yes, this is a good point, I guess in truth the two modes are mutually exclusive. Sorry for not figuring that out sooner. (uugh, all the stuff to make IEEE emulation work properly on SPE is a pain :) - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard 2007-02-07 7:57 ` Kumar Gala @ 2007-02-07 8:04 ` Zhu Ebony-r57400 2007-02-08 3:50 ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400 1 sibling, 0 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-07 8:04 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: Wednesday, February 07, 2007 3:57 PM > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Feb 7, 2007, at 1:21 AM, Zhu Ebony-r57400 wrote: >=20 > > > > > >> -----Original Message----- > >> From: Kumar Gala [mailto:galak@kernel.crashing.org] > >> Sent: Wednesday, February 07, 2007 3:12 PM > >> To: Zhu Ebony-r57400 > >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20 > comply with=20 > >> IEEE-754 standard > >> > >> > >> On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote: > >> > >>>>>> This snippet of code breaks it from math-emu/sfp-machine.h > >>>>>> > >>>>>>>> +#ifdef CONFIG_SPE > >>>>>>>> +#define __FPU_FPSCR (current->thread.spefscr) > >>>>>>>> +#else > >>>>>>>> #define __FPU_FPSCR (current->thread.fpscr.val) > >>>>>>>> +#endif > >>>>>> > >>>>>> By doing this if I want 'classic FP' emulation as well > >> as the IEEE > >>>>>> fixup my fpscr for classic emu will not be updated properly. > >>>>> > >>>>> Logically, user can choose "SPE Support" and "Math > >>>> emulation" at the > >>>>> same time on menuconfig. But from my understanding, it is not=20 > >>>>> necessary to select math-emu on a SPE available system, > >>>> since SPE can > >>>>> do math operation. > >>>> > >>>> This is not true. If I want to run a "classic" PPC binary > >> with FP I > >>>> need "Math emulation" and if I want to run an SPE one I > >> enable "SPE > >>>> Support". I could want to run both of these types of > >> binaries on the > >>>> same system at the same time. > >>> > >>> If this is the case, maybe we need a separate macro like > >>> #define __SPE_SPEFSCR (current->thread.spefscr) > >>> But if we do this, how does the kernel know if the=20 > emulation is for=20 > >>> "classic" PPC binary with FP or an SPE one, thus corresponding=20 > >>> registers(fpscr or spefscr) being updated? > >> > >> It's based on what instruction you are trying to emulate. > >> > > For example, the FP_ROUNDMODE now defined by __FPU_FPSCR is widely=20 > > used in existing code. If the kernel doesn't know the=20 > emulation is for=20 > > classic PPC or SPE fixup, then it doesn't know where to get the=20 > > correct rounding mode, from fpscr or spefscr? This has confused me. >=20 > Yes, this is a good point, I guess in truth the two modes are=20 > mutually exclusive. >=20 > Sorry for not figuring that out sooner. (uugh, all the stuff=20 > to make IEEE emulation work properly on SPE is a pain :) Defining FP_ROUNDMODE as (current->thread.spefscr & 0x3) in sigfpe handler maybe a feasible way to get correct rounding mode, and won't break existing FPU simulation. At least it works here :) I will submit revised patches soon. B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-07 7:57 ` Kumar Gala 2007-02-07 8:04 ` Zhu Ebony-r57400 @ 2007-02-08 3:50 ` Zhu Ebony-r57400 2007-02-08 5:18 ` Kumar Gala 1 sibling, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-08 3:50 UTC (permalink / raw) To: paulus; +Cc: linuxppc-dev Hi Paul, These are the re-sent patches to add support to fully comply with IEEE-754 standard for E500/E500v2 core when hardware floating point compiling is used. Comparison with last patches I've submitted, the following points was changed: 1. Add a rounding exception handler, to handle the exceptions that would occur when rounding towards +Inf/-Inf 2. Using the existing exception entering/returning routine, and get the exception instructions from regs->nip instead of reading from SRR0 Thank you all for the comments you gave to me! B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-08 3:50 ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400 @ 2007-02-08 5:18 ` Kumar Gala 2007-02-08 5:40 ` Zhu Ebony-r57400 2007-02-08 7:06 ` Zhu Ebony-r57400 0 siblings, 2 replies; 45+ messages in thread From: Kumar Gala @ 2007-02-08 5:18 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote: > Hi Paul, > > These are the re-sent patches to add support to fully comply with > IEEE-754 standard for E500/E500v2 core when hardware floating > point compiling is used. Comparison with last patches I've submitted, > the following points was changed: > > 1. Add a rounding exception handler, to handle the exceptions that > would occur when rounding towards +Inf/-Inf > > 2. Using the existing exception entering/returning routine, and get > the > exception > instructions from regs->nip instead of reading from SRR0 > > Thank you all for the comments you gave to me! Did you end up getting testfloat running? I'd like to see some testing results before accepting these patches. I think testfloat is our best bet at this point. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-08 5:18 ` Kumar Gala @ 2007-02-08 5:40 ` Zhu Ebony-r57400 2007-02-08 7:06 ` Zhu Ebony-r57400 1 sibling, 0 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-08 5:40 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: Thursday, February 08, 2007 1:19 PM > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote: >=20 > > Hi Paul, > > > > These are the re-sent patches to add support to fully comply with > > IEEE-754 standard for E500/E500v2 core when hardware floating point=20 > > compiling is used. Comparison with last patches I've submitted, the=20 > > following points was changed: > > > > 1. Add a rounding exception handler, to handle the exceptions that=20 > > would occur when rounding towards +Inf/-Inf > > > > 2. Using the existing exception entering/returning routine, and get=20 > > the exception instructions from regs->nip instead of=20 > reading from SRR0 > > > > Thank you all for the comments you gave to me! >=20 > Did you end up getting testfloat running? I'd like to see=20 > some testing results before accepting these patches. I think=20 > testfloat is our best bet at this point. >=20 > - k not yet, since it needs to be ported to powerpc platform. I will do it as soon as possible. B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-08 5:18 ` Kumar Gala 2007-02-08 5:40 ` Zhu Ebony-r57400 @ 2007-02-08 7:06 ` Zhu Ebony-r57400 2007-02-08 7:15 ` Kumar Gala 1 sibling, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-08 7:06 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: Thursday, February 08, 2007 1:19 PM > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote: >=20 > > Hi Paul, > > > > These are the re-sent patches to add support to fully comply with > > IEEE-754 standard for E500/E500v2 core when hardware floating point=20 > > compiling is used. Comparison with last patches I've submitted, the=20 > > following points was changed: > > > > 1. Add a rounding exception handler, to handle the exceptions that=20 > > would occur when rounding towards +Inf/-Inf > > > > 2. Using the existing exception entering/returning routine, and get=20 > > the exception instructions from regs->nip instead of=20 > reading from SRR0 > > > > Thank you all for the comments you gave to me! >=20 > Did you end up getting testfloat running? I'd like to see=20 > some testing results before accepting these patches. I think=20 > testfloat is our best bet at this point. >=20 Hi Kumar, I looked into the testfloat suit, and found all the instructions it tests (more than 50)should be implemented based on ASM. And also the SoftFloat test suite, which the Testfloat is comparing against, should be ported to powerpc platform. I think these work needs some time do finish. So could you review my patches and give some comments first? Thank you. B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-08 7:06 ` Zhu Ebony-r57400 @ 2007-02-08 7:15 ` Kumar Gala 2007-02-08 8:08 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-02-08 7:15 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Feb 8, 2007, at 1:06 AM, Zhu Ebony-r57400 wrote: > > >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: Thursday, February 08, 2007 1:19 PM >> To: Zhu Ebony-r57400 >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org >> Subject: Re: [patch][0/5] powerpc V2 : Add support to fully >> comply with IEEE-754 standard >> >> >> On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote: >> >>> Hi Paul, >>> >>> These are the re-sent patches to add support to fully comply with >>> IEEE-754 standard for E500/E500v2 core when hardware floating point >>> compiling is used. Comparison with last patches I've submitted, the >>> following points was changed: >>> >>> 1. Add a rounding exception handler, to handle the exceptions that >>> would occur when rounding towards +Inf/-Inf >>> >>> 2. Using the existing exception entering/returning routine, and get >>> the exception instructions from regs->nip instead of >> reading from SRR0 >>> >>> Thank you all for the comments you gave to me! >> >> Did you end up getting testfloat running? I'd like to see >> some testing results before accepting these patches. I think >> testfloat is our best bet at this point. >> > Hi Kumar, > > I looked into the testfloat suit, and found all the instructions it > tests (more than 50)should be > implemented based on ASM. Don't follow? Can't you build it with the e500 compiler? > And also the SoftFloat test suite, which the > Testfloat is comparing against, should be ported to powerpc > platform. I > think these work needs some time do finish. So could you review my > patches and > give some comments first? Thank you. Will do. Just be aware we need to get testfloat running before this will make it in mainline. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-08 7:15 ` Kumar Gala @ 2007-02-08 8:08 ` Zhu Ebony-r57400 2007-02-08 17:18 ` Kumar Gala 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-08 8:08 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus > >> > >> Did you end up getting testfloat running? I'd like to see some=20 > >> testing results before accepting these patches. I think=20 > testfloat is=20 > >> our best bet at this point. > >> > > Hi Kumar, > > > > I looked into the testfloat suit, and found all the instructions it=20 > > tests (more than 50)should be implemented based on ASM. >=20 > Don't follow? Can't you build it with the e500 compiler? >=20 The TestFloat suite provided the target of 386-Win32-gcc and SPARC-Solaris-gcc only, and a template for user to porting his own processor. Some general instructions are implemented in C, but some CPU specific instructions like evfsmul need to be implemented assemblely. To build it with e500 compiler we still Have some porting work to do. B.R. Ebony ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-08 8:08 ` Zhu Ebony-r57400 @ 2007-02-08 17:18 ` Kumar Gala 2007-02-09 5:15 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Kumar Gala @ 2007-02-08 17:18 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus On Feb 8, 2007, at 2:08 AM, Zhu Ebony-r57400 wrote: >>>> >>>> Did you end up getting testfloat running? I'd like to see some >>>> testing results before accepting these patches. I think >> testfloat is >>>> our best bet at this point. >>>> >>> Hi Kumar, >>> >>> I looked into the testfloat suit, and found all the instructions it >>> tests (more than 50)should be implemented based on ASM. >> >> Don't follow? Can't you build it with the e500 compiler? >> > The TestFloat suite provided the target of 386-Win32-gcc and > SPARC-Solaris-gcc only, and a template for user to porting his > own processor. Some general instructions are implemented in C, > but some CPU specific instructions like evfsmul need to be > implemented assemblely. To build it with e500 compiler we still > Have some porting work to do. I wouldn't worry too much about the vector forms. If the scalar single fp and double fp test out ok the vectors are pretty much similar enough. Lets just get the scalar versions tested and work out any issues there. - k ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-08 17:18 ` Kumar Gala @ 2007-02-09 5:15 ` Zhu Ebony-r57400 2007-07-30 14:56 ` Sergei Shtylyov 0 siblings, 1 reply; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-02-09 5:15 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev, paulus =20 > -----Original Message----- > From: Kumar Gala [mailto:galak@kernel.crashing.org]=20 > Sent: Friday, February 09, 2007 1:19 AM > To: Zhu Ebony-r57400 > Cc: paulus@samba.org; linuxppc-dev@ozlabs.org > Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20 > comply with IEEE-754 standard >=20 >=20 > On Feb 8, 2007, at 2:08 AM, Zhu Ebony-r57400 wrote: >=20 > >>>> > >>>> Did you end up getting testfloat running? I'd like to see some=20 > >>>> testing results before accepting these patches. I think > >> testfloat is > >>>> our best bet at this point. > >>>> > >>> Hi Kumar, > >>> > >>> I looked into the testfloat suit, and found all the=20 > instructions it=20 > >>> tests (more than 50)should be implemented based on ASM. > >> > >> Don't follow? Can't you build it with the e500 compiler? > >> > > The TestFloat suite provided the target of 386-Win32-gcc and=20 > > SPARC-Solaris-gcc only, and a template for user to porting his own=20 > > processor. Some general instructions are implemented in C, but some=20 > > CPU specific instructions like evfsmul need to be implemented=20 > > assemblely. To build it with e500 compiler we still Have=20 > some porting=20 > > work to do. >=20 > I wouldn't worry too much about the vector forms. If the=20 > scalar single fp and double fp test out ok the vectors are=20 > pretty much similar enough. >=20 > Lets just get the scalar versions tested and work out any=20 > issues there. >=20 > - k OK, I will focus on scalar SFPF and DPFP versions first. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-02-09 5:15 ` Zhu Ebony-r57400 @ 2007-07-30 14:56 ` Sergei Shtylyov 2007-07-31 3:36 ` Zhu Ebony-r57400 0 siblings, 1 reply; 45+ messages in thread From: Sergei Shtylyov @ 2007-07-30 14:56 UTC (permalink / raw) To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus Hello. Zhu Ebony-r57400 wrote: >>>>>>Did you end up getting testfloat running? I'd like to see some >>>>>>testing results before accepting these patches. I think estfloat is >>>>>>our best bet at this point. >>>>>Hi Kumar, >>>>>I looked into the testfloat suit, and found all the instructions it >>>>>tests (more than 50)should be implemented based on ASM. >>>>Don't follow? Can't you build it with the e500 compiler? >>>The TestFloat suite provided the target of 386-Win32-gcc and >>>SPARC-Solaris-gcc only, and a template for user to porting his own >>>processor. Some general instructions are implemented in C, but some >>>CPU specific instructions like evfsmul need to be implemented >>>assemblely. To build it with e500 compiler we still Have some porting >>>work to do. >>I wouldn't worry too much about the vector forms. If the >>scalar single fp and double fp test out ok the vectors are >>pretty much similar enough. >>Lets just get the scalar versions tested and work out any >>issues there. > OK, I will focus on scalar SFPF and DPFP versions first. Any progress with this patchset? WBR, Sergei ^ permalink raw reply [flat|nested] 45+ messages in thread
* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard 2007-07-30 14:56 ` Sergei Shtylyov @ 2007-07-31 3:36 ` Zhu Ebony-r57400 0 siblings, 0 replies; 45+ messages in thread From: Zhu Ebony-r57400 @ 2007-07-31 3:36 UTC (permalink / raw) To: Sergei Shtylyov; +Cc: linuxppc-dev, paulus Hi Sergei, I did some further tests and some development work in past several months, but due to project schedule and limited bandwidth, no patches can be submitted to the list by now. Anyway, I will keep you and the community updated once there is some progress. Thanks. B.R. Ebony > -----Original Message----- > From: Sergei Shtylyov [mailto:sshtylyov@ru.mvista.com]=20 > Sent: Monday, July 30, 2007 10:57 PM > To: Zhu Ebony-r57400 > Cc: Kumar Gala; linuxppc-dev@ozlabs.org; paulus@samba.org > Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20 > comply with IEEE-754 standard >=20 > Hello. >=20 > Zhu Ebony-r57400 wrote: >=20 > >>>>>>Did you end up getting testfloat running? I'd like to see some=20 > >>>>>>testing results before accepting these patches. I=20 > think estfloat=20 > >>>>>>is our best bet at this point. >=20 > >>>>>Hi Kumar, >=20 > >>>>>I looked into the testfloat suit, and found all the=20 > instructions it=20 > >>>>>tests (more than 50)should be implemented based on ASM. >=20 > >>>>Don't follow? Can't you build it with the e500 compiler? >=20 > >>>The TestFloat suite provided the target of 386-Win32-gcc and=20 > >>>SPARC-Solaris-gcc only, and a template for user to porting his own=20 > >>>processor. Some general instructions are implemented in C,=20 > but some=20 > >>>CPU specific instructions like evfsmul need to be implemented=20 > >>>assemblely. To build it with e500 compiler we still Have=20 > some porting=20 > >>>work to do. >=20 > >>I wouldn't worry too much about the vector forms. If the scalar=20 > >>single fp and double fp test out ok the vectors are pretty much=20 > >>similar enough. >=20 > >>Lets just get the scalar versions tested and work out any issues=20 > >>there. >=20 > > OK, I will focus on scalar SFPF and DPFP versions first. >=20 > Any progress with this patchset? >=20 > WBR, Sergei >=20 ^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2007-07-31 3:36 UTC | newest] Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-01-12 5:19 [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Zhu Ebony-r57400 2007-01-12 5:29 ` Paul Mackerras 2007-01-12 5:46 ` Kumar Gala 2007-01-12 8:27 ` Zhu Ebony-r57400 2007-01-12 12:06 ` Segher Boessenkool 2007-01-15 8:41 ` Zhu Ebony-r57400 2007-01-12 6:38 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400 2007-01-12 6:49 ` Kumar Gala 2007-01-12 12:03 ` Segher Boessenkool 2007-01-15 8:16 ` Zhu Ebony-r57400 2007-01-15 16:08 ` Segher Boessenkool 2007-01-12 6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala 2007-01-12 8:09 ` Zhu Ebony-r57400 2007-01-12 12:04 ` Segher Boessenkool 2007-01-15 6:45 ` Zhu Ebony-r57400 2007-01-15 15:54 ` Segher Boessenkool 2007-01-12 18:36 ` Kumar Gala 2007-01-15 6:37 ` Zhu Ebony-r57400 2007-01-15 14:37 ` Kumar Gala 2007-01-16 9:54 ` Zhu Ebony-r57400 2007-01-25 8:25 ` Zhu Ebony-r57400 2007-01-25 8:28 ` Kumar Gala 2007-01-25 8:53 ` Zhu Ebony-r57400 2007-01-25 15:10 ` Kumar Gala 2007-01-26 6:16 ` Zhu Ebony-r57400 2007-01-29 10:00 ` Zhu Ebony-r57400 2007-01-29 14:30 ` Kumar Gala 2007-01-31 9:45 ` Zhu Ebony-r57400 2007-01-31 14:48 ` Kumar Gala 2007-02-01 9:35 ` Zhu Ebony-r57400 2007-02-07 5:52 ` Zhu Ebony-r57400 2007-02-07 7:11 ` Kumar Gala 2007-02-07 7:21 ` Zhu Ebony-r57400 2007-02-07 7:57 ` Kumar Gala 2007-02-07 8:04 ` Zhu Ebony-r57400 2007-02-08 3:50 ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400 2007-02-08 5:18 ` Kumar Gala 2007-02-08 5:40 ` Zhu Ebony-r57400 2007-02-08 7:06 ` Zhu Ebony-r57400 2007-02-08 7:15 ` Kumar Gala 2007-02-08 8:08 ` Zhu Ebony-r57400 2007-02-08 17:18 ` Kumar Gala 2007-02-09 5:15 ` Zhu Ebony-r57400 2007-07-30 14:56 ` Sergei Shtylyov 2007-07-31 3:36 ` Zhu Ebony-r57400
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.