[patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard

All of lore.kernel.org
 help / color / mirror / Atom feed

* [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
@ 2007-01-12  5:19 Zhu Ebony-r57400
  2007-01-12  5:29 ` Paul Mackerras
  2007-01-12  6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala
  0 siblings, 2 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-12  5:19 UTC (permalink / raw)
  To: paulus; +Cc: linuxppc-dev

Hi Paul,

This series of patch add support to fully comply with IEEE-754 standard
for E500/E500v2 core when hardware floating point compiling is used.

Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  5:19 [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Zhu Ebony-r57400
@ 2007-01-12  5:29 ` Paul Mackerras
  2007-01-12  5:46   ` Kumar Gala
  2007-01-12  6:38   ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400
  2007-01-12  6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala
  1 sibling, 2 replies; 45+ messages in thread
From: Paul Mackerras @ 2007-01-12  5:29 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev

Zhu Ebony-r57400 writes:

> This series of patch add support to fully comply with IEEE-754 standard
> for E500/E500v2 core when hardware floating point compiling is used.

Your patch descriptions need to explain in detail in what way the
current code doesn't comply with the IEEE-754 standard, and what
approach you have taken to make it comply.  If there are alternative
approaches, explain why the approach you have taken is the best.

Thanks,
Paul.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  5:29 ` Paul Mackerras
@ 2007-01-12  5:46   ` Kumar Gala
  2007-01-12  8:27     ` Zhu Ebony-r57400
  2007-01-12  6:38   ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400
  1 sibling, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-01-12  5:46 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev


On Jan 11, 2007, at 11:29 PM, Paul Mackerras wrote:

> Zhu Ebony-r57400 writes:
>
>> This series of patch add support to fully comply with IEEE-754  
>> standard
>> for E500/E500v2 core when hardware floating point compiling is used.
>
> Your patch descriptions need to explain in detail in what way the
> current code doesn't comply with the IEEE-754 standard, and what
> approach you have taken to make it comply.  If there are alternative
> approaches, explain why the approach you have taken is the best.
>
> Thanks,
> Paul.

In addition, this is something that we need to know how it was tested  
for IEEE-754 compliance.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  5:46   ` Kumar Gala
@ 2007-01-12  8:27     ` Zhu Ebony-r57400
  2007-01-12 12:06       ` Segher Boessenkool
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-12  8:27 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, Paul Mackerras

Hi Kumar,

I wrote some cases for testing. For SPFP and DPFP exception testing,
the test cases included plus, minus, multiply, divide, comparisons, =
conversions,
DBZ... for Nan/Denorm/Inf numbers. I also tested the cases that=20
the operation result would generate Nan/Denorm/Inf/overflow/underflow
numbers. For Vector SPFP exception testing, I wrote inline asm based =
testing
program to test the instructions directly.

Ebony

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C212=C8=D5 13:46
> To: Paul Mackerras
> Cc: Zhu Ebony-r57400; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Jan 11, 2007, at 11:29 PM, Paul Mackerras wrote:
>=20
> > Zhu Ebony-r57400 writes:
> >
> >> This series of patch add support to fully comply with IEEE-754=20
> >> standard for E500/E500v2 core when hardware floating point=20
> compiling=20
> >> is used.
> >
> > Your patch descriptions need to explain in detail in what way the=20
> > current code doesn't comply with the IEEE-754 standard, and what=20
> > approach you have taken to make it comply.  If there are=20
> alternative=20
> > approaches, explain why the approach you have taken is the best.
> >
> > Thanks,
> > Paul.
>=20
> In addition, this is something that we need to know how it=20
> was tested for IEEE-754 compliance.
>=20
> - k
>=20

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  8:27     ` Zhu Ebony-r57400
@ 2007-01-12 12:06       ` Segher Boessenkool
  2007-01-15  8:41         ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Segher Boessenkool @ 2007-01-12 12:06 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras

> I wrote some cases for testing. For SPFP and DPFP exception testing,
> the test cases included plus, minus, multiply, divide, comparisons, 
> conversions,
> DBZ... for Nan/Denorm/Inf numbers. I also tested the cases that
> the operation result would generate Nan/Denorm/Inf/overflow/underflow
> numbers. For Vector SPFP exception testing, I wrote inline asm based 
> testing
> program to test the instructions directly.

Any chance you could submit that testing code too?  Would
be useful for others :-)


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12 12:06       ` Segher Boessenkool
@ 2007-01-15  8:41         ` Zhu Ebony-r57400
  0 siblings, 0 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-15  8:41 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev, Paul Mackerras

=20

> -----Original Message-----
> From: Segher Boessenkool [mailto:segher@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C212=C8=D5 20:06
> To: Zhu Ebony-r57400
> Cc: Kumar Gala; Paul Mackerras; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
> > I wrote some cases for testing. For SPFP and DPFP exception=20
> testing,=20
> > the test cases included plus, minus, multiply, divide, comparisons,=20
> > conversions, DBZ... for Nan/Denorm/Inf numbers. I also tested the=20
> > cases that the operation result would generate=20
> > Nan/Denorm/Inf/overflow/underflow numbers. For Vector SPFP=20
> exception=20
> > testing, I wrote inline asm based testing program to test the=20
> > instructions directly.
>=20
> Any chance you could submit that testing code too?  Would be=20
> useful for others :-)
>=20
>=20
> Segher
>=20
>=20

The below snipped code just tests limited instructions. On actural =
developing
process all instructrions were tested, FYI.

--------------------------Snip-------------------------------------------=
----------
/* compile with freescale gcc for MPC8548 powerpc platform
/opt/mtwk/usr/local/gcc-3_4-e500-glibc-2.3.4-dp/powerpc-linux-gnuspe/bin/=

powerpc-linux-gnuspe-gcc -mcpu=3D8548 -mhard-float -ffloat-store=20
-fno-strict-aliasing  -o Mult Mult.c -lm
*/
                                                                         =
                                                                         =
          =20
#include <stdio.h>
#include <math.h>
                                                                         =
                                                                         =
          =20
int main() {

float j =3D0.0;
float k =3D0.0;
float result, result0, result1, result2, result3;
                                                                         =
                                                                         =
          =20
printf ("Invalid operation (denorm) 1:\n");
k =3D 2.1E-44;
j =3D 1.5666666;

result0 =3D k * j;
result1 =3D k + j;
result2 =3D k - j ;
result3 =3D k / j ;

printf("after %g * %g  result is %g  \n",k,j,result0 );
printf("after %g + %g  result is %g  \n",k,j,result1 );
printf("after %g - %g  result is %g  \n",k,j,result2 );
printf("after %g / %g  result is %g  \n",k,j,result3 );

if (k>j) {=09
printf("The bigger one is %g\n",k);
}
if (k<j) {
printf("The smaller one is %g\n",k);
}=20
if (k =3D=3D j) {
printf("equal\n");
}
}
--------------------------Snip-------------------------------------------=
----------


To test VSPFT instructions, inline asm based C code is used like:


--------------------------Snip-------------------------------------------=
----------
#include <stdlib.h>
#include <asm/reg.h>


static void write_reg(volatile unsigned *addr, float val);
static float read_reg(volatile unsigned *addr);


static void
write_reg(volatile unsigned *addr, float val)
{
	__asm__ __volatile__("stwx %1,0,%2; eieio" : "=3Dm" (*addr) :
			     "r" (val), "r" (addr));
}

static void
write_reg_dbl(volatile unsigned *addr, double val)
{
	__asm__ __volatile__("evstddx %1,0,%2; eieio" : "=3Dm" (*addr) :
			     "r" (val), "r" (addr));
}

static float
read_reg(volatile unsigned *addr)
{
	float ret;
	__asm__ __volatile__("lwzx %0,0,%1; eieio" : "=3Dr" (ret) :
			     "r" (addr), "m" (*addr));
	return ret;
}

static int
read_reg_int(volatile unsigned *addr)
{
	unsigned int ret;
	__asm__ __volatile__("lwzx %0,0,%1; eieio" : "=3Dr" (ret) :
			     "r" (addr), "m" (*addr));
	return ret;
}

inline void
evfsadd(volatile unsigned *addr)
{
	unsigned int rA;
	unsigned int rB;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evlddx %4, 0, %3\n"
		      "evfsadd %1, %1, %4\n"
		      "evstdwx %1, 0, %5\n"
		      : "=3Dm" (*addr)
		      : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4)
		      );
}

inline void
evfssub(volatile unsigned *addr)
{
	unsigned int rA;
	unsigned int rB;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evlddx %4, 0, %3\n"
		      "evfssub %1, %1, %4\n"
		      "evstdwx %1, 0, %5\n"
		      : "=3Dm" (*addr)
		      : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4)
		      );
}

inline void
evfsmul(volatile unsigned *addr)
{
	unsigned int rA;
	unsigned int rB;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evlddx %4, 0, %3\n"
		      "evfsmul %1, %1, %4\n"
		      "evstdwx %1, 0, %5\n"
		      : "=3Dm" (*addr)
		      : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4)
		      );
}

inline void
evfsdiv(volatile unsigned *addr)
{
	unsigned int rA;
	unsigned int rB;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evlddx %4, 0, %3\n"
		      "evfsdiv %1, %1, %4\n"
		      "evstdwx %1, 0, %5\n"
		      : "=3Dm" (*addr)
		      : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB), "r" (addr+4)
		      );
}

inline int
evfscmpeq(volatile unsigned *addr)
{
	unsigned int rA;
	unsigned int rB;
	unsigned int val;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evlddx %4, 0, %3\n"
		      "evfscmpeq %0, %1, %4\n"
		      "mfcr %0\n"
		      : "=3Dr" (val)
		      : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB));
	return (val);
}

inline int
evfscmpgt(volatile unsigned *addr)
{
	unsigned int rA;
	unsigned int rB;
	unsigned int val;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evlddx %4, 0, %3\n"
		      "evfscmpgt %0, %1, %4\n"
		      "mfcr %0\n"
		      : "=3Dr" (val)
		      : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB));
	return (val);
}

inline int
evfscmplt(volatile unsigned *addr)
{
	unsigned int rA;
	unsigned int rB;
	unsigned int val;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evlddx %4, 0, %3\n"
		      "evfscmplt %0, %1, %4\n"
		      "mfcr %0\n"
		      : "=3Dr" (val)
		      : "r" (rA), "r" (addr), "r" (addr+2), "r" (rB));
	return (val);
}

inline void
evfsabs(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsabs %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsnabs(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsnabs %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsneg(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsneg %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsctui(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsctui %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsctsi(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsctsi %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsctsiz(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsctsiz %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsctuiz(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsctuiz %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsctuf(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsctuf %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
evfsctsf(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "evfsctsf %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
efsctsf(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("lwzx %1, 0, %2\n"
		      "efsctsf %1, %1\n"
		      "stwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
efsctuf(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("lwzx %1, 0, %2\n"
		      "efsctuf %1, %1\n"
		      "stwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
efdctuf(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "efdctuf %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}

inline void
efdctsf(volatile unsigned *addr)
{
	unsigned int rD;
__asm__ __volatile__ ("evlddx %1, 0, %2\n"
		      "efdctsf %1, %1\n"
		      "evstdwx %1, 0, %3\n"
		      : "=3Dm" (*addr)
		      : "r" (rD), "r" (addr), "r" (addr+4)
		      );
}
inline void
write_reg_vec (unsigned int *addr, float rA0, float rA1, float rB0, =
float rB1)
{
	write_reg (addr, rA0);
	write_reg (addr+1, rA1);
	write_reg (addr+2, rB0);
	write_reg (addr+3, rB1);
}

int main()
{
	unsigned *store_addr;
	float a0 =3D 2.1e-44;=20
	float a1 =3D -5.738e-42;
	float b0 =3D 1.5666666;
	float b1 =3D 1.0001221;
	float d0, d1;
	double b =3D 0.9999999996507541e+320;
	unsigned int d0_uint, d1_uint;
	unsigned int crD;
	double result;

	printf ("a0, a1 =3D %g, %g\n", a0, a1);
	printf ("b0, b1 =3D %g, %g\n", b0, b1);
	printf ("b =3D %g\n", b);
=09
	store_addr =3D malloc (sizeof(float)*6);
	write_reg_vec (store_addr, a0, a1, b0, b1);=20

	evfsadd(store_addr);
	d0 =3D read_reg (store_addr+4);
	d1 =3D read_reg (store_addr+5);
	printf ("evfsadd: d0 =3D %g, d1 =3D %g\n", d0, d1);

	evfssub(store_addr);
	d0 =3D read_reg (store_addr+4);
	d1 =3D read_reg (store_addr+5);
	printf ("evfssub: d0 =3D %g, d1 =3D %g\n", d0, d1);
=09
	evfsmul(store_addr);
	d0 =3D read_reg (store_addr+4);
	d1 =3D read_reg (store_addr+5);
	printf ("evfsmul: d0 =3D %g, d1 =3D %g\n", d0, d1);
=09
	evfsdiv(store_addr);
	d0 =3D read_reg (store_addr+4);
	d1 =3D read_reg (store_addr+5);
	printf ("evfsdiv: d0 =3D %g, d1 =3D %g\n", d0, d1);
=09
	evfsabs(store_addr);
	d0 =3D read_reg (store_addr+4);
	d1 =3D read_reg (store_addr+5);
	printf ("evfsabs: d0 =3D %g, d1 =3D %g\n", d0, d1);
=09
	evfsnabs(store_addr);
	d0 =3D read_reg (store_addr+4);
	d1 =3D read_reg (store_addr+5);
	printf ("evfsnabs: d0 =3D %g, d1 =3D %g\n", d0, d1);

	evfsneg(store_addr);
	d0 =3D read_reg (store_addr+4);
	d1 =3D read_reg (store_addr+5);
	printf ("evfsneg: d0 =3D %g, d1 =3D %g\n", d0, d1);
=09
	crD =3D evfscmpeq(store_addr);
	printf ("efscmpeq: crD =3D %08x\n", crD);
=09
	crD =3D evfscmpgt(store_addr);
	printf ("efscmpgt: crD =3D %08x\n", crD);

	crD =3D evfscmplt(store_addr);
	printf ("efscmplt: crD =3D %08x\n", crD);
=09
	evfsctui(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("evfsctui: d0 =3D %u, d1 =3D %u\n", d0_uint, d1_uint);
=09
	evfsctuiz(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("evfsctuiz: d0 =3D %u, d1 =3D %u\n", d0_uint, d1_uint);

	evfsctsi(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("evfsctsi: d0 =3D %d, d1 =3D %d\n", d0_uint, d1_uint);

	evfsctsiz(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("evfsctsiz: d0 =3D %d, d1 =3D %d\n", d0_uint, d1_uint);

	evfsctuf(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("evfsctuf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint);
=09
=09
	evfsctsf(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("evfsctsf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint);

	efsctsf(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	printf ("efsctsf: d0 =3D %08x\n", d0_uint);
=09
	efsctuf(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	printf ("efsctuf: d0 =3D %08x\n", d0_uint);

	write_reg_dbl (store_addr, b);=20
=09
	efdctuf(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("efdctuf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint);
=09
	efdctsf(store_addr);
	d0_uint =3D read_reg_int (store_addr+4);
	d1_uint =3D read_reg_int (store_addr+5);
	printf ("efdctsf: d0 =3D %08x, d1 =3D %08x\n", d0_uint, d1_uint);
=09
}
--------------------------Snip-------------------------------------------=
----------


Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard
  2007-01-12  5:29 ` Paul Mackerras
  2007-01-12  5:46   ` Kumar Gala
@ 2007-01-12  6:38   ` Zhu Ebony-r57400
  2007-01-12  6:49     ` Kumar Gala
  2007-01-12 12:03     ` Segher Boessenkool
  1 sibling, 2 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-12  6:38 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev

Hi Paul,

On SPE implemented E500/E500v2 core, the embedded floating-point
APU implements a floating-point system as defined in ANSI/IEEE
standard754-1985 but rely on software support in order to conform fully
with the standard. Thus, whenever an input operand of a floating-point
instruction has data values that are +infinity, =A8Cinfinity, denorm, or =
NaN,
or when the result of an operation produces an overflow or an underflow,
an interrupt may be taken and the interrupt handler is responsible for =
delivering
IEEE 754-compliant behavior if desired.

When floating-point invalid input exceptions are disabled
(SPEFSCR[FINVE] is cleared), default results are provided by the=20
hardware when an infinity, denorm, or NaN input is received, or for the
operation 0/0. When floating-point underflow exceptions are disabled
(SPEFSCR[FUNFE] is cleared) and the result of a floating-point operation
underflows, a signed zero result is produced. When floating-point =
overflow
exceptions are disabled (EFSCR[FOVFE] is cleared) and the result of a=20
floating-point operation overflows, a pmax or nmax result is produced.=20
A divide-by-zero exception enable flag (SPEFSCR[FDBZE]) is
provided for generating an interrupt when a divide-by-zero operation is
attempted to allow a software handler to conform to the IEEE 754 =
standard.

In current code, all of these exceptions are disabled, and the IEEE-754 =
standard
is not fully complied.

Let's see an example:

2.1E-44 * 1.5666666 =3D ?

On IEEE-754 fully complied system (x86, 7450, etc.), the result should =
be=20
3.22299e-44. But on E500/E500v2 core, the result is 0.

And there are much more cases show that E500 SPE core is not fully
IEEE-754 complied.

The approach I've taken to solve this issue is:
1. Enable SPEFSCR[FINVE|FDBZE|FUNFE|FOVFE] to make sure exceptions
can take place
2. Use exceptions handlers to handle the exceptions.
3. Restore registers and exit from exception.

In arch/powerpc/math, there are some files to emulate floating point =
instructions
on non-FPU systems, which may come from glibc. Some macros are provided =
to
emulate plus, minus, multiply, divide, etc. Therefore, I re-used some of =
the codes there
and add some new routines to emulated SPE instruction that may cause =
exception,
including SPFP instructions, DPFP instructions and Vector SPFP =
instructions.

Writing some independent codes to handle the exceptions my be an =
alternative way,
but I think re-use the existing interfaces in kernel is the best =
approach.

Ebony

> -----Original Message-----
> From: Paul Mackerras [mailto:paulus@samba.org]=20
> Sent: 2007=C4=EA1=D4=C212=C8=D5 13:30
> To: Zhu Ebony-r57400
> Cc: linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754standard
>=20
> Zhu Ebony-r57400 writes:
>=20
> > This series of patch add support to fully comply with IEEE-754=20
> > standard for E500/E500v2 core when hardware floating point=20
> compiling is used.
>=20
> Your patch descriptions need to explain in detail in what way=20
> the current code doesn't comply with the IEEE-754 standard,=20
> and what approach you have taken to make it comply.  If there=20
> are alternative approaches, explain why the approach you have=20
> taken is the best.
>=20
> Thanks,
> Paul.
>=20

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard
  2007-01-12  6:38   ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400
@ 2007-01-12  6:49     ` Kumar Gala
  2007-01-12 12:03     ` Segher Boessenkool
  1 sibling, 0 replies; 45+ messages in thread
From: Kumar Gala @ 2007-01-12  6:49 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras


On Jan 12, 2007, at 12:38 AM, Zhu Ebony-r57400 wrote:

> Hi Paul,
>
> On SPE implemented E500/E500v2 core, the embedded floating-point
> APU implements a floating-point system as defined in ANSI/IEEE
> standard754-1985 but rely on software support in order to conform =20
> fully
> with the standard. Thus, whenever an input operand of a floating-point
> instruction has data values that are +infinity, =A8Cinfinity, denorm, =20=

> or NaN,
> or when the result of an operation produces an overflow or an =20
> underflow,
> an interrupt may be taken and the interrupt handler is responsible =20
> for delivering
> IEEE 754-compliant behavior if desired.

In addition to some other corner cases the HW punts on.

[snip]

> The approach I've taken to solve this issue is:
> 1. Enable SPEFSCR[FINVE|FDBZE|FUNFE|FOVFE] to make sure exceptions
> can take place
> 2. Use exceptions handlers to handle the exceptions.
> 3. Restore registers and exit from exception.
>
> In arch/powerpc/math, there are some files to emulate floating =20
> point instructions
> on non-FPU systems, which may come from glibc. Some macros are =20
> provided to
> emulate plus, minus, multiply, divide, etc. Therefore, I re-used =20
> some of the codes there
> and add some new routines to emulated SPE instruction that may =20
> cause exception,
> including SPFP instructions, DPFP instructions and Vector SPFP =20
> instructions.
>
> Writing some independent codes to handle the exceptions my be an =20
> alternative way,
> but I think re-use the existing interfaces in kernel is the best =20
> approach.

I don't believe there is any other way to solve this problem.  On =20
these particular exceptions, the HW doesn't provide any real assist =20
and we have to recompute the result from scratch.

Once, we agree the approach is reasonable I'll make comments on the =20
actual handlers.

- k

> Ebony
>
>
>
>> -----Original Message-----
>> From: Paul Mackerras [mailto:paulus@samba.org]
>> Sent: 2007=C4=EA1=D4=C212=C8=D5 13:30
>> To: Zhu Ebony-r57400
>> Cc: linuxppc-dev@ozlabs.org
>> Subject: Re: [patch][0/5] powerpc: Add support to fully
>> comply with IEEE-754standard
>>
>> Zhu Ebony-r57400 writes:
>>
>>> This series of patch add support to fully comply with IEEE-754
>>> standard for E500/E500v2 core when hardware floating point
>> compiling is used.
>>
>> Your patch descriptions need to explain in detail in what way
>> the current code doesn't comply with the IEEE-754 standard,
>> and what approach you have taken to make it comply.  If there
>> are alternative approaches, explain why the approach you have
>> taken is the best.
>>
>> Thanks,
>> Paul.
>>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard
  2007-01-12  6:38   ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400
  2007-01-12  6:49     ` Kumar Gala
@ 2007-01-12 12:03     ` Segher Boessenkool
  2007-01-15  8:16       ` Zhu Ebony-r57400
  1 sibling, 1 reply; 45+ messages in thread
From: Segher Boessenkool @ 2007-01-12 12:03 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras

> When floating-point underflow exceptions are disabled
> (SPEFSCR[FUNFE] is cleared) and the result of a floating-point 
> operation
> underflows, a signed zero result is produced.

You probably want to make at least this one tweakable
per-process; on some important algorithms (some FFTs
etc.) the results are perfectly acceptable with underflow-
to-zero and the performance difference is huge.  AltiVec
can do this per-process too, for example -- it has a
user register for setting this, you need some other
interface though.

Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard
  2007-01-12 12:03     ` Segher Boessenkool
@ 2007-01-15  8:16       ` Zhu Ebony-r57400
  2007-01-15 16:08         ` Segher Boessenkool
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-15  8:16 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev, Paul Mackerras

=20

> -----Original Message-----
> From: Segher Boessenkool [mailto:segher@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C212=C8=D5 20:04
> To: Zhu Ebony-r57400
> Cc: Paul Mackerras; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754standard
>=20
> > When floating-point underflow exceptions are disabled=20
> (SPEFSCR[FUNFE]=20
> > is cleared) and the result of a floating-point operation=20
> underflows, a=20
> > signed zero result is produced.
>=20
> You probably want to make at least this one tweakable=20
> per-process; on some important algorithms (some FFTs
> etc.) the results are perfectly acceptable with underflow-=20
> to-zero and the performance difference is huge.  AltiVec can=20
> do this per-process too, for example -- it has a user=20
> register for setting this, you need some other interface though.
>=20
>=20
> Segher

Do you mean we can make a switch in order to let user choose
whether to enable exception handling or just use default value? If so,
a separate CONFIG_SPE_MATH_EMU in Kconfig is reasonable...

Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard
  2007-01-15  8:16       ` Zhu Ebony-r57400
@ 2007-01-15 16:08         ` Segher Boessenkool
  0 siblings, 0 replies; 45+ messages in thread
From: Segher Boessenkool @ 2007-01-15 16:08 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, Paul Mackerras

> Do you mean we can make a switch in order to let user choose
> whether to enable exception handling or just use default value?

Yes exactly.  I'm not sure what kind of interface you should
use for this though, maybe a sysctl?  Someone else can tell
you I hope :-)


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  5:19 [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Zhu Ebony-r57400
  2007-01-12  5:29 ` Paul Mackerras
@ 2007-01-12  6:41 ` Kumar Gala
  2007-01-12  8:09   ` Zhu Ebony-r57400
  1 sibling, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-01-12  6:41 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote:

> Hi Paul,
>
> This series of patch add support to fully comply with IEEE-754  
> standard
> for E500/E500v2 core when hardware floating point compiling is used.
>
> Ebony

Here are some general comments:
* We should be able to support math-emu (as it stands) and the fixup  
handling [you break math-emu]
* Copyrights / header comments should give credit to the orig math- 
emu code
* Why isn't there any handling of SPEFloatingPointRound exceptions?

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala
@ 2007-01-12  8:09   ` Zhu Ebony-r57400
  2007-01-12 12:04     ` Segher Boessenkool
  2007-01-12 18:36     ` Kumar Gala
  0 siblings, 2 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-12  8:09 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

 Hi Kumar,

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote:
>=20
> > Hi Paul,
> >
> > This series of patch add support to fully comply with IEEE-754=20
> > standard for E500/E500v2 core when hardware floating point=20
> compiling=20
> > is used.
> >
> > Ebony
>=20
> Here are some general comments:
> * We should be able to support math-emu (as it stands) and=20
> the fixup handling [you break math-emu]

I don't think I break the math-emu. I think the codes I added have
no impact to the existing math-emu.

> * Copyrights / header comments should give credit to the orig=20
> math- emu code
I'd like to do this, but in most handler codes, I can't find copyright =
information
of the orig authors. I think the math-emu code comes from glibc. In the
sigfpe_handler.c, I gave credit to the orig author.

> * Why isn't there any handling of SPEFloatingPointRound exceptions?

I think the SPEFloatingPointRound exception is not necessary to handle =
if we
handle floating point exception this way.=20

>=20
> - k
>=20

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  8:09   ` Zhu Ebony-r57400
@ 2007-01-12 12:04     ` Segher Boessenkool
  2007-01-15  6:45       ` Zhu Ebony-r57400
  2007-01-12 18:36     ` Kumar Gala
  1 sibling, 1 reply; 45+ messages in thread
From: Segher Boessenkool @ 2007-01-12 12:04 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus

>> * Why isn't there any handling of SPEFloatingPointRound exceptions?
>
> I think the SPEFloatingPointRound exception is not necessary to handle 
> if we
> handle floating point exception this way.

Some more explanation than "I don't think so" would
be nice ;-)


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12 12:04     ` Segher Boessenkool
@ 2007-01-15  6:45       ` Zhu Ebony-r57400
  2007-01-15 15:54         ` Segher Boessenkool
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-15  6:45 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Segher Boessenkool [mailto:segher@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C212=C8=D5 20:05
> To: Zhu Ebony-r57400
> Cc: Kumar Gala; paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
> >> * Why isn't there any handling of SPEFloatingPointRound exceptions?
> >
> > I think the SPEFloatingPointRound exception is not=20
> necessary to handle=20
> > if we handle floating point exception this way.
>=20
> Some more explanation than "I don't think so" would be nice ;-)
>=20

Thanks for your reminder, I would like to correct what I said before.
FP round interrupt may be taken on some circumstance that FP
data interrupt doesn't take place, so we may still have to handle round
interrupt to fully comply IEEE-754. Do you think using the same way to
handle FP round interrupt as FP data interrupt is a reasonable approach?

Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-15  6:45       ` Zhu Ebony-r57400
@ 2007-01-15 15:54         ` Segher Boessenkool
  0 siblings, 0 replies; 45+ messages in thread
From: Segher Boessenkool @ 2007-01-15 15:54 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus

> Thanks for your reminder, I would like to correct what I said before.
> FP round interrupt may be taken on some circumstance that FP
> data interrupt doesn't take place, so we may still have to handle round
> interrupt to fully comply IEEE-754. Do you think using the same way to
> handle FP round interrupt as FP data interrupt is a reasonable 
> approach?

Well you would take and handle the exception with similar
code, based on the same config option too.  The actual way
to handle the math is very different I suppose, like Kumar
said.


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12  8:09   ` Zhu Ebony-r57400
  2007-01-12 12:04     ` Segher Boessenkool
@ 2007-01-12 18:36     ` Kumar Gala
  2007-01-15  6:37       ` Zhu Ebony-r57400
  1 sibling, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-01-12 18:36 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote:

>  Hi Kumar,
>
>> -----Original Message-----
>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42
>> To: Zhu Ebony-r57400
>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
>> Subject: Re: [patch][0/5] powerpc: Add support to fully
>> comply with IEEE-754 standard
>>
>>
>> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote:
>>
>>> Hi Paul,
>>>
>>> This series of patch add support to fully comply with IEEE-754
>>> standard for E500/E500v2 core when hardware floating point
>> compiling
>>> is used.
>>>
>>> Ebony
>>
>> Here are some general comments:
>> * We should be able to support math-emu (as it stands) and
>> the fixup handling [you break math-emu]
>
> I don't think I break the math-emu. I think the codes I added have
> no impact to the existing math-emu.

This snippet of code breaks it from math-emu/sfp-machine.h

>> +#ifdef CONFIG_SPE
>> +#define __FPU_FPSCR	(current->thread.spefscr)
>> +#else
>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
>> +#endif

By doing this if I want 'classic FP' emulation as well as the IEEE =20
fixup my fpscr for classic emu will not be updated properly.

>
>> * Copyrights / header comments should give credit to the orig
>> math- emu code
> I'd like to do this, but in most handler codes, I can't find =20
> copyright information
> of the orig authors. I think the math-emu code comes from glibc. In =20=

> the
> sigfpe_handler.c, I gave credit to the orig author.

I think a comment is sufficient stating this is take from the math-=20
emu code.

>> * Why isn't there any handling of SPEFloatingPointRound exceptions?
>
> I think the SPEFloatingPointRound exception is not necessary to =20
> handle if we
> handle floating point exception this way.

I dont believe this, you'll have to explain if this is really true.  =20
But, I'm almost sure that if the RND mode is set to +/-inf and we do =20
an operation that is within the normal bounds that should round we =20
will NOT get one of the other exceptions.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-12 18:36     ` Kumar Gala
@ 2007-01-15  6:37       ` Zhu Ebony-r57400
  2007-01-15 14:37         ` Kumar Gala
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-15  6:37 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C213=C8=D5 02:36
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote:
>=20
> >  Hi Kumar,
> >
> >> -----Original Message-----
> >> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> >> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42
> >> To: Zhu Ebony-r57400
> >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with=20
> >> IEEE-754 standard
> >>
> >>
> >> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote:
> >>
> >>> Hi Paul,
> >>>
> >>> This series of patch add support to fully comply with IEEE-754=20
> >>> standard for E500/E500v2 core when hardware floating point
> >> compiling
> >>> is used.
> >>>
> >>> Ebony
> >>
> >> Here are some general comments:
> >> * We should be able to support math-emu (as it stands) and=20
> the fixup=20
> >> handling [you break math-emu]
> >
> > I don't think I break the math-emu. I think the codes I=20
> added have no=20
> > impact to the existing math-emu.
>=20
> This snippet of code breaks it from math-emu/sfp-machine.h
>=20
> >> +#ifdef CONFIG_SPE
> >> +#define __FPU_FPSCR	(current->thread.spefscr)
> >> +#else
> >>  #define __FPU_FPSCR	(current->thread.fpscr.val)
> >> +#endif
>=20
> By doing this if I want 'classic FP' emulation as well as the=20
> IEEE fixup my fpscr for classic emu will not be updated properly.

Logically, user can choose "SPE Support" and "Math emulation" at the=20
same time on menuconfig. But from my understanding, it is not necessary
to select math-emu on a SPE available system, since SPE can do math =
operation.


>=20
> >
> >> * Copyrights / header comments should give credit to the orig
> >> math- emu code
> > I'd like to do this, but in most handler codes, I can't find =20
> > copyright information
> > of the orig authors. I think the math-emu code comes from=20
> glibc. In =20
> > the
> > sigfpe_handler.c, I gave credit to the orig author.
>=20
> I think a comment is sufficient stating this is take from the math-=20
> emu code.
>=20
> >> * Why isn't there any handling of SPEFloatingPointRound exceptions?
> >
> > I think the SPEFloatingPointRound exception is not necessary to =20
> > handle if we
> > handle floating point exception this way.
>=20
> I dont believe this, you'll have to explain if this is really true.  =20
> But, I'm almost sure that if the RND mode is set to +/-inf and we do =20
> an operation that is within the normal bounds that should round we =20
> will NOT get one of the other exceptions.
>=20
> - k
>=20
>=20

I looked into the manual again, and found what you are saying is =
correct. The reason
for developing IEEE-754 fixup came from customer's complain, which is =
about denormalized
computation can't generate the correct result as the same as on x86. So =
what I was
concentrating on is floating-point data interrupt. The truth is, FP =
round interrupt may
be taken on some circumstance that FP data interrupt doesn't take place.

As you said, if RND mode is set to +/- inf, FP round interrupt will =
generate if we
do an operation within the normal bounds. Do you think we use the same =
way to
handle FP round interrupt as FP data interrupt is reasonable? How would =
you suggest?

Thanks.

Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-15  6:37       ` Zhu Ebony-r57400
@ 2007-01-15 14:37         ` Kumar Gala
  2007-01-16  9:54           ` Zhu Ebony-r57400
                             ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Kumar Gala @ 2007-01-15 14:37 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Jan 15, 2007, at 12:37 AM, Zhu Ebony-r57400 wrote:

>
>
>> -----Original Message-----
>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>> Sent: 2007=C4=EA1=D4=C213=C8=D5 02:36
>> To: Zhu Ebony-r57400
>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
>> Subject: Re: [patch][0/5] powerpc: Add support to fully
>> comply with IEEE-754 standard
>>
>>
>> On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote:
>>
>>>  Hi Kumar,
>>>
>>>> -----Original Message-----
>>>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>>>> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42
>>>> To: Zhu Ebony-r57400
>>>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
>>>> Subject: Re: [patch][0/5] powerpc: Add support to fully
>> comply with
>>>> IEEE-754 standard
>>>>
>>>>
>>>> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote:
>>>>
>>>>> Hi Paul,
>>>>>
>>>>> This series of patch add support to fully comply with IEEE-754
>>>>> standard for E500/E500v2 core when hardware floating point
>>>> compiling
>>>>> is used.
>>>>>
>>>>> Ebony
>>>>
>>>> Here are some general comments:
>>>> * We should be able to support math-emu (as it stands) and
>> the fixup
>>>> handling [you break math-emu]
>>>
>>> I don't think I break the math-emu. I think the codes I
>> added have no
>>> impact to the existing math-emu.
>>
>> This snippet of code breaks it from math-emu/sfp-machine.h
>>
>>>> +#ifdef CONFIG_SPE
>>>> +#define __FPU_FPSCR	(current->thread.spefscr)
>>>> +#else
>>>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
>>>> +#endif
>>
>> By doing this if I want 'classic FP' emulation as well as the
>> IEEE fixup my fpscr for classic emu will not be updated properly.
>
> Logically, user can choose "SPE Support" and "Math emulation" at the
> same time on menuconfig. But from my understanding, it is not =20
> necessary
> to select math-emu on a SPE available system, since SPE can do math =20=

> operation.

This is not true.  If I want to run a "classic" PPC binary with FP I =20
need "Math emulation" and if I want to run an SPE one I enable "SPE =20
Support".  I could want to run both of these types of binaries on the =20=

same system at the same time.

>>>> * Copyrights / header comments should give credit to the orig
>>>> math- emu code
>>> I'd like to do this, but in most handler codes, I can't find
>>> copyright information
>>> of the orig authors. I think the math-emu code comes from
>> glibc. In
>>> the
>>> sigfpe_handler.c, I gave credit to the orig author.
>>
>> I think a comment is sufficient stating this is take from the math-
>> emu code.
>>
>>>> * Why isn't there any handling of SPEFloatingPointRound exceptions?
>>>
>>> I think the SPEFloatingPointRound exception is not necessary to
>>> handle if we
>>> handle floating point exception this way.
>>
>> I dont believe this, you'll have to explain if this is really true.
>> But, I'm almost sure that if the RND mode is set to +/-inf and we do
>> an operation that is within the normal bounds that should round we
>> will NOT get one of the other exceptions.
>>
>> - k
>>
>>
>
> I looked into the manual again, and found what you are saying is =20
> correct. The reason
> for developing IEEE-754 fixup came from customer's complain, which =20
> is about denormalized
> computation can't generate the correct result as the same as on =20
> x86. So what I was
> concentrating on is floating-point data interrupt. The truth is, FP =20=

> round interrupt may
> be taken on some circumstance that FP data interrupt doesn't take =20
> place.
>
> As you said, if RND mode is set to +/- inf, FP round interrupt will =20=

> generate if we
> do an operation within the normal bounds. Do you think we use the =20
> same way to
> handle FP round interrupt as FP data interrupt is reasonable? How =20
> would you suggest?

No, I think the round handler should try to do the rounding by hand.  =20=

Since you have the non rounded information provided by HW, its much =20
simpler to just do the rounding step.

- k

>
> Thanks.
>
> Ebony
>
>
>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-15 14:37         ` Kumar Gala
@ 2007-01-16  9:54           ` Zhu Ebony-r57400
  2007-01-25  8:25           ` Zhu Ebony-r57400
  2007-02-07  5:52           ` Zhu Ebony-r57400
  2 siblings, 0 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-16  9:54 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C215=C8=D5 22:37
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Jan 15, 2007, at 12:37 AM, Zhu Ebony-r57400 wrote:
>=20
> >
> >
> >> -----Original Message-----
> >> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> >> Sent: 2007=C4=EA1=D4=C213=C8=D5 02:36
> >> To: Zhu Ebony-r57400
> >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with=20
> >> IEEE-754 standard
> >>
> >>
> >> On Jan 12, 2007, at 2:09 AM, Zhu Ebony-r57400 wrote:
> >>
> >>>  Hi Kumar,
> >>>
> >>>> -----Original Message-----
> >>>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> >>>> Sent: 2007=C4=EA1=D4=C212=C8=D5 14:42
> >>>> To: Zhu Ebony-r57400
> >>>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> >>>> Subject: Re: [patch][0/5] powerpc: Add support to fully
> >> comply with
> >>>> IEEE-754 standard
> >>>>
> >>>>
> >>>> On Jan 11, 2007, at 11:19 PM, Zhu Ebony-r57400 wrote:
> >>>>
> >>>>> Hi Paul,
> >>>>>
> >>>>> This series of patch add support to fully comply with IEEE-754=20
> >>>>> standard for E500/E500v2 core when hardware floating point
> >>>> compiling
> >>>>> is used.
> >>>>>
> >>>>> Ebony
> >>>>
> >>>> Here are some general comments:
> >>>> * We should be able to support math-emu (as it stands) and
> >> the fixup
> >>>> handling [you break math-emu]
> >>>
> >>> I don't think I break the math-emu. I think the codes I
> >> added have no
> >>> impact to the existing math-emu.
> >>
> >> This snippet of code breaks it from math-emu/sfp-machine.h
> >>
> >>>> +#ifdef CONFIG_SPE
> >>>> +#define __FPU_FPSCR	(current->thread.spefscr)
> >>>> +#else
> >>>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
> >>>> +#endif
> >>
> >> By doing this if I want 'classic FP' emulation as well as the IEEE=20
> >> fixup my fpscr for classic emu will not be updated properly.
> >
> > Logically, user can choose "SPE Support" and "Math=20
> emulation" at the=20
> > same time on menuconfig. But from my understanding, it is not=20
> > necessary to select math-emu on a SPE available system,=20
> since SPE can=20
> > do math operation.
>=20
> This is not true.  If I want to run a "classic" PPC binary=20
> with FP I need "Math emulation" and if I want to run an SPE=20
> one I enable "SPE Support".  I could want to run both of=20
> these types of binaries on the same system at the same time.
>=20
So how about defining a separate macro for spefscr?

 #define  __FPU_SPEFSCR	(current->thread.spefscr)=20


> >>>> * Copyrights / header comments should give credit to the orig
> >>>> math- emu code
> >>> I'd like to do this, but in most handler codes, I can't find=20
> >>> copyright information of the orig authors. I think the=20
> math-emu code=20
> >>> comes from
> >> glibc. In
> >>> the
> >>> sigfpe_handler.c, I gave credit to the orig author.
> >>
> >> I think a comment is sufficient stating this is take from=20
> the math-=20
> >> emu code.
> >>
> >>>> * Why isn't there any handling of SPEFloatingPointRound=20
> exceptions?
> >>>
> >>> I think the SPEFloatingPointRound exception is not necessary to=20
> >>> handle if we handle floating point exception this way.
> >>
> >> I dont believe this, you'll have to explain if this is really true.
> >> But, I'm almost sure that if the RND mode is set to +/-inf=20
> and we do=20
> >> an operation that is within the normal bounds that should round we=20
> >> will NOT get one of the other exceptions.
> >>
> >> - k
> >>
> >>
> >
> > I looked into the manual again, and found what you are saying is=20
> > correct. The reason for developing IEEE-754 fixup came from=20
> customer's=20
> > complain, which is about denormalized computation can't=20
> generate the=20
> > correct result as the same as on x86. So what I was=20
> concentrating on=20
> > is floating-point data interrupt. The truth is, FP round=20
> interrupt may=20
> > be taken on some circumstance that FP data interrupt doesn't take=20
> > place.
> >
> > As you said, if RND mode is set to +/- inf, FP round interrupt will=20
> > generate if we do an operation within the normal bounds. Do=20
> you think=20
> > we use the same way to handle FP round interrupt as FP data=20
> interrupt=20
> > is reasonable? How would you suggest?
>=20
> No, I think the round handler should try to do the rounding=20
> by hand.  =20
> Since you have the non rounded information provided by HW,=20
> its much simpler to just do the rounding step.
>=20

OK, I will study it.

Thanks,
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-15 14:37         ` Kumar Gala
  2007-01-16  9:54           ` Zhu Ebony-r57400
@ 2007-01-25  8:25           ` Zhu Ebony-r57400
  2007-01-25  8:28             ` Kumar Gala
  2007-02-07  5:52           ` Zhu Ebony-r57400
  2 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-25  8:25 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20
> No, I think the round handler should try to do the rounding=20
> by hand.  =20
> Since you have the non rounded information provided by HW,=20
> its much simpler to just do the rounding step.

Hi Kumar,

I have some new thoughts about rounding handler.=20
Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf) and
a normal "efsmul" may generate rounding interrupt. At this time,
according
to manual, unrounded (truncated) result is placed in the target
register. Please
note the target register contains a hexadecimal representation of a
floating point number. Since it represents a floating point number
exactly
so we can not round it anymore.

Maybe we still need to emulate the whole "efsmul" instruction by
software.

What do you think? Any idea is appreciated!

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-25  8:25           ` Zhu Ebony-r57400
@ 2007-01-25  8:28             ` Kumar Gala
  2007-01-25  8:53               ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-01-25  8:28 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote:

>
>> No, I think the round handler should try to do the rounding
>> by hand.
>> Since you have the non rounded information provided by HW,
>> its much simpler to just do the rounding step.
>
> Hi Kumar,
>
> I have some new thoughts about rounding handler.
> Suppose we set SPEFSCR[FRMC]=0b10 (rounding towards +Inf) and
> a normal "efsmul" may generate rounding interrupt. At this time,
> according
> to manual, unrounded (truncated) result is placed in the target
> register. Please
> note the target register contains a hexadecimal representation of a
> floating point number. Since it represents a floating point number
> exactly
> so we can not round it anymore.

I don't follow what you mean by not being able to round it anymore.

> Maybe we still need to emulate the whole "efsmul" instruction by
> software.

You can't always do that.  Think about the following instruction:

	efsmul	r3, r3, r3

You'll have lost the original value of r3 when the exception occurs.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-25  8:28             ` Kumar Gala
@ 2007-01-25  8:53               ` Zhu Ebony-r57400
  2007-01-25 15:10                 ` Kumar Gala
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-25  8:53 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C225=C8=D5 16:29
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote:
>=20
> >
> >> No, I think the round handler should try to do the=20
> rounding by hand.
> >> Since you have the non rounded information provided by HW,=20
> its much=20
> >> simpler to just do the rounding step.
> >
> > Hi Kumar,
> >
> > I have some new thoughts about rounding handler.
> > Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf)=20
> and a normal=20
> > "efsmul" may generate rounding interrupt. At this time,=20
> according to=20
> > manual, unrounded (truncated) result is placed in the=20
> target register.=20
> > Please note the target register contains a hexadecimal=20
> representation=20
> > of a floating point number. Since it represents a floating point=20
> > number exactly so we can not round it anymore.
>=20
> I don't follow what you mean by not being able to round it anymore.

I try to make myself clear:
>From my understanding, rounding is from a floating point number to =
another
which can be represented by IEEE-754 complied hexadecimal, but not
from a hexadecimal to another. For example:

Assume the result we got from efsmul is 3.29305125103e-44
It will be stored in target register as 0x00000017. However, 0x00000017
Represents 3.2229864679470793e-44 accurately. Can we round  =
3.2229864679470793e-44?
I'm afraid not. I mean, we must round the result before it being stored =
in
target register as hexadecimal, not after.



> > Maybe we still need to emulate the whole "efsmul" instruction by=20
> > software.
>=20
> You can't always do that.  Think about the following instruction:
>=20
> 	efsmul	r3, r3, r3
>=20
> You'll have lost the original value of r3 when the exception occurs.

If this operation causes FP data interrupt, just let data interrupt =
handler to
do the simulation. I think there's no chance that we get data and round =
interrupts
simultaneously.


>=20
> - k
>=20
>=20

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-25  8:53               ` Zhu Ebony-r57400
@ 2007-01-25 15:10                 ` Kumar Gala
  2007-01-26  6:16                   ` Zhu Ebony-r57400
  2007-01-29 10:00                   ` Zhu Ebony-r57400
  0 siblings, 2 replies; 45+ messages in thread
From: Kumar Gala @ 2007-01-25 15:10 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Jan 25, 2007, at 2:53 AM, Zhu Ebony-r57400 wrote:

>
>
>> -----Original Message-----
>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>> Sent: 2007=C4=EA1=D4=C225=C8=D5 16:29
>> To: Zhu Ebony-r57400
>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
>> Subject: Re: [patch][0/5] powerpc: Add support to fully
>> comply with IEEE-754 standard
>>
>>
>> On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote:
>>
>>>
>>>> No, I think the round handler should try to do the
>> rounding by hand.
>>>> Since you have the non rounded information provided by HW,
>> its much
>>>> simpler to just do the rounding step.
>>>
>>> Hi Kumar,
>>>
>>> I have some new thoughts about rounding handler.
>>> Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf)
>> and a normal
>>> "efsmul" may generate rounding interrupt. At this time,
>> according to
>>> manual, unrounded (truncated) result is placed in the
>> target register.
>>> Please note the target register contains a hexadecimal
>> representation
>>> of a floating point number. Since it represents a floating point
>>> number exactly so we can not round it anymore.
>>
>> I don't follow what you mean by not being able to round it anymore.
>
> I try to make myself clear:
>> =46rom my understanding, rounding is from a floating point number to =20=

>> another
> which can be represented by IEEE-754 complied hexadecimal, but not
> from a hexadecimal to another. For example:
>
> Assume the result we got from efsmul is 3.29305125103e-44
> It will be stored in target register as 0x00000017. However, =20
> 0x00000017
> Represents 3.2229864679470793e-44 accurately. Can we round  =20
> 3.2229864679470793e-44?
> I'm afraid not. I mean, we must round the result before it being =20
> stored in
> target register as hexadecimal, not after.

I still don't follow what you are getting at.  The HW stores the non-=20
rounded result.  You seem to imply there is some format change or =20
something that is going on between the computed result and what's =20
stored in the register.  If the result is such that it doesn't need =20
rounding than you don't round (I forget if you can an exception or =20
not if G|X are not set).

>>> Maybe we still need to emulate the whole "efsmul" instruction by
>>> software.
>>
>> You can't always do that.  Think about the following instruction:
>>
>> 	efsmul	r3, r3, r3
>>
>> You'll have lost the original value of r3 when the exception occurs.
>
> If this operation causes FP data interrupt, just let data interrupt =20=

> handler to
> do the simulation. I think there's no chance that we get data and =20
> round interrupts
> simultaneously.

Agreed, and I think FP data will take precedence, however the example =20=

I use can still cause a round exception and no data exception given =20
the right input values.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-25 15:10                 ` Kumar Gala
@ 2007-01-26  6:16                   ` Zhu Ebony-r57400
  2007-01-29 10:00                   ` Zhu Ebony-r57400
  1 sibling, 0 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-26  6:16 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: 2007=C4=EA1=D4=C225=C8=D5 23:11
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Jan 25, 2007, at 2:53 AM, Zhu Ebony-r57400 wrote:
>=20
> >
> >
> >> -----Original Message-----
> >> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> >> Sent: 2007=C4=EA1=D4=C225=C8=D5 16:29
> >> To: Zhu Ebony-r57400
> >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with=20
> >> IEEE-754 standard
> >>
> >>
> >> On Jan 25, 2007, at 2:25 AM, Zhu Ebony-r57400 wrote:
> >>
> >>>
> >>>> No, I think the round handler should try to do the
> >> rounding by hand.
> >>>> Since you have the non rounded information provided by HW,
> >> its much
> >>>> simpler to just do the rounding step.
> >>>
> >>> Hi Kumar,
> >>>
> >>> I have some new thoughts about rounding handler.
> >>> Suppose we set SPEFSCR[FRMC]=3D0b10 (rounding towards +Inf)
> >> and a normal
> >>> "efsmul" may generate rounding interrupt. At this time,
> >> according to
> >>> manual, unrounded (truncated) result is placed in the
> >> target register.
> >>> Please note the target register contains a hexadecimal
> >> representation
> >>> of a floating point number. Since it represents a floating point=20
> >>> number exactly so we can not round it anymore.
> >>
> >> I don't follow what you mean by not being able to round it anymore.
> >
> > I try to make myself clear:
> >> From my understanding, rounding is from a floating point number to=20
> >> another
> > which can be represented by IEEE-754 complied hexadecimal, but not=20
> > from a hexadecimal to another. For example:
> >
> > Assume the result we got from efsmul is 3.29305125103e-44=20
> It will be=20
> > stored in target register as 0x00000017. However,
> > 0x00000017
> > Represents 3.2229864679470793e-44 accurately. Can we round  =20
> > 3.2229864679470793e-44?
> > I'm afraid not. I mean, we must round the result before it being=20
> > stored in target register as hexadecimal, not after.
>=20
> I still don't follow what you are getting at.  The HW stores=20
> the non- rounded result.  You seem to imply there is some=20
> format change or something that is going on between the=20
> computed result and what's stored in the register.  If the=20
> result is such that it doesn't need rounding than you don't=20
> round (I forget if you can an exception or not if G|X are not set).

I'm now confused on this point. If I get round interrupt and the target
register is 0x00000017, what the round result should be for 0x00000017?
What is it if round to zero? If round to nearest? If round to +Inf? If =
round to -Inf?
>From the only info that 0x00000017 is a non-rounded result, we still =
don't know
how to round it. 0x00000017 is an inexact value itself.

Actually, in existing code of math-emu, rounding takes place when =
packing the bits
back into native fp result.

> >>> Maybe we still need to emulate the whole "efsmul" instruction by=20
> >>> software.
> >>
> >> You can't always do that.  Think about the following instruction:
> >>
> >> 	efsmul	r3, r3, r3
> >>
> >> You'll have lost the original value of r3 when the=20
> exception occurs.
> >
> > If this operation causes FP data interrupt, just let data interrupt=20
> > handler to do the simulation. I think there's no chance that we get=20
> > data and round interrupts simultaneously.
>=20
> Agreed, and I think FP data will take precedence, however the=20
> example I use can still cause a round exception and no data=20
> exception given the right input values.
OK, and I think I need to do some more tests to prove this.


Thanks for your feedback!

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-25 15:10                 ` Kumar Gala
  2007-01-26  6:16                   ` Zhu Ebony-r57400
@ 2007-01-29 10:00                   ` Zhu Ebony-r57400
  2007-01-29 14:30                     ` Kumar Gala
  1 sibling, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-29 10:00 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

Hi Kumar,

I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=3D0b10/0b11 to enable =
FP
round interrupt will cause the exception occurring very often, which
will dramatically decrease the performance of SPE instructions. Do you
think putting an option in menuconfig to let user choose whether to
enable FP round simulation is a reasonable idea?

Thanks.

Ebony=20

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-29 10:00                   ` Zhu Ebony-r57400
@ 2007-01-29 14:30                     ` Kumar Gala
  2007-01-31  9:45                       ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-01-29 14:30 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote:

> Hi Kumar,
>
> I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=0b10/0b11 to  
> enable FP
> round interrupt will cause the exception occurring very often, which
> will dramatically decrease the performance of SPE instructions. Do you
> think putting an option in menuconfig to let user choose whether to
> enable FP round simulation is a reasonable idea?

I don't see any issue with it, but I have to believe if you want full  
IEEE results, you want full IEEE results for everything.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-29 14:30                     ` Kumar Gala
@ 2007-01-31  9:45                       ` Zhu Ebony-r57400
  2007-01-31 14:48                         ` Kumar Gala
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-01-31  9:45 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: Monday, January 29, 2007 10:31 PM
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote:
>=20
> > Hi Kumar,
> >
> > I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=3D0b10/0b11=20
> to enable=20
> > FP round interrupt will cause the exception occurring very often,=20
> > which will dramatically decrease the performance of SPE=20
> instructions.=20
> > Do you think putting an option in menuconfig to let user choose=20
> > whether to enable FP round simulation is a reasonable idea?
>=20
> I don't see any issue with it, but I have to believe if you=20
> want full IEEE results, you want full IEEE results for everything.
>=20
> - k
>=20

Agreed, we need to fully comply with IEEE754. So let's talk something
about the handler.

The round exceptions can be put into 2 categories:

1. SPEFSCR[FRMC] =3D 0b10 or 0b11 (rounding toward +Inf and -Inf)
We need to handle this exception to comply with IEEE

2. SPEFSCR[FINXE] =3D 1
If we enable this, round exception will occurs when inaccurate results
are
generated. However, I think we don't need to do so. With FINXE=3D0, if =
SPE
data
exception occurs, we can handle the exception by existing handler, which
is fully IEEE complied, including rounding. If no data exception occurs,
HW
can implement "round to nearest" and "round toward zero" with IEEE
complied,
and "round toward +Inf/-Inf" can be handled by the handler of point 1.
So all
the situations are covered, we do have to enable FINXE.

Could you make some comments on this? Thanks!

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-31  9:45                       ` Zhu Ebony-r57400
@ 2007-01-31 14:48                         ` Kumar Gala
  2007-02-01  9:35                           ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-01-31 14:48 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus

>> On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote:
>>
>>> Hi Kumar,
>>>
>>> I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=0b10/0b11
>> to enable
>>> FP round interrupt will cause the exception occurring very often,
>>> which will dramatically decrease the performance of SPE
>> instructions.
>>> Do you think putting an option in menuconfig to let user choose
>>> whether to enable FP round simulation is a reasonable idea?
>>
>> I don't see any issue with it, but I have to believe if you
>> want full IEEE results, you want full IEEE results for everything.
>>
>> - k
>>
>
> Agreed, we need to fully comply with IEEE754. So let's talk something
> about the handler.
>
> The round exceptions can be put into 2 categories:
>
> 1. SPEFSCR[FRMC] = 0b10 or 0b11 (rounding toward +Inf and -Inf)
> We need to handle this exception to comply with IEEE
>
> 2. SPEFSCR[FINXE] = 1
> If we enable this, round exception will occurs when inaccurate results
> are
> generated. However, I think we don't need to do so. With FINXE=0,  
> if SPE
> data
> exception occurs, we can handle the exception by existing handler,  
> which
> is fully IEEE complied, including rounding. If no data exception  
> occurs,
> HW
> can implement "round to nearest" and "round toward zero" with IEEE
> complied,
> and "round toward +Inf/-Inf" can be handled by the handler of point 1.
> So all
> the situations are covered, we do have to enable FINXE.
>
> Could you make some comments on this? Thanks!

While I agree with most of what you're saying there is one issue.  If  
the user want's an exception reported on inexact results when the  
rounding mode is set to "round to nearest" or "round towards zero".   
Of course we know when the user requests this and can enable/disable  
this exception at that point if we want to.

On a side node, wondering if you've come across this test suite:
http://www.jhauser.us/arithmetic/TestFloat.html

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-31 14:48                         ` Kumar Gala
@ 2007-02-01  9:35                           ` Zhu Ebony-r57400
  0 siblings, 0 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-01  9:35 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: Wednesday, January 31, 2007 10:49 PM
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
> >> On Jan 29, 2007, at 4:00 AM, Zhu Ebony-r57400 wrote:
> >>
> >>> Hi Kumar,
> >>>
> >>> I think enabling SPEFSCR[FINXE] or SPEFSCR[FRMC]=3D0b10/0b11
> >> to enable
> >>> FP round interrupt will cause the exception occurring very often,=20
> >>> which will dramatically decrease the performance of SPE
> >> instructions.
> >>> Do you think putting an option in menuconfig to let user choose=20
> >>> whether to enable FP round simulation is a reasonable idea?
> >>
> >> I don't see any issue with it, but I have to believe if=20
> you want full=20
> >> IEEE results, you want full IEEE results for everything.
> >>
> >> - k
> >>
> >
> > Agreed, we need to fully comply with IEEE754. So let's talk=20
> something=20
> > about the handler.
> >
> > The round exceptions can be put into 2 categories:
> >
> > 1. SPEFSCR[FRMC] =3D 0b10 or 0b11 (rounding toward +Inf and -Inf) We =

> > need to handle this exception to comply with IEEE
> >
> > 2. SPEFSCR[FINXE] =3D 1
> > If we enable this, round exception will occurs when=20
> inaccurate results=20
> > are generated. However, I think we don't need to do so.=20
> With FINXE=3D0,=20
> > if SPE data exception occurs, we can handle the exception=20
> by existing=20
> > handler, which is fully IEEE complied, including rounding.=20
> If no data=20
> > exception occurs, HW can implement "round to nearest" and "round=20
> > toward zero" with IEEE complied, and "round toward=20
> +Inf/-Inf" can be=20
> > handled by the handler of point 1.
> > So all
> > the situations are covered, we do have to enable FINXE.
> >
> > Could you make some comments on this? Thanks!
>=20
> While I agree with most of what you're saying there is one=20
> issue.  If the user want's an exception reported on inexact=20
> results when the =20
> rounding mode is set to "round to nearest" or "round towards zero".  =20
> Of course we know when the user requests this and can=20
> enable/disable this exception at that point if we want to.
>=20
> On a side node, wondering if you've come across this test suite:
> http://www.jhauser.us/arithmetic/TestFloat.html
>=20
> - k

Thank  you for your comments and useful link. It seems quite good for
testing the handler.

B.R.
Ebony






>=20

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-01-15 14:37         ` Kumar Gala
  2007-01-16  9:54           ` Zhu Ebony-r57400
  2007-01-25  8:25           ` Zhu Ebony-r57400
@ 2007-02-07  5:52           ` Zhu Ebony-r57400
  2007-02-07  7:11             ` Kumar Gala
  2 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-07  5:52 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

> >> This snippet of code breaks it from math-emu/sfp-machine.h
> >>
> >>>> +#ifdef CONFIG_SPE
> >>>> +#define __FPU_FPSCR	(current->thread.spefscr)
> >>>> +#else
> >>>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
> >>>> +#endif
> >>
> >> By doing this if I want 'classic FP' emulation as well as the IEEE=20
> >> fixup my fpscr for classic emu will not be updated properly.
> >
> > Logically, user can choose "SPE Support" and "Math=20
> emulation" at the=20
> > same time on menuconfig. But from my understanding, it is not=20
> > necessary to select math-emu on a SPE available system,=20
> since SPE can=20
> > do math operation.
>=20
> This is not true.  If I want to run a "classic" PPC binary=20
> with FP I need "Math emulation" and if I want to run an SPE=20
> one I enable "SPE Support".  I could want to run both of=20
> these types of binaries on the same system at the same time.

If this is the case, maybe we need a separate macro like
#define __SPE_SPEFSCR	(current->thread.spefscr)
But if we do this, how does the kernel know if the emulation is for
"classic" PPC binary with FP or an SPE one, thus corresponding
registers(fpscr or spefscr) being updated?

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-02-07  5:52           ` Zhu Ebony-r57400
@ 2007-02-07  7:11             ` Kumar Gala
  2007-02-07  7:21               ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-02-07  7:11 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote:

>>>> This snippet of code breaks it from math-emu/sfp-machine.h
>>>>
>>>>>> +#ifdef CONFIG_SPE
>>>>>> +#define __FPU_FPSCR	(current->thread.spefscr)
>>>>>> +#else
>>>>>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
>>>>>> +#endif
>>>>
>>>> By doing this if I want 'classic FP' emulation as well as the IEEE
>>>> fixup my fpscr for classic emu will not be updated properly.
>>>
>>> Logically, user can choose "SPE Support" and "Math
>> emulation" at the
>>> same time on menuconfig. But from my understanding, it is not
>>> necessary to select math-emu on a SPE available system,
>> since SPE can
>>> do math operation.
>>
>> This is not true.  If I want to run a "classic" PPC binary
>> with FP I need "Math emulation" and if I want to run an SPE
>> one I enable "SPE Support".  I could want to run both of
>> these types of binaries on the same system at the same time.
>
> If this is the case, maybe we need a separate macro like
> #define __SPE_SPEFSCR	(current->thread.spefscr)
> But if we do this, how does the kernel know if the emulation is for
> "classic" PPC binary with FP or an SPE one, thus corresponding
> registers(fpscr or spefscr) being updated?

It's based on what instruction you are trying to emulate.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-02-07  7:11             ` Kumar Gala
@ 2007-02-07  7:21               ` Zhu Ebony-r57400
  2007-02-07  7:57                 ` Kumar Gala
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-07  7:21 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: Wednesday, February 07, 2007 3:12 PM
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote:
>=20
> >>>> This snippet of code breaks it from math-emu/sfp-machine.h
> >>>>
> >>>>>> +#ifdef CONFIG_SPE
> >>>>>> +#define __FPU_FPSCR	(current->thread.spefscr)
> >>>>>> +#else
> >>>>>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
> >>>>>> +#endif
> >>>>
> >>>> By doing this if I want 'classic FP' emulation as well=20
> as the IEEE=20
> >>>> fixup my fpscr for classic emu will not be updated properly.
> >>>
> >>> Logically, user can choose "SPE Support" and "Math
> >> emulation" at the
> >>> same time on menuconfig. But from my understanding, it is not=20
> >>> necessary to select math-emu on a SPE available system,
> >> since SPE can
> >>> do math operation.
> >>
> >> This is not true.  If I want to run a "classic" PPC binary=20
> with FP I=20
> >> need "Math emulation" and if I want to run an SPE one I=20
> enable "SPE=20
> >> Support".  I could want to run both of these types of=20
> binaries on the=20
> >> same system at the same time.
> >
> > If this is the case, maybe we need a separate macro like
> > #define __SPE_SPEFSCR	(current->thread.spefscr)
> > But if we do this, how does the kernel know if the emulation is for=20
> > "classic" PPC binary with FP or an SPE one, thus corresponding=20
> > registers(fpscr or spefscr) being updated?
>=20
> It's based on what instruction you are trying to emulate.
>=20
For example, the FP_ROUNDMODE now defined by __FPU_FPSCR is
widely used in existing code. If the kernel doesn't know the emulation
is for
classic PPC or SPE fixup, then it doesn't know where to get the correct
rounding mode, from fpscr or spefscr? This has confused me.

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-02-07  7:21               ` Zhu Ebony-r57400
@ 2007-02-07  7:57                 ` Kumar Gala
  2007-02-07  8:04                   ` Zhu Ebony-r57400
  2007-02-08  3:50                   ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400
  0 siblings, 2 replies; 45+ messages in thread
From: Kumar Gala @ 2007-02-07  7:57 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Feb 7, 2007, at 1:21 AM, Zhu Ebony-r57400 wrote:

>
>
>> -----Original Message-----
>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>> Sent: Wednesday, February 07, 2007 3:12 PM
>> To: Zhu Ebony-r57400
>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
>> Subject: Re: [patch][0/5] powerpc: Add support to fully
>> comply with IEEE-754 standard
>>
>>
>> On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote:
>>
>>>>>> This snippet of code breaks it from math-emu/sfp-machine.h
>>>>>>
>>>>>>>> +#ifdef CONFIG_SPE
>>>>>>>> +#define __FPU_FPSCR	(current->thread.spefscr)
>>>>>>>> +#else
>>>>>>>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
>>>>>>>> +#endif
>>>>>>
>>>>>> By doing this if I want 'classic FP' emulation as well
>> as the IEEE
>>>>>> fixup my fpscr for classic emu will not be updated properly.
>>>>>
>>>>> Logically, user can choose "SPE Support" and "Math
>>>> emulation" at the
>>>>> same time on menuconfig. But from my understanding, it is not
>>>>> necessary to select math-emu on a SPE available system,
>>>> since SPE can
>>>>> do math operation.
>>>>
>>>> This is not true.  If I want to run a "classic" PPC binary
>> with FP I
>>>> need "Math emulation" and if I want to run an SPE one I
>> enable "SPE
>>>> Support".  I could want to run both of these types of
>> binaries on the
>>>> same system at the same time.
>>>
>>> If this is the case, maybe we need a separate macro like
>>> #define __SPE_SPEFSCR	(current->thread.spefscr)
>>> But if we do this, how does the kernel know if the emulation is for
>>> "classic" PPC binary with FP or an SPE one, thus corresponding
>>> registers(fpscr or spefscr) being updated?
>>
>> It's based on what instruction you are trying to emulate.
>>
> For example, the FP_ROUNDMODE now defined by __FPU_FPSCR is
> widely used in existing code. If the kernel doesn't know the emulation
> is for
> classic PPC or SPE fixup, then it doesn't know where to get the  
> correct
> rounding mode, from fpscr or spefscr? This has confused me.

Yes, this is a good point, I guess in truth the two modes are  
mutually exclusive.

Sorry for not figuring that out sooner.  (uugh, all the stuff to make  
IEEE emulation work properly on SPE is a pain :)

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard
  2007-02-07  7:57                 ` Kumar Gala
@ 2007-02-07  8:04                   ` Zhu Ebony-r57400
  2007-02-08  3:50                   ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400
  1 sibling, 0 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-07  8:04 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: Wednesday, February 07, 2007 3:57 PM
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Feb 7, 2007, at 1:21 AM, Zhu Ebony-r57400 wrote:
>=20
> >
> >
> >> -----Original Message-----
> >> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> >> Sent: Wednesday, February 07, 2007 3:12 PM
> >> To: Zhu Ebony-r57400
> >> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> >> Subject: Re: [patch][0/5] powerpc: Add support to fully=20
> comply with=20
> >> IEEE-754 standard
> >>
> >>
> >> On Feb 6, 2007, at 11:52 PM, Zhu Ebony-r57400 wrote:
> >>
> >>>>>> This snippet of code breaks it from math-emu/sfp-machine.h
> >>>>>>
> >>>>>>>> +#ifdef CONFIG_SPE
> >>>>>>>> +#define __FPU_FPSCR	(current->thread.spefscr)
> >>>>>>>> +#else
> >>>>>>>>  #define __FPU_FPSCR	(current->thread.fpscr.val)
> >>>>>>>> +#endif
> >>>>>>
> >>>>>> By doing this if I want 'classic FP' emulation as well
> >> as the IEEE
> >>>>>> fixup my fpscr for classic emu will not be updated properly.
> >>>>>
> >>>>> Logically, user can choose "SPE Support" and "Math
> >>>> emulation" at the
> >>>>> same time on menuconfig. But from my understanding, it is not=20
> >>>>> necessary to select math-emu on a SPE available system,
> >>>> since SPE can
> >>>>> do math operation.
> >>>>
> >>>> This is not true.  If I want to run a "classic" PPC binary
> >> with FP I
> >>>> need "Math emulation" and if I want to run an SPE one I
> >> enable "SPE
> >>>> Support".  I could want to run both of these types of
> >> binaries on the
> >>>> same system at the same time.
> >>>
> >>> If this is the case, maybe we need a separate macro like
> >>> #define __SPE_SPEFSCR	(current->thread.spefscr)
> >>> But if we do this, how does the kernel know if the=20
> emulation is for=20
> >>> "classic" PPC binary with FP or an SPE one, thus corresponding=20
> >>> registers(fpscr or spefscr) being updated?
> >>
> >> It's based on what instruction you are trying to emulate.
> >>
> > For example, the FP_ROUNDMODE now defined by __FPU_FPSCR is widely=20
> > used in existing code. If the kernel doesn't know the=20
> emulation is for=20
> > classic PPC or SPE fixup, then it doesn't know where to get the=20
> > correct rounding mode, from fpscr or spefscr? This has confused me.
>=20
> Yes, this is a good point, I guess in truth the two modes are=20
> mutually exclusive.
>=20
> Sorry for not figuring that out sooner.  (uugh, all the stuff=20
> to make IEEE emulation work properly on SPE is a pain :)

Defining FP_ROUNDMODE as (current->thread.spefscr & 0x3) in sigfpe
handler maybe
a feasible way to get correct rounding mode, and won't break existing
FPU simulation.
At least it works here :) I will submit revised patches soon.

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-07  7:57                 ` Kumar Gala
  2007-02-07  8:04                   ` Zhu Ebony-r57400
@ 2007-02-08  3:50                   ` Zhu Ebony-r57400
  2007-02-08  5:18                     ` Kumar Gala
  1 sibling, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-08  3:50 UTC (permalink / raw)
  To: paulus; +Cc: linuxppc-dev

Hi Paul,

These are the re-sent patches to add support to fully comply with
IEEE-754 standard for E500/E500v2 core when hardware floating
point compiling is used. Comparison with last patches I've submitted,
the following points was changed:

1. Add a rounding exception handler, to handle the exceptions that
would occur when rounding towards +Inf/-Inf

2. Using the existing exception entering/returning routine, and get the
exception
instructions from regs->nip instead of reading from SRR0

Thank you all for the comments you gave to me!

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-08  3:50                   ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400
@ 2007-02-08  5:18                     ` Kumar Gala
  2007-02-08  5:40                       ` Zhu Ebony-r57400
  2007-02-08  7:06                       ` Zhu Ebony-r57400
  0 siblings, 2 replies; 45+ messages in thread
From: Kumar Gala @ 2007-02-08  5:18 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote:

> Hi Paul,
>
> These are the re-sent patches to add support to fully comply with
> IEEE-754 standard for E500/E500v2 core when hardware floating
> point compiling is used. Comparison with last patches I've submitted,
> the following points was changed:
>
> 1. Add a rounding exception handler, to handle the exceptions that
> would occur when rounding towards +Inf/-Inf
>
> 2. Using the existing exception entering/returning routine, and get  
> the
> exception
> instructions from regs->nip instead of reading from SRR0
>
> Thank you all for the comments you gave to me!

Did you end up getting testfloat running?  I'd like to see some  
testing results before accepting these patches.  I think testfloat is  
our best bet at this point.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-08  5:18                     ` Kumar Gala
@ 2007-02-08  5:40                       ` Zhu Ebony-r57400
  2007-02-08  7:06                       ` Zhu Ebony-r57400
  1 sibling, 0 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-08  5:40 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: Thursday, February 08, 2007 1:19 PM
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote:
>=20
> > Hi Paul,
> >
> > These are the re-sent patches to add support to fully comply with
> > IEEE-754 standard for E500/E500v2 core when hardware floating point=20
> > compiling is used. Comparison with last patches I've submitted, the=20
> > following points was changed:
> >
> > 1. Add a rounding exception handler, to handle the exceptions that=20
> > would occur when rounding towards +Inf/-Inf
> >
> > 2. Using the existing exception entering/returning routine, and get=20
> > the exception instructions from regs->nip instead of=20
> reading from SRR0
> >
> > Thank you all for the comments you gave to me!
>=20
> Did you end up getting testfloat running?  I'd like to see=20
> some testing results before accepting these patches.  I think=20
> testfloat is our best bet at this point.
>=20
> - k

not yet, since it needs to be ported to powerpc platform. I will do it
as soon as possible.

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-08  5:18                     ` Kumar Gala
  2007-02-08  5:40                       ` Zhu Ebony-r57400
@ 2007-02-08  7:06                       ` Zhu Ebony-r57400
  2007-02-08  7:15                         ` Kumar Gala
  1 sibling, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-08  7:06 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: Thursday, February 08, 2007 1:19 PM
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote:
>=20
> > Hi Paul,
> >
> > These are the re-sent patches to add support to fully comply with
> > IEEE-754 standard for E500/E500v2 core when hardware floating point=20
> > compiling is used. Comparison with last patches I've submitted, the=20
> > following points was changed:
> >
> > 1. Add a rounding exception handler, to handle the exceptions that=20
> > would occur when rounding towards +Inf/-Inf
> >
> > 2. Using the existing exception entering/returning routine, and get=20
> > the exception instructions from regs->nip instead of=20
> reading from SRR0
> >
> > Thank you all for the comments you gave to me!
>=20
> Did you end up getting testfloat running?  I'd like to see=20
> some testing results before accepting these patches.  I think=20
> testfloat is our best bet at this point.
>=20
Hi Kumar,

I looked into the testfloat suit, and found all the instructions it
tests (more than 50)should be
implemented based on ASM. And also the SoftFloat test suite, which the
Testfloat is comparing against, should be ported to powerpc platform. I
think these work needs some time do finish. So could you review my
patches and
give some comments first? Thank you.

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-08  7:06                       ` Zhu Ebony-r57400
@ 2007-02-08  7:15                         ` Kumar Gala
  2007-02-08  8:08                           ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-02-08  7:15 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Feb 8, 2007, at 1:06 AM, Zhu Ebony-r57400 wrote:

>
>
>> -----Original Message-----
>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>> Sent: Thursday, February 08, 2007 1:19 PM
>> To: Zhu Ebony-r57400
>> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
>> Subject: Re: [patch][0/5] powerpc V2 : Add support to fully
>> comply with IEEE-754 standard
>>
>>
>> On Feb 7, 2007, at 9:50 PM, Zhu Ebony-r57400 wrote:
>>
>>> Hi Paul,
>>>
>>> These are the re-sent patches to add support to fully comply with
>>> IEEE-754 standard for E500/E500v2 core when hardware floating point
>>> compiling is used. Comparison with last patches I've submitted, the
>>> following points was changed:
>>>
>>> 1. Add a rounding exception handler, to handle the exceptions that
>>> would occur when rounding towards +Inf/-Inf
>>>
>>> 2. Using the existing exception entering/returning routine, and get
>>> the exception instructions from regs->nip instead of
>> reading from SRR0
>>>
>>> Thank you all for the comments you gave to me!
>>
>> Did you end up getting testfloat running?  I'd like to see
>> some testing results before accepting these patches.  I think
>> testfloat is our best bet at this point.
>>
> Hi Kumar,
>
> I looked into the testfloat suit, and found all the instructions it
> tests (more than 50)should be
> implemented based on ASM.

Don't follow?  Can't you build it with the e500 compiler?

> And also the SoftFloat test suite, which the
> Testfloat is comparing against, should be ported to powerpc  
> platform. I
> think these work needs some time do finish. So could you review my
> patches and
> give some comments first? Thank you.

Will do.  Just be aware we need to get testfloat running before this  
will make it in mainline.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-08  7:15                         ` Kumar Gala
@ 2007-02-08  8:08                           ` Zhu Ebony-r57400
  2007-02-08 17:18                             ` Kumar Gala
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-08  8:08 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

> >>
> >> Did you end up getting testfloat running?  I'd like to see some=20
> >> testing results before accepting these patches.  I think=20
> testfloat is=20
> >> our best bet at this point.
> >>
> > Hi Kumar,
> >
> > I looked into the testfloat suit, and found all the instructions it=20
> > tests (more than 50)should be implemented based on ASM.
>=20
> Don't follow?  Can't you build it with the e500 compiler?
>=20
The TestFloat suite provided the target of 386-Win32-gcc and
SPARC-Solaris-gcc only, and a template for user to porting his
own processor. Some general instructions are implemented in C,
but some CPU specific instructions like evfsmul need to be
implemented assemblely. To build it with e500 compiler we still
Have some porting work to do.

B.R.
Ebony

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-08  8:08                           ` Zhu Ebony-r57400
@ 2007-02-08 17:18                             ` Kumar Gala
  2007-02-09  5:15                               ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Kumar Gala @ 2007-02-08 17:18 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus


On Feb 8, 2007, at 2:08 AM, Zhu Ebony-r57400 wrote:

>>>>
>>>> Did you end up getting testfloat running?  I'd like to see some
>>>> testing results before accepting these patches.  I think
>> testfloat is
>>>> our best bet at this point.
>>>>
>>> Hi Kumar,
>>>
>>> I looked into the testfloat suit, and found all the instructions it
>>> tests (more than 50)should be implemented based on ASM.
>>
>> Don't follow?  Can't you build it with the e500 compiler?
>>
> The TestFloat suite provided the target of 386-Win32-gcc and
> SPARC-Solaris-gcc only, and a template for user to porting his
> own processor. Some general instructions are implemented in C,
> but some CPU specific instructions like evfsmul need to be
> implemented assemblely. To build it with e500 compiler we still
> Have some porting work to do.

I wouldn't worry too much about the vector forms.  If the scalar  
single fp and double fp test out ok the vectors are pretty much  
similar enough.

Lets just get the scalar versions tested and work out any issues there.

- k

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-08 17:18                             ` Kumar Gala
@ 2007-02-09  5:15                               ` Zhu Ebony-r57400
  2007-07-30 14:56                                 ` Sergei Shtylyov
  0 siblings, 1 reply; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-02-09  5:15 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev, paulus

=20

> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]=20
> Sent: Friday, February 09, 2007 1:19 AM
> To: Zhu Ebony-r57400
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org
> Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20
> comply with IEEE-754 standard
>=20
>=20
> On Feb 8, 2007, at 2:08 AM, Zhu Ebony-r57400 wrote:
>=20
> >>>>
> >>>> Did you end up getting testfloat running?  I'd like to see some=20
> >>>> testing results before accepting these patches.  I think
> >> testfloat is
> >>>> our best bet at this point.
> >>>>
> >>> Hi Kumar,
> >>>
> >>> I looked into the testfloat suit, and found all the=20
> instructions it=20
> >>> tests (more than 50)should be implemented based on ASM.
> >>
> >> Don't follow?  Can't you build it with the e500 compiler?
> >>
> > The TestFloat suite provided the target of 386-Win32-gcc and=20
> > SPARC-Solaris-gcc only, and a template for user to porting his own=20
> > processor. Some general instructions are implemented in C, but some=20
> > CPU specific instructions like evfsmul need to be implemented=20
> > assemblely. To build it with e500 compiler we still Have=20
> some porting=20
> > work to do.
>=20
> I wouldn't worry too much about the vector forms.  If the=20
> scalar single fp and double fp test out ok the vectors are=20
> pretty much similar enough.
>=20
> Lets just get the scalar versions tested and work out any=20
> issues there.
>=20
> - k

OK, I will focus on scalar SFPF and DPFP versions first.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-02-09  5:15                               ` Zhu Ebony-r57400
@ 2007-07-30 14:56                                 ` Sergei Shtylyov
  2007-07-31  3:36                                   ` Zhu Ebony-r57400
  0 siblings, 1 reply; 45+ messages in thread
From: Sergei Shtylyov @ 2007-07-30 14:56 UTC (permalink / raw)
  To: Zhu Ebony-r57400; +Cc: linuxppc-dev, paulus

Hello.

Zhu Ebony-r57400 wrote:

>>>>>>Did you end up getting testfloat running?  I'd like to see some 
>>>>>>testing results before accepting these patches.  I think estfloat is
>>>>>>our best bet at this point.

>>>>>Hi Kumar,

>>>>>I looked into the testfloat suit, and found all the instructions it 
>>>>>tests (more than 50)should be implemented based on ASM.

>>>>Don't follow?  Can't you build it with the e500 compiler?

>>>The TestFloat suite provided the target of 386-Win32-gcc and 
>>>SPARC-Solaris-gcc only, and a template for user to porting his own 
>>>processor. Some general instructions are implemented in C, but some 
>>>CPU specific instructions like evfsmul need to be implemented 
>>>assemblely. To build it with e500 compiler we still Have some porting 
>>>work to do.

>>I wouldn't worry too much about the vector forms.  If the 
>>scalar single fp and double fp test out ok the vectors are 
>>pretty much similar enough.

>>Lets just get the scalar versions tested and work out any 
>>issues there.

> OK, I will focus on scalar SFPF and DPFP versions first.

    Any progress with this patchset?

WBR, Sergei

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [patch][0/5] powerpc V2 : Add support to fully comply with IEEE-754 standard
  2007-07-30 14:56                                 ` Sergei Shtylyov
@ 2007-07-31  3:36                                   ` Zhu Ebony-r57400
  0 siblings, 0 replies; 45+ messages in thread
From: Zhu Ebony-r57400 @ 2007-07-31  3:36 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: linuxppc-dev, paulus

 Hi Sergei,

I did some further tests and some development work in past several
months,
but due to project schedule and limited bandwidth, no patches can be
submitted
to the list by now. Anyway, I will keep you and the community updated
once there
is some progress.

Thanks.

B.R.
Ebony

> -----Original Message-----
> From: Sergei Shtylyov [mailto:sshtylyov@ru.mvista.com]=20
> Sent: Monday, July 30, 2007 10:57 PM
> To: Zhu Ebony-r57400
> Cc: Kumar Gala; linuxppc-dev@ozlabs.org; paulus@samba.org
> Subject: Re: [patch][0/5] powerpc V2 : Add support to fully=20
> comply with IEEE-754 standard
>=20
> Hello.
>=20
> Zhu Ebony-r57400 wrote:
>=20
> >>>>>>Did you end up getting testfloat running?  I'd like to see some=20
> >>>>>>testing results before accepting these patches.  I=20
> think estfloat=20
> >>>>>>is our best bet at this point.
>=20
> >>>>>Hi Kumar,
>=20
> >>>>>I looked into the testfloat suit, and found all the=20
> instructions it=20
> >>>>>tests (more than 50)should be implemented based on ASM.
>=20
> >>>>Don't follow?  Can't you build it with the e500 compiler?
>=20
> >>>The TestFloat suite provided the target of 386-Win32-gcc and=20
> >>>SPARC-Solaris-gcc only, and a template for user to porting his own=20
> >>>processor. Some general instructions are implemented in C,=20
> but some=20
> >>>CPU specific instructions like evfsmul need to be implemented=20
> >>>assemblely. To build it with e500 compiler we still Have=20
> some porting=20
> >>>work to do.
>=20
> >>I wouldn't worry too much about the vector forms.  If the scalar=20
> >>single fp and double fp test out ok the vectors are pretty much=20
> >>similar enough.
>=20
> >>Lets just get the scalar versions tested and work out any issues=20
> >>there.
>=20
> > OK, I will focus on scalar SFPF and DPFP versions first.
>=20
>     Any progress with this patchset?
>=20
> WBR, Sergei
>=20

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2007-07-31  3:36 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-12  5:19 [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Zhu Ebony-r57400
2007-01-12  5:29 ` Paul Mackerras
2007-01-12  5:46   ` Kumar Gala
2007-01-12  8:27     ` Zhu Ebony-r57400
2007-01-12 12:06       ` Segher Boessenkool
2007-01-15  8:41         ` Zhu Ebony-r57400
2007-01-12  6:38   ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754standard Zhu Ebony-r57400
2007-01-12  6:49     ` Kumar Gala
2007-01-12 12:03     ` Segher Boessenkool
2007-01-15  8:16       ` Zhu Ebony-r57400
2007-01-15 16:08         ` Segher Boessenkool
2007-01-12  6:41 ` [patch][0/5] powerpc: Add support to fully comply with IEEE-754 standard Kumar Gala
2007-01-12  8:09   ` Zhu Ebony-r57400
2007-01-12 12:04     ` Segher Boessenkool
2007-01-15  6:45       ` Zhu Ebony-r57400
2007-01-15 15:54         ` Segher Boessenkool
2007-01-12 18:36     ` Kumar Gala
2007-01-15  6:37       ` Zhu Ebony-r57400
2007-01-15 14:37         ` Kumar Gala
2007-01-16  9:54           ` Zhu Ebony-r57400
2007-01-25  8:25           ` Zhu Ebony-r57400
2007-01-25  8:28             ` Kumar Gala
2007-01-25  8:53               ` Zhu Ebony-r57400
2007-01-25 15:10                 ` Kumar Gala
2007-01-26  6:16                   ` Zhu Ebony-r57400
2007-01-29 10:00                   ` Zhu Ebony-r57400
2007-01-29 14:30                     ` Kumar Gala
2007-01-31  9:45                       ` Zhu Ebony-r57400
2007-01-31 14:48                         ` Kumar Gala
2007-02-01  9:35                           ` Zhu Ebony-r57400
2007-02-07  5:52           ` Zhu Ebony-r57400
2007-02-07  7:11             ` Kumar Gala
2007-02-07  7:21               ` Zhu Ebony-r57400
2007-02-07  7:57                 ` Kumar Gala
2007-02-07  8:04                   ` Zhu Ebony-r57400
2007-02-08  3:50                   ` [patch][0/5] powerpc V2 : " Zhu Ebony-r57400
2007-02-08  5:18                     ` Kumar Gala
2007-02-08  5:40                       ` Zhu Ebony-r57400
2007-02-08  7:06                       ` Zhu Ebony-r57400
2007-02-08  7:15                         ` Kumar Gala
2007-02-08  8:08                           ` Zhu Ebony-r57400
2007-02-08 17:18                             ` Kumar Gala
2007-02-09  5:15                               ` Zhu Ebony-r57400
2007-07-30 14:56                                 ` Sergei Shtylyov
2007-07-31  3:36                                   ` Zhu Ebony-r57400

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.