qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] ppc: Three floating point fixes
@ 2019-08-16 19:27 Paul A. Clarke
  2019-08-16 22:59 ` Aleksandar Markovic
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Paul A. Clarke @ 2019-08-16 19:27 UTC (permalink / raw)
  To: david; +Cc: qemu-ppc, qemu-devel

From: "Paul A. Clarke" <pc@us.ibm.com>

- target/ppc/fpu_helper.c:
  - helper_todouble() was not properly converting INFINITY from 32 bit
  float to 64 bit double.
  - helper_todouble() was not properly converting any denormalized
  32 bit float to 64 bit double.

- GCC, as of version 8 or so, takes advantage of the hardware's
  implementation of the xscvdpspn instruction to optimize the following
  sequence:
    xscvdpspn vs0,vs1
    mffprwz   r8,f0
  ISA 3.0B has xscvdpspn leaving its result in word 1 of the target register,
  and mffprwz expecting its input to come from word 0 of the source register.
  This sequence fails with QEMU, as a shift is required between those two
  instructions.  However, the hardware splats the result to both word 0 and
  word 1 of its output register, so the shift is not necessary.
  Expect a future revision of the ISA to specify this behavior.

Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
---
 target/ppc/fpu_helper.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 5611cf0..82b5425 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -62,13 +62,14 @@ uint64_t helper_todouble(uint32_t arg)
         ret  = (uint64_t)extract32(arg, 30, 2) << 62;
         ret |= ((extract32(arg, 30, 1) ^ 1) * (uint64_t)7) << 59;
         ret |= (uint64_t)extract32(arg, 0, 30) << 29;
+        ret |= (0x7ffULL * (extract32(arg, 23, 8) == 0xff)) << 52;
     } else {
         /* Zero or Denormalized operand.  */
         ret = (uint64_t)extract32(arg, 31, 1) << 63;
         if (unlikely(abs_arg != 0)) {
             /* Denormalized operand.  */
             int shift = clz32(abs_arg) - 9;
-            int exp = -126 - shift + 1023;
+            int exp = -127 - shift + 1023;
             ret |= (uint64_t)exp << 52;
             ret |= abs_arg << (shift + 29);
         }
@@ -2871,10 +2872,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode,
 
 uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
 {
+    uint64_t result;
+
     float_status tstat = env->fp_status;
     set_float_exception_flags(0, &tstat);
 
-    return (uint64_t)float64_to_float32(xb, &tstat) << 32;
+    result = (uint64_t)float64_to_float32(xb, &tstat);
+    /* hardware replicates result to both words of the doubleword result.  */
+    return (result << 32) | result;
 }
 
 uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-16 19:27 [Qemu-devel] [PATCH] ppc: Three floating point fixes Paul A. Clarke
@ 2019-08-16 22:59 ` Aleksandar Markovic
  2019-08-17  7:53   ` Aleksandar Markovic
  2019-08-18  8:10   ` Richard Henderson
  2019-08-18  8:00 ` [Qemu-devel] " Richard Henderson
  2019-08-19  6:30 ` David Gibson
  2 siblings, 2 replies; 13+ messages in thread
From: Aleksandar Markovic @ 2019-08-16 22:59 UTC (permalink / raw)
  To: Paul A. Clarke; +Cc: qemu-ppc, qemu-devel, david

16.08.2019. 21.28, "Paul A. Clarke" <pc@us.ibm.com> је написао/ла:
>
> From: "Paul A. Clarke" <pc@us.ibm.com>
>
> - target/ppc/fpu_helper.c:
>   - helper_todouble() was not properly converting INFINITY from 32 bit
>   float to 64 bit double.
>   - helper_todouble() was not properly converting any denormalized
>   32 bit float to 64 bit double.
>
> - GCC, as of version 8 or so, takes advantage of the hardware's
>   implementation of the xscvdpspn instruction to optimize the following
>   sequence:
>     xscvdpspn vs0,vs1
>     mffprwz   r8,f0
>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
register,
>   and mffprwz expecting its input to come from word 0 of the source
register.
>   This sequence fails with QEMU, as a shift is required between those two
>   instructions.  However, the hardware splats the result to both word 0
and
>   word 1 of its output register, so the shift is not necessary.
>   Expect a future revision of the ISA to specify this behavior.
>

Hmmm... Isn't this a gcc bug (using undocumented hardware feature), given
everything you said here?

Sincerely,
Aleksandar

> Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
> ---
>  target/ppc/fpu_helper.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 5611cf0..82b5425 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -62,13 +62,14 @@ uint64_t helper_todouble(uint32_t arg)
>          ret  = (uint64_t)extract32(arg, 30, 2) << 62;
>          ret |= ((extract32(arg, 30, 1) ^ 1) * (uint64_t)7) << 59;
>          ret |= (uint64_t)extract32(arg, 0, 30) << 29;
> +        ret |= (0x7ffULL * (extract32(arg, 23, 8) == 0xff)) << 52;
>      } else {
>          /* Zero or Denormalized operand.  */
>          ret = (uint64_t)extract32(arg, 31, 1) << 63;
>          if (unlikely(abs_arg != 0)) {
>              /* Denormalized operand.  */
>              int shift = clz32(abs_arg) - 9;
> -            int exp = -126 - shift + 1023;
> +            int exp = -127 - shift + 1023;
>              ret |= (uint64_t)exp << 52;
>              ret |= abs_arg << (shift + 29);
>          }
> @@ -2871,10 +2872,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t
opcode,
>
>  uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
>  {
> +    uint64_t result;
> +
>      float_status tstat = env->fp_status;
>      set_float_exception_flags(0, &tstat);
>
> -    return (uint64_t)float64_to_float32(xb, &tstat) << 32;
> +    result = (uint64_t)float64_to_float32(xb, &tstat);
> +    /* hardware replicates result to both words of the doubleword
result.  */
> +    return (result << 32) | result;
>  }
>
>  uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)
> --
> 1.8.3.1
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-16 22:59 ` Aleksandar Markovic
@ 2019-08-17  7:53   ` Aleksandar Markovic
  2019-08-18  8:10   ` Richard Henderson
  1 sibling, 0 replies; 13+ messages in thread
From: Aleksandar Markovic @ 2019-08-17  7:53 UTC (permalink / raw)
  To: Paul A. Clarke; +Cc: qemu-ppc, qemu-devel, david

17.08.2019. 00.59, "Aleksandar Markovic" <aleksandar.m.mail@gmail.com> је
написао/ла:
>
>
> 16.08.2019. 21.28, "Paul A. Clarke" <pc@us.ibm.com> је написао/ла:
> >
> > From: "Paul A. Clarke" <pc@us.ibm.com>
> >
> > - target/ppc/fpu_helper.c:
> >   - helper_todouble() was not properly converting INFINITY from 32 bit
> >   float to 64 bit double.
> >   - helper_todouble() was not properly converting any denormalized
> >   32 bit float to 64 bit double.
> >
> > - GCC, as of version 8 or so, takes advantage of the hardware's
> >   implementation of the xscvdpspn instruction to optimize the following
> >   sequence:
> >     xscvdpspn vs0,vs1
> >     mffprwz   r8,f0
> >   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
register,
> >   and mffprwz expecting its input to come from word 0 of the source
register.
> >   This sequence fails with QEMU, as a shift is required between those
two
> >   instructions.  However, the hardware splats the result to both word 0
and
> >   word 1 of its output register, so the shift is not necessary.
> >   Expect a future revision of the ISA to specify this behavior.
> >
>
> Hmmm... Isn't this a gcc bug (using undocumented hardware feature), given
everything you said here?
>

Paul, you are touching here a very sensitive area. If I were you, I would
most likely propose a similar patch, given thr info you presented. However,
in general, QEMU is a compiler-agnostic tool and should not depend on
compiler features, let alone fix its bugs, and therefore I think you should
more clearly spell out in your commit message that the code segment in
question is a workaround for an outright GCC bug, and just a kind of
"necessary evil" in given situation.

Let us also wait for David possibly coming up with the final verdict wrt
this issue.

Also, this patch should be split into two or three (infinity conversions
have nothing to do with xscvdpspn) - a patch deals with only one logical
unit.

Yours,
Aleksandar

> Sincerely,
> Aleksandar
>
> > Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
> > ---
> >  target/ppc/fpu_helper.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> > index 5611cf0..82b5425 100644
> > --- a/target/ppc/fpu_helper.c
> > +++ b/target/ppc/fpu_helper.c
> > @@ -62,13 +62,14 @@ uint64_t helper_todouble(uint32_t arg)
> >          ret  = (uint64_t)extract32(arg, 30, 2) << 62;
> >          ret |= ((extract32(arg, 30, 1) ^ 1) * (uint64_t)7) << 59;
> >          ret |= (uint64_t)extract32(arg, 0, 30) << 29;
> > +        ret |= (0x7ffULL * (extract32(arg, 23, 8) == 0xff)) << 52;
> >      } else {
> >          /* Zero or Denormalized operand.  */
> >          ret = (uint64_t)extract32(arg, 31, 1) << 63;
> >          if (unlikely(abs_arg != 0)) {
> >              /* Denormalized operand.  */
> >              int shift = clz32(abs_arg) - 9;
> > -            int exp = -126 - shift + 1023;
> > +            int exp = -127 - shift + 1023;
> >              ret |= (uint64_t)exp << 52;
> >              ret |= abs_arg << (shift + 29);
> >          }
> > @@ -2871,10 +2872,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t
opcode,
> >
> >  uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
> >  {
> > +    uint64_t result;
> > +
> >      float_status tstat = env->fp_status;
> >      set_float_exception_flags(0, &tstat);
> >
> > -    return (uint64_t)float64_to_float32(xb, &tstat) << 32;
> > +    result = (uint64_t)float64_to_float32(xb, &tstat);
> > +    /* hardware replicates result to both words of the doubleword
result.  */
> > +    return (result << 32) | result;
> >  }
> >
> >  uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)
> > --
> > 1.8.3.1
> >
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-16 19:27 [Qemu-devel] [PATCH] ppc: Three floating point fixes Paul A. Clarke
  2019-08-16 22:59 ` Aleksandar Markovic
@ 2019-08-18  8:00 ` Richard Henderson
  2019-08-19  6:30 ` David Gibson
  2 siblings, 0 replies; 13+ messages in thread
From: Richard Henderson @ 2019-08-18  8:00 UTC (permalink / raw)
  To: Paul A. Clarke, david; +Cc: qemu-ppc, qemu-devel

On 8/16/19 8:27 PM, Paul A. Clarke wrote:
> From: "Paul A. Clarke" <pc@us.ibm.com>
> 
> - target/ppc/fpu_helper.c:
>   - helper_todouble() was not properly converting INFINITY from 32 bit
>   float to 64 bit double.
>   - helper_todouble() was not properly converting any denormalized
>   32 bit float to 64 bit double.

These two would seem to be my fault:
Fixes: 86c0cab11aa (target/ppc: Use non-arithmetic conversions for fp load/store)

> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 5611cf0..82b5425 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -62,13 +62,14 @@ uint64_t helper_todouble(uint32_t arg)
>          ret  = (uint64_t)extract32(arg, 30, 2) << 62;
>          ret |= ((extract32(arg, 30, 1) ^ 1) * (uint64_t)7) << 59;
>          ret |= (uint64_t)extract32(arg, 0, 30) << 29;
> +        ret |= (0x7ffULL * (extract32(arg, 23, 8) == 0xff)) << 52;

Since the overwrites bits set two lines above,
I think this would be better as

    if (likely(abs_arg >= 0x00800000)) {
        /* Normalized operand, or Inf or NaN.  */
        if (unlikely(extract32(arg, 23, 8) == 0xff)) {
            /* Inf or NaN.  */
            ...
        } else {
            /* Normalized operand.  */


>      } else {
>          /* Zero or Denormalized operand.  */
>          ret = (uint64_t)extract32(arg, 31, 1) << 63;
>          if (unlikely(abs_arg != 0)) {
>              /* Denormalized operand.  */
>              int shift = clz32(abs_arg) - 9;
> -            int exp = -126 - shift + 1023;
> +            int exp = -127 - shift + 1023;
>              ret |= (uint64_t)exp << 52;
>              ret |= abs_arg << (shift + 29);

Hmm.  The manual says -126, but it also shifts the fraction by a different
amount, such that the implicit bit is 1.

What I also don't see here is where the most significant bit is removed from
the fraction, a-la "FRT[5:63] = frac[1:52]" in the manual.

The soft-float code from which this was probably copied did this by shifting
the fraction such that the msb overlaps the exponent, biasing the exponent by
-1, and then using an add instead of an or to simultaneously remove the bias,
swallow the msb, and include the lower bits unchanged.

So it looks like this should be

    /*
     * Shift fraction so that the msb is in the implicit bit position.
     * Thus shift is in the range [1:23].
     */
    int shift = clz32(abs_arg) - 8;
    /*
     * The first 3 terms compute the float64 exponent.  We then bias
     * this result by -1 so that we can swallow the implicit bit below.
     */
    int exp = -126 - shift + 1023 - 1;
    ret |= (uint64_t)exp << 52;
    ret += (uint64_t)abs_arg << (52 - 23 + shift);

Hmm.  I see another bug in the existing code whereby abs_arg is not cast before
shifting.  This truncation is probably how you're seeing correct results with
your patch, for denormals containing only one bit set, e.g. FLT_TRUE_MIN.

It would be good to test other denormals, like 0x00055555.


> @@ -2871,10 +2872,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode,
>  
>  uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
>  {
> +    uint64_t result;
> +
>      float_status tstat = env->fp_status;
>      set_float_exception_flags(0, &tstat);
>  
> -    return (uint64_t)float64_to_float32(xb, &tstat) << 32;
> +    result = (uint64_t)float64_to_float32(xb, &tstat);
> +    /* hardware replicates result to both words of the doubleword result.  */
> +    return (result << 32) | result;
>  }

This definitely should be a separate patch.  The comment should include some
language about this behaviour being required by a future ISA revision.


r~


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-16 22:59 ` Aleksandar Markovic
  2019-08-17  7:53   ` Aleksandar Markovic
@ 2019-08-18  8:10   ` Richard Henderson
  2019-08-18 20:59     ` Aleksandar Markovic
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2019-08-18  8:10 UTC (permalink / raw)
  To: Aleksandar Markovic, Paul A. Clarke; +Cc: qemu-ppc, qemu-devel, david

On 8/16/19 11:59 PM, Aleksandar Markovic wrote:
>> From: "Paul A. Clarke" <pc@us.ibm.com>
...
>>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
> register,
>>   and mffprwz expecting its input to come from word 0 of the source
> register.
>>   This sequence fails with QEMU, as a shift is required between those two
>>   instructions.  However, the hardware splats the result to both word 0
> and
>>   word 1 of its output register, so the shift is not necessary.
>>   Expect a future revision of the ISA to specify this behavior.
>>
> 
> Hmmm... Isn't this a gcc bug (using undocumented hardware feature), given
> everything you said here?

The key here is "expect a future revision of the ISA to specify this behavior".

It's clearly within IBM's purview to adjust the specification to document a
previously undocumented hardware feature.


r~


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-18  8:10   ` Richard Henderson
@ 2019-08-18 20:59     ` Aleksandar Markovic
  2019-08-19  6:28       ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: Aleksandar Markovic @ 2019-08-18 20:59 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paul A. Clarke, qemu-ppc, qemu-devel, david

18.08.2019. 10.10, "Richard Henderson" <richard.henderson@linaro.org> је
написао/ла:
>
> On 8/16/19 11:59 PM, Aleksandar Markovic wrote:
> >> From: "Paul A. Clarke" <pc@us.ibm.com>
> ...
> >>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
> > register,
> >>   and mffprwz expecting its input to come from word 0 of the source
> > register.
> >>   This sequence fails with QEMU, as a shift is required between those
two
> >>   instructions.  However, the hardware splats the result to both word 0
> > and
> >>   word 1 of its output register, so the shift is not necessary.
> >>   Expect a future revision of the ISA to specify this behavior.
> >>
> >
> > Hmmm... Isn't this a gcc bug (using undocumented hardware feature),
given
> > everything you said here?
>
> The key here is "expect a future revision of the ISA to specify this
behavior".
>
> It's clearly within IBM's purview to adjust the specification to document
a
> previously undocumented hardware feature.
>

By no means, yes, the key is in ISA documentation. But, the impression that
full original commit message conveys is that the main reason for change is
gcc behavior. If we accepted in general that gcc behavior determines QEMU
behavior, I am afraid we would be on a very slippery slope - therefore I
think the commit message (and possible code comment) should, in my opinion,
mention ISA docs as the central reason for change. Paul, is there any
tentative release date of the new ISA specification?

Aleksandar

>
> r~

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-18 20:59     ` Aleksandar Markovic
@ 2019-08-19  6:28       ` David Gibson
  2019-08-19  6:44         ` Aleksandar Markovic
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2019-08-19  6:28 UTC (permalink / raw)
  To: Aleksandar Markovic
  Cc: qemu-ppc, Richard Henderson, qemu-devel, Paul A. Clarke

[-- Attachment #1: Type: text/plain, Size: 2171 bytes --]

On Sun, Aug 18, 2019 at 10:59:01PM +0200, Aleksandar Markovic wrote:
> 18.08.2019. 10.10, "Richard Henderson" <richard.henderson@linaro.org> је
> написао/ла:
> >
> > On 8/16/19 11:59 PM, Aleksandar Markovic wrote:
> > >> From: "Paul A. Clarke" <pc@us.ibm.com>
> > ...
> > >>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
> > > register,
> > >>   and mffprwz expecting its input to come from word 0 of the source
> > > register.
> > >>   This sequence fails with QEMU, as a shift is required between those
> two
> > >>   instructions.  However, the hardware splats the result to both word 0
> > > and
> > >>   word 1 of its output register, so the shift is not necessary.
> > >>   Expect a future revision of the ISA to specify this behavior.
> > >>
> > >
> > > Hmmm... Isn't this a gcc bug (using undocumented hardware feature),
> given
> > > everything you said here?
> >
> > The key here is "expect a future revision of the ISA to specify this
> behavior".
> >
> > It's clearly within IBM's purview to adjust the specification to document
> a
> > previously undocumented hardware feature.
> >
> 
> By no means, yes, the key is in ISA documentation. But, the impression that
> full original commit message conveys is that the main reason for change is
> gcc behavior. If we accepted in general that gcc behavior determines QEMU
> behavior, I am afraid we would be on a very slippery slope - therefore I
> think the commit message (and possible code comment) should, in my opinion,
> mention ISA docs as the central reason for change. Paul, is there any
> tentative release date of the new ISA specification?

It's not really a question of gcc behaviour, it's a question of actual
cpu behaviour versus ISA document.  Which one qemu should follow is
somewhat debatable, but it sounds here like the ISA will be updated to
match the cpu, which weights it heavily in favour of mimicing the
actual cpu.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-16 19:27 [Qemu-devel] [PATCH] ppc: Three floating point fixes Paul A. Clarke
  2019-08-16 22:59 ` Aleksandar Markovic
  2019-08-18  8:00 ` [Qemu-devel] " Richard Henderson
@ 2019-08-19  6:30 ` David Gibson
  2 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2019-08-19  6:30 UTC (permalink / raw)
  To: Paul A. Clarke; +Cc: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3178 bytes --]

On Fri, Aug 16, 2019 at 02:27:49PM -0500, Paul A. Clarke wrote:
> From: "Paul A. Clarke" <pc@us.ibm.com>
> 
> - target/ppc/fpu_helper.c:
>   - helper_todouble() was not properly converting INFINITY from 32 bit
>   float to 64 bit double.
>   - helper_todouble() was not properly converting any denormalized
>   32 bit float to 64 bit double.
> 
> - GCC, as of version 8 or so, takes advantage of the hardware's
>   implementation of the xscvdpspn instruction to optimize the following
>   sequence:
>     xscvdpspn vs0,vs1
>     mffprwz   r8,f0
>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target register,
>   and mffprwz expecting its input to come from word 0 of the source register.
>   This sequence fails with QEMU, as a shift is required between those two
>   instructions.  However, the hardware splats the result to both word 0 and
>   word 1 of its output register, so the shift is not necessary.
>   Expect a future revision of the ISA to specify this behavior.

As well as addressing Richard's comments, I'd prefer to see each of
the 3 fixes here in separate patches.  The code changes may be tiny,
but I find the fpu emulation code cryptic at the best of times, and
making it clear which change is addressing which bug would help.

> Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
> ---
>  target/ppc/fpu_helper.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 5611cf0..82b5425 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -62,13 +62,14 @@ uint64_t helper_todouble(uint32_t arg)
>          ret  = (uint64_t)extract32(arg, 30, 2) << 62;
>          ret |= ((extract32(arg, 30, 1) ^ 1) * (uint64_t)7) << 59;
>          ret |= (uint64_t)extract32(arg, 0, 30) << 29;
> +        ret |= (0x7ffULL * (extract32(arg, 23, 8) == 0xff)) << 52;
>      } else {
>          /* Zero or Denormalized operand.  */
>          ret = (uint64_t)extract32(arg, 31, 1) << 63;
>          if (unlikely(abs_arg != 0)) {
>              /* Denormalized operand.  */
>              int shift = clz32(abs_arg) - 9;
> -            int exp = -126 - shift + 1023;
> +            int exp = -127 - shift + 1023;
>              ret |= (uint64_t)exp << 52;
>              ret |= abs_arg << (shift + 29);
>          }
> @@ -2871,10 +2872,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode,
>  
>  uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
>  {
> +    uint64_t result;
> +
>      float_status tstat = env->fp_status;
>      set_float_exception_flags(0, &tstat);
>  
> -    return (uint64_t)float64_to_float32(xb, &tstat) << 32;
> +    result = (uint64_t)float64_to_float32(xb, &tstat);
> +    /* hardware replicates result to both words of the doubleword result.  */
> +    return (result << 32) | result;
>  }
>  
>  uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH] ppc: Three floating point fixes
  2019-08-19  6:28       ` David Gibson
@ 2019-08-19  6:44         ` Aleksandar Markovic
  2019-08-19 17:13           ` [Qemu-devel] [Qemu-ppc] " Paul Clarke
  0 siblings, 1 reply; 13+ messages in thread
From: Aleksandar Markovic @ 2019-08-19  6:44 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, Richard Henderson, Paul A. Clarke, qemu-devel

19.08.2019. 08.30, "David Gibson" <david@gibson.dropbear.id.au> је
написао/ла:
>
> On Sun, Aug 18, 2019 at 10:59:01PM +0200, Aleksandar Markovic wrote:
> > 18.08.2019. 10.10, "Richard Henderson" <richard.henderson@linaro.org> је
> > написао/ла:
> > >
> > > On 8/16/19 11:59 PM, Aleksandar Markovic wrote:
> > > >> From: "Paul A. Clarke" <pc@us.ibm.com>
> > > ...
> > > >>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
> > > > register,
> > > >>   and mffprwz expecting its input to come from word 0 of the source
> > > > register.
> > > >>   This sequence fails with QEMU, as a shift is required between
those
> > two
> > > >>   instructions.  However, the hardware splats the result to both
word 0
> > > > and
> > > >>   word 1 of its output register, so the shift is not necessary.
> > > >>   Expect a future revision of the ISA to specify this behavior.
> > > >>
> > > >
> > > > Hmmm... Isn't this a gcc bug (using undocumented hardware feature),
> > given
> > > > everything you said here?
> > >
> > > The key here is "expect a future revision of the ISA to specify this
> > behavior".
> > >
> > > It's clearly within IBM's purview to adjust the specification to
document
> > a
> > > previously undocumented hardware feature.
> > >
> >
> > By no means, yes, the key is in ISA documentation. But, the impression
that
> > full original commit message conveys is that the main reason for change
is
> > gcc behavior. If we accepted in general that gcc behavior determines
QEMU
> > behavior, I am afraid we would be on a very slippery slope - therefore I
> > think the commit message (and possible code comment) should, in my
opinion,
> > mention ISA docs as the central reason for change. Paul, is there any
> > tentative release date of the new ISA specification?
>
> It's not really a question of gcc behaviour, it's a question of actual
> cpu behaviour versus ISA document.  Which one qemu should follow is
> somewhat debatable, but it sounds here like the ISA will be updated to
> match the cpu, which weights it heavily in favour of mimicing the
> actual cpu.
>

This sounds right to me.

Aleksandar

> --
> David Gibson                    | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_
_other_
>                                 | _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc: Three floating point fixes
  2019-08-19  6:44         ` Aleksandar Markovic
@ 2019-08-19 17:13           ` Paul Clarke
  2019-08-20  7:31             ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Clarke @ 2019-08-19 17:13 UTC (permalink / raw)
  To: Aleksandar Markovic, David Gibson; +Cc: Richard Henderson, qemu-ppc, qemu-devel

On 8/19/19 1:44 AM, Aleksandar Markovic wrote:
> 19.08.2019. 08.30, "David Gibson" <david@gibson.dropbear.id.au> је
> написао/ла:
>>
>> On Sun, Aug 18, 2019 at 10:59:01PM +0200, Aleksandar Markovic wrote:
>>> 18.08.2019. 10.10, "Richard Henderson" <richard.henderson@linaro.org> је
>>> написао/ла:
>>>>
>>>> On 8/16/19 11:59 PM, Aleksandar Markovic wrote:
>>>>>> From: "Paul A. Clarke" <pc@us.ibm.com>
>>>> ...
>>>>>>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
>>>>> register,
>>>>>>   and mffprwz expecting its input to come from word 0 of the source
>>>>> register.
>>>>>>   This sequence fails with QEMU, as a shift is required between
> those
>>> two
>>>>>>   instructions.  However, the hardware splats the result to both
> word 0
>>>>> and
>>>>>>   word 1 of its output register, so the shift is not necessary.
>>>>>>   Expect a future revision of the ISA to specify this behavior.
>>>>>>
>>>>>
>>>>> Hmmm... Isn't this a gcc bug (using undocumented hardware feature),
>>> given
>>>>> everything you said here?
>>>>
>>>> The key here is "expect a future revision of the ISA to specify this
>>> behavior".
>>>>
>>>> It's clearly within IBM's purview to adjust the specification to
> document
>>> a
>>>> previously undocumented hardware feature.
>>>>
>>>
>>> By no means, yes, the key is in ISA documentation. But, the impression
> that
>>> full original commit message conveys is that the main reason for change
> is
>>> gcc behavior. If we accepted in general that gcc behavior determines
> QEMU
>>> behavior, I am afraid we would be on a very slippery slope - therefore I
>>> think the commit message (and possible code comment) should, in my
> opinion,
>>> mention ISA docs as the central reason for change. Paul, is there any
>>> tentative release date of the new ISA specification?
>>
>> It's not really a question of gcc behaviour, it's a question of actual
>> cpu behaviour versus ISA document.  Which one qemu should follow is
>> somewhat debatable, but it sounds here like the ISA will be updated to
>> match the cpu, which weights it heavily in favour of mimicing the
>> actual cpu.
>>
> 
> This sounds right to me.

Thanks for the reviews and great discussion.

While not yet part of a published version of the ISA, I did find the behavior documented in the User's Manuals for the POWER8 and POWER9 processors:

https://www-355.ibm.com/systems/power/openpower/
"Public Documents"
- "POWER9 Processor User's Manual"
- "POWER8 Processor User's Manual for the SCM"

POWER9 Processor User's Manual 
4. Power Architecture Compliance
4.3 Floating-Point Processor (FP, VMX, and VSX)
4.3.7 Floating-Point Invalid Forms and Undefined Conditions

POWER8 Processor User's Manual for the Single-Chip Module
3. Power Architecture Compliance
3.2 Floating-Point Processor (FP, VMX, and VSX)
3.2.9 Floating-Point Invalid Forms and Undefined Conditions

In a bullet:
- VSX scalar convert from double-precision to single-precision (xscvdpsp, xscvdpspn).
VSR[32:63] is set to VSR[0:31].

I have not confirmed when the new revision of the ISA will be published, but it's on somebody's "to do" list.

I will respin the patch as 3 independent patches, and include a reference to the above documents for the change under discussion here.  (The other two changes may take a bit more time, because like David, I find the FPU emulation code cryptic.  :-/  )

These issues were found while running Glibc's test suite for "math", and there are still a *LOT* of QEMU-only FAILs, so I may be back again with suggested fixes or questions.  :-)

PC


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc: Three floating point fixes
  2019-08-19 17:13           ` [Qemu-devel] [Qemu-ppc] " Paul Clarke
@ 2019-08-20  7:31             ` David Gibson
  2019-08-20 10:35               ` Peter Maydell
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2019-08-20  7:31 UTC (permalink / raw)
  To: Paul Clarke; +Cc: Richard Henderson, qemu-ppc, qemu-devel, Aleksandar Markovic

[-- Attachment #1: Type: text/plain, Size: 4189 bytes --]

On Mon, Aug 19, 2019 at 12:13:34PM -0500, Paul Clarke wrote:
> On 8/19/19 1:44 AM, Aleksandar Markovic wrote:
> > 19.08.2019. 08.30, "David Gibson" <david@gibson.dropbear.id.au> је
> > написао/ла:
> >>
> >> On Sun, Aug 18, 2019 at 10:59:01PM +0200, Aleksandar Markovic wrote:
> >>> 18.08.2019. 10.10, "Richard Henderson" <richard.henderson@linaro.org> је
> >>> написао/ла:
> >>>>
> >>>> On 8/16/19 11:59 PM, Aleksandar Markovic wrote:
> >>>>>> From: "Paul A. Clarke" <pc@us.ibm.com>
> >>>> ...
> >>>>>>   ISA 3.0B has xscvdpspn leaving its result in word 1 of the target
> >>>>> register,
> >>>>>>   and mffprwz expecting its input to come from word 0 of the source
> >>>>> register.
> >>>>>>   This sequence fails with QEMU, as a shift is required between
> > those
> >>> two
> >>>>>>   instructions.  However, the hardware splats the result to both
> > word 0
> >>>>> and
> >>>>>>   word 1 of its output register, so the shift is not necessary.
> >>>>>>   Expect a future revision of the ISA to specify this behavior.
> >>>>>>
> >>>>>
> >>>>> Hmmm... Isn't this a gcc bug (using undocumented hardware feature),
> >>> given
> >>>>> everything you said here?
> >>>>
> >>>> The key here is "expect a future revision of the ISA to specify this
> >>> behavior".
> >>>>
> >>>> It's clearly within IBM's purview to adjust the specification to
> > document
> >>> a
> >>>> previously undocumented hardware feature.
> >>>>
> >>>
> >>> By no means, yes, the key is in ISA documentation. But, the impression
> > that
> >>> full original commit message conveys is that the main reason for change
> > is
> >>> gcc behavior. If we accepted in general that gcc behavior determines
> > QEMU
> >>> behavior, I am afraid we would be on a very slippery slope - therefore I
> >>> think the commit message (and possible code comment) should, in my
> > opinion,
> >>> mention ISA docs as the central reason for change. Paul, is there any
> >>> tentative release date of the new ISA specification?
> >>
> >> It's not really a question of gcc behaviour, it's a question of actual
> >> cpu behaviour versus ISA document.  Which one qemu should follow is
> >> somewhat debatable, but it sounds here like the ISA will be updated to
> >> match the cpu, which weights it heavily in favour of mimicing the
> >> actual cpu.
> >>
> > 
> > This sounds right to me.
> 
> Thanks for the reviews and great discussion.
> 
> While not yet part of a published version of the ISA, I did find the behavior documented in the User's Manuals for the POWER8 and POWER9 processors:
> 
> https://www-355.ibm.com/systems/power/openpower/
> "Public Documents"
> - "POWER9 Processor User's Manual"
> - "POWER8 Processor User's Manual for the SCM"
> 
> POWER9 Processor User's Manual 
> 4. Power Architecture Compliance
> 4.3 Floating-Point Processor (FP, VMX, and VSX)
> 4.3.7 Floating-Point Invalid Forms and Undefined Conditions
> 
> POWER8 Processor User's Manual for the Single-Chip Module
> 3. Power Architecture Compliance
> 3.2 Floating-Point Processor (FP, VMX, and VSX)
> 3.2.9 Floating-Point Invalid Forms and Undefined Conditions
> 
> In a bullet:
> - VSX scalar convert from double-precision to single-precision (xscvdpsp, xscvdpspn).
> VSR[32:63] is set to VSR[0:31].
> 
> I have not confirmed when the new revision of the ISA will be published, but it's on somebody's "to do" list.
> 
> I will respin the patch as 3 independent patches, and include a reference to the above documents for the change under discussion here.  (The other two changes may take a bit more time, because like David, I find the FPU emulation code cryptic.  :-/  )
> 
> These issues were found while running Glibc's test suite for "math",
> and there are still a *LOT* of QEMU-only FAILs, so I may be back
> again with suggested fixes or questions.  :-)

That doesn't greatly surprise me, TCG's ppc target stuff is only so-so
tested, TBH.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc: Three floating point fixes
  2019-08-20  7:31             ` David Gibson
@ 2019-08-20 10:35               ` Peter Maydell
  2019-08-22  3:02                 ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Maydell @ 2019-08-20 10:35 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, Richard Henderson, Paul Clarke, Aleksandar Markovic,
	QEMU Developers

On Tue, 20 Aug 2019 at 08:36, David Gibson <david@gibson.dropbear.id.au> wrote:
> On Mon, Aug 19, 2019 at 12:13:34PM -0500, Paul Clarke wrote:
> > These issues were found while running Glibc's test suite for "math",
> > and there are still a *LOT* of QEMU-only FAILs, so I may be back
> > again with suggested fixes or questions.  :-)
>
> That doesn't greatly surprise me, TCG's ppc target stuff is only so-so
> tested, TBH.

You might also consider using/extending the risu test cases for
ppc64 -- individual checks of each insn against a known-good
implementation can be easier to track down bugs than trying
to figure out why a higher-level test suite like the glibc one
has reported a failure, IME. (There are already risu patterns
for XSCVDPSP and XSCVDPSPN, so I think that bug at least ought
to be found by risu if you run it against the right h/w as
known-good reference...)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc: Three floating point fixes
  2019-08-20 10:35               ` Peter Maydell
@ 2019-08-22  3:02                 ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2019-08-22  3:02 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-ppc, Richard Henderson, Paul Clarke, Aleksandar Markovic,
	QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 1311 bytes --]

On Tue, Aug 20, 2019 at 11:35:29AM +0100, Peter Maydell wrote:
> On Tue, 20 Aug 2019 at 08:36, David Gibson <david@gibson.dropbear.id.au> wrote:
> > On Mon, Aug 19, 2019 at 12:13:34PM -0500, Paul Clarke wrote:
> > > These issues were found while running Glibc's test suite for "math",
> > > and there are still a *LOT* of QEMU-only FAILs, so I may be back
> > > again with suggested fixes or questions.  :-)
> >
> > That doesn't greatly surprise me, TCG's ppc target stuff is only so-so
> > tested, TBH.
> 
> You might also consider using/extending the risu test cases for
> ppc64 -- individual checks of each insn against a known-good
> implementation can be easier to track down bugs than trying
> to figure out why a higher-level test suite like the glibc one
> has reported a failure, IME. (There are already risu patterns
> for XSCVDPSP and XSCVDPSPN, so I think that bug at least ought
> to be found by risu if you run it against the right h/w as
> known-good reference...)

Oh, I'd love to.  I tried running risu once, it failed cryptically and
I haven't had time to investigate deeper.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-22  4:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-16 19:27 [Qemu-devel] [PATCH] ppc: Three floating point fixes Paul A. Clarke
2019-08-16 22:59 ` Aleksandar Markovic
2019-08-17  7:53   ` Aleksandar Markovic
2019-08-18  8:10   ` Richard Henderson
2019-08-18 20:59     ` Aleksandar Markovic
2019-08-19  6:28       ` David Gibson
2019-08-19  6:44         ` Aleksandar Markovic
2019-08-19 17:13           ` [Qemu-devel] [Qemu-ppc] " Paul Clarke
2019-08-20  7:31             ` David Gibson
2019-08-20 10:35               ` Peter Maydell
2019-08-22  3:02                 ` David Gibson
2019-08-18  8:00 ` [Qemu-devel] " Richard Henderson
2019-08-19  6:30 ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).