All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] PPC handles mcrfs incorrectly
@ 2016-01-24 15:41 James Clarke
  2016-01-24 15:41 ` [Qemu-devel] [PATCH 1/2] target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro James Clarke
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: James Clarke @ 2016-01-24 15:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: James Clarke, qemu-ppc, Alexander Graf

Here is the description of the mcrfs instruction from the PowerPC Architecture
Book, Version 2.02, Book I: PowerPC User Instruction Set Architecture
(http://www.ibm.com/developerworks/systems/library/es-archguide-v2.html), found
on page 120:

    The contents of FPSCR field BFA are copied to Condition Register field BF.
    All exception bits copied are set to 0 in the FPSCR. If the FX bit is
    copied, it is set to 0 in the FPSCR.

    Special Registers Altered:
        CR field BF
        FX OX                        (if BFA=0)
        UX ZX XX VXSNAN              (if BFA=1)
        VXISI VXIDI VXZDZ VXIMZ      (if BFA=2)
        VXVC                         (if BFA=3)
        VXSOFT VXSQRT VXCVI          (if BFA=5)

However, currently every bit in FPSCR field BFA is set to 0, including ones not
on that list.

I noticed this with the following simple C program:

    #include <fenv.h>
    #include <stdio.h>

    int main(int argc, char **argv) {
        int ret;
        ret = fegetround();
        printf("Current rounding: %d\n", ret);
        ret = fesetround(FE_UPWARD);
        printf("Setting to FE_UPWARD (%d): %d\n", FE_UPWARD, ret);
        ret = fegetround();
        printf("Current rounding: %d\n", ret);
        ret = fegetround();
        printf("Current rounding: %d\n", ret);
        return 0;
    }

which gave the output:

    Current rounding: 0
    Setting to FE_UPWARD (2): 0
    Current rounding: 2
    Current rounding: 0

instead of (with these patches applied):

    Current rounding: 0
    Setting to FE_UPWARD (2): 0
    Current rounding: 2
    Current rounding: 2

The relevant disassembly is in fegetround(), which, on my system, is:

    __GI___fegetround:
    <+0>:   mcrfs  cr7, cr7
    <+4>:   mfcr   r3
    <+8>:   clrldi r3, r3, 62
    <+12>:  blr

What happens is that, the first time fegetround() is called, FPSCR field 7 is
retrieved. However, because of the bug in mcrfs, the entirety of field 7 is set
to 0, which includes the rounding mode.

There are other issues this will fix, such as condition flags not persisting
when they should if read, and if you were to read a specific field with some
exception bits set, but no others were set in the entire register, then the
bits would be cleared correctly, but FEX/VX would not be updated to 0 as they
should be.

The first commit is because some FP_ macros needed to calculate
FP_EX_CLEAR_BITS did not exist, and I reordered all the FP_ macros so that they
are defined in the same order as the FPSCR_ macros.

James Clarke (2):
  target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro
  target-ppc: mcrfs should always update FEX/VX and only clear exception
    bits

 target-ppc/cpu.h       | 37 ++++++++++++++++++++++++++++---------
 target-ppc/translate.c | 15 +++++++++++----
 2 files changed, 39 insertions(+), 13 deletions(-)

--
2.7.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 1/2] target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro
  2016-01-24 15:41 [Qemu-devel] [PATCH 0/2] PPC handles mcrfs incorrectly James Clarke
@ 2016-01-24 15:41 ` James Clarke
  2016-01-24 15:41 ` [Qemu-devel] [PATCH 2/2] target-ppc: mcrfs should always update FEX/VX and only clear exception bits James Clarke
  2016-01-29  3:04 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC handles mcrfs incorrectly David Gibson
  2 siblings, 0 replies; 5+ messages in thread
From: James Clarke @ 2016-01-24 15:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: James Clarke, qemu-ppc, Alexander Graf

Signed-off-by: James Clarke <jrtc27@jrtc27.com>
---
 target-ppc/cpu.h | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 9706000..3a967b7 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -686,24 +686,37 @@ enum {
 
 #define FP_FX		(1ull << FPSCR_FX)
 #define FP_FEX		(1ull << FPSCR_FEX)
+#define FP_VX		(1ull << FPSCR_VX)
 #define FP_OX		(1ull << FPSCR_OX)
-#define FP_OE		(1ull << FPSCR_OE)
 #define FP_UX		(1ull << FPSCR_UX)
-#define FP_UE		(1ull << FPSCR_UE)
-#define FP_XX		(1ull << FPSCR_XX)
-#define FP_XE		(1ull << FPSCR_XE)
 #define FP_ZX		(1ull << FPSCR_ZX)
-#define FP_ZE		(1ull << FPSCR_ZE)
-#define FP_VX		(1ull << FPSCR_VX)
+#define FP_XX		(1ull << FPSCR_XX)
 #define FP_VXSNAN	(1ull << FPSCR_VXSNAN)
 #define FP_VXISI	(1ull << FPSCR_VXISI)
-#define FP_VXIMZ	(1ull << FPSCR_VXIMZ)
-#define FP_VXZDZ	(1ull << FPSCR_VXZDZ)
 #define FP_VXIDI	(1ull << FPSCR_VXIDI)
+#define FP_VXZDZ	(1ull << FPSCR_VXZDZ)
+#define FP_VXIMZ	(1ull << FPSCR_VXIMZ)
 #define FP_VXVC		(1ull << FPSCR_VXVC)
+#define FP_FR		(1ull << FSPCR_FR)
+#define FP_FI		(1ull << FPSCR_FI)
+#define FP_C		(1ull << FPSCR_C)
+#define FP_FL		(1ull << FPSCR_FL)
+#define FP_FG		(1ull << FPSCR_FG)
+#define FP_FE		(1ull << FPSCR_FE)
+#define FP_FU		(1ull << FPSCR_FU)
+#define FP_FPCC		(FP_FL | FP_FG | FP_FE | FP_FU)
+#define FP_FPRF		(FP_C  | FP_FL | FP_FG | FP_FE | FP_FU)
+#define FP_VXSOFT	(1ull << FPSCR_VXSOFT)
+#define FP_VXSQRT	(1ull << FPSCR_VXSQRT)
 #define FP_VXCVI	(1ull << FPSCR_VXCVI)
 #define FP_VE		(1ull << FPSCR_VE)
-#define FP_FI		(1ull << FPSCR_FI)
+#define FP_OE		(1ull << FPSCR_OE)
+#define FP_UE		(1ull << FPSCR_UE)
+#define FP_ZE		(1ull << FPSCR_ZE)
+#define FP_XE		(1ull << FPSCR_XE)
+#define FP_NI		(1ull << FPSCR_NI)
+#define FP_RN1		(1ull << FPSCR_RN1)
+#define FP_RN		(1ull << FPSCR_RN)
 
 /*****************************************************************************/
 /* Vector status and control register */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 2/2] target-ppc: mcrfs should always update FEX/VX and only clear exception bits
  2016-01-24 15:41 [Qemu-devel] [PATCH 0/2] PPC handles mcrfs incorrectly James Clarke
  2016-01-24 15:41 ` [Qemu-devel] [PATCH 1/2] target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro James Clarke
@ 2016-01-24 15:41 ` James Clarke
  2016-01-29  3:17   ` [Qemu-devel] [Qemu-ppc] " David Gibson
  2016-01-29  3:04 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC handles mcrfs incorrectly David Gibson
  2 siblings, 1 reply; 5+ messages in thread
From: James Clarke @ 2016-01-24 15:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: James Clarke, qemu-ppc, Alexander Graf

Signed-off-by: James Clarke <jrtc27@jrtc27.com>
---
 target-ppc/cpu.h       |  6 ++++++
 target-ppc/translate.c | 15 +++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 3a967b7..d811bc9 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -718,6 +718,12 @@ enum {
 #define FP_RN1		(1ull << FPSCR_RN1)
 #define FP_RN		(1ull << FPSCR_RN)
 
+/* the exception bits which can be cleared by mcrfs - includes FX */
+#define FP_EX_CLEAR_BITS (FP_FX     | FP_OX     | FP_UX     | FP_ZX     | \
+                          FP_XX     | FP_VXSNAN | FP_VXISI  | FP_VXIDI  | \
+                          FP_VXZDZ  | FP_VXIMZ  | FP_VXVC   | FP_VXSOFT | \
+                          FP_VXSQRT | FP_VXCVI)
+
 /*****************************************************************************/
 /* Vector status and control register */
 #define VSCR_NJ		16 /* Vector non-java */
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 4be7eaa..989f683 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2501,17 +2501,24 @@ static void gen_mcrfs(DisasContext *ctx)
 {
     TCGv tmp = tcg_temp_new();
     int bfa;
+    int nibble;
+    int shift;
 
     if (unlikely(!ctx->fpu_enabled)) {
         gen_exception(ctx, POWERPC_EXCP_FPU);
         return;
     }
-    bfa = 4 * (7 - crfS(ctx->opcode));
-    tcg_gen_shri_tl(tmp, cpu_fpscr, bfa);
+    bfa = crfS(ctx->opcode);
+    nibble = 7 - bfa;
+    shift = 4 * nibble;
+    tcg_gen_shri_tl(tmp, cpu_fpscr, shift);
     tcg_gen_trunc_tl_i32(cpu_crf[crfD(ctx->opcode)], tmp);
-    tcg_temp_free(tmp);
     tcg_gen_andi_i32(cpu_crf[crfD(ctx->opcode)], cpu_crf[crfD(ctx->opcode)], 0xf);
-    tcg_gen_andi_tl(cpu_fpscr, cpu_fpscr, ~(0xF << bfa));
+    /* Only the exception bits (including FX) should be cleared if read */
+    tcg_gen_andi_tl(tmp, cpu_fpscr, ~((0xF << shift) & FP_EX_CLEAR_BITS));
+    /* FEX and VX need to be updated, so don't set fpscr directly */
+    gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);
+    tcg_temp_free(tmp);
 }
 
 /* mffs */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC handles mcrfs incorrectly
  2016-01-24 15:41 [Qemu-devel] [PATCH 0/2] PPC handles mcrfs incorrectly James Clarke
  2016-01-24 15:41 ` [Qemu-devel] [PATCH 1/2] target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro James Clarke
  2016-01-24 15:41 ` [Qemu-devel] [PATCH 2/2] target-ppc: mcrfs should always update FEX/VX and only clear exception bits James Clarke
@ 2016-01-29  3:04 ` David Gibson
  2 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2016-01-29  3:04 UTC (permalink / raw)
  To: James Clarke; +Cc: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3455 bytes --]

On Sun, Jan 24, 2016 at 03:41:24PM +0000, James Clarke wrote:
> Here is the description of the mcrfs instruction from the PowerPC Architecture
> Book, Version 2.02, Book I: PowerPC User Instruction Set Architecture
> (http://www.ibm.com/developerworks/systems/library/es-archguide-v2.html), found
> on page 120:

Thanks I've merged these fixes to ppc-for-2.6 which I'll send a pull
request for shortly.

> 
>     The contents of FPSCR field BFA are copied to Condition Register field BF.
>     All exception bits copied are set to 0 in the FPSCR. If the FX bit is
>     copied, it is set to 0 in the FPSCR.
> 
>     Special Registers Altered:
>         CR field BF
>         FX OX                        (if BFA=0)
>         UX ZX XX VXSNAN              (if BFA=1)
>         VXISI VXIDI VXZDZ VXIMZ      (if BFA=2)
>         VXVC                         (if BFA=3)
>         VXSOFT VXSQRT VXCVI          (if BFA=5)
> 
> However, currently every bit in FPSCR field BFA is set to 0, including ones not
> on that list.
> 
> I noticed this with the following simple C program:
> 
>     #include <fenv.h>
>     #include <stdio.h>
> 
>     int main(int argc, char **argv) {
>         int ret;
>         ret = fegetround();
>         printf("Current rounding: %d\n", ret);
>         ret = fesetround(FE_UPWARD);
>         printf("Setting to FE_UPWARD (%d): %d\n", FE_UPWARD, ret);
>         ret = fegetround();
>         printf("Current rounding: %d\n", ret);
>         ret = fegetround();
>         printf("Current rounding: %d\n", ret);
>         return 0;
>     }
> 
> which gave the output:
> 
>     Current rounding: 0
>     Setting to FE_UPWARD (2): 0
>     Current rounding: 2
>     Current rounding: 0
> 
> instead of (with these patches applied):
> 
>     Current rounding: 0
>     Setting to FE_UPWARD (2): 0
>     Current rounding: 2
>     Current rounding: 2
> 
> The relevant disassembly is in fegetround(), which, on my system, is:
> 
>     __GI___fegetround:
>     <+0>:   mcrfs  cr7, cr7
>     <+4>:   mfcr   r3
>     <+8>:   clrldi r3, r3, 62
>     <+12>:  blr
> 
> What happens is that, the first time fegetround() is called, FPSCR field 7 is
> retrieved. However, because of the bug in mcrfs, the entirety of field 7 is set
> to 0, which includes the rounding mode.
> 
> There are other issues this will fix, such as condition flags not persisting
> when they should if read, and if you were to read a specific field with some
> exception bits set, but no others were set in the entire register, then the
> bits would be cleared correctly, but FEX/VX would not be updated to 0 as they
> should be.
> 
> The first commit is because some FP_ macros needed to calculate
> FP_EX_CLEAR_BITS did not exist, and I reordered all the FP_ macros so that they
> are defined in the same order as the FPSCR_ macros.
> 
> James Clarke (2):
>   target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro
>   target-ppc: mcrfs should always update FEX/VX and only clear exception
>     bits
> 
>  target-ppc/cpu.h       | 37 ++++++++++++++++++++++++++++---------
>  target-ppc/translate.c | 15 +++++++++++----
>  2 files changed, 39 insertions(+), 13 deletions(-)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] target-ppc: mcrfs should always update FEX/VX and only clear exception bits
  2016-01-24 15:41 ` [Qemu-devel] [PATCH 2/2] target-ppc: mcrfs should always update FEX/VX and only clear exception bits James Clarke
@ 2016-01-29  3:17   ` David Gibson
  0 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2016-01-29  3:17 UTC (permalink / raw)
  To: James Clarke; +Cc: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 4000 bytes --]

On Sun, Jan 24, 2016 at 03:41:26PM +0000, James Clarke wrote:
> Signed-off-by: James Clarke <jrtc27@jrtc27.com>

So, first, for a patch making a subtle behavioural change like this a
detailed commit message is absolutely essential.  In this case I can
take the description from 0/2, but in future please include
rationale's like that in the individual patches, so they'll appear in
the git history without extra work on my part.

But.. there's a more serious bug here, so I've backed this out of
ppc-for-2.6...

[snip]
> @@ -2501,17 +2501,24 @@ static void gen_mcrfs(DisasContext *ctx)
>  {
>      TCGv tmp = tcg_temp_new();
>      int bfa;
> +    int nibble;
> +    int shift;
>  
>      if (unlikely(!ctx->fpu_enabled)) {
>          gen_exception(ctx, POWERPC_EXCP_FPU);
>          return;
>      }
> -    bfa = 4 * (7 - crfS(ctx->opcode));
> -    tcg_gen_shri_tl(tmp, cpu_fpscr, bfa);
> +    bfa = crfS(ctx->opcode);
> +    nibble = 7 - bfa;
> +    shift = 4 * nibble;
> +    tcg_gen_shri_tl(tmp, cpu_fpscr, shift);
>      tcg_gen_trunc_tl_i32(cpu_crf[crfD(ctx->opcode)], tmp);
> -    tcg_temp_free(tmp);
>      tcg_gen_andi_i32(cpu_crf[crfD(ctx->opcode)], cpu_crf[crfD(ctx->opcode)], 0xf);
> -    tcg_gen_andi_tl(cpu_fpscr, cpu_fpscr, ~(0xF << bfa));
> +    /* Only the exception bits (including FX) should be cleared if read */
> +    tcg_gen_andi_tl(tmp, cpu_fpscr, ~((0xF << shift) & FP_EX_CLEAR_BITS));
> +    /* FEX and VX need to be updated, so don't set fpscr directly */
> +    gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);

This doesn't compile.  For 64-bit targets we get:

  CC    ppc64-softmmu/target-ppc/translate.o
/home/dwg/src/qemu/target-ppc/translate.c: In function ‘gen_mcrfs’:
/home/dwg/src/qemu/target-ppc/translate.c:2520:42: error: passing argument 3 of ‘gen_helper_store_fpscr’ makes pointe
r from integer without a cast [-Werror=int-conversion]
     gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);
                                          ^
In file included from /home/dwg/src/qemu/include/exec/helper-gen.h:59:0,
                 from /home/dwg/src/qemu/tcg/tcg-op.h:27,
                 from /home/dwg/src/qemu/target-ppc/translate.c:23:
/home/dwg/src/qemu/target-ppc/helper.h:56:58: note: expected ‘TCGv_i32 {aka struct TCGv_i32_d *}’ but argument is of 
type ‘int’

For 32-bit targets it's worse:

  CC    ppcemb-softmmu/target-ppc/translate.o
/home/dwg/src/qemu/target-ppc/translate.c: In function ‘gen_mcrfs’:
/home/dwg/src/qemu/target-ppc/translate.c:2520:37: error: passing argument 2 of ‘gen_helper_store_fpscr’ from incompa
tible pointer type [-Werror=incompatible-pointer-types]
     gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);
                                     ^
In file included from /home/dwg/src/qemu/include/exec/helper-gen.h:59:0,
                 from /home/dwg/src/qemu/tcg/tcg-op.h:27,
                 from /home/dwg/src/qemu/target-ppc/translate.c:23:
/home/dwg/src/qemu/target-ppc/helper.h:56:58: note: expected ‘TCGv_i64 {aka struct TCGv_i64_d *}’ but argument is of 
type ‘TCGv_i32 {aka struct TCGv_i32_d *}’
/home/dwg/src/qemu/target-ppc/translate.c:2520:42: error: passing argument 3 of ‘gen_helper_store_fpscr’ makes pointe
r from integer without a cast [-Werror=int-conversion]
     gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);
                                          ^
In file included from /home/dwg/src/qemu/include/exec/helper-gen.h:59:0,
                 from /home/dwg/src/qemu/tcg/tcg-op.h:27,
                 from /home/dwg/src/qemu/target-ppc/translate.c:23:
/home/dwg/src/qemu/target-ppc/helper.h:56:58: note: expected ‘TCGv_i32 {aka struct TCGv_i32_d *}’ but argument is of 
type ‘int’


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-01-29  4:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-24 15:41 [Qemu-devel] [PATCH 0/2] PPC handles mcrfs incorrectly James Clarke
2016-01-24 15:41 ` [Qemu-devel] [PATCH 1/2] target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro James Clarke
2016-01-24 15:41 ` [Qemu-devel] [PATCH 2/2] target-ppc: mcrfs should always update FEX/VX and only clear exception bits James Clarke
2016-01-29  3:17   ` [Qemu-devel] [Qemu-ppc] " David Gibson
2016-01-29  3:04 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC handles mcrfs incorrectly David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.