All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4
@ 2014-01-10 19:07 Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 01/18] target-ppc: VSX Stage 4: Add VSX 2.07 Flag Tom Musta
                   ` (17 more replies)
  0 siblings, 18 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This is the fourth and final series of patches that add emulation support
to QEMU for the PowerPC Vector Scalar Extension (VSX).

This series adds the instructions that were newly introduced with Power ISA
V2.07.  This includes 3 scalar load instructions, 2 scalar store instructions,
7 standard single precision scalar arithmetic instructions, 8 scalar single
precision fused multiply/add instructions, two integer-to-single-precision
conversion instructions and 3 vector logical instructions.

The single-precision scalar arithmetic instructions all interpret the most
significant 64 bits of a VSR as a single precision floating point number
stored in double precision format (similar to the standard PowerPC floating
point single precision instructions).  Thus a common theme in the supporting
code is rounding of an intermediate double-precision number to single 
precision.

V2: (a) Changed the rounding to single precision to reuse the existing
helper_frsp() routine.  (b) Re-implemented the fused multiply/add instructions
to use float32_muladd instead of float64_muladd, which avoids subtle rounding
errors.

V3: Re-implemented fused multiply/add per clarification from Richard Henderson.

V4: Changed fused multiply/add to use helper_frsp (inadvertently re-injected
when I used an earlier patch).  

V5: Fixed tcg compilation problems.

V6: Added instructions that were previously missed.

Tom Musta (18):
  target-ppc: VSX Stage 4: Add VSX 2.07 Flag
  target-ppc: VSX Stage 4: Refactor lxsdx
  target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx
  target-ppc: VSX Stage 4: Refactor stxsdx
  target-ppc: VSX Stage 4: Add stxsiwx and stxsspx
  target-ppc: VSX Stage 4: Add xsaddsp and xssubsp
  target-ppc: VSX Stage 4: Add xsmulsp
  target-ppc: VSX Stage 4: Add xsdivsp
  target-ppc: VSX Stage 4: Add xsresp
  target-ppc: VSX Stage 4: Add xssqrtsp
  target-ppc: VSX Stage 4: add xsrsqrtesp
  target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds
  target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp
  target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc
  target-ppc: Move To/From VSR Instructions
  target-ppc: Floating Merge Word Instructions
  target-ppc: Scalar Round to Single Precision
  target-ppc: Scalar Non-Signalling Conversions

 target-ppc/cpu.h            |    4 +-
 target-ppc/fpu_helper.c     |  231 ++++++++++++++++++++++++++++++-------------
 target-ppc/helper.h         |   21 ++++
 target-ppc/translate.c      |  195 +++++++++++++++++++++++++++++++-----
 target-ppc/translate_init.c |    2 +-
 5 files changed, 359 insertions(+), 94 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 01/18] target-ppc: VSX Stage 4: Add VSX 2.07 Flag
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 02/18] target-ppc: VSX Stage 4: Refactor lxsdx Tom Musta
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds a flag to identify those VSX instructions that are
new to Power ISA V2.07.  The flag is added to the Power 8 processor
initialization so that the P8 models understand how to decode and
emulate instructions in this category.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-ppc/cpu.h            |    4 +++-
 target-ppc/translate_init.c |    2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index bb84767..0abc848 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1875,9 +1875,11 @@ enum {
     PPC2_DBRX          = 0x0000000000000010ULL,
     /* Book I 2.05 PowerPC specification                                     */
     PPC2_ISA205        = 0x0000000000000020ULL,
+    /* VSX additions in ISA 2.07                                             */
+    PPC2_VSX207        = 0x0000000000000040ULL,
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
-  PPC2_ISA205)
+                        PPC2_ISA205 | PPC2_VSX207)
 };
 
 /*****************************************************************************/
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index c030a20..dd57df3 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7312,7 +7312,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                        PPC_64B | PPC_ALTIVEC |
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
-    pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX;
+    pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX;
     pcc->msr_mask = 0x800000000284FF36ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 02/18] target-ppc: VSX Stage 4: Refactor lxsdx
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 01/18] target-ppc: VSX Stage 4: Add VSX 2.07 Flag Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 03/18] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx Tom Musta
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch refactors the lxsdx generator. Resuable code is isolated
into a macro.  The macro will be used in subsequent patches in this
series to implement other scalar load instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-ppc/translate.c |   31 +++++++++++++++++--------------
 1 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 79be8ed..ca26dcf 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7022,20 +7022,23 @@ static inline TCGv_i64 cpu_vsrl(int n)
     }
 }
 
-static void gen_lxsdx(DisasContext *ctx)
-{
-    TCGv EA;
-    if (unlikely(!ctx->vsx_enabled)) {
-        gen_exception(ctx, POWERPC_EXCP_VSXU);
-        return;
-    }
-    gen_set_access_type(ctx, ACCESS_INT);
-    EA = tcg_temp_new();
-    gen_addr_reg_index(ctx, EA);
-    gen_qemu_ld64(ctx, cpu_vsrh(xT(ctx->opcode)), EA);
-    /* NOTE: cpu_vsrl is undefined */
-    tcg_temp_free(EA);
-}
+#define VSX_LOAD_SCALAR(name, operation)                      \
+static void gen_##name(DisasContext *ctx)                     \
+{                                                             \
+    TCGv EA;                                                  \
+    if (unlikely(!ctx->vsx_enabled)) {                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                \
+        return;                                               \
+    }                                                         \
+    gen_set_access_type(ctx, ACCESS_INT);                     \
+    EA = tcg_temp_new();                                      \
+    gen_addr_reg_index(ctx, EA);                              \
+    gen_qemu_##operation(ctx, cpu_vsrh(xT(ctx->opcode)), EA); \
+    /* NOTE: cpu_vsrl is undefined */                         \
+    tcg_temp_free(EA);                                        \
+}
+
+VSX_LOAD_SCALAR(lxsdx, ld64)
 
 static void gen_lxvd2x(DisasContext *ctx)
 {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 03/18] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 01/18] target-ppc: VSX Stage 4: Add VSX 2.07 Flag Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 02/18] target-ppc: VSX Stage 4: Refactor lxsdx Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 04/18] target-ppc: VSX Stage 4: Refactor stxsdx Tom Musta
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the scalar load instructions introduced in ISA
V2.07:

  - Load VSX Scalar as Integer Word Algebraic Indexd (lxsiwax)
  - Load VSX Scalar as Integer Word and Zero Indexed (lxsiwzx)
  - Load VSX Scalar Single-Precision Indexed (lxsspx)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V5: Updated to fix tcg-debug compilation failures.

 target-ppc/translate.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index ca26dcf..958ea94 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2585,6 +2585,14 @@ static inline void gen_qemu_ld32s(DisasContext *ctx, TCGv arg1, TCGv arg2)
         tcg_gen_qemu_ld32s(arg1, arg2, ctx->mem_idx);
 }
 
+static void gen_qemu_ld32s_i64(DisasContext *ctx, TCGv_i64 val, TCGv addr)
+{
+    TCGv tmp = tcg_temp_new();
+    gen_qemu_ld32s(ctx, tmp, addr);
+    tcg_gen_ext_tl_i64(val, tmp);
+    tcg_temp_free(tmp);
+}
+
 static inline void gen_qemu_ld64(DisasContext *ctx, TCGv_i64 arg1, TCGv arg2)
 {
     tcg_gen_qemu_ld64(arg1, arg2, ctx->mem_idx);
@@ -7039,6 +7047,9 @@ static void gen_##name(DisasContext *ctx)                     \
 }
 
 VSX_LOAD_SCALAR(lxsdx, ld64)
+VSX_LOAD_SCALAR(lxsiwax, ld32s_i64)
+VSX_LOAD_SCALAR(lxsiwzx, ld32u_i64)
+VSX_LOAD_SCALAR(lxsspx, ld32fs)
 
 static void gen_lxvd2x(DisasContext *ctx)
 {
@@ -10044,6 +10055,9 @@ GEN_VAFORM_PAIRED(vsel, vperm, 21),
 GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23),
 
 GEN_HANDLER_E(lxsdx, 0x1F, 0x0C, 0x12, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(lxsiwax, 0x1F, 0x0C, 0x02, 0, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(lxsiwzx, 0x1F, 0x0C, 0x00, 0, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(lxsspx, 0x1F, 0x0C, 0x10, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 04/18] target-ppc: VSX Stage 4: Refactor stxsdx
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (2 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 03/18] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 05/18] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx Tom Musta
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch refactors the stxsdx instruction.  Reusable code is
extracted into a macro which will be used in subsequent patches
in this series.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-ppc/translate.c |   27 +++++++++++++++------------
 1 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 958ea94..9f3dda7 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7112,20 +7112,23 @@ static void gen_lxvw4x(DisasContext *ctx)
     tcg_temp_free_i64(tmp);
 }
 
-static void gen_stxsdx(DisasContext *ctx)
-{
-    TCGv EA;
-    if (unlikely(!ctx->vsx_enabled)) {
-        gen_exception(ctx, POWERPC_EXCP_VSXU);
-        return;
-    }
-    gen_set_access_type(ctx, ACCESS_INT);
-    EA = tcg_temp_new();
-    gen_addr_reg_index(ctx, EA);
-    gen_qemu_st64(ctx, cpu_vsrh(xS(ctx->opcode)), EA);
-    tcg_temp_free(EA);
+#define VSX_STORE_SCALAR(name, operation)                     \
+static void gen_##name(DisasContext *ctx)                     \
+{                                                             \
+    TCGv EA;                                                  \
+    if (unlikely(!ctx->vsx_enabled)) {                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                \
+        return;                                               \
+    }                                                         \
+    gen_set_access_type(ctx, ACCESS_INT);                     \
+    EA = tcg_temp_new();                                      \
+    gen_addr_reg_index(ctx, EA);                              \
+    gen_qemu_##operation(ctx, cpu_vsrh(xS(ctx->opcode)), EA); \
+    tcg_temp_free(EA);                                        \
 }
 
+VSX_STORE_SCALAR(stxsdx, st64)
+
 static void gen_stxvd2x(DisasContext *ctx)
 {
     TCGv EA;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 05/18] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (3 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 04/18] target-ppc: VSX Stage 4: Refactor stxsdx Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 06/18] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp Tom Musta
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds two store scalar instructions:

  - Store VSX Scalar as Integer Word Indexed (stxsiwx)
  - Store VSX Scalar Single-Precision Indexed (stxsspx)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V5: Updated to address tcg-debug compliation errors.

 target-ppc/translate.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 9f3dda7..28794d1 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7128,6 +7128,8 @@ static void gen_##name(DisasContext *ctx)                     \
 }
 
 VSX_STORE_SCALAR(stxsdx, st64)
+VSX_STORE_SCALAR(stxsiwx, st32_i64)
+VSX_STORE_SCALAR(stxsspx, st32fs)
 
 static void gen_stxvd2x(DisasContext *ctx)
 {
@@ -10066,6 +10068,8 @@ GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
 
 GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(stxsiwx, 0x1F, 0xC, 0x04, 0, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 06/18] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (4 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 05/18] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 07/18] target-ppc: VSX Stage 4: Add xsmulsp Tom Musta
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Add Single-Precision (xsaddsp) and
VSX Scalar Subtract Single-Precision (xssubsp) instructions.

The existing VSX_ADD_SUB macro is modified to support the rounding
of the (intermediate) result to single-precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V2: updated conversion of result to single precision.

 target-ppc/fpu_helper.c |   20 +++++++++++++-------
 target-ppc/helper.h     |    3 +++
 target-ppc/translate.c  |    6 ++++++
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 3165ef0..f047640 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1768,7 +1768,7 @@ static void putVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf)                          \
+#define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp)                    \
 void helper_##name(CPUPPCState *env, uint32_t opcode)                        \
 {                                                                            \
     ppc_vsr_t xt, xa, xb;                                                    \
@@ -1794,6 +1794,10 @@ void helper_##name(CPUPPCState *env, uint32_t opcode)                        \
             }                                                                \
         }                                                                    \
                                                                              \
+        if (r2sp) {                                                          \
+            xt.fld[i] = helper_frsp(env, xt.fld[i]);                         \
+        }                                                                    \
+                                                                             \
         if (sfprf) {                                                         \
             helper_compute_fprf(env, xt.fld[i], sfprf);                      \
         }                                                                    \
@@ -1802,12 +1806,14 @@ void helper_##name(CPUPPCState *env, uint32_t opcode)                        \
     helper_float_check_status(env);                                          \
 }
 
-VSX_ADD_SUB(xsadddp, add, 1, float64, f64, 1)
-VSX_ADD_SUB(xvadddp, add, 2, float64, f64, 0)
-VSX_ADD_SUB(xvaddsp, add, 4, float32, f32, 0)
-VSX_ADD_SUB(xssubdp, sub, 1, float64, f64, 1)
-VSX_ADD_SUB(xvsubdp, sub, 2, float64, f64, 0)
-VSX_ADD_SUB(xvsubsp, sub, 4, float32, f32, 0)
+VSX_ADD_SUB(xsadddp, add, 1, float64, f64, 1, 0)
+VSX_ADD_SUB(xsaddsp, add, 1, float64, f64, 1, 1)
+VSX_ADD_SUB(xvadddp, add, 2, float64, f64, 0, 0)
+VSX_ADD_SUB(xvaddsp, add, 4, float32, f32, 0, 0)
+VSX_ADD_SUB(xssubdp, sub, 1, float64, f64, 1, 0)
+VSX_ADD_SUB(xssubsp, sub, 1, float64, f64, 1, 1)
+VSX_ADD_SUB(xvsubdp, sub, 2, float64, f64, 0, 0)
+VSX_ADD_SUB(xvsubsp, sub, 4, float32, f32, 0, 0)
 
 /* VSX_MUL - VSX floating point multiply
  *   op    - instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0276b02..696b9d3 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -286,6 +286,9 @@ DEF_HELPER_2(xsrdpim, void, env, i32)
 DEF_HELPER_2(xsrdpip, void, env, i32)
 DEF_HELPER_2(xsrdpiz, void, env, i32)
 
+DEF_HELPER_2(xsaddsp, void, env, i32)
+DEF_HELPER_2(xssubsp, void, env, i32)
+
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
 DEF_HELPER_2(xvmuldp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 28794d1..c50d800 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7358,6 +7358,9 @@ GEN_VSX_HELPER_2(xsrdpim, 0x12, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsrdpip, 0x12, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 
+GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
+
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
@@ -10164,6 +10167,9 @@ GEN_XX2FORM(xsrdpim, 0x12, 0x07, PPC2_VSX),
 GEN_XX2FORM(xsrdpip, 0x12, 0x06, PPC2_VSX),
 GEN_XX2FORM(xsrdpiz, 0x12, 0x05, PPC2_VSX),
 
+GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
+GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
+
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
 GEN_XX3FORM(xvmuldp, 0x00, 0x0E, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 07/18] target-ppc: VSX Stage 4: Add xsmulsp
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (5 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 06/18] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 08/18] target-ppc: VSX Stage 4: Add xsdivsp Tom Musta
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Multiply Single-Precision (xsmulsp)
instruction.

The existing VSX_MUL macro is modified to support rounding of the
intermediate result to single precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V2: Updated conversion to single precision.

 target-ppc/fpu_helper.c |   13 +++++++++----
 target-ppc/helper.h     |    1 +
 target-ppc/translate.c  |    2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index f047640..dc9849f 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1822,7 +1822,7 @@ VSX_ADD_SUB(xvsubsp, sub, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_MUL(op, nels, tp, fld, sfprf)                                    \
+#define VSX_MUL(op, nels, tp, fld, sfprf, r2sp)                              \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
     ppc_vsr_t xt, xa, xb;                                                    \
@@ -1849,6 +1849,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
             }                                                                \
         }                                                                    \
                                                                              \
+        if (r2sp) {                                                          \
+            xt.fld[i] = helper_frsp(env, xt.fld[i]);                         \
+        }                                                                    \
+                                                                             \
         if (sfprf) {                                                         \
             helper_compute_fprf(env, xt.fld[i], sfprf);                      \
         }                                                                    \
@@ -1858,9 +1862,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
     helper_float_check_status(env);                                          \
 }
 
-VSX_MUL(xsmuldp, 1, float64, f64, 1)
-VSX_MUL(xvmuldp, 2, float64, f64, 0)
-VSX_MUL(xvmulsp, 4, float32, f32, 0)
+VSX_MUL(xsmuldp, 1, float64, f64, 1, 0)
+VSX_MUL(xsmulsp, 1, float64, f64, 1, 1)
+VSX_MUL(xvmuldp, 2, float64, f64, 0, 0)
+VSX_MUL(xvmulsp, 4, float32, f32, 0, 0)
 
 /* VSX_DIV - VSX floating point divide
  *   op    - instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 696b9d3..0ccdc96 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -288,6 +288,7 @@ DEF_HELPER_2(xsrdpiz, void, env, i32)
 
 DEF_HELPER_2(xsaddsp, void, env, i32)
 DEF_HELPER_2(xssubsp, void, env, i32)
+DEF_HELPER_2(xsmulsp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index c50d800..3a6a94b 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7360,6 +7360,7 @@ GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 
 GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10169,6 +10170,7 @@ GEN_XX2FORM(xsrdpiz, 0x12, 0x05, PPC2_VSX),
 
 GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
 GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
+GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 08/18] target-ppc: VSX Stage 4: Add xsdivsp
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (6 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 07/18] target-ppc: VSX Stage 4: Add xsmulsp Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 09/18] target-ppc: VSX Stage 4: Add xsresp Tom Musta
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Divide Single Precision (xsdivsp)
instruction.

The existing VSX_DIV macro is modified to support rounding of the
intermediate double precision result to single precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V2: Updated conversion to single precision.

 target-ppc/fpu_helper.c |   13 +++++++++----
 target-ppc/helper.h     |    1 +
 target-ppc/translate.c  |    2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index dc9849f..49cf09a 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1874,7 +1874,7 @@ VSX_MUL(xvmulsp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_DIV(op, nels, tp, fld, sfprf)                                     \
+#define VSX_DIV(op, nels, tp, fld, sfprf, r2sp)                               \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 {                                                                             \
     ppc_vsr_t xt, xa, xb;                                                     \
@@ -1903,6 +1903,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
             }                                                                 \
         }                                                                     \
                                                                               \
+        if (r2sp) {                                                           \
+            xt.fld[i] = helper_frsp(env, xt.fld[i]);                          \
+        }                                                                     \
+                                                                              \
         if (sfprf) {                                                          \
             helper_compute_fprf(env, xt.fld[i], sfprf);                       \
         }                                                                     \
@@ -1912,9 +1916,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
     helper_float_check_status(env);                                           \
 }
 
-VSX_DIV(xsdivdp, 1, float64, f64, 1)
-VSX_DIV(xvdivdp, 2, float64, f64, 0)
-VSX_DIV(xvdivsp, 4, float32, f32, 0)
+VSX_DIV(xsdivdp, 1, float64, f64, 1, 0)
+VSX_DIV(xsdivsp, 1, float64, f64, 1, 1)
+VSX_DIV(xvdivdp, 2, float64, f64, 0, 0)
+VSX_DIV(xvdivsp, 4, float32, f32, 0, 0)
 
 /* VSX_RE  - VSX floating point reciprocal estimate
  *   op    - instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0ccdc96..308f97c 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -289,6 +289,7 @@ DEF_HELPER_2(xsrdpiz, void, env, i32)
 DEF_HELPER_2(xsaddsp, void, env, i32)
 DEF_HELPER_2(xssubsp, void, env, i32)
 DEF_HELPER_2(xsmulsp, void, env, i32)
+DEF_HELPER_2(xsdivsp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 3a6a94b..eddb9d2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7361,6 +7361,7 @@ GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10171,6 +10172,7 @@ GEN_XX2FORM(xsrdpiz, 0x12, 0x05, PPC2_VSX),
 GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
 GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
 GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
+GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 09/18] target-ppc: VSX Stage 4: Add xsresp
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (7 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 08/18] target-ppc: VSX Stage 4: Add xsdivsp Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 10/18] target-ppc: VSX Stage 4: Add xssqrtsp Tom Musta
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Reciprocal Estimate Single Precision
(xsresp) instruction.

The existing VSX_RE macro is modified to support rounding of the
intermediate double precision result to single precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V2: Updated conversion to single precision range.

 target-ppc/fpu_helper.c |   14 ++++++++++----
 target-ppc/helper.h     |    1 +
 target-ppc/translate.c  |    2 ++
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 49cf09a..ac52c23 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1928,7 +1928,7 @@ VSX_DIV(xvdivsp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_RE(op, nels, tp, fld, sfprf)                                      \
+#define VSX_RE(op, nels, tp, fld, sfprf, r2sp)                                \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 {                                                                             \
     ppc_vsr_t xt, xb;                                                         \
@@ -1943,6 +1943,11 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
                 fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
         }                                                                     \
         xt.fld[i] = tp##_div(tp##_one, xb.fld[i], &env->fp_status);           \
+                                                                              \
+        if (r2sp) {                                                           \
+            xt.fld[i] = helper_frsp(env, xt.fld[i]);                          \
+        }                                                                     \
+                                                                              \
         if (sfprf) {                                                          \
             helper_compute_fprf(env, xt.fld[0], sfprf);                       \
         }                                                                     \
@@ -1952,9 +1957,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
     helper_float_check_status(env);                                           \
 }
 
-VSX_RE(xsredp, 1, float64, f64, 1)
-VSX_RE(xvredp, 2, float64, f64, 0)
-VSX_RE(xvresp, 4, float32, f32, 0)
+VSX_RE(xsredp, 1, float64, f64, 1, 0)
+VSX_RE(xsresp, 1, float64, f64, 1, 1)
+VSX_RE(xvredp, 2, float64, f64, 0, 0)
+VSX_RE(xvresp, 4, float32, f32, 0, 0)
 
 /* VSX_SQRT - VSX floating point square root
  *   op    - instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 308f97c..b1cf3c0 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -290,6 +290,7 @@ DEF_HELPER_2(xsaddsp, void, env, i32)
 DEF_HELPER_2(xssubsp, void, env, i32)
 DEF_HELPER_2(xsmulsp, void, env, i32)
 DEF_HELPER_2(xsdivsp, void, env, i32)
+DEF_HELPER_2(xsresp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index eddb9d2..3108a29 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7362,6 +7362,7 @@ GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10173,6 +10174,7 @@ GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
 GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
 GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
+GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 10/18] target-ppc: VSX Stage 4: Add xssqrtsp
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (8 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 09/18] target-ppc: VSX Stage 4: Add xsresp Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 11/18] target-ppc: VSX Stage 4: add xsrsqrtesp Tom Musta
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Square Root Single Precision (xssqrtsp)
instruction.

The existing VSX_SQRT() macro is modified to support rounding of the
intermediate double-precision result to single-precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V2: Updated conversion to single precision range.

 target-ppc/fpu_helper.c |   13 +++++++++----
 target-ppc/helper.h     |    1 +
 target-ppc/translate.c  |    2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index ac52c23..fec9d1b 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1969,7 +1969,7 @@ VSX_RE(xvresp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_SQRT(op, nels, tp, fld, sfprf)                                   \
+#define VSX_SQRT(op, nels, tp, fld, sfprf, r2sp)                             \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
     ppc_vsr_t xt, xb;                                                        \
@@ -1993,6 +1993,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
             }                                                                \
         }                                                                    \
                                                                              \
+        if (r2sp) {                                                          \
+            xt.fld[i] = helper_frsp(env, xt.fld[i]);                         \
+        }                                                                    \
+                                                                             \
         if (sfprf) {                                                         \
             helper_compute_fprf(env, xt.fld[i], sfprf);                      \
         }                                                                    \
@@ -2002,9 +2006,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
     helper_float_check_status(env);                                          \
 }
 
-VSX_SQRT(xssqrtdp, 1, float64, f64, 1)
-VSX_SQRT(xvsqrtdp, 2, float64, f64, 0)
-VSX_SQRT(xvsqrtsp, 4, float32, f32, 0)
+VSX_SQRT(xssqrtdp, 1, float64, f64, 1, 0)
+VSX_SQRT(xssqrtsp, 1, float64, f64, 1, 1)
+VSX_SQRT(xvsqrtdp, 2, float64, f64, 0, 0)
+VSX_SQRT(xvsqrtsp, 4, float32, f32, 0, 0)
 
 /* VSX_RSQRTE - VSX floating point reciprocal square root estimate
  *   op    - instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index b1cf3c0..0192043 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -291,6 +291,7 @@ DEF_HELPER_2(xssubsp, void, env, i32)
 DEF_HELPER_2(xsmulsp, void, env, i32)
 DEF_HELPER_2(xsdivsp, void, env, i32)
 DEF_HELPER_2(xsresp, void, env, i32)
+DEF_HELPER_2(xssqrtsp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 3108a29..f4c1f42 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7363,6 +7363,7 @@ GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10175,6 +10176,7 @@ GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
 GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
+GEN_XX2FORM(xssqrtsp,  0x16, 0x00, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 11/18] target-ppc: VSX Stage 4: add xsrsqrtesp
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (9 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 10/18] target-ppc: VSX Stage 4: Add xssqrtsp Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 12/18] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds Tom Musta
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Reciprocal Square Root Estimate
Single Precision (xsrsqrtesp) instruction.

The existing VSX_RSQRTE() macro is modified to support rounding
of the intermediate double-precision result to single precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V2: Updated conversion to single precision range.

 target-ppc/fpu_helper.c |   13 +++++++++----
 target-ppc/helper.h     |    1 +
 target-ppc/translate.c  |    2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index fec9d1b..33da462 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2018,7 +2018,7 @@ VSX_SQRT(xvsqrtsp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_RSQRTE(op, nels, tp, fld, sfprf)                                 \
+#define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp)                           \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
     ppc_vsr_t xt, xb;                                                        \
@@ -2043,6 +2043,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
             }                                                                \
         }                                                                    \
                                                                              \
+        if (r2sp) {                                                          \
+            xt.fld[i] = helper_frsp(env, xt.fld[i]);                         \
+        }                                                                    \
+                                                                             \
         if (sfprf) {                                                         \
             helper_compute_fprf(env, xt.fld[i], sfprf);                      \
         }                                                                    \
@@ -2052,9 +2056,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
     helper_float_check_status(env);                                          \
 }
 
-VSX_RSQRTE(xsrsqrtedp, 1, float64, f64, 1)
-VSX_RSQRTE(xvrsqrtedp, 2, float64, f64, 0)
-VSX_RSQRTE(xvrsqrtesp, 4, float32, f32, 0)
+VSX_RSQRTE(xsrsqrtedp, 1, float64, f64, 1, 0)
+VSX_RSQRTE(xsrsqrtesp, 1, float64, f64, 1, 1)
+VSX_RSQRTE(xvrsqrtedp, 2, float64, f64, 0, 0)
+VSX_RSQRTE(xvrsqrtesp, 4, float32, f32, 0, 0)
 
 static inline int ppc_float32_get_unbiased_exp(float32 f)
 {
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0192043..84c6ee1 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -292,6 +292,7 @@ DEF_HELPER_2(xsmulsp, void, env, i32)
 DEF_HELPER_2(xsdivsp, void, env, i32)
 DEF_HELPER_2(xsresp, void, env, i32)
 DEF_HELPER_2(xssqrtsp, void, env, i32)
+DEF_HELPER_2(xsrsqrtesp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index f4c1f42..950c02e 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7364,6 +7364,7 @@ GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10177,6 +10178,7 @@ GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
 GEN_XX2FORM(xssqrtsp,  0x16, 0x00, PPC2_VSX207),
+GEN_XX2FORM(xsrsqrtesp,  0x14, 0x00, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 12/18] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (10 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 11/18] target-ppc: VSX Stage 4: add xsrsqrtesp Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 13/18] target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp Tom Musta
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Single Precision VSX Scalar Fused Multiply-Add
instructions: xsmaddasp, xsmaddmsp, xssubasp, xssubmsp, xsnmaddasp,
xsnmaddmsp, xsnmsubasp, xsnmsubmsp.

The existing VSX_MADD() macro is modified to support rounding of the
intermediate double precision result to single precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V2: Re-implemented per feedback from Richard Henderson.  In order to
avoid double rounding and incorrect results, the operands must be
converted to true single precision values and use the single precision
fused multiply/add routine.

V3: Re-implemented per feedback from Richard Henderson (I did not
fully understand his comment when I implemented V2).

V4: Changed to use helper_frsp (inadvertently re-injected when I used
an earlier patch).  Thanks to Richard Henderson for catching this.

 target-ppc/fpu_helper.c |   82 ++++++++++++++++++++++++++++++----------------
 target-ppc/helper.h     |    8 ++++
 target-ppc/translate.c  |   16 +++++++++
 3 files changed, 77 insertions(+), 29 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 33da462..7e5003a 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2192,7 +2192,7 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23)
  *   afrm  - A form (1=A, 0=M)
  *   sfprf - set FPRF
  */
-#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf)                    \
+#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)              \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 {                                                                             \
     ppc_vsr_t xt_in, xa, xb, xt_out;                                          \
@@ -2218,8 +2218,18 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
     for (i = 0; i < nels; i++) {                                              \
         float_status tstat = env->fp_status;                                  \
         set_float_exception_flags(0, &tstat);                                 \
-        xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],          \
-                                     maddflgs, &tstat);                       \
+        if (r2sp && (tstat.float_rounding_mode == float_round_nearest_even)) {\
+            /* Avoid double rounding errors by rounding the intermediate */   \
+            /* result to odd.                                            */   \
+            set_float_rounding_mode(float_round_to_zero, &tstat);             \
+            xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],      \
+                                       maddflgs, &tstat);                     \
+            xt_out.fld[i] |= (get_float_exception_flags(&tstat) &             \
+                              float_flag_inexact) != 0;                       \
+        } else {                                                              \
+            xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],      \
+                                        maddflgs, &tstat);                    \
+        }                                                                     \
         env->fp_status.float_exception_flags |= tstat.float_exception_flags;  \
                                                                               \
         if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {     \
@@ -2242,6 +2252,11 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
                 fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, sfprf);     \
             }                                                                 \
         }                                                                     \
+                                                                              \
+        if (r2sp) {                                                           \
+            xt_out.fld[i] = helper_frsp(env, xt_out.fld[i]);                  \
+        }                                                                     \
+                                                                              \
         if (sfprf) {                                                          \
             helper_compute_fprf(env, xt_out.fld[i], sfprf);                   \
         }                                                                     \
@@ -2255,32 +2270,41 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 #define NMADD_FLGS float_muladd_negate_result
 #define NMSUB_FLGS (float_muladd_negate_c | float_muladd_negate_result)
 
-VSX_MADD(xsmaddadp, 1, float64, f64, MADD_FLGS, 1, 1)
-VSX_MADD(xsmaddmdp, 1, float64, f64, MADD_FLGS, 0, 1)
-VSX_MADD(xsmsubadp, 1, float64, f64, MSUB_FLGS, 1, 1)
-VSX_MADD(xsmsubmdp, 1, float64, f64, MSUB_FLGS, 0, 1)
-VSX_MADD(xsnmaddadp, 1, float64, f64, NMADD_FLGS, 1, 1)
-VSX_MADD(xsnmaddmdp, 1, float64, f64, NMADD_FLGS, 0, 1)
-VSX_MADD(xsnmsubadp, 1, float64, f64, NMSUB_FLGS, 1, 1)
-VSX_MADD(xsnmsubmdp, 1, float64, f64, NMSUB_FLGS, 0, 1)
-
-VSX_MADD(xvmaddadp, 2, float64, f64, MADD_FLGS, 1, 0)
-VSX_MADD(xvmaddmdp, 2, float64, f64, MADD_FLGS, 0, 0)
-VSX_MADD(xvmsubadp, 2, float64, f64, MSUB_FLGS, 1, 0)
-VSX_MADD(xvmsubmdp, 2, float64, f64, MSUB_FLGS, 0, 0)
-VSX_MADD(xvnmaddadp, 2, float64, f64, NMADD_FLGS, 1, 0)
-VSX_MADD(xvnmaddmdp, 2, float64, f64, NMADD_FLGS, 0, 0)
-VSX_MADD(xvnmsubadp, 2, float64, f64, NMSUB_FLGS, 1, 0)
-VSX_MADD(xvnmsubmdp, 2, float64, f64, NMSUB_FLGS, 0, 0)
-
-VSX_MADD(xvmaddasp, 4, float32, f32, MADD_FLGS, 1, 0)
-VSX_MADD(xvmaddmsp, 4, float32, f32, MADD_FLGS, 0, 0)
-VSX_MADD(xvmsubasp, 4, float32, f32, MSUB_FLGS, 1, 0)
-VSX_MADD(xvmsubmsp, 4, float32, f32, MSUB_FLGS, 0, 0)
-VSX_MADD(xvnmaddasp, 4, float32, f32, NMADD_FLGS, 1, 0)
-VSX_MADD(xvnmaddmsp, 4, float32, f32, NMADD_FLGS, 0, 0)
-VSX_MADD(xvnmsubasp, 4, float32, f32, NMSUB_FLGS, 1, 0)
-VSX_MADD(xvnmsubmsp, 4, float32, f32, NMSUB_FLGS, 0, 0)
+VSX_MADD(xsmaddadp, 1, float64, f64, MADD_FLGS, 1, 1, 0)
+VSX_MADD(xsmaddmdp, 1, float64, f64, MADD_FLGS, 0, 1, 0)
+VSX_MADD(xsmsubadp, 1, float64, f64, MSUB_FLGS, 1, 1, 0)
+VSX_MADD(xsmsubmdp, 1, float64, f64, MSUB_FLGS, 0, 1, 0)
+VSX_MADD(xsnmaddadp, 1, float64, f64, NMADD_FLGS, 1, 1, 0)
+VSX_MADD(xsnmaddmdp, 1, float64, f64, NMADD_FLGS, 0, 1, 0)
+VSX_MADD(xsnmsubadp, 1, float64, f64, NMSUB_FLGS, 1, 1, 0)
+VSX_MADD(xsnmsubmdp, 1, float64, f64, NMSUB_FLGS, 0, 1, 0)
+
+VSX_MADD(xsmaddasp, 1, float64, f64, MADD_FLGS, 1, 1, 1)
+VSX_MADD(xsmaddmsp, 1, float64, f64, MADD_FLGS, 0, 1, 1)
+VSX_MADD(xsmsubasp, 1, float64, f64, MSUB_FLGS, 1, 1, 1)
+VSX_MADD(xsmsubmsp, 1, float64, f64, MSUB_FLGS, 0, 1, 1)
+VSX_MADD(xsnmaddasp, 1, float64, f64, NMADD_FLGS, 1, 1, 1)
+VSX_MADD(xsnmaddmsp, 1, float64, f64, NMADD_FLGS, 0, 1, 1)
+VSX_MADD(xsnmsubasp, 1, float64, f64, NMSUB_FLGS, 1, 1, 1)
+VSX_MADD(xsnmsubmsp, 1, float64, f64, NMSUB_FLGS, 0, 1, 1)
+
+VSX_MADD(xvmaddadp, 2, float64, f64, MADD_FLGS, 1, 0, 0)
+VSX_MADD(xvmaddmdp, 2, float64, f64, MADD_FLGS, 0, 0, 0)
+VSX_MADD(xvmsubadp, 2, float64, f64, MSUB_FLGS, 1, 0, 0)
+VSX_MADD(xvmsubmdp, 2, float64, f64, MSUB_FLGS, 0, 0, 0)
+VSX_MADD(xvnmaddadp, 2, float64, f64, NMADD_FLGS, 1, 0, 0)
+VSX_MADD(xvnmaddmdp, 2, float64, f64, NMADD_FLGS, 0, 0, 0)
+VSX_MADD(xvnmsubadp, 2, float64, f64, NMSUB_FLGS, 1, 0, 0)
+VSX_MADD(xvnmsubmdp, 2, float64, f64, NMSUB_FLGS, 0, 0, 0)
+
+VSX_MADD(xvmaddasp, 4, float32, f32, MADD_FLGS, 1, 0, 0)
+VSX_MADD(xvmaddmsp, 4, float32, f32, MADD_FLGS, 0, 0, 0)
+VSX_MADD(xvmsubasp, 4, float32, f32, MSUB_FLGS, 1, 0, 0)
+VSX_MADD(xvmsubmsp, 4, float32, f32, MSUB_FLGS, 0, 0, 0)
+VSX_MADD(xvnmaddasp, 4, float32, f32, NMADD_FLGS, 1, 0, 0)
+VSX_MADD(xvnmaddmsp, 4, float32, f32, NMADD_FLGS, 0, 0, 0)
+VSX_MADD(xvnmsubasp, 4, float32, f32, NMSUB_FLGS, 1, 0, 0)
+VSX_MADD(xvnmsubmsp, 4, float32, f32, NMSUB_FLGS, 0, 0, 0)
 
 #define VSX_SCALAR_CMP(op, ordered)                                      \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 84c6ee1..655b670 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -293,6 +293,14 @@ DEF_HELPER_2(xsdivsp, void, env, i32)
 DEF_HELPER_2(xsresp, void, env, i32)
 DEF_HELPER_2(xssqrtsp, void, env, i32)
 DEF_HELPER_2(xsrsqrtesp, void, env, i32)
+DEF_HELPER_2(xsmaddasp, void, env, i32)
+DEF_HELPER_2(xsmaddmsp, void, env, i32)
+DEF_HELPER_2(xsmsubasp, void, env, i32)
+DEF_HELPER_2(xsmsubmsp, void, env, i32)
+DEF_HELPER_2(xsnmaddasp, void, env, i32)
+DEF_HELPER_2(xsnmaddmsp, void, env, i32)
+DEF_HELPER_2(xsnmsubasp, void, env, i32)
+DEF_HELPER_2(xsnmsubmsp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 950c02e..5b68c7e 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7365,6 +7365,14 @@ GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsmsubmsp, 0x04, 0x03, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10179,6 +10187,14 @@ GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
 GEN_XX2FORM(xssqrtsp,  0x16, 0x00, PPC2_VSX207),
 GEN_XX2FORM(xsrsqrtesp,  0x14, 0x00, PPC2_VSX207),
+GEN_XX3FORM(xsmaddasp, 0x04, 0x00, PPC2_VSX207),
+GEN_XX3FORM(xsmaddmsp, 0x04, 0x01, PPC2_VSX207),
+GEN_XX3FORM(xsmsubasp, 0x04, 0x02, PPC2_VSX207),
+GEN_XX3FORM(xsmsubmsp, 0x04, 0x03, PPC2_VSX207),
+GEN_XX3FORM(xsnmaddasp, 0x04, 0x10, PPC2_VSX207),
+GEN_XX3FORM(xsnmaddmsp, 0x04, 0x11, PPC2_VSX207),
+GEN_XX3FORM(xsnmsubasp, 0x04, 0x12, PPC2_VSX207),
+GEN_XX3FORM(xsnmsubmsp, 0x04, 0x13, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 13/18] target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (11 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 12/18] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 14/18] target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc Tom Musta
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Convert Unsigned Integer Doubleword
to Floating Point Format and Round to Single Precision (xscvuxdsp)
and VSX Scalar Convert Signed Integer Douglbeword to Floating Point
Format and Round to Single Precision (xscvsxdsp) instructions.

The existing integer to floating point conversion macro (VSX_CVT_INT_TO_FP)
is modified to support the rounding of the intermediate floating point
result to single precision.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V2: updated conversion to single precision range.

 target-ppc/fpu_helper.c |   27 ++++++++++++++++-----------
 target-ppc/helper.h     |    2 ++
 target-ppc/translate.c  |    4 ++++
 3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 7e5003a..1dfb3c0 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2558,7 +2558,7 @@ VSX_CVT_FP_TO_INT(xvcvspuxws, 4, float32, uint32, f32[j], u32[i], i, 0)
  *   jdef  - definition of the j index (i or 2*i)
  *   sfprf - set FPRF
  */
-#define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, jdef, sfprf)  \
+#define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, jdef, sfprf, r2sp) \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
 {                                                                       \
     ppc_vsr_t xt, xb;                                                   \
@@ -2570,6 +2570,9 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
     for (i = 0; i < nels; i++) {                                        \
         int j = jdef;                                                   \
         xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);             \
+        if (r2sp) {                                                     \
+            xt.tfld = helper_frsp(env, xt.tfld);                        \
+        }                                                               \
         if (sfprf) {                                                    \
             helper_compute_fprf(env, xt.tfld, sfprf);                   \
         }                                                               \
@@ -2579,20 +2582,22 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
     helper_float_check_status(env);                                     \
 }
 
-VSX_CVT_INT_TO_FP(xscvsxddp, 1, int64, float64, u64[j], f64[i], i, 1)
-VSX_CVT_INT_TO_FP(xscvuxddp, 1, uint64, float64, u64[j], f64[i], i, 1)
-VSX_CVT_INT_TO_FP(xvcvsxddp, 2, int64, float64, u64[j], f64[i], i, 0)
-VSX_CVT_INT_TO_FP(xvcvuxddp, 2, uint64, float64, u64[j], f64[i], i, 0)
+VSX_CVT_INT_TO_FP(xscvsxddp, 1, int64, float64, u64[j], f64[i], i, 1, 0)
+VSX_CVT_INT_TO_FP(xscvuxddp, 1, uint64, float64, u64[j], f64[i], i, 1, 0)
+VSX_CVT_INT_TO_FP(xscvsxdsp, 1, int64, float64, u64[j], f64[i], i, 1, 1)
+VSX_CVT_INT_TO_FP(xscvuxdsp, 1, uint64, float64, u64[j], f64[i], i, 1, 1)
+VSX_CVT_INT_TO_FP(xvcvsxddp, 2, int64, float64, u64[j], f64[i], i, 0, 0)
+VSX_CVT_INT_TO_FP(xvcvuxddp, 2, uint64, float64, u64[j], f64[i], i, 0, 0)
 VSX_CVT_INT_TO_FP(xvcvsxwdp, 2, int32, float64, u32[j], f64[i], \
-                  2*i + JOFFSET, 0)
+                  2*i + JOFFSET, 0, 0)
 VSX_CVT_INT_TO_FP(xvcvuxwdp, 2, uint64, float64, u32[j], f64[i], \
-                  2*i + JOFFSET, 0)
+                  2*i + JOFFSET, 0, 0)
 VSX_CVT_INT_TO_FP(xvcvsxdsp, 2, int64, float32, u64[i], f32[j], \
-                  2*i + JOFFSET, 0)
+                  2*i + JOFFSET, 0, 0)
 VSX_CVT_INT_TO_FP(xvcvuxdsp, 2, uint64, float32, u64[i], f32[j], \
-                  2*i + JOFFSET, 0)
-VSX_CVT_INT_TO_FP(xvcvsxwsp, 4, int32, float32, u32[j], f32[i], i, 0)
-VSX_CVT_INT_TO_FP(xvcvuxwsp, 4, uint32, float32, u32[j], f32[i], i, 0)
+                  2*i + JOFFSET, 0, 0)
+VSX_CVT_INT_TO_FP(xvcvsxwsp, 4, int32, float32, u32[j], f32[i], i, 0, 0)
+VSX_CVT_INT_TO_FP(xvcvuxwsp, 4, uint32, float32, u32[j], f32[i], i, 0, 0)
 
 /* For "use current rounding mode", define a value that will not be one of
  * the existing rounding model enums.
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 655b670..6250eba 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -279,6 +279,8 @@ DEF_HELPER_2(xscvdpsxws, void, env, i32)
 DEF_HELPER_2(xscvdpuxds, void, env, i32)
 DEF_HELPER_2(xscvdpuxws, void, env, i32)
 DEF_HELPER_2(xscvsxddp, void, env, i32)
+DEF_HELPER_2(xscvuxdsp, void, env, i32)
+DEF_HELPER_2(xscvsxdsp, void, env, i32)
 DEF_HELPER_2(xscvuxddp, void, env, i32)
 DEF_HELPER_2(xsrdpi, void, env, i32)
 DEF_HELPER_2(xsrdpic, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 5b68c7e..7659085 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7373,6 +7373,8 @@ GEN_VSX_HELPER_2(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10195,6 +10197,8 @@ GEN_XX3FORM(xsnmaddasp, 0x04, 0x10, PPC2_VSX207),
 GEN_XX3FORM(xsnmaddmsp, 0x04, 0x11, PPC2_VSX207),
 GEN_XX3FORM(xsnmsubasp, 0x04, 0x12, PPC2_VSX207),
 GEN_XX3FORM(xsnmsubmsp, 0x04, 0x13, PPC2_VSX207),
+GEN_XX2FORM(xscvsxdsp, 0x10, 0x13, PPC2_VSX207),
+GEN_XX2FORM(xscvuxdsp, 0x10, 0x12, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 14/18] target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (12 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 13/18] target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions Tom Musta
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patchs adds the VSX Logical instructions that are new with
ISA V2.07:

  - VSX Logical Equivalence (xxleqv)
  - VSX Logical NAND (xxlnand)
  - VSX Logical ORC (xxlorc)

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
V5: Changes to address tcg-debug compilation errors.

 target-ppc/translate.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 7659085..e2dd272 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7468,6 +7468,9 @@ VSX_LOGICAL(xxlandc, tcg_gen_andc_i64)
 VSX_LOGICAL(xxlor, tcg_gen_or_i64)
 VSX_LOGICAL(xxlxor, tcg_gen_xor_i64)
 VSX_LOGICAL(xxlnor, tcg_gen_nor_i64)
+VSX_LOGICAL(xxleqv, tcg_gen_eqv_i64)
+VSX_LOGICAL(xxlnand, tcg_gen_nand_i64)
+VSX_LOGICAL(xxlorc, tcg_gen_orc_i64)
 
 #define VSX_XXMRG(name, high)                               \
 static void glue(gen_, name)(DisasContext * ctx)            \
@@ -10283,6 +10286,9 @@ VSX_LOGICAL(xxlandc, 0x8, 0x11, PPC2_VSX),
 VSX_LOGICAL(xxlor, 0x8, 0x12, PPC2_VSX),
 VSX_LOGICAL(xxlxor, 0x8, 0x13, PPC2_VSX),
 VSX_LOGICAL(xxlnor, 0x8, 0x14, PPC2_VSX),
+VSX_LOGICAL(xxleqv, 0x8, 0x17, PPC2_VSX207),
+VSX_LOGICAL(xxlnand, 0x8, 0x16, PPC2_VSX207),
+VSX_LOGICAL(xxlorc, 0x8, 0x15, PPC2_VSX207),
 GEN_XX3FORM(xxmrghw, 0x08, 0x02, PPC2_VSX),
 GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX),
 GEN_XX2FORM(xxspltw, 0x08, 0x0A, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (13 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 14/18] target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc Tom Musta
@ 2014-01-10 19:07 ` Tom Musta
  2014-01-10 21:29   ` Richard Henderson
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 16/18] target-ppc: Floating Merge Word Instructions Tom Musta
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Move To VSR instructions (mfvsrd, mfvsrwz)
and Move From VSR instructions (mtvsrd, mtvsrwa, mtvsrwz).  These
instructions are unusual in that they are considered a floating
point instruction if the indexed VSR is in the first half of the
array (0-31) but they are considered vector instructions if the
indexed VSR is in the second half of the array (32-63).

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V6: New.

 target-ppc/translate.c |   42 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index e2dd272..ec945a2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7175,6 +7175,40 @@ static void gen_stxvw4x(DisasContext *ctx)
     tcg_temp_free_i64(tmp);
 }
 
+#define MV_VSR(name, tcgop1, tcgop2, target, source)            \
+static void gen_##name(DisasContext *ctx)                       \
+{                                                               \
+    if (xS(ctx->opcode) < 32) {                                 \
+        if (unlikely(!ctx->fpu_enabled)) {                      \
+            gen_exception(ctx, POWERPC_EXCP_FPU);               \
+            return;                                             \
+        }                                                       \
+    } else {                                                    \
+        if (unlikely(!ctx->altivec_enabled)) {                  \
+            gen_exception(ctx, POWERPC_EXCP_VPU);               \
+            return;                                             \
+        }                                                       \
+    }                                                           \
+    TCGv_i64 tmp = tcg_temp_new_i64();                          \
+    tcg_gen_##tcgop1(tmp, source);                              \
+    tcg_gen_##tcgop2(target, tmp);                              \
+    tcg_temp_free_i64(tmp);                                     \
+}
+
+
+MV_VSR(mfvsrwz, ext32u_i64, trunc_i64_tl, cpu_gpr[rA(ctx->opcode)], \
+       cpu_vsrh(xS(ctx->opcode)))
+MV_VSR(mtvsrwa, extu_tl_i64, ext32s_i64, cpu_vsrh(xT(ctx->opcode)), \
+       cpu_gpr[rA(ctx->opcode)])
+MV_VSR(mtvsrwz, extu_tl_i64, ext32u_i64, cpu_vsrh(xT(ctx->opcode)), \
+       cpu_gpr[rA(ctx->opcode)])
+#if defined(TARGET_PPC64)
+MV_VSR(mfvsrd, mov_i64, mov_i64, cpu_gpr[rA(ctx->opcode)], \
+       cpu_vsrh(xS(ctx->opcode)))
+MV_VSR(mtvsrd, mov_i64, mov_i64, cpu_vsrh(xT(ctx->opcode)), \
+       cpu_gpr[rA(ctx->opcode)])
+#endif
+
 static void gen_xxpermdi(DisasContext *ctx)
 {
     if (unlikely(!ctx->vsx_enabled)) {
@@ -10094,6 +10128,14 @@ GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
 
+GEN_HANDLER_E(mfvsrwz, 0x1F, 0x13, 0x03, 0x0000F800, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(mtvsrwa, 0x1F, 0x13, 0x06, 0x0000F800, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(mtvsrwz, 0x1F, 0x13, 0x07, 0x0000F800, PPC_NONE, PPC2_VSX207),
+#if defined(TARGET_PPC64)
+GEN_HANDLER_E(mfvsrd, 0x1F, 0x13, 0x01, 0x0000F800, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(mtvsrd, 0x1F, 0x13, 0x05, 0x0000F800, PPC_NONE, PPC2_VSX207),
+#endif
+
 #undef GEN_XX2FORM
 #define GEN_XX2FORM(name, opc2, opc3, fl2)                           \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0, PPC_NONE, fl2), \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 16/18] target-ppc: Floating Merge Word Instructions
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (14 preceding siblings ...)
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions Tom Musta
@ 2014-01-10 19:08 ` Tom Musta
  2014-01-10 21:34   ` Richard Henderson
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 17/18] target-ppc: Scalar Round to Single Precision Tom Musta
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 18/18] target-ppc: Scalar Non-Signalling Conversions Tom Musta
  17 siblings, 1 reply; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Floating Merge Even Word (fmrgew) and Floating
Merge Odd Word (fmrgow) instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V6: New.

 target-ppc/translate.c |   31 +++++++++++++++++++++++++++++++
 1 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index ec945a2..7f2a66f 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2294,6 +2294,35 @@ static void gen_fcpsgn(DisasContext *ctx)
     gen_compute_fprf(cpu_fpr[rD(ctx->opcode)], 0, Rc(ctx->opcode) != 0);
 }
 
+static void gen_fmrgew(DisasContext *ctx)
+{
+    TCGv_i64 b0;
+    if (unlikely(!ctx->fpu_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_FPU);
+        return;
+    }
+    b0 = tcg_temp_new_i64();
+    tcg_gen_shri_i64(b0, cpu_fpr[rB(ctx->opcode)], 32);
+    tcg_gen_deposit_i64(cpu_fpr[rD(ctx->opcode)], cpu_fpr[rA(ctx->opcode)],
+                        b0, 0, 32);
+    tcg_temp_free_i64(b0);
+}
+
+static void gen_fmrgow(DisasContext *ctx)
+{
+    TCGv_i64 a1;
+    if (unlikely(!ctx->fpu_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_FPU);
+        return;
+    }
+    a1 = tcg_temp_new_i64();
+    tcg_gen_shli_i64(a1, cpu_fpr[rA(ctx->opcode)], 32);
+    tcg_gen_deposit_i64(cpu_fpr[rD(ctx->opcode)],
+                        a1, cpu_fpr[rB(ctx->opcode)],
+                        0, 32);
+    tcg_temp_free_i64(a1);
+}
+
 /***                  Floating-Point status & ctrl register                ***/
 
 /* mcrfs */
@@ -9397,6 +9426,8 @@ GEN_HANDLER(fmr, 0x3F, 0x08, 0x02, 0x001F0000, PPC_FLOAT),
 GEN_HANDLER(fnabs, 0x3F, 0x08, 0x04, 0x001F0000, PPC_FLOAT),
 GEN_HANDLER(fneg, 0x3F, 0x08, 0x01, 0x001F0000, PPC_FLOAT),
 GEN_HANDLER_E(fcpsgn, 0x3F, 0x08, 0x00, 0x00000000, PPC_NONE, PPC2_ISA205),
+GEN_HANDLER_E(fmrgew, 0x3F, 0x06, 0x1E, 0x00000001, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(fmrgow, 0x3F, 0x06, 0x1A, 0x00000001, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER(mcrfs, 0x3F, 0x00, 0x02, 0x0063F801, PPC_FLOAT),
 GEN_HANDLER(mffs, 0x3F, 0x07, 0x12, 0x001FF800, PPC_FLOAT),
 GEN_HANDLER(mtfsb0, 0x3F, 0x06, 0x02, 0x001FF800, PPC_FLOAT),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 17/18] target-ppc: Scalar Round to Single Precision
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (15 preceding siblings ...)
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 16/18] target-ppc: Floating Merge Word Instructions Tom Musta
@ 2014-01-10 19:08 ` Tom Musta
  2014-01-10 21:40   ` Richard Henderson
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 18/18] target-ppc: Scalar Non-Signalling Conversions Tom Musta
  17 siblings, 1 reply; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Scalar Round to Single Precision (xsrsp)
instruction.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V6: New.

 target-ppc/fpu_helper.c |   17 +++++++++++++++++
 target-ppc/helper.h     |    1 +
 target-ppc/translate.c  |    2 ++
 3 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 1dfb3c0..c1524e3 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2666,3 +2666,20 @@ VSX_ROUND(xvrspic, 4, float32, f32, FLOAT_ROUND_CURRENT, 0)
 VSX_ROUND(xvrspim, 4, float32, f32, float_round_down, 0)
 VSX_ROUND(xvrspip, 4, float32, f32, float_round_up, 0)
 VSX_ROUND(xvrspiz, 4, float32, f32, float_round_to_zero, 0)
+
+void helper_xsrsp(CPUPPCState *env, uint32_t opcode)
+{
+    ppc_vsr_t xt, xb;
+
+    getVSR(xB(opcode), &xb, env);
+    getVSR(xT(opcode), &xt, env);
+
+    helper_reset_fpstatus(env);
+
+    xt.f64[0] = helper_frsp(env, xb.f64[0]);
+
+    helper_compute_fprf(env, xt.f64[0], 1);
+
+    putVSR(xT(opcode), &xt, env);
+    helper_float_check_status(env);
+}
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 6250eba..300e194 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -293,6 +293,7 @@ DEF_HELPER_2(xssubsp, void, env, i32)
 DEF_HELPER_2(xsmulsp, void, env, i32)
 DEF_HELPER_2(xsdivsp, void, env, i32)
 DEF_HELPER_2(xsresp, void, env, i32)
+DEF_HELPER_2(xsrsp, void, env, i32)
 DEF_HELPER_2(xssqrtsp, void, env, i32)
 DEF_HELPER_2(xsrsqrtesp, void, env, i32)
 DEF_HELPER_2(xsmaddasp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 7f2a66f..48b93c8 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7426,6 +7426,7 @@ GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsrsp, 0x12, 0x11, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207)
@@ -10263,6 +10264,7 @@ GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
 GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
+GEN_XX2FORM(xsrsp, 0x12, 0x11, PPC2_VSX207),
 GEN_XX2FORM(xssqrtsp,  0x16, 0x00, PPC2_VSX207),
 GEN_XX2FORM(xsrsqrtesp,  0x14, 0x00, PPC2_VSX207),
 GEN_XX3FORM(xsmaddasp, 0x04, 0x00, PPC2_VSX207),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [V6 PATCH 18/18] target-ppc: Scalar Non-Signalling Conversions
  2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
                   ` (16 preceding siblings ...)
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 17/18] target-ppc: Scalar Round to Single Precision Tom Musta
@ 2014-01-10 19:08 ` Tom Musta
  2014-01-10 21:42   ` Richard Henderson
  17 siblings, 1 reply; 25+ messages in thread
From: Tom Musta @ 2014-01-10 19:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the non-signalling scalar conversion instructions:

  - VSX Scalar Convert Single Precision to Double Precision
    Non-Signalling (xscvspdpn)
  - VSX Scalar Convert Double Precision to Single Precision
    Non-Signalling (xscvdpspn)

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V6: New.

 target-ppc/fpu_helper.c |   19 +++++++++++++++++++
 target-ppc/helper.h     |    2 ++
 target-ppc/translate.c  |    4 ++++
 3 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index c1524e3..8bb647c 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2487,6 +2487,25 @@ VSX_CVT_FP_TO_FP(xscvspdp, 1, float32, float64, f32[j], f64[i], 1)
 VSX_CVT_FP_TO_FP(xvcvdpsp, 2, float64, float32, f64[i], f32[j], 0)
 VSX_CVT_FP_TO_FP(xvcvspdp, 2, float32, float64, f32[j], f64[i], 0)
 
+#define VSX_CVT_FP_TO_FP_NONSIG(op, stp, ttp, sfld, tfld) \
+void helper_##op(CPUPPCState *env, uint32_t opcode)       \
+{                                                         \
+    ppc_vsr_t xt, xb;                                     \
+                                                          \
+    getVSR(xB(opcode), &xb, env);                         \
+    getVSR(xT(opcode), &xt, env);                         \
+                                                          \
+    float_status tstat = env->fp_status;                  \
+    set_float_exception_flags(0, &tstat);                 \
+                                                          \
+    xt.tfld[0] = stp##_to_##ttp(xb.sfld[0], &tstat);      \
+                                                          \
+    putVSR(xT(opcode), &xt, env);                         \
+}
+
+VSX_CVT_FP_TO_FP_NONSIG(xscvdpspn, float64, float32, f64, f32)
+VSX_CVT_FP_TO_FP_NONSIG(xscvspdpn, float32, float64, f32, f64)
+
 /* VSX_CVT_FP_TO_INT - VSX floating point to integer conversion
  *   op    - instruction mnemonic
  *   nels  - number of elements (1, 2 or 4)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 300e194..753ab01 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -273,7 +273,9 @@ DEF_HELPER_2(xscmpudp, void, env, i32)
 DEF_HELPER_2(xsmaxdp, void, env, i32)
 DEF_HELPER_2(xsmindp, void, env, i32)
 DEF_HELPER_2(xscvdpsp, void, env, i32)
+DEF_HELPER_2(xscvdpspn, void, env, i32)
 DEF_HELPER_2(xscvspdp, void, env, i32)
+DEF_HELPER_2(xscvspdpn, void, env, i32)
 DEF_HELPER_2(xscvdpsxds, void, env, i32)
 DEF_HELPER_2(xscvdpsxws, void, env, i32)
 DEF_HELPER_2(xscvdpuxds, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 48b93c8..dde5a06 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7408,7 +7408,9 @@ GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvdpspn, 0x16, 0x10, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xscvspdp, 0x12, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvspdpn, 0x16, 0x14, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xscvdpsxds, 0x10, 0x15, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvdpsxws, 0x10, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvdpuxds, 0x10, 0x14, 0, PPC2_VSX)
@@ -10246,7 +10248,9 @@ GEN_XX2FORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
 GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
 GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),
 GEN_XX2FORM(xscvdpsp, 0x12, 0x10, PPC2_VSX),
+GEN_XX2FORM(xscvdpspn, 0x16, 0x10, PPC2_VSX207),
 GEN_XX2FORM(xscvspdp, 0x12, 0x14, PPC2_VSX),
+GEN_XX2FORM(xscvspdpn, 0x16, 0x14, PPC2_VSX207),
 GEN_XX2FORM(xscvdpsxds, 0x10, 0x15, PPC2_VSX),
 GEN_XX2FORM(xscvdpsxws, 0x10, 0x05, PPC2_VSX),
 GEN_XX2FORM(xscvdpuxds, 0x10, 0x14, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions
  2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions Tom Musta
@ 2014-01-10 21:29   ` Richard Henderson
  2014-01-14 14:14     ` Tom Musta
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2014-01-10 21:29 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/10/2014 11:07 AM, Tom Musta wrote:
> +#define MV_VSR(name, tcgop1, tcgop2, target, source)            \
> +static void gen_##name(DisasContext *ctx)                       \
> +{                                                               \
> +    if (xS(ctx->opcode) < 32) {                                 \
> +        if (unlikely(!ctx->fpu_enabled)) {                      \
> +            gen_exception(ctx, POWERPC_EXCP_FPU);               \
> +            return;                                             \
> +        }                                                       \
> +    } else {                                                    \
> +        if (unlikely(!ctx->altivec_enabled)) {                  \
> +            gen_exception(ctx, POWERPC_EXCP_VPU);               \
> +            return;                                             \
> +        }                                                       \
> +    }                                                           \
> +    TCGv_i64 tmp = tcg_temp_new_i64();                          \
> +    tcg_gen_##tcgop1(tmp, source);                              \
> +    tcg_gen_##tcgop2(target, tmp);                              \
> +    tcg_temp_free_i64(tmp);                                     \
> +}
> +
> +
> +MV_VSR(mfvsrwz, ext32u_i64, trunc_i64_tl, cpu_gpr[rA(ctx->opcode)], \
> +       cpu_vsrh(xS(ctx->opcode)))
> +MV_VSR(mtvsrwa, extu_tl_i64, ext32s_i64, cpu_vsrh(xT(ctx->opcode)), \
> +       cpu_gpr[rA(ctx->opcode)])
> +MV_VSR(mtvsrwz, extu_tl_i64, ext32u_i64, cpu_vsrh(xT(ctx->opcode)), \
> +       cpu_gpr[rA(ctx->opcode)])
> +#if defined(TARGET_PPC64)
> +MV_VSR(mfvsrd, mov_i64, mov_i64, cpu_gpr[rA(ctx->opcode)], \
> +       cpu_vsrh(xS(ctx->opcode)))
> +MV_VSR(mtvsrd, mov_i64, mov_i64, cpu_vsrh(xT(ctx->opcode)), \
> +       cpu_gpr[rA(ctx->opcode)])
> +#endif

Better to do this in one step:

mfcsrwz:	tcg_gen_ext32u_tl
mtvsrwa:	tcg_gen_ext_tl_i64
mtvsrwz:	tcg_gen_extu_tl_i64
m[tf]vsrd:	tcg_gen_mov_i64


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [V6 PATCH 16/18] target-ppc: Floating Merge Word Instructions
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 16/18] target-ppc: Floating Merge Word Instructions Tom Musta
@ 2014-01-10 21:34   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2014-01-10 21:34 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/10/2014 11:08 AM, Tom Musta wrote:
> +static void gen_fmrgow(DisasContext *ctx)
> +{
> +    TCGv_i64 a1;
> +    if (unlikely(!ctx->fpu_enabled)) {
> +        gen_exception(ctx, POWERPC_EXCP_FPU);
> +        return;
> +    }
> +    a1 = tcg_temp_new_i64();
> +    tcg_gen_shli_i64(a1, cpu_fpr[rA(ctx->opcode)], 32);
> +    tcg_gen_deposit_i64(cpu_fpr[rD(ctx->opcode)],
> +                        a1, cpu_fpr[rB(ctx->opcode)],
> +                        0, 32);
> +    tcg_temp_free_i64(a1);
> +}

Better use of the deposit when you use it for the shift also:

	tcg_gen_deposit_i64(cpu_fpr[rD],
			    cpu_fpr[rB],
                            cpu_fpr[rA], 32, 32);


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [V6 PATCH 17/18] target-ppc: Scalar Round to Single Precision
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 17/18] target-ppc: Scalar Round to Single Precision Tom Musta
@ 2014-01-10 21:40   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2014-01-10 21:40 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/10/2014 11:08 AM, Tom Musta wrote:
> This patch adds the VSX Scalar Round to Single Precision (xsrsp)
> instruction.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V6: New.
> 
>  target-ppc/fpu_helper.c |   17 +++++++++++++++++
>  target-ppc/helper.h     |    1 +
>  target-ppc/translate.c  |    2 ++
>  3 files changed, 20 insertions(+), 0 deletions(-)

Ok, I guess, although why aren't we passing and returning by value, rather than
by reference?

This is scalar, so we only need to pass and return uint64_t...


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [V6 PATCH 18/18] target-ppc: Scalar Non-Signalling Conversions
  2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 18/18] target-ppc: Scalar Non-Signalling Conversions Tom Musta
@ 2014-01-10 21:42   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2014-01-10 21:42 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/10/2014 11:08 AM, Tom Musta wrote:
> This patch adds the non-signalling scalar conversion instructions:
> 
>   - VSX Scalar Convert Single Precision to Double Precision
>     Non-Signalling (xscvspdpn)
>   - VSX Scalar Convert Double Precision to Single Precision
>     Non-Signalling (xscvdpspn)
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V6: New.
> 
>  target-ppc/fpu_helper.c |   19 +++++++++++++++++++
>  target-ppc/helper.h     |    2 ++
>  target-ppc/translate.c  |    4 ++++
>  3 files changed, 25 insertions(+), 0 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>

Of course, this also deserves the same cleanup that all of the other VSX
scalars ought to get -- reorg to be more like regular FP, with data passed by
value.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions
  2014-01-10 21:29   ` Richard Henderson
@ 2014-01-14 14:14     ` Tom Musta
  2014-01-15 21:07       ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Tom Musta @ 2014-01-14 14:14 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-ppc

On 1/10/2014 3:29 PM, Richard Henderson wrote:
> On 01/10/2014 11:07 AM, Tom Musta wrote:
>> +#define MV_VSR(name, tcgop1, tcgop2, target, source)            \
>> +static void gen_##name(DisasContext *ctx)                       \
>> +{                                                               \
>> +    if (xS(ctx->opcode) < 32) {                                 \
>> +        if (unlikely(!ctx->fpu_enabled)) {                      \
>> +            gen_exception(ctx, POWERPC_EXCP_FPU);               \
>> +            return;                                             \
>> +        }                                                       \
>> +    } else {                                                    \
>> +        if (unlikely(!ctx->altivec_enabled)) {                  \
>> +            gen_exception(ctx, POWERPC_EXCP_VPU);               \
>> +            return;                                             \
>> +        }                                                       \
>> +    }                                                           \
>> +    TCGv_i64 tmp = tcg_temp_new_i64();                          \
>> +    tcg_gen_##tcgop1(tmp, source);                              \
>> +    tcg_gen_##tcgop2(target, tmp);                              \
>> +    tcg_temp_free_i64(tmp);                                     \
>> +}
>> +
>> +
>> +MV_VSR(mfvsrwz, ext32u_i64, trunc_i64_tl, cpu_gpr[rA(ctx->opcode)], \
>> +       cpu_vsrh(xS(ctx->opcode)))
>> +MV_VSR(mtvsrwa, extu_tl_i64, ext32s_i64, cpu_vsrh(xT(ctx->opcode)), \
>> +       cpu_gpr[rA(ctx->opcode)])
>> +MV_VSR(mtvsrwz, extu_tl_i64, ext32u_i64, cpu_vsrh(xT(ctx->opcode)), \
>> +       cpu_gpr[rA(ctx->opcode)])
>> +#if defined(TARGET_PPC64)
>> +MV_VSR(mfvsrd, mov_i64, mov_i64, cpu_gpr[rA(ctx->opcode)], \
>> +       cpu_vsrh(xS(ctx->opcode)))
>> +MV_VSR(mtvsrd, mov_i64, mov_i64, cpu_vsrh(xT(ctx->opcode)), \
>> +       cpu_gpr[rA(ctx->opcode)])
>> +#endif
> 
> Better to do this in one step:
> 
> mfcsrwz:	tcg_gen_ext32u_tl
> mtvsrwa:	tcg_gen_ext_tl_i64
> mtvsrwz:	tcg_gen_extu_tl_i64
> m[tf]vsrd:	tcg_gen_mov_i64
> 
> 
> r~
> 

Richard:

As always, thanks for reviewing my patches.

I agree on m[tf]vsrd because these are 64-bit instructions and therefore both
source and target are always i64s.

However, the word versions are a bit more tricky.  The VSR operand is always
an i64.  However, the GPR operand is either an i32 (on 32-bit implementations)
or a part of an i64.  I could not find single TCG operations to handle
these cases.  Specifically, here is what I think doesn't work with your
suggestions:

(1) Using tcg_gen_ext32u_tl for mfvsrwz does not compile on 32-bit PPC -- the
ext32u_tl operation is equivalent to mov_i32 but the source operand (VSRH) is
an i64.

(2) Using tcg_gen_ext_tl_i64 for mtvsrwa compiles but is incorrect on 64-bit PPC
-- the ext_tl_i64 translates to mov_i64.  The instruction semantic is to sign
extend the lower 32 bits of the 64-bit source GPR.

(3) Similarly, using tcg_gen_extu_tl_i64 for mtvsrwz is incorrect for 64-bit
PPC -- this is a mov_i64 and hence does not zero out the upper 32 bits of the
target VSRH.

I can recode the "d" versions to eliminate the extraneous tcg op.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions
  2014-01-14 14:14     ` Tom Musta
@ 2014-01-15 21:07       ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2014-01-15 21:07 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/14/2014 06:14 AM, Tom Musta wrote:
> However, the word versions are a bit more tricky.  The VSR operand is always
> an i64.  However, the GPR operand is either an i32 (on 32-bit implementations)
> or a part of an i64.  I could not find single TCG operations to handle
> these cases.  Specifically, here is what I think doesn't work with your
> suggestions:
> 
> (1) Using tcg_gen_ext32u_tl for mfvsrwz does not compile on 32-bit PPC -- the
> ext32u_tl operation is equivalent to mov_i32 but the source operand (VSRH) is
> an i64.
> 
> (2) Using tcg_gen_ext_tl_i64 for mtvsrwa compiles but is incorrect on 64-bit PPC
> -- the ext_tl_i64 translates to mov_i64.  The instruction semantic is to sign
> extend the lower 32 bits of the 64-bit source GPR.
> 
> (3) Similarly, using tcg_gen_extu_tl_i64 for mtvsrwz is incorrect for 64-bit
> PPC -- this is a mov_i64 and hence does not zero out the upper 32 bits of the
> target VSRH.

Hmm, you're right that there's not one symbol that does the job for each case.
However, there's an opcode that does the job for each case.  So perhaps we need

#ifdef TARGET_PPC64
#define ppc_gen_mfcsrwz    tcg_gen_ext32u_i64
#define ppc_gen_mtvsrwa    tcg_gen_ext32s_i64
#define ppc_gen_mtvsrwz    tcg_gen_ext32u_i64
#else
#define ppc_gen_mfcsrwz    tcg_gen_trunc_i64_i32
#define ppc_gen_mtvsrwa    tcg_gen_ext_i32_i64
#define ppc_gen_mtvsrwz    tcg_gen_extu_i32_i64
#endif

and then use those within your other macros.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-01-15 21:08 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-10 19:07 [Qemu-devel] [V6 PATCH 00/18] target-ppc: VSX Stage 4 Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 01/18] target-ppc: VSX Stage 4: Add VSX 2.07 Flag Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 02/18] target-ppc: VSX Stage 4: Refactor lxsdx Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 03/18] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 04/18] target-ppc: VSX Stage 4: Refactor stxsdx Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 05/18] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 06/18] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 07/18] target-ppc: VSX Stage 4: Add xsmulsp Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 08/18] target-ppc: VSX Stage 4: Add xsdivsp Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 09/18] target-ppc: VSX Stage 4: Add xsresp Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 10/18] target-ppc: VSX Stage 4: Add xssqrtsp Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 11/18] target-ppc: VSX Stage 4: add xsrsqrtesp Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 12/18] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 13/18] target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 14/18] target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc Tom Musta
2014-01-10 19:07 ` [Qemu-devel] [V6 PATCH 15/18] target-ppc: Move To/From VSR Instructions Tom Musta
2014-01-10 21:29   ` Richard Henderson
2014-01-14 14:14     ` Tom Musta
2014-01-15 21:07       ` Richard Henderson
2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 16/18] target-ppc: Floating Merge Word Instructions Tom Musta
2014-01-10 21:34   ` Richard Henderson
2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 17/18] target-ppc: Scalar Round to Single Precision Tom Musta
2014-01-10 21:40   ` Richard Henderson
2014-01-10 19:08 ` [Qemu-devel] [V6 PATCH 18/18] target-ppc: Scalar Non-Signalling Conversions Tom Musta
2014-01-10 21:42   ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.