All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4
@ 2016-08-12 18:34 Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 01/17] target-ppc: consolidate load operations Nikunj A Dadhania
                   ` (17 more replies)
  0 siblings, 18 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

1) Consolidate Load/Store operations using tcg_gen_qemu_ld/st functions
2) This series contains 10 new instructions for POWER9 ISA3.0
   Use newer qemu load/store tcg helpers and optimize stxvw4x and lxvw4x.

Patches:
 01-09:  Cleanup load/store operations in ppc translator 
    10:  xxspltib: VSX Vector Splat Immediate Byte
    11:  darn: Deliver a random number
    12:  lxsibzx - Load VSX Scalar as Integer Byte & Zero Indexed
         lxsihzx - Load VSX Scalar as Integer Halfword & Zero Indexed
    13:  stxsibx - Store VSX Scalar as Integer Byte Indexed
         stxsihx - Store VSX Scalar as Integer Halfword Indexed
    14:  lxvw4x - improve implementation
    15:  lxvb16x: Load VSX Vector Byte*16
         lxvh8x:  Load VSX Vector Halfword*8
    16:  stxv4x - improve implementation
    17:  stxvb16x: Store VSX Vector Byte*16
         stxvh8x:  Store VSX Vector Halfword*8

Series also available here: https://github.com/nikunjad/qemu/tree/p9-tcg

Changelog:
v1: 
* More load/store cleanups in byte reverse routines
* ld64/st64 converted to newer macro and updated call sites
* Cleanup load with reservation and store conditional
* Return invalid random for darn instruction

v0:
* darn - read /dev/random to get the random number
* xxspltib - make is PPC64 only
* Consolidate load/store operations and use macros to generate qemu_st/ld
* Simplify load/store vsx endian manipulation

Nikunj A Dadhania (16):
  target-ppc: consolidate load operations
  target-ppc: convert ld64 to use new macro
  target-ppc: convert ld[16,32,64]ur to use new macro
  target-ppc: consolidate store operations
  target-ppc: convert st64 to use new macro
  target-ppc: convert st[16,32,64]r to use new macro
  target-ppc: consolidate load with reservation
  target-ppc: move out stqcx impementation
  target-ppc: consolidate store conditional
  target-ppc: add xxspltib instruction
  target-ppc: add lxsi[bw]zx instruction
  target-ppc: add stxsi[bh]x instruction
  target-ppc: improve lxvw4x implementation
  target-ppc: add lxvb16x and lxvh8x
  target-ppc: improve stxvw4x implementation
  target-ppc: add stxvb16x and stxvh8x

Ravi Bangoria (1):
  target-ppc: implement darn instruction

 target-ppc/helper.h                 |   4 +
 target-ppc/int_helper.c             |  16 ++
 target-ppc/mem_helper.c             |  11 ++
 target-ppc/translate.c              | 379 +++++++++++++++++-------------------
 target-ppc/translate/fp-impl.inc.c  |  84 ++++----
 target-ppc/translate/fp-ops.inc.c   |   2 +-
 target-ppc/translate/spe-impl.inc.c |   4 +-
 target-ppc/translate/vmx-impl.inc.c |  24 +--
 target-ppc/translate/vsx-impl.inc.c | 208 ++++++++++++++++----
 target-ppc/translate/vsx-ops.inc.c  |  13 ++
 10 files changed, 460 insertions(+), 285 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 01/17] target-ppc: consolidate load operations
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 02/17] target-ppc: convert ld64 to use new macro Nikunj A Dadhania
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Implement macro to consolidate store operations using newer
tcg_gen_qemu_ld functions.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c | 58 +++++++++++++++++---------------------------------
 1 file changed, 20 insertions(+), 38 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 618334a..30d548a 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2460,50 +2460,32 @@ static inline void gen_align_no_le(DisasContext *ctx)
 }
 
 /***                             Integer load                              ***/
-static inline void gen_qemu_ld8u(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    tcg_gen_qemu_ld8u(arg1, arg2, ctx->mem_idx);
-}
-
-static inline void gen_qemu_ld16u(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UW | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_ld_tl(arg1, arg2, ctx->mem_idx, op);
-}
-
-static inline void gen_qemu_ld16s(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_SW | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_ld_tl(arg1, arg2, ctx->mem_idx, op);
-}
+#define DEF_MEMOP(op) ((op) | ctx->default_tcg_memop_mask)
 
-static inline void gen_qemu_ld32u(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UL | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_ld_tl(arg1, arg2, ctx->mem_idx, op);
+#define GEN_QEMU_LOAD_TL(ldop, op)                                      \
+static void glue(gen_qemu_, ldop)(DisasContext *ctx,                    \
+                                  TCGv val,                             \
+                                  TCGv addr)                            \
+{                                                                       \
+    tcg_gen_qemu_ld_tl(val, addr, ctx->mem_idx, op);                    \
 }
 
-static void gen_qemu_ld32u_i64(DisasContext *ctx, TCGv_i64 val, TCGv addr)
-{
-    TCGv tmp = tcg_temp_new();
-    gen_qemu_ld32u(ctx, tmp, addr);
-    tcg_gen_extu_tl_i64(val, tmp);
-    tcg_temp_free(tmp);
-}
+GEN_QEMU_LOAD_TL(ld8u,  DEF_MEMOP(MO_UB))
+GEN_QEMU_LOAD_TL(ld16u, DEF_MEMOP(MO_UW))
+GEN_QEMU_LOAD_TL(ld16s, DEF_MEMOP(MO_SW))
+GEN_QEMU_LOAD_TL(ld32u, DEF_MEMOP(MO_UL))
+GEN_QEMU_LOAD_TL(ld32s, DEF_MEMOP(MO_SL))
 
-static inline void gen_qemu_ld32s(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_SL | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_ld_tl(arg1, arg2, ctx->mem_idx, op);
+#define GEN_QEMU_LOAD_64(ldop, op)                                  \
+static void glue(gen_qemu_, glue(ldop, _i64))(DisasContext *ctx,    \
+                                             TCGv_i64 val,          \
+                                             TCGv addr)             \
+{                                                                   \
+    tcg_gen_qemu_ld_i64(val, addr, ctx->mem_idx, op);               \
 }
 
-static void gen_qemu_ld32s_i64(DisasContext *ctx, TCGv_i64 val, TCGv addr)
-{
-    TCGv tmp = tcg_temp_new();
-    gen_qemu_ld32s(ctx, tmp, addr);
-    tcg_gen_ext_tl_i64(val, tmp);
-    tcg_temp_free(tmp);
-}
+GEN_QEMU_LOAD_64(ld32u, DEF_MEMOP(MO_UL))
+GEN_QEMU_LOAD_64(ld32s, DEF_MEMOP(MO_SL))
 
 static inline void gen_qemu_ld64(DisasContext *ctx, TCGv_i64 arg1, TCGv arg2)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 02/17] target-ppc: convert ld64 to use new macro
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 01/17] target-ppc: consolidate load operations Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 03/17] target-ppc: convert ld[16, 32, 64]ur " Nikunj A Dadhania
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Use macro for ld64 as well, this changes the function signature from
gen_qemu_ld64 => gen_qemu_ld64_i64. Replace this at all the call sites.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 39 +++++++++++++++-------------------
 target-ppc/translate/fp-impl.inc.c  | 42 ++++++++++++++++++-------------------
 target-ppc/translate/spe-impl.inc.c |  2 +-
 target-ppc/translate/vmx-impl.inc.c | 12 +++++------
 target-ppc/translate/vsx-impl.inc.c |  8 +++----
 5 files changed, 49 insertions(+), 54 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 30d548a..42f403a 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2486,12 +2486,7 @@ static void glue(gen_qemu_, glue(ldop, _i64))(DisasContext *ctx,    \
 
 GEN_QEMU_LOAD_64(ld32u, DEF_MEMOP(MO_UL))
 GEN_QEMU_LOAD_64(ld32s, DEF_MEMOP(MO_SL))
-
-static inline void gen_qemu_ld64(DisasContext *ctx, TCGv_i64 arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_Q | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_ld_i64(arg1, arg2, ctx->mem_idx, op);
-}
+GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q))
 
 static inline void gen_qemu_st8(DisasContext *ctx, TCGv arg1, TCGv arg2)
 {
@@ -2610,12 +2605,12 @@ GEN_LDUX(lwa, ld32s, 0x15, 0x0B, PPC_64B);
 /* lwax */
 GEN_LDX(lwa, ld32s, 0x15, 0x0A, PPC_64B);
 /* ldux */
-GEN_LDUX(ld, ld64, 0x15, 0x01, PPC_64B);
+GEN_LDUX(ld, ld64_i64, 0x15, 0x01, PPC_64B);
 /* ldx */
-GEN_LDX(ld, ld64, 0x15, 0x00, PPC_64B);
+GEN_LDX(ld, ld64_i64, 0x15, 0x00, PPC_64B);
 
 /* CI load/store variants */
-GEN_LDX_HVRM(ldcix, ld64, 0x15, 0x1b, PPC_CILDST)
+GEN_LDX_HVRM(ldcix, ld64_i64, 0x15, 0x1b, PPC_CILDST)
 GEN_LDX_HVRM(lwzcix, ld32u, 0x15, 0x15, PPC_CILDST)
 GEN_LDX_HVRM(lhzcix, ld16u, 0x15, 0x19, PPC_CILDST)
 GEN_LDX_HVRM(lbzcix, ld8u, 0x15, 0x1a, PPC_CILDST)
@@ -2638,7 +2633,7 @@ static void gen_ld(DisasContext *ctx)
         gen_qemu_ld32s(ctx, cpu_gpr[rD(ctx->opcode)], EA);
     } else {
         /* ld - ldu */
-        gen_qemu_ld64(ctx, cpu_gpr[rD(ctx->opcode)], EA);
+        gen_qemu_ld64_i64(ctx, cpu_gpr[rD(ctx->opcode)], EA);
     }
     if (Rc(ctx->opcode))
         tcg_gen_mov_tl(cpu_gpr[rA(ctx->opcode)], EA);
@@ -2675,16 +2670,16 @@ static void gen_lq(DisasContext *ctx)
     EA = tcg_temp_new();
     gen_addr_imm_index(ctx, EA, 0x0F);
 
-    /* We only need to swap high and low halves. gen_qemu_ld64 does necessary
-       64-bit byteswap already. */
+    /* We only need to swap high and low halves. gen_qemu_ld64_i64 does
+       necessary 64-bit byteswap already. */
     if (unlikely(ctx->le_mode)) {
-        gen_qemu_ld64(ctx, cpu_gpr[rd+1], EA);
+        gen_qemu_ld64_i64(ctx, cpu_gpr[rd + 1], EA);
         gen_addr_add(ctx, EA, EA, 8);
-        gen_qemu_ld64(ctx, cpu_gpr[rd], EA);
+        gen_qemu_ld64_i64(ctx, cpu_gpr[rd], EA);
     } else {
-        gen_qemu_ld64(ctx, cpu_gpr[rd], EA);
+        gen_qemu_ld64_i64(ctx, cpu_gpr[rd], EA);
         gen_addr_add(ctx, EA, EA, 8);
-        gen_qemu_ld64(ctx, cpu_gpr[rd+1], EA);
+        gen_qemu_ld64_i64(ctx, cpu_gpr[rd + 1], EA);
     }
     tcg_temp_free(EA);
 }
@@ -3182,7 +3177,7 @@ STCX(stwcx_, 4);
 
 #if defined(TARGET_PPC64)
 /* ldarx */
-LARX(ldarx, 8, ld64);
+LARX(ldarx, 8, ld64_i64);
 
 /* lqarx */
 static void gen_lqarx(DisasContext *ctx)
@@ -3208,11 +3203,11 @@ static void gen_lqarx(DisasContext *ctx)
         gpr1 = cpu_gpr[rd];
         gpr2 = cpu_gpr[rd+1];
     }
-    gen_qemu_ld64(ctx, gpr1, EA);
+    gen_qemu_ld64_i64(ctx, gpr1, EA);
     tcg_gen_mov_tl(cpu_reserve, EA);
 
     gen_addr_add(ctx, EA, EA, 8);
-    gen_qemu_ld64(ctx, gpr2, EA);
+    gen_qemu_ld64_i64(ctx, gpr2, EA);
 
     tcg_gen_st_tl(gpr1, cpu_env, offsetof(CPUPPCState, reserve_val));
     tcg_gen_st_tl(gpr2, cpu_env, offsetof(CPUPPCState, reserve_val2));
@@ -6596,12 +6591,12 @@ GEN_LDS(lwz, ld32u, 0x00, PPC_INTEGER)
 #if defined(TARGET_PPC64)
 GEN_LDUX(lwa, ld32s, 0x15, 0x0B, PPC_64B)
 GEN_LDX(lwa, ld32s, 0x15, 0x0A, PPC_64B)
-GEN_LDUX(ld, ld64, 0x15, 0x01, PPC_64B)
-GEN_LDX(ld, ld64, 0x15, 0x00, PPC_64B)
+GEN_LDUX(ld, ld64_i64, 0x15, 0x01, PPC_64B)
+GEN_LDX(ld, ld64_i64, 0x15, 0x00, PPC_64B)
 GEN_LDX_E(ldbr, ld64ur, 0x14, 0x10, PPC_NONE, PPC2_DBRX, CHK_NONE)
 
 /* HV/P7 and later only */
-GEN_LDX_HVRM(ldcix, ld64, 0x15, 0x1b, PPC_CILDST)
+GEN_LDX_HVRM(ldcix, ld64_i64, 0x15, 0x1b, PPC_CILDST)
 GEN_LDX_HVRM(lwzcix, ld32u, 0x15, 0x18, PPC_CILDST)
 GEN_LDX_HVRM(lhzcix, ld16u, 0x15, 0x19, PPC_CILDST)
 GEN_LDX_HVRM(lbzcix, ld8u, 0x15, 0x1a, PPC_CILDST)
diff --git a/target-ppc/translate/fp-impl.inc.c b/target-ppc/translate/fp-impl.inc.c
index 9ba9289..53b7fc7 100644
--- a/target-ppc/translate/fp-impl.inc.c
+++ b/target-ppc/translate/fp-impl.inc.c
@@ -672,7 +672,7 @@ static inline void gen_qemu_ld32fs(DisasContext *ctx, TCGv_i64 arg1, TCGv arg2)
 }
 
  /* lfd lfdu lfdux lfdx */
-GEN_LDFS(lfd, ld64, 0x12, PPC_FLOAT);
+GEN_LDFS(lfd, ld64_i64, 0x12, PPC_FLOAT);
  /* lfs lfsu lfsux lfsx */
 GEN_LDFS(lfs, ld32fs, 0x10, PPC_FLOAT);
 
@@ -687,16 +687,16 @@ static void gen_lfdp(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     EA = tcg_temp_new();
     gen_addr_imm_index(ctx, EA, 0);
-    /* We only need to swap high and low halves. gen_qemu_ld64 does necessary
-       64-bit byteswap already. */
+    /* We only need to swap high and low halves. gen_qemu_ld64_i64 does
+       necessary 64-bit byteswap already. */
     if (unlikely(ctx->le_mode)) {
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
     } else {
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
     }
     tcg_temp_free(EA);
 }
@@ -712,16 +712,16 @@ static void gen_lfdpx(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     EA = tcg_temp_new();
     gen_addr_reg_index(ctx, EA);
-    /* We only need to swap high and low halves. gen_qemu_ld64 does necessary
-       64-bit byteswap already. */
+    /* We only need to swap high and low halves. gen_qemu_ld64_i64 does
+       necessary 64-bit byteswap already. */
     if (unlikely(ctx->le_mode)) {
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
     } else {
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_ld64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_ld64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
     }
     tcg_temp_free(EA);
 }
@@ -924,9 +924,9 @@ static void gen_lfq(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     t0 = tcg_temp_new();
     gen_addr_imm_index(ctx, t0, 0);
-    gen_qemu_ld64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[rd], t0);
     gen_addr_add(ctx, t0, t0, 8);
-    gen_qemu_ld64(ctx, cpu_fpr[(rd + 1) % 32], t0);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[(rd + 1) % 32], t0);
     tcg_temp_free(t0);
 }
 
@@ -940,9 +940,9 @@ static void gen_lfqu(DisasContext *ctx)
     t0 = tcg_temp_new();
     t1 = tcg_temp_new();
     gen_addr_imm_index(ctx, t0, 0);
-    gen_qemu_ld64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[rd], t0);
     gen_addr_add(ctx, t1, t0, 8);
-    gen_qemu_ld64(ctx, cpu_fpr[(rd + 1) % 32], t1);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[(rd + 1) % 32], t1);
     if (ra != 0)
         tcg_gen_mov_tl(cpu_gpr[ra], t0);
     tcg_temp_free(t0);
@@ -958,10 +958,10 @@ static void gen_lfqux(DisasContext *ctx)
     TCGv t0, t1;
     t0 = tcg_temp_new();
     gen_addr_reg_index(ctx, t0);
-    gen_qemu_ld64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[rd], t0);
     t1 = tcg_temp_new();
     gen_addr_add(ctx, t1, t0, 8);
-    gen_qemu_ld64(ctx, cpu_fpr[(rd + 1) % 32], t1);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[(rd + 1) % 32], t1);
     tcg_temp_free(t1);
     if (ra != 0)
         tcg_gen_mov_tl(cpu_gpr[ra], t0);
@@ -976,9 +976,9 @@ static void gen_lfqx(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     t0 = tcg_temp_new();
     gen_addr_reg_index(ctx, t0);
-    gen_qemu_ld64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[rd], t0);
     gen_addr_add(ctx, t0, t0, 8);
-    gen_qemu_ld64(ctx, cpu_fpr[(rd + 1) % 32], t0);
+    gen_qemu_ld64_i64(ctx, cpu_fpr[(rd + 1) % 32], t0);
     tcg_temp_free(t0);
 }
 
diff --git a/target-ppc/translate/spe-impl.inc.c b/target-ppc/translate/spe-impl.inc.c
index 0ce403a..a969927 100644
--- a/target-ppc/translate/spe-impl.inc.c
+++ b/target-ppc/translate/spe-impl.inc.c
@@ -617,7 +617,7 @@ static inline void gen_addr_spe_imm_index(DisasContext *ctx, TCGv EA, int sh)
 static inline void gen_op_evldd(DisasContext *ctx, TCGv addr)
 {
     TCGv_i64 t0 = tcg_temp_new_i64();
-    gen_qemu_ld64(ctx, t0, addr);
+    gen_qemu_ld64_i64(ctx, t0, addr);
     gen_store_gpr64(rD(ctx->opcode), t0);
     tcg_temp_free_i64(t0);
 }
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index b984122..e386ed0 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -26,16 +26,16 @@ static void glue(gen_, name)(DisasContext *ctx)
     EA = tcg_temp_new();                                                      \
     gen_addr_reg_index(ctx, EA);                                              \
     tcg_gen_andi_tl(EA, EA, ~0xf);                                            \
-    /* We only need to swap high and low halves. gen_qemu_ld64 does necessary \
-       64-bit byteswap already. */                                            \
+    /* We only need to swap high and low halves. gen_qemu_ld64_i64 does       \
+       necessary 64-bit byteswap already. */                                  \
     if (ctx->le_mode) {                                                       \
-        gen_qemu_ld64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                    \
+        gen_qemu_ld64_i64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                \
         tcg_gen_addi_tl(EA, EA, 8);                                           \
-        gen_qemu_ld64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                    \
+        gen_qemu_ld64_i64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                \
     } else {                                                                  \
-        gen_qemu_ld64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                    \
+        gen_qemu_ld64_i64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                \
         tcg_gen_addi_tl(EA, EA, 8);                                           \
-        gen_qemu_ld64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                    \
+        gen_qemu_ld64_i64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                \
     }                                                                         \
     tcg_temp_free(EA);                                                        \
 }
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 9f77b06..5725794 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -34,7 +34,7 @@ static void gen_##name(DisasContext *ctx)                     \
     tcg_temp_free(EA);                                        \
 }
 
-VSX_LOAD_SCALAR(lxsdx, ld64)
+VSX_LOAD_SCALAR(lxsdx, ld64_i64)
 VSX_LOAD_SCALAR(lxsiwax, ld32s_i64)
 VSX_LOAD_SCALAR(lxsiwzx, ld32u_i64)
 VSX_LOAD_SCALAR(lxsspx, ld32fs)
@@ -49,9 +49,9 @@ static void gen_lxvd2x(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_INT);
     EA = tcg_temp_new();
     gen_addr_reg_index(ctx, EA);
-    gen_qemu_ld64(ctx, cpu_vsrh(xT(ctx->opcode)), EA);
+    gen_qemu_ld64_i64(ctx, cpu_vsrh(xT(ctx->opcode)), EA);
     tcg_gen_addi_tl(EA, EA, 8);
-    gen_qemu_ld64(ctx, cpu_vsrl(xT(ctx->opcode)), EA);
+    gen_qemu_ld64_i64(ctx, cpu_vsrl(xT(ctx->opcode)), EA);
     tcg_temp_free(EA);
 }
 
@@ -65,7 +65,7 @@ static void gen_lxvdsx(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_INT);
     EA = tcg_temp_new();
     gen_addr_reg_index(ctx, EA);
-    gen_qemu_ld64(ctx, cpu_vsrh(xT(ctx->opcode)), EA);
+    gen_qemu_ld64_i64(ctx, cpu_vsrh(xT(ctx->opcode)), EA);
     tcg_gen_mov_i64(cpu_vsrl(xT(ctx->opcode)), cpu_vsrh(xT(ctx->opcode)));
     tcg_temp_free(EA);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 03/17] target-ppc: convert ld[16, 32, 64]ur to use new macro
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 01/17] target-ppc: consolidate load operations Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 02/17] target-ppc: convert ld64 to use new macro Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 04/17] target-ppc: consolidate store operations Nikunj A Dadhania
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Make byte-swap routines use the common GEN_QEMU_LOAD macro

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c | 27 ++++++++++-----------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 42f403a..a33e0ca 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2461,6 +2461,7 @@ static inline void gen_align_no_le(DisasContext *ctx)
 
 /***                             Integer load                              ***/
 #define DEF_MEMOP(op) ((op) | ctx->default_tcg_memop_mask)
+#define BSWAP_MEMOP(op) ((op) | (ctx->default_tcg_memop_mask ^ MO_BSWAP))
 
 #define GEN_QEMU_LOAD_TL(ldop, op)                                      \
 static void glue(gen_qemu_, ldop)(DisasContext *ctx,                    \
@@ -2476,6 +2477,9 @@ GEN_QEMU_LOAD_TL(ld16s, DEF_MEMOP(MO_SW))
 GEN_QEMU_LOAD_TL(ld32u, DEF_MEMOP(MO_UL))
 GEN_QEMU_LOAD_TL(ld32s, DEF_MEMOP(MO_SL))
 
+GEN_QEMU_LOAD_TL(ld16ur, BSWAP_MEMOP(MO_UW))
+GEN_QEMU_LOAD_TL(ld32ur, BSWAP_MEMOP(MO_UL))
+
 #define GEN_QEMU_LOAD_64(ldop, op)                                  \
 static void glue(gen_qemu_, glue(ldop, _i64))(DisasContext *ctx,    \
                                              TCGv_i64 val,          \
@@ -2488,6 +2492,10 @@ GEN_QEMU_LOAD_64(ld32u, DEF_MEMOP(MO_UL))
 GEN_QEMU_LOAD_64(ld32s, DEF_MEMOP(MO_SL))
 GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q))
 
+#if defined(TARGET_PPC64)
+GEN_QEMU_LOAD_64(ld64ur, BSWAP_MEMOP(MO_Q))
+#endif
+
 static inline void gen_qemu_st8(DisasContext *ctx, TCGv arg1, TCGv arg2)
 {
     tcg_gen_qemu_st8(arg1, arg2, ctx->mem_idx);
@@ -2834,29 +2842,14 @@ static void gen_std(DisasContext *ctx)
 /***                Integer load and store with byte reverse               ***/
 
 /* lhbrx */
-static inline void gen_qemu_ld16ur(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UW | (ctx->default_tcg_memop_mask ^ MO_BSWAP);
-    tcg_gen_qemu_ld_tl(arg1, arg2, ctx->mem_idx, op);
-}
 GEN_LDX(lhbr, ld16ur, 0x16, 0x18, PPC_INTEGER);
 
 /* lwbrx */
-static inline void gen_qemu_ld32ur(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UL | (ctx->default_tcg_memop_mask ^ MO_BSWAP);
-    tcg_gen_qemu_ld_tl(arg1, arg2, ctx->mem_idx, op);
-}
 GEN_LDX(lwbr, ld32ur, 0x16, 0x10, PPC_INTEGER);
 
 #if defined(TARGET_PPC64)
 /* ldbrx */
-static inline void gen_qemu_ld64ur(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_Q | (ctx->default_tcg_memop_mask ^ MO_BSWAP);
-    tcg_gen_qemu_ld_i64(arg1, arg2, ctx->mem_idx, op);
-}
-GEN_LDX_E(ldbr, ld64ur, 0x14, 0x10, PPC_NONE, PPC2_DBRX, CHK_NONE);
+GEN_LDX_E(ldbr, ld64ur_i64, 0x14, 0x10, PPC_NONE, PPC2_DBRX, CHK_NONE);
 #endif  /* TARGET_PPC64 */
 
 /* sthbrx */
@@ -6593,7 +6586,7 @@ GEN_LDUX(lwa, ld32s, 0x15, 0x0B, PPC_64B)
 GEN_LDX(lwa, ld32s, 0x15, 0x0A, PPC_64B)
 GEN_LDUX(ld, ld64_i64, 0x15, 0x01, PPC_64B)
 GEN_LDX(ld, ld64_i64, 0x15, 0x00, PPC_64B)
-GEN_LDX_E(ldbr, ld64ur, 0x14, 0x10, PPC_NONE, PPC2_DBRX, CHK_NONE)
+GEN_LDX_E(ldbr, ld64ur_i64, 0x14, 0x10, PPC_NONE, PPC2_DBRX, CHK_NONE)
 
 /* HV/P7 and later only */
 GEN_LDX_HVRM(ldcix, ld64_i64, 0x15, 0x1b, PPC_CILDST)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 04/17] target-ppc: consolidate store operations
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 03/17] target-ppc: convert ld[16, 32, 64]ur " Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 05/17] target-ppc: convert st64 to use new macro Nikunj A Dadhania
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Implement macro to consolidate store operations using newer
tcg_gen_qemu_st function.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c | 35 ++++++++++++++++-------------------
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index a33e0ca..a893c91 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2496,30 +2496,27 @@ GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q))
 GEN_QEMU_LOAD_64(ld64ur, BSWAP_MEMOP(MO_Q))
 #endif
 
-static inline void gen_qemu_st8(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    tcg_gen_qemu_st8(arg1, arg2, ctx->mem_idx);
+#define GEN_QEMU_STORE_TL(stop, op)                                     \
+static void glue(gen_qemu_, stop)(DisasContext *ctx,                    \
+                                  TCGv val,                             \
+                                  TCGv addr)                            \
+{                                                                       \
+    tcg_gen_qemu_st_tl(val, addr, ctx->mem_idx, op);                    \
 }
 
-static inline void gen_qemu_st16(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UW | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_st_tl(arg1, arg2, ctx->mem_idx, op);
-}
+GEN_QEMU_STORE_TL(st8,  DEF_MEMOP(MO_UB))
+GEN_QEMU_STORE_TL(st16, DEF_MEMOP(MO_UW))
+GEN_QEMU_STORE_TL(st32, DEF_MEMOP(MO_UL))
 
-static inline void gen_qemu_st32(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UL | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_st_tl(arg1, arg2, ctx->mem_idx, op);
+#define GEN_QEMU_STORE_64(stop, op)                               \
+static void glue(gen_qemu_, glue(stop, _i64))(DisasContext *ctx,  \
+                                              TCGv_i64 val,       \
+                                              TCGv addr)          \
+{                                                                 \
+    tcg_gen_qemu_st_i64(val, addr, ctx->mem_idx, op);             \
 }
 
-static void gen_qemu_st32_i64(DisasContext *ctx, TCGv_i64 val, TCGv addr)
-{
-    TCGv tmp = tcg_temp_new();
-    tcg_gen_trunc_i64_tl(tmp, val);
-    gen_qemu_st32(ctx, tmp, addr);
-    tcg_temp_free(tmp);
-}
+GEN_QEMU_STORE_64(st32, DEF_MEMOP(MO_UL))
 
 static inline void gen_qemu_st64(DisasContext *ctx, TCGv_i64 arg1, TCGv arg2)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 05/17] target-ppc: convert st64 to use new macro
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 04/17] target-ppc: consolidate store operations Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 06/17] target-ppc: convert st[16, 32, 64]r " Nikunj A Dadhania
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Use macro for ld64 as well, this changes the function signature from
gen_qemu_st64 => gen_qemu_st64_i64. Replace this at all the call sites.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 37 ++++++++++++++------------------
 target-ppc/translate/fp-impl.inc.c  | 42 ++++++++++++++++++-------------------
 target-ppc/translate/fp-ops.inc.c   |  2 +-
 target-ppc/translate/spe-impl.inc.c |  2 +-
 target-ppc/translate/vmx-impl.inc.c | 12 +++++------
 target-ppc/translate/vsx-impl.inc.c |  6 +++---
 6 files changed, 48 insertions(+), 53 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index a893c91..bd16681 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2517,12 +2517,7 @@ static void glue(gen_qemu_, glue(stop, _i64))(DisasContext *ctx,  \
 }
 
 GEN_QEMU_STORE_64(st32, DEF_MEMOP(MO_UL))
-
-static inline void gen_qemu_st64(DisasContext *ctx, TCGv_i64 arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_Q | ctx->default_tcg_memop_mask;
-    tcg_gen_qemu_st_i64(arg1, arg2, ctx->mem_idx, op);
-}
+GEN_QEMU_STORE_64(st64, DEF_MEMOP(MO_Q))
 
 #define GEN_LD(name, ldop, opc, type)                                         \
 static void glue(gen_, name)(DisasContext *ctx)                                       \
@@ -2767,9 +2762,9 @@ GEN_STS(sth, st16, 0x0C, PPC_INTEGER);
 /* stw stwu stwux stwx */
 GEN_STS(stw, st32, 0x04, PPC_INTEGER);
 #if defined(TARGET_PPC64)
-GEN_STUX(std, st64, 0x15, 0x05, PPC_64B);
-GEN_STX(std, st64, 0x15, 0x04, PPC_64B);
-GEN_STX_HVRM(stdcix, st64, 0x15, 0x1f, PPC_CILDST)
+GEN_STUX(std, st64_i64, 0x15, 0x05, PPC_64B);
+GEN_STX(std, st64_i64, 0x15, 0x04, PPC_64B);
+GEN_STX_HVRM(stdcix, st64_i64, 0x15, 0x1f, PPC_CILDST)
 GEN_STX_HVRM(stwcix, st32, 0x15, 0x1c, PPC_CILDST)
 GEN_STX_HVRM(sthcix, st16, 0x15, 0x1d, PPC_CILDST)
 GEN_STX_HVRM(stbcix, st8, 0x15, 0x1e, PPC_CILDST)
@@ -2806,16 +2801,16 @@ static void gen_std(DisasContext *ctx)
         EA = tcg_temp_new();
         gen_addr_imm_index(ctx, EA, 0x03);
 
-        /* We only need to swap high and low halves. gen_qemu_st64 does
+        /* We only need to swap high and low halves. gen_qemu_st64_i64 does
            necessary 64-bit byteswap already. */
         if (unlikely(ctx->le_mode)) {
-            gen_qemu_st64(ctx, cpu_gpr[rs+1], EA);
+            gen_qemu_st64_i64(ctx, cpu_gpr[rs + 1], EA);
             gen_addr_add(ctx, EA, EA, 8);
-            gen_qemu_st64(ctx, cpu_gpr[rs], EA);
+            gen_qemu_st64_i64(ctx, cpu_gpr[rs], EA);
         } else {
-            gen_qemu_st64(ctx, cpu_gpr[rs], EA);
+            gen_qemu_st64_i64(ctx, cpu_gpr[rs], EA);
             gen_addr_add(ctx, EA, EA, 8);
-            gen_qemu_st64(ctx, cpu_gpr[rs+1], EA);
+            gen_qemu_st64_i64(ctx, cpu_gpr[rs + 1], EA);
         }
         tcg_temp_free(EA);
     } else {
@@ -2829,7 +2824,7 @@ static void gen_std(DisasContext *ctx)
         gen_set_access_type(ctx, ACCESS_INT);
         EA = tcg_temp_new();
         gen_addr_imm_index(ctx, EA, 0x03);
-        gen_qemu_st64(ctx, cpu_gpr[rs], EA);
+        gen_qemu_st64_i64(ctx, cpu_gpr[rs], EA);
         if (Rc(ctx->opcode))
             tcg_gen_mov_tl(cpu_gpr[rA(ctx->opcode)], EA);
         tcg_temp_free(EA);
@@ -3111,7 +3106,7 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA,
     tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
 #if defined(TARGET_PPC64)
     if (size == 8) {
-        gen_qemu_st64(ctx, cpu_gpr[reg], EA);
+        gen_qemu_st64_i64(ctx, cpu_gpr[reg], EA);
     } else
 #endif
     if (size == 4) {
@@ -3128,10 +3123,10 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA,
             gpr1 = cpu_gpr[reg];
             gpr2 = cpu_gpr[reg+1];
         }
-        gen_qemu_st64(ctx, gpr1, EA);
+        gen_qemu_st64_i64(ctx, gpr1, EA);
         EA8 = tcg_temp_local_new();
         gen_addr_add(ctx, EA8, EA, 8);
-        gen_qemu_st64(ctx, gpr2, EA8);
+        gen_qemu_st64_i64(ctx, gpr2, EA8);
         tcg_temp_free(EA8);
 #endif
     } else {
@@ -6617,10 +6612,10 @@ GEN_STS(stb, st8, 0x06, PPC_INTEGER)
 GEN_STS(sth, st16, 0x0C, PPC_INTEGER)
 GEN_STS(stw, st32, 0x04, PPC_INTEGER)
 #if defined(TARGET_PPC64)
-GEN_STUX(std, st64, 0x15, 0x05, PPC_64B)
-GEN_STX(std, st64, 0x15, 0x04, PPC_64B)
+GEN_STUX(std, st64_i64, 0x15, 0x05, PPC_64B)
+GEN_STX(std, st64_i64, 0x15, 0x04, PPC_64B)
 GEN_STX_E(stdbr, st64r, 0x14, 0x14, PPC_NONE, PPC2_DBRX, CHK_NONE)
-GEN_STX_HVRM(stdcix, st64, 0x15, 0x1f, PPC_CILDST)
+GEN_STX_HVRM(stdcix, st64_i64, 0x15, 0x1f, PPC_CILDST)
 GEN_STX_HVRM(stwcix, st32, 0x15, 0x1c, PPC_CILDST)
 GEN_STX_HVRM(sthcix, st16, 0x15, 0x1d, PPC_CILDST)
 GEN_STX_HVRM(stbcix, st8, 0x15, 0x1e, PPC_CILDST)
diff --git a/target-ppc/translate/fp-impl.inc.c b/target-ppc/translate/fp-impl.inc.c
index 53b7fc7..872af7b 100644
--- a/target-ppc/translate/fp-impl.inc.c
+++ b/target-ppc/translate/fp-impl.inc.c
@@ -848,7 +848,7 @@ static inline void gen_qemu_st32fs(DisasContext *ctx, TCGv_i64 arg1, TCGv arg2)
 }
 
 /* stfd stfdu stfdux stfdx */
-GEN_STFS(stfd, st64, 0x16, PPC_FLOAT);
+GEN_STFS(stfd, st64_i64, 0x16, PPC_FLOAT);
 /* stfs stfsu stfsux stfsx */
 GEN_STFS(stfs, st32fs, 0x14, PPC_FLOAT);
 
@@ -863,16 +863,16 @@ static void gen_stfdp(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     EA = tcg_temp_new();
     gen_addr_imm_index(ctx, EA, 0);
-    /* We only need to swap high and low halves. gen_qemu_st64 does necessary
-       64-bit byteswap already. */
+    /* We only need to swap high and low halves. gen_qemu_st64_i64 does
+       necessary 64-bit byteswap already. */
     if (unlikely(ctx->le_mode)) {
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
     } else {
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
     }
     tcg_temp_free(EA);
 }
@@ -888,16 +888,16 @@ static void gen_stfdpx(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     EA = tcg_temp_new();
     gen_addr_reg_index(ctx, EA);
-    /* We only need to swap high and low halves. gen_qemu_st64 does necessary
-       64-bit byteswap already. */
+    /* We only need to swap high and low halves. gen_qemu_st64_i64 does
+       necessary 64-bit byteswap already. */
     if (unlikely(ctx->le_mode)) {
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
     } else {
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
         tcg_gen_addi_tl(EA, EA, 8);
-        gen_qemu_st64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
+        gen_qemu_st64_i64(ctx, cpu_fpr[rD(ctx->opcode) + 1], EA);
     }
     tcg_temp_free(EA);
 }
@@ -990,9 +990,9 @@ static void gen_stfq(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     t0 = tcg_temp_new();
     gen_addr_imm_index(ctx, t0, 0);
-    gen_qemu_st64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_st64_i64(ctx, cpu_fpr[rd], t0);
     gen_addr_add(ctx, t0, t0, 8);
-    gen_qemu_st64(ctx, cpu_fpr[(rd + 1) % 32], t0);
+    gen_qemu_st64_i64(ctx, cpu_fpr[(rd + 1) % 32], t0);
     tcg_temp_free(t0);
 }
 
@@ -1005,10 +1005,10 @@ static void gen_stfqu(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     t0 = tcg_temp_new();
     gen_addr_imm_index(ctx, t0, 0);
-    gen_qemu_st64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_st64_i64(ctx, cpu_fpr[rd], t0);
     t1 = tcg_temp_new();
     gen_addr_add(ctx, t1, t0, 8);
-    gen_qemu_st64(ctx, cpu_fpr[(rd + 1) % 32], t1);
+    gen_qemu_st64_i64(ctx, cpu_fpr[(rd + 1) % 32], t1);
     tcg_temp_free(t1);
     if (ra != 0)
         tcg_gen_mov_tl(cpu_gpr[ra], t0);
@@ -1024,10 +1024,10 @@ static void gen_stfqux(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     t0 = tcg_temp_new();
     gen_addr_reg_index(ctx, t0);
-    gen_qemu_st64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_st64_i64(ctx, cpu_fpr[rd], t0);
     t1 = tcg_temp_new();
     gen_addr_add(ctx, t1, t0, 8);
-    gen_qemu_st64(ctx, cpu_fpr[(rd + 1) % 32], t1);
+    gen_qemu_st64_i64(ctx, cpu_fpr[(rd + 1) % 32], t1);
     tcg_temp_free(t1);
     if (ra != 0)
         tcg_gen_mov_tl(cpu_gpr[ra], t0);
@@ -1042,9 +1042,9 @@ static void gen_stfqx(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_FLOAT);
     t0 = tcg_temp_new();
     gen_addr_reg_index(ctx, t0);
-    gen_qemu_st64(ctx, cpu_fpr[rd], t0);
+    gen_qemu_st64_i64(ctx, cpu_fpr[rd], t0);
     gen_addr_add(ctx, t0, t0, 8);
-    gen_qemu_st64(ctx, cpu_fpr[(rd + 1) % 32], t0);
+    gen_qemu_st64_i64(ctx, cpu_fpr[(rd + 1) % 32], t0);
     tcg_temp_free(t0);
 }
 
diff --git a/target-ppc/translate/fp-ops.inc.c b/target-ppc/translate/fp-ops.inc.c
index 291a1e6..d36ab4e 100644
--- a/target-ppc/translate/fp-ops.inc.c
+++ b/target-ppc/translate/fp-ops.inc.c
@@ -85,7 +85,7 @@ GEN_STUF(name, stop, op | 0x21, type)                                         \
 GEN_STUXF(name, stop, op | 0x01, type)                                        \
 GEN_STXF(name, stop, 0x17, op | 0x00, type)
 
-GEN_STFS(stfd, st64, 0x16, PPC_FLOAT)
+GEN_STFS(stfd, st64_i64, 0x16, PPC_FLOAT)
 GEN_STFS(stfs, st32fs, 0x14, PPC_FLOAT)
 GEN_STXF(stfiw, st32fiw, 0x17, 0x1E, PPC_FLOAT_STFIWX)
 GEN_HANDLER_E(stfdp, 0x3D, 0xFF, 0xFF, 0x00200003, PPC_NONE, PPC2_ISA205),
diff --git a/target-ppc/translate/spe-impl.inc.c b/target-ppc/translate/spe-impl.inc.c
index a969927..8c1c16c 100644
--- a/target-ppc/translate/spe-impl.inc.c
+++ b/target-ppc/translate/spe-impl.inc.c
@@ -725,7 +725,7 @@ static inline void gen_op_evstdd(DisasContext *ctx, TCGv addr)
 {
     TCGv_i64 t0 = tcg_temp_new_i64();
     gen_load_gpr64(t0, rS(ctx->opcode));
-    gen_qemu_st64(ctx, t0, addr);
+    gen_qemu_st64_i64(ctx, t0, addr);
     tcg_temp_free_i64(t0);
 }
 
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index e386ed0..37fd5ae 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -52,16 +52,16 @@ static void gen_st##name(DisasContext *ctx)                                   \
     EA = tcg_temp_new();                                                      \
     gen_addr_reg_index(ctx, EA);                                              \
     tcg_gen_andi_tl(EA, EA, ~0xf);                                            \
-    /* We only need to swap high and low halves. gen_qemu_st64 does necessary \
-       64-bit byteswap already. */                                            \
+    /* We only need to swap high and low halves. gen_qemu_st64_i64 does       \
+       necessary 64-bit byteswap already. */                                  \
     if (ctx->le_mode) {                                                       \
-        gen_qemu_st64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                    \
+        gen_qemu_st64_i64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                \
         tcg_gen_addi_tl(EA, EA, 8);                                           \
-        gen_qemu_st64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                    \
+        gen_qemu_st64_i64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                \
     } else {                                                                  \
-        gen_qemu_st64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                    \
+        gen_qemu_st64_i64(ctx, cpu_avrh[rD(ctx->opcode)], EA);                \
         tcg_gen_addi_tl(EA, EA, 8);                                           \
-        gen_qemu_st64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                    \
+        gen_qemu_st64_i64(ctx, cpu_avrl[rD(ctx->opcode)], EA);                \
     }                                                                         \
     tcg_temp_free(EA);                                                        \
 }
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 5725794..99cabb2 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -115,7 +115,7 @@ static void gen_##name(DisasContext *ctx)                     \
     tcg_temp_free(EA);                                        \
 }
 
-VSX_STORE_SCALAR(stxsdx, st64)
+VSX_STORE_SCALAR(stxsdx, st64_i64)
 VSX_STORE_SCALAR(stxsiwx, st32_i64)
 VSX_STORE_SCALAR(stxsspx, st32fs)
 
@@ -129,9 +129,9 @@ static void gen_stxvd2x(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_INT);
     EA = tcg_temp_new();
     gen_addr_reg_index(ctx, EA);
-    gen_qemu_st64(ctx, cpu_vsrh(xS(ctx->opcode)), EA);
+    gen_qemu_st64_i64(ctx, cpu_vsrh(xS(ctx->opcode)), EA);
     tcg_gen_addi_tl(EA, EA, 8);
-    gen_qemu_st64(ctx, cpu_vsrl(xS(ctx->opcode)), EA);
+    gen_qemu_st64_i64(ctx, cpu_vsrl(xS(ctx->opcode)), EA);
     tcg_temp_free(EA);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 06/17] target-ppc: convert st[16, 32, 64]r to use new macro
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 05/17] target-ppc: convert st64 to use new macro Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 07/17] target-ppc: consolidate load with reservation Nikunj A Dadhania
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Make byte-swap routines use the common GEN_QEMU_LOAD macro

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c | 32 ++++++++++----------------------
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index bd16681..21092d0 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2508,6 +2508,9 @@ GEN_QEMU_STORE_TL(st8,  DEF_MEMOP(MO_UB))
 GEN_QEMU_STORE_TL(st16, DEF_MEMOP(MO_UW))
 GEN_QEMU_STORE_TL(st32, DEF_MEMOP(MO_UL))
 
+GEN_QEMU_STORE_TL(st16r, BSWAP_MEMOP(MO_UW))
+GEN_QEMU_STORE_TL(st32r, BSWAP_MEMOP(MO_UL))
+
 #define GEN_QEMU_STORE_64(stop, op)                               \
 static void glue(gen_qemu_, glue(stop, _i64))(DisasContext *ctx,  \
                                               TCGv_i64 val,       \
@@ -2519,6 +2522,10 @@ static void glue(gen_qemu_, glue(stop, _i64))(DisasContext *ctx,  \
 GEN_QEMU_STORE_64(st32, DEF_MEMOP(MO_UL))
 GEN_QEMU_STORE_64(st64, DEF_MEMOP(MO_Q))
 
+#if defined(TARGET_PPC64)
+GEN_QEMU_STORE_64(st64r, BSWAP_MEMOP(MO_Q))
+#endif
+
 #define GEN_LD(name, ldop, opc, type)                                         \
 static void glue(gen_, name)(DisasContext *ctx)                                       \
 {                                                                             \
@@ -2842,34 +2849,15 @@ GEN_LDX(lwbr, ld32ur, 0x16, 0x10, PPC_INTEGER);
 #if defined(TARGET_PPC64)
 /* ldbrx */
 GEN_LDX_E(ldbr, ld64ur_i64, 0x14, 0x10, PPC_NONE, PPC2_DBRX, CHK_NONE);
+/* stdbrx */
+GEN_STX_E(stdbr, st64r_i64, 0x14, 0x14, PPC_NONE, PPC2_DBRX, CHK_NONE);
 #endif  /* TARGET_PPC64 */
 
 /* sthbrx */
-static inline void gen_qemu_st16r(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UW | (ctx->default_tcg_memop_mask ^ MO_BSWAP);
-    tcg_gen_qemu_st_tl(arg1, arg2, ctx->mem_idx, op);
-}
 GEN_STX(sthbr, st16r, 0x16, 0x1C, PPC_INTEGER);
-
 /* stwbrx */
-static inline void gen_qemu_st32r(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_UL | (ctx->default_tcg_memop_mask ^ MO_BSWAP);
-    tcg_gen_qemu_st_tl(arg1, arg2, ctx->mem_idx, op);
-}
 GEN_STX(stwbr, st32r, 0x16, 0x14, PPC_INTEGER);
 
-#if defined(TARGET_PPC64)
-/* stdbrx */
-static inline void gen_qemu_st64r(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-    TCGMemOp op = MO_Q | (ctx->default_tcg_memop_mask ^ MO_BSWAP);
-    tcg_gen_qemu_st_i64(arg1, arg2, ctx->mem_idx, op);
-}
-GEN_STX_E(stdbr, st64r, 0x14, 0x14, PPC_NONE, PPC2_DBRX, CHK_NONE);
-#endif  /* TARGET_PPC64 */
-
 /***                    Integer load and store multiple                    ***/
 
 /* lmw */
@@ -6614,7 +6602,7 @@ GEN_STS(stw, st32, 0x04, PPC_INTEGER)
 #if defined(TARGET_PPC64)
 GEN_STUX(std, st64_i64, 0x15, 0x05, PPC_64B)
 GEN_STX(std, st64_i64, 0x15, 0x04, PPC_64B)
-GEN_STX_E(stdbr, st64r, 0x14, 0x14, PPC_NONE, PPC2_DBRX, CHK_NONE)
+GEN_STX_E(stdbr, st64r_i64, 0x14, 0x14, PPC_NONE, PPC2_DBRX, CHK_NONE)
 GEN_STX_HVRM(stdcix, st64_i64, 0x15, 0x1f, PPC_CILDST)
 GEN_STX_HVRM(stwcix, st32, 0x15, 0x1c, PPC_CILDST)
 GEN_STX_HVRM(sthcix, st16, 0x15, 0x1d, PPC_CILDST)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 07/17] target-ppc: consolidate load with reservation
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (5 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 06/17] target-ppc: convert st[16, 32, 64]r " Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 08/17] target-ppc: move out stqcx impementation Nikunj A Dadhania
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Use tcg_gen_qemu_ld in the load with reservation instructions.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 21092d0..87857f7 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -3047,28 +3047,30 @@ static void gen_isync(DisasContext *ctx)
     gen_stop_exception(ctx);
 }
 
-#define LARX(name, len, loadop)                                      \
+#define MEMOP_GET_SIZE(x)  (1 << ((x) & MO_SIZE))
+
+#define LARX(name, memop)                                            \
 static void gen_##name(DisasContext *ctx)                            \
 {                                                                    \
     TCGv t0;                                                         \
     TCGv gpr = cpu_gpr[rD(ctx->opcode)];                             \
+    int len = MEMOP_GET_SIZE(memop);                                 \
     gen_set_access_type(ctx, ACCESS_RES);                            \
     t0 = tcg_temp_local_new();                                       \
     gen_addr_reg_index(ctx, t0);                                     \
     if ((len) > 1) {                                                 \
         gen_check_align(ctx, t0, (len)-1);                           \
     }                                                                \
-    gen_qemu_##loadop(ctx, gpr, t0);                                 \
+    tcg_gen_qemu_ld_tl(gpr, t0, ctx->mem_idx, memop);                \
     tcg_gen_mov_tl(cpu_reserve, t0);                                 \
     tcg_gen_st_tl(gpr, cpu_env, offsetof(CPUPPCState, reserve_val)); \
     tcg_temp_free(t0);                                               \
 }
 
 /* lwarx */
-LARX(lbarx, 1, ld8u);
-LARX(lharx, 2, ld16u);
-LARX(lwarx, 4, ld32u);
-
+LARX(lbarx, DEF_MEMOP(MO_UB))
+LARX(lharx, DEF_MEMOP(MO_UW))
+LARX(lwarx, DEF_MEMOP(MO_UL))
 
 #if defined(CONFIG_USER_ONLY)
 static void gen_conditional_store(DisasContext *ctx, TCGv EA,
@@ -3150,7 +3152,7 @@ STCX(stwcx_, 4);
 
 #if defined(TARGET_PPC64)
 /* ldarx */
-LARX(ldarx, 8, ld64_i64);
+LARX(ldarx, DEF_MEMOP(MO_Q))
 
 /* lqarx */
 static void gen_lqarx(DisasContext *ctx)
@@ -3176,15 +3178,13 @@ static void gen_lqarx(DisasContext *ctx)
         gpr1 = cpu_gpr[rd];
         gpr2 = cpu_gpr[rd+1];
     }
-    gen_qemu_ld64_i64(ctx, gpr1, EA);
+    tcg_gen_qemu_ld_i64(gpr1, EA, ctx->mem_idx, DEF_MEMOP(MO_Q));
     tcg_gen_mov_tl(cpu_reserve, EA);
-
     gen_addr_add(ctx, EA, EA, 8);
-    gen_qemu_ld64_i64(ctx, gpr2, EA);
+    tcg_gen_qemu_ld_i64(gpr2, EA, ctx->mem_idx, DEF_MEMOP(MO_Q));
 
     tcg_gen_st_tl(gpr1, cpu_env, offsetof(CPUPPCState, reserve_val));
     tcg_gen_st_tl(gpr2, cpu_env, offsetof(CPUPPCState, reserve_val2));
-
     tcg_temp_free(EA);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 08/17] target-ppc: move out stqcx impementation
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (6 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 07/17] target-ppc: consolidate load with reservation Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 09/17] target-ppc: consolidate store conditional Nikunj A Dadhania
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Being a 16byte operation, qemu_ld/st still does not support this. Move
this out so other store operation can use qemu_ld/st in the following
patch. Also, convert it to two MO_Q operations for stqcx.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c | 69 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 47 insertions(+), 22 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 87857f7..f3d2c4e 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -3103,22 +3103,6 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA,
         gen_qemu_st32(ctx, cpu_gpr[reg], EA);
     } else if (size == 2) {
         gen_qemu_st16(ctx, cpu_gpr[reg], EA);
-#if defined(TARGET_PPC64)
-    } else if (size == 16) {
-        TCGv gpr1, gpr2 , EA8;
-        if (unlikely(ctx->le_mode)) {
-            gpr1 = cpu_gpr[reg+1];
-            gpr2 = cpu_gpr[reg];
-        } else {
-            gpr1 = cpu_gpr[reg];
-            gpr2 = cpu_gpr[reg+1];
-        }
-        gen_qemu_st64_i64(ctx, gpr1, EA);
-        EA8 = tcg_temp_local_new();
-        gen_addr_add(ctx, EA8, EA, 8);
-        gen_qemu_st64_i64(ctx, gpr2, EA8);
-        tcg_temp_free(EA8);
-#endif
     } else {
         gen_qemu_st8(ctx, cpu_gpr[reg], EA);
     }
@@ -3131,11 +3115,6 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA,
 static void gen_##name(DisasContext *ctx)                 \
 {                                                         \
     TCGv t0;                                              \
-    if (unlikely((len == 16) && (rD(ctx->opcode) & 1))) { \
-        gen_inval_exception(ctx,                          \
-                            POWERPC_EXCP_INVAL_INVAL);    \
-        return;                                           \
-    }                                                     \
     gen_set_access_type(ctx, ACCESS_RES);                 \
     t0 = tcg_temp_local_new();                            \
     gen_addr_reg_index(ctx, t0);                          \
@@ -3188,9 +3167,55 @@ static void gen_lqarx(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+/* stqcx. */
+static void gen_stqcx_(DisasContext *ctx)
+{
+    TCGv EA;
+    int reg = rS(ctx->opcode);
+    int len = 16;
+#if !defined(CONFIG_USER_ONLY)
+    TCGLabel *l1;
+    TCGv gpr1, gpr2;
+#endif
+
+    if (unlikely((rD(ctx->opcode) & 1))) {
+        gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);
+        return;
+    }
+    gen_set_access_type(ctx, ACCESS_RES);
+    EA = tcg_temp_local_new();
+    gen_addr_reg_index(ctx, EA);
+    if (len > 1) {
+        gen_check_align(ctx, EA, (len) - 1);
+    }
+
+#if defined(CONFIG_USER_ONLY)
+    gen_conditional_store(ctx, EA, reg, 16);
+#else
+    tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
+    l1 = gen_new_label();
+    tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1);
+    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
+
+    if (unlikely(ctx->le_mode)) {
+        gpr1 = cpu_gpr[reg + 1];
+        gpr2 = cpu_gpr[reg];
+    } else {
+        gpr1 = cpu_gpr[reg];
+        gpr2 = cpu_gpr[reg + 1];
+    }
+    tcg_gen_qemu_st_tl(gpr1, EA, ctx->mem_idx, DEF_MEMOP(MO_Q));
+    gen_addr_add(ctx, EA, EA, 8);
+    tcg_gen_qemu_st_tl(gpr2, EA, ctx->mem_idx, DEF_MEMOP(MO_Q));
+
+    gen_set_label(l1);
+    tcg_gen_movi_tl(cpu_reserve, -1);
+#endif
+    tcg_temp_free(EA);
+}
+
 /* stdcx. */
 STCX(stdcx_, 8);
-STCX(stqcx_, 16);
 #endif /* defined(TARGET_PPC64) */
 
 /* sync */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 09/17] target-ppc: consolidate store conditional
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (7 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 08/17] target-ppc: move out stqcx impementation Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 10/17] target-ppc: add xxspltib instruction Nikunj A Dadhania
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Use tcg_gen_qemu_st store conditional instructions.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c | 58 +++++++++++++++++++++-----------------------------
 1 file changed, 24 insertions(+), 34 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index f3d2c4e..bba196c 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -3074,19 +3074,19 @@ LARX(lwarx, DEF_MEMOP(MO_UL))
 
 #if defined(CONFIG_USER_ONLY)
 static void gen_conditional_store(DisasContext *ctx, TCGv EA,
-                                  int reg, int size)
+                                  int reg, int memop)
 {
     TCGv t0 = tcg_temp_new();
 
     tcg_gen_st_tl(EA, cpu_env, offsetof(CPUPPCState, reserve_ea));
-    tcg_gen_movi_tl(t0, (size << 5) | reg);
+    tcg_gen_movi_tl(t0, (MEMOP_GET_SIZE(memop) << 5) | reg);
     tcg_gen_st_tl(t0, cpu_env, offsetof(CPUPPCState, reserve_info));
     tcg_temp_free(t0);
     gen_exception_err(ctx, POWERPC_EXCP_STCX, 0);
 }
 #else
 static void gen_conditional_store(DisasContext *ctx, TCGv EA,
-                                  int reg, int size)
+                                  int reg, int memop)
 {
     TCGLabel *l1;
 
@@ -3094,44 +3094,36 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA,
     l1 = gen_new_label();
     tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1);
     tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
-#if defined(TARGET_PPC64)
-    if (size == 8) {
-        gen_qemu_st64_i64(ctx, cpu_gpr[reg], EA);
-    } else
-#endif
-    if (size == 4) {
-        gen_qemu_st32(ctx, cpu_gpr[reg], EA);
-    } else if (size == 2) {
-        gen_qemu_st16(ctx, cpu_gpr[reg], EA);
-    } else {
-        gen_qemu_st8(ctx, cpu_gpr[reg], EA);
-    }
+    tcg_gen_qemu_st_tl(cpu_gpr[reg], EA, ctx->mem_idx, memop);
     gen_set_label(l1);
     tcg_gen_movi_tl(cpu_reserve, -1);
 }
 #endif
 
-#define STCX(name, len)                                   \
-static void gen_##name(DisasContext *ctx)                 \
-{                                                         \
-    TCGv t0;                                              \
-    gen_set_access_type(ctx, ACCESS_RES);                 \
-    t0 = tcg_temp_local_new();                            \
-    gen_addr_reg_index(ctx, t0);                          \
-    if (len > 1) {                                        \
-        gen_check_align(ctx, t0, (len)-1);                \
-    }                                                     \
-    gen_conditional_store(ctx, t0, rS(ctx->opcode), len); \
-    tcg_temp_free(t0);                                    \
-}
-
-STCX(stbcx_, 1);
-STCX(sthcx_, 2);
-STCX(stwcx_, 4);
+#define STCX(name, memop)                                   \
+static void gen_##name(DisasContext *ctx)                   \
+{                                                           \
+    TCGv t0;                                                \
+    int len = MEMOP_GET_SIZE(memop);                        \
+    gen_set_access_type(ctx, ACCESS_RES);                   \
+    t0 = tcg_temp_local_new();                              \
+    gen_addr_reg_index(ctx, t0);                            \
+    if (len > 1) {                                          \
+        gen_check_align(ctx, t0, (len) - 1);                \
+    }                                                       \
+    gen_conditional_store(ctx, t0, rS(ctx->opcode), memop); \
+    tcg_temp_free(t0);                                      \
+}
+
+STCX(stbcx_, DEF_MEMOP(MO_UB))
+STCX(sthcx_, DEF_MEMOP(MO_UW))
+STCX(stwcx_, DEF_MEMOP(MO_UL))
 
 #if defined(TARGET_PPC64)
 /* ldarx */
 LARX(ldarx, DEF_MEMOP(MO_Q))
+/* stdcx. */
+STCX(stdcx_, DEF_MEMOP(MO_Q))
 
 /* lqarx */
 static void gen_lqarx(DisasContext *ctx)
@@ -3214,8 +3206,6 @@ static void gen_stqcx_(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
-/* stdcx. */
-STCX(stdcx_, 8);
 #endif /* defined(TARGET_PPC64) */
 
 /* sync */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 10/17] target-ppc: add xxspltib instruction
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (8 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 09/17] target-ppc: consolidate store conditional Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 11/17] target-ppc: implement darn instruction Nikunj A Dadhania
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

xxspltib: VSX Vector Splat Immediate Byte

Copy the immediate byte in each byte of target VSR

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              |  2 ++
 target-ppc/translate/vsx-impl.inc.c | 20 ++++++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  5 +++++
 3 files changed, 27 insertions(+)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index bba196c..89a4b37 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -589,6 +589,8 @@ EXTRACT_HELPER(DM, 8, 2);
 EXTRACT_HELPER(UIM, 16, 2);
 EXTRACT_HELPER(SHW, 8, 2);
 EXTRACT_HELPER(SP, 19, 2);
+EXTRACT_HELPER(IMM8, 11, 8);
+
 /*****************************************************************************/
 /* PowerPC instructions table                                                */
 
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 99cabb2..67f5621 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -647,6 +647,26 @@ static void gen_xxspltw(DisasContext *ctx)
     tcg_temp_free_i64(b2);
 }
 
+#define pattern(x) (((x) & 0xff) * (~(uint64_t)0 / 0xff))
+
+static void gen_xxspltib(DisasContext *ctx)
+{
+    unsigned char uim8 = IMM8(ctx->opcode);
+    if (xS(ctx->opcode) < 32) {
+        if (unlikely(!ctx->altivec_enabled)) {
+            gen_exception(ctx, POWERPC_EXCP_VPU);
+            return;
+        }
+    } else {
+        if (unlikely(!ctx->vsx_enabled)) {
+            gen_exception(ctx, POWERPC_EXCP_VSXU);
+            return;
+        }
+    }
+    tcg_gen_movi_i64(cpu_vsrh(xT(ctx->opcode)), pattern(uim8));
+    tcg_gen_movi_i64(cpu_vsrl(xT(ctx->opcode)), pattern(uim8));
+}
+
 static void gen_xxsldwi(DisasContext *ctx)
 {
     TCGv_i64 xth, xtl;
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 8b9da65..62a6251 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -20,6 +20,10 @@ GEN_HANDLER_E(mfvsrd, 0x1F, 0x13, 0x01, 0x0000F800, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(mtvsrd, 0x1F, 0x13, 0x05, 0x0000F800, PPC_NONE, PPC2_VSX207),
 #endif
 
+#define GEN_XX1FORM(name, opc2, opc3, fl2)                              \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2)
+
 #define GEN_XX2FORM(name, opc2, opc3, fl2)                           \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0, PPC_NONE, fl2), \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2)
@@ -222,6 +226,7 @@ VSX_LOGICAL(xxlorc, 0x8, 0x15, PPC2_VSX207),
 GEN_XX3FORM(xxmrghw, 0x08, 0x02, PPC2_VSX),
 GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX),
 GEN_XX2FORM(xxspltw, 0x08, 0x0A, PPC2_VSX),
+GEN_XX1FORM(xxspltib, 0x08, 0x0B, PPC2_ISA300),
 GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
 
 #define GEN_XXSEL_ROW(opc3) \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 11/17] target-ppc: implement darn instruction
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (9 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 10/17] target-ppc: add xxspltib instruction Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 12/17] target-ppc: add lxsi[bw]zx instruction Nikunj A Dadhania
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh, Ravi Bangoria

From: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>

darn: Deliver A Random Number

Currently return invalid random number for all the case. This needs
proper algorithm to provide cryptographically suitable random data.
Reading from /dev/random can block and that is not an expected behaviour
while the cpu instruction is getting executed. Moreover, /dev/random
would only work for linux-user

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h     |  2 ++
 target-ppc/int_helper.c | 16 ++++++++++++++++
 target-ppc/translate.c  | 18 ++++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index dcf3f95..695a2db 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -50,6 +50,8 @@ DEF_HELPER_FLAGS_1(cnttzd, TCG_CALL_NO_RWG_SE, tl, tl)
 DEF_HELPER_FLAGS_1(popcntd, TCG_CALL_NO_RWG_SE, tl, tl)
 DEF_HELPER_FLAGS_2(bpermd, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_3(srad, tl, env, tl, tl)
+DEF_HELPER_0(darn32, tl)
+DEF_HELPER_0(darn64, tl)
 #endif
 
 DEF_HELPER_FLAGS_1(cntlsw32, TCG_CALL_NO_RWG_SE, i32, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 552b2e0..ca33add 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -182,6 +182,22 @@ target_ulong helper_cnttzd(target_ulong t)
 {
     return ctz64(t);
 }
+
+/* Return invalid random number.
+ *
+ * FIXME: Add rng backend or other mechanism to get cryptographically suitable
+ * random number
+ */
+target_ulong helper_darn32(void)
+{
+    return -1;
+}
+
+target_ulong helper_darn64(void)
+{
+    return -1;
+}
+
 #endif
 
 #if defined(TARGET_PPC64)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 89a4b37..2e5116e 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -526,6 +526,8 @@ EXTRACT_HELPER(FPW, 16, 1);
 
 /* addpcis */
 EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
+/* darn */
+EXTRACT_HELPER(L, 16, 2);
 
 /***                            Jump target decoding                       ***/
 /* Immediate address */
@@ -1893,6 +1895,21 @@ static void gen_cnttzd(DisasContext *ctx)
         gen_set_Rc0(ctx, cpu_gpr[rA(ctx->opcode)]);
     }
 }
+
+/* darn */
+static void gen_darn(DisasContext *ctx)
+{
+    int l = L(ctx->opcode);
+
+    if (l == 0) {
+        gen_helper_darn32(cpu_gpr[rD(ctx->opcode)]);
+    } else if (l <= 2) {
+        /* Return 64-bit random for both CRN and RRN */
+        gen_helper_darn64(cpu_gpr[rD(ctx->opcode)]);
+    } else {
+        tcg_gen_movi_i64(cpu_gpr[rD(ctx->opcode)], -1);
+    }
+}
 #endif
 
 /***                             Integer rotate                            ***/
@@ -6207,6 +6224,7 @@ GEN_HANDLER_E(prtyw, 0x1F, 0x1A, 0x04, 0x0000F801, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(popcntd, 0x1F, 0x1A, 0x0F, 0x0000F801, PPC_POPCNTWD),
 GEN_HANDLER(cntlzd, 0x1F, 0x1A, 0x01, 0x00000000, PPC_64B),
 GEN_HANDLER_E(cnttzd, 0x1F, 0x1A, 0x11, 0x00000000, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(darn, 0x1F, 0x13, 0x17, 0x001CF801, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER_E(prtyd, 0x1F, 0x1A, 0x05, 0x0000F801, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(bpermd, 0x1F, 0x1C, 0x07, 0x00000001, PPC_NONE, PPC2_PERM_ISA206),
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 12/17] target-ppc: add lxsi[bw]zx instruction
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (10 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 11/17] target-ppc: implement darn instruction Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 13/17] target-ppc: add stxsi[bh]x instruction Nikunj A Dadhania
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

lxsibzx - Load VSX Scalar as Integer Byte & Zero Indexed
lxsihzx - Load VSX Scalar as Integer Halfword & Zero Indexed

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 2 ++
 target-ppc/translate/vsx-impl.inc.c | 2 ++
 target-ppc/translate/vsx-ops.inc.c  | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 2e5116e..e72b8d7 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2507,6 +2507,8 @@ static void glue(gen_qemu_, glue(ldop, _i64))(DisasContext *ctx,    \
     tcg_gen_qemu_ld_i64(val, addr, ctx->mem_idx, op);               \
 }
 
+GEN_QEMU_LOAD_64(ld8u,  DEF_MEMOP(MO_UB))
+GEN_QEMU_LOAD_64(ld16u, DEF_MEMOP(MO_UW))
 GEN_QEMU_LOAD_64(ld32u, DEF_MEMOP(MO_UL))
 GEN_QEMU_LOAD_64(ld32s, DEF_MEMOP(MO_SL))
 GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q))
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 67f5621..888f2e4 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -36,6 +36,8 @@ static void gen_##name(DisasContext *ctx)                     \
 
 VSX_LOAD_SCALAR(lxsdx, ld64_i64)
 VSX_LOAD_SCALAR(lxsiwax, ld32s_i64)
+VSX_LOAD_SCALAR(lxsibzx, ld8u_i64)
+VSX_LOAD_SCALAR(lxsihzx, ld16u_i64)
 VSX_LOAD_SCALAR(lxsiwzx, ld32u_i64)
 VSX_LOAD_SCALAR(lxsspx, ld32fs)
 
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 62a6251..4cd742c 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -1,6 +1,8 @@
 GEN_HANDLER_E(lxsdx, 0x1F, 0x0C, 0x12, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxsiwax, 0x1F, 0x0C, 0x02, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(lxsiwzx, 0x1F, 0x0C, 0x00, 0, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(lxsibzx, 0x1F, 0x0D, 0x18, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(lxsihzx, 0x1F, 0x0D, 0x19, 0, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER_E(lxsspx, 0x1F, 0x0C, 0x10, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 13/17] target-ppc: add stxsi[bh]x instruction
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (11 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 12/17] target-ppc: add lxsi[bw]zx instruction Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 14/17] target-ppc: improve lxvw4x implementation Nikunj A Dadhania
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

stxsibx - Store VSX Scalar as Integer Byte Indexed
stxsihx - Store VSX Scalar as Integer Halfword Indexed

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 2 ++
 target-ppc/translate/vsx-impl.inc.c | 3 +++
 target-ppc/translate/vsx-ops.inc.c  | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index e72b8d7..4a882b3 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2540,6 +2540,8 @@ static void glue(gen_qemu_, glue(stop, _i64))(DisasContext *ctx,  \
     tcg_gen_qemu_st_i64(val, addr, ctx->mem_idx, op);             \
 }
 
+GEN_QEMU_STORE_64(st8,  DEF_MEMOP(MO_UB))
+GEN_QEMU_STORE_64(st16, DEF_MEMOP(MO_UW))
 GEN_QEMU_STORE_64(st32, DEF_MEMOP(MO_UL))
 GEN_QEMU_STORE_64(st64, DEF_MEMOP(MO_Q))
 
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 888f2e4..eee6052 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -118,6 +118,9 @@ static void gen_##name(DisasContext *ctx)                     \
 }
 
 VSX_STORE_SCALAR(stxsdx, st64_i64)
+
+VSX_STORE_SCALAR(stxsibx, st8_i64)
+VSX_STORE_SCALAR(stxsihx, st16_i64)
 VSX_STORE_SCALAR(stxsiwx, st32_i64)
 VSX_STORE_SCALAR(stxsspx, st32fs)
 
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 4cd742c..414b73b 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -9,6 +9,8 @@ GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
 
 GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(stxsihx, 0x1F, 0xD, 0x1D, 0, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER_E(stxsiwx, 0x1F, 0xC, 0x04, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 14/17] target-ppc: improve lxvw4x implementation
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (12 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 13/17] target-ppc: add stxsi[bh]x instruction Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 15/17] target-ppc: add lxvb16x and lxvh8x Nikunj A Dadhania
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Load 8byte at a time and manipulate.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  1 +
 target-ppc/mem_helper.c             |  5 +++++
 target-ppc/translate/vsx-impl.inc.c | 34 ++++++++++++++++++++--------------
 3 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 695a2db..ce214f7 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -288,6 +288,7 @@ DEF_HELPER_2(mtvscr, void, env, avr)
 DEF_HELPER_3(lvebx, void, env, avr, tl)
 DEF_HELPER_3(lvehx, void, env, avr, tl)
 DEF_HELPER_3(lvewx, void, env, avr, tl)
+DEF_HELPER_1(bswap32x2, i64, i64)
 DEF_HELPER_3(stvebx, void, env, avr, tl)
 DEF_HELPER_3(stvehx, void, env, avr, tl)
 DEF_HELPER_3(stvewx, void, env, avr, tl)
diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c
index bf6c44a..070dff6 100644
--- a/target-ppc/mem_helper.c
+++ b/target-ppc/mem_helper.c
@@ -354,6 +354,11 @@ STVE(stvewx, cpu_stl_data_ra, bswap32, u32)
 #undef I
 #undef LVE
 
+uint64_t helper_bswap32x2(uint64_t x)
+{
+    return deposit64((x >> 32), 32, 32, (x));
+}
+
 #undef HI_IDX
 #undef LO_IDX
 
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index eee6052..e3374df 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -75,7 +75,7 @@ static void gen_lxvdsx(DisasContext *ctx)
 static void gen_lxvw4x(DisasContext *ctx)
 {
     TCGv EA;
-    TCGv_i64 tmp;
+    TCGv_i64 t0, t1;
     TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
     TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
     if (unlikely(!ctx->vsx_enabled)) {
@@ -84,22 +84,28 @@ static void gen_lxvw4x(DisasContext *ctx)
     }
     gen_set_access_type(ctx, ACCESS_INT);
     EA = tcg_temp_new();
-    tmp = tcg_temp_new_i64();
 
     gen_addr_reg_index(ctx, EA);
-    gen_qemu_ld32u_i64(ctx, tmp, EA);
-    tcg_gen_addi_tl(EA, EA, 4);
-    gen_qemu_ld32u_i64(ctx, xth, EA);
-    tcg_gen_deposit_i64(xth, xth, tmp, 32, 32);
-
-    tcg_gen_addi_tl(EA, EA, 4);
-    gen_qemu_ld32u_i64(ctx, tmp, EA);
-    tcg_gen_addi_tl(EA, EA, 4);
-    gen_qemu_ld32u_i64(ctx, xtl, EA);
-    tcg_gen_deposit_i64(xtl, xtl, tmp, 32, 32);
-
+    if (ctx->le_mode) {
+        t0 = tcg_temp_new_i64();
+        t1 = tcg_temp_new_i64();
+        tcg_gen_qemu_ld_i64(t0, EA, ctx->mem_idx, MO_LEQ);
+        tcg_gen_shri_i64(t1, t0, 32);
+        tcg_gen_deposit_i64(xth, t1, t0, 32, 32);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_ld_i64(t0, EA, ctx->mem_idx, MO_LEQ);
+        tcg_gen_shri_i64(t1, t0, 32);
+        tcg_gen_deposit_i64(xtl, t1, t0, 32, 32);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    } else {
+        tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_LEQ);
+        gen_helper_bswap32x2(xth, xth);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_LEQ);
+        gen_helper_bswap32x2(xtl, xtl);
+    }
     tcg_temp_free(EA);
-    tcg_temp_free_i64(tmp);
 }
 
 #define VSX_STORE_SCALAR(name, operation)                     \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 15/17] target-ppc: add lxvb16x and lxvh8x
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (13 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 14/17] target-ppc: improve lxvw4x implementation Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 16/17] target-ppc: improve stxvw4x implementation Nikunj A Dadhania
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

lxvb16x: Load VSX Vector Byte*16
lxvh8x:  Load VSX Vector Halfword*8

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  1 +
 target-ppc/mem_helper.c             |  6 ++++
 target-ppc/translate/vsx-impl.inc.c | 57 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  2 ++
 4 files changed, 66 insertions(+)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index ce214f7..ab80c34 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -289,6 +289,7 @@ DEF_HELPER_3(lvebx, void, env, avr, tl)
 DEF_HELPER_3(lvehx, void, env, avr, tl)
 DEF_HELPER_3(lvewx, void, env, avr, tl)
 DEF_HELPER_1(bswap32x2, i64, i64)
+DEF_HELPER_1(bswap16x4, i64, i64)
 DEF_HELPER_3(stvebx, void, env, avr, tl)
 DEF_HELPER_3(stvehx, void, env, avr, tl)
 DEF_HELPER_3(stvewx, void, env, avr, tl)
diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c
index 070dff6..09d552f 100644
--- a/target-ppc/mem_helper.c
+++ b/target-ppc/mem_helper.c
@@ -359,6 +359,12 @@ uint64_t helper_bswap32x2(uint64_t x)
     return deposit64((x >> 32), 32, 32, (x));
 }
 
+uint64_t helper_bswap16x4(uint64_t x)
+{
+    uint64_t m = 0x00ff00ff00ff00ffull;
+    return ((x & m) << 8) | ((x >> 8) & m);
+}
+
 #undef HI_IDX
 #undef LO_IDX
 
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index e3374df..caa6660 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -108,6 +108,63 @@ static void gen_lxvw4x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+static void gen_lxvb16x(DisasContext *ctx)
+{
+    TCGv EA;
+    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    gen_set_access_type(ctx, ACCESS_INT);
+    EA = tcg_temp_new();
+    gen_addr_reg_index(ctx, EA);
+    if (ctx->le_mode) {
+        tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ);
+    } else {
+        tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_LEQ);
+        gen_helper_bswap32x2(xth, xth);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_LEQ);
+        gen_helper_bswap32x2(xtl, xtl);
+    }
+    tcg_temp_free(EA);
+}
+
+static void gen_lxvh8x(DisasContext *ctx)
+{
+    TCGv EA;
+    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    gen_set_access_type(ctx, ACCESS_INT);
+    EA = tcg_temp_new();
+    gen_addr_reg_index(ctx, EA);
+
+    if (ctx->le_mode) {
+        tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ);
+        gen_helper_bswap16x4(xth, xth);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ);
+        gen_helper_bswap16x4(xtl, xtl);
+    } else {
+        tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_LEQ);
+        gen_helper_bswap32x2(xth, xth);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_LEQ);
+        gen_helper_bswap32x2(xtl, xtl);
+    }
+    tcg_temp_free(EA);
+}
+
 #define VSX_STORE_SCALAR(name, operation)                     \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 414b73b..598b349 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -7,6 +7,8 @@ GEN_HANDLER_E(lxsspx, 0x1F, 0x0C, 0x10, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(lxvh8x, 0x1F, 0x0C, 0x19, 0, PPC_NONE,  PPC2_ISA300),
+GEN_HANDLER_E(lxvb16x, 0x1F, 0x0C, 0x1B, 0, PPC_NONE, PPC2_ISA300),
 
 GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 16/17] target-ppc: improve stxvw4x implementation
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (14 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 15/17] target-ppc: add lxvb16x and lxvh8x Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 17/17] target-ppc: add stxvb16x and stxvh8x Nikunj A Dadhania
  2016-08-17  3:03 ` [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 David Gibson
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Manipulate data and store 8bytes instead of 4bytes.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate/vsx-impl.inc.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index caa6660..f2fc5f9 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -205,7 +205,8 @@ static void gen_stxvd2x(DisasContext *ctx)
 
 static void gen_stxvw4x(DisasContext *ctx)
 {
-    TCGv_i64 tmp;
+    TCGv_i64 xsh = cpu_vsrh(xS(ctx->opcode));
+    TCGv_i64 xsl = cpu_vsrl(xS(ctx->opcode));
     TCGv EA;
     if (unlikely(!ctx->vsx_enabled)) {
         gen_exception(ctx, POWERPC_EXCP_VSXU);
@@ -214,21 +215,19 @@ static void gen_stxvw4x(DisasContext *ctx)
     gen_set_access_type(ctx, ACCESS_INT);
     EA = tcg_temp_new();
     gen_addr_reg_index(ctx, EA);
-    tmp = tcg_temp_new_i64();
-
-    tcg_gen_shri_i64(tmp, cpu_vsrh(xS(ctx->opcode)), 32);
-    gen_qemu_st32_i64(ctx, tmp, EA);
-    tcg_gen_addi_tl(EA, EA, 4);
-    gen_qemu_st32_i64(ctx, cpu_vsrh(xS(ctx->opcode)), EA);
-
-    tcg_gen_shri_i64(tmp, cpu_vsrl(xS(ctx->opcode)), 32);
-    tcg_gen_addi_tl(EA, EA, 4);
-    gen_qemu_st32_i64(ctx, tmp, EA);
-    tcg_gen_addi_tl(EA, EA, 4);
-    gen_qemu_st32_i64(ctx, cpu_vsrl(xS(ctx->opcode)), EA);
 
+    if (ctx->le_mode) {
+        tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEQ);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEQ);
+    } else {
+        gen_helper_bswap32x2(xsh, xsh);
+        tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_LEQ);
+        tcg_gen_addi_tl(EA, EA, 8);
+        gen_helper_bswap32x2(xsl, xsl);
+        tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_LEQ);
+    }
     tcg_temp_free(EA);
-    tcg_temp_free_i64(tmp);
 }
 
 #define MV_VSRW(name, tcgop1, tcgop2, target, source)           \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 17/17] target-ppc: add stxvb16x and stxvh8x
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (15 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 16/17] target-ppc: improve stxvw4x implementation Nikunj A Dadhania
@ 2016-08-12 18:34 ` Nikunj A Dadhania
  2016-08-17  3:03 ` [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 David Gibson
  17 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-12 18:34 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

stxvb16x: Store VSX Vector Byte*16
stxvh8x:  Store VSX Vector Halfword*8

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate/vsx-impl.inc.c | 55 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  2 ++
 2 files changed, 57 insertions(+)

diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index f2fc5f9..20afe3b 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -165,6 +165,61 @@ static void gen_lxvh8x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+static void gen_stxvb16x(DisasContext *ctx)
+{
+    TCGv_i64 xsh = cpu_vsrh(xS(ctx->opcode));
+    TCGv_i64 xsl = cpu_vsrl(xS(ctx->opcode));
+    TCGv EA;
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    gen_set_access_type(ctx, ACCESS_INT);
+    EA = tcg_temp_new();
+    gen_addr_reg_index(ctx, EA);
+
+    if (ctx->le_mode) {
+        tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEQ);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEQ);
+    } else {
+        gen_helper_bswap32x2(xsh, xsh);
+        tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_LEQ);
+        tcg_gen_addi_tl(EA, EA, 8);
+        gen_helper_bswap32x2(xsl, xsl);
+        tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_LEQ);
+    }
+    tcg_temp_free(EA);
+}
+
+static void gen_stxvh8x(DisasContext *ctx)
+{
+    TCGv_i64 xsh = cpu_vsrh(xS(ctx->opcode));
+    TCGv_i64 xsl = cpu_vsrl(xS(ctx->opcode));
+    TCGv EA;
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    gen_set_access_type(ctx, ACCESS_INT);
+    EA = tcg_temp_new();
+    gen_addr_reg_index(ctx, EA);
+    if (ctx->le_mode) {
+        tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEQ);
+        tcg_gen_addi_tl(EA, EA, 8);
+        tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEQ);
+    } else {
+        gen_helper_bswap32x2(xsh, xsh);
+        tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_LEQ);
+        tcg_gen_addi_tl(EA, EA, 8);
+        gen_helper_bswap32x2(xsl, xsl);
+        tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_LEQ);
+    }
+    tcg_temp_free(EA);
+}
+
 #define VSX_STORE_SCALAR(name, operation)                     \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 598b349..f5afa0f 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -17,6 +17,8 @@ GEN_HANDLER_E(stxsiwx, 0x1F, 0xC, 0x04, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(stxvh8x, 0x1F, 0x0C, 0x1D, 0, PPC_NONE,  PPC2_ISA300),
+GEN_HANDLER_E(stxvb16x, 0x1F, 0x0C, 0x1F, 0, PPC_NONE, PPC2_ISA300),
 
 GEN_HANDLER_E(mfvsrwz, 0x1F, 0x13, 0x03, 0x0000F800, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(mtvsrwa, 0x1F, 0x13, 0x06, 0x0000F800, PPC_NONE, PPC2_VSX207),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4
  2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
                   ` (16 preceding siblings ...)
  2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 17/17] target-ppc: add stxvb16x and stxvh8x Nikunj A Dadhania
@ 2016-08-17  3:03 ` David Gibson
  2016-08-17  4:32   ` Nikunj A Dadhania
  17 siblings, 1 reply; 20+ messages in thread
From: David Gibson @ 2016-08-17  3:03 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, benh

[-- Attachment #1: Type: text/plain, Size: 3380 bytes --]

On Sat, Aug 13, 2016 at 12:04:26AM +0530, Nikunj A Dadhania wrote:
> 1) Consolidate Load/Store operations using tcg_gen_qemu_ld/st functions
> 2) This series contains 10 new instructions for POWER9 ISA3.0
>    Use newer qemu load/store tcg helpers and optimize stxvw4x and lxvw4x.
> 
> Patches:
>  01-09:  Cleanup load/store operations in ppc translator 
>     10:  xxspltib: VSX Vector Splat Immediate Byte
>     11:  darn: Deliver a random number
>     12:  lxsibzx - Load VSX Scalar as Integer Byte & Zero Indexed
>          lxsihzx - Load VSX Scalar as Integer Halfword & Zero Indexed
>     13:  stxsibx - Store VSX Scalar as Integer Byte Indexed
>          stxsihx - Store VSX Scalar as Integer Halfword Indexed
>     14:  lxvw4x - improve implementation
>     15:  lxvb16x: Load VSX Vector Byte*16
>          lxvh8x:  Load VSX Vector Halfword*8
>     16:  stxv4x - improve implementation
>     17:  stxvb16x: Store VSX Vector Byte*16
>          stxvh8x:  Store VSX Vector Halfword*8
> 
> Series also available here:
> https://github.com/nikunjad/qemu/tree/p9-tcg

Sorry, I'm not going to get a chance to review these in the near
future.  Could you rebase and resend in a couple of weeks once KVM
Forum is over?

> 
> Changelog:
> v1: 
> * More load/store cleanups in byte reverse routines
> * ld64/st64 converted to newer macro and updated call sites
> * Cleanup load with reservation and store conditional
> * Return invalid random for darn instruction
> 
> v0:
> * darn - read /dev/random to get the random number
> * xxspltib - make is PPC64 only
> * Consolidate load/store operations and use macros to generate qemu_st/ld
> * Simplify load/store vsx endian manipulation
> 
> Nikunj A Dadhania (16):
>   target-ppc: consolidate load operations
>   target-ppc: convert ld64 to use new macro
>   target-ppc: convert ld[16,32,64]ur to use new macro
>   target-ppc: consolidate store operations
>   target-ppc: convert st64 to use new macro
>   target-ppc: convert st[16,32,64]r to use new macro
>   target-ppc: consolidate load with reservation
>   target-ppc: move out stqcx impementation
>   target-ppc: consolidate store conditional
>   target-ppc: add xxspltib instruction
>   target-ppc: add lxsi[bw]zx instruction
>   target-ppc: add stxsi[bh]x instruction
>   target-ppc: improve lxvw4x implementation
>   target-ppc: add lxvb16x and lxvh8x
>   target-ppc: improve stxvw4x implementation
>   target-ppc: add stxvb16x and stxvh8x
> 
> Ravi Bangoria (1):
>   target-ppc: implement darn instruction
> 
>  target-ppc/helper.h                 |   4 +
>  target-ppc/int_helper.c             |  16 ++
>  target-ppc/mem_helper.c             |  11 ++
>  target-ppc/translate.c              | 379 +++++++++++++++++-------------------
>  target-ppc/translate/fp-impl.inc.c  |  84 ++++----
>  target-ppc/translate/fp-ops.inc.c   |   2 +-
>  target-ppc/translate/spe-impl.inc.c |   4 +-
>  target-ppc/translate/vmx-impl.inc.c |  24 +--
>  target-ppc/translate/vsx-impl.inc.c | 208 ++++++++++++++++----
>  target-ppc/translate/vsx-ops.inc.c  |  13 ++
>  10 files changed, 460 insertions(+), 285 deletions(-)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4
  2016-08-17  3:03 ` [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 David Gibson
@ 2016-08-17  4:32   ` Nikunj A Dadhania
  0 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-08-17  4:32 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, rth, qemu-devel, benh

David Gibson <david@gibson.dropbear.id.au> writes:

> [ Unknown signature status ]
> On Sat, Aug 13, 2016 at 12:04:26AM +0530, Nikunj A Dadhania wrote:
>> 1) Consolidate Load/Store operations using tcg_gen_qemu_ld/st functions
>> 2) This series contains 10 new instructions for POWER9 ISA3.0
>>    Use newer qemu load/store tcg helpers and optimize stxvw4x and lxvw4x.
>> 
>> Patches:
>>  01-09:  Cleanup load/store operations in ppc translator 
>>     10:  xxspltib: VSX Vector Splat Immediate Byte
>>     11:  darn: Deliver a random number
>>     12:  lxsibzx - Load VSX Scalar as Integer Byte & Zero Indexed
>>          lxsihzx - Load VSX Scalar as Integer Halfword & Zero Indexed
>>     13:  stxsibx - Store VSX Scalar as Integer Byte Indexed
>>          stxsihx - Store VSX Scalar as Integer Halfword Indexed
>>     14:  lxvw4x - improve implementation
>>     15:  lxvb16x: Load VSX Vector Byte*16
>>          lxvh8x:  Load VSX Vector Halfword*8
>>     16:  stxv4x - improve implementation
>>     17:  stxvb16x: Store VSX Vector Byte*16
>>          stxvh8x:  Store VSX Vector Halfword*8
>> 
>> Series also available here:
>> https://github.com/nikunjad/qemu/tree/p9-tcg
>
> Sorry, I'm not going to get a chance to review these in the near
> future.  Could you rebase and resend in a couple of weeks once KVM
> Forum is over?

Sure David, I guessed so.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-08-17  4:32 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-12 18:34 [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 01/17] target-ppc: consolidate load operations Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 02/17] target-ppc: convert ld64 to use new macro Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 03/17] target-ppc: convert ld[16, 32, 64]ur " Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 04/17] target-ppc: consolidate store operations Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 05/17] target-ppc: convert st64 to use new macro Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 06/17] target-ppc: convert st[16, 32, 64]r " Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 07/17] target-ppc: consolidate load with reservation Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 08/17] target-ppc: move out stqcx impementation Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 09/17] target-ppc: consolidate store conditional Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 10/17] target-ppc: add xxspltib instruction Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 11/17] target-ppc: implement darn instruction Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 12/17] target-ppc: add lxsi[bw]zx instruction Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 13/17] target-ppc: add stxsi[bh]x instruction Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 14/17] target-ppc: improve lxvw4x implementation Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 15/17] target-ppc: add lxvb16x and lxvh8x Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 16/17] target-ppc: improve stxvw4x implementation Nikunj A Dadhania
2016-08-12 18:34 ` [Qemu-devel] [PATCH v2 17/17] target-ppc: add stxvb16x and stxvh8x Nikunj A Dadhania
2016-08-17  3:03 ` [Qemu-devel] [PATCH v2 00/17] POWER9 TCG enablements - part4 David Gibson
2016-08-17  4:32   ` Nikunj A Dadhania

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.