All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6
@ 2016-10-12  5:08 Nikunj A Dadhania
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 1/3] target-ppc: implement vexts[bh]2w and vexts[bhw]2d Nikunj A Dadhania
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Nikunj A Dadhania @ 2016-10-12  5:08 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

This series contains 11 new instructions for POWER9 ISA3.0
   Vector Extend Sign
   Vector Integer Negate 
   Vector Byte-Reverse

Patches:
01:
    vextsb2w: Vector Extend Sign Byte To Word
    vextsh2w: Vector Extend Sign Halfword To Word
    vextsb2d: Vector Extend Sign Byte To Doubleword
    vextsh2d: Vector Extend Sign Halfword To Doubleword
    vextsw2d: Vector Extend Sign Word To Doubleword
02:
    vnegw: Vector Negate Word
    vnegd: Vector Negate Doubleword
03:
    xxbrh: VSX Vector Byte-Reverse Halfword
    xxbrw: VSX Vector Byte-Reverse Word
    xxbrd: VSX Vector Byte-Reverse Doubleword
    xxbrq: VSX Vector Byte-Reverse Quadword

Changelog:
* Added temporary in xxbrq
* Use negate directly in place for computing 2's compliment
* Use int8_t instead for char
* Dropped "VSX Scalar Compare" as fpu_helper needs change 
  with regard to exception flag handling

Nikunj A Dadhania (3):
  target-ppc: implement vexts[bh]2w and vexts[bhw]2d
  target-ppc: implement vnegw/d instructions
  target-ppc: implement xxbr[qdwh] instruction

 target-ppc/helper.h                 |  7 ++++
 target-ppc/int_helper.c             | 27 +++++++++++++
 target-ppc/translate.c              | 32 +++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  7 ++++
 target-ppc/translate/vmx-ops.inc.c  |  7 ++++
 target-ppc/translate/vsx-impl.inc.c | 77 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  8 ++++
 7 files changed, 165 insertions(+)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v1 1/3] target-ppc: implement vexts[bh]2w and vexts[bhw]2d
  2016-10-12  5:08 [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 Nikunj A Dadhania
@ 2016-10-12  5:08 ` Nikunj A Dadhania
  2016-10-13  0:23   ` David Gibson
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 2/3] target-ppc: implement vnegw/d instructions Nikunj A Dadhania
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Nikunj A Dadhania @ 2016-10-12  5:08 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Vector Extend Sign Instructions:

vextsb2w: Vector Extend Sign Byte To Word
vextsh2w: Vector Extend Sign Halfword To Word
vextsb2d: Vector Extend Sign Byte To Doubleword
vextsh2d: Vector Extend Sign Halfword To Doubleword
vextsw2d: Vector Extend Sign Word To Doubleword

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  5 +++++
 target-ppc/int_helper.c             | 15 +++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  5 +++++
 target-ppc/translate/vmx-ops.inc.c  |  5 +++++
 4 files changed, 30 insertions(+)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 796ad45..04c6421 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -267,6 +267,11 @@ DEF_HELPER_3(vinsertb, void, avr, avr, i32)
 DEF_HELPER_3(vinserth, void, avr, avr, i32)
 DEF_HELPER_3(vinsertw, void, avr, avr, i32)
 DEF_HELPER_3(vinsertd, void, avr, avr, i32)
+DEF_HELPER_2(vextsb2w, void, avr, avr)
+DEF_HELPER_2(vextsh2w, void, avr, avr)
+DEF_HELPER_2(vextsb2d, void, avr, avr)
+DEF_HELPER_2(vextsh2d, void, avr, avr)
+DEF_HELPER_2(vextsw2d, void, avr, avr)
 DEF_HELPER_2(vupkhpx, void, avr, avr)
 DEF_HELPER_2(vupklpx, void, avr, avr)
 DEF_HELPER_2(vupkhsb, void, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 202854f..5aee0a8 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1934,6 +1934,21 @@ VEXTRACT(uw, u32)
 VEXTRACT(d, u64)
 #undef VEXTRACT
 
+#define VEXT_SIGNED(name, element, mask, cast, recast)              \
+void helper_##name(ppc_avr_t *r, ppc_avr_t *b)                      \
+{                                                                   \
+    int i;                                                          \
+    VECTOR_FOR_INORDER_I(i, element) {                              \
+        r->element[i] = (recast)((cast)(b->element[i] & mask));     \
+    }                                                               \
+}
+VEXT_SIGNED(vextsb2w, s32, UINT8_MAX, int8_t, int32_t)
+VEXT_SIGNED(vextsb2d, s64, UINT8_MAX, int8_t, int64_t)
+VEXT_SIGNED(vextsh2w, s32, UINT16_MAX, int16_t, int32_t)
+VEXT_SIGNED(vextsh2d, s64, UINT16_MAX, int16_t, int64_t)
+VEXT_SIGNED(vextsw2d, s64, UINT32_MAX, int32_t, int64_t)
+#undef VEXT_SIGNED
+
 #define VSPLTI(suffix, element, splat_type)                     \
     void helper_vspltis##suffix(ppc_avr_t *r, uint32_t splat)   \
     {                                                           \
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index 25cd073..c8998f3 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -815,6 +815,11 @@ GEN_VXFORM_NOA(vclzb, 1, 28)
 GEN_VXFORM_NOA(vclzh, 1, 29)
 GEN_VXFORM_NOA(vclzw, 1, 30)
 GEN_VXFORM_NOA(vclzd, 1, 31)
+GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
+GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
+GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
+GEN_VXFORM_NOA_2(vextsh2d, 1, 24, 25)
+GEN_VXFORM_NOA_2(vextsw2d, 1, 24, 26)
 GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
 GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
 GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index ac1dc9b..68cba3e 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -215,6 +215,11 @@ GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000,
 GEN_VXFORM_DUAL_INV(vspltisw, vinsertw, 6, 14, 0x00000000, 0x100000,
                                                PPC_ALTIVEC),
 GEN_VXFORM_300_EXT(vinsertd, 6, 15, 0x100000),
+GEN_VXFORM_300_EO(vextsb2w, 0x01, 0x18, 0x10),
+GEN_VXFORM_300_EO(vextsh2w, 0x01, 0x18, 0x11),
+GEN_VXFORM_300_EO(vextsb2d, 0x01, 0x18, 0x18),
+GEN_VXFORM_300_EO(vextsh2d, 0x01, 0x18, 0x19),
+GEN_VXFORM_300_EO(vextsw2d, 0x01, 0x18, 0x1A),
 GEN_VXFORM_300_EO(vctzb, 0x01, 0x18, 0x1C),
 GEN_VXFORM_300_EO(vctzh, 0x01, 0x18, 0x1D),
 GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v1 2/3] target-ppc: implement vnegw/d instructions
  2016-10-12  5:08 [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 Nikunj A Dadhania
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 1/3] target-ppc: implement vexts[bh]2w and vexts[bhw]2d Nikunj A Dadhania
@ 2016-10-12  5:08 ` Nikunj A Dadhania
  2016-10-13  0:12   ` David Gibson
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction Nikunj A Dadhania
  2016-10-12  8:50 ` [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 no-reply
  3 siblings, 1 reply; 12+ messages in thread
From: Nikunj A Dadhania @ 2016-10-12  5:08 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Vector Integer Negate Instructions:

vnegw: Vector Negate Word
vnegd: Vector Negate Doubleword

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  2 ++
 target-ppc/int_helper.c             | 12 ++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  2 ++
 target-ppc/translate/vmx-ops.inc.c  |  2 ++
 4 files changed, 18 insertions(+)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 04c6421..5fcc546 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -272,6 +272,8 @@ DEF_HELPER_2(vextsh2w, void, avr, avr)
 DEF_HELPER_2(vextsb2d, void, avr, avr)
 DEF_HELPER_2(vextsh2d, void, avr, avr)
 DEF_HELPER_2(vextsw2d, void, avr, avr)
+DEF_HELPER_2(vnegw, void, avr, avr)
+DEF_HELPER_2(vnegd, void, avr, avr)
 DEF_HELPER_2(vupkhpx, void, avr, avr)
 DEF_HELPER_2(vupklpx, void, avr, avr)
 DEF_HELPER_2(vupkhsb, void, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 5aee0a8..7446e4e 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1949,6 +1949,18 @@ VEXT_SIGNED(vextsh2d, s64, UINT16_MAX, int16_t, int64_t)
 VEXT_SIGNED(vextsw2d, s64, UINT32_MAX, int32_t, int64_t)
 #undef VEXT_SIGNED
 
+#define VNEG(name, element, mask)                                   \
+void helper_##name(ppc_avr_t *r, ppc_avr_t *b)                      \
+{                                                                   \
+    int i;                                                          \
+    VECTOR_FOR_INORDER_I(i, element) {                              \
+        r->element[i] = -b->element[i];                             \
+    }                                                               \
+}
+VNEG(vnegw, s32, UINT32_MAX)
+VNEG(vnegd, s64, UINT64_MAX)
+#undef VNEG
+
 #define VSPLTI(suffix, element, splat_type)                     \
     void helper_vspltis##suffix(ppc_avr_t *r, uint32_t splat)   \
     {                                                           \
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index c8998f3..563f101 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -815,6 +815,8 @@ GEN_VXFORM_NOA(vclzb, 1, 28)
 GEN_VXFORM_NOA(vclzh, 1, 29)
 GEN_VXFORM_NOA(vclzw, 1, 30)
 GEN_VXFORM_NOA(vclzd, 1, 31)
+GEN_VXFORM_NOA_2(vnegw, 1, 24, 6)
+GEN_VXFORM_NOA_2(vnegd, 1, 24, 7)
 GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
 GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
 GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index 68cba3e..ab64ab2 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -215,6 +215,8 @@ GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000,
 GEN_VXFORM_DUAL_INV(vspltisw, vinsertw, 6, 14, 0x00000000, 0x100000,
                                                PPC_ALTIVEC),
 GEN_VXFORM_300_EXT(vinsertd, 6, 15, 0x100000),
+GEN_VXFORM_300_EO(vnegw, 0x01, 0x18, 0x06),
+GEN_VXFORM_300_EO(vnegd, 0x01, 0x18, 0x07),
 GEN_VXFORM_300_EO(vextsb2w, 0x01, 0x18, 0x10),
 GEN_VXFORM_300_EO(vextsh2w, 0x01, 0x18, 0x11),
 GEN_VXFORM_300_EO(vextsb2d, 0x01, 0x18, 0x18),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction
  2016-10-12  5:08 [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 Nikunj A Dadhania
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 1/3] target-ppc: implement vexts[bh]2w and vexts[bhw]2d Nikunj A Dadhania
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 2/3] target-ppc: implement vnegw/d instructions Nikunj A Dadhania
@ 2016-10-12  5:08 ` Nikunj A Dadhania
  2016-10-13  0:21   ` David Gibson
  2016-10-12  8:50 ` [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 no-reply
  3 siblings, 1 reply; 12+ messages in thread
From: Nikunj A Dadhania @ 2016-10-12  5:08 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh

Add required helpers (GEN_XX2FORM_EO) for supporting this instruction.

xxbrh: VSX Vector Byte-Reverse Halfword
xxbrw: VSX Vector Byte-Reverse Word
xxbrd: VSX Vector Byte-Reverse Doubleword
xxbrq: VSX Vector Byte-Reverse Quadword

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 32 +++++++++++++++
 target-ppc/translate/vsx-impl.inc.c | 77 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  8 ++++
 3 files changed, 117 insertions(+)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index dab8f19..94989b2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -376,6 +376,9 @@ GEN_OPCODE2(name, onam, opc1, opc2, opc3, inval, type, type2)
 #define GEN_HANDLER_E_2(name, opc1, opc2, opc3, opc4, inval, type, type2)     \
 GEN_OPCODE3(name, opc1, opc2, opc3, opc4, inval, type, type2)
 
+#define GEN_HANDLER2_E_2(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2) \
+GEN_OPCODE4(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2)
+
 typedef struct opcode_t {
     unsigned char opc1, opc2, opc3, opc4;
 #if HOST_LONG_BITS == 64 /* Explicitly align to 64 bits */
@@ -662,6 +665,21 @@ EXTRACT_HELPER(IMM8, 11, 8);
     },                                                                        \
     .oname = stringify(name),                                                 \
 }
+#define GEN_OPCODE4(name, onam, op1, op2, op3, op4, invl, _typ, _typ2)        \
+{                                                                             \
+    .opc1 = op1,                                                              \
+    .opc2 = op2,                                                              \
+    .opc3 = op3,                                                              \
+    .opc4 = op4,                                                              \
+    .handler = {                                                              \
+        .inval1  = invl,                                                      \
+        .type = _typ,                                                         \
+        .type2 = _typ2,                                                       \
+        .handler = &gen_##name,                                               \
+        .oname = onam,                                                        \
+    },                                                                        \
+    .oname = onam,                                                            \
+}
 #else
 #define GEN_OPCODE(name, op1, op2, op3, invl, _typ, _typ2)                    \
 {                                                                             \
@@ -720,6 +738,20 @@ EXTRACT_HELPER(IMM8, 11, 8);
     },                                                                        \
     .oname = stringify(name),                                                 \
 }
+#define GEN_OPCODE4(name, onam, op1, op2, op3, op4, invl, _typ, _typ2)        \
+{                                                                             \
+    .opc1 = op1,                                                              \
+    .opc2 = op2,                                                              \
+    .opc3 = op3,                                                              \
+    .opc4 = op4,                                                              \
+    .handler = {                                                              \
+        .inval1  = invl,                                                      \
+        .type = _typ,                                                         \
+        .type2 = _typ2,                                                       \
+        .handler = &gen_##name,                                               \
+    },                                                                        \
+    .oname = onam,                                                            \
+}
 #endif
 
 /* SPR load/store helpers */
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 23ec1e1..52af5c1 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -132,6 +132,22 @@ static void gen_bswap16x8(TCGv_i64 outh, TCGv_i64 outl,
     tcg_temp_free_i64(mask);
 }
 
+static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
+                          TCGv_i64 inh, TCGv_i64 inl)
+{
+    TCGv_i64 hi = tcg_temp_new_i64();
+    TCGv_i64 lo = tcg_temp_new_i64();
+
+    tcg_gen_bswap64_i64(hi, inh);
+    tcg_gen_bswap64_i64(lo, inl);
+    tcg_gen_shri_i64(outh, hi, 32);
+    tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
+    tcg_gen_shri_i64(outl, lo, 32);
+    tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
+
+    tcg_temp_free_i64(hi);
+    tcg_temp_free_i64(lo);
+}
 static void gen_lxvh8x(DisasContext *ctx)
 {
     TCGv EA;
@@ -717,6 +733,67 @@ GEN_VSX_HELPER_2(xvrspim, 0x12, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
 
+static void gen_xxbrd(DisasContext *ctx)
+{
+    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
+    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    tcg_gen_bswap64_i64(xth, xbh);
+    tcg_gen_bswap64_i64(xtl, xbl);
+}
+
+static void gen_xxbrh(DisasContext *ctx)
+{
+    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
+    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    gen_bswap16x8(xth, xtl, xbh, xbl);
+}
+
+static void gen_xxbrq(DisasContext *ctx)
+{
+    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
+    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
+    TCGv_i64 t0 = tcg_temp_new_i64();
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    tcg_gen_bswap64_i64(t0, xbl);
+    tcg_gen_bswap64_i64(xtl, xbh);
+    tcg_gen_bswap64_i64(xth, t0);
+    tcg_temp_free_i64(t0);
+}
+
+static void gen_xxbrw(DisasContext *ctx)
+{
+    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
+    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
+
+    if (unlikely(!ctx->vsx_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VSXU);
+        return;
+    }
+    gen_bswap32x4(xth, xtl, xbh, xbl);
+}
+
 #define VSX_LOGICAL(name, tcg_op)                                    \
 static void glue(gen_, name)(DisasContext * ctx)                     \
     {                                                                \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 10eb4b9..af0d27e 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -39,6 +39,10 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2)
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0, PPC_NONE, fl2), \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2)
 
+#define GEN_XX2FORM_EO(name, opc2, opc3, opc4, fl2)                          \
+GEN_HANDLER2_E_2(name, #name, 0x3C, opc2 | 0, opc3, opc4, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E_2(name, #name, 0x3C, opc2 | 1, opc3, opc4, 0, PPC_NONE, fl2)
+
 #define GEN_XX3FORM(name, opc2, opc3, fl2)                           \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0, PPC_NONE, fl2), \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2), \
@@ -222,6 +226,10 @@ GEN_XX2FORM(xvrspic, 0x16, 0x0A, PPC2_VSX),
 GEN_XX2FORM(xvrspim, 0x12, 0x0B, PPC2_VSX),
 GEN_XX2FORM(xvrspip, 0x12, 0x0A, PPC2_VSX),
 GEN_XX2FORM(xvrspiz, 0x12, 0x09, PPC2_VSX),
+GEN_XX2FORM_EO(xxbrh, 0x16, 0x1D, 0x07, PPC2_ISA300),
+GEN_XX2FORM_EO(xxbrw, 0x16, 0x1D, 0x0F, PPC2_ISA300),
+GEN_XX2FORM_EO(xxbrd, 0x16, 0x1D, 0x17, PPC2_ISA300),
+GEN_XX2FORM_EO(xxbrq, 0x16, 0x1D, 0x1F, PPC2_ISA300),
 
 #define VSX_LOGICAL(name, opc2, opc3, fl2) \
 GEN_XX3FORM(name, opc2, opc3, fl2)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6
  2016-10-12  5:08 [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction Nikunj A Dadhania
@ 2016-10-12  8:50 ` no-reply
  3 siblings, 0 replies; 12+ messages in thread
From: no-reply @ 2016-10-12  8:50 UTC (permalink / raw)
  To: nikunj; +Cc: famz, qemu-ppc, david, rth, qemu-devel

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6
Message-id: 1476248933-25562-1-git-send-email-nikunj@linux.vnet.ibm.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]      patchew/1476107947-31430-1-git-send-email-pbonzini@redhat.com -> patchew/1476107947-31430-1-git-send-email-pbonzini@redhat.com
Switched to a new branch 'test'
167305d target-ppc: implement xxbr[qdwh] instruction
f648df6 target-ppc: implement vnegw/d instructions
fc481f0 target-ppc: implement vexts[bh]2w and vexts[bhw]2d

=== OUTPUT BEGIN ===
Checking PATCH 1/3: target-ppc: implement vexts[bh]2w and vexts[bhw]2d...
Checking PATCH 2/3: target-ppc: implement vnegw/d instructions...
Checking PATCH 3/3: target-ppc: implement xxbr[qdwh] instruction...
ERROR: Macros with complex values should be enclosed in parenthesis
#176: FILE: target-ppc/translate/vsx-ops.inc.c:42:
+#define GEN_XX2FORM_EO(name, opc2, opc3, opc4, fl2)                          \
+GEN_HANDLER2_E_2(name, #name, 0x3C, opc2 | 0, opc3, opc4, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E_2(name, #name, 0x3C, opc2 | 1, opc3, opc4, 0, PPC_NONE, fl2)

total: 1 errors, 0 warnings, 159 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/3] target-ppc: implement vnegw/d instructions
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 2/3] target-ppc: implement vnegw/d instructions Nikunj A Dadhania
@ 2016-10-13  0:12   ` David Gibson
  2016-10-19  5:12     ` Nikunj A Dadhania
  0 siblings, 1 reply; 12+ messages in thread
From: David Gibson @ 2016-10-13  0:12 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, benh

[-- Attachment #1: Type: text/plain, Size: 3745 bytes --]

On Wed, Oct 12, 2016 at 10:38:52AM +0530, Nikunj A Dadhania wrote:
> Vector Integer Negate Instructions:
> 
> vnegw: Vector Negate Word
> vnegd: Vector Negate Doubleword
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/helper.h                 |  2 ++
>  target-ppc/int_helper.c             | 12 ++++++++++++
>  target-ppc/translate/vmx-impl.inc.c |  2 ++
>  target-ppc/translate/vmx-ops.inc.c  |  2 ++
>  4 files changed, 18 insertions(+)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 04c6421..5fcc546 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -272,6 +272,8 @@ DEF_HELPER_2(vextsh2w, void, avr, avr)
>  DEF_HELPER_2(vextsb2d, void, avr, avr)
>  DEF_HELPER_2(vextsh2d, void, avr, avr)
>  DEF_HELPER_2(vextsw2d, void, avr, avr)
> +DEF_HELPER_2(vnegw, void, avr, avr)
> +DEF_HELPER_2(vnegd, void, avr, avr)
>  DEF_HELPER_2(vupkhpx, void, avr, avr)
>  DEF_HELPER_2(vupklpx, void, avr, avr)
>  DEF_HELPER_2(vupkhsb, void, avr, avr)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 5aee0a8..7446e4e 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -1949,6 +1949,18 @@ VEXT_SIGNED(vextsh2d, s64, UINT16_MAX, int16_t, int64_t)
>  VEXT_SIGNED(vextsw2d, s64, UINT32_MAX, int32_t, int64_t)
>  #undef VEXT_SIGNED
>  
> +#define VNEG(name, element, mask)                                   \

The mask parameter appears to be unused.

> +void helper_##name(ppc_avr_t *r, ppc_avr_t *b)                      \
> +{                                                                   \
> +    int i;                                                          \
> +    VECTOR_FOR_INORDER_I(i, element) {                              \
> +        r->element[i] = -b->element[i];                             \
> +    }                                                               \
> +}
> +VNEG(vnegw, s32, UINT32_MAX)
> +VNEG(vnegd, s64, UINT64_MAX)
> +#undef VNEG
> +
>  #define VSPLTI(suffix, element, splat_type)                     \
>      void helper_vspltis##suffix(ppc_avr_t *r, uint32_t splat)   \
>      {                                                           \
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index c8998f3..563f101 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -815,6 +815,8 @@ GEN_VXFORM_NOA(vclzb, 1, 28)
>  GEN_VXFORM_NOA(vclzh, 1, 29)
>  GEN_VXFORM_NOA(vclzw, 1, 30)
>  GEN_VXFORM_NOA(vclzd, 1, 31)
> +GEN_VXFORM_NOA_2(vnegw, 1, 24, 6)
> +GEN_VXFORM_NOA_2(vnegd, 1, 24, 7)
>  GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
>  GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
>  GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index 68cba3e..ab64ab2 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -215,6 +215,8 @@ GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000,
>  GEN_VXFORM_DUAL_INV(vspltisw, vinsertw, 6, 14, 0x00000000, 0x100000,
>                                                 PPC_ALTIVEC),
>  GEN_VXFORM_300_EXT(vinsertd, 6, 15, 0x100000),
> +GEN_VXFORM_300_EO(vnegw, 0x01, 0x18, 0x06),
> +GEN_VXFORM_300_EO(vnegd, 0x01, 0x18, 0x07),
>  GEN_VXFORM_300_EO(vextsb2w, 0x01, 0x18, 0x10),
>  GEN_VXFORM_300_EO(vextsh2w, 0x01, 0x18, 0x11),
>  GEN_VXFORM_300_EO(vextsb2d, 0x01, 0x18, 0x18),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction Nikunj A Dadhania
@ 2016-10-13  0:21   ` David Gibson
  2016-10-13 18:14     ` Richard Henderson
  0 siblings, 1 reply; 12+ messages in thread
From: David Gibson @ 2016-10-13  0:21 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, benh

[-- Attachment #1: Type: text/plain, Size: 9842 bytes --]

On Wed, Oct 12, 2016 at 10:38:53AM +0530, Nikunj A Dadhania wrote:
> Add required helpers (GEN_XX2FORM_EO) for supporting this instruction.
> 
> xxbrh: VSX Vector Byte-Reverse Halfword
> xxbrw: VSX Vector Byte-Reverse Word
> xxbrd: VSX Vector Byte-Reverse Doubleword
> xxbrq: VSX Vector Byte-Reverse Quadword
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/translate.c              | 32 +++++++++++++++
>  target-ppc/translate/vsx-impl.inc.c | 77 +++++++++++++++++++++++++++++++++++++
>  target-ppc/translate/vsx-ops.inc.c  |  8 ++++
>  3 files changed, 117 insertions(+)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index dab8f19..94989b2 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -376,6 +376,9 @@ GEN_OPCODE2(name, onam, opc1, opc2, opc3, inval, type, type2)
>  #define GEN_HANDLER_E_2(name, opc1, opc2, opc3, opc4, inval, type, type2)     \
>  GEN_OPCODE3(name, opc1, opc2, opc3, opc4, inval, type, type2)
>  
> +#define GEN_HANDLER2_E_2(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2) \
> +GEN_OPCODE4(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2)
> +
>  typedef struct opcode_t {
>      unsigned char opc1, opc2, opc3, opc4;
>  #if HOST_LONG_BITS == 64 /* Explicitly align to 64 bits */
> @@ -662,6 +665,21 @@ EXTRACT_HELPER(IMM8, 11, 8);
>      },                                                                        \
>      .oname = stringify(name),                                                 \
>  }
> +#define GEN_OPCODE4(name, onam, op1, op2, op3, op4, invl, _typ, _typ2)        \
> +{                                                                             \
> +    .opc1 = op1,                                                              \
> +    .opc2 = op2,                                                              \
> +    .opc3 = op3,                                                              \
> +    .opc4 = op4,                                                              \
> +    .handler = {                                                              \
> +        .inval1  = invl,                                                      \
> +        .type = _typ,                                                         \
> +        .type2 = _typ2,                                                       \
> +        .handler = &gen_##name,                                               \
> +        .oname = onam,                                                        \
> +    },                                                                        \
> +    .oname = onam,                                                            \
> +}
>  #else
>  #define GEN_OPCODE(name, op1, op2, op3, invl, _typ, _typ2)                    \
>  {                                                                             \
> @@ -720,6 +738,20 @@ EXTRACT_HELPER(IMM8, 11, 8);
>      },                                                                        \
>      .oname = stringify(name),                                                 \
>  }
> +#define GEN_OPCODE4(name, onam, op1, op2, op3, op4, invl, _typ, _typ2)        \
> +{                                                                             \
> +    .opc1 = op1,                                                              \
> +    .opc2 = op2,                                                              \
> +    .opc3 = op3,                                                              \
> +    .opc4 = op4,                                                              \
> +    .handler = {                                                              \
> +        .inval1  = invl,                                                      \
> +        .type = _typ,                                                         \
> +        .type2 = _typ2,                                                       \
> +        .handler = &gen_##name,                                               \
> +    },                                                                        \
> +    .oname = onam,                                                            \
> +}
>  #endif
>  
>  /* SPR load/store helpers */
> diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
> index 23ec1e1..52af5c1 100644
> --- a/target-ppc/translate/vsx-impl.inc.c
> +++ b/target-ppc/translate/vsx-impl.inc.c
> @@ -132,6 +132,22 @@ static void gen_bswap16x8(TCGv_i64 outh, TCGv_i64 outl,
>      tcg_temp_free_i64(mask);
>  }
>  
> +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
> +                          TCGv_i64 inh, TCGv_i64 inl)
> +{
> +    TCGv_i64 hi = tcg_temp_new_i64();
> +    TCGv_i64 lo = tcg_temp_new_i64();
> +
> +    tcg_gen_bswap64_i64(hi, inh);
> +    tcg_gen_bswap64_i64(lo, inl);
> +    tcg_gen_shri_i64(outh, hi, 32);
> +    tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
> +    tcg_gen_shri_i64(outl, lo, 32);
> +    tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
> +
> +    tcg_temp_free_i64(hi);
> +    tcg_temp_free_i64(lo);
> +}

Is there actually any advantage to having this 128-bit operation
working on two 64-bit "register"s, as opposed to having a bswap32x2
that operates on a single 64-bit register amd calling it twice?

>  static void gen_lxvh8x(DisasContext *ctx)
>  {
>      TCGv EA;
> @@ -717,6 +733,67 @@ GEN_VSX_HELPER_2(xvrspim, 0x12, 0x0B, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
>  
> +static void gen_xxbrd(DisasContext *ctx)
> +{
> +    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
> +    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
> +    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
> +    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
> +
> +    if (unlikely(!ctx->vsx_enabled)) {
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);
> +        return;
> +    }
> +    tcg_gen_bswap64_i64(xth, xbh);
> +    tcg_gen_bswap64_i64(xtl, xbl);
> +}
> +
> +static void gen_xxbrh(DisasContext *ctx)
> +{
> +    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
> +    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
> +    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
> +    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
> +
> +    if (unlikely(!ctx->vsx_enabled)) {
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);
> +        return;
> +    }
> +    gen_bswap16x8(xth, xtl, xbh, xbl);

Likewise for the 16x8 version, I guess, although that would mean
changing the existing users.

> +}
> +
> +static void gen_xxbrq(DisasContext *ctx)
> +{
> +    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
> +    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
> +    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
> +    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
> +    TCGv_i64 t0 = tcg_temp_new_i64();
> +
> +    if (unlikely(!ctx->vsx_enabled)) {
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);
> +        return;
> +    }
> +    tcg_gen_bswap64_i64(t0, xbl);
> +    tcg_gen_bswap64_i64(xtl, xbh);
> +    tcg_gen_bswap64_i64(xth, t0);

This looks wrong.  You swap xbl as you move it to t0, then swap it
again as you put it back into xth.  So it looks like you'll translate
           0011223344556677 8899AABBCCDDEEFF
to
           8899AABBCCDDEEFF 7766554433221100
whereas it should become
	   FFEEDDCCBBAA9977 7766554433221100

> +    tcg_temp_free_i64(t0);
> +}
> +
> +static void gen_xxbrw(DisasContext *ctx)
> +{
> +    TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
> +    TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
> +    TCGv_i64 xbh = cpu_vsrh(xB(ctx->opcode));
> +    TCGv_i64 xbl = cpu_vsrl(xB(ctx->opcode));
> +
> +    if (unlikely(!ctx->vsx_enabled)) {
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);
> +        return;
> +    }
> +    gen_bswap32x4(xth, xtl, xbh, xbl);
> +}
> +
>  #define VSX_LOGICAL(name, tcg_op)                                    \
>  static void glue(gen_, name)(DisasContext * ctx)                     \
>      {                                                                \
> diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
> index 10eb4b9..af0d27e 100644
> --- a/target-ppc/translate/vsx-ops.inc.c
> +++ b/target-ppc/translate/vsx-ops.inc.c
> @@ -39,6 +39,10 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2)
>  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0, PPC_NONE, fl2), \
>  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2)
>  
> +#define GEN_XX2FORM_EO(name, opc2, opc3, opc4, fl2)                          \
> +GEN_HANDLER2_E_2(name, #name, 0x3C, opc2 | 0, opc3, opc4, 0, PPC_NONE, fl2), \
> +GEN_HANDLER2_E_2(name, #name, 0x3C, opc2 | 1, opc3, opc4, 0, PPC_NONE, fl2)
> +
>  #define GEN_XX3FORM(name, opc2, opc3, fl2)                           \
>  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 0, PPC_NONE, fl2), \
>  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2), \
> @@ -222,6 +226,10 @@ GEN_XX2FORM(xvrspic, 0x16, 0x0A, PPC2_VSX),
>  GEN_XX2FORM(xvrspim, 0x12, 0x0B, PPC2_VSX),
>  GEN_XX2FORM(xvrspip, 0x12, 0x0A, PPC2_VSX),
>  GEN_XX2FORM(xvrspiz, 0x12, 0x09, PPC2_VSX),
> +GEN_XX2FORM_EO(xxbrh, 0x16, 0x1D, 0x07, PPC2_ISA300),
> +GEN_XX2FORM_EO(xxbrw, 0x16, 0x1D, 0x0F, PPC2_ISA300),
> +GEN_XX2FORM_EO(xxbrd, 0x16, 0x1D, 0x17, PPC2_ISA300),
> +GEN_XX2FORM_EO(xxbrq, 0x16, 0x1D, 0x1F, PPC2_ISA300),
>  
>  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
>  GEN_XX3FORM(name, opc2, opc3, fl2)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/3] target-ppc: implement vexts[bh]2w and vexts[bhw]2d
  2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 1/3] target-ppc: implement vexts[bh]2w and vexts[bhw]2d Nikunj A Dadhania
@ 2016-10-13  0:23   ` David Gibson
  0 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2016-10-13  0:23 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, benh

[-- Attachment #1: Type: text/plain, Size: 4525 bytes --]

On Wed, Oct 12, 2016 at 10:38:51AM +0530, Nikunj A Dadhania wrote:
> Vector Extend Sign Instructions:
> 
> vextsb2w: Vector Extend Sign Byte To Word
> vextsh2w: Vector Extend Sign Halfword To Word
> vextsb2d: Vector Extend Sign Byte To Doubleword
> vextsh2d: Vector Extend Sign Halfword To Doubleword
> vextsw2d: Vector Extend Sign Word To Doubleword
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Applied to ppc-for-2.8, thanks.

> ---
>  target-ppc/helper.h                 |  5 +++++
>  target-ppc/int_helper.c             | 15 +++++++++++++++
>  target-ppc/translate/vmx-impl.inc.c |  5 +++++
>  target-ppc/translate/vmx-ops.inc.c  |  5 +++++
>  4 files changed, 30 insertions(+)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 796ad45..04c6421 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -267,6 +267,11 @@ DEF_HELPER_3(vinsertb, void, avr, avr, i32)
>  DEF_HELPER_3(vinserth, void, avr, avr, i32)
>  DEF_HELPER_3(vinsertw, void, avr, avr, i32)
>  DEF_HELPER_3(vinsertd, void, avr, avr, i32)
> +DEF_HELPER_2(vextsb2w, void, avr, avr)
> +DEF_HELPER_2(vextsh2w, void, avr, avr)
> +DEF_HELPER_2(vextsb2d, void, avr, avr)
> +DEF_HELPER_2(vextsh2d, void, avr, avr)
> +DEF_HELPER_2(vextsw2d, void, avr, avr)
>  DEF_HELPER_2(vupkhpx, void, avr, avr)
>  DEF_HELPER_2(vupklpx, void, avr, avr)
>  DEF_HELPER_2(vupkhsb, void, avr, avr)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 202854f..5aee0a8 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -1934,6 +1934,21 @@ VEXTRACT(uw, u32)
>  VEXTRACT(d, u64)
>  #undef VEXTRACT
>  
> +#define VEXT_SIGNED(name, element, mask, cast, recast)              \
> +void helper_##name(ppc_avr_t *r, ppc_avr_t *b)                      \
> +{                                                                   \
> +    int i;                                                          \
> +    VECTOR_FOR_INORDER_I(i, element) {                              \
> +        r->element[i] = (recast)((cast)(b->element[i] & mask));     \
> +    }                                                               \
> +}
> +VEXT_SIGNED(vextsb2w, s32, UINT8_MAX, int8_t, int32_t)
> +VEXT_SIGNED(vextsb2d, s64, UINT8_MAX, int8_t, int64_t)
> +VEXT_SIGNED(vextsh2w, s32, UINT16_MAX, int16_t, int32_t)
> +VEXT_SIGNED(vextsh2d, s64, UINT16_MAX, int16_t, int64_t)
> +VEXT_SIGNED(vextsw2d, s64, UINT32_MAX, int32_t, int64_t)
> +#undef VEXT_SIGNED
> +
>  #define VSPLTI(suffix, element, splat_type)                     \
>      void helper_vspltis##suffix(ppc_avr_t *r, uint32_t splat)   \
>      {                                                           \
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index 25cd073..c8998f3 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -815,6 +815,11 @@ GEN_VXFORM_NOA(vclzb, 1, 28)
>  GEN_VXFORM_NOA(vclzh, 1, 29)
>  GEN_VXFORM_NOA(vclzw, 1, 30)
>  GEN_VXFORM_NOA(vclzd, 1, 31)
> +GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
> +GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
> +GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
> +GEN_VXFORM_NOA_2(vextsh2d, 1, 24, 25)
> +GEN_VXFORM_NOA_2(vextsw2d, 1, 24, 26)
>  GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
>  GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
>  GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index ac1dc9b..68cba3e 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -215,6 +215,11 @@ GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000,
>  GEN_VXFORM_DUAL_INV(vspltisw, vinsertw, 6, 14, 0x00000000, 0x100000,
>                                                 PPC_ALTIVEC),
>  GEN_VXFORM_300_EXT(vinsertd, 6, 15, 0x100000),
> +GEN_VXFORM_300_EO(vextsb2w, 0x01, 0x18, 0x10),
> +GEN_VXFORM_300_EO(vextsh2w, 0x01, 0x18, 0x11),
> +GEN_VXFORM_300_EO(vextsb2d, 0x01, 0x18, 0x18),
> +GEN_VXFORM_300_EO(vextsh2d, 0x01, 0x18, 0x19),
> +GEN_VXFORM_300_EO(vextsw2d, 0x01, 0x18, 0x1A),
>  GEN_VXFORM_300_EO(vctzb, 0x01, 0x18, 0x1C),
>  GEN_VXFORM_300_EO(vctzh, 0x01, 0x18, 0x1D),
>  GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction
  2016-10-13  0:21   ` David Gibson
@ 2016-10-13 18:14     ` Richard Henderson
  2016-10-13 22:31       ` David Gibson
  2016-10-19  5:13       ` Nikunj A Dadhania
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Henderson @ 2016-10-13 18:14 UTC (permalink / raw)
  To: David Gibson, Nikunj A Dadhania; +Cc: qemu-ppc, qemu-devel, benh

On 10/12/2016 07:21 PM, David Gibson wrote:
>> +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
>> +                          TCGv_i64 inh, TCGv_i64 inl)
>> +{
>> +    TCGv_i64 hi = tcg_temp_new_i64();
>> +    TCGv_i64 lo = tcg_temp_new_i64();
>> +
>> +    tcg_gen_bswap64_i64(hi, inh);
>> +    tcg_gen_bswap64_i64(lo, inl);
>> +    tcg_gen_shri_i64(outh, hi, 32);
>> +    tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
>> +    tcg_gen_shri_i64(outl, lo, 32);
>> +    tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
>> +
>> +    tcg_temp_free_i64(hi);
>> +    tcg_temp_free_i64(lo);
>> +}
>
> Is there actually any advantage to having this 128-bit operation
> working on two 64-bit "register"s, as opposed to having a bswap32x2
> that operates on a single 64-bit register amd calling it twice?

For this one, no particular advantage.

>> +    gen_bswap16x8(xth, xtl, xbh, xbl);
>
> Likewise for the 16x8 version, I guess, although that would mean
> changing the existing users.

For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff.  On some 
hosts that's up to 6 insns.  Being about to reuse that for both swaps is useful.

>> +    tcg_gen_bswap64_i64(t0, xbl);
>> +    tcg_gen_bswap64_i64(xtl, xbh);
>> +    tcg_gen_bswap64_i64(xth, t0);
>
> This looks wrong.  You swap xbl as you move it to t0, then swap it
> again as you put it back into xth.  So it looks like you'll translate
>            0011223344556677 8899AABBCCDDEEFF
> to
>            8899AABBCCDDEEFF 7766554433221100
> whereas it should become
> 	   FFEEDDCCBBAA9977 7766554433221100

Indeed, the third line should be a move, not a swap.


r~

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction
  2016-10-13 18:14     ` Richard Henderson
@ 2016-10-13 22:31       ` David Gibson
  2016-10-19  5:13       ` Nikunj A Dadhania
  1 sibling, 0 replies; 12+ messages in thread
From: David Gibson @ 2016-10-13 22:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Nikunj A Dadhania, qemu-ppc, qemu-devel, benh

[-- Attachment #1: Type: text/plain, Size: 2081 bytes --]

On Thu, Oct 13, 2016 at 01:14:35PM -0500, Richard Henderson wrote:
> On 10/12/2016 07:21 PM, David Gibson wrote:
> > > +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
> > > +                          TCGv_i64 inh, TCGv_i64 inl)
> > > +{
> > > +    TCGv_i64 hi = tcg_temp_new_i64();
> > > +    TCGv_i64 lo = tcg_temp_new_i64();
> > > +
> > > +    tcg_gen_bswap64_i64(hi, inh);
> > > +    tcg_gen_bswap64_i64(lo, inl);
> > > +    tcg_gen_shri_i64(outh, hi, 32);
> > > +    tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
> > > +    tcg_gen_shri_i64(outl, lo, 32);
> > > +    tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
> > > +
> > > +    tcg_temp_free_i64(hi);
> > > +    tcg_temp_free_i64(lo);
> > > +}
> > 
> > Is there actually any advantage to having this 128-bit operation
> > working on two 64-bit "register"s, as opposed to having a bswap32x2
> > that operates on a single 64-bit register amd calling it twice?
> 
> For this one, no particular advantage.
> 
> > > +    gen_bswap16x8(xth, xtl, xbh, xbl);
> > 
> > Likewise for the 16x8 version, I guess, although that would mean
> > changing the existing users.
> 
> For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff.  On
> some hosts that's up to 6 insns.  Being about to reuse that for both swaps
> is useful.

Ah, good point.

> > > +    tcg_gen_bswap64_i64(t0, xbl);
> > > +    tcg_gen_bswap64_i64(xtl, xbh);
> > > +    tcg_gen_bswap64_i64(xth, t0);
> > 
> > This looks wrong.  You swap xbl as you move it to t0, then swap it
> > again as you put it back into xth.  So it looks like you'll translate
> >            0011223344556677 8899AABBCCDDEEFF
> > to
> >            8899AABBCCDDEEFF 7766554433221100
> > whereas it should become
> > 	   FFEEDDCCBBAA9977 7766554433221100
> 
> Indeed, the third line should be a move, not a swap.
> 
> 
> r~
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/3] target-ppc: implement vnegw/d instructions
  2016-10-13  0:12   ` David Gibson
@ 2016-10-19  5:12     ` Nikunj A Dadhania
  0 siblings, 0 replies; 12+ messages in thread
From: Nikunj A Dadhania @ 2016-10-19  5:12 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, rth, qemu-devel, benh

David Gibson <david@gibson.dropbear.id.au> writes:

> [ Unknown signature status ]
> On Wed, Oct 12, 2016 at 10:38:52AM +0530, Nikunj A Dadhania wrote:
>> Vector Integer Negate Instructions:
>> 
>> vnegw: Vector Negate Word
>> vnegd: Vector Negate Doubleword
>> 
>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>> ---
>>  target-ppc/helper.h                 |  2 ++
>>  target-ppc/int_helper.c             | 12 ++++++++++++
>>  target-ppc/translate/vmx-impl.inc.c |  2 ++
>>  target-ppc/translate/vmx-ops.inc.c  |  2 ++
>>  4 files changed, 18 insertions(+)
>> 
>> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
>> index 04c6421..5fcc546 100644
>> --- a/target-ppc/helper.h
>> +++ b/target-ppc/helper.h
>> @@ -272,6 +272,8 @@ DEF_HELPER_2(vextsh2w, void, avr, avr)
>>  DEF_HELPER_2(vextsb2d, void, avr, avr)
>>  DEF_HELPER_2(vextsh2d, void, avr, avr)
>>  DEF_HELPER_2(vextsw2d, void, avr, avr)
>> +DEF_HELPER_2(vnegw, void, avr, avr)
>> +DEF_HELPER_2(vnegd, void, avr, avr)
>>  DEF_HELPER_2(vupkhpx, void, avr, avr)
>>  DEF_HELPER_2(vupklpx, void, avr, avr)
>>  DEF_HELPER_2(vupkhsb, void, avr, avr)
>> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
>> index 5aee0a8..7446e4e 100644
>> --- a/target-ppc/int_helper.c
>> +++ b/target-ppc/int_helper.c
>> @@ -1949,6 +1949,18 @@ VEXT_SIGNED(vextsh2d, s64, UINT16_MAX, int16_t, int64_t)
>>  VEXT_SIGNED(vextsw2d, s64, UINT32_MAX, int32_t, int64_t)
>>  #undef VEXT_SIGNED
>>  
>> +#define VNEG(name, element, mask)                                   \
>
> The mask parameter appears to be unused.

Yes, will remove it and send the updated patch.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction
  2016-10-13 18:14     ` Richard Henderson
  2016-10-13 22:31       ` David Gibson
@ 2016-10-19  5:13       ` Nikunj A Dadhania
  1 sibling, 0 replies; 12+ messages in thread
From: Nikunj A Dadhania @ 2016-10-19  5:13 UTC (permalink / raw)
  To: Richard Henderson, David Gibson; +Cc: qemu-ppc, qemu-devel, benh

Richard Henderson <rth@twiddle.net> writes:

> On 10/12/2016 07:21 PM, David Gibson wrote:
>>> +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
>>> +                          TCGv_i64 inh, TCGv_i64 inl)
>>> +{
>>> +    TCGv_i64 hi = tcg_temp_new_i64();
>>> +    TCGv_i64 lo = tcg_temp_new_i64();
>>> +
>>> +    tcg_gen_bswap64_i64(hi, inh);
>>> +    tcg_gen_bswap64_i64(lo, inl);
>>> +    tcg_gen_shri_i64(outh, hi, 32);
>>> +    tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
>>> +    tcg_gen_shri_i64(outl, lo, 32);
>>> +    tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
>>> +
>>> +    tcg_temp_free_i64(hi);
>>> +    tcg_temp_free_i64(lo);
>>> +}
>>
>> Is there actually any advantage to having this 128-bit operation
>> working on two 64-bit "register"s, as opposed to having a bswap32x2
>> that operates on a single 64-bit register amd calling it twice?
>
> For this one, no particular advantage.
>
>>> +    gen_bswap16x8(xth, xtl, xbh, xbl);
>>
>> Likewise for the 16x8 version, I guess, although that would mean
>> changing the existing users.
>
> For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff.  On some 
> hosts that's up to 6 insns.  Being about to reuse that for both swaps is useful.
>
>>> +    tcg_gen_bswap64_i64(t0, xbl);
>>> +    tcg_gen_bswap64_i64(xtl, xbh);
>>> +    tcg_gen_bswap64_i64(xth, t0);
>>
>> This looks wrong.  You swap xbl as you move it to t0, then swap it
>> again as you put it back into xth.  So it looks like you'll translate
>>            0011223344556677 8899AABBCCDDEEFF
>> to
>>            8899AABBCCDDEEFF 7766554433221100
>> whereas it should become
>> 	   FFEEDDCCBBAA9977 7766554433221100
>
> Indeed, the third line should be a move, not a swap.

Correct, will send updated patch.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-10-19  5:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-12  5:08 [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 Nikunj A Dadhania
2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 1/3] target-ppc: implement vexts[bh]2w and vexts[bhw]2d Nikunj A Dadhania
2016-10-13  0:23   ` David Gibson
2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 2/3] target-ppc: implement vnegw/d instructions Nikunj A Dadhania
2016-10-13  0:12   ` David Gibson
2016-10-19  5:12     ` Nikunj A Dadhania
2016-10-12  5:08 ` [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction Nikunj A Dadhania
2016-10-13  0:21   ` David Gibson
2016-10-13 18:14     ` Richard Henderson
2016-10-13 22:31       ` David Gibson
2016-10-19  5:13       ` Nikunj A Dadhania
2016-10-12  8:50 ` [Qemu-devel] [PATCH v1 0/3] POWER9 TCG enablements - part6 no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.