[PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions
@ 2020-06-25 17:00 Lijun Pan
  2020-06-25 17:00 ` [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag Lijun Pan
                   ` (7 more replies)
  0 siblings, 8 replies; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

This patch series add several newly introduced 32/64-bit vector
instructions in Power ISA 3.1. Power ISA 3.1 flag is introduced in
this version. In v3 version, coding style issues are fixed, community
reviews/suggestions are taken into consideration.

Lijun Pan (8):
  target/ppc: Introduce Power ISA 3.1 flag
  target/ppc: add byte-reverse br[dwh] instructions
  target/ppc: convert vmuluwm to tcg_gen_gvec_mul
  target/ppc: add vmulld instruction
  target/ppc: add vmulh{su}w instructions
  fix the prototype of muls64/mulu64
  target/ppc: add vmulh{su}d instructions
  target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions

 include/qemu/host-utils.h           |  4 +-
 target/ppc/cpu.h                    |  4 +-
 target/ppc/helper.h                 | 13 ++++-
 target/ppc/int_helper.c             | 74 ++++++++++++++++++++++++-----
 target/ppc/translate.c              | 41 ++++++++++++++++
 target/ppc/translate/vmx-impl.inc.c | 26 +++++++++-
 target/ppc/translate/vmx-ops.inc.c  | 27 +++++++++--
 target/ppc/translate_init.inc.c     |  2 +-
 tcg/ppc/tcg-target.h                |  2 +
 tcg/ppc/tcg-target.inc.c            |  7 ++-
 10 files changed, 175 insertions(+), 25 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 17:40   ` Richard Henderson
  2020-06-25 17:00 ` [PATCH v3 2/8] target/ppc: add byte-reverse br[dwh] instructions Lijun Pan
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

This flag will be used for Power10 instructions.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 target/ppc/cpu.h                | 4 +++-
 target/ppc/translate_init.inc.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 1988b436cb..ebb5a0811a 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2191,6 +2191,8 @@ enum {
     PPC2_PM_ISA206     = 0x0000000000040000ULL,
     /* POWER ISA 3.0                                                         */
     PPC2_ISA300        = 0x0000000000080000ULL,
+    /* POWER ISA 3.1                                                         */
+    PPC2_ISA310        = 0x0000000000100000ULL,
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
                         PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
@@ -2199,7 +2201,7 @@ enum {
                         PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
                         PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
                         PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \
-                        PPC2_ISA300)
+                        PPC2_ISA300 | PPC2_ISA310)
 };
 
 /*****************************************************************************/
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 38cb773ab4..3f72310e60 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -9206,7 +9206,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
                         PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
                         PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
                         PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
-                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
+                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310;
     pcc->msr_mask = (1ull << MSR_SF) |
                     (1ull << MSR_HV) |
                     (1ull << MSR_TM) |
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/8] target/ppc: add byte-reverse br[dwh] instructions
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
  2020-06-25 17:00 ` [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 17:42   ` Richard Henderson
  2020-06-25 17:00 ` [PATCH v3 3/8] target/ppc: convert vmuluwm to tcg_gen_gvec_mul Lijun Pan
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

POWER ISA 3.1 introduces following byte-reverse instructions:
brd: Byte-Reverse Doubleword X-form
brw: Byte-Reverse Word X-form
brh: Byte-Reverse Halfword X-form

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v3: fix the store issue in br[dwh]
    simplify brw implementation
    add "if defined(TARGET_PPC64)"

 target/ppc/translate.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 4ce3d664b5..6634b38f3a 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6971,7 +6971,45 @@ static void gen_dform3D(DisasContext *ctx)
     return gen_invalid(ctx);
 }
 
+/* brd */
+static void gen_brd(DisasContext *ctx)
+{
+    tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
+}
+
+/* brw */
+static void gen_brw(DisasContext *ctx)
+{
+    tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
+    tcg_gen_rotli_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rA(ctx->opcode)], 32);
+
+}
+
+/* brh */
+static void gen_brh(DisasContext *ctx)
+{
+    TCGv_i64 t0 = tcg_temp_new_i64();
+    TCGv_i64 t1 = tcg_temp_new_i64();
+    TCGv_i64 t2 = tcg_temp_new_i64();
+
+    tcg_gen_movi_i64(t0, 0x00ff00ff00ff00ffull);
+    tcg_gen_shri_i64(t1, cpu_gpr[rS(ctx->opcode)], 8);
+    tcg_gen_and_i64(t2, t1, t0);
+    tcg_gen_and_i64(t1, cpu_gpr[rS(ctx->opcode)], t0);
+    tcg_gen_shli_i64(t1, t1, 8);
+    tcg_gen_or_i64(cpu_gpr[rA(ctx->opcode)], t1, t2);
+
+    tcg_temp_free_i64(t0);
+    tcg_temp_free_i64(t1);
+    tcg_temp_free_i64(t2);
+}
+
 static opcode_t opcodes[] = {
+#if defined(TARGET_PPC64)
+GEN_HANDLER_E(brd, 0x1F, 0x1B, 0x05, 0x0000F801, PPC_NONE, PPC2_ISA310),
+GEN_HANDLER_E(brw, 0x1F, 0x1B, 0x04, 0x0000F801, PPC_NONE, PPC2_ISA310),
+GEN_HANDLER_E(brh, 0x1F, 0x1B, 0x06, 0x0000F801, PPC_NONE, PPC2_ISA310),
+#endif
 GEN_HANDLER(invalid, 0x00, 0x00, 0x00, 0xFFFFFFFF, PPC_NONE),
 GEN_HANDLER(cmp, 0x1F, 0x00, 0x00, 0x00400000, PPC_INTEGER),
 GEN_HANDLER(cmpi, 0x0B, 0xFF, 0xFF, 0x00400000, PPC_INTEGER),
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/8] target/ppc: convert vmuluwm to tcg_gen_gvec_mul
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
  2020-06-25 17:00 ` [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag Lijun Pan
  2020-06-25 17:00 ` [PATCH v3 2/8] target/ppc: add byte-reverse br[dwh] instructions Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 17:52   ` Richard Henderson
  2020-06-25 17:00 ` [PATCH v3 4/8] target/ppc: add vmulld instruction Lijun Pan
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

Convert the original implementation of vmuluwm to the more generic
tcg_gen_gvec_mul.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v3: newly introduced.

 target/ppc/helper.h                 |  1 -
 target/ppc/int_helper.c             | 13 -------------
 target/ppc/translate/vmx-impl.inc.c |  2 +-
 3 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 2dfa1c6942..69416b6d7c 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -184,7 +184,6 @@ DEF_HELPER_3(vmulosw, void, avr, avr, avr)
 DEF_HELPER_3(vmuloub, void, avr, avr, avr)
 DEF_HELPER_3(vmulouh, void, avr, avr, avr)
 DEF_HELPER_3(vmulouw, void, avr, avr, avr)
-DEF_HELPER_3(vmuluwm, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index be53cd6f68..bd3e6d7cc7 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -523,19 +523,6 @@ void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
     r->VsrD(0) = 0;
 }
 
-#define VARITH_DO(name, op, element)                                    \
-    void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
-    {                                                                   \
-        int i;                                                          \
-                                                                        \
-        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                  \
-            r->element[i] = a->element[i] op b->element[i];             \
-        }                                                               \
-    }
-VARITH_DO(muluwm, *, u32)
-#undef VARITH_DO
-#undef VARITH
-
 #define VARITHFP(suffix, func)                                          \
     void helper_v##suffix(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, \
                           ppc_avr_t *b)                                 \
diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index 403ed3a01c..6e79ffa650 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -801,7 +801,7 @@ static void trans_vclzd(DisasContext *ctx)
 GEN_VXFORM(vmuloub, 4, 0);
 GEN_VXFORM(vmulouh, 4, 1);
 GEN_VXFORM(vmulouw, 4, 2);
-GEN_VXFORM(vmuluwm, 4, 2);
+GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
 GEN_VXFORM_DUAL(vmulouw, PPC_ALTIVEC, PPC_NONE,
                 vmuluwm, PPC_NONE, PPC2_ALTIVEC_207)
 GEN_VXFORM(vmulosb, 4, 4);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 4/8] target/ppc: add vmulld instruction
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
                   ` (2 preceding siblings ...)
  2020-06-25 17:00 ` [PATCH v3 3/8] target/ppc: convert vmuluwm to tcg_gen_gvec_mul Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 18:25   ` Richard Henderson
  2020-06-25 17:00 ` [PATCH v3 5/8] target/ppc: add vmulh{su}w instructions Lijun Pan
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

vmulld: Vector Multiply Low Doubleword.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v3: use tcg_gen_gvec_mul()

 target/ppc/translate/vmx-impl.inc.c | 1 +
 target/ppc/translate/vmx-ops.inc.c  | 4 ++++
 tcg/ppc/tcg-target.h                | 2 ++
 tcg/ppc/tcg-target.inc.c            | 7 +++++--
 4 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index 6e79ffa650..8c89738552 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -807,6 +807,7 @@ GEN_VXFORM_DUAL(vmulouw, PPC_ALTIVEC, PPC_NONE,
 GEN_VXFORM(vmulosb, 4, 4);
 GEN_VXFORM(vmulosh, 4, 5);
 GEN_VXFORM(vmulosw, 4, 6);
+GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
 GEN_VXFORM(vmuleub, 4, 8);
 GEN_VXFORM(vmuleuh, 4, 9);
 GEN_VXFORM(vmuleuw, 4, 10);
diff --git a/target/ppc/translate/vmx-ops.inc.c b/target/ppc/translate/vmx-ops.inc.c
index 84e05fb827..b49787ac97 100644
--- a/target/ppc/translate/vmx-ops.inc.c
+++ b/target/ppc/translate/vmx-ops.inc.c
@@ -48,6 +48,9 @@ GEN_HANDLER_E(name, 0x04, opc2, opc3, inval, PPC_NONE, PPC2_ISA300)
 GEN_HANDLER_E_2(name, 0x04, opc2, opc3, opc4, 0x00000000, PPC_NONE,     \
                                                        PPC2_ISA300)
 
+#define GEN_VXFORM_310(name, opc2, opc3)                                \
+GEN_HANDLER_E(name, 0x04, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA310)
+
 #define GEN_VXFORM_DUAL(name0, name1, opc2, opc3, type0, type1) \
 GEN_HANDLER_E(name0##_##name1, 0x4, opc2, opc3, 0x00000000, type0, type1)
 
@@ -104,6 +107,7 @@ GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vmulosb, 4, 4),
 GEN_VXFORM(vmulosh, 4, 5),
 GEN_VXFORM_207(vmulosw, 4, 6),
+GEN_VXFORM_310(vmulld, 4, 7),
 GEN_VXFORM(vmuleub, 4, 8),
 GEN_VXFORM(vmuleuh, 4, 9),
 GEN_VXFORM_207(vmuleuw, 4, 10),
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 4fa21f0e71..ff1249ef8e 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -63,6 +63,7 @@ typedef enum {
     tcg_isa_2_06,
     tcg_isa_2_07,
     tcg_isa_3_00,
+    tcg_isa_3_10,
 } TCGPowerISA;
 
 extern TCGPowerISA have_isa;
@@ -72,6 +73,7 @@ extern bool have_vsx;
 #define have_isa_2_06  (have_isa >= tcg_isa_2_06)
 #define have_isa_2_07  (have_isa >= tcg_isa_2_07)
 #define have_isa_3_00  (have_isa >= tcg_isa_3_00)
+#define have_isa_3_10  (have_isa >= tcg_isa_3_10)
 
 /* optional instructions automatically implemented */
 #define TCG_TARGET_HAS_ext8u_i32        0 /* andi */
diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c
index ee1f9227c1..de0feaf7a8 100644
--- a/tcg/ppc/tcg-target.inc.c
+++ b/tcg/ppc/tcg-target.inc.c
@@ -564,6 +564,7 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type,
 #define VMULOUH    VX4(72)
 #define VMULOUW    VX4(136)       /* v2.07 */
 #define VMULUWM    VX4(137)       /* v2.07 */
+#define VMULLD     VX4(457)       /* v3.10 */
 #define VMSUMUHM   VX4(38)
 
 #define VMRGHB     VX4(12)
@@ -3149,6 +3150,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
     static const uint32_t
         add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM },
         sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM },
+        mul_op[4] = { 0, 0, VMULUWM, VMULLD },
         neg_op[4] = { 0, 0, VNEGW, VNEGD },
         eq_op[4]  = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD },
         ne_op[4]  = { VCMPNEB, VCMPNEH, VCMPNEW, 0 },
@@ -3199,8 +3201,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
         a1 = 0;
         break;
     case INDEX_op_mul_vec:
-        tcg_debug_assert(vece == MO_32 && have_isa_2_07);
-        insn = VMULUWM;
+        tcg_debug_assert((vece == MO_32 && have_isa_2_07) ||
+                         (vece == MO_64 && have_isa_3_10));
+        insn = mul_op[vece];
         break;
     case INDEX_op_ssadd_vec:
         insn = ssadd_op[vece];
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 5/8] target/ppc: add vmulh{su}w instructions
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
                   ` (3 preceding siblings ...)
  2020-06-25 17:00 ` [PATCH v3 4/8] target/ppc: add vmulld instruction Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 18:26   ` Richard Henderson
  2020-06-25 17:00 ` [PATCH v3 6/8] fix the prototype of muls64/mulu64 Lijun Pan
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

vmulhsw: Vector Multiply High Signed Word
vmulhuw: Vector Multiply High Unsigned Word

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v3: inline the helper_vmulh{su}w multiply directly instead of using macro

 target/ppc/helper.h                 |  2 ++
 target/ppc/int_helper.c             | 19 +++++++++++++++++++
 target/ppc/translate/vmx-impl.inc.c |  6 ++++++
 target/ppc/translate/vmx-ops.inc.c  |  4 ++--
 4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 69416b6d7c..3b3013866a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -184,6 +184,8 @@ DEF_HELPER_3(vmulosw, void, avr, avr, avr)
 DEF_HELPER_3(vmuloub, void, avr, avr, avr)
 DEF_HELPER_3(vmulouh, void, avr, avr, avr)
 DEF_HELPER_3(vmulouw, void, avr, avr, avr)
+DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
+DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index bd3e6d7cc7..a3a20821fc 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1086,6 +1086,25 @@ VMUL(uw, u32, VsrW, VsrD, uint64_t)
 #undef VMUL_DO_ODD
 #undef VMUL
 
+void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+    int i;
+
+    for (i = 0; i < 4; i++) {
+        r->s32[i] = (int32_t)(((int64_t)a->s32[i] * (int64_t)b->s32[i]) >> 32);
+    }
+}
+
+void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+    int i;
+
+    for (i = 0; i < 4; i++) {
+        r->u32[i] = (uint32_t)(((uint64_t)a->u32[i] *
+                               (uint64_t)b->u32[i]) >> 32);
+    }
+}
+
 void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
                   ppc_avr_t *c)
 {
diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index 8c89738552..50bac375fc 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -811,9 +811,15 @@ GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
 GEN_VXFORM(vmuleub, 4, 8);
 GEN_VXFORM(vmuleuh, 4, 9);
 GEN_VXFORM(vmuleuw, 4, 10);
+GEN_VXFORM(vmulhuw, 4, 10);
+GEN_VXFORM_DUAL(vmuleuw, PPC_ALTIVEC, PPC_NONE,
+                vmulhuw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM(vmulesb, 4, 12);
 GEN_VXFORM(vmulesh, 4, 13);
 GEN_VXFORM(vmulesw, 4, 14);
+GEN_VXFORM(vmulhsw, 4, 14);
+GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
+                vmulhsw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
 GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
 GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
diff --git a/target/ppc/translate/vmx-ops.inc.c b/target/ppc/translate/vmx-ops.inc.c
index b49787ac97..29701ad778 100644
--- a/target/ppc/translate/vmx-ops.inc.c
+++ b/target/ppc/translate/vmx-ops.inc.c
@@ -110,10 +110,10 @@ GEN_VXFORM_207(vmulosw, 4, 6),
 GEN_VXFORM_310(vmulld, 4, 7),
 GEN_VXFORM(vmuleub, 4, 8),
 GEN_VXFORM(vmuleuh, 4, 9),
-GEN_VXFORM_207(vmuleuw, 4, 10),
+GEN_VXFORM_DUAL(vmuleuw, vmulhuw, 4, 10, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vmulesb, 4, 12),
 GEN_VXFORM(vmulesh, 4, 13),
-GEN_VXFORM_207(vmulesw, 4, 14),
+GEN_VXFORM_DUAL(vmulesw, vmulhsw, 4, 14, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vslb, 2, 4),
 GEN_VXFORM(vslh, 2, 5),
 GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 6/8] fix the prototype of muls64/mulu64
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
                   ` (4 preceding siblings ...)
  2020-06-25 17:00 ` [PATCH v3 5/8] target/ppc: add vmulh{su}w instructions Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 18:28   ` Richard Henderson
  2020-06-25 17:00 ` [PATCH v3 7/8] target/ppc: add vmulh{su}d instructions Lijun Pan
  2020-06-25 17:00 ` [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions Lijun Pan
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

The prototypes of muls64/mulu64 in host-utils.h should match the
definitions in host-utils.c

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 include/qemu/host-utils.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index 4cd170e6cd..cdca2991d8 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -77,8 +77,8 @@ static inline int divs128(int64_t *plow, int64_t *phigh, int64_t divisor)
     }
 }
 #else
-void muls64(uint64_t *phigh, uint64_t *plow, int64_t a, int64_t b);
-void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b);
+void muls64(uint64_t *plow, uint64_t *phigh, int64_t a, int64_t b);
+void mulu64(uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b);
 int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor);
 int divs128(int64_t *plow, int64_t *phigh, int64_t divisor);
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 7/8] target/ppc: add vmulh{su}d instructions
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
                   ` (5 preceding siblings ...)
  2020-06-25 17:00 ` [PATCH v3 6/8] fix the prototype of muls64/mulu64 Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 18:32   ` Richard Henderson
  2020-06-25 17:00 ` [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions Lijun Pan
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

vmulhsd: Vector Multiply High Signed Doubleword
vmulhud: Vector Multiply High Unsigned Doubleword

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v3: simplify helper_vmulh{su}d 

 target/ppc/helper.h                 |  2 ++
 target/ppc/int_helper.c             | 16 ++++++++++++++++
 target/ppc/translate/vmx-impl.inc.c |  2 ++
 target/ppc/translate/vmx-ops.inc.c  |  2 ++
 4 files changed, 22 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 3b3013866a..0036788919 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -186,6 +186,8 @@ DEF_HELPER_3(vmulouh, void, avr, avr, avr)
 DEF_HELPER_3(vmulouw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
+DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
+DEF_HELPER_3(vmulhud, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index a3a20821fc..57d6767f60 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1105,6 +1105,22 @@ void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
     }
 }
 
+void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+    uint64_t discard;
+
+    muls64(&discard, &r->u64[0], a->s64[0], b->s64[0]);
+    muls64(&discard, &r->u64[1], a->s64[1], b->s64[1]);
+}
+
+void helper_vmulhud(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+    uint64_t discard;
+
+    mulu64(&discard, &r->u64[0], a->u64[0], b->u64[0]);
+    mulu64(&discard, &r->u64[1], a->u64[1], b->u64[1]);
+}
+
 void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
                   ppc_avr_t *c)
 {
diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index 50bac375fc..0910807232 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -812,6 +812,7 @@ GEN_VXFORM(vmuleub, 4, 8);
 GEN_VXFORM(vmuleuh, 4, 9);
 GEN_VXFORM(vmuleuw, 4, 10);
 GEN_VXFORM(vmulhuw, 4, 10);
+GEN_VXFORM(vmulhud, 4, 11);
 GEN_VXFORM_DUAL(vmuleuw, PPC_ALTIVEC, PPC_NONE,
                 vmulhuw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM(vmulesb, 4, 12);
@@ -820,6 +821,7 @@ GEN_VXFORM(vmulesw, 4, 14);
 GEN_VXFORM(vmulhsw, 4, 14);
 GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
                 vmulhsw, PPC_NONE, PPC2_ISA310);
+GEN_VXFORM(vmulhsd, 4, 15);
 GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
 GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
 GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
diff --git a/target/ppc/translate/vmx-ops.inc.c b/target/ppc/translate/vmx-ops.inc.c
index 29701ad778..f3f4855111 100644
--- a/target/ppc/translate/vmx-ops.inc.c
+++ b/target/ppc/translate/vmx-ops.inc.c
@@ -111,9 +111,11 @@ GEN_VXFORM_310(vmulld, 4, 7),
 GEN_VXFORM(vmuleub, 4, 8),
 GEN_VXFORM(vmuleuh, 4, 9),
 GEN_VXFORM_DUAL(vmuleuw, vmulhuw, 4, 10, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhud, 4, 11),
 GEN_VXFORM(vmulesb, 4, 12),
 GEN_VXFORM(vmulesh, 4, 13),
 GEN_VXFORM_DUAL(vmulesw, vmulhsw, 4, 14, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhsd, 4, 15),
 GEN_VXFORM(vslb, 2, 4),
 GEN_VXFORM(vslh, 2, 5),
 GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions
  2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
                   ` (6 preceding siblings ...)
  2020-06-25 17:00 ` [PATCH v3 7/8] target/ppc: add vmulh{su}d instructions Lijun Pan
@ 2020-06-25 17:00 ` Lijun Pan
  2020-06-25 18:37   ` Richard Henderson
  7 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 17:00 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: Lijun Pan, richard.henderson, david

vdivsw: Vector Divide Signed Word
vdivuw: Vector Divide Unsigned Word
vdivsd: Vector Divide Signed Doubleword
vdivud: Vector Divide Unsigned Doubleword
vmodsw: Vector Modulo Signed Word
vmoduw: Vector Modulo Unsigned Word
vmodsd: Vector Modulo Signed Doubleword
vmodud: Vector Modulo Unsigned Doubleword

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v3: add missing divided-by-zero, divided-by-(-1) handling

 target/ppc/helper.h                 |  8 ++++++++
 target/ppc/int_helper.c             | 26 ++++++++++++++++++++++++++
 target/ppc/translate.c              |  3 +++
 target/ppc/translate/vmx-impl.inc.c | 15 +++++++++++++++
 target/ppc/translate/vmx-ops.inc.c  | 17 +++++++++++++++--
 5 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 0036788919..70a14029ca 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -188,6 +188,14 @@ DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
 DEF_HELPER_3(vmulhud, void, avr, avr, avr)
+DEF_HELPER_3(vdivsw, void, avr, avr, avr)
+DEF_HELPER_3(vdivuw, void, avr, avr, avr)
+DEF_HELPER_3(vdivsd, void, avr, avr, avr)
+DEF_HELPER_3(vdivud, void, avr, avr, avr)
+DEF_HELPER_3(vmodsw, void, avr, avr, avr)
+DEF_HELPER_3(vmoduw, void, avr, avr, avr)
+DEF_HELPER_3(vmodsd, void, avr, avr, avr)
+DEF_HELPER_3(vmodud, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 57d6767f60..283f5a00af 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1121,6 +1121,32 @@ void helper_vmulhud(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
     mulu64(&discard, &r->u64[1], a->u64[1], b->u64[1]);
 }
 
+#define VDIV_MOD_DO(name, op, element, sign, bit)                       \
+    void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
+    {                                                                   \
+        int i;                                                          \
+                                                                        \
+                                                                        \
+        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                  \
+            if (unlikely((b->element[i] == 0) ||                        \
+                (sign &&                                                \
+                (b->element[i] == UINT##bit##_MAX) &&                   \
+                (a->element[i] == INT##bit##_MIN))))                    \
+                continue;                                               \
+            r->element[i] = a->element[i] op b->element[i];             \
+        }                                                               \
+    }
+VDIV_MOD_DO(divsw, /, s32, 1, 32)
+VDIV_MOD_DO(divuw, /, u32, 0, 32)
+VDIV_MOD_DO(divsd, /, s64, 1, 64)
+VDIV_MOD_DO(divud, /, u64, 0, 64)
+VDIV_MOD_DO(modsw, %, s32, 1, 32)
+VDIV_MOD_DO(moduw, %, u32, 0, 32)
+VDIV_MOD_DO(modsd, %, s64, 1, 64)
+VDIV_MOD_DO(modud, %, u64, 0, 64)
+#undef VDIV_MOD_DO
+
+
 void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
                   ppc_avr_t *c)
 {
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 6634b38f3a..d4ea7eccfd 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -388,6 +388,9 @@ GEN_OPCODE3(name, opc1, opc2, opc3, opc4, inval, type, type2)
 #define GEN_HANDLER2_E_2(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2) \
 GEN_OPCODE4(name, onam, opc1, opc2, opc3, opc4, inval, typ, typ2)
 
+#define GEN_HANDLER_BOTH(name, opc1, opc2, opc3, inval0, inval1, type0, type1) \
+GEN_OPCODE_DUAL(name, opc1, opc2, opc3, inval0, inval1, type0, type1)
+
 typedef struct opcode_t {
     unsigned char opc1, opc2, opc3, opc4;
 #if HOST_LONG_BITS == 64 /* Explicitly align to 64 bits */
diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index 0910807232..ac5e820541 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -798,6 +798,9 @@ static void trans_vclzd(DisasContext *ctx)
     tcg_temp_free_i64(avr);
 }
 
+static void gen_vexptefp(DisasContext *ctx);
+static void gen_vlogefp(DisasContext *ctx);
+
 GEN_VXFORM(vmuloub, 4, 0);
 GEN_VXFORM(vmulouh, 4, 1);
 GEN_VXFORM(vmulouw, 4, 2);
@@ -822,6 +825,18 @@ GEN_VXFORM(vmulhsw, 4, 14);
 GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
                 vmulhsw, PPC_NONE, PPC2_ISA310);
 GEN_VXFORM(vmulhsd, 4, 15);
+GEN_VXFORM(vdivuw, 5, 2);
+GEN_VXFORM(vdivud, 5, 3);
+GEN_VXFORM(vdivsw, 5, 6);
+GEN_VXFORM_DUAL_EXT(vexptefp, PPC_ALTIVEC, PPC_NONE, 0x001f0000,
+                    vdivsw, PPC_NONE, PPC2_ISA310, 0x00000000);
+GEN_VXFORM(vdivsd, 5, 7);
+GEN_VXFORM_DUAL_EXT(vlogefp, PPC_ALTIVEC, PPC_NONE, 0x001f0000,
+                    vdivsd, PPC_NONE, PPC2_ISA310, 0x00000000);
+GEN_VXFORM(vmoduw, 5, 26);
+GEN_VXFORM(vmodud, 5, 27);
+GEN_VXFORM(vmodsw, 5, 30);
+GEN_VXFORM(vmodsd, 5, 31);
 GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
 GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
 GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
diff --git a/target/ppc/translate/vmx-ops.inc.c b/target/ppc/translate/vmx-ops.inc.c
index f3f4855111..528458cb25 100644
--- a/target/ppc/translate/vmx-ops.inc.c
+++ b/target/ppc/translate/vmx-ops.inc.c
@@ -54,6 +54,11 @@ GEN_HANDLER_E(name, 0x04, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA310)
 #define GEN_VXFORM_DUAL(name0, name1, opc2, opc3, type0, type1) \
 GEN_HANDLER_E(name0##_##name1, 0x4, opc2, opc3, 0x00000000, type0, type1)
 
+#define GEN_VXFORM_DUAL_BOTH(name0, name1, opc2, opc3, inval0, \
+                             inval1, type0, type1)             \
+GEN_HANDLER_BOTH(name0##_##name1, 0x4, opc2, opc3, inval0,     \
+                 inval1, type0, type1)
+
 #define GEN_VXRFORM_DUAL(name0, name1, opc2, opc3, tp0, tp1) \
 GEN_HANDLER_E(name0##_##name1, 0x4, opc2, opc3, 0x00000000, tp0, tp1), \
 GEN_HANDLER_E(name0##_##name1, 0x4, opc2, (opc3 | 0x10), 0x00000000, tp0, tp1),
@@ -116,6 +121,16 @@ GEN_VXFORM(vmulesb, 4, 12),
 GEN_VXFORM(vmulesh, 4, 13),
 GEN_VXFORM_DUAL(vmulesw, vmulhsw, 4, 14, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_310(vmulhsd, 4, 15),
+GEN_VXFORM_310(vdivuw, 5, 2),
+GEN_VXFORM_310(vdivud, 5, 3),
+GEN_VXFORM_DUAL_BOTH(vexptefp, vdivsw, 5, 6, 0x001f0000, 0x00000000,
+                     PPC_ALTIVEC, PPC2_ISA310),
+GEN_VXFORM_DUAL_BOTH(vlogefp, vdivsd, 5, 7, 0x001f0000, 0x00000000,
+                     PPC_ALTIVEC, PPC2_ISA310),
+GEN_VXFORM_310(vmoduw, 5, 26),
+GEN_VXFORM_310(vmodud, 5, 27),
+GEN_VXFORM_310(vmodsw, 5, 30),
+GEN_VXFORM_310(vmodsd, 5, 31),
 GEN_VXFORM(vslb, 2, 4),
 GEN_VXFORM(vslh, 2, 5),
 GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
@@ -259,8 +274,6 @@ GEN_VXFORM_NOA(vupkhpx, 7, 13),
 GEN_VXFORM_NOA(vupklpx, 7, 15),
 GEN_VXFORM_NOA(vrefp, 5, 4),
 GEN_VXFORM_NOA(vrsqrtefp, 5, 5),
-GEN_VXFORM_NOA(vexptefp, 5, 6),
-GEN_VXFORM_NOA(vlogefp, 5, 7),
 GEN_VXFORM_NOA(vrfim, 5, 11),
 GEN_VXFORM_NOA(vrfin, 5, 8),
 GEN_VXFORM_NOA(vrfip, 5, 10),
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag
  2020-06-25 17:00 ` [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag Lijun Pan
@ 2020-06-25 17:40   ` Richard Henderson
  2020-06-25 21:12     ` Lijun Pan
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 17:40 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> +    /* POWER ISA 3.1                                                         */
> +    PPC2_ISA310        = 0x0000000000100000ULL,

This goes in the first patch, but...

>  #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
>                          PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
> @@ -2199,7 +2201,7 @@ enum {
>                          PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
>                          PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
>                          PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \
> -                        PPC2_ISA300)
> +                        PPC2_ISA300 | PPC2_ISA310)

... all of the rest belongs in a separate patch, which will be sorted to the
end of the patch set.

It's ok to keep the second patch at the beginning during development, so that
you can test each instruction as you add it.  But for final commit you do not
want to enable the feature until it is complete.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/8] target/ppc: add byte-reverse br[dwh] instructions
  2020-06-25 17:00 ` [PATCH v3 2/8] target/ppc: add byte-reverse br[dwh] instructions Lijun Pan
@ 2020-06-25 17:42   ` Richard Henderson
  2020-06-25 21:13     ` Lijun Pan
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 17:42 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> +static void gen_brh(DisasContext *ctx)
> +{
> +    TCGv_i64 t0 = tcg_temp_new_i64();
> +    TCGv_i64 t1 = tcg_temp_new_i64();
> +    TCGv_i64 t2 = tcg_temp_new_i64();
> +
> +    tcg_gen_movi_i64(t0, 0x00ff00ff00ff00ffull);
> +    tcg_gen_shri_i64(t1, cpu_gpr[rS(ctx->opcode)], 8);
> +    tcg_gen_and_i64(t2, t1, t0);
> +    tcg_gen_and_i64(t1, cpu_gpr[rS(ctx->opcode)], t0);
> +    tcg_gen_shli_i64(t1, t1, 8);
> +    tcg_gen_or_i64(cpu_gpr[rA(ctx->opcode)], t1, t2);
> +
> +    tcg_temp_free_i64(t0);
> +    tcg_temp_free_i64(t1);
> +    tcg_temp_free_i64(t2);
> +}
> +
>  static opcode_t opcodes[] = {
> +#if defined(TARGET_PPC64)
> +GEN_HANDLER_E(brd, 0x1F, 0x1B, 0x05, 0x0000F801, PPC_NONE, PPC2_ISA310),
> +GEN_HANDLER_E(brw, 0x1F, 0x1B, 0x04, 0x0000F801, PPC_NONE, PPC2_ISA310),
> +GEN_HANDLER_E(brh, 0x1F, 0x1B, 0x06, 0x0000F801, PPC_NONE, PPC2_ISA310),
> +#endif

No.  You haven't even tested this -- it doesn't compile.

>   CC      ppc-softmmu/target/ppc/translate.o
> /home/rth/qemu/qemu/target/ppc/translate.c: In function ‘gen_brd’:
> /home/rth/qemu/qemu/target/ppc/translate.c:6980:32: error: passing argument 1 of ‘tcg_gen_bswap64_i64’ from incompatible pointer type [-Werror=incompatible-pointer-types]
>  6980 |     tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
>       |                         ~~~~~~~^~~~~~~~~~~~~~~~~
>       |                                |
>       |                                TCGv_i32 {aka struct TCGv_i32_d *}
> In file included from /home/rth/qemu/qemu/target/ppc/translate.c:26:
> /home/rth/qemu/qemu/include/tcg/tcg-op.h:533:35: note: expected ‘TCGv_i64’ {aka ‘struct TCGv_i64_d *’} but argument is of type ‘TCGv_i32’ {aka ‘struct TCGv_i32_d *’}
>   533 | void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg);
>       |                          ~~~~~~~~~^~~
> /home/rth/qemu/qemu/target/ppc/translate.c:6980:58: error: passing argument 2 of ‘tcg_gen_bswap64_i64’ from incompatible pointer type [-Werror=incompatible-pointer-types]
>  6980 |     tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
>       |                                                   ~~~~~~~^~~~~~~~~~~~~~~~~
>       |                                                          |
>       |                                                          TCGv_i32 {aka struct TCGv_i32_d *}

and so forth.

I warned you before about this.  Why are you still building only a restricted
set of targets?


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/8] target/ppc: convert vmuluwm to tcg_gen_gvec_mul
  2020-06-25 17:00 ` [PATCH v3 3/8] target/ppc: convert vmuluwm to tcg_gen_gvec_mul Lijun Pan
@ 2020-06-25 17:52   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 17:52 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> Convert the original implementation of vmuluwm to the more generic
> tcg_gen_gvec_mul.
> 
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
> v3: newly introduced.
> 
>  target/ppc/helper.h                 |  1 -
>  target/ppc/int_helper.c             | 13 -------------
>  target/ppc/translate/vmx-impl.inc.c |  2 +-
>  3 files changed, 1 insertion(+), 15 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/8] target/ppc: add vmulld instruction
  2020-06-25 17:00 ` [PATCH v3 4/8] target/ppc: add vmulld instruction Lijun Pan
@ 2020-06-25 18:25   ` Richard Henderson
  2020-06-25 21:13     ` Lijun Pan
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 18:25 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> vmulld: Vector Multiply Low Doubleword.
> 
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
> v3: use tcg_gen_gvec_mul()
> 
>  target/ppc/translate/vmx-impl.inc.c | 1 +
>  target/ppc/translate/vmx-ops.inc.c  | 4 ++++

This part looks fine.

>  tcg/ppc/tcg-target.h                | 2 ++
>  tcg/ppc/tcg-target.inc.c            | 7 +++++--

This part must be a separate patch.


> @@ -3149,6 +3150,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
>      static const uint32_t
>          add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM },
>          sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM },
> +        mul_op[4] = { 0, 0, VMULUWM, VMULLD },
>          neg_op[4] = { 0, 0, VNEGW, VNEGD },
>          eq_op[4]  = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD },
>          ne_op[4]  = { VCMPNEB, VCMPNEH, VCMPNEW, 0 },
> @@ -3199,8 +3201,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
>          a1 = 0;
>          break;
>      case INDEX_op_mul_vec:
> -        tcg_debug_assert(vece == MO_32 && have_isa_2_07);
> -        insn = VMULUWM;
> +        tcg_debug_assert((vece == MO_32 && have_isa_2_07) ||
> +                         (vece == MO_64 && have_isa_3_10));
> +        insn = mul_op[vece];

I think it would be ok to just index mul_op here, since the real isa check is
to be done elsewhere.

Missing a change to tcg_can_emit_vec_op to do that isa check, and allow
INDEX_op_mul_vec to be used for MO_64.

Missing a change to tcg_target_init to test for PPC_FEATURE2_ARCH_3_1.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 5/8] target/ppc: add vmulh{su}w instructions
  2020-06-25 17:00 ` [PATCH v3 5/8] target/ppc: add vmulh{su}w instructions Lijun Pan
@ 2020-06-25 18:26   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 18:26 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> vmulhsw: Vector Multiply High Signed Word
> vmulhuw: Vector Multiply High Unsigned Word
> 
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
> v3: inline the helper_vmulh{su}w multiply directly instead of using macro
> 
>  target/ppc/helper.h                 |  2 ++
>  target/ppc/int_helper.c             | 19 +++++++++++++++++++
>  target/ppc/translate/vmx-impl.inc.c |  6 ++++++
>  target/ppc/translate/vmx-ops.inc.c  |  4 ++--
>  4 files changed, 29 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 6/8] fix the prototype of muls64/mulu64
  2020-06-25 17:00 ` [PATCH v3 6/8] fix the prototype of muls64/mulu64 Lijun Pan
@ 2020-06-25 18:28   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 18:28 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> The prototypes of muls64/mulu64 in host-utils.h should match the
> definitions in host-utils.c
> 
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
>  include/qemu/host-utils.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

I already gave you a Reviewed-by tag for this patch.
You should include all of those in future versions so that other developers
know what has been done.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 7/8] target/ppc: add vmulh{su}d instructions
  2020-06-25 17:00 ` [PATCH v3 7/8] target/ppc: add vmulh{su}d instructions Lijun Pan
@ 2020-06-25 18:32   ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 18:32 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> vmulhsd: Vector Multiply High Signed Doubleword
> vmulhud: Vector Multiply High Unsigned Doubleword
> 
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
> v3: simplify helper_vmulh{su}d 
> 
>  target/ppc/helper.h                 |  2 ++
>  target/ppc/int_helper.c             | 16 ++++++++++++++++
>  target/ppc/translate/vmx-impl.inc.c |  2 ++
>  target/ppc/translate/vmx-ops.inc.c  |  2 ++
>  4 files changed, 22 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions
  2020-06-25 17:00 ` [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions Lijun Pan
@ 2020-06-25 18:37   ` Richard Henderson
  2020-06-25 21:15     ` Lijun Pan
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2020-06-25 18:37 UTC (permalink / raw)
  To: Lijun Pan, qemu-ppc, qemu-devel; +Cc: david

On 6/25/20 10:00 AM, Lijun Pan wrote:
> +#define VDIV_MOD_DO(name, op, element, sign, bit)                       \
> +    void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
> +    {                                                                   \
> +        int i;                                                          \
> +                                                                        \
> +                                                                        \
> +        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                  \
> +            if (unlikely((b->element[i] == 0) ||                        \
> +                (sign &&                                                \
> +                (b->element[i] == UINT##bit##_MAX) &&                   \
> +                (a->element[i] == INT##bit##_MIN))))                    \
> +                continue;                                               \
> +            r->element[i] = a->element[i] op b->element[i];             \
> +        }                                                               \
> +    }

Missing braces for the if.  Extra blank line before the for.

I see that the ISA document says divide-by-zero produces an undefined result,
so leaving the previous contents does seem to be within the letter of the law.

However... Are you able to test what real hardware produces?  It would be nice
(but not required) to match if it is simple to do so.

Whichever way we go with the undefined result, this deserves a comment.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag
  2020-06-25 17:40   ` Richard Henderson
@ 2020-06-25 21:12     ` Lijun Pan
  2020-06-26  3:40       ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 21:12 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-ppc, Lijun Pan, qemu-devel, david



> On Jun 25, 2020, at 12:40 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 6/25/20 10:00 AM, Lijun Pan wrote:
>> +    /* POWER ISA 3.1                                                         */
>> +    PPC2_ISA310        = 0x0000000000100000ULL,
> 
> This goes in the first patch, but...
> 
>> #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
>>                         PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
>> @@ -2199,7 +2201,7 @@ enum {
>>                         PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
>>                         PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
>>                         PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \
>> -                        PPC2_ISA300)
>> +                        PPC2_ISA300 | PPC2_ISA310)
> 
> ... all of the rest belongs in a separate patch, which will be sorted to the
> end of the patch set.
> 

Do you mean the first patch has
“
>> +    /* POWER ISA 3.1                                                         */
>> +    PPC2_ISA310        = 0x0000000000100000ULL,

“
the second patch has

“
>> #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
>>                         PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
>> @@ -2199,7 +2201,7 @@ enum {
>>                         PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
>>                         PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
>>                         PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \
>> -                        PPC2_ISA300)
>> +                        PPC2_ISA300 | PPC2_ISA310)


+++ b/target/ppc/translate_init.inc.c
@@ -9206,7 +9206,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
                        PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
                        PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
                        PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
-                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
+                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310;

"


> It's ok to keep the second patch at the beginning during development, so that
> you can test each instruction as you add it.  But for final commit you do not
> want to enable the feature until it is complete.
> 

Do you mean not submiting the second patch until all the instructions are enabled in the future?

Lijun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/8] target/ppc: add byte-reverse br[dwh] instructions
  2020-06-25 17:42   ` Richard Henderson
@ 2020-06-25 21:13     ` Lijun Pan
  0 siblings, 0 replies; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 21:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-ppc, Lijun Pan, qemu-devel, david



> On Jun 25, 2020, at 12:42 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 6/25/20 10:00 AM, Lijun Pan wrote:
>> +static void gen_brh(DisasContext *ctx)
>> +{
>> +    TCGv_i64 t0 = tcg_temp_new_i64();
>> +    TCGv_i64 t1 = tcg_temp_new_i64();
>> +    TCGv_i64 t2 = tcg_temp_new_i64();
>> +
>> +    tcg_gen_movi_i64(t0, 0x00ff00ff00ff00ffull);
>> +    tcg_gen_shri_i64(t1, cpu_gpr[rS(ctx->opcode)], 8);
>> +    tcg_gen_and_i64(t2, t1, t0);
>> +    tcg_gen_and_i64(t1, cpu_gpr[rS(ctx->opcode)], t0);
>> +    tcg_gen_shli_i64(t1, t1, 8);
>> +    tcg_gen_or_i64(cpu_gpr[rA(ctx->opcode)], t1, t2);
>> +
>> +    tcg_temp_free_i64(t0);
>> +    tcg_temp_free_i64(t1);
>> +    tcg_temp_free_i64(t2);
>> +}
>> +
>> static opcode_t opcodes[] = {
>> +#if defined(TARGET_PPC64)
>> +GEN_HANDLER_E(brd, 0x1F, 0x1B, 0x05, 0x0000F801, PPC_NONE, PPC2_ISA310),
>> +GEN_HANDLER_E(brw, 0x1F, 0x1B, 0x04, 0x0000F801, PPC_NONE, PPC2_ISA310),
>> +GEN_HANDLER_E(brh, 0x1F, 0x1B, 0x06, 0x0000F801, PPC_NONE, PPC2_ISA310),
>> +#endif
> 
> No.  You haven't even tested this -- it doesn't compile.
> 
>>  CC      ppc-softmmu/target/ppc/translate.o
>> /home/rth/qemu/qemu/target/ppc/translate.c: In function ‘gen_brd’:
>> /home/rth/qemu/qemu/target/ppc/translate.c:6980:32: error: passing argument 1 of ‘tcg_gen_bswap64_i64’ from incompatible pointer type [-Werror=incompatible-pointer-types]
>> 6980 |     tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
>>      |                         ~~~~~~~^~~~~~~~~~~~~~~~~
>>      |                                |
>>      |                                TCGv_i32 {aka struct TCGv_i32_d *}
>> In file included from /home/rth/qemu/qemu/target/ppc/translate.c:26:
>> /home/rth/qemu/qemu/include/tcg/tcg-op.h:533:35: note: expected ‘TCGv_i64’ {aka ‘struct TCGv_i64_d *’} but argument is of type ‘TCGv_i32’ {aka ‘struct TCGv_i32_d *’}
>>  533 | void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg);
>>      |                          ~~~~~~~~~^~~
>> /home/rth/qemu/qemu/target/ppc/translate.c:6980:58: error: passing argument 2 of ‘tcg_gen_bswap64_i64’ from incompatible pointer type [-Werror=incompatible-pointer-types]
>> 6980 |     tcg_gen_bswap64_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
>>      |                                                   ~~~~~~~^~~~~~~~~~~~~~~~~
>>      |                                                          |
>>      |                                                          TCGv_i32 {aka struct TCGv_i32_d *}
> 
> and so forth.
> 
> I warned you before about this.  Why are you still building only a restricted
> set of targets?
> 

I will configure all the target-list, and add "#if defined(TARGET_PPC64)" for all gen_brd/w/h().

Lijun



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/8] target/ppc: add vmulld instruction
  2020-06-25 18:25   ` Richard Henderson
@ 2020-06-25 21:13     ` Lijun Pan
  2020-06-26  3:52       ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 21:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-ppc, Lijun Pan, qemu-devel, david

[-- Attachment #1: Type: text/plain, Size: 2778 bytes --]



> On Jun 25, 2020, at 1:25 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 6/25/20 10:00 AM, Lijun Pan wrote:
>> vmulld: Vector Multiply Low Doubleword.
>> 
>> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
>> ---
>> v3: use tcg_gen_gvec_mul()
>> 
>> target/ppc/translate/vmx-impl.inc.c | 1 +
>> target/ppc/translate/vmx-ops.inc.c  | 4 ++++
> 
> This part looks fine.
> 
>> tcg/ppc/tcg-target.h                | 2 ++
>> tcg/ppc/tcg-target.inc.c            | 7 +++++--
> 
> This part must be a separate patch.
> 
> 
>> @@ -3149,6 +3150,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
>>     static const uint32_t
>>         add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM },
>>         sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM },
>> +        mul_op[4] = { 0, 0, VMULUWM, VMULLD },
>>         neg_op[4] = { 0, 0, VNEGW, VNEGD },
>>         eq_op[4]  = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD },
>>         ne_op[4]  = { VCMPNEB, VCMPNEH, VCMPNEW, 0 },
>> @@ -3199,8 +3201,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
>>         a1 = 0;
>>         break;
>>     case INDEX_op_mul_vec:
>> -        tcg_debug_assert(vece == MO_32 && have_isa_2_07);
>> -        insn = VMULUWM;
>> +        tcg_debug_assert((vece == MO_32 && have_isa_2_07) ||
>> +                         (vece == MO_64 && have_isa_3_10));
>> +        insn = mul_op[vece];
> 
> I think it would be ok to just index mul_op here, since the real isa check is
> to be done elsewhere.

Just keep "insn = mul_op[vece];"
and remove"        tcg_debug_assert((vece == MO_32 && have_isa_2_07) ||
                         (vece == MO_64 && have_isa_3_10));“?

> 
> Missing a change to tcg_can_emit_vec_op to do that isa check, and allow
> INDEX_op_mul_vec to be used for MO_64.

something like below?
"
@@ -3016,6 +3016,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece)
             return -1;
         case MO_32:
             return have_isa_2_07 ? 1 : -1;
+        case MO_64:
+            return have_isa_3_10 ? 1 : -1;
         }
"

> 
> Missing a change to tcg_target_init to test for PPC_FEATURE2_ARCH_3_1.

something like below?
@@ -3712,6 +3712,11 @@ static void tcg_target_init(TCGContext *s)
         have_isa = tcg_isa_3_00;
     }
 #endif
+#ifdef PPC_FEATURE2_ARCH_3_10
+    if (hwcap2 & PPC_FEATURE2_ARCH_3_10) {
+        have_isa = tcg_isa_3_10;
+    }
+#endif

+++ b/include/elf.h
@@ -554,6 +554,7 @@ typedef struct {
 #define PPC_FEATURE2_HTM_NOSC           0x01000000
 #define PPC_FEATURE2_ARCH_3_00          0x00800000
 #define PPC_FEATURE2_HAS_IEEE128        0x00400000
+#define PPC_FEATURE2_ARCH_3_10          0x00200000


Thanks,
Lijun

[-- Attachment #2: Type: text/html, Size: 11690 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions
  2020-06-25 18:37   ` Richard Henderson
@ 2020-06-25 21:15     ` Lijun Pan
  2020-06-26  3:53       ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Lijun Pan @ 2020-06-25 21:15 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-ppc, Lijun Pan, qemu-devel, David Gibson



> On Jun 25, 2020, at 1:37 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 6/25/20 10:00 AM, Lijun Pan wrote:
>> +#define VDIV_MOD_DO(name, op, element, sign, bit)                       \
>> +    void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
>> +    {                                                                   \
>> +        int i;                                                          \
>> +                                                                        \
>> +                                                                        \
>> +        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                  \
>> +            if (unlikely((b->element[i] == 0) ||                        \
>> +                (sign &&                                                \
>> +                (b->element[i] == UINT##bit##_MAX) &&                   \
>> +                (a->element[i] == INT##bit##_MIN))))                    \
>> +                continue;                                               \
>> +            r->element[i] = a->element[i] op b->element[i];             \
>> +        }                                                               \
>> +    }
> 
> Missing braces for the if.  Extra blank line before the for.

No, the braces are enough. "unlikely" is to describe the whole logic,
eg. if (unlikely( (divisor == 0) || (sign && (divisor == 0xFFFFFFFF) && (dividend = 0x80000000) ) ) )

I will remove that blank line.

> 
> I see that the ISA document says divide-by-zero produces an undefined result,
> so leaving the previous contents does seem to be within the letter of the law.
> 
> However... Are you able to test what real hardware produces?  It would be nice
> (but not required) to match if it is simple to do so.
> 
> Whichever way we go with the undefined result, this deserves a comment.

I will add “continue; / * Undefined, No Special Registers Altered */ "

Lijun


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag
  2020-06-25 21:12     ` Lijun Pan
@ 2020-06-26  3:40       ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2020-06-26  3:40 UTC (permalink / raw)
  To: Lijun Pan; +Cc: qemu-ppc, Lijun Pan, qemu-devel, david

On 6/25/20 2:12 PM, Lijun Pan wrote:
> Do you mean not submiting the second patch until all the instructions are enabled in the future?

Well, I mean not *merging* the second patch until all of the instructions are
enabled.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 4/8] target/ppc: add vmulld instruction
  2020-06-25 21:13     ` Lijun Pan
@ 2020-06-26  3:52       ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2020-06-26  3:52 UTC (permalink / raw)
  To: Lijun Pan; +Cc: qemu-ppc, Laurent Vivier, Lijun Pan, qemu-devel, david

On 6/25/20 2:13 PM, Lijun Pan wrote:
>>>     case INDEX_op_mul_vec:
>>> -        tcg_debug_assert(vece == MO_32 && have_isa_2_07);
>>> -        insn = VMULUWM;
>>> +        tcg_debug_assert((vece == MO_32 && have_isa_2_07) ||
>>> +                         (vece == MO_64 && have_isa_3_10));
>>> +        insn = mul_op[vece];
>>
>> I think it would be ok to just index mul_op here, since the real isa check is
>> to be done elsewhere.
> 
> Just keep "insn = mul_op[vece];"
> and remove"        tcg_debug_assert((vece == MO_32 && have_isa_2_07) ||
>                          (vece == MO_64 && have_isa_3_10));“?

Yes.

> @@ -3016,6 +3016,8 @@int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type,
> unsigned vece)
>              return -1;
>          case MO_32:
>              return have_isa_2_07 ? 1 : -1;
> +        case MO_64:
> +            return have_isa_3_10 ? 1 : -1;
>          }

Actually, just "return have_isa_3_10".

Returning 1 means that the opcode is supported directly.  Returning -1 means
that the opcode can be expanded by tcg_expand_vec_op.  Returning 0 means that
the tcg backend does not support the opcode at all.

> something like below?
> @@ -3712,6 +3712,11 @@static void tcg_target_init(TCGContext *s)
>          have_isa = tcg_isa_3_00;
>      }
>  #endif
> +#ifdef PPC_FEATURE2_ARCH_3_10
> +    if (hwcap2 & PPC_FEATURE2_ARCH_3_10) {
> +        have_isa = tcg_isa_3_10;
> +    }
> +#endif

Certainly this.

> @@ -554,6 +554,7 @@typedef struct {
>  #define PPC_FEATURE2_HTM_NOSC           0x01000000
>  #define PPC_FEATURE2_ARCH_3_00          0x00800000
>  #define PPC_FEATURE2_HAS_IEEE128        0x00400000
> +#define PPC_FEATURE2_ARCH_3_10          0x00200000

Of this I'm not sure.  I didn't even realize these defines were here in
include/elf.h.  For other tcg backends we get the defines from <sys/auxv.h>.

If we do want to update include/elf.h, it should be a separate patch.  CC'ing
Laurent for this.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions
  2020-06-25 21:15     ` Lijun Pan
@ 2020-06-26  3:53       ` Richard Henderson
  2020-06-26  4:31         ` David Gibson
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2020-06-26  3:53 UTC (permalink / raw)
  To: Lijun Pan; +Cc: qemu-ppc, Lijun Pan, qemu-devel, David Gibson

On 6/25/20 2:15 PM, Lijun Pan wrote:
> 
> 
>> On Jun 25, 2020, at 1:37 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
>>
>> On 6/25/20 10:00 AM, Lijun Pan wrote:
>>> +#define VDIV_MOD_DO(name, op, element, sign, bit)                       \
>>> +    void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
>>> +    {                                                                   \
>>> +        int i;                                                          \
>>> +                                                                        \
>>> +                                                                        \
>>> +        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                  \
>>> +            if (unlikely((b->element[i] == 0) ||                        \
>>> +                (sign &&                                                \
>>> +                (b->element[i] == UINT##bit##_MAX) &&                   \
>>> +                (a->element[i] == INT##bit##_MIN))))                    \
>>> +                continue;                                               \
>>> +            r->element[i] = a->element[i] op b->element[i];             \
>>> +        }                                                               \
>>> +    }
>>
>> Missing braces for the if.  Extra blank line before the for.
> 
> No, the braces are enough.

No they are not.  See CODING_STYLE.rst.


r~



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions
  2020-06-26  3:53       ` Richard Henderson
@ 2020-06-26  4:31         ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2020-06-26  4:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-ppc, Lijun Pan, Lijun Pan, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2040 bytes --]

On Thu, Jun 25, 2020 at 08:53:54PM -0700, Richard Henderson wrote:
> On 6/25/20 2:15 PM, Lijun Pan wrote:
> > 
> > 
> >> On Jun 25, 2020, at 1:37 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> >>
> >> On 6/25/20 10:00 AM, Lijun Pan wrote:
> >>> +#define VDIV_MOD_DO(name, op, element, sign, bit)                       \
> >>> +    void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
> >>> +    {                                                                   \
> >>> +        int i;                                                          \
> >>> +                                                                        \
> >>> +                                                                        \
> >>> +        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                  \
> >>> +            if (unlikely((b->element[i] == 0) ||                        \
> >>> +                (sign &&                                                \
> >>> +                (b->element[i] == UINT##bit##_MAX) &&                   \
> >>> +                (a->element[i] == INT##bit##_MIN))))                    \
> >>> +                continue;                                               \
> >>> +            r->element[i] = a->element[i] op b->element[i];             \
> >>> +        }                                                               \
> >>> +    }
> >>
> >> Missing braces for the if.  Extra blank line before the for.
> > 
> > No, the braces are enough.
> 
> No they are not.  See CODING_STYLE.rst.

I suspect there is some confusion in terms here.

Lijun, what Richard is saying is that the qemu coding style requires
braces { } around if blocks, even if they're a single statement.  Your
response seemed to be discussing the brackets ( ) which are fine.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-06-26  4:32 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 17:00 [PATCH v3 0/8] Add several Power ISA 3.1 32/64-bit vector instructions Lijun Pan
2020-06-25 17:00 ` [PATCH v3 1/8] target/ppc: Introduce Power ISA 3.1 flag Lijun Pan
2020-06-25 17:40   ` Richard Henderson
2020-06-25 21:12     ` Lijun Pan
2020-06-26  3:40       ` Richard Henderson
2020-06-25 17:00 ` [PATCH v3 2/8] target/ppc: add byte-reverse br[dwh] instructions Lijun Pan
2020-06-25 17:42   ` Richard Henderson
2020-06-25 21:13     ` Lijun Pan
2020-06-25 17:00 ` [PATCH v3 3/8] target/ppc: convert vmuluwm to tcg_gen_gvec_mul Lijun Pan
2020-06-25 17:52   ` Richard Henderson
2020-06-25 17:00 ` [PATCH v3 4/8] target/ppc: add vmulld instruction Lijun Pan
2020-06-25 18:25   ` Richard Henderson
2020-06-25 21:13     ` Lijun Pan
2020-06-26  3:52       ` Richard Henderson
2020-06-25 17:00 ` [PATCH v3 5/8] target/ppc: add vmulh{su}w instructions Lijun Pan
2020-06-25 18:26   ` Richard Henderson
2020-06-25 17:00 ` [PATCH v3 6/8] fix the prototype of muls64/mulu64 Lijun Pan
2020-06-25 18:28   ` Richard Henderson
2020-06-25 17:00 ` [PATCH v3 7/8] target/ppc: add vmulh{su}d instructions Lijun Pan
2020-06-25 18:32   ` Richard Henderson
2020-06-25 17:00 ` [PATCH v3 8/8] target/ppc: add vdiv{su}{wd} vmod{su}{wd} instructions Lijun Pan
2020-06-25 18:37   ` Richard Henderson
2020-06-25 21:15     ` Lijun Pan
2020-06-26  3:53       ` Richard Henderson
2020-06-26  4:31         ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.