All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] POWER9 TCG enablement - part3
@ 2016-08-09 10:12 Rajalakshmi Srinivasaraghavan
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions Rajalakshmi Srinivasaraghavan
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-08-09 10:12 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, benh, Rajalakshmi Srinivasaraghavan

This series contains 14 new instructions for POWER9 described in ISA3.0.

Patches:
        01: Adds vector insert instructions.
            vinsertb - Vector Insert Byte
            vinserth - Vector Insert Halfword
            vinsertw - Vector Insert Word
            vinsertd - Vector Insert Doubleword
        02: Adds vector extract instructions.
            vextractub - Vector Extract Unsigned Byte
            vextractuh - Vector Extract Unsigned Halfword
            vextractuw - Vector Extract Unsigned Word
            vextractd - Vector Extract Unsigned Doubleword
        03: Adds vector count trailing zeros instructions.
            vctzb - Vector Count Trailing Zeros Byte
            vctzh - Vector Count Trailing Zeros Halfword
            vctzw - Vector Count Trailing Zeros Word
            vctzd - Vector Count Trailing Zeros Doubleword
        04: Adds vbpermd-vector bit permute doubleword instruction.
        05: Adds vpermr-vector permute right indexed instruction.

Changelog:
v0:
* Rename GEN_VXFORM_300_EXT1 to GEN_VXFORM_300_EO.
* Rename GEN_VXFORM_DUAL1 to GEN_VXFORM_DUAL_INV.
* Remove undef GEN_VXFORM_DUAL1.

v1:
* Correct SPLAT and handle src = dest for vinsert and vextract.
* Correct typecast for vctz.
* Computation of index rearranged for vpermr.
* Assignment of perm moved out of inner loop in vbpermd.

Rajalakshmi Srinivasaraghavan (5):
  target-ppc: add vector insert instructions
  target-ppc: add vector extract instructions
  target-ppc: add vector count trailing zeros instructions
  target-ppc: add vector bit permute doubleword instruction
  target-ppc: add vector permute right indexed instruction

 target-ppc/helper.h             |   14 ++++
 target-ppc/int_helper.c         |  131 +++++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.c |   58 +++++++++++++++++
 target-ppc/translate/vmx-ops.c  |   38 +++++++++---
 4 files changed, 233 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions
  2016-08-09 10:12 [Qemu-devel] [PATCH 0/5] POWER9 TCG enablement - part3 Rajalakshmi Srinivasaraghavan
@ 2016-08-09 10:12 ` Rajalakshmi Srinivasaraghavan
  2016-08-09 18:20   ` Richard Henderson
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 2/5] target-ppc: add vector extract instructions Rajalakshmi Srinivasaraghavan
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-08-09 10:12 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, benh, Rajalakshmi Srinivasaraghavan

The following vector insert instructions are added from ISA 3.0.

vinsertb - Vector Insert Byte
vinserth - Vector Insert Halfword
vinsertw - Vector Insert Word
vinsertd - Vector Insert Doubleword

Signed-off-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
---
 target-ppc/helper.h             |    4 +++
 target-ppc/int_helper.c         |   41 +++++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.c |   10 +++++++++
 target-ppc/translate/vmx-ops.c  |   18 ++++++++++++----
 4 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 93ac9e1..0923779 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -250,6 +250,10 @@ DEF_HELPER_2(vspltisw, void, avr, i32)
 DEF_HELPER_3(vspltb, void, avr, avr, i32)
 DEF_HELPER_3(vsplth, void, avr, avr, i32)
 DEF_HELPER_3(vspltw, void, avr, avr, i32)
+DEF_HELPER_3(vinsertb, void, avr, avr, i32)
+DEF_HELPER_3(vinserth, void, avr, avr, i32)
+DEF_HELPER_3(vinsertw, void, avr, avr, i32)
+DEF_HELPER_3(vinsertd, void, avr, avr, i32)
 DEF_HELPER_2(vupkhpx, void, avr, avr)
 DEF_HELPER_2(vupklpx, void, avr, avr)
 DEF_HELPER_2(vupkhsb, void, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 552b2e0..ece5543 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1792,6 +1792,47 @@ VSPLT(w, u32)
 #undef VSPLT
 #undef SPLAT_ELEMENT
 #undef _SPLAT_MASKED
+#define SPLAT(element)                                                      \
+((splat > (16 - sizeof(r->element[0]))) ? 16 - sizeof(r->element[0]) : splat)
+#if defined(HOST_WORDS_BIGENDIAN)
+#define VINSERT(suffix, element, index)                                     \
+    void helper_vinsert##suffix(ppc_avr_t *r, ppc_avr_t *b, uint32_t splat) \
+    {                                                                       \
+        ppc_avr_t result;                                                   \
+        uint32_t s = sizeof(b->element[0]) * index;                         \
+        int i;                                                              \
+        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                      \
+            result.element[i] = r->element[i];                              \
+        }                                                                   \
+        for (i = 0; i < sizeof(r->element[0]); i++) {                       \
+            result.u8[SPLAT(element) + i] = b->u8[s + i];                   \
+        }                                                                   \
+        *r = result;                                                        \
+    }
+#else
+#define VINSERT(suffix, element, index)                                     \
+    void helper_vinsert##suffix(ppc_avr_t *r, ppc_avr_t *b, uint32_t splat) \
+    {                                                                       \
+        ppc_avr_t result;                                                   \
+        uint32_t s = sizeof(b->element[0]) *                                \
+                           ((ARRAY_SIZE(r->element) - index) - 1);          \
+        int i;                                                              \
+        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                      \
+            result.element[i] = r->element[i];                              \
+        }                                                                   \
+        for (i = 0; i < sizeof(r->element[0]); i++) {                       \
+            result.u8[(16 - SPLAT(element)) - sizeof(r->element[0]) + i] =  \
+                                                              b->u8[s + i]; \
+        }                                                                   \
+        *r = result;                                                        \
+    }
+#endif
+VINSERT(b, u8, 7)
+VINSERT(h, u16, 3)
+VINSERT(w, u32, 1)
+VINSERT(d, u64, 0)
+#undef VINSERT
+#undef SPLAT
 
 #define VSPLTI(suffix, element, splat_type)                     \
     void helper_vspltis##suffix(ppc_avr_t *r, uint32_t splat)   \
diff --git a/target-ppc/translate/vmx-impl.c b/target-ppc/translate/vmx-impl.c
index ac78caf..4940ae3 100644
--- a/target-ppc/translate/vmx-impl.c
+++ b/target-ppc/translate/vmx-impl.c
@@ -626,10 +626,20 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
 GEN_VXFORM_UIMM(vspltb, 6, 8);
 GEN_VXFORM_UIMM(vsplth, 6, 9);
 GEN_VXFORM_UIMM(vspltw, 6, 10);
+GEN_VXFORM_UIMM(vinsertb, 6, 12);
+GEN_VXFORM_UIMM(vinserth, 6, 13);
+GEN_VXFORM_UIMM(vinsertw, 6, 14);
+GEN_VXFORM_UIMM(vinsertd, 6, 15);
 GEN_VXFORM_UIMM_ENV(vcfux, 5, 12);
 GEN_VXFORM_UIMM_ENV(vcfsx, 5, 13);
 GEN_VXFORM_UIMM_ENV(vctuxs, 5, 14);
 GEN_VXFORM_UIMM_ENV(vctsxs, 5, 15);
+GEN_VXFORM_DUAL(vspltisb, PPC_NONE, PPC2_ALTIVEC_207,
+                      vinsertb, PPC_NONE, PPC2_ISA300);
+GEN_VXFORM_DUAL(vspltish, PPC_NONE, PPC2_ALTIVEC_207,
+                      vinserth, PPC_NONE, PPC2_ISA300);
+GEN_VXFORM_DUAL(vspltisw, PPC_NONE, PPC2_ALTIVEC_207,
+                      vinsertw, PPC_NONE, PPC2_ISA300);
 
 static void gen_vsldoi(DisasContext *ctx)
 {
diff --git a/target-ppc/translate/vmx-ops.c b/target-ppc/translate/vmx-ops.c
index 7449396..ca69e56 100644
--- a/target-ppc/translate/vmx-ops.c
+++ b/target-ppc/translate/vmx-ops.c
@@ -41,6 +41,9 @@ GEN_HANDLER_E(name, 0x04, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ALTIVEC_207)
 #define GEN_VXFORM_300(name, opc2, opc3)                                \
 GEN_HANDLER_E(name, 0x04, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA300)
 
+#define GEN_VXFORM_300_EXT(name, opc2, opc3, inval)                     \
+GEN_HANDLER_E(name, 0x04, opc2, opc3, inval, PPC_NONE, PPC2_ISA300)
+
 #define GEN_VXFORM_DUAL(name0, name1, opc2, opc3, type0, type1) \
 GEN_HANDLER_E(name0##_##name1, 0x4, opc2, opc3, 0x00000000, type0, type1)
 
@@ -191,11 +194,16 @@ GEN_VXRFORM(vcmpgefp, 3, 7)
 GEN_VXRFORM_DUAL(vcmpgtfp, vcmpgtud, 3, 11, PPC_ALTIVEC, PPC_NONE)
 GEN_VXRFORM_DUAL(vcmpbfp, vcmpgtsd, 3, 15, PPC_ALTIVEC, PPC_NONE)
 
-#define GEN_VXFORM_SIMM(name, opc2, opc3)                               \
-    GEN_HANDLER(name, 0x04, opc2, opc3, 0x00000000, PPC_ALTIVEC)
-GEN_VXFORM_SIMM(vspltisb, 6, 12),
-GEN_VXFORM_SIMM(vspltish, 6, 13),
-GEN_VXFORM_SIMM(vspltisw, 6, 14),
+#define GEN_VXFORM_DUAL_INV(name0, name1, opc2, opc3, inval0, inval1, type) \
+GEN_OPCODE_DUAL(name0##_##name1, 0x04, opc2, opc3, inval0, inval1, type, \
+                                                               PPC_NONE)
+GEN_VXFORM_DUAL_INV(vspltisb, vinsertb, 6, 12, 0x00000000, 0x100000,
+                                               PPC2_ALTIVEC_207),
+GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000,
+                                               PPC2_ALTIVEC_207),
+GEN_VXFORM_DUAL_INV(vspltisw, vinsertw, 6, 14, 0x00000000, 0x100000,
+                                               PPC2_ALTIVEC_207),
+GEN_VXFORM_300_EXT(vinsertd, 6, 15, 0x100000),
 
 #define GEN_VXFORM_NOA(name, opc2, opc3)                                \
     GEN_HANDLER(name, 0x04, opc2, opc3, 0x001f0000, PPC_ALTIVEC)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v2 2/5] target-ppc: add vector extract instructions
  2016-08-09 10:12 [Qemu-devel] [PATCH 0/5] POWER9 TCG enablement - part3 Rajalakshmi Srinivasaraghavan
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions Rajalakshmi Srinivasaraghavan
@ 2016-08-09 10:12 ` Rajalakshmi Srinivasaraghavan
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 3/5] target-ppc: add vector count trailing zeros instructions Rajalakshmi Srinivasaraghavan
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-08-09 10:12 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, benh, Rajalakshmi Srinivasaraghavan

The following vector extract instructions are added from ISA 3.0.

vextractub - Vector Extract Unsigned Byte
vextractuh - Vector Extract Unsigned Halfword
vextractuw - Vector Extract Unsigned Word
vextractd - Vector Extract Unsigned Doubleword

Signed-off-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
---
 target-ppc/helper.h             |    4 ++++
 target-ppc/int_helper.c         |   32 ++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.c |   10 ++++++++++
 target-ppc/translate/vmx-ops.c  |   10 +++++++---
 4 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0923779..59e7b88 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -250,6 +250,10 @@ DEF_HELPER_2(vspltisw, void, avr, i32)
 DEF_HELPER_3(vspltb, void, avr, avr, i32)
 DEF_HELPER_3(vsplth, void, avr, avr, i32)
 DEF_HELPER_3(vspltw, void, avr, avr, i32)
+DEF_HELPER_3(vextractub, void, avr, avr, i32)
+DEF_HELPER_3(vextractuh, void, avr, avr, i32)
+DEF_HELPER_3(vextractuw, void, avr, avr, i32)
+DEF_HELPER_3(vextractd, void, avr, avr, i32)
 DEF_HELPER_3(vinsertb, void, avr, avr, i32)
 DEF_HELPER_3(vinserth, void, avr, avr, i32)
 DEF_HELPER_3(vinsertw, void, avr, avr, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index ece5543..6477401 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1832,6 +1832,38 @@ VINSERT(h, u16, 3)
 VINSERT(w, u32, 1)
 VINSERT(d, u64, 0)
 #undef VINSERT
+#if defined(HOST_WORDS_BIGENDIAN)
+#define VEXTRACT(suffix, element, index)                                     \
+    void helper_vextract##suffix(ppc_avr_t *r, ppc_avr_t *b, uint32_t splat) \
+    {                                                                        \
+        uint32_t s = sizeof(r->element[0]) * index;                          \
+        int i;                                                               \
+        ppc_avr_t result = { .u64 = { 0, 0 } };                              \
+        for (i = 0; i < sizeof(r->element[0]); i++) {                        \
+            result.u8[s + i] = b->u8[(SPLAT(element)) + i];                  \
+        }                                                                    \
+        *r = result;                                                         \
+    }
+#else
+#define VEXTRACT(suffix, element, index)                                     \
+    void helper_vextract##suffix(ppc_avr_t *r, ppc_avr_t *b, uint32_t splat) \
+    {                                                                        \
+        uint32_t s = sizeof(b->element[0]) *                                 \
+                           ((ARRAY_SIZE(r->element) - index) - 1);           \
+        int i;                                                               \
+        ppc_avr_t result = { .u64 = { 0, 0 } };                              \
+        for (i = 0; i < sizeof(r->element[0]); i++) {                        \
+            result.u8[s + i] =                                               \
+                  b->u8[(16 - SPLAT(element)) - sizeof(r->element[0]) + i];  \
+        }                                                                    \
+        *r = result;                                                         \
+    }
+#endif
+VEXTRACT(ub, u8, 7)
+VEXTRACT(uh, u16, 3)
+VEXTRACT(uw, u32, 1)
+VEXTRACT(d, u64, 0)
+#undef VEXTRACT
 #undef SPLAT
 
 #define VSPLTI(suffix, element, splat_type)                     \
diff --git a/target-ppc/translate/vmx-impl.c b/target-ppc/translate/vmx-impl.c
index 4940ae3..8bd48f3 100644
--- a/target-ppc/translate/vmx-impl.c
+++ b/target-ppc/translate/vmx-impl.c
@@ -626,6 +626,10 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
 GEN_VXFORM_UIMM(vspltb, 6, 8);
 GEN_VXFORM_UIMM(vsplth, 6, 9);
 GEN_VXFORM_UIMM(vspltw, 6, 10);
+GEN_VXFORM_UIMM(vextractub, 6, 8);
+GEN_VXFORM_UIMM(vextractuh, 6, 9);
+GEN_VXFORM_UIMM(vextractuw, 6, 10);
+GEN_VXFORM_UIMM(vextractd, 6, 11);
 GEN_VXFORM_UIMM(vinsertb, 6, 12);
 GEN_VXFORM_UIMM(vinserth, 6, 13);
 GEN_VXFORM_UIMM(vinsertw, 6, 14);
@@ -634,6 +638,12 @@ GEN_VXFORM_UIMM_ENV(vcfux, 5, 12);
 GEN_VXFORM_UIMM_ENV(vcfsx, 5, 13);
 GEN_VXFORM_UIMM_ENV(vctuxs, 5, 14);
 GEN_VXFORM_UIMM_ENV(vctsxs, 5, 15);
+GEN_VXFORM_DUAL(vspltb, PPC_NONE, PPC2_ALTIVEC_207,
+                      vextractub, PPC_NONE, PPC2_ISA300);
+GEN_VXFORM_DUAL(vsplth, PPC_NONE, PPC2_ALTIVEC_207,
+                      vextractuh, PPC_NONE, PPC2_ISA300);
+GEN_VXFORM_DUAL(vspltw, PPC_NONE, PPC2_ALTIVEC_207,
+                      vextractuw, PPC_NONE, PPC2_ISA300);
 GEN_VXFORM_DUAL(vspltisb, PPC_NONE, PPC2_ALTIVEC_207,
                       vinsertb, PPC_NONE, PPC2_ISA300);
 GEN_VXFORM_DUAL(vspltish, PPC_NONE, PPC2_ALTIVEC_207,
diff --git a/target-ppc/translate/vmx-ops.c b/target-ppc/translate/vmx-ops.c
index ca69e56..aafe70b 100644
--- a/target-ppc/translate/vmx-ops.c
+++ b/target-ppc/translate/vmx-ops.c
@@ -197,6 +197,13 @@ GEN_VXRFORM_DUAL(vcmpbfp, vcmpgtsd, 3, 15, PPC_ALTIVEC, PPC_NONE)
 #define GEN_VXFORM_DUAL_INV(name0, name1, opc2, opc3, inval0, inval1, type) \
 GEN_OPCODE_DUAL(name0##_##name1, 0x04, opc2, opc3, inval0, inval1, type, \
                                                                PPC_NONE)
+GEN_VXFORM_DUAL_INV(vspltb, vextractub, 6, 8, 0x00000000, 0x100000,
+                                               PPC2_ALTIVEC_207),
+GEN_VXFORM_DUAL_INV(vsplth, vextractuh, 6, 9, 0x00000000, 0x100000,
+                                               PPC2_ALTIVEC_207),
+GEN_VXFORM_DUAL_INV(vspltw, vextractuw, 6, 10, 0x00000000, 0x100000,
+                                               PPC2_ALTIVEC_207),
+GEN_VXFORM_300_EXT(vextractd, 6, 11, 0x100000),
 GEN_VXFORM_DUAL_INV(vspltisb, vinsertb, 6, 12, 0x00000000, 0x100000,
                                                PPC2_ALTIVEC_207),
 GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000,
@@ -226,9 +233,6 @@ GEN_VXFORM_NOA(vrfiz, 5, 9),
 
 #define GEN_VXFORM_UIMM(name, opc2, opc3)                               \
     GEN_HANDLER(name, 0x04, opc2, opc3, 0x00000000, PPC_ALTIVEC)
-GEN_VXFORM_UIMM(vspltb, 6, 8),
-GEN_VXFORM_UIMM(vsplth, 6, 9),
-GEN_VXFORM_UIMM(vspltw, 6, 10),
 GEN_VXFORM_UIMM(vcfux, 5, 12),
 GEN_VXFORM_UIMM(vcfsx, 5, 13),
 GEN_VXFORM_UIMM(vctuxs, 5, 14),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v2 3/5] target-ppc: add vector count trailing zeros instructions
  2016-08-09 10:12 [Qemu-devel] [PATCH 0/5] POWER9 TCG enablement - part3 Rajalakshmi Srinivasaraghavan
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions Rajalakshmi Srinivasaraghavan
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 2/5] target-ppc: add vector extract instructions Rajalakshmi Srinivasaraghavan
@ 2016-08-09 10:12 ` Rajalakshmi Srinivasaraghavan
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 4/5] target-ppc: add vector bit permute doubleword instruction Rajalakshmi Srinivasaraghavan
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 5/5] target-ppc: add vector permute right indexed instruction Rajalakshmi Srinivasaraghavan
  4 siblings, 0 replies; 9+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-08-09 10:12 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, benh, Rajalakshmi Srinivasaraghavan

The following vector count trailing zeros instructions are
added from ISA 3.0.

vctzb - Vector Count Trailing Zeros Byte
vctzh - Vector Count Trailing Zeros Halfword
vctzw - Vector Count Trailing Zeros Word
vctzd - Vector Count Trailing Zeros Doubleword

Signed-off-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
---
 target-ppc/helper.h             |    4 ++++
 target-ppc/int_helper.c         |   15 +++++++++++++++
 target-ppc/translate/vmx-impl.c |   19 +++++++++++++++++++
 target-ppc/translate/vmx-ops.c  |    8 ++++++++
 4 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 59e7b88..6e6e7b3 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -327,6 +327,10 @@ DEF_HELPER_2(vclzb, void, avr, avr)
 DEF_HELPER_2(vclzh, void, avr, avr)
 DEF_HELPER_2(vclzw, void, avr, avr)
 DEF_HELPER_2(vclzd, void, avr, avr)
+DEF_HELPER_2(vctzb, void, avr, avr)
+DEF_HELPER_2(vctzh, void, avr, avr)
+DEF_HELPER_2(vctzw, void, avr, avr)
+DEF_HELPER_2(vctzd, void, avr, avr)
 DEF_HELPER_2(vpopcntb, void, avr, avr)
 DEF_HELPER_2(vpopcnth, void, avr, avr)
 DEF_HELPER_2(vpopcntw, void, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 6477401..188ac6f 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -2111,6 +2111,21 @@ VGENERIC_DO(clzd, u64)
 #undef clzw
 #undef clzd
 
+#define ctzb(v) ((v) ? ctz32(v) : 8)
+#define ctzh(v) ((v) ? ctz32(v) : 16)
+#define ctzw(v) ctz32((v))
+#define ctzd(v) ctz64((v))
+
+VGENERIC_DO(ctzb, u8)
+VGENERIC_DO(ctzh, u16)
+VGENERIC_DO(ctzw, u32)
+VGENERIC_DO(ctzd, u64)
+
+#undef ctzb
+#undef ctzh
+#undef ctzw
+#undef ctzd
+
 #define popcntb(v) ctpop8(v)
 #define popcnth(v) ctpop16(v)
 #define popcntw(v) ctpop32(v)
diff --git a/target-ppc/translate/vmx-impl.c b/target-ppc/translate/vmx-impl.c
index 8bd48f3..2cf8c8f 100644
--- a/target-ppc/translate/vmx-impl.c
+++ b/target-ppc/translate/vmx-impl.c
@@ -553,6 +553,21 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
         tcg_temp_free_ptr(rd);                                          \
     }
 
+#define GEN_VXFORM_NOA_2(name, opc2, opc3, opc4)                        \
+static void glue(gen_, name)(DisasContext *ctx)                         \
+    {                                                                   \
+        TCGv_ptr rb, rd;                                                \
+        if (unlikely(!ctx->altivec_enabled)) {                          \
+            gen_exception(ctx, POWERPC_EXCP_VPU);                       \
+            return;                                                     \
+        }                                                               \
+        rb = gen_avr_ptr(rB(ctx->opcode));                              \
+        rd = gen_avr_ptr(rD(ctx->opcode));                              \
+        gen_helper_##name(rd, rb);                                      \
+        tcg_temp_free_ptr(rb);                                          \
+        tcg_temp_free_ptr(rd);                                          \
+    }
+
 GEN_VXFORM_NOA(vupkhsb, 7, 8);
 GEN_VXFORM_NOA(vupkhsh, 7, 9);
 GEN_VXFORM_NOA(vupkhsw, 7, 25);
@@ -723,6 +738,10 @@ GEN_VXFORM_NOA(vclzb, 1, 28)
 GEN_VXFORM_NOA(vclzh, 1, 29)
 GEN_VXFORM_NOA(vclzw, 1, 30)
 GEN_VXFORM_NOA(vclzd, 1, 31)
+GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
+GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
+GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
+GEN_VXFORM_NOA_2(vctzd, 1, 24, 31)
 GEN_VXFORM_NOA(vpopcntb, 1, 28)
 GEN_VXFORM_NOA(vpopcnth, 1, 29)
 GEN_VXFORM_NOA(vpopcntw, 1, 30)
diff --git a/target-ppc/translate/vmx-ops.c b/target-ppc/translate/vmx-ops.c
index aafe70b..5b2826e 100644
--- a/target-ppc/translate/vmx-ops.c
+++ b/target-ppc/translate/vmx-ops.c
@@ -44,6 +44,10 @@ GEN_HANDLER_E(name, 0x04, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA300)
 #define GEN_VXFORM_300_EXT(name, opc2, opc3, inval)                     \
 GEN_HANDLER_E(name, 0x04, opc2, opc3, inval, PPC_NONE, PPC2_ISA300)
 
+#define GEN_VXFORM_300_EO(name, opc2, opc3, opc4)                     \
+GEN_HANDLER_E_2(name, 0x04, opc2, opc3, opc4, 0x00000000, PPC_NONE,     \
+                                                       PPC2_ISA300)
+
 #define GEN_VXFORM_DUAL(name0, name1, opc2, opc3, type0, type1) \
 GEN_HANDLER_E(name0##_##name1, 0x4, opc2, opc3, 0x00000000, type0, type1)
 
@@ -211,6 +215,10 @@ GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000,
 GEN_VXFORM_DUAL_INV(vspltisw, vinsertw, 6, 14, 0x00000000, 0x100000,
                                                PPC2_ALTIVEC_207),
 GEN_VXFORM_300_EXT(vinsertd, 6, 15, 0x100000),
+GEN_VXFORM_300_EO(vctzb, 0x01, 0x18, 0x1C),
+GEN_VXFORM_300_EO(vctzh, 0x01, 0x18, 0x1D),
+GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
+GEN_VXFORM_300_EO(vctzd, 0x01, 0x18, 0x1F),
 
 #define GEN_VXFORM_NOA(name, opc2, opc3)                                \
     GEN_HANDLER(name, 0x04, opc2, opc3, 0x001f0000, PPC_ALTIVEC)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v2 4/5] target-ppc: add vector bit permute doubleword instruction
  2016-08-09 10:12 [Qemu-devel] [PATCH 0/5] POWER9 TCG enablement - part3 Rajalakshmi Srinivasaraghavan
                   ` (2 preceding siblings ...)
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 3/5] target-ppc: add vector count trailing zeros instructions Rajalakshmi Srinivasaraghavan
@ 2016-08-09 10:12 ` Rajalakshmi Srinivasaraghavan
  2016-08-10  5:04   ` Richard Henderson
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 5/5] target-ppc: add vector permute right indexed instruction Rajalakshmi Srinivasaraghavan
  4 siblings, 1 reply; 9+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-08-09 10:12 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, benh, Rajalakshmi Srinivasaraghavan

Add vbpermd instruction from ISA 3.0.

Signed-off-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
---
 target-ppc/helper.h             |    1 +
 target-ppc/int_helper.c         |   20 ++++++++++++++++++++
 target-ppc/translate/vmx-impl.c |    1 +
 target-ppc/translate/vmx-ops.c  |    1 +
 4 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 6e6e7b3..d1d9418 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -335,6 +335,7 @@ DEF_HELPER_2(vpopcntb, void, avr, avr)
 DEF_HELPER_2(vpopcnth, void, avr, avr)
 DEF_HELPER_2(vpopcntw, void, avr, avr)
 DEF_HELPER_2(vpopcntd, void, avr, avr)
+DEF_HELPER_3(vbpermd, void, avr, avr, avr)
 DEF_HELPER_3(vbpermq, void, avr, avr, avr)
 DEF_HELPER_2(vgbbd, void, avr, avr)
 DEF_HELPER_3(vpmsumb, void, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 188ac6f..d6f26bb 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1134,6 +1134,26 @@ void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
 #define VBPERMQ_DW(index) (((index) & 0x40) == 0)
 #endif
 
+void helper_vbpermd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+    int i, j;
+    uint64_t perm = 0;
+
+    VECTOR_FOR_INORDER_I(i, u64) {
+        perm = 0;
+        for (j = 0; j < 8; j++) {
+            int index = VBPERMQ_INDEX(b, (i * 8) + j);
+            if (index < 64) {
+                uint64_t mask = (1ull << (63 - (index & 0x3F)));
+                if (a->u64[VBPERMQ_DW(index)] & mask) {
+                    perm |= (0x80 >> j);
+                }
+            }
+        }
+        r->u64[i] = perm;
+    }
+}
+
 void helper_vbpermq(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
     int i;
diff --git a/target-ppc/translate/vmx-impl.c b/target-ppc/translate/vmx-impl.c
index 2cf8c8f..5ddff58 100644
--- a/target-ppc/translate/vmx-impl.c
+++ b/target-ppc/translate/vmx-impl.c
@@ -754,6 +754,7 @@ GEN_VXFORM_DUAL(vclzw, PPC_NONE, PPC2_ALTIVEC_207, \
                 vpopcntw, PPC_NONE, PPC2_ALTIVEC_207)
 GEN_VXFORM_DUAL(vclzd, PPC_NONE, PPC2_ALTIVEC_207, \
                 vpopcntd, PPC_NONE, PPC2_ALTIVEC_207)
+GEN_VXFORM(vbpermd, 6, 23);
 GEN_VXFORM(vbpermq, 6, 21);
 GEN_VXFORM_NOA(vgbbd, 6, 20);
 GEN_VXFORM(vpmsumb, 4, 16)
diff --git a/target-ppc/translate/vmx-ops.c b/target-ppc/translate/vmx-ops.c
index 5b2826e..32bd533 100644
--- a/target-ppc/translate/vmx-ops.c
+++ b/target-ppc/translate/vmx-ops.c
@@ -261,6 +261,7 @@ GEN_VXFORM_DUAL(vclzh, vpopcnth, 1, 29, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM_DUAL(vclzw, vpopcntw, 1, 30, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM_DUAL(vclzd, vpopcntd, 1, 31, PPC_NONE, PPC2_ALTIVEC_207),
 
+GEN_VXFORM_300(vbpermd, 6, 23),
 GEN_VXFORM_207(vbpermq, 6, 21),
 GEN_VXFORM_207(vgbbd, 6, 20),
 GEN_VXFORM_207(vpmsumb, 4, 16),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v2 5/5] target-ppc: add vector permute right indexed instruction
  2016-08-09 10:12 [Qemu-devel] [PATCH 0/5] POWER9 TCG enablement - part3 Rajalakshmi Srinivasaraghavan
                   ` (3 preceding siblings ...)
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 4/5] target-ppc: add vector bit permute doubleword instruction Rajalakshmi Srinivasaraghavan
@ 2016-08-09 10:12 ` Rajalakshmi Srinivasaraghavan
  4 siblings, 0 replies; 9+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-08-09 10:12 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, benh, Rajalakshmi Srinivasaraghavan

Add vpermr instruction from ISA 3.0.

Signed-off-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
---
 target-ppc/helper.h             |    1 +
 target-ppc/int_helper.c         |   23 +++++++++++++++++++++++
 target-ppc/translate/vmx-impl.c |   18 ++++++++++++++++++
 target-ppc/translate/vmx-ops.c  |    1 +
 4 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index d1d9418..3c476c9 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -270,6 +270,7 @@ DEF_HELPER_5(vmsumubm, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vmsummbm, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vsel, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vperm, void, env, avr, avr, avr, avr)
+DEF_HELPER_5(vpermr, void, env, avr, avr, avr, avr)
 DEF_HELPER_4(vpkshss, void, env, avr, avr, avr)
 DEF_HELPER_4(vpkshus, void, env, avr, avr, avr)
 DEF_HELPER_4(vpkswss, void, env, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index d6f26bb..6869544 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1126,6 +1126,29 @@ void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
     *r = result;
 }
 
+void helper_vpermr(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
+                  ppc_avr_t *c)
+{
+    ppc_avr_t result;
+    int i;
+
+    VECTOR_FOR_INORDER_I(i, u8) {
+        int s = c->u8[i] & 0x1f;
+#if defined(HOST_WORDS_BIGENDIAN)
+        int index = 15 - (s & 0xf);
+#else
+        int index = s & 0xf;
+#endif
+
+        if (s & 0x10) {
+            result.u8[i] = a->u8[index];
+        } else {
+            result.u8[i] = b->u8[index];
+        }
+    }
+    *r = result;
+}
+
 #if defined(HOST_WORDS_BIGENDIAN)
 #define VBPERMQ_INDEX(avr, i) ((avr)->u8[(i)])
 #define VBPERMQ_DW(index) (((index) & 0x40) != 0)
diff --git a/target-ppc/translate/vmx-impl.c b/target-ppc/translate/vmx-impl.c
index 5ddff58..d13640f 100644
--- a/target-ppc/translate/vmx-impl.c
+++ b/target-ppc/translate/vmx-impl.c
@@ -728,6 +728,24 @@ static void gen_vmladduhm(DisasContext *ctx)
     tcg_temp_free_ptr(rd);
 }
 
+static void gen_vpermr(DisasContext *ctx)
+{
+    TCGv_ptr ra, rb, rc, rd;
+    if (unlikely(!ctx->altivec_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_VPU);
+        return;
+    }
+    ra = gen_avr_ptr(rA(ctx->opcode));
+    rb = gen_avr_ptr(rB(ctx->opcode));
+    rc = gen_avr_ptr(rC(ctx->opcode));
+    rd = gen_avr_ptr(rD(ctx->opcode));
+    gen_helper_vpermr(cpu_env, rd, ra, rb, rc);
+    tcg_temp_free_ptr(ra);
+    tcg_temp_free_ptr(rb);
+    tcg_temp_free_ptr(rc);
+    tcg_temp_free_ptr(rd);
+}
+
 GEN_VAFORM_PAIRED(vmsumubm, vmsummbm, 18)
 GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19)
 GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20)
diff --git a/target-ppc/translate/vmx-ops.c b/target-ppc/translate/vmx-ops.c
index 32bd533..ad72db5 100644
--- a/target-ppc/translate/vmx-ops.c
+++ b/target-ppc/translate/vmx-ops.c
@@ -219,6 +219,7 @@ GEN_VXFORM_300_EO(vctzb, 0x01, 0x18, 0x1C),
 GEN_VXFORM_300_EO(vctzh, 0x01, 0x18, 0x1D),
 GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
 GEN_VXFORM_300_EO(vctzd, 0x01, 0x18, 0x1F),
+GEN_VXFORM_300(vpermr, 0x1D, 0xFF),
 
 #define GEN_VXFORM_NOA(name, opc2, opc3)                                \
     GEN_HANDLER(name, 0x04, opc2, opc3, 0x001f0000, PPC_ALTIVEC)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions Rajalakshmi Srinivasaraghavan
@ 2016-08-09 18:20   ` Richard Henderson
  2016-08-10  4:05     ` Rajalakshmi Srinivasaraghavan
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2016-08-09 18:20 UTC (permalink / raw)
  To: Rajalakshmi Srinivasaraghavan, qemu-ppc, david; +Cc: qemu-devel, nikunj, benh

On 08/09/2016 03:42 PM, Rajalakshmi Srinivasaraghavan wrote:
> +        for (i = 0; i < ARRAY_SIZE(r->element); i++) {                      \
> +            result.element[i] = r->element[i];                              \
> +        }                                                                   \

memcpy, or assignment.

> +        for (i = 0; i < sizeof(r->element[0]); i++) {                       \
> +            result.u8[SPLAT(element) + i] = b->u8[s + i];                   \
> +        }                                                                   \

Also memcpy.

I think your mistake is in your definition of SPLAT, as pointed out by David(?) 
elsewhere.  Any conditional should take place at translate time.

If an exception isn't legal for halfword splat=15, then forcing splat=14 (or 0, 
or...) would be legal, since splat=15 is undefined.


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions
  2016-08-09 18:20   ` Richard Henderson
@ 2016-08-10  4:05     ` Rajalakshmi Srinivasaraghavan
  0 siblings, 0 replies; 9+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-08-10  4:05 UTC (permalink / raw)
  To: Richard Henderson, qemu-ppc, david; +Cc: qemu-devel, nikunj, benh



On 08/09/2016 11:50 PM, Richard Henderson wrote:
> On 08/09/2016 03:42 PM, Rajalakshmi Srinivasaraghavan wrote:
>> +        for (i = 0; i < ARRAY_SIZE(r->element); i++) 
>> {                      \
>> +            result.element[i] = 
>> r->element[i];                              \
>> + } \
>
> memcpy, or assignment.
>
>> +        for (i = 0; i < sizeof(r->element[0]); i++) 
>> {                       \
>> +            result.u8[SPLAT(element) + i] = b->u8[s + 
>> i];                   \
>> + } \
>
> Also memcpy.
Do you mean memcpy is preferred here?
>
> I think your mistake is in your definition of SPLAT, as pointed out by 
> David(?) elsewhere.  Any conditional should take place at translate time.
David pointed not to use SPLAT_ELEMENT which I have corrected in the 
last patch.(v2)
>
> If an exception isn't legal for halfword splat=15, then forcing 
> splat=14 (or 0, or...) would be legal, since splat=15 is undefined.
Yes. I have done the same here.
#define SPLAT(element) \
((splat > (16 - sizeof(r->element[0]))) ? 16 - sizeof(r->element[0]) : 
splat)

>
>
> r~
>
>

-- 
Thanks
Rajalakshmi S

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/5] target-ppc: add vector bit permute doubleword instruction
  2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 4/5] target-ppc: add vector bit permute doubleword instruction Rajalakshmi Srinivasaraghavan
@ 2016-08-10  5:04   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2016-08-10  5:04 UTC (permalink / raw)
  To: Rajalakshmi Srinivasaraghavan, qemu-ppc, david; +Cc: qemu-devel, nikunj, benh

On 08/09/2016 03:42 PM, Rajalakshmi Srinivasaraghavan wrote:
> +void helper_vbpermd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> +    int i, j;
> +    uint64_t perm = 0;
> +
> +    VECTOR_FOR_INORDER_I(i, u64) {
> +        perm = 0;
> +        for (j = 0; j < 8; j++) {
> +            int index = VBPERMQ_INDEX(b, (i * 8) + j);
> +            if (index < 64) {
> +                uint64_t mask = (1ull << (63 - (index & 0x3F)));
> +                if (a->u64[VBPERMQ_DW(index)] & mask) {
> +                    perm |= (0x80 >> j);
> +                }
> +            }
> +        }
> +        r->u64[i] = perm;
> +    }

You need to care for overlap between R vs {A,B}.


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-08-10  5:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-09 10:12 [Qemu-devel] [PATCH 0/5] POWER9 TCG enablement - part3 Rajalakshmi Srinivasaraghavan
2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 1/5] target-ppc: add vector insert instructions Rajalakshmi Srinivasaraghavan
2016-08-09 18:20   ` Richard Henderson
2016-08-10  4:05     ` Rajalakshmi Srinivasaraghavan
2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 2/5] target-ppc: add vector extract instructions Rajalakshmi Srinivasaraghavan
2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 3/5] target-ppc: add vector count trailing zeros instructions Rajalakshmi Srinivasaraghavan
2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 4/5] target-ppc: add vector bit permute doubleword instruction Rajalakshmi Srinivasaraghavan
2016-08-10  5:04   ` Richard Henderson
2016-08-09 10:12 ` [Qemu-devel] [PATCH v2 5/5] target-ppc: add vector permute right indexed instruction Rajalakshmi Srinivasaraghavan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.