All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7
@ 2016-10-26  6:26 Nikunj A Dadhania
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 1/6] target-ppc: add xscmp[eq, gt, ge, ne]dp instructions Nikunj A Dadhania
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-26  6:26 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata, sandipandas1990, ego

This series contains 14 new instructions for POWER9 ISA3.0
    VSX Scalar compare
    Vector Rotate Left Dword
    Vector Rotate Left Word 
    VSX Vector compare not equal
    Vector Parity Byte

Changelog:
v1: 
* Simplify extract routines (Richard)
* Added ror/rol fix (Richard)
* Added vector parity and vector compare instructions

v0:
* Use extract32 and extract64 helper (Richard)
* Use rol32 and rol64 helper (Richard)

Patches:
01: 
    xscmpeqdp: VSX Scalar Compare Equal Double-Precision
    xscmpgedp: VSX Scalar Compare Greater Than or Equal Double-Precision
    xscmpgtdp: VSX Scalar Compare Greater Than Double-Precision
    xscmpnedp: VSX Scalar Compare Not Equal Double-Precision
02: 
    Fix ror[8,16,32,64] and rol[8,16,32,64] 
03:
    vrldmi: Vector Rotate Left Dword then Mask Insert
    vrlwmi: Vector Rotate Left Word then Mask Insert
04: 
    vrldnm: Vector Rotate Left Doubleword then AND with Mask
    vrlwnm: Vector Rotate Left Word then AND with Mask
05:
    vprtybw: Vector Parity Byte Word
    vprtybd: Vector Parity Byte Double Word
    vprtybq: Vector Parity Byte Quad Word
06:
    xvcmpnedp[.]: VSX Vector Compare Not Equal Double-Precision
    xvcmpnesp[.]: VSX Vector Compare Not Equal Single-Precision

Ankit Kumar (1):
  target-ppc: add vprtyb[w/d/q] instructions

Bharata B Rao (1):
  target-ppc: add vrldnm and vrlwnm instructions

Gautham R. Shenoy (1):
  target-ppc: add vrldnmi and vrlwmi instructions

Nikunj A Dadhania (1):
  bitops: fix rol/ror when shift is zero

Sandipan Das (1):
  target-ppc: add xscmp[eq,gt,ge,ne]dp instructions

Swapnil Bokade (1):
  target-ppc: Add xvcmpnesp, xvcmpnedp instructions

 disas/ppc.c                         |  4 ++
 include/qemu/bitops.h               | 16 +++----
 target-ppc/fpu_helper.c             | 71 +++++++++++++++++++++++++++----
 target-ppc/helper.h                 | 13 ++++++
 target-ppc/int_helper.c             | 83 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c | 15 +++++++
 target-ppc/translate/vmx-ops.inc.c  | 12 ++++--
 target-ppc/translate/vsx-impl.inc.c |  6 +++
 target-ppc/translate/vsx-ops.inc.c  |  6 +++
 9 files changed, 206 insertions(+), 20 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 1/6] target-ppc: add xscmp[eq, gt, ge, ne]dp instructions
  2016-10-26  6:26 [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7 Nikunj A Dadhania
@ 2016-10-26  6:26 ` Nikunj A Dadhania
  2016-10-27  3:34   ` David Gibson
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero Nikunj A Dadhania
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-26  6:26 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata, sandipandas1990, ego

From: Sandipan Das <sandipandas1990@gmail.com>

xscmpeqdp: VSX Scalar Compare Equal Double-Precision
xscmpgedp: VSX Scalar Compare Greater Than or Equal Double-Precision
xscmpgtdp: VSX Scalar Compare Greater Than Double-Precision
xscmpnedp: VSX Scalar Compare Not Equal Double-Precision

Signed-off-by: Sandipan Das <sandipandas1990@gmail.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c             | 52 +++++++++++++++++++++++++++++++++++++
 target-ppc/helper.h                 |  4 +++
 target-ppc/translate/vsx-impl.inc.c |  4 +++
 target-ppc/translate/vsx-ops.inc.c  |  4 +++
 4 files changed, 64 insertions(+)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index b0760f0..4906372 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2362,6 +2362,58 @@ VSX_MADD(xvnmaddmsp, 4, float32, VsrW(i), NMADD_FLGS, 0, 0, 0)
 VSX_MADD(xvnmsubasp, 4, float32, VsrW(i), NMSUB_FLGS, 1, 0, 0)
 VSX_MADD(xvnmsubmsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0, 0)
 
+/* VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
+ *   op    - instruction mnemonic
+ *   cmp   - comparison operation
+ *   exp   - expected result of comparison
+ *   svxvc - set VXVC bit
+ */
+#define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc)                                \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false;            \
+                                                                              \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+                                                                              \
+    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||              \
+        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {              \
+        vxsnan_flag = true;                                                   \
+        if (fpscr_ve == 0 && svxvc) {                                         \
+            vxvc_flag = true;                                                 \
+        }                                                                     \
+    } else if (svxvc) {                                                       \
+        vxvc_flag = float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||      \
+            float64_is_quiet_nan(xb.VsrD(0), &env->fp_status);                \
+    }                                                                         \
+    if (vxsnan_flag) {                                                        \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);                \
+    }                                                                         \
+    if (vxvc_flag) {                                                          \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);                  \
+    }                                                                         \
+    vex_flag = fpscr_ve && (vxvc_flag || vxsnan_flag);                        \
+                                                                              \
+    if (!vex_flag) {                                                          \
+        if (float64_##cmp(xb.VsrD(0), xa.VsrD(0), &env->fp_status) == exp) {  \
+            xt.VsrD(0) = -1;                                                  \
+            xt.VsrD(1) = 0;                                                   \
+        } else {                                                              \
+            xt.VsrD(0) = 0;                                                   \
+            xt.VsrD(1) = 0;                                                   \
+        }                                                                     \
+    }                                                                         \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_SCALAR_CMP_DP(xscmpeqdp, eq, 1, 0)
+VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
+VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
+VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
+
 #define VSX_SCALAR_CMP(op, ordered)                                      \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 {                                                                        \
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 5fcc546..0337292 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -389,6 +389,10 @@ DEF_HELPER_2(xsnmaddadp, void, env, i32)
 DEF_HELPER_2(xsnmaddmdp, void, env, i32)
 DEF_HELPER_2(xsnmsubadp, void, env, i32)
 DEF_HELPER_2(xsnmsubmdp, void, env, i32)
+DEF_HELPER_2(xscmpeqdp, void, env, i32)
+DEF_HELPER_2(xscmpgtdp, void, env, i32)
+DEF_HELPER_2(xscmpgedp, void, env, i32)
+DEF_HELPER_2(xscmpnedp, void, env, i32)
 DEF_HELPER_2(xscmpodp, void, env, i32)
 DEF_HELPER_2(xscmpudp, void, env, i32)
 DEF_HELPER_2(xsmaxdp, void, env, i32)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 1508bd1..bf167d0 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -620,6 +620,10 @@ GEN_VSX_HELPER_2(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
+GEN_VSX_HELPER_2(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
+GEN_VSX_HELPER_2(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
+GEN_VSX_HELPER_2(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index af0d27e..202c557 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -114,6 +114,10 @@ GEN_XX3FORM(xsnmaddadp, 0x04, 0x14, PPC2_VSX),
 GEN_XX3FORM(xsnmaddmdp, 0x04, 0x15, PPC2_VSX),
 GEN_XX3FORM(xsnmsubadp, 0x04, 0x16, PPC2_VSX),
 GEN_XX3FORM(xsnmsubmdp, 0x04, 0x17, PPC2_VSX),
+GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
+GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
+GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
+GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
 GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
 GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
 GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero
  2016-10-26  6:26 [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7 Nikunj A Dadhania
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 1/6] target-ppc: add xscmp[eq, gt, ge, ne]dp instructions Nikunj A Dadhania
@ 2016-10-26  6:26 ` Nikunj A Dadhania
  2016-10-26 15:20   ` Richard Henderson
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions Nikunj A Dadhania
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-26  6:26 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata, sandipandas1990, ego

All the variants for rol/ror have a bug in case where the shift == 0.
For example rol32, would generate:

    return (word << 0) | (word >> 32);

Which though works, would be flagged as a runtime error on clang's
sanitizer.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 include/qemu/bitops.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
index 98fb005..1881284 100644
--- a/include/qemu/bitops.h
+++ b/include/qemu/bitops.h
@@ -218,7 +218,7 @@ static inline unsigned long hweight_long(unsigned long w)
  */
 static inline uint8_t rol8(uint8_t word, unsigned int shift)
 {
-    return (word << shift) | (word >> (8 - shift));
+    return (word << shift) | (word >> ((8 - shift) & 7));
 }
 
 /**
@@ -228,7 +228,7 @@ static inline uint8_t rol8(uint8_t word, unsigned int shift)
  */
 static inline uint8_t ror8(uint8_t word, unsigned int shift)
 {
-    return (word >> shift) | (word << (8 - shift));
+    return (word >> shift) | (word << ((8 - shift) & 7));
 }
 
 /**
@@ -238,7 +238,7 @@ static inline uint8_t ror8(uint8_t word, unsigned int shift)
  */
 static inline uint16_t rol16(uint16_t word, unsigned int shift)
 {
-    return (word << shift) | (word >> (16 - shift));
+    return (word << shift) | (word >> ((16 - shift) & 15));
 }
 
 /**
@@ -248,7 +248,7 @@ static inline uint16_t rol16(uint16_t word, unsigned int shift)
  */
 static inline uint16_t ror16(uint16_t word, unsigned int shift)
 {
-    return (word >> shift) | (word << (16 - shift));
+    return (word >> shift) | (word << ((16 - shift) & 15));
 }
 
 /**
@@ -258,7 +258,7 @@ static inline uint16_t ror16(uint16_t word, unsigned int shift)
  */
 static inline uint32_t rol32(uint32_t word, unsigned int shift)
 {
-    return (word << shift) | (word >> (32 - shift));
+    return (word << shift) | (word >> ((32 - shift) & 31));
 }
 
 /**
@@ -268,7 +268,7 @@ static inline uint32_t rol32(uint32_t word, unsigned int shift)
  */
 static inline uint32_t ror32(uint32_t word, unsigned int shift)
 {
-    return (word >> shift) | (word << (32 - shift));
+    return (word >> shift) | (word << ((32 - shift) & 31));
 }
 
 /**
@@ -278,7 +278,7 @@ static inline uint32_t ror32(uint32_t word, unsigned int shift)
  */
 static inline uint64_t rol64(uint64_t word, unsigned int shift)
 {
-    return (word << shift) | (word >> (64 - shift));
+    return (word << shift) | (word >> ((64 - shift) & 63));
 }
 
 /**
@@ -288,7 +288,7 @@ static inline uint64_t rol64(uint64_t word, unsigned int shift)
  */
 static inline uint64_t ror64(uint64_t word, unsigned int shift)
 {
-    return (word >> shift) | (word << (64 - shift));
+    return (word >> shift) | (word << ((64 - shift) & 63));
 }
 
 /**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions
  2016-10-26  6:26 [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7 Nikunj A Dadhania
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 1/6] target-ppc: add xscmp[eq, gt, ge, ne]dp instructions Nikunj A Dadhania
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero Nikunj A Dadhania
@ 2016-10-26  6:26 ` Nikunj A Dadhania
  2016-10-27  3:38   ` David Gibson
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 4/6] target-ppc: add vrldnm and vrlwnm instructions Nikunj A Dadhania
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-26  6:26 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata, sandipandas1990, ego

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

vrldmi: Vector Rotate Left Dword then Mask Insert
vrlwmi: Vector Rotate Left Word then Mask Insert

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
( use extract[32,64] and rol[32,64] )
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 disas/ppc.c                         |  2 ++
 target-ppc/helper.h                 |  2 ++
 target-ppc/int_helper.c             | 46 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  6 +++++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++--
 5 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/disas/ppc.c b/disas/ppc.c
index 052cebe..32f0d8d 100644
--- a/disas/ppc.c
+++ b/disas/ppc.c
@@ -2286,6 +2286,8 @@ const struct powerpc_opcode powerpc_opcodes[] = {
 { "vrlh",      VX(4,   68), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
 { "vrlw",      VX(4,  132), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
 { "vrsqrtefp", VX(4,  330), VX_MASK,	PPCVEC,		{ VD, VB } },
+{ "vrldmi",    VX(4,  197), VX_MASK,    PPCVEC,         { VD, VA, VB } },
+{ "vrlwmi",    VX(4,  133), VX_MASK,    PPCVEC,         { VD, VA, VB} },
 { "vsel",      VXA(4,  42), VXA_MASK,	PPCVEC,		{ VD, VA, VB, VC } },
 { "vsl",       VX(4,  452), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
 { "vslb",      VX(4,  260), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0337292..9fb8f0d 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -325,6 +325,8 @@ DEF_HELPER_4(vmaxfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vminfp, void, env, avr, avr, avr)
 DEF_HELPER_3(vrefp, void, env, avr, avr)
 DEF_HELPER_3(vrsqrtefp, void, env, avr, avr)
+DEF_HELPER_3(vrlwmi, void, avr, avr, avr)
+DEF_HELPER_3(vrldmi, void, avr, avr, avr)
 DEF_HELPER_5(vmaddfp, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vnmsubfp, void, env, avr, avr, avr, avr)
 DEF_HELPER_3(vexptefp, void, env, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index dca4798..b54cd7c 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1717,6 +1717,52 @@ void helper_vrsqrtefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
     }
 }
 
+#define MASK(size, max_val)                                     \
+static inline uint##size##_t mask_u##size(uint##size##_t start, \
+                                uint##size##_t end)             \
+{                                                               \
+    uint##size##_t ret, max_bit = size - 1;                     \
+                                                                \
+    if (likely(start == 0)) {                                   \
+        ret = max_val << (max_bit - end);                       \
+    } else if (likely(end == max_bit)) {                        \
+        ret = max_val >> start;                                 \
+    } else {                                                    \
+        ret = (((uint##size##_t)(-1ULL)) >> (start)) ^          \
+            (((uint##size##_t)(-1ULL) >> (end)) >> 1);          \
+        if (unlikely(start > end)) {                            \
+            return ~ret;                                        \
+        }                                                       \
+    }                                                           \
+                                                                \
+    return ret;                                                 \
+}
+
+MASK(32, UINT32_MAX);
+MASK(64, UINT64_MAX);
+
+#define VRLMI(name, size, element)                                    \
+void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)          \
+{                                                                     \
+    int i;                                                            \
+    for (i = 0; i < ARRAY_SIZE(r->element); i++) {                    \
+        uint##size##_t src1 = a->element[i];                          \
+        uint##size##_t src2 = b->element[i];                          \
+        uint##size##_t src3 = r->element[i];                          \
+        uint##size##_t begin, end, shift, mask, rot_val;              \
+                                                                      \
+        shift = extract##size(src2, 0, 6);                            \
+        end   = extract##size(src2, 8, 6);                            \
+        begin = extract##size(src2, 16, 6);                           \
+        rot_val = rol##size(src1, shift);                             \
+        mask = mask_u##size(begin, end);                              \
+        r->element[i] = (rot_val & mask) | (src3 & ~mask);            \
+    }                                                                 \
+}
+
+VRLMI(vrldmi, 64, u64);
+VRLMI(vrlwmi, 32, u32);
+
 void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
                  ppc_avr_t *c)
 {
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index fc612d9..fdfbd6a 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -488,7 +488,13 @@ GEN_VXFORM_DUAL(vsubeuqm, PPC_NONE, PPC2_ALTIVEC_207, \
 GEN_VXFORM(vrlb, 2, 0);
 GEN_VXFORM(vrlh, 2, 1);
 GEN_VXFORM(vrlw, 2, 2);
+GEN_VXFORM(vrlwmi, 2, 2);
+GEN_VXFORM_DUAL(vrlw, PPC_ALTIVEC, PPC_NONE, \
+                vrlwmi, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM(vrld, 2, 3);
+GEN_VXFORM(vrldmi, 2, 3);
+GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \
+                vrldmi, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM(vsl, 2, 7);
 GEN_VXFORM(vsr, 2, 11);
 GEN_VXFORM_ENV(vpkuhum, 7, 0);
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index cc7ed7e..76b3593 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -143,8 +143,8 @@ GEN_VXFORM_207(vsubcuq, 0, 21),
 GEN_VXFORM_DUAL(vsubeuqm, vsubecuq, 31, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM(vrlb, 2, 0),
 GEN_VXFORM(vrlh, 2, 1),
-GEN_VXFORM(vrlw, 2, 2),
-GEN_VXFORM_207(vrld, 2, 3),
+GEN_VXFORM_DUAL(vrlw, vrlwmi, 2, 2, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_DUAL(vrld, vrldmi, 2, 3, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM(vsl, 2, 7),
 GEN_VXFORM(vsr, 2, 11),
 GEN_VXFORM(vpkuhum, 7, 0),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 4/6] target-ppc: add vrldnm and vrlwnm instructions
  2016-10-26  6:26 [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7 Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions Nikunj A Dadhania
@ 2016-10-26  6:26 ` Nikunj A Dadhania
  2016-10-27  3:39   ` David Gibson
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions Nikunj A Dadhania
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 6/6] target-ppc: Add xvcmpnesp, xvcmpnedp instructions Nikunj A Dadhania
  5 siblings, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-26  6:26 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata, sandipandas1990, ego

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

vrldnm: Vector Rotate Left Doubleword then AND with Mask
vrlwnm: Vector Rotate Left Word then AND with Mask

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 disas/ppc.c                         |  2 ++
 target-ppc/helper.h                 |  2 ++
 target-ppc/int_helper.c             | 14 ++++++++++----
 target-ppc/translate/vmx-impl.inc.c |  6 ++++++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++--
 5 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/disas/ppc.c b/disas/ppc.c
index 32f0d8d..bd05623 100644
--- a/disas/ppc.c
+++ b/disas/ppc.c
@@ -2287,7 +2287,9 @@ const struct powerpc_opcode powerpc_opcodes[] = {
 { "vrlw",      VX(4,  132), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
 { "vrsqrtefp", VX(4,  330), VX_MASK,	PPCVEC,		{ VD, VB } },
 { "vrldmi",    VX(4,  197), VX_MASK,    PPCVEC,         { VD, VA, VB } },
+{ "vrldnm",    VX(4,  453), VX_MASK,    PPCVEC,         { VD, VA, VB } },
 { "vrlwmi",    VX(4,  133), VX_MASK,    PPCVEC,         { VD, VA, VB} },
+{ "vrlwnm",    VX(4,  389), VX_MASK,    PPCVEC,         { VD, VA, VB } },
 { "vsel",      VXA(4,  42), VXA_MASK,	PPCVEC,		{ VD, VA, VB, VC } },
 { "vsl",       VX(4,  452), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
 { "vslb",      VX(4,  260), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 9fb8f0d..d6ee26e 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -327,6 +327,8 @@ DEF_HELPER_3(vrefp, void, env, avr, avr)
 DEF_HELPER_3(vrsqrtefp, void, env, avr, avr)
 DEF_HELPER_3(vrlwmi, void, avr, avr, avr)
 DEF_HELPER_3(vrldmi, void, avr, avr, avr)
+DEF_HELPER_3(vrldnm, void, avr, avr, avr)
+DEF_HELPER_3(vrlwnm, void, avr, avr, avr)
 DEF_HELPER_5(vmaddfp, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vnmsubfp, void, env, avr, avr, avr, avr)
 DEF_HELPER_3(vexptefp, void, env, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index b54cd7c..0fd92ed 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1741,7 +1741,7 @@ static inline uint##size##_t mask_u##size(uint##size##_t start, \
 MASK(32, UINT32_MAX);
 MASK(64, UINT64_MAX);
 
-#define VRLMI(name, size, element)                                    \
+#define VRLMI(name, size, element, insert)                            \
 void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)          \
 {                                                                     \
     int i;                                                            \
@@ -1756,12 +1756,18 @@ void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)          \
         begin = extract##size(src2, 16, 6);                           \
         rot_val = rol##size(src1, shift);                             \
         mask = mask_u##size(begin, end);                              \
-        r->element[i] = (rot_val & mask) | (src3 & ~mask);            \
+        if (insert) {                                                 \
+            r->element[i] = (rot_val & mask) | (src3 & ~mask);        \
+        } else {                                                      \
+            r->element[i] = (rot_val & mask);                         \
+        }                                                             \
     }                                                                 \
 }
 
-VRLMI(vrldmi, 64, u64);
-VRLMI(vrlwmi, 32, u32);
+VRLMI(vrldmi, 64, u64, 1);
+VRLMI(vrlwmi, 32, u32, 1);
+VRLMI(vrldnm, 64, u64, 0);
+VRLMI(vrlwnm, 32, u32, 0);
 
 void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
                  ppc_avr_t *c)
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index fdfbd6a..500c43f 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -442,6 +442,9 @@ GEN_VXFORM(vmulesw, 4, 14);
 GEN_VXFORM(vslb, 2, 4);
 GEN_VXFORM(vslh, 2, 5);
 GEN_VXFORM(vslw, 2, 6);
+GEN_VXFORM(vrlwnm, 2, 6);
+GEN_VXFORM_DUAL(vslw, PPC_ALTIVEC, PPC_NONE, \
+                vrlwnm, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM(vsld, 2, 23);
 GEN_VXFORM(vsrb, 2, 8);
 GEN_VXFORM(vsrh, 2, 9);
@@ -496,6 +499,9 @@ GEN_VXFORM(vrldmi, 2, 3);
 GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \
                 vrldmi, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM(vsl, 2, 7);
+GEN_VXFORM(vrldnm, 2, 7);
+GEN_VXFORM_DUAL(vsl, PPC_ALTIVEC, PPC_NONE, \
+                vrldnm, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM(vsr, 2, 11);
 GEN_VXFORM_ENV(vpkuhum, 7, 0);
 GEN_VXFORM_ENV(vpkuwum, 7, 1);
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index 76b3593..a5ad4d4 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -107,7 +107,7 @@ GEN_VXFORM(vmulesh, 4, 13),
 GEN_VXFORM_207(vmulesw, 4, 14),
 GEN_VXFORM(vslb, 2, 4),
 GEN_VXFORM(vslh, 2, 5),
-GEN_VXFORM(vslw, 2, 6),
+GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_207(vsld, 2, 23),
 GEN_VXFORM(vsrb, 2, 8),
 GEN_VXFORM(vsrh, 2, 9),
@@ -145,7 +145,7 @@ GEN_VXFORM(vrlb, 2, 0),
 GEN_VXFORM(vrlh, 2, 1),
 GEN_VXFORM_DUAL(vrlw, vrlwmi, 2, 2, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_DUAL(vrld, vrldmi, 2, 3, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM(vsl, 2, 7),
+GEN_VXFORM_DUAL(vsl, vrldnm, 2, 7, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vsr, 2, 11),
 GEN_VXFORM(vpkuhum, 7, 0),
 GEN_VXFORM(vpkuwum, 7, 1),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions
  2016-10-26  6:26 [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7 Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 4/6] target-ppc: add vrldnm and vrlwnm instructions Nikunj A Dadhania
@ 2016-10-26  6:26 ` Nikunj A Dadhania
  2016-10-27  3:47   ` David Gibson
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 6/6] target-ppc: Add xvcmpnesp, xvcmpnedp instructions Nikunj A Dadhania
  5 siblings, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-26  6:26 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, bharata, sandipandas1990, ego, Ankit Kumar

From: Ankit Kumar <ankit@linux.vnet.ibm.com>

Add following POWER ISA 3.0 instructions.
vprtybw: Vector Parity Byte Word
vprtybd: Vector Parity Byte Double Word
vprtybq: Vector Parity Byte Quad Word

Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  3 +++
 target-ppc/int_helper.c             | 31 +++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  3 +++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++++
 4 files changed, 41 insertions(+)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index d6ee26e..7d42f99 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -223,6 +223,9 @@ DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
 DEF_HELPER_3(vslv, void, avr, avr, avr)
 DEF_HELPER_3(vaddcuw, void, avr, avr, avr)
+DEF_HELPER_2(vprtybw, void, avr, avr)
+DEF_HELPER_2(vprtybd, void, avr, avr)
+DEF_HELPER_2(vprtybq, void, avr, avr)
 DEF_HELPER_3(vsubcuw, void, avr, avr, avr)
 DEF_HELPER_2(lvsl, void, avr, tl)
 DEF_HELPER_2(lvsr, void, avr, tl)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 0fd92ed..358ffff 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -527,6 +527,37 @@ void helper_vaddcuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
     }
 }
 
+/* vprtyb[w/d] */
+#define VPRTYB(name, element)                               \
+void glue(helper_, name)(ppc_avr_t *r, ppc_avr_t *b)        \
+{                                                           \
+    int i, j;                                               \
+    uint8_t s;                                              \
+    int nr_b = sizeof(b->element[0]) / sizeof(b->u8[0]);    \
+    for (i = 0; i < ARRAY_SIZE(r->element); i++) {          \
+        s = 0;                                              \
+        for (j = i * nr_b; j < (i + 1) * nr_b; j++) {       \
+            s ^= (b->u8[j] & 1);                            \
+        }                                                   \
+        r->element[i] = (!s) ? 0 : 1;                       \
+    }                                                       \
+}
+VPRTYB(vprtybw, u32)
+VPRTYB(vprtybd, u64)
+#undef VPTRYB
+
+/* vprtybq */
+void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
+{
+    int i;
+    uint8_t s = 0;
+    for (i = 0; i < 16; i++) {
+        s ^= (b->u8[i] & 1);
+    }
+    r->u64[LO_IDX] = (!s) ? 0 : 1;
+    r->u64[HI_IDX] = 0;
+}
+
 #define VARITH_DO(name, op, element)                                    \
     void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
     {                                                                   \
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index 500c43f..e1d0897 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -705,6 +705,9 @@ GEN_VXFORM_NOA_ENV(vrfim, 5, 11);
 GEN_VXFORM_NOA_ENV(vrfin, 5, 8);
 GEN_VXFORM_NOA_ENV(vrfip, 5, 10);
 GEN_VXFORM_NOA_ENV(vrfiz, 5, 9);
+GEN_VXFORM_NOA(vprtybw, 1, 24);
+GEN_VXFORM_NOA(vprtybd, 1, 24);
+GEN_VXFORM_NOA(vprtybq, 1, 24);
 
 #define GEN_VXFORM_SIMM(name, opc2, opc3)                               \
 static void glue(gen_, name)(DisasContext *ctx)                                 \
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index a5ad4d4..c631780 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -122,6 +122,10 @@ GEN_VXFORM_300(vslv, 2, 29),
 GEN_VXFORM(vslo, 6, 16),
 GEN_VXFORM(vsro, 6, 17),
 GEN_VXFORM(vaddcuw, 0, 6),
+GEN_HANDLER_E_2(vprtybw, 0x4, 0x1, 0x18, 8, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E_2(vprtybd, 0x4, 0x1, 0x18, 9, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E_2(vprtybq, 0x4, 0x1, 0x18, 10, 0, PPC_NONE, PPC2_ISA300),
+
 GEN_VXFORM(vsubcuw, 0, 22),
 GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 6/6] target-ppc: Add xvcmpnesp, xvcmpnedp instructions
  2016-10-26  6:26 [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7 Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions Nikunj A Dadhania
@ 2016-10-26  6:26 ` Nikunj A Dadhania
  2016-10-27  3:50   ` David Gibson
  5 siblings, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-26  6:26 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, bharata, sandipandas1990, ego, Swapnil Bokade

From: Swapnil Bokade <bokadeswapnil@gmail.com>

xvcmpnedp[.]: VSX Vector Compare Not Equal Double-Precision
xvcmpnesp[.]: VSX Vector Compare Not Equal Single-Precision

Signed-off-by: Swapnil Bokade <bokadeswapnil@gmail.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c             | 19 +++++++++++--------
 target-ppc/helper.h                 |  2 ++
 target-ppc/translate/vsx-impl.inc.c |  2 ++
 target-ppc/translate/vsx-ops.inc.c  |  2 ++
 4 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 4906372..8a389e1 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2497,8 +2497,9 @@ VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
  *   fld   - vsr_t field (VsrD(*) or VsrW(*))
  *   cmp   - comparison operation
  *   svxvc - set VXVC bit
+ *   exp   - expected result of comparison
  */
-#define VSX_CMP(op, nels, tp, fld, cmp, svxvc)                            \
+#define VSX_CMP(op, nels, tp, fld, cmp, svxvc, exp)                       \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
 {                                                                         \
     ppc_vsr_t xt, xa, xb;                                                 \
@@ -2523,7 +2524,7 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
             xt.fld = 0;                                                   \
             all_true = 0;                                                 \
         } else {                                                          \
-            if (tp##_##cmp(xb.fld, xa.fld, &env->fp_status) == 1) {       \
+            if (tp##_##cmp(xb.fld, xa.fld, &env->fp_status) == exp) {     \
                 xt.fld = -1;                                              \
                 all_false = 0;                                            \
             } else {                                                      \
@@ -2540,12 +2541,14 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
     float_check_status(env);                                              \
  }
 
-VSX_CMP(xvcmpeqdp, 2, float64, VsrD(i), eq, 0)
-VSX_CMP(xvcmpgedp, 2, float64, VsrD(i), le, 1)
-VSX_CMP(xvcmpgtdp, 2, float64, VsrD(i), lt, 1)
-VSX_CMP(xvcmpeqsp, 4, float32, VsrW(i), eq, 0)
-VSX_CMP(xvcmpgesp, 4, float32, VsrW(i), le, 1)
-VSX_CMP(xvcmpgtsp, 4, float32, VsrW(i), lt, 1)
+VSX_CMP(xvcmpeqdp, 2, float64, VsrD(i), eq, 0, 1)
+VSX_CMP(xvcmpgedp, 2, float64, VsrD(i), le, 1, 1)
+VSX_CMP(xvcmpgtdp, 2, float64, VsrD(i), lt, 1, 1)
+VSX_CMP(xvcmpnedp, 2, float64, VsrD(i), eq, 0, 0)
+VSX_CMP(xvcmpeqsp, 4, float32, VsrW(i), eq, 0, 1)
+VSX_CMP(xvcmpgesp, 4, float32, VsrW(i), le, 1, 1)
+VSX_CMP(xvcmpgtsp, 4, float32, VsrW(i), lt, 1, 1)
+VSX_CMP(xvcmpnesp, 4, float32, VsrW(i), eq, 0, 0)
 
 /* VSX_CVT_FP_TO_FP - VSX floating point/floating point conversion
  *   op    - instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 7d42f99..201a8cf 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -461,6 +461,7 @@ DEF_HELPER_2(xvmindp, void, env, i32)
 DEF_HELPER_2(xvcmpeqdp, void, env, i32)
 DEF_HELPER_2(xvcmpgedp, void, env, i32)
 DEF_HELPER_2(xvcmpgtdp, void, env, i32)
+DEF_HELPER_2(xvcmpnedp, void, env, i32)
 DEF_HELPER_2(xvcvdpsp, void, env, i32)
 DEF_HELPER_2(xvcvdpsxds, void, env, i32)
 DEF_HELPER_2(xvcvdpsxws, void, env, i32)
@@ -498,6 +499,7 @@ DEF_HELPER_2(xvminsp, void, env, i32)
 DEF_HELPER_2(xvcmpeqsp, void, env, i32)
 DEF_HELPER_2(xvcmpgesp, void, env, i32)
 DEF_HELPER_2(xvcmpgtsp, void, env, i32)
+DEF_HELPER_2(xvcmpnesp, void, env, i32)
 DEF_HELPER_2(xvcvspdp, void, env, i32)
 DEF_HELPER_2(xvcvspsxds, void, env, i32)
 DEF_HELPER_2(xvcvspsxws, void, env, i32)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index bf167d0..5a27be4 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -685,6 +685,7 @@ GEN_VSX_HELPER_2(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpnedp, 0x0C, 0x0F, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xvcvdpsp, 0x12, 0x18, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvdpsxds, 0x10, 0x1D, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvdpsxws, 0x10, 0x0D, 0, PPC2_VSX)
@@ -722,6 +723,7 @@ GEN_VSX_HELPER_2(xvminsp, 0x00, 0x19, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpnesp, 0x0C, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvspdp, 0x12, 0x1C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvspsxds, 0x10, 0x19, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvspsxws, 0x10, 0x09, 0, PPC2_VSX)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 202c557..3d91041 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -179,6 +179,7 @@ GEN_XX3FORM(xvmindp, 0x00, 0x1D, PPC2_VSX),
 GEN_XX3_RC_FORM(xvcmpeqdp, 0x0C, 0x0C, PPC2_VSX),
 GEN_XX3_RC_FORM(xvcmpgtdp, 0x0C, 0x0D, PPC2_VSX),
 GEN_XX3_RC_FORM(xvcmpgedp, 0x0C, 0x0E, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpnedp, 0x0C, 0x0F, PPC2_ISA300),
 GEN_XX2FORM(xvcvdpsp, 0x12, 0x18, PPC2_VSX),
 GEN_XX2FORM(xvcvdpsxds, 0x10, 0x1D, PPC2_VSX),
 GEN_XX2FORM(xvcvdpsxws, 0x10, 0x0D, PPC2_VSX),
@@ -216,6 +217,7 @@ GEN_XX3FORM(xvminsp, 0x00, 0x19, PPC2_VSX),
 GEN_XX3_RC_FORM(xvcmpeqsp, 0x0C, 0x08, PPC2_VSX),
 GEN_XX3_RC_FORM(xvcmpgtsp, 0x0C, 0x09, PPC2_VSX),
 GEN_XX3_RC_FORM(xvcmpgesp, 0x0C, 0x0A, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpnesp, 0x0C, 0x0B, PPC2_ISA300),
 GEN_XX2FORM(xvcvspdp, 0x12, 0x1C, PPC2_VSX),
 GEN_XX2FORM(xvcvspsxds, 0x10, 0x19, PPC2_VSX),
 GEN_XX2FORM(xvcvspsxws, 0x10, 0x09, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero Nikunj A Dadhania
@ 2016-10-26 15:20   ` Richard Henderson
  2016-10-27  3:51     ` David Gibson
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Henderson @ 2016-10-26 15:20 UTC (permalink / raw)
  To: Nikunj A Dadhania, qemu-ppc, david
  Cc: qemu-devel, bharata, sandipandas1990, ego

On 10/25/2016 11:26 PM, Nikunj A Dadhania wrote:
> All the variants for rol/ror have a bug in case where the shift == 0.
> For example rol32, would generate:
> 
>     return (word << 0) | (word >> 32);
> 
> Which though works, would be flagged as a runtime error on clang's
> sanitizer.
> 
> Suggested-by: Richard Henderson <rth@twiddle.net>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  include/qemu/bitops.h | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/6] target-ppc: add xscmp[eq, gt, ge, ne]dp instructions
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 1/6] target-ppc: add xscmp[eq, gt, ge, ne]dp instructions Nikunj A Dadhania
@ 2016-10-27  3:34   ` David Gibson
  0 siblings, 0 replies; 23+ messages in thread
From: David Gibson @ 2016-10-27  3:34 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: qemu-ppc, rth, qemu-devel, bharata, sandipandas1990, ego

[-- Attachment #1: Type: text/plain, Size: 7657 bytes --]

On Wed, Oct 26, 2016 at 11:56:24AM +0530, Nikunj A Dadhania wrote:
> From: Sandipan Das <sandipandas1990@gmail.com>
> 
> xscmpeqdp: VSX Scalar Compare Equal Double-Precision
> xscmpgedp: VSX Scalar Compare Greater Than or Equal Double-Precision
> xscmpgtdp: VSX Scalar Compare Greater Than Double-Precision
> xscmpnedp: VSX Scalar Compare Not Equal Double-Precision
> 
> Signed-off-by: Sandipan Das <sandipandas1990@gmail.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Applied to ppc-for-2.8.

> ---
>  target-ppc/fpu_helper.c             | 52 +++++++++++++++++++++++++++++++++++++
>  target-ppc/helper.h                 |  4 +++
>  target-ppc/translate/vsx-impl.inc.c |  4 +++
>  target-ppc/translate/vsx-ops.inc.c  |  4 +++
>  4 files changed, 64 insertions(+)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index b0760f0..4906372 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2362,6 +2362,58 @@ VSX_MADD(xvnmaddmsp, 4, float32, VsrW(i), NMADD_FLGS, 0, 0, 0)
>  VSX_MADD(xvnmsubasp, 4, float32, VsrW(i), NMSUB_FLGS, 1, 0, 0)
>  VSX_MADD(xvnmsubmsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0, 0)
>  
> +/* VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
> + *   op    - instruction mnemonic
> + *   cmp   - comparison operation
> + *   exp   - expected result of comparison
> + *   svxvc - set VXVC bit
> + */
> +#define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc)                                \
> +void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
> +{                                                                             \
> +    ppc_vsr_t xt, xa, xb;                                                     \
> +    bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false;            \
> +                                                                              \
> +    getVSR(xA(opcode), &xa, env);                                             \
> +    getVSR(xB(opcode), &xb, env);                                             \
> +    getVSR(xT(opcode), &xt, env);                                             \
> +                                                                              \
> +    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||              \
> +        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {              \
> +        vxsnan_flag = true;                                                   \
> +        if (fpscr_ve == 0 && svxvc) {                                         \
> +            vxvc_flag = true;                                                 \
> +        }                                                                     \
> +    } else if (svxvc) {                                                       \
> +        vxvc_flag = float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||      \
> +            float64_is_quiet_nan(xb.VsrD(0), &env->fp_status);                \
> +    }                                                                         \
> +    if (vxsnan_flag) {                                                        \
> +        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);                \
> +    }                                                                         \
> +    if (vxvc_flag) {                                                          \
> +        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);                  \
> +    }                                                                         \
> +    vex_flag = fpscr_ve && (vxvc_flag || vxsnan_flag);                        \
> +                                                                              \
> +    if (!vex_flag) {                                                          \
> +        if (float64_##cmp(xb.VsrD(0), xa.VsrD(0), &env->fp_status) == exp) {  \
> +            xt.VsrD(0) = -1;                                                  \
> +            xt.VsrD(1) = 0;                                                   \
> +        } else {                                                              \
> +            xt.VsrD(0) = 0;                                                   \
> +            xt.VsrD(1) = 0;                                                   \
> +        }                                                                     \
> +    }                                                                         \
> +    putVSR(xT(opcode), &xt, env);                                             \
> +    helper_float_check_status(env);                                           \
> +}
> +
> +VSX_SCALAR_CMP_DP(xscmpeqdp, eq, 1, 0)
> +VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
> +VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
> +VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
> +
>  #define VSX_SCALAR_CMP(op, ordered)                                      \
>  void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
>  {                                                                        \
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 5fcc546..0337292 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -389,6 +389,10 @@ DEF_HELPER_2(xsnmaddadp, void, env, i32)
>  DEF_HELPER_2(xsnmaddmdp, void, env, i32)
>  DEF_HELPER_2(xsnmsubadp, void, env, i32)
>  DEF_HELPER_2(xsnmsubmdp, void, env, i32)
> +DEF_HELPER_2(xscmpeqdp, void, env, i32)
> +DEF_HELPER_2(xscmpgtdp, void, env, i32)
> +DEF_HELPER_2(xscmpgedp, void, env, i32)
> +DEF_HELPER_2(xscmpnedp, void, env, i32)
>  DEF_HELPER_2(xscmpodp, void, env, i32)
>  DEF_HELPER_2(xscmpudp, void, env, i32)
>  DEF_HELPER_2(xsmaxdp, void, env, i32)
> diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
> index 1508bd1..bf167d0 100644
> --- a/target-ppc/translate/vsx-impl.inc.c
> +++ b/target-ppc/translate/vsx-impl.inc.c
> @@ -620,6 +620,10 @@ GEN_VSX_HELPER_2(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
> +GEN_VSX_HELPER_2(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
> +GEN_VSX_HELPER_2(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
> +GEN_VSX_HELPER_2(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
> +GEN_VSX_HELPER_2(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
> diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
> index af0d27e..202c557 100644
> --- a/target-ppc/translate/vsx-ops.inc.c
> +++ b/target-ppc/translate/vsx-ops.inc.c
> @@ -114,6 +114,10 @@ GEN_XX3FORM(xsnmaddadp, 0x04, 0x14, PPC2_VSX),
>  GEN_XX3FORM(xsnmaddmdp, 0x04, 0x15, PPC2_VSX),
>  GEN_XX3FORM(xsnmsubadp, 0x04, 0x16, PPC2_VSX),
>  GEN_XX3FORM(xsnmsubmdp, 0x04, 0x17, PPC2_VSX),
> +GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
> +GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
> +GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
> +GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
>  GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
>  GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
>  GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions Nikunj A Dadhania
@ 2016-10-27  3:38   ` David Gibson
  2016-10-27  8:33     ` Nikunj A Dadhania
  0 siblings, 1 reply; 23+ messages in thread
From: David Gibson @ 2016-10-27  3:38 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: qemu-ppc, rth, qemu-devel, bharata, sandipandas1990, ego

[-- Attachment #1: Type: text/plain, Size: 7105 bytes --]

On Wed, Oct 26, 2016 at 11:56:26AM +0530, Nikunj A Dadhania wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> vrldmi: Vector Rotate Left Dword then Mask Insert
> vrlwmi: Vector Rotate Left Word then Mask Insert
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ( use extract[32,64] and rol[32,64] )
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  disas/ppc.c                         |  2 ++
>  target-ppc/helper.h                 |  2 ++
>  target-ppc/int_helper.c             | 46 +++++++++++++++++++++++++++++++++++++
>  target-ppc/translate/vmx-impl.inc.c |  6 +++++
>  target-ppc/translate/vmx-ops.inc.c  |  4 ++--
>  5 files changed, 58 insertions(+), 2 deletions(-)
> 
> diff --git a/disas/ppc.c b/disas/ppc.c
> index 052cebe..32f0d8d 100644
> --- a/disas/ppc.c
> +++ b/disas/ppc.c
> @@ -2286,6 +2286,8 @@ const struct powerpc_opcode powerpc_opcodes[] = {
>  { "vrlh",      VX(4,   68), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
>  { "vrlw",      VX(4,  132), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
>  { "vrsqrtefp", VX(4,  330), VX_MASK,	PPCVEC,		{ VD, VB } },
> +{ "vrldmi",    VX(4,  197), VX_MASK,    PPCVEC,         { VD, VA, VB } },
> +{ "vrlwmi",    VX(4,  133), VX_MASK,    PPCVEC,         { VD, VA, VB} },
>  { "vsel",      VXA(4,  42), VXA_MASK,	PPCVEC,		{ VD, VA, VB, VC } },
>  { "vsl",       VX(4,  452), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
>  { "vslb",      VX(4,  260), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 0337292..9fb8f0d 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -325,6 +325,8 @@ DEF_HELPER_4(vmaxfp, void, env, avr, avr, avr)
>  DEF_HELPER_4(vminfp, void, env, avr, avr, avr)
>  DEF_HELPER_3(vrefp, void, env, avr, avr)
>  DEF_HELPER_3(vrsqrtefp, void, env, avr, avr)
> +DEF_HELPER_3(vrlwmi, void, avr, avr, avr)
> +DEF_HELPER_3(vrldmi, void, avr, avr, avr)
>  DEF_HELPER_5(vmaddfp, void, env, avr, avr, avr, avr)
>  DEF_HELPER_5(vnmsubfp, void, env, avr, avr, avr, avr)
>  DEF_HELPER_3(vexptefp, void, env, avr, avr)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index dca4798..b54cd7c 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -1717,6 +1717,52 @@ void helper_vrsqrtefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
>      }
>  }
>  
> +#define MASK(size, max_val)                                     \
> +static inline uint##size##_t mask_u##size(uint##size##_t start, \
> +                                uint##size##_t end)             \
> +{                                                               \
> +    uint##size##_t ret, max_bit = size - 1;                     \
> +                                                                \
> +    if (likely(start == 0)) {                                   \
> +        ret = max_val << (max_bit - end);                       \
> +    } else if (likely(end == max_bit)) {                        \
> +        ret = max_val >> start;                                 \
> +    } else {                                                    \
> +        ret = (((uint##size##_t)(-1ULL)) >> (start)) ^          \
> +            (((uint##size##_t)(-1ULL) >> (end)) >> 1);          \
> +        if (unlikely(start > end)) {                            \
> +            return ~ret;                                        \
> +        }                                                       \
> +    }                                                           \
> +                                                                \
> +    return ret;                                                 \
> +}
> +
> +MASK(32, UINT32_MAX);
> +MASK(64, UINT64_MAX);

It would be nicer to merge this mask generation with the
implementation in target-ppc/translate.c (called MASK()).

> +
> +#define VRLMI(name, size, element)                                    \
> +void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)          \
> +{                                                                     \
> +    int i;                                                            \
> +    for (i = 0; i < ARRAY_SIZE(r->element); i++) {                    \
> +        uint##size##_t src1 = a->element[i];                          \
> +        uint##size##_t src2 = b->element[i];                          \
> +        uint##size##_t src3 = r->element[i];                          \
> +        uint##size##_t begin, end, shift, mask, rot_val;              \
> +                                                                      \
> +        shift = extract##size(src2, 0, 6);                            \
> +        end   = extract##size(src2, 8, 6);                            \
> +        begin = extract##size(src2, 16, 6);                           \
> +        rot_val = rol##size(src1, shift);                             \
> +        mask = mask_u##size(begin, end);                              \
> +        r->element[i] = (rot_val & mask) | (src3 & ~mask);            \
> +    }                                                                 \
> +}
> +
> +VRLMI(vrldmi, 64, u64);
> +VRLMI(vrlwmi, 32, u32);
> +
>  void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
>                   ppc_avr_t *c)
>  {
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index fc612d9..fdfbd6a 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -488,7 +488,13 @@ GEN_VXFORM_DUAL(vsubeuqm, PPC_NONE, PPC2_ALTIVEC_207, \
>  GEN_VXFORM(vrlb, 2, 0);
>  GEN_VXFORM(vrlh, 2, 1);
>  GEN_VXFORM(vrlw, 2, 2);
> +GEN_VXFORM(vrlwmi, 2, 2);
> +GEN_VXFORM_DUAL(vrlw, PPC_ALTIVEC, PPC_NONE, \
> +                vrlwmi, PPC_NONE, PPC2_ISA300)
>  GEN_VXFORM(vrld, 2, 3);
> +GEN_VXFORM(vrldmi, 2, 3);
> +GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \
> +                vrldmi, PPC_NONE, PPC2_ISA300)
>  GEN_VXFORM(vsl, 2, 7);
>  GEN_VXFORM(vsr, 2, 11);
>  GEN_VXFORM_ENV(vpkuhum, 7, 0);
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index cc7ed7e..76b3593 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -143,8 +143,8 @@ GEN_VXFORM_207(vsubcuq, 0, 21),
>  GEN_VXFORM_DUAL(vsubeuqm, vsubecuq, 31, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
>  GEN_VXFORM(vrlb, 2, 0),
>  GEN_VXFORM(vrlh, 2, 1),
> -GEN_VXFORM(vrlw, 2, 2),
> -GEN_VXFORM_207(vrld, 2, 3),
> +GEN_VXFORM_DUAL(vrlw, vrlwmi, 2, 2, PPC_ALTIVEC, PPC_NONE),
> +GEN_VXFORM_DUAL(vrld, vrldmi, 2, 3, PPC_NONE, PPC2_ALTIVEC_207),
>  GEN_VXFORM(vsl, 2, 7),
>  GEN_VXFORM(vsr, 2, 11),
>  GEN_VXFORM(vpkuhum, 7, 0),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/6] target-ppc: add vrldnm and vrlwnm instructions
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 4/6] target-ppc: add vrldnm and vrlwnm instructions Nikunj A Dadhania
@ 2016-10-27  3:39   ` David Gibson
  0 siblings, 0 replies; 23+ messages in thread
From: David Gibson @ 2016-10-27  3:39 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: qemu-ppc, rth, qemu-devel, bharata, sandipandas1990, ego

[-- Attachment #1: Type: text/plain, Size: 6302 bytes --]

On Wed, Oct 26, 2016 at 11:56:27AM +0530, Nikunj A Dadhania wrote:
> From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> vrldnm: Vector Rotate Left Doubleword then AND with Mask
> vrlwnm: Vector Rotate Left Word then AND with Mask
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

With the caveat that I'd like to see the previous patch it's based on
share, rather than duplicate the mask generation with translate.c

> ---
>  disas/ppc.c                         |  2 ++
>  target-ppc/helper.h                 |  2 ++
>  target-ppc/int_helper.c             | 14 ++++++++++----
>  target-ppc/translate/vmx-impl.inc.c |  6 ++++++
>  target-ppc/translate/vmx-ops.inc.c  |  4 ++--
>  5 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/disas/ppc.c b/disas/ppc.c
> index 32f0d8d..bd05623 100644
> --- a/disas/ppc.c
> +++ b/disas/ppc.c
> @@ -2287,7 +2287,9 @@ const struct powerpc_opcode powerpc_opcodes[] = {
>  { "vrlw",      VX(4,  132), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
>  { "vrsqrtefp", VX(4,  330), VX_MASK,	PPCVEC,		{ VD, VB } },
>  { "vrldmi",    VX(4,  197), VX_MASK,    PPCVEC,         { VD, VA, VB } },
> +{ "vrldnm",    VX(4,  453), VX_MASK,    PPCVEC,         { VD, VA, VB } },
>  { "vrlwmi",    VX(4,  133), VX_MASK,    PPCVEC,         { VD, VA, VB} },
> +{ "vrlwnm",    VX(4,  389), VX_MASK,    PPCVEC,         { VD, VA, VB } },
>  { "vsel",      VXA(4,  42), VXA_MASK,	PPCVEC,		{ VD, VA, VB, VC } },
>  { "vsl",       VX(4,  452), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
>  { "vslb",      VX(4,  260), VX_MASK,	PPCVEC,		{ VD, VA, VB } },
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 9fb8f0d..d6ee26e 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -327,6 +327,8 @@ DEF_HELPER_3(vrefp, void, env, avr, avr)
>  DEF_HELPER_3(vrsqrtefp, void, env, avr, avr)
>  DEF_HELPER_3(vrlwmi, void, avr, avr, avr)
>  DEF_HELPER_3(vrldmi, void, avr, avr, avr)
> +DEF_HELPER_3(vrldnm, void, avr, avr, avr)
> +DEF_HELPER_3(vrlwnm, void, avr, avr, avr)
>  DEF_HELPER_5(vmaddfp, void, env, avr, avr, avr, avr)
>  DEF_HELPER_5(vnmsubfp, void, env, avr, avr, avr, avr)
>  DEF_HELPER_3(vexptefp, void, env, avr, avr)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index b54cd7c..0fd92ed 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -1741,7 +1741,7 @@ static inline uint##size##_t mask_u##size(uint##size##_t start, \
>  MASK(32, UINT32_MAX);
>  MASK(64, UINT64_MAX);
>  
> -#define VRLMI(name, size, element)                                    \
> +#define VRLMI(name, size, element, insert)                            \
>  void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)          \
>  {                                                                     \
>      int i;                                                            \
> @@ -1756,12 +1756,18 @@ void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)          \
>          begin = extract##size(src2, 16, 6);                           \
>          rot_val = rol##size(src1, shift);                             \
>          mask = mask_u##size(begin, end);                              \
> -        r->element[i] = (rot_val & mask) | (src3 & ~mask);            \
> +        if (insert) {                                                 \
> +            r->element[i] = (rot_val & mask) | (src3 & ~mask);        \
> +        } else {                                                      \
> +            r->element[i] = (rot_val & mask);                         \
> +        }                                                             \
>      }                                                                 \
>  }
>  
> -VRLMI(vrldmi, 64, u64);
> -VRLMI(vrlwmi, 32, u32);
> +VRLMI(vrldmi, 64, u64, 1);
> +VRLMI(vrlwmi, 32, u32, 1);
> +VRLMI(vrldnm, 64, u64, 0);
> +VRLMI(vrlwnm, 32, u32, 0);
>  
>  void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
>                   ppc_avr_t *c)
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index fdfbd6a..500c43f 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -442,6 +442,9 @@ GEN_VXFORM(vmulesw, 4, 14);
>  GEN_VXFORM(vslb, 2, 4);
>  GEN_VXFORM(vslh, 2, 5);
>  GEN_VXFORM(vslw, 2, 6);
> +GEN_VXFORM(vrlwnm, 2, 6);
> +GEN_VXFORM_DUAL(vslw, PPC_ALTIVEC, PPC_NONE, \
> +                vrlwnm, PPC_NONE, PPC2_ISA300)
>  GEN_VXFORM(vsld, 2, 23);
>  GEN_VXFORM(vsrb, 2, 8);
>  GEN_VXFORM(vsrh, 2, 9);
> @@ -496,6 +499,9 @@ GEN_VXFORM(vrldmi, 2, 3);
>  GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \
>                  vrldmi, PPC_NONE, PPC2_ISA300)
>  GEN_VXFORM(vsl, 2, 7);
> +GEN_VXFORM(vrldnm, 2, 7);
> +GEN_VXFORM_DUAL(vsl, PPC_ALTIVEC, PPC_NONE, \
> +                vrldnm, PPC_NONE, PPC2_ISA300)
>  GEN_VXFORM(vsr, 2, 11);
>  GEN_VXFORM_ENV(vpkuhum, 7, 0);
>  GEN_VXFORM_ENV(vpkuwum, 7, 1);
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index 76b3593..a5ad4d4 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -107,7 +107,7 @@ GEN_VXFORM(vmulesh, 4, 13),
>  GEN_VXFORM_207(vmulesw, 4, 14),
>  GEN_VXFORM(vslb, 2, 4),
>  GEN_VXFORM(vslh, 2, 5),
> -GEN_VXFORM(vslw, 2, 6),
> +GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_207(vsld, 2, 23),
>  GEN_VXFORM(vsrb, 2, 8),
>  GEN_VXFORM(vsrh, 2, 9),
> @@ -145,7 +145,7 @@ GEN_VXFORM(vrlb, 2, 0),
>  GEN_VXFORM(vrlh, 2, 1),
>  GEN_VXFORM_DUAL(vrlw, vrlwmi, 2, 2, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_DUAL(vrld, vrldmi, 2, 3, PPC_NONE, PPC2_ALTIVEC_207),
> -GEN_VXFORM(vsl, 2, 7),
> +GEN_VXFORM_DUAL(vsl, vrldnm, 2, 7, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM(vsr, 2, 11),
>  GEN_VXFORM(vpkuhum, 7, 0),
>  GEN_VXFORM(vpkuwum, 7, 1),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions Nikunj A Dadhania
@ 2016-10-27  3:47   ` David Gibson
  2016-10-27  5:22     ` Richard Henderson
  0 siblings, 1 reply; 23+ messages in thread
From: David Gibson @ 2016-10-27  3:47 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: qemu-ppc, rth, qemu-devel, bharata, sandipandas1990, ego, Ankit Kumar

[-- Attachment #1: Type: text/plain, Size: 4930 bytes --]

On Wed, Oct 26, 2016 at 11:56:28AM +0530, Nikunj A Dadhania wrote:
> From: Ankit Kumar <ankit@linux.vnet.ibm.com>
> 
> Add following POWER ISA 3.0 instructions.
> vprtybw: Vector Parity Byte Word
> vprtybd: Vector Parity Byte Double Word
> vprtybq: Vector Parity Byte Quad Word
> 
> Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/helper.h                 |  3 +++
>  target-ppc/int_helper.c             | 31 +++++++++++++++++++++++++++++++
>  target-ppc/translate/vmx-impl.inc.c |  3 +++
>  target-ppc/translate/vmx-ops.inc.c  |  4 ++++
>  4 files changed, 41 insertions(+)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index d6ee26e..7d42f99 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -223,6 +223,9 @@ DEF_HELPER_3(vsro, void, avr, avr, avr)
>  DEF_HELPER_3(vsrv, void, avr, avr, avr)
>  DEF_HELPER_3(vslv, void, avr, avr, avr)
>  DEF_HELPER_3(vaddcuw, void, avr, avr, avr)
> +DEF_HELPER_2(vprtybw, void, avr, avr)
> +DEF_HELPER_2(vprtybd, void, avr, avr)
> +DEF_HELPER_2(vprtybq, void, avr, avr)
>  DEF_HELPER_3(vsubcuw, void, avr, avr, avr)
>  DEF_HELPER_2(lvsl, void, avr, tl)
>  DEF_HELPER_2(lvsr, void, avr, tl)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 0fd92ed..358ffff 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -527,6 +527,37 @@ void helper_vaddcuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
>      }
>  }
>  
> +/* vprtyb[w/d] */
> +#define VPRTYB(name, element)                               \
> +void glue(helper_, name)(ppc_avr_t *r, ppc_avr_t *b)        \
> +{                                                           \
> +    int i, j;                                               \
> +    uint8_t s;                                              \
> +    int nr_b = sizeof(b->element[0]) / sizeof(b->u8[0]);    \
> +    for (i = 0; i < ARRAY_SIZE(r->element); i++) {          \
> +        s = 0;                                              \
> +        for (j = i * nr_b; j < (i + 1) * nr_b; j++) {       \
> +            s ^= (b->u8[j] & 1);                            \
> +        }                                                   \
> +        r->element[i] = (!s) ? 0 : 1;                       \
> +    }                                                       \
> +}
> +VPRTYB(vprtybw, u32)
> +VPRTYB(vprtybd, u64)
> +#undef VPTRYB
> +
> +/* vprtybq */
> +void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
> +{
> +    int i;
> +    uint8_t s = 0;
> +    for (i = 0; i < 16; i++) {
> +        s ^= (b->u8[i] & 1);
> +    }
> +    r->u64[LO_IDX] = (!s) ? 0 : 1;
> +    r->u64[HI_IDX] = 0;
> +}
> +

I think you can implement these better.  First mask with 0x01010101
(of the appropriate length) to extract the LSB bits of each byte.
Then XOR the two halves together, then quarters and so forth,
ln2(size) times to arrive at the parity.  This is similar to the usual
Hamming weight implementation.

>  #define VARITH_DO(name, op, element)                                    \
>      void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)       \
>      {                                                                   \
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index 500c43f..e1d0897 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -705,6 +705,9 @@ GEN_VXFORM_NOA_ENV(vrfim, 5, 11);
>  GEN_VXFORM_NOA_ENV(vrfin, 5, 8);
>  GEN_VXFORM_NOA_ENV(vrfip, 5, 10);
>  GEN_VXFORM_NOA_ENV(vrfiz, 5, 9);
> +GEN_VXFORM_NOA(vprtybw, 1, 24);
> +GEN_VXFORM_NOA(vprtybd, 1, 24);
> +GEN_VXFORM_NOA(vprtybq, 1, 24);
>  
>  #define GEN_VXFORM_SIMM(name, opc2, opc3)                               \
>  static void glue(gen_, name)(DisasContext *ctx)                                 \
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index a5ad4d4..c631780 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -122,6 +122,10 @@ GEN_VXFORM_300(vslv, 2, 29),
>  GEN_VXFORM(vslo, 6, 16),
>  GEN_VXFORM(vsro, 6, 17),
>  GEN_VXFORM(vaddcuw, 0, 6),
> +GEN_HANDLER_E_2(vprtybw, 0x4, 0x1, 0x18, 8, 0, PPC_NONE, PPC2_ISA300),
> +GEN_HANDLER_E_2(vprtybd, 0x4, 0x1, 0x18, 9, 0, PPC_NONE, PPC2_ISA300),
> +GEN_HANDLER_E_2(vprtybq, 0x4, 0x1, 0x18, 10, 0, PPC_NONE, PPC2_ISA300),
> +
>  GEN_VXFORM(vsubcuw, 0, 22),
>  GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 6/6] target-ppc: Add xvcmpnesp, xvcmpnedp instructions
  2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 6/6] target-ppc: Add xvcmpnesp, xvcmpnedp instructions Nikunj A Dadhania
@ 2016-10-27  3:50   ` David Gibson
  0 siblings, 0 replies; 23+ messages in thread
From: David Gibson @ 2016-10-27  3:50 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: qemu-ppc, rth, qemu-devel, bharata, sandipandas1990, ego, Swapnil Bokade

[-- Attachment #1: Type: text/plain, Size: 6610 bytes --]

On Wed, Oct 26, 2016 at 11:56:29AM +0530, Nikunj A Dadhania wrote:
> From: Swapnil Bokade <bokadeswapnil@gmail.com>
> 
> xvcmpnedp[.]: VSX Vector Compare Not Equal Double-Precision
> xvcmpnesp[.]: VSX Vector Compare Not Equal Single-Precision
> 
> Signed-off-by: Swapnil Bokade <bokadeswapnil@gmail.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Applied to ppc-for-2.8

> ---
>  target-ppc/fpu_helper.c             | 19 +++++++++++--------
>  target-ppc/helper.h                 |  2 ++
>  target-ppc/translate/vsx-impl.inc.c |  2 ++
>  target-ppc/translate/vsx-ops.inc.c  |  2 ++
>  4 files changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index 4906372..8a389e1 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2497,8 +2497,9 @@ VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
>   *   fld   - vsr_t field (VsrD(*) or VsrW(*))
>   *   cmp   - comparison operation
>   *   svxvc - set VXVC bit
> + *   exp   - expected result of comparison
>   */
> -#define VSX_CMP(op, nels, tp, fld, cmp, svxvc)                            \
> +#define VSX_CMP(op, nels, tp, fld, cmp, svxvc, exp)                       \
>  void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
>  {                                                                         \
>      ppc_vsr_t xt, xa, xb;                                                 \
> @@ -2523,7 +2524,7 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
>              xt.fld = 0;                                                   \
>              all_true = 0;                                                 \
>          } else {                                                          \
> -            if (tp##_##cmp(xb.fld, xa.fld, &env->fp_status) == 1) {       \
> +            if (tp##_##cmp(xb.fld, xa.fld, &env->fp_status) == exp) {     \
>                  xt.fld = -1;                                              \
>                  all_false = 0;                                            \
>              } else {                                                      \
> @@ -2540,12 +2541,14 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
>      float_check_status(env);                                              \
>   }
>  
> -VSX_CMP(xvcmpeqdp, 2, float64, VsrD(i), eq, 0)
> -VSX_CMP(xvcmpgedp, 2, float64, VsrD(i), le, 1)
> -VSX_CMP(xvcmpgtdp, 2, float64, VsrD(i), lt, 1)
> -VSX_CMP(xvcmpeqsp, 4, float32, VsrW(i), eq, 0)
> -VSX_CMP(xvcmpgesp, 4, float32, VsrW(i), le, 1)
> -VSX_CMP(xvcmpgtsp, 4, float32, VsrW(i), lt, 1)
> +VSX_CMP(xvcmpeqdp, 2, float64, VsrD(i), eq, 0, 1)
> +VSX_CMP(xvcmpgedp, 2, float64, VsrD(i), le, 1, 1)
> +VSX_CMP(xvcmpgtdp, 2, float64, VsrD(i), lt, 1, 1)
> +VSX_CMP(xvcmpnedp, 2, float64, VsrD(i), eq, 0, 0)
> +VSX_CMP(xvcmpeqsp, 4, float32, VsrW(i), eq, 0, 1)
> +VSX_CMP(xvcmpgesp, 4, float32, VsrW(i), le, 1, 1)
> +VSX_CMP(xvcmpgtsp, 4, float32, VsrW(i), lt, 1, 1)
> +VSX_CMP(xvcmpnesp, 4, float32, VsrW(i), eq, 0, 0)
>  
>  /* VSX_CVT_FP_TO_FP - VSX floating point/floating point conversion
>   *   op    - instruction mnemonic
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 7d42f99..201a8cf 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -461,6 +461,7 @@ DEF_HELPER_2(xvmindp, void, env, i32)
>  DEF_HELPER_2(xvcmpeqdp, void, env, i32)
>  DEF_HELPER_2(xvcmpgedp, void, env, i32)
>  DEF_HELPER_2(xvcmpgtdp, void, env, i32)
> +DEF_HELPER_2(xvcmpnedp, void, env, i32)
>  DEF_HELPER_2(xvcvdpsp, void, env, i32)
>  DEF_HELPER_2(xvcvdpsxds, void, env, i32)
>  DEF_HELPER_2(xvcvdpsxws, void, env, i32)
> @@ -498,6 +499,7 @@ DEF_HELPER_2(xvminsp, void, env, i32)
>  DEF_HELPER_2(xvcmpeqsp, void, env, i32)
>  DEF_HELPER_2(xvcmpgesp, void, env, i32)
>  DEF_HELPER_2(xvcmpgtsp, void, env, i32)
> +DEF_HELPER_2(xvcmpnesp, void, env, i32)
>  DEF_HELPER_2(xvcvspdp, void, env, i32)
>  DEF_HELPER_2(xvcvspsxds, void, env, i32)
>  DEF_HELPER_2(xvcvspsxws, void, env, i32)
> diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
> index bf167d0..5a27be4 100644
> --- a/target-ppc/translate/vsx-impl.inc.c
> +++ b/target-ppc/translate/vsx-impl.inc.c
> @@ -685,6 +685,7 @@ GEN_VSX_HELPER_2(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)
> +GEN_VSX_HELPER_2(xvcmpnedp, 0x0C, 0x0F, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xvcvdpsp, 0x12, 0x18, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcvdpsxds, 0x10, 0x1D, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcvdpsxws, 0x10, 0x0D, 0, PPC2_VSX)
> @@ -722,6 +723,7 @@ GEN_VSX_HELPER_2(xvminsp, 0x00, 0x19, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)
> +GEN_VSX_HELPER_2(xvcmpnesp, 0x0C, 0x0B, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcvspdp, 0x12, 0x1C, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcvspsxds, 0x10, 0x19, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xvcvspsxws, 0x10, 0x09, 0, PPC2_VSX)
> diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
> index 202c557..3d91041 100644
> --- a/target-ppc/translate/vsx-ops.inc.c
> +++ b/target-ppc/translate/vsx-ops.inc.c
> @@ -179,6 +179,7 @@ GEN_XX3FORM(xvmindp, 0x00, 0x1D, PPC2_VSX),
>  GEN_XX3_RC_FORM(xvcmpeqdp, 0x0C, 0x0C, PPC2_VSX),
>  GEN_XX3_RC_FORM(xvcmpgtdp, 0x0C, 0x0D, PPC2_VSX),
>  GEN_XX3_RC_FORM(xvcmpgedp, 0x0C, 0x0E, PPC2_VSX),
> +GEN_XX3_RC_FORM(xvcmpnedp, 0x0C, 0x0F, PPC2_ISA300),
>  GEN_XX2FORM(xvcvdpsp, 0x12, 0x18, PPC2_VSX),
>  GEN_XX2FORM(xvcvdpsxds, 0x10, 0x1D, PPC2_VSX),
>  GEN_XX2FORM(xvcvdpsxws, 0x10, 0x0D, PPC2_VSX),
> @@ -216,6 +217,7 @@ GEN_XX3FORM(xvminsp, 0x00, 0x19, PPC2_VSX),
>  GEN_XX3_RC_FORM(xvcmpeqsp, 0x0C, 0x08, PPC2_VSX),
>  GEN_XX3_RC_FORM(xvcmpgtsp, 0x0C, 0x09, PPC2_VSX),
>  GEN_XX3_RC_FORM(xvcmpgesp, 0x0C, 0x0A, PPC2_VSX),
> +GEN_XX3_RC_FORM(xvcmpnesp, 0x0C, 0x0B, PPC2_ISA300),
>  GEN_XX2FORM(xvcvspdp, 0x12, 0x1C, PPC2_VSX),
>  GEN_XX2FORM(xvcvspsxds, 0x10, 0x19, PPC2_VSX),
>  GEN_XX2FORM(xvcvspsxws, 0x10, 0x09, PPC2_VSX),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero
  2016-10-26 15:20   ` Richard Henderson
@ 2016-10-27  3:51     ` David Gibson
  2016-10-30  2:57       ` Nikunj A Dadhania
  0 siblings, 1 reply; 23+ messages in thread
From: David Gibson @ 2016-10-27  3:51 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Nikunj A Dadhania, qemu-ppc, qemu-devel, bharata, sandipandas1990, ego

[-- Attachment #1: Type: text/plain, Size: 961 bytes --]

On Wed, Oct 26, 2016 at 08:20:10AM -0700, Richard Henderson wrote:
> On 10/25/2016 11:26 PM, Nikunj A Dadhania wrote:
> > All the variants for rol/ror have a bug in case where the shift == 0.
> > For example rol32, would generate:
> > 
> >     return (word << 0) | (word >> 32);
> > 
> > Which though works, would be flagged as a runtime error on clang's
> > sanitizer.
> > 
> > Suggested-by: Richard Henderson <rth@twiddle.net>
> > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> > ---
> >  include/qemu/bitops.h | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> Reviewed-by: Richard Henderson <rth@twiddle.net>

This looks fine to me too, but I'm not sure if it should be going via
my tree or not.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions
  2016-10-27  3:47   ` David Gibson
@ 2016-10-27  5:22     ` Richard Henderson
  2016-10-27  8:36       ` Nikunj A Dadhania
  2016-10-27 13:28       ` David Gibson
  0 siblings, 2 replies; 23+ messages in thread
From: Richard Henderson @ 2016-10-27  5:22 UTC (permalink / raw)
  To: David Gibson, Nikunj A Dadhania
  Cc: qemu-ppc, qemu-devel, bharata, sandipandas1990, ego, Ankit Kumar

On 10/26/2016 08:47 PM, David Gibson wrote:
>> > +void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
>> > +{
>> > +    int i;
>> > +    uint8_t s = 0;
>> > +    for (i = 0; i < 16; i++) {
>> > +        s ^= (b->u8[i] & 1);
>> > +    }
>> > +    r->u64[LO_IDX] = (!s) ? 0 : 1;
>> > +    r->u64[HI_IDX] = 0;
>> > +}
>> > +
> I think you can implement these better.  First mask with 0x01010101
> (of the appropriate length) to extract the LSB bits of each byte.
> Then XOR the two halves together, then quarters and so forth,
> ln2(size) times to arrive at the parity.  This is similar to the usual
> Hamming weight implementation.
>

You don't even have to mask with 0x01010101 to start.  Just fold halves til you 
get to the byte level and then mask with 1.


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions
  2016-10-27  3:38   ` David Gibson
@ 2016-10-27  8:33     ` Nikunj A Dadhania
  2016-10-28  1:30       ` David Gibson
  0 siblings, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-27  8:33 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, rth, qemu-devel, bharata, sandipandas1990, ego

David Gibson <david@gibson.dropbear.id.au> writes:
>> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
>> index dca4798..b54cd7c 100644
>> --- a/target-ppc/int_helper.c
>> +++ b/target-ppc/int_helper.c
>> @@ -1717,6 +1717,52 @@ void helper_vrsqrtefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
>>      }
>>  }
>>  
>> +#define MASK(size, max_val)                                     \
>> +static inline uint##size##_t mask_u##size(uint##size##_t start, \
>> +                                uint##size##_t end)             \
>> +{                                                               \
>> +    uint##size##_t ret, max_bit = size - 1;                     \
>> +                                                                \
>> +    if (likely(start == 0)) {                                   \
>> +        ret = max_val << (max_bit - end);                       \
>> +    } else if (likely(end == max_bit)) {                        \
>> +        ret = max_val >> start;                                 \
>> +    } else {                                                    \
>> +        ret = (((uint##size##_t)(-1ULL)) >> (start)) ^          \
>> +            (((uint##size##_t)(-1ULL) >> (end)) >> 1);          \
>> +        if (unlikely(start > end)) {                            \
>> +            return ~ret;                                        \
>> +        }                                                       \
>> +    }                                                           \
>> +                                                                \
>> +    return ret;                                                 \
>> +}
>> +
>> +MASK(32, UINT32_MAX);
>> +MASK(64, UINT64_MAX);
>
> It would be nicer to merge this mask generation with the
> implementation in target-ppc/translate.c (called MASK()).

How about something like this in target-ppc/cpu.h

#define FUNC_MASK(name, ret_type, size, max_val)                  \
static inline ret_type name (uint##size##_t start,                \
                             uint##size##_t end)                  \
{                                                                 \
    ret_type ret, max_bit = size - 1;                             \
                                                                  \
    if (likely(start == 0)) {                                     \
        ret = max_val << (max_bit - end);                         \
    } else if (likely(end == max_bit)) {                          \
        ret = max_val >> start;                                   \
    } else {                                                      \
        ret = (((uint##size##_t)(-1ULL)) >> (start)) ^            \
            (((uint##size##_t)(-1ULL) >> (end)) >> 1);            \
        if (unlikely(start > end)) {                              \
            return ~ret;                                          \
        }                                                         \
    }                                                             \
                                                                  \
    return ret;                                                   \
}

#if defined(TARGET_PPC64)
FUNC_MASK(MASK, target_ulong, 64, UINT64_MAX);
#else
FUNC_MASK(MASK, target_ulong, 32, UINT32_MAX);
#endif
FUNC_MASK(mask_u32, uint32_t, 32, UINT32_MAX);
FUNC_MASK(mask_u64, uint64_t, 64, UINT64_MAX);

Regards
Nikunj

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions
  2016-10-27  5:22     ` Richard Henderson
@ 2016-10-27  8:36       ` Nikunj A Dadhania
  2016-10-27 14:16         ` Richard Henderson
  2016-10-27 13:28       ` David Gibson
  1 sibling, 1 reply; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-27  8:36 UTC (permalink / raw)
  To: Richard Henderson, David Gibson
  Cc: qemu-ppc, qemu-devel, bharata, sandipandas1990, ego, Ankit Kumar

Richard Henderson <rth@twiddle.net> writes:

> On 10/26/2016 08:47 PM, David Gibson wrote:
>>> > +void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
>>> > +{
>>> > +    int i;
>>> > +    uint8_t s = 0;
>>> > +    for (i = 0; i < 16; i++) {
>>> > +        s ^= (b->u8[i] & 1);
>>> > +    }
>>> > +    r->u64[LO_IDX] = (!s) ? 0 : 1;
>>> > +    r->u64[HI_IDX] = 0;
>>> > +}
>>> > +
>> I think you can implement these better.  First mask with 0x01010101
>> (of the appropriate length) to extract the LSB bits of each byte.
>> Then XOR the two halves together, then quarters and so forth,
>> ln2(size) times to arrive at the parity.  This is similar to the usual
>> Hamming weight implementation.
>>
>
> You don't even have to mask with 0x01010101 to start.  Just fold halves til you 
> get to the byte level and then mask with 1.

Right, it does reduce number of operations:

+#define SIZE_MASK(x) ((1ULL << (x)) - 1)
+static uint64_t vparity(uint64_t f1, uint64_t f2, int size)
+{
+    uint64_t res = f1 ^ f2;
+    if (size == 8) return res;
+    return vparity(res & SIZE_MASK(size/2), res >> (size/2), size/2);
+}
+

Regards
Nikunj

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions
  2016-10-27  5:22     ` Richard Henderson
  2016-10-27  8:36       ` Nikunj A Dadhania
@ 2016-10-27 13:28       ` David Gibson
  1 sibling, 0 replies; 23+ messages in thread
From: David Gibson @ 2016-10-27 13:28 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Nikunj A Dadhania, qemu-ppc, qemu-devel, bharata,
	sandipandas1990, ego, Ankit Kumar

[-- Attachment #1: Type: text/plain, Size: 1107 bytes --]

On Wed, Oct 26, 2016 at 10:22:10PM -0700, Richard Henderson wrote:
> On 10/26/2016 08:47 PM, David Gibson wrote:
> > > > +void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
> > > > +{
> > > > +    int i;
> > > > +    uint8_t s = 0;
> > > > +    for (i = 0; i < 16; i++) {
> > > > +        s ^= (b->u8[i] & 1);
> > > > +    }
> > > > +    r->u64[LO_IDX] = (!s) ? 0 : 1;
> > > > +    r->u64[HI_IDX] = 0;
> > > > +}
> > > > +
> > I think you can implement these better.  First mask with 0x01010101
> > (of the appropriate length) to extract the LSB bits of each byte.
> > Then XOR the two halves together, then quarters and so forth,
> > ln2(size) times to arrive at the parity.  This is similar to the usual
> > Hamming weight implementation.
> > 
> 
> You don't even have to mask with 0x01010101 to start.  Just fold halves til
> you get to the byte level and then mask with 1.

Good point.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions
  2016-10-27  8:36       ` Nikunj A Dadhania
@ 2016-10-27 14:16         ` Richard Henderson
  2016-10-28  1:34           ` David Gibson
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Henderson @ 2016-10-27 14:16 UTC (permalink / raw)
  To: Nikunj A Dadhania, David Gibson
  Cc: qemu-ppc, qemu-devel, bharata, sandipandas1990, ego, Ankit Kumar

On 10/27/2016 01:36 AM, Nikunj A Dadhania wrote:
> Right, it does reduce number of operations:
>
> +#define SIZE_MASK(x) ((1ULL << (x)) - 1)
> +static uint64_t vparity(uint64_t f1, uint64_t f2, int size)
> +{
> +    uint64_t res = f1 ^ f2;
> +    if (size == 8) return res;
> +    return vparity(res & SIZE_MASK(size/2), res >> (size/2), size/2);
> +}

Why are you using recursion for something that should be 5 operations?  You're 
making this more complicated than it needs to be.

   uint64_t res = b->u64[0] ^ b->u64[1];
   res ^= res >> 32;
   res ^= res >> 16;
   res ^= res >> 8;
   r->u64[LO_IDX] = res & 1;


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions
  2016-10-27  8:33     ` Nikunj A Dadhania
@ 2016-10-28  1:30       ` David Gibson
  2016-10-28 16:28         ` Richard Henderson
  0 siblings, 1 reply; 23+ messages in thread
From: David Gibson @ 2016-10-28  1:30 UTC (permalink / raw)
  To: Nikunj A Dadhania
  Cc: qemu-ppc, rth, qemu-devel, bharata, sandipandas1990, ego

[-- Attachment #1: Type: text/plain, Size: 3907 bytes --]

On Thu, Oct 27, 2016 at 02:03:01PM +0530, Nikunj A Dadhania wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
> >> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> >> index dca4798..b54cd7c 100644
> >> --- a/target-ppc/int_helper.c
> >> +++ b/target-ppc/int_helper.c
> >> @@ -1717,6 +1717,52 @@ void helper_vrsqrtefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
> >>      }
> >>  }
> >>  
> >> +#define MASK(size, max_val)                                     \
> >> +static inline uint##size##_t mask_u##size(uint##size##_t start, \
> >> +                                uint##size##_t end)             \
> >> +{                                                               \
> >> +    uint##size##_t ret, max_bit = size - 1;                     \
> >> +                                                                \
> >> +    if (likely(start == 0)) {                                   \
> >> +        ret = max_val << (max_bit - end);                       \
> >> +    } else if (likely(end == max_bit)) {                        \
> >> +        ret = max_val >> start;                                 \
> >> +    } else {                                                    \
> >> +        ret = (((uint##size##_t)(-1ULL)) >> (start)) ^          \
> >> +            (((uint##size##_t)(-1ULL) >> (end)) >> 1);          \
> >> +        if (unlikely(start > end)) {                            \
> >> +            return ~ret;                                        \
> >> +        }                                                       \
> >> +    }                                                           \
> >> +                                                                \
> >> +    return ret;                                                 \
> >> +}
> >> +
> >> +MASK(32, UINT32_MAX);
> >> +MASK(64, UINT64_MAX);
> >
> > It would be nicer to merge this mask generation with the
> > implementation in target-ppc/translate.c (called MASK()).
> 
> How about something like this in target-ppc/cpu.h
> 
> #define FUNC_MASK(name, ret_type, size, max_val)                  \
> static inline ret_type name (uint##size##_t start,                \
>                              uint##size##_t end)                  \
> {                                                                 \
>     ret_type ret, max_bit = size - 1;                             \
>                                                                   \
>     if (likely(start == 0)) {                                     \
>         ret = max_val << (max_bit - end);                         \
>     } else if (likely(end == max_bit)) {                          \
>         ret = max_val >> start;                                   \
>     } else {                                                      \
>         ret = (((uint##size##_t)(-1ULL)) >> (start)) ^            \
>             (((uint##size##_t)(-1ULL) >> (end)) >> 1);            \
>         if (unlikely(start > end)) {                              \
>             return ~ret;                                          \
>         }                                                         \
>     }                                                             \
>                                                                   \
>     return ret;                                                   \
> }
> 
> #if defined(TARGET_PPC64)
> FUNC_MASK(MASK, target_ulong, 64, UINT64_MAX);
> #else
> FUNC_MASK(MASK, target_ulong, 32, UINT32_MAX);
> #endif
> FUNC_MASK(mask_u32, uint32_t, 32, UINT32_MAX);
> FUNC_MASK(mask_u64, uint64_t, 64, UINT64_MAX);

That seems reasonable.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions
  2016-10-27 14:16         ` Richard Henderson
@ 2016-10-28  1:34           ` David Gibson
  0 siblings, 0 replies; 23+ messages in thread
From: David Gibson @ 2016-10-28  1:34 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Nikunj A Dadhania, qemu-ppc, qemu-devel, bharata,
	sandipandas1990, ego, Ankit Kumar

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

On Thu, Oct 27, 2016 at 07:16:24AM -0700, Richard Henderson wrote:
> On 10/27/2016 01:36 AM, Nikunj A Dadhania wrote:
> > Right, it does reduce number of operations:
> > 
> > +#define SIZE_MASK(x) ((1ULL << (x)) - 1)
> > +static uint64_t vparity(uint64_t f1, uint64_t f2, int size)
> > +{
> > +    uint64_t res = f1 ^ f2;
> > +    if (size == 8) return res;
> > +    return vparity(res & SIZE_MASK(size/2), res >> (size/2), size/2);
> > +}
> 
> Why are you using recursion for something that should be 5 operations?
> You're making this more complicated than it needs to be.
> 
>   uint64_t res = b->u64[0] ^ b->u64[1];
>   res ^= res >> 32;
>   res ^= res >> 16;
>   res ^= res >> 8;
>   r->u64[LO_IDX] = res & 1;

We do need to implement it at multiple sizes, which makes it a bit
more complex.  But I wonder if it makes sense to do this without a
helper, something like

gen_vprty()
{
    ...
    gen(shift right 8)
    gen(xor)
    if (size > 2)
       gen(shift right 16)
       gen(xor)
    if (size > 4)
       gen(shift right 32)
       gen(xor)
    if (size > 8)
       gen(xor hi and low words)
    gen(mask result bits)
}

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions
  2016-10-28  1:30       ` David Gibson
@ 2016-10-28 16:28         ` Richard Henderson
  0 siblings, 0 replies; 23+ messages in thread
From: Richard Henderson @ 2016-10-28 16:28 UTC (permalink / raw)
  To: David Gibson, Nikunj A Dadhania
  Cc: qemu-ppc, qemu-devel, bharata, sandipandas1990, ego

On 10/27/2016 06:30 PM, David Gibson wrote:
>> How about something like this in target-ppc/cpu.h
>>
>> #define FUNC_MASK(name, ret_type, size, max_val)                  \
>> static inline ret_type name (uint##size##_t start,                \
>>                              uint##size##_t end)                  \

Consider introducing an internals.h, for stuff that needs to be shared within 
target-ppc/, but is not required by any other user of cpu.h.


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero
  2016-10-27  3:51     ` David Gibson
@ 2016-10-30  2:57       ` Nikunj A Dadhania
  0 siblings, 0 replies; 23+ messages in thread
From: Nikunj A Dadhania @ 2016-10-30  2:57 UTC (permalink / raw)
  To: David Gibson, Richard Henderson
  Cc: qemu-ppc, qemu-devel, bharata, sandipandas1990, ego

David Gibson <david@gibson.dropbear.id.au> writes:

> [ Unknown signature status ]
> On Wed, Oct 26, 2016 at 08:20:10AM -0700, Richard Henderson wrote:
>> On 10/25/2016 11:26 PM, Nikunj A Dadhania wrote:
>> > All the variants for rol/ror have a bug in case where the shift == 0.
>> > For example rol32, would generate:
>> > 
>> >     return (word << 0) | (word >> 32);
>> > 
>> > Which though works, would be flagged as a runtime error on clang's
>> > sanitizer.
>> > 
>> > Suggested-by: Richard Henderson <rth@twiddle.net>
>> > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>> > ---
>> >  include/qemu/bitops.h | 16 ++++++++--------
>> >  1 file changed, 8 insertions(+), 8 deletions(-)
>> 
>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>
> This looks fine to me too, but I'm not sure if it should be going via
> my tree or not.

get_maintainer.pl does not help either.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-10-30  2:57 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-26  6:26 [Qemu-devel] [PATCH v2 0/6] POWER9 TCG enablements - part7 Nikunj A Dadhania
2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 1/6] target-ppc: add xscmp[eq, gt, ge, ne]dp instructions Nikunj A Dadhania
2016-10-27  3:34   ` David Gibson
2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 2/6] bitops: fix rol/ror when shift is zero Nikunj A Dadhania
2016-10-26 15:20   ` Richard Henderson
2016-10-27  3:51     ` David Gibson
2016-10-30  2:57       ` Nikunj A Dadhania
2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 3/6] target-ppc: add vrldnmi and vrlwmi instructions Nikunj A Dadhania
2016-10-27  3:38   ` David Gibson
2016-10-27  8:33     ` Nikunj A Dadhania
2016-10-28  1:30       ` David Gibson
2016-10-28 16:28         ` Richard Henderson
2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 4/6] target-ppc: add vrldnm and vrlwnm instructions Nikunj A Dadhania
2016-10-27  3:39   ` David Gibson
2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 5/6] target-ppc: add vprtyb[w/d/q] instructions Nikunj A Dadhania
2016-10-27  3:47   ` David Gibson
2016-10-27  5:22     ` Richard Henderson
2016-10-27  8:36       ` Nikunj A Dadhania
2016-10-27 14:16         ` Richard Henderson
2016-10-28  1:34           ` David Gibson
2016-10-27 13:28       ` David Gibson
2016-10-26  6:26 ` [Qemu-devel] [PATCH v2 6/6] target-ppc: Add xvcmpnesp, xvcmpnedp instructions Nikunj A Dadhania
2016-10-27  3:50   ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.