[Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8
@ 2016-11-22 11:45 Nikunj A Dadhania
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 1/9] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:45 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

This series contains 18 new instructions for POWER9 ISA3.0
    Vector Extract Left/Right Indexed
    VSX Scalar Compare Exponents
    VSX Scalar Compare Quad-Precision
    Load/Store VSX Vector 
    Load/Store VSX Scalar

Patches
=======
01-02: Consolidation/Fixes
   03: 
      xscmpexpdp: VSX Scalar Compare Exponents Double-Precision
      xscmpexpqp: VSX Scalar Compare Exponents Quad-Precision
   04:
      xscmpoqp: VSX Scalar Compare Ordered Quad-Precision
      xscmpuqp: VSX Scalar Compare Unordered Quad-Precision
   05:
      lxsd:  Load VSX Scalar Dword
      lxssp: Load VSX Scalar Single Precision
   06:
      stxsd:  Store VSX Scalar Dword
      stxssp: Store VSX Scalar Single Precision
   07:
      lxv:   Load VSX Vector
      lxvx:  Load VSX Vector Indexed
      stxv:  Store VSX Vector
      stxvx: Store VSX Vector Indexed
   08: 
      vextublx:  Vector Extract Unsigned Byte Left
      vextuhlx:  Vector Extract Unsigned Halfword Left
      vextuwlx:  Vector Extract Unsigned Word Left
   09: 
      vextubrx: Vector Extract Unsigned Byte Right-Indexed
      vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed
      vextuwrx: Vector Extract Unsigned Word Right-Indexed

Avinesh Kumar (1):
  target-ppc: add vextu[bhw]lx instructions

Bharata B Rao (4):
  target-ppc: Consolidate instruction decode helpers
  target-ppc: Fix xscmpodp and xscmpudp instructions
  target-ppc: Add xscmpexp[dp,qp] instructions
  target-ppc: Add xscmpoqp and xscmpuqp instructions

Hariharan T.S (1):
  target-ppc: add vextu[bhw]rx instructions

Nikunj A Dadhania (3):
  target-ppc: implement lxsd and lxssp instructions
  target-ppc: implement stxsd and stxssp
  target-ppc: implement lxv/lxvx and stxv/stxvx

 target-ppc/fpu_helper.c             | 168 +++++++++++++++++++++++-----
 target-ppc/helper.h                 |  10 ++
 target-ppc/int_helper.c             | 123 +++++++++++++++++++++
 target-ppc/internal.h               | 152 ++++++++++++++++++++++++++
 target-ppc/translate.c              | 211 ++++++++++--------------------------
 target-ppc/translate/fp-ops.inc.c   |   2 -
 target-ppc/translate/vmx-impl.inc.c |  23 ++++
 target-ppc/translate/vmx-ops.inc.c  |   8 +-
 target-ppc/translate/vsx-impl.inc.c |  96 ++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  10 ++
 10 files changed, 622 insertions(+), 181 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 1/9] target-ppc: Consolidate instruction decode helpers
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
@ 2016-11-22 11:45 ` Nikunj A Dadhania
  2016-11-23  3:56   ` David Gibson
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions Nikunj A Dadhania
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:45 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

Move instruction decode helpers to target-ppc/internal.h so that some
of these can be used from outside of translate.c. This movement also
helps to get rid of some duplicate helpers from target-ppc/fpu_helper.c.

Suggested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c |  11 +---
 target-ppc/internal.h   | 151 ++++++++++++++++++++++++++++++++++++++++++++++++
 target-ppc/translate.c  | 151 ------------------------------------------------
 3 files changed, 152 insertions(+), 161 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 8a389e1..d3741b4 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -20,6 +20,7 @@
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "exec/exec-all.h"
+#include "internal.h"
 
 #define float64_snan_to_qnan(x) ((x) | 0x0008000000000000ULL)
 #define float32_snan_to_qnan(x) ((x) | 0x00400000)
@@ -1776,16 +1777,6 @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2)
     return helper_efdtsteq(env, op1, op2);
 }
 
-#define DECODE_SPLIT(opcode, shift1, nb1, shift2, nb2) \
-    (((((opcode) >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |    \
-     (((opcode) >> (shift2)) & ((1 << (nb2)) - 1)))
-
-#define xT(opcode) DECODE_SPLIT(opcode, 0, 1, 21, 5)
-#define xA(opcode) DECODE_SPLIT(opcode, 2, 1, 16, 5)
-#define xB(opcode) DECODE_SPLIT(opcode, 1, 1, 11, 5)
-#define xC(opcode) DECODE_SPLIT(opcode, 3, 1,  6, 5)
-#define BF(opcode) (((opcode) >> (31-8)) & 7)
-
 typedef union _ppc_vsr_t {
     uint64_t u64[2];
     uint32_t u32[4];
diff --git a/target-ppc/internal.h b/target-ppc/internal.h
index 1ff4896..9a4a74a 100644
--- a/target-ppc/internal.h
+++ b/target-ppc/internal.h
@@ -47,4 +47,155 @@ FUNC_MASK(MASK, target_ulong, 32, UINT32_MAX);
 FUNC_MASK(mask_u32, uint32_t, 32, UINT32_MAX);
 FUNC_MASK(mask_u64, uint64_t, 64, UINT64_MAX);
 
+/*****************************************************************************/
+/***                           Instruction decoding                        ***/
+#define EXTRACT_HELPER(name, shift, nb)                                       \
+static inline uint32_t name(uint32_t opcode)                                  \
+{                                                                             \
+    return (opcode >> (shift)) & ((1 << (nb)) - 1);                           \
+}
+
+#define EXTRACT_SHELPER(name, shift, nb)                                      \
+static inline int32_t name(uint32_t opcode)                                   \
+{                                                                             \
+    return (int16_t)((opcode >> (shift)) & ((1 << (nb)) - 1));                \
+}
+
+#define EXTRACT_HELPER_SPLIT(name, shift1, nb1, shift2, nb2)                  \
+static inline uint32_t name(uint32_t opcode)                                  \
+{                                                                             \
+    return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |             \
+            ((opcode >> (shift2)) & ((1 << (nb2)) - 1));                      \
+}
+
+#define EXTRACT_HELPER_DXFORM(name,                                           \
+                              d0_bits, shift_op_d0, shift_d0,                 \
+                              d1_bits, shift_op_d1, shift_d1,                 \
+                              d2_bits, shift_op_d2, shift_d2)                 \
+static inline int16_t name(uint32_t opcode)                                   \
+{                                                                             \
+    return                                                                    \
+        (((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) | \
+        (((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) | \
+        (((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << (shift_d2));  \
+}
+
+
+/* Opcode part 1 */
+EXTRACT_HELPER(opc1, 26, 6);
+/* Opcode part 2 */
+EXTRACT_HELPER(opc2, 1, 5);
+/* Opcode part 3 */
+EXTRACT_HELPER(opc3, 6, 5);
+/* Opcode part 4 */
+EXTRACT_HELPER(opc4, 16, 5);
+/* Update Cr0 flags */
+EXTRACT_HELPER(Rc, 0, 1);
+/* Update Cr6 flags (Altivec) */
+EXTRACT_HELPER(Rc21, 10, 1);
+/* Destination */
+EXTRACT_HELPER(rD, 21, 5);
+/* Source */
+EXTRACT_HELPER(rS, 21, 5);
+/* First operand */
+EXTRACT_HELPER(rA, 16, 5);
+/* Second operand */
+EXTRACT_HELPER(rB, 11, 5);
+/* Third operand */
+EXTRACT_HELPER(rC, 6, 5);
+/***                               Get CRn                                 ***/
+EXTRACT_HELPER(crfD, 23, 3);
+EXTRACT_HELPER(BF, 23, 3);
+EXTRACT_HELPER(crfS, 18, 3);
+EXTRACT_HELPER(crbD, 21, 5);
+EXTRACT_HELPER(crbA, 16, 5);
+EXTRACT_HELPER(crbB, 11, 5);
+/* SPR / TBL */
+EXTRACT_HELPER(_SPR, 11, 10);
+static inline uint32_t SPR(uint32_t opcode)
+{
+    uint32_t sprn = _SPR(opcode);
+
+    return ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
+}
+/***                              Get constants                            ***/
+/* 16 bits signed immediate value */
+EXTRACT_SHELPER(SIMM, 0, 16);
+/* 16 bits unsigned immediate value */
+EXTRACT_HELPER(UIMM, 0, 16);
+/* 5 bits signed immediate value */
+EXTRACT_HELPER(SIMM5, 16, 5);
+/* 5 bits signed immediate value */
+EXTRACT_HELPER(UIMM5, 16, 5);
+/* 4 bits unsigned immediate value */
+EXTRACT_HELPER(UIMM4, 16, 4);
+/* Bit count */
+EXTRACT_HELPER(NB, 11, 5);
+/* Shift count */
+EXTRACT_HELPER(SH, 11, 5);
+/* Vector shift count */
+EXTRACT_HELPER(VSH, 6, 4);
+/* Mask start */
+EXTRACT_HELPER(MB, 6, 5);
+/* Mask end */
+EXTRACT_HELPER(ME, 1, 5);
+/* Trap operand */
+EXTRACT_HELPER(TO, 21, 5);
+
+EXTRACT_HELPER(CRM, 12, 8);
+
+#ifndef CONFIG_USER_ONLY
+EXTRACT_HELPER(SR, 16, 4);
+#endif
+
+/* mtfsf/mtfsfi */
+EXTRACT_HELPER(FPBF, 23, 3);
+EXTRACT_HELPER(FPIMM, 12, 4);
+EXTRACT_HELPER(FPL, 25, 1);
+EXTRACT_HELPER(FPFLM, 17, 8);
+EXTRACT_HELPER(FPW, 16, 1);
+
+/* addpcis */
+EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
+#if defined(TARGET_PPC64)
+/* darn */
+EXTRACT_HELPER(L, 16, 2);
+#endif
+
+/***                            Jump target decoding                       ***/
+/* Immediate address */
+static inline target_ulong LI(uint32_t opcode)
+{
+    return (opcode >> 0) & 0x03FFFFFC;
+}
+
+static inline uint32_t BD(uint32_t opcode)
+{
+    return (opcode >> 0) & 0xFFFC;
+}
+
+EXTRACT_HELPER(BO, 21, 5);
+EXTRACT_HELPER(BI, 16, 5);
+/* Absolute/relative address */
+EXTRACT_HELPER(AA, 1, 1);
+/* Link */
+EXTRACT_HELPER(LK, 0, 1);
+
+/* DFP Z22-form */
+EXTRACT_HELPER(DCM, 10, 6)
+
+/* DFP Z23-form */
+EXTRACT_HELPER(RMC, 9, 2)
+
+EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
+EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
+EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
+EXTRACT_HELPER_SPLIT(xB, 1, 1, 11, 5);
+EXTRACT_HELPER_SPLIT(xC, 3, 1,  6, 5);
+EXTRACT_HELPER(DM, 8, 2);
+EXTRACT_HELPER(UIM, 16, 2);
+EXTRACT_HELPER(SHW, 8, 2);
+EXTRACT_HELPER(SP, 19, 2);
+EXTRACT_HELPER(IMM8, 11, 8);
+
 #endif /* PPC_INTERNAL_H */
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 59e9552..6bdc433 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -422,157 +422,6 @@ typedef struct opcode_t {
 
 #define CHK_NONE
 
-
-/*****************************************************************************/
-/***                           Instruction decoding                        ***/
-#define EXTRACT_HELPER(name, shift, nb)                                       \
-static inline uint32_t name(uint32_t opcode)                                  \
-{                                                                             \
-    return (opcode >> (shift)) & ((1 << (nb)) - 1);                           \
-}
-
-#define EXTRACT_SHELPER(name, shift, nb)                                      \
-static inline int32_t name(uint32_t opcode)                                   \
-{                                                                             \
-    return (int16_t)((opcode >> (shift)) & ((1 << (nb)) - 1));                \
-}
-
-#define EXTRACT_HELPER_SPLIT(name, shift1, nb1, shift2, nb2)                  \
-static inline uint32_t name(uint32_t opcode)                                  \
-{                                                                             \
-    return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |             \
-            ((opcode >> (shift2)) & ((1 << (nb2)) - 1));                      \
-}
-
-#define EXTRACT_HELPER_DXFORM(name,                                           \
-                              d0_bits, shift_op_d0, shift_d0,                 \
-                              d1_bits, shift_op_d1, shift_d1,                 \
-                              d2_bits, shift_op_d2, shift_d2)                 \
-static inline int16_t name(uint32_t opcode)                                   \
-{                                                                             \
-    return                                                                    \
-        (((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) | \
-        (((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) | \
-        (((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << (shift_d2));  \
-}
-
-
-/* Opcode part 1 */
-EXTRACT_HELPER(opc1, 26, 6);
-/* Opcode part 2 */
-EXTRACT_HELPER(opc2, 1, 5);
-/* Opcode part 3 */
-EXTRACT_HELPER(opc3, 6, 5);
-/* Opcode part 4 */
-EXTRACT_HELPER(opc4, 16, 5);
-/* Update Cr0 flags */
-EXTRACT_HELPER(Rc, 0, 1);
-/* Update Cr6 flags (Altivec) */
-EXTRACT_HELPER(Rc21, 10, 1);
-/* Destination */
-EXTRACT_HELPER(rD, 21, 5);
-/* Source */
-EXTRACT_HELPER(rS, 21, 5);
-/* First operand */
-EXTRACT_HELPER(rA, 16, 5);
-/* Second operand */
-EXTRACT_HELPER(rB, 11, 5);
-/* Third operand */
-EXTRACT_HELPER(rC, 6, 5);
-/***                               Get CRn                                 ***/
-EXTRACT_HELPER(crfD, 23, 3);
-EXTRACT_HELPER(crfS, 18, 3);
-EXTRACT_HELPER(crbD, 21, 5);
-EXTRACT_HELPER(crbA, 16, 5);
-EXTRACT_HELPER(crbB, 11, 5);
-/* SPR / TBL */
-EXTRACT_HELPER(_SPR, 11, 10);
-static inline uint32_t SPR(uint32_t opcode)
-{
-    uint32_t sprn = _SPR(opcode);
-
-    return ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
-}
-/***                              Get constants                            ***/
-/* 16 bits signed immediate value */
-EXTRACT_SHELPER(SIMM, 0, 16);
-/* 16 bits unsigned immediate value */
-EXTRACT_HELPER(UIMM, 0, 16);
-/* 5 bits signed immediate value */
-EXTRACT_HELPER(SIMM5, 16, 5);
-/* 5 bits signed immediate value */
-EXTRACT_HELPER(UIMM5, 16, 5);
-/* 4 bits unsigned immediate value */
-EXTRACT_HELPER(UIMM4, 16, 4);
-/* Bit count */
-EXTRACT_HELPER(NB, 11, 5);
-/* Shift count */
-EXTRACT_HELPER(SH, 11, 5);
-/* Vector shift count */
-EXTRACT_HELPER(VSH, 6, 4);
-/* Mask start */
-EXTRACT_HELPER(MB, 6, 5);
-/* Mask end */
-EXTRACT_HELPER(ME, 1, 5);
-/* Trap operand */
-EXTRACT_HELPER(TO, 21, 5);
-
-EXTRACT_HELPER(CRM, 12, 8);
-
-#ifndef CONFIG_USER_ONLY
-EXTRACT_HELPER(SR, 16, 4);
-#endif
-
-/* mtfsf/mtfsfi */
-EXTRACT_HELPER(FPBF, 23, 3);
-EXTRACT_HELPER(FPIMM, 12, 4);
-EXTRACT_HELPER(FPL, 25, 1);
-EXTRACT_HELPER(FPFLM, 17, 8);
-EXTRACT_HELPER(FPW, 16, 1);
-
-/* addpcis */
-EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
-#if defined(TARGET_PPC64)
-/* darn */
-EXTRACT_HELPER(L, 16, 2);
-#endif
-
-/***                            Jump target decoding                       ***/
-/* Immediate address */
-static inline target_ulong LI(uint32_t opcode)
-{
-    return (opcode >> 0) & 0x03FFFFFC;
-}
-
-static inline uint32_t BD(uint32_t opcode)
-{
-    return (opcode >> 0) & 0xFFFC;
-}
-
-EXTRACT_HELPER(BO, 21, 5);
-EXTRACT_HELPER(BI, 16, 5);
-/* Absolute/relative address */
-EXTRACT_HELPER(AA, 1, 1);
-/* Link */
-EXTRACT_HELPER(LK, 0, 1);
-
-/* DFP Z22-form */
-EXTRACT_HELPER(DCM, 10, 6)
-
-/* DFP Z23-form */
-EXTRACT_HELPER(RMC, 9, 2)
-
-EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
-EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
-EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
-EXTRACT_HELPER_SPLIT(xB, 1, 1, 11, 5);
-EXTRACT_HELPER_SPLIT(xC, 3, 1,  6, 5);
-EXTRACT_HELPER(DM, 8, 2);
-EXTRACT_HELPER(UIM, 16, 2);
-EXTRACT_HELPER(SHW, 8, 2);
-EXTRACT_HELPER(SP, 19, 2);
-EXTRACT_HELPER(IMM8, 11, 8);
-
 /*****************************************************************************/
 /* PowerPC instructions table                                                */
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 1/9] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
@ 2016-11-22 11:45 ` Nikunj A Dadhania
  2016-11-23  4:01   ` David Gibson
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 3/9] target-ppc: Add xscmpexp[dp, qp] instructions Nikunj A Dadhania
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:45 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

- xscmpodp & xscmpudp are missing flags reset.
- In xscmpodp, VXCC should be set only if VE is 0 for signalling NaN case
  and VXCC should be set by explicitly checking for quiet NaN case.
- Comparison is being done only if the operands are not NaNs. However as
  per ISA, it should be done even when operands are NaNs.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c | 41 +++++++++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index d3741b4..3027003 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2410,29 +2410,38 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 {                                                                        \
     ppc_vsr_t xa, xb;                                                    \
     uint32_t cc = 0;                                                     \
+    bool vxsnan_flag = false, vxvc_flag = false;                         \
                                                                          \
+    helper_reset_fpstatus(env);                                          \
     getVSR(xA(opcode), &xa, env);                                        \
     getVSR(xB(opcode), &xb, env);                                        \
                                                                          \
-    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||                       \
-                 float64_is_any_nan(xb.VsrD(0)))) {                      \
-        if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||     \
-            float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {     \
-            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);       \
-        }                                                                \
-        if (ordered) {                                                   \
-            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);         \
+    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||         \
+        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {         \
+        vxsnan_flag = true;                                              \
+        cc = 1;                                                          \
+        if (fpscr_ve == 0 && ordered) {                                  \
+            vxvc_flag = true;                                            \
         }                                                                \
+    } else if ((float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||     \
+                float64_is_quiet_nan(xb.VsrD(0), &env->fp_status))       \
+               && ordered) {                                             \
         cc = 1;                                                          \
+        vxvc_flag = true;                                                \
+    }                                                                    \
+    if (vxsnan_flag) {                                                   \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);           \
+    }                                                                    \
+    if (vxvc_flag) {                                                     \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);             \
+    }                                                                    \
+                                                                         \
+    if (float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {           \
+        cc |= 8;                                                         \
+    } else if (!float64_le(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {   \
+        cc |= 4;                                                         \
     } else {                                                             \
-        if (float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {       \
-            cc = 8;                                                      \
-        } else if (!float64_le(xa.VsrD(0), xb.VsrD(0),                   \
-                               &env->fp_status)) { \
-            cc = 4;                                                      \
-        } else {                                                         \
-            cc = 2;                                                      \
-        }                                                                \
+        cc |= 2;                                                         \
     }                                                                    \
                                                                          \
     env->fpscr &= ~(0x0F << FPSCR_FPRF);                                 \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 3/9] target-ppc: Add xscmpexp[dp, qp] instructions
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 1/9] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions Nikunj A Dadhania
@ 2016-11-22 11:45 ` Nikunj A Dadhania
  2016-11-23  4:06   ` David Gibson
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 4/9] target-ppc: Add xscmpoqp and xscmpuqp instructions Nikunj A Dadhania
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:45 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

xscmpexpdp: VSX Scalar Compare Exponents Double-Precision
xscmpexpqp: VSX Scalar Compare Exponents Quad-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c             | 64 +++++++++++++++++++++++++++++++++++++
 target-ppc/helper.h                 |  2 ++
 target-ppc/translate/vsx-impl.inc.c |  2 ++
 target-ppc/translate/vsx-ops.inc.c  |  6 ++++
 4 files changed, 74 insertions(+)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 3027003..b1c5a07 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2405,6 +2405,70 @@ VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
 VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
 VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
 
+void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode)
+{
+    ppc_vsr_t xa, xb;
+    int64_t exp_a, exp_b;
+    uint32_t cc;
+
+    getVSR(xA(opcode), &xa, env);
+    getVSR(xB(opcode), &xb, env);
+
+    exp_a = extract64(xa.VsrD(0), 52, 11);
+    exp_b = extract64(xb.VsrD(0), 52, 11);
+
+    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||
+                 float64_is_any_nan(xb.VsrD(0)))) {
+        cc = 1;
+    } else {
+        if (exp_a < exp_b) {
+            cc = 8;
+        } else if (exp_a > exp_b) {
+            cc = 4;
+        } else {
+            cc = 2;
+        }
+    }
+
+    env->fpscr &= ~(0x0F << FPSCR_FPRF);
+    env->fpscr |= cc << FPSCR_FPRF;
+    env->crf[BF(opcode)] = cc;
+
+    helper_float_check_status(env);
+}
+
+void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode)
+{
+    ppc_vsr_t xa, xb;
+    int64_t exp_a, exp_b;
+    uint32_t cc;
+
+    getVSR(rA(opcode) + 32, &xa, env);
+    getVSR(rB(opcode) + 32, &xb, env);
+
+    exp_a = extract64(xa.VsrD(0), 48, 15);
+    exp_b = extract64(xb.VsrD(0), 48, 15);
+
+    if (unlikely(float128_is_any_nan(make_float128(xa.VsrD(0), xa.VsrD(1))) ||
+                 float128_is_any_nan(make_float128(xb.VsrD(0), xb.VsrD(1))))) {
+        cc = 1;
+    } else {
+        if (exp_a < exp_b) {
+            cc = 8;
+        } else if (exp_a > exp_b) {
+            cc = 4;
+        } else {
+            cc = 2;
+        }
+    }
+
+    env->fpscr &= ~(0x0F << FPSCR_FPRF);
+    env->fpscr |= cc << FPSCR_FPRF;
+    env->crf[BF(opcode)] = cc;
+
+    helper_float_check_status(env);
+}
+
 #define VSX_SCALAR_CMP(op, ordered)                                      \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 {                                                                        \
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index da00f0a..ba42015 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -404,6 +404,8 @@ DEF_HELPER_2(xscmpeqdp, void, env, i32)
 DEF_HELPER_2(xscmpgtdp, void, env, i32)
 DEF_HELPER_2(xscmpgedp, void, env, i32)
 DEF_HELPER_2(xscmpnedp, void, env, i32)
+DEF_HELPER_2(xscmpexpdp, void, env, i32)
+DEF_HELPER_2(xscmpexpqp, void, env, i32)
 DEF_HELPER_2(xscmpodp, void, env, i32)
 DEF_HELPER_2(xscmpudp, void, env, i32)
 DEF_HELPER_2(xsmaxdp, void, env, i32)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 5a27be4..5206258 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -624,6 +624,8 @@ GEN_VSX_HELPER_2(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
+GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
+GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 3d91041..2468ee9 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -83,6 +83,10 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x01, opc3|0x0C, 0, PPC_NONE, PPC2_VSX),\
 GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x02, opc3|0x0C, 0, PPC_NONE, PPC2_VSX),\
 GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x03, opc3|0x0C, 0, PPC_NONE, PPC2_VSX)
 
+#define GEN_VSX_XFORM_300(name, opc2, opc3, inval) \
+GEN_HANDLER_E(name, 0x3F, opc2, opc3, inval, PPC_NONE, PPC2_ISA300)
+
+
 GEN_XX2FORM(xsabsdp, 0x12, 0x15, PPC2_VSX),
 GEN_XX2FORM(xsnabsdp, 0x12, 0x16, PPC2_VSX),
 GEN_XX2FORM(xsnegdp, 0x12, 0x17, PPC2_VSX),
@@ -118,6 +122,8 @@ GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
 GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
 GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
 GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
+GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
+GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
 GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
 GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
 GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 4/9] target-ppc: Add xscmpoqp and xscmpuqp instructions
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 3/9] target-ppc: Add xscmpexp[dp, qp] instructions Nikunj A Dadhania
@ 2016-11-22 11:46 ` Nikunj A Dadhania
  2016-11-23  4:06   ` David Gibson
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 5/9] target-ppc: implement lxsd and lxssp instructions Nikunj A Dadhania
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:46 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

xscmpoqp - VSX Scalar Compare Ordered Quad-Precision
xscmpuqp - VSX Scalar Compare Unordered Quad-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c             | 52 +++++++++++++++++++++++++++++++++++++
 target-ppc/helper.h                 |  2 ++
 target-ppc/translate/vsx-impl.inc.c |  2 ++
 target-ppc/translate/vsx-ops.inc.c  |  2 ++
 4 files changed, 58 insertions(+)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index b1c5a07..28c1fea 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2518,6 +2518,58 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 VSX_SCALAR_CMP(xscmpodp, 1)
 VSX_SCALAR_CMP(xscmpudp, 0)
 
+#define VSX_SCALAR_CMPQ(op, ordered)                                    \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+{                                                                       \
+    ppc_vsr_t xa, xb;                                                   \
+    uint32_t cc = 0;                                                    \
+    bool vxsnan_flag = false, vxvc_flag = false;                        \
+    float128 a, b;                                                      \
+                                                                        \
+    helper_reset_fpstatus(env);                                         \
+    getVSR(rA(opcode) + 32, &xa, env);                                  \
+    getVSR(rB(opcode) + 32, &xb, env);                                  \
+                                                                        \
+    a = make_float128(xa.VsrD(0), xa.VsrD(1));                          \
+    b = make_float128(xb.VsrD(0), xb.VsrD(1));                          \
+                                                                        \
+    if (float128_is_signaling_nan(a, &env->fp_status) ||                \
+        float128_is_signaling_nan(b, &env->fp_status)) {                \
+        vxsnan_flag = true;                                             \
+        cc = 1;                                                         \
+        if (fpscr_ve == 0 && ordered) {                                 \
+            vxvc_flag = true;                                           \
+        }                                                               \
+    } else if (ordered && (float128_is_quiet_nan(a, &env->fp_status)    \
+                           || float128_is_quiet_nan(b, &env->fp_status))) { \
+        cc = 1;                                                         \
+        vxvc_flag = true;                                               \
+    }                                                                   \
+    if (vxsnan_flag) {                                                  \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);          \
+    }                                                                   \
+    if (vxvc_flag) {                                                    \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);            \
+    }                                                                   \
+                                                                        \
+    if (float128_lt(a, b, &env->fp_status)) {                           \
+        cc |= 8;                                                        \
+    } else if (!float128_le(a, b, &env->fp_status)) {                   \
+        cc |= 4;                                                        \
+    } else {                                                            \
+        cc |= 2;                                                        \
+    }                                                                   \
+                                                                        \
+    env->fpscr &= ~(0x0F << FPSCR_FPRF);                                \
+    env->fpscr |= cc << FPSCR_FPRF;                                     \
+    env->crf[BF(opcode)] = cc;                                          \
+                                                                        \
+    float_check_status(env);                                            \
+}
+
+VSX_SCALAR_CMPQ(xscmpoqp, 1)
+VSX_SCALAR_CMPQ(xscmpuqp, 0)
+
 /* VSX_MAX_MIN - VSX floating point maximum/minimum
  *   name  - instruction mnemonic
  *   op    - operation (max or min)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index ba42015..3b26678 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -408,6 +408,8 @@ DEF_HELPER_2(xscmpexpdp, void, env, i32)
 DEF_HELPER_2(xscmpexpqp, void, env, i32)
 DEF_HELPER_2(xscmpodp, void, env, i32)
 DEF_HELPER_2(xscmpudp, void, env, i32)
+DEF_HELPER_2(xscmpoqp, void, env, i32)
+DEF_HELPER_2(xscmpuqp, void, env, i32)
 DEF_HELPER_2(xsmaxdp, void, env, i32)
 DEF_HELPER_2(xsmindp, void, env, i32)
 DEF_HELPER_2(xscvdpsp, void, env, i32)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 5206258..ed9588e 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -628,6 +628,8 @@ GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 2468ee9..7f09527 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -126,6 +126,8 @@ GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
 GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
 GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
 GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
+GEN_VSX_XFORM_300(xscmpoqp, 0x04, 0x04, 0x00600001),
+GEN_VSX_XFORM_300(xscmpuqp, 0x04, 0x14, 0x00600001),
 GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
 GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),
 GEN_XX2FORM(xscvdpsp, 0x12, 0x10, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 5/9] target-ppc: implement lxsd and lxssp instructions
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 4/9] target-ppc: Add xscmpoqp and xscmpuqp instructions Nikunj A Dadhania
@ 2016-11-22 11:46 ` Nikunj A Dadhania
  2016-11-23  4:06   ` David Gibson
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 6/9] target-ppc: implement stxsd and stxssp Nikunj A Dadhania
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:46 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

lxsd: Load VSX Scalar Dword
lxssp: Load VSX Scalar Single

Moreover, DS-Form instructions shares the same primary opcode, bits
30:31 are used to decode the instruction. Use a common routine to decode
primary opcode(0x39) - ds-form instructions and branch-out depending on
bits 30:31.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 25 +++++++++++++++++++++++++
 target-ppc/translate/fp-ops.inc.c   |  1 -
 target-ppc/translate/vsx-impl.inc.c | 21 +++++++++++++++++++++
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 6bdc433..f280851 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -6053,6 +6053,29 @@ GEN_TM_PRIV_NOOP(trechkpt);
 
 #include "translate/spe-impl.inc.c"
 
+/* Handles lfdp, lxsd, lxssp */
+static void gen_dform39(DisasContext *ctx)
+{
+    switch (ctx->opcode & 0x3) {
+    case 0: /* lfdp */
+        if (ctx->insns_flags2 & PPC2_ISA205) {
+            return gen_lfdp(ctx);
+        }
+        break;
+    case 2: /* lxsd */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_lxsd(ctx);
+        }
+        break;
+    case 3: /* lxssp */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_lxssp(ctx);
+        }
+        break;
+    }
+    return gen_invalid(ctx);
+}
+
 static opcode_t opcodes[] = {
 GEN_HANDLER(invalid, 0x00, 0x00, 0x00, 0xFFFFFFFF, PPC_NONE),
 GEN_HANDLER(cmp, 0x1F, 0x00, 0x00, 0x00400000, PPC_INTEGER),
@@ -6125,6 +6148,8 @@ GEN_HANDLER(ld, 0x3A, 0xFF, 0xFF, 0x00000000, PPC_64B),
 GEN_HANDLER(lq, 0x38, 0xFF, 0xFF, 0x00000000, PPC_64BX),
 GEN_HANDLER(std, 0x3E, 0xFF, 0xFF, 0x00000000, PPC_64B),
 #endif
+/* handles lfdp, lxsd, lxssp */
+GEN_HANDLER_E(dform39, 0x39, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(lmw, 0x2E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(stmw, 0x2F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(lswi, 0x1F, 0x15, 0x12, 0x00000001, PPC_STRING),
diff --git a/target-ppc/translate/fp-ops.inc.c b/target-ppc/translate/fp-ops.inc.c
index d36ab4e..3127fa0 100644
--- a/target-ppc/translate/fp-ops.inc.c
+++ b/target-ppc/translate/fp-ops.inc.c
@@ -68,7 +68,6 @@ GEN_LDFS(lfd, ld64, 0x12, PPC_FLOAT)
 GEN_LDFS(lfs, ld32fs, 0x10, PPC_FLOAT)
 GEN_HANDLER_E(lfiwax, 0x1f, 0x17, 0x1a, 0x00000001, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(lfiwzx, 0x1f, 0x17, 0x1b, 0x1, PPC_NONE, PPC2_FP_CVT_ISA206),
-GEN_HANDLER_E(lfdp, 0x39, 0xFF, 0xFF, 0x00200003, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(lfdpx, 0x1F, 0x17, 0x18, 0x00200001, PPC_NONE, PPC2_ISA205),
 
 #define GEN_STF(name, stop, opc, type)                                        \
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index ed9588e..1d7cd23 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -190,6 +190,27 @@ static void gen_lxvb16x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+#define VSX_LOAD_SCALAR_DS(name, operation)                       \
+static void gen_##name(DisasContext *ctx)                         \
+{                                                                 \
+    TCGv EA;                                                      \
+    TCGv_i64 xth = cpu_vsrh(rD(ctx->opcode) + 32);                \
+                                                                  \
+    if (unlikely(!ctx->altivec_enabled)) {                        \
+        gen_exception(ctx, POWERPC_EXCP_VPU);                     \
+        return;                                                   \
+    }                                                             \
+    gen_set_access_type(ctx, ACCESS_INT);                         \
+    EA = tcg_temp_new();                                          \
+    gen_addr_imm_index(ctx, EA, 0x03);                            \
+    gen_qemu_##operation(ctx, xth, EA);                           \
+    /* NOTE: cpu_vsrl is undefined */                             \
+    tcg_temp_free(EA);                                            \
+}
+
+VSX_LOAD_SCALAR_DS(lxsd, ld64_i64)
+VSX_LOAD_SCALAR_DS(lxssp, ld32fs)
+
 #define VSX_STORE_SCALAR(name, operation)                     \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 6/9] target-ppc: implement stxsd and stxssp
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 5/9] target-ppc: implement lxsd and lxssp instructions Nikunj A Dadhania
@ 2016-11-22 11:46 ` Nikunj A Dadhania
  2016-11-22 15:19   ` Nikunj A Dadhania
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 7/9] target-ppc: implement lxv/lxvx and stxv/stxvx Nikunj A Dadhania
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:46 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

stxsd:  Store VSX Scalar Dword
stxssp: Store VSX Scalar SP

Moreover, DQ-Form/DS-FORM instructions shares the same primary
opcode(0x3D), bits 29:31 are used to decode the instruction. Us e a
common routine to decode primary opcode(0x3D) - ds-form/dq-form
instructions.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 25 +++++++++++++++++++++++++
 target-ppc/translate/fp-ops.inc.c   |  1 -
 target-ppc/translate/vsx-impl.inc.c | 21 +++++++++++++++++++++
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index f280851..bce607b 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -6076,6 +6076,29 @@ static void gen_dform39(DisasContext *ctx)
     return gen_invalid(ctx);
 }
 
+/* handles stfdp, stxsd, stxssp */
+static void gen_dform3D(DisasContext *ctx)
+{
+    switch (ctx->opcode & 0x7) {
+    case 0: /* lfdp */
+        if (ctx->insns_flags2 & PPC2_ISA205) {
+            return gen_stfdp(ctx);
+        }
+        break;
+    case 2: /* lxsd */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_stxsd(ctx);
+        }
+        break;
+    case 3: /* lxssp */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_stxssp(ctx);
+        }
+        break;
+    }
+    return gen_invalid(ctx);
+}
+
 static opcode_t opcodes[] = {
 GEN_HANDLER(invalid, 0x00, 0x00, 0x00, 0xFFFFFFFF, PPC_NONE),
 GEN_HANDLER(cmp, 0x1F, 0x00, 0x00, 0x00400000, PPC_INTEGER),
@@ -6150,6 +6173,8 @@ GEN_HANDLER(std, 0x3E, 0xFF, 0xFF, 0x00000000, PPC_64B),
 #endif
 /* handles lfdp, lxsd, lxssp */
 GEN_HANDLER_E(dform39, 0x39, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
+/* handles stfdp, stxsd, stxssp */
+GEN_HANDLER_E(dform3D, 0x3D, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(lmw, 0x2E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(stmw, 0x2F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(lswi, 0x1F, 0x15, 0x12, 0x00000001, PPC_STRING),
diff --git a/target-ppc/translate/fp-ops.inc.c b/target-ppc/translate/fp-ops.inc.c
index 3127fa0..3c6d05a 100644
--- a/target-ppc/translate/fp-ops.inc.c
+++ b/target-ppc/translate/fp-ops.inc.c
@@ -87,7 +87,6 @@ GEN_STXF(name, stop, 0x17, op | 0x00, type)
 GEN_STFS(stfd, st64_i64, 0x16, PPC_FLOAT)
 GEN_STFS(stfs, st32fs, 0x14, PPC_FLOAT)
 GEN_STXF(stfiw, st32fiw, 0x17, 0x1E, PPC_FLOAT_STFIWX)
-GEN_HANDLER_E(stfdp, 0x3D, 0xFF, 0xFF, 0x00200003, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(stfdpx, 0x1F, 0x17, 0x1C, 0x00200001, PPC_NONE, PPC2_ISA205),
 
 GEN_HANDLER(frsqrtes, 0x3B, 0x1A, 0xFF, 0x001F07C0, PPC_FLOAT_FRSQRTES),
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 1d7cd23..8ee44cf 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -332,6 +332,27 @@ static void gen_stxvb16x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+#define VSX_STORE_SCALAR_DS(name, operation)                      \
+static void gen_##name(DisasContext *ctx)                         \
+{                                                                 \
+    TCGv EA;                                                      \
+    TCGv_i64 xth = cpu_vsrh(rD(ctx->opcode) + 32);                \
+                                                                  \
+    if (unlikely(!ctx->altivec_enabled)) {                        \
+        gen_exception(ctx, POWERPC_EXCP_VPU);                     \
+        return;                                                   \
+    }                                                             \
+    gen_set_access_type(ctx, ACCESS_INT);                         \
+    EA = tcg_temp_new();                                          \
+    gen_addr_imm_index(ctx, EA, 0x03);                            \
+    gen_qemu_##operation(ctx, xth, EA);                           \
+    /* NOTE: cpu_vsrl is undefined */                             \
+    tcg_temp_free(EA);                                            \
+}
+
+VSX_LOAD_SCALAR_DS(stxsd, st64_i64)
+VSX_LOAD_SCALAR_DS(stxssp, st32fs)
+
 #define MV_VSRW(name, tcgop1, tcgop2, target, source)           \
 static void gen_##name(DisasContext *ctx)                       \
 {                                                               \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 7/9] target-ppc: implement lxv/lxvx and stxv/stxvx
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (5 preceding siblings ...)
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 6/9] target-ppc: implement stxsd and stxssp Nikunj A Dadhania
@ 2016-11-22 11:46 ` Nikunj A Dadhania
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 8/9] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 9/9] target-ppc: add vextu[bhw]rx instructions Nikunj A Dadhania
  8 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:46 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

lxv:  Load VSX Vector
lxvx: Load VSX Vector Indexed

    Little/Big-endian Storage
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    Vector load results:
    BE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    LE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |E7|E6|E5|E4|E3|E2|E1|E0|F7|F6|F5|F4|F3|F2|F1|F0|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

stxv: Store VSX Vector
stxvx: Store VSX Vector Indexed

    Vector (8-bit elements) in BE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    Vector (8-bit elements) in LE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |E7|E6|E5|E4|E3|E2|E1|E0|F7|F6|F5|F4|F3|F2|F1|F0|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    Store results in following:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

--

Note: In LE mode, lxvx and stxvx is being replaced by the compiler with
lxvd2x and stxvd2x (ISA 3.0: page 489 and page 507). AFAIU, seems to be
wrong as lxvx does a 16byte read and does a byte-swap across 16bytes,
while lxvd2x, does two 8bytes read and byte-swaps 8bytes.
---
 target-ppc/internal.h               |  1 +
 target-ppc/translate.c              | 14 +++++++++--
 target-ppc/translate/vsx-impl.inc.c | 50 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  2 ++
 4 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/target-ppc/internal.h b/target-ppc/internal.h
index 9a4a74a..e83ea45 100644
--- a/target-ppc/internal.h
+++ b/target-ppc/internal.h
@@ -187,6 +187,7 @@ EXTRACT_HELPER(DCM, 10, 6)
 /* DFP Z23-form */
 EXTRACT_HELPER(RMC, 9, 2)
 
+EXTRACT_HELPER_SPLIT(DQxT, 3, 1, 21, 5);
 EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
 EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
 EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index bce607b..9a4575d 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -6076,7 +6076,7 @@ static void gen_dform39(DisasContext *ctx)
     return gen_invalid(ctx);
 }
 
-/* handles stfdp, stxsd, stxssp */
+/* handles stfdp, lxv, stxsd, stxssp lxvx */
 static void gen_dform3D(DisasContext *ctx)
 {
     switch (ctx->opcode & 0x7) {
@@ -6085,6 +6085,11 @@ static void gen_dform3D(DisasContext *ctx)
             return gen_stfdp(ctx);
         }
         break;
+    case 1: /* lxv */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_lxv(ctx);
+        }
+        break;
     case 2: /* lxsd */
         if (ctx->insns_flags2 & PPC2_ISA300) {
             return gen_stxsd(ctx);
@@ -6095,6 +6100,11 @@ static void gen_dform3D(DisasContext *ctx)
             return gen_stxssp(ctx);
         }
         break;
+    case 5: /* stxv */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_stxv(ctx);
+        }
+        break;
     }
     return gen_invalid(ctx);
 }
@@ -6173,7 +6183,7 @@ GEN_HANDLER(std, 0x3E, 0xFF, 0xFF, 0x00000000, PPC_64B),
 #endif
 /* handles lfdp, lxsd, lxssp */
 GEN_HANDLER_E(dform39, 0x39, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
-/* handles stfdp, stxsd, stxssp */
+/* handles stfdp, lxv, stxsd, stxssp, stxv */
 GEN_HANDLER_E(dform3D, 0x3D, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(lmw, 0x2E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(stmw, 0x2F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 8ee44cf..2fbdbd2 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -190,6 +190,56 @@ static void gen_lxvb16x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+#define VSX_VECTOR_LOAD_STORE(name, op, indexed)            \
+static void gen_##name(DisasContext *ctx)                   \
+{                                                           \
+    int xt;                                                 \
+    TCGv EA;                                                \
+    TCGv_i64 xth, xtl;                                      \
+                                                            \
+    if (indexed) {                                          \
+        xt = xT(ctx->opcode);                               \
+    } else {                                                \
+        xt = DQxT(ctx->opcode);                             \
+    }                                                       \
+    xth = cpu_vsrh(xt);                                     \
+    xtl = cpu_vsrl(xt);                                     \
+                                                            \
+    if (xt < 32) {                                          \
+        if (unlikely(!ctx->vsx_enabled)) {                  \
+            gen_exception(ctx, POWERPC_EXCP_VSXU);          \
+            return;                                         \
+        }                                                   \
+    } else {                                                \
+        if (unlikely(!ctx->altivec_enabled)) {              \
+            gen_exception(ctx, POWERPC_EXCP_VPU);           \
+            return;                                         \
+        }                                                   \
+    }                                                       \
+    gen_set_access_type(ctx, ACCESS_INT);                   \
+    EA = tcg_temp_new();                                    \
+    if (indexed) {                                          \
+        gen_addr_reg_index(ctx, EA);                        \
+    } else {                                                \
+        gen_addr_imm_index(ctx, EA, 0x0F);                  \
+    }                                                       \
+    if (ctx->le_mode) {                                     \
+        tcg_gen_qemu_##op(xtl, EA, ctx->mem_idx, MO_LEQ);   \
+        tcg_gen_addi_tl(EA, EA, 8);                         \
+        tcg_gen_qemu_##op(xth, EA, ctx->mem_idx, MO_LEQ);   \
+    } else {                                                \
+        tcg_gen_qemu_##op(xth, EA, ctx->mem_idx, MO_BEQ);   \
+        tcg_gen_addi_tl(EA, EA, 8);                         \
+        tcg_gen_qemu_##op(xtl, EA, ctx->mem_idx, MO_BEQ);   \
+    }                                                       \
+    tcg_temp_free(EA);                                      \
+}
+
+VSX_VECTOR_LOAD_STORE(lxv, ld_i64, 0)
+VSX_VECTOR_LOAD_STORE(stxv, st_i64, 0)
+VSX_VECTOR_LOAD_STORE(lxvx, ld_i64, 1)
+VSX_VECTOR_LOAD_STORE(stxvx, st_i64, 1)
+
 #define VSX_LOAD_SCALAR_DS(name, operation)                       \
 static void gen_##name(DisasContext *ctx)                         \
 {                                                                 \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 7f09527..8a1cbe0 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -9,6 +9,7 @@ GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvh8x, 0x1F, 0x0C, 0x19, 0, PPC_NONE,  PPC2_ISA300),
 GEN_HANDLER_E(lxvb16x, 0x1F, 0x0C, 0x1B, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(lxvx, 0x1F, 0x0C, 0x08, 0x00000040, PPC_NONE, PPC2_ISA300),
 
 GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300),
@@ -19,6 +20,7 @@ GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvh8x, 0x1F, 0x0C, 0x1D, 0, PPC_NONE,  PPC2_ISA300),
 GEN_HANDLER_E(stxvb16x, 0x1F, 0x0C, 0x1F, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(stxvx, 0x1F, 0x0C, 0x0C, 0, PPC_NONE, PPC2_ISA300),
 
 GEN_HANDLER_E(mfvsrwz, 0x1F, 0x13, 0x03, 0x0000F800, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(mtvsrwa, 0x1F, 0x13, 0x06, 0x0000F800, PPC_NONE, PPC2_VSX207),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 8/9] target-ppc: add vextu[bhw]lx instructions
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (6 preceding siblings ...)
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 7/9] target-ppc: implement lxv/lxvx and stxv/stxvx Nikunj A Dadhania
@ 2016-11-22 11:46 ` Nikunj A Dadhania
  2016-11-23  4:11   ` David Gibson
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 9/9] target-ppc: add vextu[bhw]rx instructions Nikunj A Dadhania
  8 siblings, 1 reply; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:46 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata, Avinesh Kumar

From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>

vextublx:  Vector Extract Unsigned Byte Left
vextuhlx:  Vector Extract Unsigned Halfword Left
vextuwlx:  Vector Extract Unsigned Word Left

Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  3 ++
 target-ppc/int_helper.c             | 63 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c | 18 +++++++++++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++-
 4 files changed, 87 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 3b26678..d0a8fb2 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -366,6 +366,9 @@ DEF_HELPER_3(vpmsumb, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumh, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumw, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
+DEF_HELPER_2(vextublx, tl, tl, avr)
+DEF_HELPER_2(vextuhlx, tl, tl, avr)
+DEF_HELPER_2(vextuwlx, tl, tl, avr)
 
 DEF_HELPER_2(vsbox, void, avr, avr)
 DEF_HELPER_3(vcipher, void, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 8886a72..fb9f178 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1805,6 +1805,69 @@ void helper_vlogefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
     }
 }
 
+#define EXTRACT128(value, start, length)        \
+    ((value >> start) & (~(__uint128_t)0 >> (128 - length)))
+
+#if defined(HOST_WORDS_BIGENDIAN)
+#  if defined(CONFIG_INT128)
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int index = (a & 0xf) * 8;                                  \
+    r = EXTRACT128(b->u128, index, elem * 8);                   \
+    return r;                                                   \
+}
+#  else
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int i;                                                      \
+    int index = a & 0xf;                                        \
+    for (i = 0; i < elem; i++) {                                \
+        r = r << 8;                                             \
+        if (index + i <= 15) {                                  \
+            r = r | b->u8[index + i];                           \
+        }                                                       \
+    }                                                           \
+    return r;                                                   \
+}
+#  endif
+#else
+#  if defined(CONFIG_INT128)
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int size =  elem * 8;                                       \
+    int index = (15 - (a & 0xf) + 1) * 8;                       \
+    r = EXTRACT128(b->u128, (index - size), size);              \
+    return r;                                                   \
+}
+#  else
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int i;                                                      \
+    int index = 15 - (a & 0xf);                                 \
+    for (i = 0; i < elem; i++) {                                \
+        r = r << 8;                                             \
+        if (index - i >= 0) {                                   \
+            r = r | b->u8[index - i];                           \
+        }                                                       \
+    }                                                           \
+    return r;                                                   \
+}
+#  endif
+#endif
+
+VEXTULX_DO(vextublx, 1)
+VEXTULX_DO(vextuhlx, 2)
+VEXTULX_DO(vextuwlx, 4)
+#undef VEXTULX_DO
+
 /* The specification says that the results are undefined if all of the
  * shift counts are not identical.  We check to make sure that they are
  * to conform to what real hardware appears to do.  */
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index 7143eb3..e91d10b 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -340,6 +340,19 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx)              \
     }                                                                   \
 }
 
+#define GEN_VXFORM_HETRO(name, opc2, opc3)                              \
+static void glue(gen_, name)(DisasContext *ctx)                         \
+{                                                                       \
+    TCGv_ptr rb;                                                        \
+    if (unlikely(!ctx->altivec_enabled)) {                              \
+        gen_exception(ctx, POWERPC_EXCP_VPU);                           \
+        return;                                                         \
+    }                                                                   \
+    rb = gen_avr_ptr(rB(ctx->opcode));                                  \
+    gen_helper_##name(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)], rb); \
+    tcg_temp_free_ptr(rb);                                              \
+}
+
 GEN_VXFORM(vaddubm, 0, 0);
 GEN_VXFORM_DUAL_EXT(vaddubm, PPC_ALTIVEC, PPC_NONE, 0,       \
                     vmul10cuq, PPC_NONE, PPC2_ISA300, 0x0000F800)
@@ -525,6 +538,11 @@ GEN_VXFORM_ENV(vaddfp, 5, 0);
 GEN_VXFORM_ENV(vsubfp, 5, 1);
 GEN_VXFORM_ENV(vmaxfp, 5, 16);
 GEN_VXFORM_ENV(vminfp, 5, 17);
+GEN_VXFORM_HETRO(vextublx, 6, 24)
+GEN_VXFORM_HETRO(vextuhlx, 6, 25)
+GEN_VXFORM_HETRO(vextuwlx, 6, 26)
+GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207,
+                vextuwlx, PPC_NONE, PPC2_ISA300)
 
 #define GEN_VXRFORM1(opname, name, str, opc2, opc3)                     \
 static void glue(gen_, name)(DisasContext *ctx)                         \
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index f02b3be..e62e564 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -91,8 +91,10 @@ GEN_VXFORM(vmrghw, 6, 2),
 GEN_VXFORM(vmrglb, 6, 4),
 GEN_VXFORM(vmrglh, 6, 5),
 GEN_VXFORM(vmrglw, 6, 6),
+GEN_VXFORM_300(vextublx, 6, 24),
+GEN_VXFORM_300(vextuhlx, 6, 25),
+GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM_207(vmrgew, 6, 30),
-GEN_VXFORM_207(vmrgow, 6, 26),
 GEN_VXFORM(vmuloub, 4, 0),
 GEN_VXFORM(vmulouh, 4, 1),
 GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 9/9] target-ppc: add vextu[bhw]rx instructions
  2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (7 preceding siblings ...)
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 8/9] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
@ 2016-11-22 11:46 ` Nikunj A Dadhania
  8 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 11:46 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, bharata, Hariharan T.S, Avinesh Kumar

From: "Hariharan T.S" <hari@linux.vnet.ibm.com>

vextubrx: Vector Extract Unsigned Byte Right-Indexed VX-form
vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed VX-form
vextuwrx: Vector Extract Unsigned Word Right-Indexed VX-form

Signed-off-by: Hariharan T.S. <hari@linux.vnet.ibm.com>
Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  3 ++
 target-ppc/int_helper.c             | 60 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  5 ++++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++-
 4 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index d0a8fb2..a6e04cb 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -369,6 +369,9 @@ DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
 DEF_HELPER_2(vextublx, tl, tl, avr)
 DEF_HELPER_2(vextuhlx, tl, tl, avr)
 DEF_HELPER_2(vextuwlx, tl, tl, avr)
+DEF_HELPER_2(vextubrx, tl, tl, avr)
+DEF_HELPER_2(vextuhrx, tl, tl, avr)
+DEF_HELPER_2(vextuwrx, tl, tl, avr)
 
 DEF_HELPER_2(vsbox, void, avr, avr)
 DEF_HELPER_3(vcipher, void, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index fb9f178..d719996 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1868,6 +1868,66 @@ VEXTULX_DO(vextuhlx, 2)
 VEXTULX_DO(vextuwlx, 4)
 #undef VEXTULX_DO
 
+#if defined(HOST_WORDS_BIGENDIAN)
+#  if defined(CONFIG_INT128)
+#  define VEXTURX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int size =  elem * 8;                                       \
+    int index = (15 - (a & 0xf) + 1) * 8;                       \
+    r = EXTRACT128(b->u128, (index - size), size);              \
+    return r;                                                   \
+}
+#  else
+#  define VEXTURX_DO(name, elem)                                        \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)          \
+{                                                                       \
+    target_ulong r = 0;                                                 \
+    int i;                                                              \
+    int index = a & 0xf;                                                \
+    for (i = elem - 1; i >= 0; i--) {                                   \
+        r = r << 8;                                                     \
+        if ((15 - i - index) >= 0) {                                    \
+            r = r | b->u8[15 - i - index];                              \
+        }                                                               \
+    }                                                                   \
+    return r;                                                           \
+}
+#  endif
+#else
+#  if defined(CONFIG_INT128)
+#  define VEXTURX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int index = (a & 0xf) * 8;                                  \
+    r = EXTRACT128(b->u128, index, elem * 8);                   \
+    return r;                                                   \
+}
+#  else
+#  define VEXTURX_DO(name, elem)                                        \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)          \
+{                                                                       \
+    target_ulong r = 0;                                                 \
+    int i;                                                              \
+    int index = 15 - (a & 0xf);                                         \
+    for (i = elem - 1; i >= 0; i--) {                                   \
+        r = r << 8;                                                     \
+        if ((15 + i - index) <= 15) {                                   \
+            r = r | b->u8[15 + i - index];                              \
+        }                                                               \
+    }                                                                   \
+    return r;                                                           \
+}
+#  endif
+#endif
+
+VEXTURX_DO(vextubrx, 1)
+VEXTURX_DO(vextuhrx, 2)
+VEXTURX_DO(vextuwrx, 4)
+#undef VEXTURX_DO
+
 /* The specification says that the results are undefined if all of the
  * shift counts are not identical.  We check to make sure that they are
  * to conform to what real hardware appears to do.  */
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index e91d10b..3dea465 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -543,6 +543,11 @@ GEN_VXFORM_HETRO(vextuhlx, 6, 25)
 GEN_VXFORM_HETRO(vextuwlx, 6, 26)
 GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207,
                 vextuwlx, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_HETRO(vextubrx, 6, 28)
+GEN_VXFORM_HETRO(vextuhrx, 6, 29)
+GEN_VXFORM_HETRO(vextuwrx, 6, 30)
+GEN_VXFORM_DUAL(vmrgew, PPC_NONE, PPC2_ALTIVEC_207, \
+                vextuwrx, PPC_NONE, PPC2_ISA300)
 
 #define GEN_VXRFORM1(opname, name, str, opc2, opc3)                     \
 static void glue(gen_, name)(DisasContext *ctx)                         \
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index e62e564..a3c9d05 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -94,7 +94,9 @@ GEN_VXFORM(vmrglw, 6, 6),
 GEN_VXFORM_300(vextublx, 6, 24),
 GEN_VXFORM_300(vextuhlx, 6, 25),
 GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM_207(vmrgew, 6, 30),
+GEN_VXFORM_300(vextubrx, 6, 28),
+GEN_VXFORM_300(vextuhrx, 6, 29),
+GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM(vmuloub, 4, 0),
 GEN_VXFORM(vmulouh, 4, 1),
 GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] target-ppc: implement stxsd and stxssp
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 6/9] target-ppc: implement stxsd and stxssp Nikunj A Dadhania
@ 2016-11-22 15:19   ` Nikunj A Dadhania
  0 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-22 15:19 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata

Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> writes:

> stxsd:  Store VSX Scalar Dword
> stxssp: Store VSX Scalar SP
>
> Moreover, DQ-Form/DS-FORM instructions shares the same primary
> opcode(0x3D), bits 29:31 are used to decode the instruction. Us e a
> common routine to decode primary opcode(0x3D) - ds-form/dq-form
> instructions.


Realised that the below logic wast correct, should be something like this:

static void gen_dform3D(DisasContext *ctx)
{
    if ((ctx->opcode & 3) == 1) { /* DQ-FORM */
        switch (ctx->opcode & 0x7) {
        case 1: /* lxv */
            if (ctx->insns_flags2 & PPC2_ISA300) {
                return gen_lxv(ctx);
            }
            break;
        case 5: /* stxv */
            if (ctx->insns_flags2 & PPC2_ISA300) {
                return gen_stxv(ctx);
            }
            break;
        }
    } else { /* DS-FORM */
        switch (ctx->opcode & 0x3) {
        case 0: /* lfdp */
            if (ctx->insns_flags2 & PPC2_ISA205) {
                return gen_stfdp(ctx);
            }
            break;
        case 2: /* lxsd */
            if (ctx->insns_flags2 & PPC2_ISA300) {
                return gen_stxsd(ctx);
            }
        break;
        case 3: /* lxssp */
            if (ctx->insns_flags2 & PPC2_ISA300) {
                return gen_stxssp(ctx);
            }
            break;
        }
    }
    return gen_invalid(ctx);
}

Will correct it in next revision.

>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/translate.c              | 25 +++++++++++++++++++++++++
>  target-ppc/translate/fp-ops.inc.c   |  1 -
>  target-ppc/translate/vsx-impl.inc.c | 21 +++++++++++++++++++++
>  3 files changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index f280851..bce607b 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -6076,6 +6076,29 @@ static void gen_dform39(DisasContext *ctx)
>      return gen_invalid(ctx);
>  }
>
> +/* handles stfdp, stxsd, stxssp */
> +static void gen_dform3D(DisasContext *ctx)
> +{
> +    switch (ctx->opcode & 0x7) {
> +    case 0: /* lfdp */
> +        if (ctx->insns_flags2 & PPC2_ISA205) {
> +            return gen_stfdp(ctx);
> +        }
> +        break;
> +    case 2: /* lxsd */
> +        if (ctx->insns_flags2 & PPC2_ISA300) {
> +            return gen_stxsd(ctx);
> +        }
> +        break;
> +    case 3: /* lxssp */
> +        if (ctx->insns_flags2 & PPC2_ISA300) {
> +            return gen_stxssp(ctx);
> +        }
> +        break;
> +    }
> +    return gen_invalid(ctx);
> +}
> +

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 1/9] target-ppc: Consolidate instruction decode helpers
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 1/9] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
@ 2016-11-23  3:56   ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2016-11-23  3:56 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 14004 bytes --]

On Tue, Nov 22, 2016 at 05:15:57PM +0530, Nikunj A Dadhania wrote:
> From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> Move instruction decode helpers to target-ppc/internal.h so that some
> of these can be used from outside of translate.c. This movement also
> helps to get rid of some duplicate helpers from target-ppc/fpu_helper.c.
> 
> Suggested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  target-ppc/fpu_helper.c |  11 +---
>  target-ppc/internal.h   | 151 ++++++++++++++++++++++++++++++++++++++++++++++++
>  target-ppc/translate.c  | 151 ------------------------------------------------
>  3 files changed, 152 insertions(+), 161 deletions(-)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index 8a389e1..d3741b4 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -20,6 +20,7 @@
>  #include "cpu.h"
>  #include "exec/helper-proto.h"
>  #include "exec/exec-all.h"
> +#include "internal.h"
>  
>  #define float64_snan_to_qnan(x) ((x) | 0x0008000000000000ULL)
>  #define float32_snan_to_qnan(x) ((x) | 0x00400000)
> @@ -1776,16 +1777,6 @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2)
>      return helper_efdtsteq(env, op1, op2);
>  }
>  
> -#define DECODE_SPLIT(opcode, shift1, nb1, shift2, nb2) \
> -    (((((opcode) >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |    \
> -     (((opcode) >> (shift2)) & ((1 << (nb2)) - 1)))
> -
> -#define xT(opcode) DECODE_SPLIT(opcode, 0, 1, 21, 5)
> -#define xA(opcode) DECODE_SPLIT(opcode, 2, 1, 16, 5)
> -#define xB(opcode) DECODE_SPLIT(opcode, 1, 1, 11, 5)
> -#define xC(opcode) DECODE_SPLIT(opcode, 3, 1,  6, 5)
> -#define BF(opcode) (((opcode) >> (31-8)) & 7)
> -
>  typedef union _ppc_vsr_t {
>      uint64_t u64[2];
>      uint32_t u32[4];
> diff --git a/target-ppc/internal.h b/target-ppc/internal.h
> index 1ff4896..9a4a74a 100644
> --- a/target-ppc/internal.h
> +++ b/target-ppc/internal.h
> @@ -47,4 +47,155 @@ FUNC_MASK(MASK, target_ulong, 32, UINT32_MAX);
>  FUNC_MASK(mask_u32, uint32_t, 32, UINT32_MAX);
>  FUNC_MASK(mask_u64, uint64_t, 64, UINT64_MAX);
>  
> +/*****************************************************************************/
> +/***                           Instruction decoding                        ***/
> +#define EXTRACT_HELPER(name, shift, nb)                                       \
> +static inline uint32_t name(uint32_t opcode)                                  \
> +{                                                                             \
> +    return (opcode >> (shift)) & ((1 << (nb)) - 1);                           \
> +}
> +
> +#define EXTRACT_SHELPER(name, shift, nb)                                      \
> +static inline int32_t name(uint32_t opcode)                                   \
> +{                                                                             \
> +    return (int16_t)((opcode >> (shift)) & ((1 << (nb)) - 1));                \
> +}
> +
> +#define EXTRACT_HELPER_SPLIT(name, shift1, nb1, shift2, nb2)                  \
> +static inline uint32_t name(uint32_t opcode)                                  \
> +{                                                                             \
> +    return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |             \
> +            ((opcode >> (shift2)) & ((1 << (nb2)) - 1));                      \
> +}
> +
> +#define EXTRACT_HELPER_DXFORM(name,                                           \
> +                              d0_bits, shift_op_d0, shift_d0,                 \
> +                              d1_bits, shift_op_d1, shift_d1,                 \
> +                              d2_bits, shift_op_d2, shift_d2)                 \
> +static inline int16_t name(uint32_t opcode)                                   \
> +{                                                                             \
> +    return                                                                    \
> +        (((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) | \
> +        (((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) | \
> +        (((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << (shift_d2));  \
> +}
> +
> +
> +/* Opcode part 1 */
> +EXTRACT_HELPER(opc1, 26, 6);
> +/* Opcode part 2 */
> +EXTRACT_HELPER(opc2, 1, 5);
> +/* Opcode part 3 */
> +EXTRACT_HELPER(opc3, 6, 5);
> +/* Opcode part 4 */
> +EXTRACT_HELPER(opc4, 16, 5);
> +/* Update Cr0 flags */
> +EXTRACT_HELPER(Rc, 0, 1);
> +/* Update Cr6 flags (Altivec) */
> +EXTRACT_HELPER(Rc21, 10, 1);
> +/* Destination */
> +EXTRACT_HELPER(rD, 21, 5);
> +/* Source */
> +EXTRACT_HELPER(rS, 21, 5);
> +/* First operand */
> +EXTRACT_HELPER(rA, 16, 5);
> +/* Second operand */
> +EXTRACT_HELPER(rB, 11, 5);
> +/* Third operand */
> +EXTRACT_HELPER(rC, 6, 5);
> +/***                               Get CRn                                 ***/
> +EXTRACT_HELPER(crfD, 23, 3);
> +EXTRACT_HELPER(BF, 23, 3);
> +EXTRACT_HELPER(crfS, 18, 3);
> +EXTRACT_HELPER(crbD, 21, 5);
> +EXTRACT_HELPER(crbA, 16, 5);
> +EXTRACT_HELPER(crbB, 11, 5);
> +/* SPR / TBL */
> +EXTRACT_HELPER(_SPR, 11, 10);
> +static inline uint32_t SPR(uint32_t opcode)
> +{
> +    uint32_t sprn = _SPR(opcode);
> +
> +    return ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
> +}
> +/***                              Get constants                            ***/
> +/* 16 bits signed immediate value */
> +EXTRACT_SHELPER(SIMM, 0, 16);
> +/* 16 bits unsigned immediate value */
> +EXTRACT_HELPER(UIMM, 0, 16);
> +/* 5 bits signed immediate value */
> +EXTRACT_HELPER(SIMM5, 16, 5);
> +/* 5 bits signed immediate value */
> +EXTRACT_HELPER(UIMM5, 16, 5);
> +/* 4 bits unsigned immediate value */
> +EXTRACT_HELPER(UIMM4, 16, 4);
> +/* Bit count */
> +EXTRACT_HELPER(NB, 11, 5);
> +/* Shift count */
> +EXTRACT_HELPER(SH, 11, 5);
> +/* Vector shift count */
> +EXTRACT_HELPER(VSH, 6, 4);
> +/* Mask start */
> +EXTRACT_HELPER(MB, 6, 5);
> +/* Mask end */
> +EXTRACT_HELPER(ME, 1, 5);
> +/* Trap operand */
> +EXTRACT_HELPER(TO, 21, 5);
> +
> +EXTRACT_HELPER(CRM, 12, 8);
> +
> +#ifndef CONFIG_USER_ONLY
> +EXTRACT_HELPER(SR, 16, 4);
> +#endif
> +
> +/* mtfsf/mtfsfi */
> +EXTRACT_HELPER(FPBF, 23, 3);
> +EXTRACT_HELPER(FPIMM, 12, 4);
> +EXTRACT_HELPER(FPL, 25, 1);
> +EXTRACT_HELPER(FPFLM, 17, 8);
> +EXTRACT_HELPER(FPW, 16, 1);
> +
> +/* addpcis */
> +EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
> +#if defined(TARGET_PPC64)
> +/* darn */
> +EXTRACT_HELPER(L, 16, 2);
> +#endif
> +
> +/***                            Jump target decoding                       ***/
> +/* Immediate address */
> +static inline target_ulong LI(uint32_t opcode)
> +{
> +    return (opcode >> 0) & 0x03FFFFFC;
> +}
> +
> +static inline uint32_t BD(uint32_t opcode)
> +{
> +    return (opcode >> 0) & 0xFFFC;
> +}
> +
> +EXTRACT_HELPER(BO, 21, 5);
> +EXTRACT_HELPER(BI, 16, 5);
> +/* Absolute/relative address */
> +EXTRACT_HELPER(AA, 1, 1);
> +/* Link */
> +EXTRACT_HELPER(LK, 0, 1);
> +
> +/* DFP Z22-form */
> +EXTRACT_HELPER(DCM, 10, 6)
> +
> +/* DFP Z23-form */
> +EXTRACT_HELPER(RMC, 9, 2)
> +
> +EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
> +EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
> +EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
> +EXTRACT_HELPER_SPLIT(xB, 1, 1, 11, 5);
> +EXTRACT_HELPER_SPLIT(xC, 3, 1,  6, 5);
> +EXTRACT_HELPER(DM, 8, 2);
> +EXTRACT_HELPER(UIM, 16, 2);
> +EXTRACT_HELPER(SHW, 8, 2);
> +EXTRACT_HELPER(SP, 19, 2);
> +EXTRACT_HELPER(IMM8, 11, 8);
> +
>  #endif /* PPC_INTERNAL_H */
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 59e9552..6bdc433 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -422,157 +422,6 @@ typedef struct opcode_t {
>  
>  #define CHK_NONE
>  
> -
> -/*****************************************************************************/
> -/***                           Instruction decoding                        ***/
> -#define EXTRACT_HELPER(name, shift, nb)                                       \
> -static inline uint32_t name(uint32_t opcode)                                  \
> -{                                                                             \
> -    return (opcode >> (shift)) & ((1 << (nb)) - 1);                           \
> -}
> -
> -#define EXTRACT_SHELPER(name, shift, nb)                                      \
> -static inline int32_t name(uint32_t opcode)                                   \
> -{                                                                             \
> -    return (int16_t)((opcode >> (shift)) & ((1 << (nb)) - 1));                \
> -}
> -
> -#define EXTRACT_HELPER_SPLIT(name, shift1, nb1, shift2, nb2)                  \
> -static inline uint32_t name(uint32_t opcode)                                  \
> -{                                                                             \
> -    return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |             \
> -            ((opcode >> (shift2)) & ((1 << (nb2)) - 1));                      \
> -}
> -
> -#define EXTRACT_HELPER_DXFORM(name,                                           \
> -                              d0_bits, shift_op_d0, shift_d0,                 \
> -                              d1_bits, shift_op_d1, shift_d1,                 \
> -                              d2_bits, shift_op_d2, shift_d2)                 \
> -static inline int16_t name(uint32_t opcode)                                   \
> -{                                                                             \
> -    return                                                                    \
> -        (((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) | \
> -        (((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) | \
> -        (((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << (shift_d2));  \
> -}
> -
> -
> -/* Opcode part 1 */
> -EXTRACT_HELPER(opc1, 26, 6);
> -/* Opcode part 2 */
> -EXTRACT_HELPER(opc2, 1, 5);
> -/* Opcode part 3 */
> -EXTRACT_HELPER(opc3, 6, 5);
> -/* Opcode part 4 */
> -EXTRACT_HELPER(opc4, 16, 5);
> -/* Update Cr0 flags */
> -EXTRACT_HELPER(Rc, 0, 1);
> -/* Update Cr6 flags (Altivec) */
> -EXTRACT_HELPER(Rc21, 10, 1);
> -/* Destination */
> -EXTRACT_HELPER(rD, 21, 5);
> -/* Source */
> -EXTRACT_HELPER(rS, 21, 5);
> -/* First operand */
> -EXTRACT_HELPER(rA, 16, 5);
> -/* Second operand */
> -EXTRACT_HELPER(rB, 11, 5);
> -/* Third operand */
> -EXTRACT_HELPER(rC, 6, 5);
> -/***                               Get CRn                                 ***/
> -EXTRACT_HELPER(crfD, 23, 3);
> -EXTRACT_HELPER(crfS, 18, 3);
> -EXTRACT_HELPER(crbD, 21, 5);
> -EXTRACT_HELPER(crbA, 16, 5);
> -EXTRACT_HELPER(crbB, 11, 5);
> -/* SPR / TBL */
> -EXTRACT_HELPER(_SPR, 11, 10);
> -static inline uint32_t SPR(uint32_t opcode)
> -{
> -    uint32_t sprn = _SPR(opcode);
> -
> -    return ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
> -}
> -/***                              Get constants                            ***/
> -/* 16 bits signed immediate value */
> -EXTRACT_SHELPER(SIMM, 0, 16);
> -/* 16 bits unsigned immediate value */
> -EXTRACT_HELPER(UIMM, 0, 16);
> -/* 5 bits signed immediate value */
> -EXTRACT_HELPER(SIMM5, 16, 5);
> -/* 5 bits signed immediate value */
> -EXTRACT_HELPER(UIMM5, 16, 5);
> -/* 4 bits unsigned immediate value */
> -EXTRACT_HELPER(UIMM4, 16, 4);
> -/* Bit count */
> -EXTRACT_HELPER(NB, 11, 5);
> -/* Shift count */
> -EXTRACT_HELPER(SH, 11, 5);
> -/* Vector shift count */
> -EXTRACT_HELPER(VSH, 6, 4);
> -/* Mask start */
> -EXTRACT_HELPER(MB, 6, 5);
> -/* Mask end */
> -EXTRACT_HELPER(ME, 1, 5);
> -/* Trap operand */
> -EXTRACT_HELPER(TO, 21, 5);
> -
> -EXTRACT_HELPER(CRM, 12, 8);
> -
> -#ifndef CONFIG_USER_ONLY
> -EXTRACT_HELPER(SR, 16, 4);
> -#endif
> -
> -/* mtfsf/mtfsfi */
> -EXTRACT_HELPER(FPBF, 23, 3);
> -EXTRACT_HELPER(FPIMM, 12, 4);
> -EXTRACT_HELPER(FPL, 25, 1);
> -EXTRACT_HELPER(FPFLM, 17, 8);
> -EXTRACT_HELPER(FPW, 16, 1);
> -
> -/* addpcis */
> -EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
> -#if defined(TARGET_PPC64)
> -/* darn */
> -EXTRACT_HELPER(L, 16, 2);
> -#endif
> -
> -/***                            Jump target decoding                       ***/
> -/* Immediate address */
> -static inline target_ulong LI(uint32_t opcode)
> -{
> -    return (opcode >> 0) & 0x03FFFFFC;
> -}
> -
> -static inline uint32_t BD(uint32_t opcode)
> -{
> -    return (opcode >> 0) & 0xFFFC;
> -}
> -
> -EXTRACT_HELPER(BO, 21, 5);
> -EXTRACT_HELPER(BI, 16, 5);
> -/* Absolute/relative address */
> -EXTRACT_HELPER(AA, 1, 1);
> -/* Link */
> -EXTRACT_HELPER(LK, 0, 1);
> -
> -/* DFP Z22-form */
> -EXTRACT_HELPER(DCM, 10, 6)
> -
> -/* DFP Z23-form */
> -EXTRACT_HELPER(RMC, 9, 2)
> -
> -EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
> -EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
> -EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
> -EXTRACT_HELPER_SPLIT(xB, 1, 1, 11, 5);
> -EXTRACT_HELPER_SPLIT(xC, 3, 1,  6, 5);
> -EXTRACT_HELPER(DM, 8, 2);
> -EXTRACT_HELPER(UIM, 16, 2);
> -EXTRACT_HELPER(SHW, 8, 2);
> -EXTRACT_HELPER(SP, 19, 2);
> -EXTRACT_HELPER(IMM8, 11, 8);
> -
>  /*****************************************************************************/
>  /* PowerPC instructions table                                                */
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions Nikunj A Dadhania
@ 2016-11-23  4:01   ` David Gibson
  2016-11-23  5:40     ` Bharata B Rao
  0 siblings, 1 reply; 20+ messages in thread
From: David Gibson @ 2016-11-23  4:01 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 5740 bytes --]

On Tue, Nov 22, 2016 at 05:15:58PM +0530, Nikunj A Dadhania wrote:
> From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> - xscmpodp & xscmpudp are missing flags reset.
> - In xscmpodp, VXCC should be set only if VE is 0 for signalling NaN case
>   and VXCC should be set by explicitly checking for quiet NaN case.
> - Comparison is being done only if the operands are not NaNs. However as
>   per ISA, it should be done even when operands are NaNs.

For my interest, can you explain the difference between ordered and
unordered comparisons?  I looked at the ISA and mostly just became
confused.

> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/fpu_helper.c | 41 +++++++++++++++++++++++++----------------
>  1 file changed, 25 insertions(+), 16 deletions(-)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index d3741b4..3027003 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2410,29 +2410,38 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
>  {                                                                        \
>      ppc_vsr_t xa, xb;                                                    \
>      uint32_t cc = 0;                                                     \
> +    bool vxsnan_flag = false, vxvc_flag = false;                         \
>                                                                           \
> +    helper_reset_fpstatus(env);                                          \
>      getVSR(xA(opcode), &xa, env);                                        \
>      getVSR(xB(opcode), &xb, env);                                        \
>                                                                           \
> -    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||                       \
> -                 float64_is_any_nan(xb.VsrD(0)))) {                      \
> -        if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||     \
> -            float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {     \
> -            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);       \
> -        }                                                                \
> -        if (ordered) {                                                   \
> -            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);         \
> +    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||         \
> +        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {         \
> +        vxsnan_flag = true;                                              \
> +        cc = 1;                                                          \
> +        if (fpscr_ve == 0 && ordered) {                                  \
> +            vxvc_flag = true;                                            \
>          }                                                                \
> +    } else if ((float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||     \
> +                float64_is_quiet_nan(xb.VsrD(0), &env->fp_status))       \
> +               && ordered) {                                             \
>          cc = 1;                                                          \

Since you're basically rewriting this, could you please change it to
use symbolic constants for the CC bits, which will make it easier to
follow.

> +        vxvc_flag = true;                                                \
> +    }                                                                    \
> +    if (vxsnan_flag) {                                                   \
> +        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);           \
> +    }                                                                    \
> +    if (vxvc_flag) {                                                     \
> +        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);             \
> +    }                                                                    \
> +                                                                         \
> +    if (float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {           \
> +        cc |= 8;                                                         \
> +    } else if (!float64_le(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {   \
> +        cc |= 4;                                                         \
>      } else {                                                             \
> -        if (float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {       \
> -            cc = 8;                                                      \
> -        } else if (!float64_le(xa.VsrD(0), xb.VsrD(0),                   \
> -                               &env->fp_status)) { \
> -            cc = 4;                                                      \
> -        } else {                                                         \
> -            cc = 2;                                                      \
> -        }                                                                \
> +        cc |= 2;                                                         \
>      }                                                                    \
>                                                                           \
>      env->fpscr &= ~(0x0F << FPSCR_FPRF);                                 \

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] target-ppc: implement lxsd and lxssp instructions
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 5/9] target-ppc: implement lxsd and lxssp instructions Nikunj A Dadhania
@ 2016-11-23  4:06   ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2016-11-23  4:06 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 5012 bytes --]

On Tue, Nov 22, 2016 at 05:16:01PM +0530, Nikunj A Dadhania wrote:
> lxsd: Load VSX Scalar Dword
> lxssp: Load VSX Scalar Single
> 
> Moreover, DS-Form instructions shares the same primary opcode, bits
> 30:31 are used to decode the instruction. Use a common routine to decode
> primary opcode(0x39) - ds-form instructions and branch-out depending on
> bits 30:31.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  target-ppc/translate.c              | 25 +++++++++++++++++++++++++
>  target-ppc/translate/fp-ops.inc.c   |  1 -
>  target-ppc/translate/vsx-impl.inc.c | 21 +++++++++++++++++++++
>  3 files changed, 46 insertions(+), 1 deletion(-)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 6bdc433..f280851 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -6053,6 +6053,29 @@ GEN_TM_PRIV_NOOP(trechkpt);
>  
>  #include "translate/spe-impl.inc.c"
>  
> +/* Handles lfdp, lxsd, lxssp */
> +static void gen_dform39(DisasContext *ctx)
> +{
> +    switch (ctx->opcode & 0x3) {
> +    case 0: /* lfdp */
> +        if (ctx->insns_flags2 & PPC2_ISA205) {
> +            return gen_lfdp(ctx);
> +        }
> +        break;
> +    case 2: /* lxsd */
> +        if (ctx->insns_flags2 & PPC2_ISA300) {
> +            return gen_lxsd(ctx);
> +        }
> +        break;
> +    case 3: /* lxssp */
> +        if (ctx->insns_flags2 & PPC2_ISA300) {
> +            return gen_lxssp(ctx);
> +        }
> +        break;
> +    }
> +    return gen_invalid(ctx);
> +}
> +
>  static opcode_t opcodes[] = {
>  GEN_HANDLER(invalid, 0x00, 0x00, 0x00, 0xFFFFFFFF, PPC_NONE),
>  GEN_HANDLER(cmp, 0x1F, 0x00, 0x00, 0x00400000, PPC_INTEGER),
> @@ -6125,6 +6148,8 @@ GEN_HANDLER(ld, 0x3A, 0xFF, 0xFF, 0x00000000, PPC_64B),
>  GEN_HANDLER(lq, 0x38, 0xFF, 0xFF, 0x00000000, PPC_64BX),
>  GEN_HANDLER(std, 0x3E, 0xFF, 0xFF, 0x00000000, PPC_64B),
>  #endif
> +/* handles lfdp, lxsd, lxssp */
> +GEN_HANDLER_E(dform39, 0x39, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
>  GEN_HANDLER(lmw, 0x2E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
>  GEN_HANDLER(stmw, 0x2F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
>  GEN_HANDLER(lswi, 0x1F, 0x15, 0x12, 0x00000001, PPC_STRING),
> diff --git a/target-ppc/translate/fp-ops.inc.c b/target-ppc/translate/fp-ops.inc.c
> index d36ab4e..3127fa0 100644
> --- a/target-ppc/translate/fp-ops.inc.c
> +++ b/target-ppc/translate/fp-ops.inc.c
> @@ -68,7 +68,6 @@ GEN_LDFS(lfd, ld64, 0x12, PPC_FLOAT)
>  GEN_LDFS(lfs, ld32fs, 0x10, PPC_FLOAT)
>  GEN_HANDLER_E(lfiwax, 0x1f, 0x17, 0x1a, 0x00000001, PPC_NONE, PPC2_ISA205),
>  GEN_HANDLER_E(lfiwzx, 0x1f, 0x17, 0x1b, 0x1, PPC_NONE, PPC2_FP_CVT_ISA206),
> -GEN_HANDLER_E(lfdp, 0x39, 0xFF, 0xFF, 0x00200003, PPC_NONE, PPC2_ISA205),
>  GEN_HANDLER_E(lfdpx, 0x1F, 0x17, 0x18, 0x00200001, PPC_NONE, PPC2_ISA205),
>  
>  #define GEN_STF(name, stop, opc, type)                                        \
> diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
> index ed9588e..1d7cd23 100644
> --- a/target-ppc/translate/vsx-impl.inc.c
> +++ b/target-ppc/translate/vsx-impl.inc.c
> @@ -190,6 +190,27 @@ static void gen_lxvb16x(DisasContext *ctx)
>      tcg_temp_free(EA);
>  }
>  
> +#define VSX_LOAD_SCALAR_DS(name, operation)                       \
> +static void gen_##name(DisasContext *ctx)                         \
> +{                                                                 \
> +    TCGv EA;                                                      \
> +    TCGv_i64 xth = cpu_vsrh(rD(ctx->opcode) + 32);                \
> +                                                                  \
> +    if (unlikely(!ctx->altivec_enabled)) {                        \
> +        gen_exception(ctx, POWERPC_EXCP_VPU);                     \
> +        return;                                                   \
> +    }                                                             \
> +    gen_set_access_type(ctx, ACCESS_INT);                         \
> +    EA = tcg_temp_new();                                          \
> +    gen_addr_imm_index(ctx, EA, 0x03);                            \
> +    gen_qemu_##operation(ctx, xth, EA);                           \
> +    /* NOTE: cpu_vsrl is undefined */                             \
> +    tcg_temp_free(EA);                                            \
> +}
> +
> +VSX_LOAD_SCALAR_DS(lxsd, ld64_i64)
> +VSX_LOAD_SCALAR_DS(lxssp, ld32fs)
> +
>  #define VSX_STORE_SCALAR(name, operation)                     \
>  static void gen_##name(DisasContext *ctx)                     \
>  {                                                             \

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/9] target-ppc: Add xscmpexp[dp, qp] instructions
  2016-11-22 11:45 ` [Qemu-devel] [PATCH 3/9] target-ppc: Add xscmpexp[dp, qp] instructions Nikunj A Dadhania
@ 2016-11-23  4:06   ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2016-11-23  4:06 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 5796 bytes --]

On Tue, Nov 22, 2016 at 05:15:59PM +0530, Nikunj A Dadhania wrote:
> From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> xscmpexpdp: VSX Scalar Compare Exponents Double-Precision
> xscmpexpqp: VSX Scalar Compare Exponents Quad-Precision
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/fpu_helper.c             | 64 +++++++++++++++++++++++++++++++++++++
>  target-ppc/helper.h                 |  2 ++
>  target-ppc/translate/vsx-impl.inc.c |  2 ++
>  target-ppc/translate/vsx-ops.inc.c  |  6 ++++
>  4 files changed, 74 insertions(+)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index 3027003..b1c5a07 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2405,6 +2405,70 @@ VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
>  VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
>  VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
>  
> +void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode)
> +{
> +    ppc_vsr_t xa, xb;
> +    int64_t exp_a, exp_b;
> +    uint32_t cc;
> +
> +    getVSR(xA(opcode), &xa, env);
> +    getVSR(xB(opcode), &xb, env);
> +
> +    exp_a = extract64(xa.VsrD(0), 52, 11);
> +    exp_b = extract64(xb.VsrD(0), 52, 11);
> +
> +    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||
> +                 float64_is_any_nan(xb.VsrD(0)))) {
> +        cc = 1;

Please use symbolic constants here.

> +    } else {
> +        if (exp_a < exp_b) {
> +            cc = 8;
> +        } else if (exp_a > exp_b) {
> +            cc = 4;
> +        } else {
> +            cc = 2;
> +        }
> +    }
> +
> +    env->fpscr &= ~(0x0F << FPSCR_FPRF);
> +    env->fpscr |= cc << FPSCR_FPRF;
> +    env->crf[BF(opcode)] = cc;
> +
> +    helper_float_check_status(env);
> +}
> +
> +void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode)
> +{
> +    ppc_vsr_t xa, xb;
> +    int64_t exp_a, exp_b;
> +    uint32_t cc;
> +
> +    getVSR(rA(opcode) + 32, &xa, env);
> +    getVSR(rB(opcode) + 32, &xb, env);
> +
> +    exp_a = extract64(xa.VsrD(0), 48, 15);
> +    exp_b = extract64(xb.VsrD(0), 48, 15);
> +
> +    if (unlikely(float128_is_any_nan(make_float128(xa.VsrD(0), xa.VsrD(1))) ||
> +                 float128_is_any_nan(make_float128(xb.VsrD(0), xb.VsrD(1))))) {
> +        cc = 1;
> +    } else {
> +        if (exp_a < exp_b) {
> +            cc = 8;
> +        } else if (exp_a > exp_b) {
> +            cc = 4;
> +        } else {
> +            cc = 2;
> +        }
> +    }
> +
> +    env->fpscr &= ~(0x0F << FPSCR_FPRF);
> +    env->fpscr |= cc << FPSCR_FPRF;
> +    env->crf[BF(opcode)] = cc;
> +
> +    helper_float_check_status(env);
> +}
> +
>  #define VSX_SCALAR_CMP(op, ordered)                                      \
>  void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
>  {                                                                        \
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index da00f0a..ba42015 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -404,6 +404,8 @@ DEF_HELPER_2(xscmpeqdp, void, env, i32)
>  DEF_HELPER_2(xscmpgtdp, void, env, i32)
>  DEF_HELPER_2(xscmpgedp, void, env, i32)
>  DEF_HELPER_2(xscmpnedp, void, env, i32)
> +DEF_HELPER_2(xscmpexpdp, void, env, i32)
> +DEF_HELPER_2(xscmpexpqp, void, env, i32)
>  DEF_HELPER_2(xscmpodp, void, env, i32)
>  DEF_HELPER_2(xscmpudp, void, env, i32)
>  DEF_HELPER_2(xsmaxdp, void, env, i32)
> diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
> index 5a27be4..5206258 100644
> --- a/target-ppc/translate/vsx-impl.inc.c
> +++ b/target-ppc/translate/vsx-impl.inc.c
> @@ -624,6 +624,8 @@ GEN_VSX_HELPER_2(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
> +GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
> +GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
> diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
> index 3d91041..2468ee9 100644
> --- a/target-ppc/translate/vsx-ops.inc.c
> +++ b/target-ppc/translate/vsx-ops.inc.c
> @@ -83,6 +83,10 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x01, opc3|0x0C, 0, PPC_NONE, PPC2_VSX),\
>  GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x02, opc3|0x0C, 0, PPC_NONE, PPC2_VSX),\
>  GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x03, opc3|0x0C, 0, PPC_NONE, PPC2_VSX)
>  
> +#define GEN_VSX_XFORM_300(name, opc2, opc3, inval) \
> +GEN_HANDLER_E(name, 0x3F, opc2, opc3, inval, PPC_NONE, PPC2_ISA300)
> +
> +
>  GEN_XX2FORM(xsabsdp, 0x12, 0x15, PPC2_VSX),
>  GEN_XX2FORM(xsnabsdp, 0x12, 0x16, PPC2_VSX),
>  GEN_XX2FORM(xsnegdp, 0x12, 0x17, PPC2_VSX),
> @@ -118,6 +122,8 @@ GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
>  GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
>  GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
>  GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
> +GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
> +GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
>  GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
>  GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
>  GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 4/9] target-ppc: Add xscmpoqp and xscmpuqp instructions
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 4/9] target-ppc: Add xscmpoqp and xscmpuqp instructions Nikunj A Dadhania
@ 2016-11-23  4:06   ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2016-11-23  4:06 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 7115 bytes --]

On Tue, Nov 22, 2016 at 05:16:00PM +0530, Nikunj A Dadhania wrote:
> From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> xscmpoqp - VSX Scalar Compare Ordered Quad-Precision
> xscmpuqp - VSX Scalar Compare Unordered Quad-Precision
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/fpu_helper.c             | 52 +++++++++++++++++++++++++++++++++++++
>  target-ppc/helper.h                 |  2 ++
>  target-ppc/translate/vsx-impl.inc.c |  2 ++
>  target-ppc/translate/vsx-ops.inc.c  |  2 ++
>  4 files changed, 58 insertions(+)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index b1c5a07..28c1fea 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2518,6 +2518,58 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
>  VSX_SCALAR_CMP(xscmpodp, 1)
>  VSX_SCALAR_CMP(xscmpudp, 0)
>  
> +#define VSX_SCALAR_CMPQ(op, ordered)                                    \
> +void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
> +{                                                                       \
> +    ppc_vsr_t xa, xb;                                                   \
> +    uint32_t cc = 0;                                                    \
> +    bool vxsnan_flag = false, vxvc_flag = false;                        \
> +    float128 a, b;                                                      \
> +                                                                        \
> +    helper_reset_fpstatus(env);                                         \
> +    getVSR(rA(opcode) + 32, &xa, env);                                  \
> +    getVSR(rB(opcode) + 32, &xb, env);                                  \
> +                                                                        \
> +    a = make_float128(xa.VsrD(0), xa.VsrD(1));                          \
> +    b = make_float128(xb.VsrD(0), xb.VsrD(1));                          \
> +                                                                        \
> +    if (float128_is_signaling_nan(a, &env->fp_status) ||                \
> +        float128_is_signaling_nan(b, &env->fp_status)) {                \
> +        vxsnan_flag = true;                                             \
> +        cc = 1;                                                         \
> +        if (fpscr_ve == 0 && ordered) {                                 \
> +            vxvc_flag = true;                                           \
> +        }                                                               \
> +    } else if (ordered && (float128_is_quiet_nan(a, &env->fp_status)    \
> +                           || float128_is_quiet_nan(b, &env->fp_status))) { \
> +        cc = 1;                                                         \

Please use symbolic constants for the CC bits.

> +        vxvc_flag = true;                                               \
> +    }                                                                   \
> +    if (vxsnan_flag) {                                                  \
> +        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);          \
> +    }                                                                   \
> +    if (vxvc_flag) {                                                    \
> +        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);            \
> +    }                                                                   \
> +                                                                        \
> +    if (float128_lt(a, b, &env->fp_status)) {                           \
> +        cc |= 8;                                                        \
> +    } else if (!float128_le(a, b, &env->fp_status)) {                   \
> +        cc |= 4;                                                        \
> +    } else {                                                            \
> +        cc |= 2;                                                        \
> +    }                                                                   \
> +                                                                        \
> +    env->fpscr &= ~(0x0F << FPSCR_FPRF);                                \
> +    env->fpscr |= cc << FPSCR_FPRF;                                     \
> +    env->crf[BF(opcode)] = cc;                                          \
> +                                                                        \
> +    float_check_status(env);                                            \
> +}
> +
> +VSX_SCALAR_CMPQ(xscmpoqp, 1)
> +VSX_SCALAR_CMPQ(xscmpuqp, 0)
> +
>  /* VSX_MAX_MIN - VSX floating point maximum/minimum
>   *   name  - instruction mnemonic
>   *   op    - operation (max or min)
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index ba42015..3b26678 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -408,6 +408,8 @@ DEF_HELPER_2(xscmpexpdp, void, env, i32)
>  DEF_HELPER_2(xscmpexpqp, void, env, i32)
>  DEF_HELPER_2(xscmpodp, void, env, i32)
>  DEF_HELPER_2(xscmpudp, void, env, i32)
> +DEF_HELPER_2(xscmpoqp, void, env, i32)
> +DEF_HELPER_2(xscmpuqp, void, env, i32)
>  DEF_HELPER_2(xsmaxdp, void, env, i32)
>  DEF_HELPER_2(xsmindp, void, env, i32)
>  DEF_HELPER_2(xscvdpsp, void, env, i32)
> diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
> index 5206258..ed9588e 100644
> --- a/target-ppc/translate/vsx-impl.inc.c
> +++ b/target-ppc/translate/vsx-impl.inc.c
> @@ -628,6 +628,8 @@ GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
>  GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
> +GEN_VSX_HELPER_2(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
> +GEN_VSX_HELPER_2(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
>  GEN_VSX_HELPER_2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
> diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
> index 2468ee9..7f09527 100644
> --- a/target-ppc/translate/vsx-ops.inc.c
> +++ b/target-ppc/translate/vsx-ops.inc.c
> @@ -126,6 +126,8 @@ GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
>  GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
>  GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
>  GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
> +GEN_VSX_XFORM_300(xscmpoqp, 0x04, 0x04, 0x00600001),
> +GEN_VSX_XFORM_300(xscmpuqp, 0x04, 0x14, 0x00600001),
>  GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
>  GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),
>  GEN_XX2FORM(xscvdpsp, 0x12, 0x10, PPC2_VSX),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] target-ppc: add vextu[bhw]lx instructions
  2016-11-22 11:46 ` [Qemu-devel] [PATCH 8/9] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
@ 2016-11-23  4:11   ` David Gibson
  2016-11-23  4:48     ` Nikunj A Dadhania
  0 siblings, 1 reply; 20+ messages in thread
From: David Gibson @ 2016-11-23  4:11 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata, Avinesh Kumar

[-- Attachment #1: Type: text/plain, Size: 8304 bytes --]

On Tue, Nov 22, 2016 at 05:16:04PM +0530, Nikunj A Dadhania wrote:
> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
> 
> vextublx:  Vector Extract Unsigned Byte Left
> vextuhlx:  Vector Extract Unsigned Halfword Left
> vextuwlx:  Vector Extract Unsigned Word Left
> 
> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target-ppc/helper.h                 |  3 ++
>  target-ppc/int_helper.c             | 63 +++++++++++++++++++++++++++++++++++++
>  target-ppc/translate/vmx-impl.inc.c | 18 +++++++++++
>  target-ppc/translate/vmx-ops.inc.c  |  4 ++-
>  4 files changed, 87 insertions(+), 1 deletion(-)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 3b26678..d0a8fb2 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -366,6 +366,9 @@ DEF_HELPER_3(vpmsumb, void, avr, avr, avr)
>  DEF_HELPER_3(vpmsumh, void, avr, avr, avr)
>  DEF_HELPER_3(vpmsumw, void, avr, avr, avr)
>  DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
> +DEF_HELPER_2(vextublx, tl, tl, avr)
> +DEF_HELPER_2(vextuhlx, tl, tl, avr)
> +DEF_HELPER_2(vextuwlx, tl, tl, avr)
>  
>  DEF_HELPER_2(vsbox, void, avr, avr)
>  DEF_HELPER_3(vcipher, void, avr, avr, avr)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 8886a72..fb9f178 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -1805,6 +1805,69 @@ void helper_vlogefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
>      }
>  }
>  
> +#define EXTRACT128(value, start, length)        \
> +    ((value >> start) & (~(__uint128_t)0 >> (128 - length)))

Although we do use 128-bit arithmetic in some places in qemu, I don't
think we assume the presence of a working uint1238_t type.  Better to
actually write a helper function which does this in terms of 64 bit
arithmetic, I think.

> +
> +#if defined(HOST_WORDS_BIGENDIAN)
> +#  if defined(CONFIG_INT128)
> +#  define VEXTULX_DO(name, elem)                                \
> +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \

It seems a bit odd to need helpers for what's essentially just copying
a byte/halfword/whatever out of the vector.

> +{                                                               \
> +    target_ulong r = 0;                                         \
> +    int index = (a & 0xf) * 8;                                  \
> +    r = EXTRACT128(b->u128, index, elem * 8);                   \
> +    return r;                                                   \
> +}
> +#  else
> +#  define VEXTULX_DO(name, elem)                                \
> +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
> +{                                                               \
> +    target_ulong r = 0;                                         \
> +    int i;                                                      \
> +    int index = a & 0xf;                                        \
> +    for (i = 0; i < elem; i++) {                                \
> +        r = r << 8;                                             \
> +        if (index + i <= 15) {                                  \
> +            r = r | b->u8[index + i];                           \
> +        }                                                       \
> +    }                                                           \
> +    return r;                                                   \
> +}
> +#  endif
> +#else
> +#  if defined(CONFIG_INT128)
> +#  define VEXTULX_DO(name, elem)                                \
> +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
> +{                                                               \
> +    target_ulong r = 0;                                         \
> +    int size =  elem * 8;                                       \
> +    int index = (15 - (a & 0xf) + 1) * 8;                       \
> +    r = EXTRACT128(b->u128, (index - size), size);              \
> +    return r;                                                   \
> +}
> +#  else
> +#  define VEXTULX_DO(name, elem)                                \
> +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
> +{                                                               \
> +    target_ulong r = 0;                                         \
> +    int i;                                                      \
> +    int index = 15 - (a & 0xf);                                 \
> +    for (i = 0; i < elem; i++) {                                \
> +        r = r << 8;                                             \
> +        if (index - i >= 0) {                                   \
> +            r = r | b->u8[index - i];                           \
> +        }                                                       \
> +    }                                                           \
> +    return r;                                                   \
> +}
> +#  endif
> +#endif
> +
> +VEXTULX_DO(vextublx, 1)
> +VEXTULX_DO(vextuhlx, 2)
> +VEXTULX_DO(vextuwlx, 4)
> +#undef VEXTULX_DO
> +
>  /* The specification says that the results are undefined if all of the
>   * shift counts are not identical.  We check to make sure that they are
>   * to conform to what real hardware appears to do.  */
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index 7143eb3..e91d10b 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -340,6 +340,19 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx)              \
>      }                                                                   \
>  }
>  
> +#define GEN_VXFORM_HETRO(name, opc2, opc3)                              \
> +static void glue(gen_, name)(DisasContext *ctx)                         \
> +{                                                                       \
> +    TCGv_ptr rb;                                                        \
> +    if (unlikely(!ctx->altivec_enabled)) {                              \
> +        gen_exception(ctx, POWERPC_EXCP_VPU);                           \
> +        return;                                                         \
> +    }                                                                   \
> +    rb = gen_avr_ptr(rB(ctx->opcode));                                  \
> +    gen_helper_##name(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)], rb); \
> +    tcg_temp_free_ptr(rb);                                              \
> +}
> +
>  GEN_VXFORM(vaddubm, 0, 0);
>  GEN_VXFORM_DUAL_EXT(vaddubm, PPC_ALTIVEC, PPC_NONE, 0,       \
>                      vmul10cuq, PPC_NONE, PPC2_ISA300, 0x0000F800)
> @@ -525,6 +538,11 @@ GEN_VXFORM_ENV(vaddfp, 5, 0);
>  GEN_VXFORM_ENV(vsubfp, 5, 1);
>  GEN_VXFORM_ENV(vmaxfp, 5, 16);
>  GEN_VXFORM_ENV(vminfp, 5, 17);
> +GEN_VXFORM_HETRO(vextublx, 6, 24)
> +GEN_VXFORM_HETRO(vextuhlx, 6, 25)
> +GEN_VXFORM_HETRO(vextuwlx, 6, 26)
> +GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207,
> +                vextuwlx, PPC_NONE, PPC2_ISA300)
>  
>  #define GEN_VXRFORM1(opname, name, str, opc2, opc3)                     \
>  static void glue(gen_, name)(DisasContext *ctx)                         \
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index f02b3be..e62e564 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -91,8 +91,10 @@ GEN_VXFORM(vmrghw, 6, 2),
>  GEN_VXFORM(vmrglb, 6, 4),
>  GEN_VXFORM(vmrglh, 6, 5),
>  GEN_VXFORM(vmrglw, 6, 6),
> +GEN_VXFORM_300(vextublx, 6, 24),
> +GEN_VXFORM_300(vextuhlx, 6, 25),
> +GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
>  GEN_VXFORM_207(vmrgew, 6, 30),
> -GEN_VXFORM_207(vmrgow, 6, 26),
>  GEN_VXFORM(vmuloub, 4, 0),
>  GEN_VXFORM(vmulouh, 4, 1),
>  GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 8/9] target-ppc: add vextu[bhw]lx instructions
  2016-11-23  4:11   ` David Gibson
@ 2016-11-23  4:48     ` Nikunj A Dadhania
  0 siblings, 0 replies; 20+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23  4:48 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, rth, qemu-devel, bharata, Avinesh Kumar

David Gibson <david@gibson.dropbear.id.au> writes:

> [ Unknown signature status ]
> On Tue, Nov 22, 2016 at 05:16:04PM +0530, Nikunj A Dadhania wrote:
>> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>> 
>> vextublx:  Vector Extract Unsigned Byte Left
>> vextuhlx:  Vector Extract Unsigned Halfword Left
>> vextuwlx:  Vector Extract Unsigned Word Left
>> 
>> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>> ---
>>  target-ppc/helper.h                 |  3 ++
>>  target-ppc/int_helper.c             | 63 +++++++++++++++++++++++++++++++++++++
>>  target-ppc/translate/vmx-impl.inc.c | 18 +++++++++++
>>  target-ppc/translate/vmx-ops.inc.c  |  4 ++-
>>  4 files changed, 87 insertions(+), 1 deletion(-)
>> 
>> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
>> index 3b26678..d0a8fb2 100644
>> --- a/target-ppc/helper.h
>> +++ b/target-ppc/helper.h
>> @@ -366,6 +366,9 @@ DEF_HELPER_3(vpmsumb, void, avr, avr, avr)
>>  DEF_HELPER_3(vpmsumh, void, avr, avr, avr)
>>  DEF_HELPER_3(vpmsumw, void, avr, avr, avr)
>>  DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
>> +DEF_HELPER_2(vextublx, tl, tl, avr)
>> +DEF_HELPER_2(vextuhlx, tl, tl, avr)
>> +DEF_HELPER_2(vextuwlx, tl, tl, avr)
>>  
>>  DEF_HELPER_2(vsbox, void, avr, avr)
>>  DEF_HELPER_3(vcipher, void, avr, avr, avr)
>> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
>> index 8886a72..fb9f178 100644
>> --- a/target-ppc/int_helper.c
>> +++ b/target-ppc/int_helper.c
>> @@ -1805,6 +1805,69 @@ void helper_vlogefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
>>      }
>>  }
>>  
>> +#define EXTRACT128(value, start, length)        \
>> +    ((value >> start) & (~(__uint128_t)0 >> (128 - length)))
>
> Although we do use 128-bit arithmetic in some places in qemu, I don't
> think we assume the presence of a working uint1238_t type.  Better to
> actually write a helper function which does this in terms of 64 bit
> arithmetic, I think.

I think we should have it in #if defined(CONFIG_INT128), as the callers
are already within the define.

>> +
>> +#if defined(HOST_WORDS_BIGENDIAN)
>> +#  if defined(CONFIG_INT128)
>> +#  define VEXTULX_DO(name, elem)                                \
>> +target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
>
> It seems a bit odd to need helpers for what's essentially just copying
> a byte/halfword/whatever out of the vector.

Are you suggesting to do this using tcg_ops ?
In tcg, vector is represented as high/low, so the extraction in boundary
cases will get ugly.

>> +{                                                               \
>> +    target_ulong r = 0;                                         \
>> +    int index = (a & 0xf) * 8;                                  \
>> +    r = EXTRACT128(b->u128, index, elem * 8);                   \
>> +    return r;                                                   \
>> +}

Regards
Nikunj

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions
  2016-11-23  4:01   ` David Gibson
@ 2016-11-23  5:40     ` Bharata B Rao
  2016-11-24  1:29       ` David Gibson
  0 siblings, 1 reply; 20+ messages in thread
From: Bharata B Rao @ 2016-11-23  5:40 UTC (permalink / raw)
  To: David Gibson; +Cc: Nikunj A Dadhania, qemu-ppc, rth, qemu-devel

On Wed, Nov 23, 2016 at 03:01:18PM +1100, David Gibson wrote:
> On Tue, Nov 22, 2016 at 05:15:58PM +0530, Nikunj A Dadhania wrote:
> > From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > 
> > - xscmpodp & xscmpudp are missing flags reset.
> > - In xscmpodp, VXCC should be set only if VE is 0 for signalling NaN case
> >   and VXCC should be set by explicitly checking for quiet NaN case.
> > - Comparison is being done only if the operands are not NaNs. However as
> >   per ISA, it should be done even when operands are NaNs.
> 
> For my interest, can you explain the difference between ordered and
> unordered comparisons?  I looked at the ISA and mostly just became
> confused.

>From another section of the same ISA doc, I see these description which
makes the distinction between ordered and unordered comparisions a bit
more clear.

Unordered:

"If either of the operands is a NaN, either quiet or signal-
ing, then CR field BF and the FPCC are set to reflect
unordered. If either of the operands is a Signaling NaN,
then VXSNAN is set."

Ordered:

"If either of the operands is a NaN, either quiet or signal-
ing, then CR field BF and the FPCC are set to reflect
unordered. If either of the operands is a Signaling NaN,
then VXSNAN is set and, if Invalid Operation is dis-
abled (VE=0), VXVC is set. If neither operand is a Sig-
naling NaN but at least one operand is a Quiet NaN,
then VXVC is set."
 
> 
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> > ---
> >  target-ppc/fpu_helper.c | 41 +++++++++++++++++++++++++----------------
> >  1 file changed, 25 insertions(+), 16 deletions(-)
> > 
> > diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> > index d3741b4..3027003 100644
> > --- a/target-ppc/fpu_helper.c
> > +++ b/target-ppc/fpu_helper.c
> > @@ -2410,29 +2410,38 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
> >  {                                                                        \
> >      ppc_vsr_t xa, xb;                                                    \
> >      uint32_t cc = 0;                                                     \
> > +    bool vxsnan_flag = false, vxvc_flag = false;                         \
> >                                                                           \
> > +    helper_reset_fpstatus(env);                                          \
> >      getVSR(xA(opcode), &xa, env);                                        \
> >      getVSR(xB(opcode), &xb, env);                                        \
> >                                                                           \
> > -    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||                       \
> > -                 float64_is_any_nan(xb.VsrD(0)))) {                      \
> > -        if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||     \
> > -            float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {     \
> > -            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);       \
> > -        }                                                                \
> > -        if (ordered) {                                                   \
> > -            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);         \
> > +    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||         \
> > +        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {         \
> > +        vxsnan_flag = true;                                              \
> > +        cc = 1;                                                          \
> > +        if (fpscr_ve == 0 && ordered) {                                  \
> > +            vxvc_flag = true;                                            \
> >          }                                                                \
> > +    } else if ((float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||     \
> > +                float64_is_quiet_nan(xb.VsrD(0), &env->fp_status))       \
> > +               && ordered) {                                             \
> >          cc = 1;                                                          \
> 
> Since you're basically rewriting this, could you please change it to
> use symbolic constants for the CC bits, which will make it easier to
> follow.

Sure will do.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions
  2016-11-23  5:40     ` Bharata B Rao
@ 2016-11-24  1:29       ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2016-11-24  1:29 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: Nikunj A Dadhania, qemu-ppc, rth, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1845 bytes --]

On Wed, Nov 23, 2016 at 11:10:08AM +0530, Bharata B Rao wrote:
> On Wed, Nov 23, 2016 at 03:01:18PM +1100, David Gibson wrote:
> > On Tue, Nov 22, 2016 at 05:15:58PM +0530, Nikunj A Dadhania wrote:
> > > From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > 
> > > - xscmpodp & xscmpudp are missing flags reset.
> > > - In xscmpodp, VXCC should be set only if VE is 0 for signalling NaN case
> > >   and VXCC should be set by explicitly checking for quiet NaN case.
> > > - Comparison is being done only if the operands are not NaNs. However as
> > >   per ISA, it should be done even when operands are NaNs.
> > 
> > For my interest, can you explain the difference between ordered and
> > unordered comparisons?  I looked at the ISA and mostly just became
> > confused.
> 
> >From another section of the same ISA doc, I see these description which
> makes the distinction between ordered and unordered comparisions a bit
> more clear.
> 
> Unordered:
> 
> "If either of the operands is a NaN, either quiet or signal-
> ing, then CR field BF and the FPCC are set to reflect
> unordered. If either of the operands is a Signaling NaN,
> then VXSNAN is set."
> 
> Ordered:
> 
> "If either of the operands is a NaN, either quiet or signal-
> ing, then CR field BF and the FPCC are set to reflect
> unordered. If either of the operands is a Signaling NaN,
> then VXSNAN is set and, if Invalid Operation is dis-
> abled (VE=0), VXVC is set. If neither operand is a Sig-
> naling NaN but at least one operand is a Quiet NaN,
> then VXVC is set."

Ah, thanks.  So it's basically just the setting of VXVC which differs.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-11-24  1:48 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22 11:45 [Qemu-devel] [PATCH ppc-for-2.9 0/9] POWER9 TCG enablements - part8 Nikunj A Dadhania
2016-11-22 11:45 ` [Qemu-devel] [PATCH 1/9] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
2016-11-23  3:56   ` David Gibson
2016-11-22 11:45 ` [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions Nikunj A Dadhania
2016-11-23  4:01   ` David Gibson
2016-11-23  5:40     ` Bharata B Rao
2016-11-24  1:29       ` David Gibson
2016-11-22 11:45 ` [Qemu-devel] [PATCH 3/9] target-ppc: Add xscmpexp[dp, qp] instructions Nikunj A Dadhania
2016-11-23  4:06   ` David Gibson
2016-11-22 11:46 ` [Qemu-devel] [PATCH 4/9] target-ppc: Add xscmpoqp and xscmpuqp instructions Nikunj A Dadhania
2016-11-23  4:06   ` David Gibson
2016-11-22 11:46 ` [Qemu-devel] [PATCH 5/9] target-ppc: implement lxsd and lxssp instructions Nikunj A Dadhania
2016-11-23  4:06   ` David Gibson
2016-11-22 11:46 ` [Qemu-devel] [PATCH 6/9] target-ppc: implement stxsd and stxssp Nikunj A Dadhania
2016-11-22 15:19   ` Nikunj A Dadhania
2016-11-22 11:46 ` [Qemu-devel] [PATCH 7/9] target-ppc: implement lxv/lxvx and stxv/stxvx Nikunj A Dadhania
2016-11-22 11:46 ` [Qemu-devel] [PATCH 8/9] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
2016-11-23  4:11   ` David Gibson
2016-11-23  4:48     ` Nikunj A Dadhania
2016-11-22 11:46 ` [Qemu-devel] [PATCH 9/9] target-ppc: add vextu[bhw]rx instructions Nikunj A Dadhania

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.