All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8
@ 2016-11-23 11:37 Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 01/10] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
                   ` (10 more replies)
  0 siblings, 11 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

This series contains 18 new instructions for POWER9 ISA3.0
    Vector Extract Left/Right Indexed
    VSX Scalar Compare Exponents
    VSX Scalar Compare Quad-Precision
    Load/Store VSX Vector 
    Load/Store VSX Scalar

Changelog:
v0:
* Change dq/ds-form decoding for primary opcode 0x3D
* Rename CR Field defines, as at every place it was
  using bit shifts.
* Use symbolic constants in xscmp*
* Fix bug in exception handling for QNaN
* Define EXTRACT128 within CONFIG_INT128

Patches
=======
01-03: Consolidation/Fixes
   04: 
      xscmpexpdp: VSX Scalar Compare Exponents Double-Precision
      xscmpexpqp: VSX Scalar Compare Exponents Quad-Precision
   05:
      xscmpoqp: VSX Scalar Compare Ordered Quad-Precision
      xscmpuqp: VSX Scalar Compare Unordered Quad-Precision
   06:
      lxsd:  Load VSX Scalar Dword
      lxssp: Load VSX Scalar Single Precision
   07:
      stxsd:  Store VSX Scalar Dword
      stxssp: Store VSX Scalar Single Precision
   08:
      lxv:   Load VSX Vector
      lxvx:  Load VSX Vector Indexed
      stxv:  Store VSX Vector
      stxvx: Store VSX Vector Indexed
   09: 
      vextublx:  Vector Extract Unsigned Byte Left
      vextuhlx:  Vector Extract Unsigned Halfword Left
      vextuwlx:  Vector Extract Unsigned Word Left
   10: 
      vextubrx: Vector Extract Unsigned Byte Right-Indexed
      vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed
      vextuwrx: Vector Extract Unsigned Word Right-Indexed

Avinesh Kumar (1):
  target-ppc: add vextu[bhw]lx instructions

Bharata B Rao (4):
  target-ppc: Consolidate instruction decode helpers
  target-ppc: Fix xscmpodp and xscmpudp instructions
  target-ppc: Add xscmpexp[dp,qp] instructions
  target-ppc: Add xscmpoqp and xscmpuqp instructions

Hariharan T.S (1):
  target-ppc: add vextu[bhw]rx instructions

Nikunj A Dadhania (4):
  target-ppc: rename CRF_* defines as CRF_*_BIT
  target-ppc: implement lxsd and lxssp instructions
  target-ppc: implement stxsd and stxssp
  target-ppc: implement lxv/lxvx and stxv/stxvx

 target-ppc/cpu.h                    |  21 ++--
 target-ppc/fpu_helper.c             | 169 ++++++++++++++++++++++----
 target-ppc/helper.h                 |  10 ++
 target-ppc/int_helper.c             | 155 +++++++++++++++++++++---
 target-ppc/internal.h               | 152 ++++++++++++++++++++++++
 target-ppc/translate.c              | 230 +++++++++++-------------------------
 target-ppc/translate/fp-ops.inc.c   |   2 -
 target-ppc/translate/vmx-impl.inc.c |  23 ++++
 target-ppc/translate/vmx-ops.inc.c  |   8 +-
 target-ppc/translate/vsx-impl.inc.c |  96 +++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  10 ++
 11 files changed, 666 insertions(+), 210 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 01/10] target-ppc: Consolidate instruction decode helpers
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 02/10] target-ppc: rename CRF_* defines as CRF_*_BIT Nikunj A Dadhania
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

Move instruction decode helpers to target-ppc/internal.h so that some
of these can be used from outside of translate.c. This movement also
helps to get rid of some duplicate helpers from target-ppc/fpu_helper.c.

Suggested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 target-ppc/fpu_helper.c |  11 +---
 target-ppc/internal.h   | 151 ++++++++++++++++++++++++++++++++++++++++++++++++
 target-ppc/translate.c  | 151 ------------------------------------------------
 3 files changed, 152 insertions(+), 161 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 8a389e1..d3741b4 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -20,6 +20,7 @@
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "exec/exec-all.h"
+#include "internal.h"
 
 #define float64_snan_to_qnan(x) ((x) | 0x0008000000000000ULL)
 #define float32_snan_to_qnan(x) ((x) | 0x00400000)
@@ -1776,16 +1777,6 @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2)
     return helper_efdtsteq(env, op1, op2);
 }
 
-#define DECODE_SPLIT(opcode, shift1, nb1, shift2, nb2) \
-    (((((opcode) >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |    \
-     (((opcode) >> (shift2)) & ((1 << (nb2)) - 1)))
-
-#define xT(opcode) DECODE_SPLIT(opcode, 0, 1, 21, 5)
-#define xA(opcode) DECODE_SPLIT(opcode, 2, 1, 16, 5)
-#define xB(opcode) DECODE_SPLIT(opcode, 1, 1, 11, 5)
-#define xC(opcode) DECODE_SPLIT(opcode, 3, 1,  6, 5)
-#define BF(opcode) (((opcode) >> (31-8)) & 7)
-
 typedef union _ppc_vsr_t {
     uint64_t u64[2];
     uint32_t u32[4];
diff --git a/target-ppc/internal.h b/target-ppc/internal.h
index 1ff4896..9a4a74a 100644
--- a/target-ppc/internal.h
+++ b/target-ppc/internal.h
@@ -47,4 +47,155 @@ FUNC_MASK(MASK, target_ulong, 32, UINT32_MAX);
 FUNC_MASK(mask_u32, uint32_t, 32, UINT32_MAX);
 FUNC_MASK(mask_u64, uint64_t, 64, UINT64_MAX);
 
+/*****************************************************************************/
+/***                           Instruction decoding                        ***/
+#define EXTRACT_HELPER(name, shift, nb)                                       \
+static inline uint32_t name(uint32_t opcode)                                  \
+{                                                                             \
+    return (opcode >> (shift)) & ((1 << (nb)) - 1);                           \
+}
+
+#define EXTRACT_SHELPER(name, shift, nb)                                      \
+static inline int32_t name(uint32_t opcode)                                   \
+{                                                                             \
+    return (int16_t)((opcode >> (shift)) & ((1 << (nb)) - 1));                \
+}
+
+#define EXTRACT_HELPER_SPLIT(name, shift1, nb1, shift2, nb2)                  \
+static inline uint32_t name(uint32_t opcode)                                  \
+{                                                                             \
+    return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |             \
+            ((opcode >> (shift2)) & ((1 << (nb2)) - 1));                      \
+}
+
+#define EXTRACT_HELPER_DXFORM(name,                                           \
+                              d0_bits, shift_op_d0, shift_d0,                 \
+                              d1_bits, shift_op_d1, shift_d1,                 \
+                              d2_bits, shift_op_d2, shift_d2)                 \
+static inline int16_t name(uint32_t opcode)                                   \
+{                                                                             \
+    return                                                                    \
+        (((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) | \
+        (((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) | \
+        (((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << (shift_d2));  \
+}
+
+
+/* Opcode part 1 */
+EXTRACT_HELPER(opc1, 26, 6);
+/* Opcode part 2 */
+EXTRACT_HELPER(opc2, 1, 5);
+/* Opcode part 3 */
+EXTRACT_HELPER(opc3, 6, 5);
+/* Opcode part 4 */
+EXTRACT_HELPER(opc4, 16, 5);
+/* Update Cr0 flags */
+EXTRACT_HELPER(Rc, 0, 1);
+/* Update Cr6 flags (Altivec) */
+EXTRACT_HELPER(Rc21, 10, 1);
+/* Destination */
+EXTRACT_HELPER(rD, 21, 5);
+/* Source */
+EXTRACT_HELPER(rS, 21, 5);
+/* First operand */
+EXTRACT_HELPER(rA, 16, 5);
+/* Second operand */
+EXTRACT_HELPER(rB, 11, 5);
+/* Third operand */
+EXTRACT_HELPER(rC, 6, 5);
+/***                               Get CRn                                 ***/
+EXTRACT_HELPER(crfD, 23, 3);
+EXTRACT_HELPER(BF, 23, 3);
+EXTRACT_HELPER(crfS, 18, 3);
+EXTRACT_HELPER(crbD, 21, 5);
+EXTRACT_HELPER(crbA, 16, 5);
+EXTRACT_HELPER(crbB, 11, 5);
+/* SPR / TBL */
+EXTRACT_HELPER(_SPR, 11, 10);
+static inline uint32_t SPR(uint32_t opcode)
+{
+    uint32_t sprn = _SPR(opcode);
+
+    return ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
+}
+/***                              Get constants                            ***/
+/* 16 bits signed immediate value */
+EXTRACT_SHELPER(SIMM, 0, 16);
+/* 16 bits unsigned immediate value */
+EXTRACT_HELPER(UIMM, 0, 16);
+/* 5 bits signed immediate value */
+EXTRACT_HELPER(SIMM5, 16, 5);
+/* 5 bits signed immediate value */
+EXTRACT_HELPER(UIMM5, 16, 5);
+/* 4 bits unsigned immediate value */
+EXTRACT_HELPER(UIMM4, 16, 4);
+/* Bit count */
+EXTRACT_HELPER(NB, 11, 5);
+/* Shift count */
+EXTRACT_HELPER(SH, 11, 5);
+/* Vector shift count */
+EXTRACT_HELPER(VSH, 6, 4);
+/* Mask start */
+EXTRACT_HELPER(MB, 6, 5);
+/* Mask end */
+EXTRACT_HELPER(ME, 1, 5);
+/* Trap operand */
+EXTRACT_HELPER(TO, 21, 5);
+
+EXTRACT_HELPER(CRM, 12, 8);
+
+#ifndef CONFIG_USER_ONLY
+EXTRACT_HELPER(SR, 16, 4);
+#endif
+
+/* mtfsf/mtfsfi */
+EXTRACT_HELPER(FPBF, 23, 3);
+EXTRACT_HELPER(FPIMM, 12, 4);
+EXTRACT_HELPER(FPL, 25, 1);
+EXTRACT_HELPER(FPFLM, 17, 8);
+EXTRACT_HELPER(FPW, 16, 1);
+
+/* addpcis */
+EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
+#if defined(TARGET_PPC64)
+/* darn */
+EXTRACT_HELPER(L, 16, 2);
+#endif
+
+/***                            Jump target decoding                       ***/
+/* Immediate address */
+static inline target_ulong LI(uint32_t opcode)
+{
+    return (opcode >> 0) & 0x03FFFFFC;
+}
+
+static inline uint32_t BD(uint32_t opcode)
+{
+    return (opcode >> 0) & 0xFFFC;
+}
+
+EXTRACT_HELPER(BO, 21, 5);
+EXTRACT_HELPER(BI, 16, 5);
+/* Absolute/relative address */
+EXTRACT_HELPER(AA, 1, 1);
+/* Link */
+EXTRACT_HELPER(LK, 0, 1);
+
+/* DFP Z22-form */
+EXTRACT_HELPER(DCM, 10, 6)
+
+/* DFP Z23-form */
+EXTRACT_HELPER(RMC, 9, 2)
+
+EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
+EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
+EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
+EXTRACT_HELPER_SPLIT(xB, 1, 1, 11, 5);
+EXTRACT_HELPER_SPLIT(xC, 3, 1,  6, 5);
+EXTRACT_HELPER(DM, 8, 2);
+EXTRACT_HELPER(UIM, 16, 2);
+EXTRACT_HELPER(SHW, 8, 2);
+EXTRACT_HELPER(SP, 19, 2);
+EXTRACT_HELPER(IMM8, 11, 8);
+
 #endif /* PPC_INTERNAL_H */
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 59e9552..6bdc433 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -422,157 +422,6 @@ typedef struct opcode_t {
 
 #define CHK_NONE
 
-
-/*****************************************************************************/
-/***                           Instruction decoding                        ***/
-#define EXTRACT_HELPER(name, shift, nb)                                       \
-static inline uint32_t name(uint32_t opcode)                                  \
-{                                                                             \
-    return (opcode >> (shift)) & ((1 << (nb)) - 1);                           \
-}
-
-#define EXTRACT_SHELPER(name, shift, nb)                                      \
-static inline int32_t name(uint32_t opcode)                                   \
-{                                                                             \
-    return (int16_t)((opcode >> (shift)) & ((1 << (nb)) - 1));                \
-}
-
-#define EXTRACT_HELPER_SPLIT(name, shift1, nb1, shift2, nb2)                  \
-static inline uint32_t name(uint32_t opcode)                                  \
-{                                                                             \
-    return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |             \
-            ((opcode >> (shift2)) & ((1 << (nb2)) - 1));                      \
-}
-
-#define EXTRACT_HELPER_DXFORM(name,                                           \
-                              d0_bits, shift_op_d0, shift_d0,                 \
-                              d1_bits, shift_op_d1, shift_d1,                 \
-                              d2_bits, shift_op_d2, shift_d2)                 \
-static inline int16_t name(uint32_t opcode)                                   \
-{                                                                             \
-    return                                                                    \
-        (((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) | \
-        (((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) | \
-        (((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << (shift_d2));  \
-}
-
-
-/* Opcode part 1 */
-EXTRACT_HELPER(opc1, 26, 6);
-/* Opcode part 2 */
-EXTRACT_HELPER(opc2, 1, 5);
-/* Opcode part 3 */
-EXTRACT_HELPER(opc3, 6, 5);
-/* Opcode part 4 */
-EXTRACT_HELPER(opc4, 16, 5);
-/* Update Cr0 flags */
-EXTRACT_HELPER(Rc, 0, 1);
-/* Update Cr6 flags (Altivec) */
-EXTRACT_HELPER(Rc21, 10, 1);
-/* Destination */
-EXTRACT_HELPER(rD, 21, 5);
-/* Source */
-EXTRACT_HELPER(rS, 21, 5);
-/* First operand */
-EXTRACT_HELPER(rA, 16, 5);
-/* Second operand */
-EXTRACT_HELPER(rB, 11, 5);
-/* Third operand */
-EXTRACT_HELPER(rC, 6, 5);
-/***                               Get CRn                                 ***/
-EXTRACT_HELPER(crfD, 23, 3);
-EXTRACT_HELPER(crfS, 18, 3);
-EXTRACT_HELPER(crbD, 21, 5);
-EXTRACT_HELPER(crbA, 16, 5);
-EXTRACT_HELPER(crbB, 11, 5);
-/* SPR / TBL */
-EXTRACT_HELPER(_SPR, 11, 10);
-static inline uint32_t SPR(uint32_t opcode)
-{
-    uint32_t sprn = _SPR(opcode);
-
-    return ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
-}
-/***                              Get constants                            ***/
-/* 16 bits signed immediate value */
-EXTRACT_SHELPER(SIMM, 0, 16);
-/* 16 bits unsigned immediate value */
-EXTRACT_HELPER(UIMM, 0, 16);
-/* 5 bits signed immediate value */
-EXTRACT_HELPER(SIMM5, 16, 5);
-/* 5 bits signed immediate value */
-EXTRACT_HELPER(UIMM5, 16, 5);
-/* 4 bits unsigned immediate value */
-EXTRACT_HELPER(UIMM4, 16, 4);
-/* Bit count */
-EXTRACT_HELPER(NB, 11, 5);
-/* Shift count */
-EXTRACT_HELPER(SH, 11, 5);
-/* Vector shift count */
-EXTRACT_HELPER(VSH, 6, 4);
-/* Mask start */
-EXTRACT_HELPER(MB, 6, 5);
-/* Mask end */
-EXTRACT_HELPER(ME, 1, 5);
-/* Trap operand */
-EXTRACT_HELPER(TO, 21, 5);
-
-EXTRACT_HELPER(CRM, 12, 8);
-
-#ifndef CONFIG_USER_ONLY
-EXTRACT_HELPER(SR, 16, 4);
-#endif
-
-/* mtfsf/mtfsfi */
-EXTRACT_HELPER(FPBF, 23, 3);
-EXTRACT_HELPER(FPIMM, 12, 4);
-EXTRACT_HELPER(FPL, 25, 1);
-EXTRACT_HELPER(FPFLM, 17, 8);
-EXTRACT_HELPER(FPW, 16, 1);
-
-/* addpcis */
-EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
-#if defined(TARGET_PPC64)
-/* darn */
-EXTRACT_HELPER(L, 16, 2);
-#endif
-
-/***                            Jump target decoding                       ***/
-/* Immediate address */
-static inline target_ulong LI(uint32_t opcode)
-{
-    return (opcode >> 0) & 0x03FFFFFC;
-}
-
-static inline uint32_t BD(uint32_t opcode)
-{
-    return (opcode >> 0) & 0xFFFC;
-}
-
-EXTRACT_HELPER(BO, 21, 5);
-EXTRACT_HELPER(BI, 16, 5);
-/* Absolute/relative address */
-EXTRACT_HELPER(AA, 1, 1);
-/* Link */
-EXTRACT_HELPER(LK, 0, 1);
-
-/* DFP Z22-form */
-EXTRACT_HELPER(DCM, 10, 6)
-
-/* DFP Z23-form */
-EXTRACT_HELPER(RMC, 9, 2)
-
-EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
-EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
-EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
-EXTRACT_HELPER_SPLIT(xB, 1, 1, 11, 5);
-EXTRACT_HELPER_SPLIT(xC, 3, 1,  6, 5);
-EXTRACT_HELPER(DM, 8, 2);
-EXTRACT_HELPER(UIM, 16, 2);
-EXTRACT_HELPER(SHW, 8, 2);
-EXTRACT_HELPER(SP, 19, 2);
-EXTRACT_HELPER(IMM8, 11, 8);
-
 /*****************************************************************************/
 /* PowerPC instructions table                                                */
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 02/10] target-ppc: rename CRF_* defines as CRF_*_BIT
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 01/10] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-24  0:23   ` David Gibson
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 03/10] target-ppc: Fix xscmpodp and xscmpudp instructions Nikunj A Dadhania
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

Add _BIT to CRF_[GT,LT,EQ_SO] and introduce CRF_[GT,LT,EQ,SO] for usage
without shifts in the code. This would simplify the code.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

--

Note: Have not changed CRF defines for SPE extension.
---
 target-ppc/cpu.h        | 21 +++++++++++++--------
 target-ppc/int_helper.c | 30 +++++++++++++++---------------
 target-ppc/translate.c  | 14 +++++++-------
 3 files changed, 35 insertions(+), 30 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 76f3a69..5a5355c 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1301,14 +1301,19 @@ static inline int cpu_mmu_index (CPUPPCState *env, bool ifetch)
 
 /*****************************************************************************/
 /* CRF definitions */
-#define CRF_LT        3
-#define CRF_GT        2
-#define CRF_EQ        1
-#define CRF_SO        0
-#define CRF_CH        (1 << CRF_LT)
-#define CRF_CL        (1 << CRF_GT)
-#define CRF_CH_OR_CL  (1 << CRF_EQ)
-#define CRF_CH_AND_CL (1 << CRF_SO)
+#define CRF_LT_BIT    3
+#define CRF_GT_BIT    2
+#define CRF_EQ_BIT    1
+#define CRF_SO_BIT    0
+#define CRF_LT        (1 << CRF_LT_BIT)
+#define CRF_GT        (1 << CRF_GT_BIT)
+#define CRF_EQ        (1 << CRF_EQ_BIT)
+#define CRF_SO        (1 << CRF_SO_BIT)
+/* For SPE extensions */
+#define CRF_CH        (1 << CRF_LT_BIT)
+#define CRF_CL        (1 << CRF_GT_BIT)
+#define CRF_CH_OR_CL  (1 << CRF_EQ_BIT)
+#define CRF_CH_AND_CL (1 << CRF_SO_BIT)
 
 /* XER definitions */
 #define XER_SO  31
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 8886a72..fbf477f 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -167,7 +167,7 @@ target_ulong helper_cnttzw(target_ulong t)
 
 uint32_t helper_cmpeqb(target_ulong ra, target_ulong rb)
 {
-    return hasvalue(rb, ra) ? 1 << CRF_GT : 0;
+    return hasvalue(rb, ra) ? CRF_GT : 0;
 }
 
 #undef pattern
@@ -2563,9 +2563,9 @@ static void bcd_put_digit(ppc_avr_t *bcd, uint8_t digit, int n)
 static int bcd_cmp_zero(ppc_avr_t *bcd)
 {
     if (bcd->u64[HI_IDX] == 0 && (bcd->u64[LO_IDX] >> 4) == 0) {
-        return 1 << CRF_EQ;
+        return CRF_EQ;
     } else {
-        return (bcd_get_sgn(bcd) == 1) ? 1 << CRF_GT : 1 << CRF_LT;
+        return (bcd_get_sgn(bcd) == 1) ? CRF_GT : CRF_LT;
     }
 }
 
@@ -2677,25 +2677,25 @@ uint32_t helper_bcdadd(ppc_avr_t *r,  ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
         if (sgna == sgnb) {
             result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgna, ps);
             zero = bcd_add_mag(&result, a, b, &invalid, &overflow);
-            cr = (sgna > 0) ? 1 << CRF_GT : 1 << CRF_LT;
+            cr = (sgna > 0) ? CRF_GT : CRF_LT;
         } else if (bcd_cmp_mag(a, b) > 0) {
             result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgna, ps);
             zero = bcd_sub_mag(&result, a, b, &invalid, &overflow);
-            cr = (sgna > 0) ? 1 << CRF_GT : 1 << CRF_LT;
+            cr = (sgna > 0) ? CRF_GT : CRF_LT;
         } else {
             result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgnb, ps);
             zero = bcd_sub_mag(&result, b, a, &invalid, &overflow);
-            cr = (sgnb > 0) ? 1 << CRF_GT : 1 << CRF_LT;
+            cr = (sgnb > 0) ? CRF_GT : CRF_LT;
         }
     }
 
     if (unlikely(invalid)) {
         result.u64[HI_IDX] = result.u64[LO_IDX] = -1;
-        cr = 1 << CRF_SO;
+        cr = CRF_SO;
     } else if (overflow) {
-        cr |= 1 << CRF_SO;
+        cr |= CRF_SO;
     } else if (zero) {
-        cr = 1 << CRF_EQ;
+        cr = CRF_EQ;
     }
 
     *r = result;
@@ -2745,7 +2745,7 @@ uint32_t helper_bcdcfn(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
     cr = bcd_cmp_zero(&ret);
 
     if (unlikely(invalid)) {
-        cr = 1 << CRF_SO;
+        cr = CRF_SO;
     }
 
     *r = ret;
@@ -2775,11 +2775,11 @@ uint32_t helper_bcdctn(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
     cr = bcd_cmp_zero(b);
 
     if (ox_flag) {
-        cr |= 1 << CRF_SO;
+        cr |= CRF_SO;
     }
 
     if (unlikely(invalid)) {
-        cr = 1 << CRF_SO;
+        cr = CRF_SO;
     }
 
     *r = ret;
@@ -2823,7 +2823,7 @@ uint32_t helper_bcdcfz(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
     cr = bcd_cmp_zero(&ret);
 
     if (unlikely(invalid)) {
-        cr = 1 << CRF_SO;
+        cr = CRF_SO;
     }
 
     *r = ret;
@@ -2862,11 +2862,11 @@ uint32_t helper_bcdctz(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
     cr = bcd_cmp_zero(b);
 
     if (ox_flag) {
-        cr |= 1 << CRF_SO;
+        cr |= CRF_SO;
     }
 
     if (unlikely(invalid)) {
-        cr = 1 << CRF_SO;
+        cr = CRF_SO;
     }
 
     *r = ret;
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 6bdc433..d3bda1b 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -612,17 +612,17 @@ static inline void gen_op_cmp(TCGv arg0, TCGv arg1, int s, int crf)
 
     tcg_gen_setcond_tl((s ? TCG_COND_LT: TCG_COND_LTU), t0, arg0, arg1);
     tcg_gen_trunc_tl_i32(t1, t0);
-    tcg_gen_shli_i32(t1, t1, CRF_LT);
+    tcg_gen_shli_i32(t1, t1, CRF_LT_BIT);
     tcg_gen_or_i32(cpu_crf[crf], cpu_crf[crf], t1);
 
     tcg_gen_setcond_tl((s ? TCG_COND_GT: TCG_COND_GTU), t0, arg0, arg1);
     tcg_gen_trunc_tl_i32(t1, t0);
-    tcg_gen_shli_i32(t1, t1, CRF_GT);
+    tcg_gen_shli_i32(t1, t1, CRF_GT_BIT);
     tcg_gen_or_i32(cpu_crf[crf], cpu_crf[crf], t1);
 
     tcg_gen_setcond_tl(TCG_COND_EQ, t0, arg0, arg1);
     tcg_gen_trunc_tl_i32(t1, t0);
-    tcg_gen_shli_i32(t1, t1, CRF_EQ);
+    tcg_gen_shli_i32(t1, t1, CRF_EQ_BIT);
     tcg_gen_or_i32(cpu_crf[crf], cpu_crf[crf], t1);
 
     tcg_temp_free(t0);
@@ -748,7 +748,7 @@ static void gen_cmprb(DisasContext *ctx)
         tcg_gen_and_i32(src2lo, src2lo, src2hi);
         tcg_gen_or_i32(crf, crf, src2lo);
     }
-    tcg_gen_shli_i32(crf, crf, CRF_GT);
+    tcg_gen_shli_i32(crf, crf, CRF_GT_BIT);
     tcg_temp_free_i32(src1);
     tcg_temp_free_i32(src2);
     tcg_temp_free_i32(src2lo);
@@ -2978,7 +2978,7 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA,
     tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
     l1 = gen_new_label();
     tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1);
-    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
+    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], CRF_EQ);
     tcg_gen_qemu_st_tl(cpu_gpr[reg], EA, ctx->mem_idx, memop);
     gen_set_label(l1);
     tcg_gen_movi_tl(cpu_reserve, -1);
@@ -3072,7 +3072,7 @@ static void gen_stqcx_(DisasContext *ctx)
     tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
     l1 = gen_new_label();
     tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1);
-    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
+    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], CRF_EQ);
 
     if (unlikely(ctx->le_mode)) {
         gpr1 = cpu_gpr[reg + 1];
@@ -4253,7 +4253,7 @@ static void gen_slbfee_(DisasContext *ctx)
     l2 = gen_new_label();
     tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
     tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_gpr[rS(ctx->opcode)], -1, l1);
-    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
+    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], CRF_EQ);
     tcg_gen_br(l2);
     gen_set_label(l1);
     tcg_gen_movi_tl(cpu_gpr[rS(ctx->opcode)], 0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 03/10] target-ppc: Fix xscmpodp and xscmpudp instructions
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 01/10] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 02/10] target-ppc: rename CRF_* defines as CRF_*_BIT Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 04/10] target-ppc: Add xscmpexp[dp, qp] instructions Nikunj A Dadhania
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

- xscmpodp & xscmpudp are missing flags reset.
- In xscmpodp, VXCC should be set only if VE is 0 for signalling NaN case
  and VXCC should be set by explicitly checking for quiet NaN case.
- Comparison is being done only if the operands are not NaNs. However as
  per ISA, it should be done even when operands are NaNs.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index d3741b4..fdd3216 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2410,29 +2410,39 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 {                                                                        \
     ppc_vsr_t xa, xb;                                                    \
     uint32_t cc = 0;                                                     \
+    bool vxsnan_flag = false, vxvc_flag = false;                         \
                                                                          \
+    helper_reset_fpstatus(env);                                          \
     getVSR(xA(opcode), &xa, env);                                        \
     getVSR(xB(opcode), &xb, env);                                        \
                                                                          \
-    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||                       \
-                 float64_is_any_nan(xb.VsrD(0)))) {                      \
-        if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||     \
-            float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {     \
-            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);       \
+    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||         \
+        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {         \
+        vxsnan_flag = true;                                              \
+        cc = CRF_SO;                                                     \
+        if (fpscr_ve == 0 && ordered) {                                  \
+            vxvc_flag = true;                                            \
         }                                                                \
+    } else if (float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||      \
+               float64_is_quiet_nan(xb.VsrD(0), &env->fp_status)) {      \
+        cc = CRF_SO;                                                     \
         if (ordered) {                                                   \
-            float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);         \
+            vxvc_flag = true;                                            \
         }                                                                \
-        cc = 1;                                                          \
+    }                                                                    \
+    if (vxsnan_flag) {                                                   \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);           \
+    }                                                                    \
+    if (vxvc_flag) {                                                     \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);             \
+    }                                                                    \
+                                                                         \
+    if (float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {           \
+        cc |= CRF_LT;                                                    \
+    } else if (!float64_le(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {   \
+        cc |= CRF_GT;                                                    \
     } else {                                                             \
-        if (float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {       \
-            cc = 8;                                                      \
-        } else if (!float64_le(xa.VsrD(0), xb.VsrD(0),                   \
-                               &env->fp_status)) { \
-            cc = 4;                                                      \
-        } else {                                                         \
-            cc = 2;                                                      \
-        }                                                                \
+        cc |= CRF_EQ;                                                    \
     }                                                                    \
                                                                          \
     env->fpscr &= ~(0x0F << FPSCR_FPRF);                                 \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 04/10] target-ppc: Add xscmpexp[dp, qp] instructions
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 03/10] target-ppc: Fix xscmpodp and xscmpudp instructions Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 05/10] target-ppc: Add xscmpoqp and xscmpuqp instructions Nikunj A Dadhania
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

xscmpexpdp: VSX Scalar Compare Exponents Double-Precision
xscmpexpqp: VSX Scalar Compare Exponents Quad-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c             | 64 +++++++++++++++++++++++++++++++++++++
 target-ppc/helper.h                 |  2 ++
 target-ppc/translate/vsx-impl.inc.c |  2 ++
 target-ppc/translate/vsx-ops.inc.c  |  6 ++++
 4 files changed, 74 insertions(+)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index fdd3216..8bffafb 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2405,6 +2405,70 @@ VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
 VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
 VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
 
+void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode)
+{
+    ppc_vsr_t xa, xb;
+    int64_t exp_a, exp_b;
+    uint32_t cc;
+
+    getVSR(xA(opcode), &xa, env);
+    getVSR(xB(opcode), &xb, env);
+
+    exp_a = extract64(xa.VsrD(0), 52, 11);
+    exp_b = extract64(xb.VsrD(0), 52, 11);
+
+    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||
+                 float64_is_any_nan(xb.VsrD(0)))) {
+        cc = CRF_SO;
+    } else {
+        if (exp_a < exp_b) {
+            cc = CRF_LT;
+        } else if (exp_a > exp_b) {
+            cc = CRF_GT;
+        } else {
+            cc = CRF_EQ;
+        }
+    }
+
+    env->fpscr &= ~(0x0F << FPSCR_FPRF);
+    env->fpscr |= cc << FPSCR_FPRF;
+    env->crf[BF(opcode)] = cc;
+
+    helper_float_check_status(env);
+}
+
+void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode)
+{
+    ppc_vsr_t xa, xb;
+    int64_t exp_a, exp_b;
+    uint32_t cc;
+
+    getVSR(rA(opcode) + 32, &xa, env);
+    getVSR(rB(opcode) + 32, &xb, env);
+
+    exp_a = extract64(xa.VsrD(0), 48, 15);
+    exp_b = extract64(xb.VsrD(0), 48, 15);
+
+    if (unlikely(float128_is_any_nan(make_float128(xa.VsrD(0), xa.VsrD(1))) ||
+                 float128_is_any_nan(make_float128(xb.VsrD(0), xb.VsrD(1))))) {
+        cc = CRF_SO;
+    } else {
+        if (exp_a < exp_b) {
+            cc = CRF_LT;
+        } else if (exp_a > exp_b) {
+            cc = CRF_GT;
+        } else {
+            cc = CRF_EQ;
+        }
+    }
+
+    env->fpscr &= ~(0x0F << FPSCR_FPRF);
+    env->fpscr |= cc << FPSCR_FPRF;
+    env->crf[BF(opcode)] = cc;
+
+    helper_float_check_status(env);
+}
+
 #define VSX_SCALAR_CMP(op, ordered)                                      \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 {                                                                        \
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index da00f0a..ba42015 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -404,6 +404,8 @@ DEF_HELPER_2(xscmpeqdp, void, env, i32)
 DEF_HELPER_2(xscmpgtdp, void, env, i32)
 DEF_HELPER_2(xscmpgedp, void, env, i32)
 DEF_HELPER_2(xscmpnedp, void, env, i32)
+DEF_HELPER_2(xscmpexpdp, void, env, i32)
+DEF_HELPER_2(xscmpexpqp, void, env, i32)
 DEF_HELPER_2(xscmpodp, void, env, i32)
 DEF_HELPER_2(xscmpudp, void, env, i32)
 DEF_HELPER_2(xsmaxdp, void, env, i32)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 5a27be4..5206258 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -624,6 +624,8 @@ GEN_VSX_HELPER_2(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
+GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
+GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 3d91041..2468ee9 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -83,6 +83,10 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x01, opc3|0x0C, 0, PPC_NONE, PPC2_VSX),\
 GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x02, opc3|0x0C, 0, PPC_NONE, PPC2_VSX),\
 GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x03, opc3|0x0C, 0, PPC_NONE, PPC2_VSX)
 
+#define GEN_VSX_XFORM_300(name, opc2, opc3, inval) \
+GEN_HANDLER_E(name, 0x3F, opc2, opc3, inval, PPC_NONE, PPC2_ISA300)
+
+
 GEN_XX2FORM(xsabsdp, 0x12, 0x15, PPC2_VSX),
 GEN_XX2FORM(xsnabsdp, 0x12, 0x16, PPC2_VSX),
 GEN_XX2FORM(xsnegdp, 0x12, 0x17, PPC2_VSX),
@@ -118,6 +122,8 @@ GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
 GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
 GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
 GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
+GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
+GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
 GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
 GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
 GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 05/10] target-ppc: Add xscmpoqp and xscmpuqp instructions
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 04/10] target-ppc: Add xscmpexp[dp, qp] instructions Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 06/10] target-ppc: implement lxsd and lxssp instructions Nikunj A Dadhania
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

From: Bharata B Rao <bharata@linux.vnet.ibm.com>

xscmpoqp - VSX Scalar Compare Ordered Quad-Precision
xscmpuqp - VSX Scalar Compare Unordered Quad-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/fpu_helper.c             | 54 +++++++++++++++++++++++++++++++++++++
 target-ppc/helper.h                 |  2 ++
 target-ppc/translate/vsx-impl.inc.c |  2 ++
 target-ppc/translate/vsx-ops.inc.c  |  2 ++
 4 files changed, 60 insertions(+)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 8bffafb..696f537 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2519,6 +2519,60 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 VSX_SCALAR_CMP(xscmpodp, 1)
 VSX_SCALAR_CMP(xscmpudp, 0)
 
+#define VSX_SCALAR_CMPQ(op, ordered)                                    \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+{                                                                       \
+    ppc_vsr_t xa, xb;                                                   \
+    uint32_t cc = 0;                                                    \
+    bool vxsnan_flag = false, vxvc_flag = false;                        \
+    float128 a, b;                                                      \
+                                                                        \
+    helper_reset_fpstatus(env);                                         \
+    getVSR(rA(opcode) + 32, &xa, env);                                  \
+    getVSR(rB(opcode) + 32, &xb, env);                                  \
+                                                                        \
+    a = make_float128(xa.VsrD(0), xa.VsrD(1));                          \
+    b = make_float128(xb.VsrD(0), xb.VsrD(1));                          \
+                                                                        \
+    if (float128_is_signaling_nan(a, &env->fp_status) ||                \
+        float128_is_signaling_nan(b, &env->fp_status)) {                \
+        vxsnan_flag = true;                                             \
+        cc = CRF_SO;                                                    \
+        if (fpscr_ve == 0 && ordered) {                                 \
+            vxvc_flag = true;                                           \
+        }                                                               \
+    } else if (float128_is_quiet_nan(a, &env->fp_status) ||             \
+               float128_is_quiet_nan(b, &env->fp_status)) {             \
+        cc = CRF_SO;                                                    \
+        if (ordered) {                                                  \
+            vxvc_flag = true;                                           \
+        }                                                               \
+    }                                                                   \
+    if (vxsnan_flag) {                                                  \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);          \
+    }                                                                   \
+    if (vxvc_flag) {                                                    \
+        float_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);            \
+    }                                                                   \
+                                                                        \
+    if (float128_lt(a, b, &env->fp_status)) {                           \
+        cc |= CRF_LT;                                                   \
+    } else if (!float128_le(a, b, &env->fp_status)) {                   \
+        cc |= CRF_GT;                                                   \
+    } else {                                                            \
+        cc |= CRF_EQ;                                                   \
+    }                                                                   \
+                                                                        \
+    env->fpscr &= ~(0x0F << FPSCR_FPRF);                                \
+    env->fpscr |= cc << FPSCR_FPRF;                                     \
+    env->crf[BF(opcode)] = cc;                                          \
+                                                                        \
+    float_check_status(env);                                            \
+}
+
+VSX_SCALAR_CMPQ(xscmpoqp, 1)
+VSX_SCALAR_CMPQ(xscmpuqp, 0)
+
 /* VSX_MAX_MIN - VSX floating point maximum/minimum
  *   name  - instruction mnemonic
  *   op    - operation (max or min)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index ba42015..3b26678 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -408,6 +408,8 @@ DEF_HELPER_2(xscmpexpdp, void, env, i32)
 DEF_HELPER_2(xscmpexpqp, void, env, i32)
 DEF_HELPER_2(xscmpodp, void, env, i32)
 DEF_HELPER_2(xscmpudp, void, env, i32)
+DEF_HELPER_2(xscmpoqp, void, env, i32)
+DEF_HELPER_2(xscmpuqp, void, env, i32)
 DEF_HELPER_2(xsmaxdp, void, env, i32)
 DEF_HELPER_2(xsmindp, void, env, i32)
 DEF_HELPER_2(xscvdpsp, void, env, i32)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 5206258..ed9588e 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -628,6 +628,8 @@ GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 2468ee9..7f09527 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -126,6 +126,8 @@ GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
 GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
 GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
 GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
+GEN_VSX_XFORM_300(xscmpoqp, 0x04, 0x04, 0x00600001),
+GEN_VSX_XFORM_300(xscmpuqp, 0x04, 0x14, 0x00600001),
 GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
 GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),
 GEN_XX2FORM(xscvdpsp, 0x12, 0x10, PPC2_VSX),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 06/10] target-ppc: implement lxsd and lxssp instructions
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 05/10] target-ppc: Add xscmpoqp and xscmpuqp instructions Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 07/10] target-ppc: implement stxsd and stxssp Nikunj A Dadhania
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

lxsd: Load VSX Scalar Dword
lxssp: Load VSX Scalar Single

Moreover, DS-Form instructions shares the same primary opcode, bits
30:31 are used to decode the instruction. Use a common routine to decode
primary opcode(0x39) - ds-form instructions and branch-out depending on
bits 30:31.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 target-ppc/translate.c              | 25 +++++++++++++++++++++++++
 target-ppc/translate/fp-ops.inc.c   |  1 -
 target-ppc/translate/vsx-impl.inc.c | 21 +++++++++++++++++++++
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index d3bda1b..9114f85 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -6053,6 +6053,29 @@ GEN_TM_PRIV_NOOP(trechkpt);
 
 #include "translate/spe-impl.inc.c"
 
+/* Handles lfdp, lxsd, lxssp */
+static void gen_dform39(DisasContext *ctx)
+{
+    switch (ctx->opcode & 0x3) {
+    case 0: /* lfdp */
+        if (ctx->insns_flags2 & PPC2_ISA205) {
+            return gen_lfdp(ctx);
+        }
+        break;
+    case 2: /* lxsd */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_lxsd(ctx);
+        }
+        break;
+    case 3: /* lxssp */
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            return gen_lxssp(ctx);
+        }
+        break;
+    }
+    return gen_invalid(ctx);
+}
+
 static opcode_t opcodes[] = {
 GEN_HANDLER(invalid, 0x00, 0x00, 0x00, 0xFFFFFFFF, PPC_NONE),
 GEN_HANDLER(cmp, 0x1F, 0x00, 0x00, 0x00400000, PPC_INTEGER),
@@ -6125,6 +6148,8 @@ GEN_HANDLER(ld, 0x3A, 0xFF, 0xFF, 0x00000000, PPC_64B),
 GEN_HANDLER(lq, 0x38, 0xFF, 0xFF, 0x00000000, PPC_64BX),
 GEN_HANDLER(std, 0x3E, 0xFF, 0xFF, 0x00000000, PPC_64B),
 #endif
+/* handles lfdp, lxsd, lxssp */
+GEN_HANDLER_E(dform39, 0x39, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(lmw, 0x2E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(stmw, 0x2F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(lswi, 0x1F, 0x15, 0x12, 0x00000001, PPC_STRING),
diff --git a/target-ppc/translate/fp-ops.inc.c b/target-ppc/translate/fp-ops.inc.c
index d36ab4e..3127fa0 100644
--- a/target-ppc/translate/fp-ops.inc.c
+++ b/target-ppc/translate/fp-ops.inc.c
@@ -68,7 +68,6 @@ GEN_LDFS(lfd, ld64, 0x12, PPC_FLOAT)
 GEN_LDFS(lfs, ld32fs, 0x10, PPC_FLOAT)
 GEN_HANDLER_E(lfiwax, 0x1f, 0x17, 0x1a, 0x00000001, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(lfiwzx, 0x1f, 0x17, 0x1b, 0x1, PPC_NONE, PPC2_FP_CVT_ISA206),
-GEN_HANDLER_E(lfdp, 0x39, 0xFF, 0xFF, 0x00200003, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(lfdpx, 0x1F, 0x17, 0x18, 0x00200001, PPC_NONE, PPC2_ISA205),
 
 #define GEN_STF(name, stop, opc, type)                                        \
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index ed9588e..1d7cd23 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -190,6 +190,27 @@ static void gen_lxvb16x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+#define VSX_LOAD_SCALAR_DS(name, operation)                       \
+static void gen_##name(DisasContext *ctx)                         \
+{                                                                 \
+    TCGv EA;                                                      \
+    TCGv_i64 xth = cpu_vsrh(rD(ctx->opcode) + 32);                \
+                                                                  \
+    if (unlikely(!ctx->altivec_enabled)) {                        \
+        gen_exception(ctx, POWERPC_EXCP_VPU);                     \
+        return;                                                   \
+    }                                                             \
+    gen_set_access_type(ctx, ACCESS_INT);                         \
+    EA = tcg_temp_new();                                          \
+    gen_addr_imm_index(ctx, EA, 0x03);                            \
+    gen_qemu_##operation(ctx, xth, EA);                           \
+    /* NOTE: cpu_vsrl is undefined */                             \
+    tcg_temp_free(EA);                                            \
+}
+
+VSX_LOAD_SCALAR_DS(lxsd, ld64_i64)
+VSX_LOAD_SCALAR_DS(lxssp, ld32fs)
+
 #define VSX_STORE_SCALAR(name, operation)                     \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 07/10] target-ppc: implement stxsd and stxssp
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (5 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 06/10] target-ppc: implement lxsd and lxssp instructions Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 08/10] target-ppc: implement lxv/lxvx and stxv/stxvx Nikunj A Dadhania
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

stxsd:  Store VSX Scalar Dword
stxssp: Store VSX Scalar SP

Moreover, DQ-Form/DS-FORM instructions shares the same primary
opcode(0x3D). For DQ-FORM bits 29:31 are used, for DS-FORM bits 30:31
are used. Common routine to decode primary opcode(0x3D) -
ds-form/dq-form instructions is required.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/translate.c              | 34 ++++++++++++++++++++++++++++++++++
 target-ppc/translate/fp-ops.inc.c   |  1 -
 target-ppc/translate/vsx-impl.inc.c | 21 +++++++++++++++++++++
 3 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 9114f85..a1bf74c 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -6076,6 +6076,38 @@ static void gen_dform39(DisasContext *ctx)
     return gen_invalid(ctx);
 }
 
+/* handles stfdp, stxsd, stxssp */
+static void gen_dform3D(DisasContext *ctx)
+{
+    if ((ctx->opcode & 3) == 1) { /* DQ-FORM */
+        switch (ctx->opcode & 0x7) {
+        case 1: /* lxv */
+            break;
+        case 5: /* stxv */
+            break;
+        }
+    } else { /* DS-FORM */
+        switch (ctx->opcode & 0x3) {
+        case 0: /* stfdp */
+            if (ctx->insns_flags2 & PPC2_ISA205) {
+                return gen_stfdp(ctx);
+            }
+            break;
+        case 2: /* stxsd */
+            if (ctx->insns_flags2 & PPC2_ISA300) {
+                return gen_stxsd(ctx);
+            }
+            break;
+        case 3: /* stxssp */
+            if (ctx->insns_flags2 & PPC2_ISA300) {
+                return gen_stxssp(ctx);
+            }
+            break;
+        }
+    }
+    return gen_invalid(ctx);
+}
+
 static opcode_t opcodes[] = {
 GEN_HANDLER(invalid, 0x00, 0x00, 0x00, 0xFFFFFFFF, PPC_NONE),
 GEN_HANDLER(cmp, 0x1F, 0x00, 0x00, 0x00400000, PPC_INTEGER),
@@ -6150,6 +6182,8 @@ GEN_HANDLER(std, 0x3E, 0xFF, 0xFF, 0x00000000, PPC_64B),
 #endif
 /* handles lfdp, lxsd, lxssp */
 GEN_HANDLER_E(dform39, 0x39, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
+/* handles stfdp, stxsd, stxssp */
+GEN_HANDLER_E(dform3D, 0x3D, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(lmw, 0x2E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(stmw, 0x2F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(lswi, 0x1F, 0x15, 0x12, 0x00000001, PPC_STRING),
diff --git a/target-ppc/translate/fp-ops.inc.c b/target-ppc/translate/fp-ops.inc.c
index 3127fa0..3c6d05a 100644
--- a/target-ppc/translate/fp-ops.inc.c
+++ b/target-ppc/translate/fp-ops.inc.c
@@ -87,7 +87,6 @@ GEN_STXF(name, stop, 0x17, op | 0x00, type)
 GEN_STFS(stfd, st64_i64, 0x16, PPC_FLOAT)
 GEN_STFS(stfs, st32fs, 0x14, PPC_FLOAT)
 GEN_STXF(stfiw, st32fiw, 0x17, 0x1E, PPC_FLOAT_STFIWX)
-GEN_HANDLER_E(stfdp, 0x3D, 0xFF, 0xFF, 0x00200003, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(stfdpx, 0x1F, 0x17, 0x1C, 0x00200001, PPC_NONE, PPC2_ISA205),
 
 GEN_HANDLER(frsqrtes, 0x3B, 0x1A, 0xFF, 0x001F07C0, PPC_FLOAT_FRSQRTES),
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 1d7cd23..8ee44cf 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -332,6 +332,27 @@ static void gen_stxvb16x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+#define VSX_STORE_SCALAR_DS(name, operation)                      \
+static void gen_##name(DisasContext *ctx)                         \
+{                                                                 \
+    TCGv EA;                                                      \
+    TCGv_i64 xth = cpu_vsrh(rD(ctx->opcode) + 32);                \
+                                                                  \
+    if (unlikely(!ctx->altivec_enabled)) {                        \
+        gen_exception(ctx, POWERPC_EXCP_VPU);                     \
+        return;                                                   \
+    }                                                             \
+    gen_set_access_type(ctx, ACCESS_INT);                         \
+    EA = tcg_temp_new();                                          \
+    gen_addr_imm_index(ctx, EA, 0x03);                            \
+    gen_qemu_##operation(ctx, xth, EA);                           \
+    /* NOTE: cpu_vsrl is undefined */                             \
+    tcg_temp_free(EA);                                            \
+}
+
+VSX_LOAD_SCALAR_DS(stxsd, st64_i64)
+VSX_LOAD_SCALAR_DS(stxssp, st32fs)
+
 #define MV_VSRW(name, tcgop1, tcgop2, target, source)           \
 static void gen_##name(DisasContext *ctx)                       \
 {                                                               \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 08/10] target-ppc: implement lxv/lxvx and stxv/stxvx
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (6 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 07/10] target-ppc: implement stxsd and stxssp Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata

lxv:  Load VSX Vector
lxvx: Load VSX Vector Indexed

    Little/Big-endian Storage
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    Vector load results:
    BE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    LE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |E7|E6|E5|E4|E3|E2|E1|E0|F7|F6|F5|F4|F3|F2|F1|F0|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

stxv: Store VSX Vector
stxvx: Store VSX Vector Indexed

    Vector (8-bit elements) in BE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    Vector (8-bit elements) in LE:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |E7|E6|E5|E4|E3|E2|E1|E0|F7|F6|F5|F4|F3|F2|F1|F0|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

    Store results in following:
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

--

Note: In LE mode, lxvx and stxvx is being replaced by the compiler with
lxvd2x and stxvd2x (ISA 3.0: page 489 and page 507). AFAIU, seems to be
wrong as lxvx does a 16byte read and does a byte-swap across 16bytes,
while lxvd2x, does two 8bytes read and byte-swaps 8bytes.
---
 target-ppc/internal.h               |  1 +
 target-ppc/translate.c              | 10 ++++++--
 target-ppc/translate/vsx-impl.inc.c | 50 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vsx-ops.inc.c  |  2 ++
 4 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/target-ppc/internal.h b/target-ppc/internal.h
index 9a4a74a..e83ea45 100644
--- a/target-ppc/internal.h
+++ b/target-ppc/internal.h
@@ -187,6 +187,7 @@ EXTRACT_HELPER(DCM, 10, 6)
 /* DFP Z23-form */
 EXTRACT_HELPER(RMC, 9, 2)
 
+EXTRACT_HELPER_SPLIT(DQxT, 3, 1, 21, 5);
 EXTRACT_HELPER_SPLIT(xT, 0, 1, 21, 5);
 EXTRACT_HELPER_SPLIT(xS, 0, 1, 21, 5);
 EXTRACT_HELPER_SPLIT(xA, 2, 1, 16, 5);
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index a1bf74c..f68f427 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -6076,14 +6076,20 @@ static void gen_dform39(DisasContext *ctx)
     return gen_invalid(ctx);
 }
 
-/* handles stfdp, stxsd, stxssp */
+/* handles stfdp, lxv, stxsd, stxssp lxvx */
 static void gen_dform3D(DisasContext *ctx)
 {
     if ((ctx->opcode & 3) == 1) { /* DQ-FORM */
         switch (ctx->opcode & 0x7) {
         case 1: /* lxv */
+            if (ctx->insns_flags2 & PPC2_ISA300) {
+                return gen_lxv(ctx);
+            }
             break;
         case 5: /* stxv */
+            if (ctx->insns_flags2 & PPC2_ISA300) {
+                return gen_stxv(ctx);
+            }
             break;
         }
     } else { /* DS-FORM */
@@ -6182,7 +6188,7 @@ GEN_HANDLER(std, 0x3E, 0xFF, 0xFF, 0x00000000, PPC_64B),
 #endif
 /* handles lfdp, lxsd, lxssp */
 GEN_HANDLER_E(dform39, 0x39, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
-/* handles stfdp, stxsd, stxssp */
+/* handles stfdp, lxv, stxsd, stxssp, stxv */
 GEN_HANDLER_E(dform3D, 0x3D, 0xFF, 0xFF, 0x00000000, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(lmw, 0x2E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(stmw, 0x2F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 8ee44cf..2fbdbd2 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -190,6 +190,56 @@ static void gen_lxvb16x(DisasContext *ctx)
     tcg_temp_free(EA);
 }
 
+#define VSX_VECTOR_LOAD_STORE(name, op, indexed)            \
+static void gen_##name(DisasContext *ctx)                   \
+{                                                           \
+    int xt;                                                 \
+    TCGv EA;                                                \
+    TCGv_i64 xth, xtl;                                      \
+                                                            \
+    if (indexed) {                                          \
+        xt = xT(ctx->opcode);                               \
+    } else {                                                \
+        xt = DQxT(ctx->opcode);                             \
+    }                                                       \
+    xth = cpu_vsrh(xt);                                     \
+    xtl = cpu_vsrl(xt);                                     \
+                                                            \
+    if (xt < 32) {                                          \
+        if (unlikely(!ctx->vsx_enabled)) {                  \
+            gen_exception(ctx, POWERPC_EXCP_VSXU);          \
+            return;                                         \
+        }                                                   \
+    } else {                                                \
+        if (unlikely(!ctx->altivec_enabled)) {              \
+            gen_exception(ctx, POWERPC_EXCP_VPU);           \
+            return;                                         \
+        }                                                   \
+    }                                                       \
+    gen_set_access_type(ctx, ACCESS_INT);                   \
+    EA = tcg_temp_new();                                    \
+    if (indexed) {                                          \
+        gen_addr_reg_index(ctx, EA);                        \
+    } else {                                                \
+        gen_addr_imm_index(ctx, EA, 0x0F);                  \
+    }                                                       \
+    if (ctx->le_mode) {                                     \
+        tcg_gen_qemu_##op(xtl, EA, ctx->mem_idx, MO_LEQ);   \
+        tcg_gen_addi_tl(EA, EA, 8);                         \
+        tcg_gen_qemu_##op(xth, EA, ctx->mem_idx, MO_LEQ);   \
+    } else {                                                \
+        tcg_gen_qemu_##op(xth, EA, ctx->mem_idx, MO_BEQ);   \
+        tcg_gen_addi_tl(EA, EA, 8);                         \
+        tcg_gen_qemu_##op(xtl, EA, ctx->mem_idx, MO_BEQ);   \
+    }                                                       \
+    tcg_temp_free(EA);                                      \
+}
+
+VSX_VECTOR_LOAD_STORE(lxv, ld_i64, 0)
+VSX_VECTOR_LOAD_STORE(stxv, st_i64, 0)
+VSX_VECTOR_LOAD_STORE(lxvx, ld_i64, 1)
+VSX_VECTOR_LOAD_STORE(stxvx, st_i64, 1)
+
 #define VSX_LOAD_SCALAR_DS(name, operation)                       \
 static void gen_##name(DisasContext *ctx)                         \
 {                                                                 \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 7f09527..8a1cbe0 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -9,6 +9,7 @@ GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvh8x, 0x1F, 0x0C, 0x19, 0, PPC_NONE,  PPC2_ISA300),
 GEN_HANDLER_E(lxvb16x, 0x1F, 0x0C, 0x1B, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(lxvx, 0x1F, 0x0C, 0x08, 0x00000040, PPC_NONE, PPC2_ISA300),
 
 GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300),
@@ -19,6 +20,7 @@ GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvh8x, 0x1F, 0x0C, 0x1D, 0, PPC_NONE,  PPC2_ISA300),
 GEN_HANDLER_E(stxvb16x, 0x1F, 0x0C, 0x1F, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(stxvx, 0x1F, 0x0C, 0x0C, 0, PPC_NONE, PPC2_ISA300),
 
 GEN_HANDLER_E(mfvsrwz, 0x1F, 0x13, 0x03, 0x0000F800, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(mtvsrwa, 0x1F, 0x13, 0x06, 0x0000F800, PPC_NONE, PPC2_VSX207),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (7 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 08/10] target-ppc: implement lxv/lxvx and stxv/stxvx Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-24  1:02   ` David Gibson
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 10/10] target-ppc: add vextu[bhw]rx instructions Nikunj A Dadhania
  2016-11-24  0:56 ` [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 David Gibson
  10 siblings, 1 reply; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, bharata, Avinesh Kumar

From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>

vextublx:  Vector Extract Unsigned Byte Left
vextuhlx:  Vector Extract Unsigned Halfword Left
vextuwlx:  Vector Extract Unsigned Word Left

Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  3 ++
 target-ppc/int_helper.c             | 65 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c | 18 ++++++++++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++-
 4 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 3b26678..d0a8fb2 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -366,6 +366,9 @@ DEF_HELPER_3(vpmsumb, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumh, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumw, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
+DEF_HELPER_2(vextublx, tl, tl, avr)
+DEF_HELPER_2(vextuhlx, tl, tl, avr)
+DEF_HELPER_2(vextuwlx, tl, tl, avr)
 
 DEF_HELPER_2(vsbox, void, avr, avr)
 DEF_HELPER_3(vcipher, void, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index fbf477f..ce6cff1 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1805,6 +1805,71 @@ void helper_vlogefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
     }
 }
 
+#ifdef CONFIG_INT128
+#define EXTRACT128(value, start, length)                      \
+    ((value >> start) & (~(__uint128_t)0 >> (128 - length)))
+#endif
+
+#if defined(HOST_WORDS_BIGENDIAN)
+#  if defined(CONFIG_INT128)
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int index = (a & 0xf) * 8;                                  \
+    r = EXTRACT128(b->u128, index, elem * 8);                   \
+    return r;                                                   \
+}
+#  else
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int i;                                                      \
+    int index = a & 0xf;                                        \
+    for (i = 0; i < elem; i++) {                                \
+        r = r << 8;                                             \
+        if (index + i <= 15) {                                  \
+            r = r | b->u8[index + i];                           \
+        }                                                       \
+    }                                                           \
+    return r;                                                   \
+}
+#  endif
+#else
+#  if defined(CONFIG_INT128)
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int size =  elem * 8;                                       \
+    int index = (15 - (a & 0xf) + 1) * 8;                       \
+    r = EXTRACT128(b->u128, (index - size), size);              \
+    return r;                                                   \
+}
+#  else
+#  define VEXTULX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int i;                                                      \
+    int index = 15 - (a & 0xf);                                 \
+    for (i = 0; i < elem; i++) {                                \
+        r = r << 8;                                             \
+        if (index - i >= 0) {                                   \
+            r = r | b->u8[index - i];                           \
+        }                                                       \
+    }                                                           \
+    return r;                                                   \
+}
+#  endif
+#endif
+
+VEXTULX_DO(vextublx, 1)
+VEXTULX_DO(vextuhlx, 2)
+VEXTULX_DO(vextuwlx, 4)
+#undef VEXTULX_DO
+
 /* The specification says that the results are undefined if all of the
  * shift counts are not identical.  We check to make sure that they are
  * to conform to what real hardware appears to do.  */
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index 7143eb3..e91d10b 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -340,6 +340,19 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx)              \
     }                                                                   \
 }
 
+#define GEN_VXFORM_HETRO(name, opc2, opc3)                              \
+static void glue(gen_, name)(DisasContext *ctx)                         \
+{                                                                       \
+    TCGv_ptr rb;                                                        \
+    if (unlikely(!ctx->altivec_enabled)) {                              \
+        gen_exception(ctx, POWERPC_EXCP_VPU);                           \
+        return;                                                         \
+    }                                                                   \
+    rb = gen_avr_ptr(rB(ctx->opcode));                                  \
+    gen_helper_##name(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)], rb); \
+    tcg_temp_free_ptr(rb);                                              \
+}
+
 GEN_VXFORM(vaddubm, 0, 0);
 GEN_VXFORM_DUAL_EXT(vaddubm, PPC_ALTIVEC, PPC_NONE, 0,       \
                     vmul10cuq, PPC_NONE, PPC2_ISA300, 0x0000F800)
@@ -525,6 +538,11 @@ GEN_VXFORM_ENV(vaddfp, 5, 0);
 GEN_VXFORM_ENV(vsubfp, 5, 1);
 GEN_VXFORM_ENV(vmaxfp, 5, 16);
 GEN_VXFORM_ENV(vminfp, 5, 17);
+GEN_VXFORM_HETRO(vextublx, 6, 24)
+GEN_VXFORM_HETRO(vextuhlx, 6, 25)
+GEN_VXFORM_HETRO(vextuwlx, 6, 26)
+GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207,
+                vextuwlx, PPC_NONE, PPC2_ISA300)
 
 #define GEN_VXRFORM1(opname, name, str, opc2, opc3)                     \
 static void glue(gen_, name)(DisasContext *ctx)                         \
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index f02b3be..e62e564 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -91,8 +91,10 @@ GEN_VXFORM(vmrghw, 6, 2),
 GEN_VXFORM(vmrglb, 6, 4),
 GEN_VXFORM(vmrglh, 6, 5),
 GEN_VXFORM(vmrglw, 6, 6),
+GEN_VXFORM_300(vextublx, 6, 24),
+GEN_VXFORM_300(vextuhlx, 6, 25),
+GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM_207(vmrgew, 6, 30),
-GEN_VXFORM_207(vmrgow, 6, 26),
 GEN_VXFORM(vmuloub, 4, 0),
 GEN_VXFORM(vmulouh, 4, 1),
 GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH v1 10/10] target-ppc: add vextu[bhw]rx instructions
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (8 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
@ 2016-11-23 11:37 ` Nikunj A Dadhania
  2016-11-24  0:56 ` [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 David Gibson
  10 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-23 11:37 UTC (permalink / raw)
  To: qemu-ppc, david, rth
  Cc: qemu-devel, nikunj, bharata, Hariharan T.S, Avinesh Kumar

From: "Hariharan T.S" <hari@linux.vnet.ibm.com>

vextubrx: Vector Extract Unsigned Byte Right-Indexed VX-form
vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed VX-form
vextuwrx: Vector Extract Unsigned Word Right-Indexed VX-form

Signed-off-by: Hariharan T.S. <hari@linux.vnet.ibm.com>
Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  3 ++
 target-ppc/int_helper.c             | 60 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  5 ++++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++-
 4 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index d0a8fb2..a6e04cb 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -369,6 +369,9 @@ DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
 DEF_HELPER_2(vextublx, tl, tl, avr)
 DEF_HELPER_2(vextuhlx, tl, tl, avr)
 DEF_HELPER_2(vextuwlx, tl, tl, avr)
+DEF_HELPER_2(vextubrx, tl, tl, avr)
+DEF_HELPER_2(vextuhrx, tl, tl, avr)
+DEF_HELPER_2(vextuwrx, tl, tl, avr)
 
 DEF_HELPER_2(vsbox, void, avr, avr)
 DEF_HELPER_3(vcipher, void, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index ce6cff1..7f8b45d 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1870,6 +1870,66 @@ VEXTULX_DO(vextuhlx, 2)
 VEXTULX_DO(vextuwlx, 4)
 #undef VEXTULX_DO
 
+#if defined(HOST_WORDS_BIGENDIAN)
+#  if defined(CONFIG_INT128)
+#  define VEXTURX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int size =  elem * 8;                                       \
+    int index = (15 - (a & 0xf) + 1) * 8;                       \
+    r = EXTRACT128(b->u128, (index - size), size);              \
+    return r;                                                   \
+}
+#  else
+#  define VEXTURX_DO(name, elem)                                        \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)          \
+{                                                                       \
+    target_ulong r = 0;                                                 \
+    int i;                                                              \
+    int index = a & 0xf;                                                \
+    for (i = elem - 1; i >= 0; i--) {                                   \
+        r = r << 8;                                                     \
+        if ((15 - i - index) >= 0) {                                    \
+            r = r | b->u8[15 - i - index];                              \
+        }                                                               \
+    }                                                                   \
+    return r;                                                           \
+}
+#  endif
+#else
+#  if defined(CONFIG_INT128)
+#  define VEXTURX_DO(name, elem)                                \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{                                                               \
+    target_ulong r = 0;                                         \
+    int index = (a & 0xf) * 8;                                  \
+    r = EXTRACT128(b->u128, index, elem * 8);                   \
+    return r;                                                   \
+}
+#  else
+#  define VEXTURX_DO(name, elem)                                        \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)          \
+{                                                                       \
+    target_ulong r = 0;                                                 \
+    int i;                                                              \
+    int index = 15 - (a & 0xf);                                         \
+    for (i = elem - 1; i >= 0; i--) {                                   \
+        r = r << 8;                                                     \
+        if ((15 + i - index) <= 15) {                                   \
+            r = r | b->u8[15 + i - index];                              \
+        }                                                               \
+    }                                                                   \
+    return r;                                                           \
+}
+#  endif
+#endif
+
+VEXTURX_DO(vextubrx, 1)
+VEXTURX_DO(vextuhrx, 2)
+VEXTURX_DO(vextuwrx, 4)
+#undef VEXTURX_DO
+
 /* The specification says that the results are undefined if all of the
  * shift counts are not identical.  We check to make sure that they are
  * to conform to what real hardware appears to do.  */
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index e91d10b..3dea465 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -543,6 +543,11 @@ GEN_VXFORM_HETRO(vextuhlx, 6, 25)
 GEN_VXFORM_HETRO(vextuwlx, 6, 26)
 GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207,
                 vextuwlx, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_HETRO(vextubrx, 6, 28)
+GEN_VXFORM_HETRO(vextuhrx, 6, 29)
+GEN_VXFORM_HETRO(vextuwrx, 6, 30)
+GEN_VXFORM_DUAL(vmrgew, PPC_NONE, PPC2_ALTIVEC_207, \
+                vextuwrx, PPC_NONE, PPC2_ISA300)
 
 #define GEN_VXRFORM1(opname, name, str, opc2, opc3)                     \
 static void glue(gen_, name)(DisasContext *ctx)                         \
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index e62e564..a3c9d05 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -94,7 +94,9 @@ GEN_VXFORM(vmrglw, 6, 6),
 GEN_VXFORM_300(vextublx, 6, 24),
 GEN_VXFORM_300(vextuhlx, 6, 25),
 GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM_207(vmrgew, 6, 30),
+GEN_VXFORM_300(vextubrx, 6, 28),
+GEN_VXFORM_300(vextuhrx, 6, 29),
+GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM(vmuloub, 4, 0),
 GEN_VXFORM(vmulouh, 4, 1),
 GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v1 02/10] target-ppc: rename CRF_* defines as CRF_*_BIT
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 02/10] target-ppc: rename CRF_* defines as CRF_*_BIT Nikunj A Dadhania
@ 2016-11-24  0:23   ` David Gibson
  0 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2016-11-24  0:23 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 477 bytes --]

On Wed, Nov 23, 2016 at 05:07:11PM +0530, Nikunj A Dadhania wrote:
> Add _BIT to CRF_[GT,LT,EQ_SO] and introduce CRF_[GT,LT,EQ,SO] for usage
> without shifts in the code. This would simplify the code.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

Nice!

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8
  2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
                   ` (9 preceding siblings ...)
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 10/10] target-ppc: add vextu[bhw]rx instructions Nikunj A Dadhania
@ 2016-11-24  0:56 ` David Gibson
  10 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2016-11-24  0:56 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 3307 bytes --]

On Wed, Nov 23, 2016 at 05:07:09PM +0530, Nikunj A Dadhania wrote:
> This series contains 18 new instructions for POWER9 ISA3.0
>     Vector Extract Left/Right Indexed
>     VSX Scalar Compare Exponents
>     VSX Scalar Compare Quad-Precision
>     Load/Store VSX Vector 
>     Load/Store VSX Scalar
> 
> Changelog:
> v0:
> * Change dq/ds-form decoding for primary opcode 0x3D
> * Rename CR Field defines, as at every place it was
>   using bit shifts.
> * Use symbolic constants in xscmp*
> * Fix bug in exception handling for QNaN
> * Define EXTRACT128 within CONFIG_INT128

I've applied patches 1..8 to ppc-for-2.9.  Patches 9 & 10 can still do
with some improvement, I think.

> 
> Patches
> =======
> 01-03: Consolidation/Fixes
>    04: 
>       xscmpexpdp: VSX Scalar Compare Exponents Double-Precision
>       xscmpexpqp: VSX Scalar Compare Exponents Quad-Precision
>    05:
>       xscmpoqp: VSX Scalar Compare Ordered Quad-Precision
>       xscmpuqp: VSX Scalar Compare Unordered Quad-Precision
>    06:
>       lxsd:  Load VSX Scalar Dword
>       lxssp: Load VSX Scalar Single Precision
>    07:
>       stxsd:  Store VSX Scalar Dword
>       stxssp: Store VSX Scalar Single Precision
>    08:
>       lxv:   Load VSX Vector
>       lxvx:  Load VSX Vector Indexed
>       stxv:  Store VSX Vector
>       stxvx: Store VSX Vector Indexed
>    09: 
>       vextublx:  Vector Extract Unsigned Byte Left
>       vextuhlx:  Vector Extract Unsigned Halfword Left
>       vextuwlx:  Vector Extract Unsigned Word Left
>    10: 
>       vextubrx: Vector Extract Unsigned Byte Right-Indexed
>       vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed
>       vextuwrx: Vector Extract Unsigned Word Right-Indexed
> 
> Avinesh Kumar (1):
>   target-ppc: add vextu[bhw]lx instructions
> 
> Bharata B Rao (4):
>   target-ppc: Consolidate instruction decode helpers
>   target-ppc: Fix xscmpodp and xscmpudp instructions
>   target-ppc: Add xscmpexp[dp,qp] instructions
>   target-ppc: Add xscmpoqp and xscmpuqp instructions
> 
> Hariharan T.S (1):
>   target-ppc: add vextu[bhw]rx instructions
> 
> Nikunj A Dadhania (4):
>   target-ppc: rename CRF_* defines as CRF_*_BIT
>   target-ppc: implement lxsd and lxssp instructions
>   target-ppc: implement stxsd and stxssp
>   target-ppc: implement lxv/lxvx and stxv/stxvx
> 
>  target-ppc/cpu.h                    |  21 ++--
>  target-ppc/fpu_helper.c             | 169 ++++++++++++++++++++++----
>  target-ppc/helper.h                 |  10 ++
>  target-ppc/int_helper.c             | 155 +++++++++++++++++++++---
>  target-ppc/internal.h               | 152 ++++++++++++++++++++++++
>  target-ppc/translate.c              | 230 +++++++++++-------------------------
>  target-ppc/translate/fp-ops.inc.c   |   2 -
>  target-ppc/translate/vmx-impl.inc.c |  23 ++++
>  target-ppc/translate/vmx-ops.inc.c  |   8 +-
>  target-ppc/translate/vsx-impl.inc.c |  96 +++++++++++++++
>  target-ppc/translate/vsx-ops.inc.c  |  10 ++
>  11 files changed, 666 insertions(+), 210 deletions(-)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions
  2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
@ 2016-11-24  1:02   ` David Gibson
  2016-11-24  5:53     ` Nikunj A Dadhania
  0 siblings, 1 reply; 17+ messages in thread
From: David Gibson @ 2016-11-24  1:02 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata, Avinesh Kumar

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:
> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
> 
> vextublx:  Vector Extract Unsigned Byte Left
> vextuhlx:  Vector Extract Unsigned Halfword Left
> vextuwlx:  Vector Extract Unsigned Word Left
> 
> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

So, when I suggested doing these without helpers before, I had
forgotten that the non-byte versions can straddle the word boundary.
Given that the offset is in a register, not the instruction that does
make it complicated.

But, this version also relies on working 128-bit arithmetic, AFAICT
this will just fail to build if CONFIG_INT128 isn't defined.  It
really shouldn't be that hard to make a helper that works just in
terms of 64-bit arithmetic - there are only 3 cases (all in the upper
word, all in the lower, and straddling).  I'd prefer to see it done
that way, rather than increasing reliance on CONFIG_INT128.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions
  2016-11-24  1:02   ` David Gibson
@ 2016-11-24  5:53     ` Nikunj A Dadhania
  2016-11-24  8:14       ` Richard Henderson
  0 siblings, 1 reply; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-24  5:53 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, rth, qemu-devel, bharata, Avinesh Kumar

David Gibson <david@gibson.dropbear.id.au> writes:

> [ Unknown signature status ]
> On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:
>> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>> 
>> vextublx:  Vector Extract Unsigned Byte Left
>> vextuhlx:  Vector Extract Unsigned Halfword Left
>> vextuwlx:  Vector Extract Unsigned Word Left
>> 
>> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>
> So, when I suggested doing these without helpers before, I had
> forgotten that the non-byte versions can straddle the word boundary.
> Given that the offset is in a register, not the instruction that does
> make it complicated.
>
> But, this version also relies on working 128-bit arithmetic, AFAICT
> this will just fail to build if CONFIG_INT128 isn't defined.

It has both the implementation, just that the defines might have
confused you:

#if defined(HOST_WORDS_BIGENDIAN)

#  if defined(CONFIG_INT128)
#  else
#  endif

#else /* !defined (HOST_WORDS_BIGENDIAN) */

#  if defined(CONFIG_INT128)
#  else
#  endif

#endif 

> It really shouldn't be that hard to make a helper that works just in
> terms of 64-bit arithmetic - there are only 3 cases (all in the upper
> word, all in the lower, and straddling).

Currently, its being done using byte array.

 +{                                                               \
 +    target_ulong r = 0;                                         \
 +    int i;                                                      \
 +    int index = a & 0xf;                                        \
 +    for (i = 0; i < elem; i++) {                                \
 +        r = r << 8;                                             \
 +        if (index + i <= 15) {                                  \
 +            r = r | b->u8[index + i];                           \
 +        }                                                       \
 +    }                                                           \
 +    return r;                                                   \
 +}

> I'd prefer to see it done that way, rather than increasing reliance on
> CONFIG_INT128.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions
  2016-11-24  5:53     ` Nikunj A Dadhania
@ 2016-11-24  8:14       ` Richard Henderson
  2016-11-24  8:22         ` Nikunj A Dadhania
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Henderson @ 2016-11-24  8:14 UTC (permalink / raw)
  To: Nikunj A Dadhania, David Gibson
  Cc: qemu-ppc, qemu-devel, bharata, Avinesh Kumar

On 11/24/2016 06:53 AM, Nikunj A Dadhania wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
>
>> [ Unknown signature status ]
>> On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:
>>> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>>>
>>> vextublx:  Vector Extract Unsigned Byte Left
>>> vextuhlx:  Vector Extract Unsigned Halfword Left
>>> vextuwlx:  Vector Extract Unsigned Word Left
>>>
>>> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>>
>> So, when I suggested doing these without helpers before, I had
>> forgotten that the non-byte versions can straddle the word boundary.
>> Given that the offset is in a register, not the instruction that does
>> make it complicated.
>>
>> But, this version also relies on working 128-bit arithmetic, AFAICT
>> this will just fail to build if CONFIG_INT128 isn't defined.
>
> It has both the implementation, just that the defines might have
> confused you:
>
> #if defined(HOST_WORDS_BIGENDIAN)
>
> #  if defined(CONFIG_INT128)
> #  else
> #  endif
>
> #else /* !defined (HOST_WORDS_BIGENDIAN) */
>
> #  if defined(CONFIG_INT128)
> #  else
> #  endif
>
> #endif

In include/qemu/int128.h, we do have int128_rshift.  So you don't *really* have 
to do this by hand, exactly.


r~

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions
  2016-11-24  8:14       ` Richard Henderson
@ 2016-11-24  8:22         ` Nikunj A Dadhania
  0 siblings, 0 replies; 17+ messages in thread
From: Nikunj A Dadhania @ 2016-11-24  8:22 UTC (permalink / raw)
  To: Richard Henderson, David Gibson
  Cc: qemu-ppc, qemu-devel, bharata, Avinesh Kumar

Richard Henderson <rth@twiddle.net> writes:

> On 11/24/2016 06:53 AM, Nikunj A Dadhania wrote:
>> David Gibson <david@gibson.dropbear.id.au> writes:
>>
>>> [ Unknown signature status ]
>>> On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:
>>>> From: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>>>>
>>>> vextublx:  Vector Extract Unsigned Byte Left
>>>> vextuhlx:  Vector Extract Unsigned Halfword Left
>>>> vextuwlx:  Vector Extract Unsigned Word Left
>>>>
>>>> Signed-off-by: Avinesh Kumar <avinesku@linux.vnet.ibm.com>
>>>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>>>
>>> So, when I suggested doing these without helpers before, I had
>>> forgotten that the non-byte versions can straddle the word boundary.
>>> Given that the offset is in a register, not the instruction that does
>>> make it complicated.
>>>
>>> But, this version also relies on working 128-bit arithmetic, AFAICT
>>> this will just fail to build if CONFIG_INT128 isn't defined.
>>
>> It has both the implementation, just that the defines might have
>> confused you:
>>
>> #if defined(HOST_WORDS_BIGENDIAN)
>>
>> #  if defined(CONFIG_INT128)
>> #  else
>> #  endif
>>
>> #else /* !defined (HOST_WORDS_BIGENDIAN) */
>>
>> #  if defined(CONFIG_INT128)
>> #  else
>> #  endif
>>
>> #endif
>
> In include/qemu/int128.h, we do have int128_rshift.  So you don't *really* have 
> to do this by hand, exactly.

Sure, let me add int128_extract as well. Will be helpful.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-11-24  8:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-23 11:37 [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 01/10] target-ppc: Consolidate instruction decode helpers Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 02/10] target-ppc: rename CRF_* defines as CRF_*_BIT Nikunj A Dadhania
2016-11-24  0:23   ` David Gibson
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 03/10] target-ppc: Fix xscmpodp and xscmpudp instructions Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 04/10] target-ppc: Add xscmpexp[dp, qp] instructions Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 05/10] target-ppc: Add xscmpoqp and xscmpuqp instructions Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 06/10] target-ppc: implement lxsd and lxssp instructions Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 07/10] target-ppc: implement stxsd and stxssp Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 08/10] target-ppc: implement lxv/lxvx and stxv/stxvx Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions Nikunj A Dadhania
2016-11-24  1:02   ` David Gibson
2016-11-24  5:53     ` Nikunj A Dadhania
2016-11-24  8:14       ` Richard Henderson
2016-11-24  8:22         ` Nikunj A Dadhania
2016-11-23 11:37 ` [Qemu-devel] [PATCH v1 10/10] target-ppc: add vextu[bhw]rx instructions Nikunj A Dadhania
2016-11-24  0:56 ` [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8 David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.