All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3
@ 2013-10-24 16:16 Tom Musta
  2013-10-24 16:17 ` [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX Tom Musta
                   ` (18 more replies)
  0 siblings, 19 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:16 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This is the third series of patches to add PowerPC VSX emulation support
to QEMU.

This series adds the floating point arithmetic, compare, conversion and
rounding instructions.  Instructions are implemented using helpers and
wherever practical, existing floating point code such as the softfloat
library and the existing PowerPC floating point helper code.

As with the previous series, the Power ISA V2.06 instructions are added
but the V2.07 instructions are not.  The latter will be implemented in a
future patch series.

Tom Musta (19):
   Add New softfloat Routines for VSX
   Add set_fprf Argument to fload_invalid_op_excp()
   General Support for VSX Helpers
   Add VSX ISA2.06 xadd Instructions
   Add VSX ISA2.06 xsub Instructions
   Add VSX ISA2.06 xmul Instructions
   Add VSX ISA2.06 xdiv Instructions
   Add VSX ISA2.06 xre Instructions
   Add VSX ISA2.06 xsqrt Instructions
   Add VSX ISA2.06 xrsqrte Instructions
   Add VSX ISA2.06 xtdiv Instructions
   Add VSX ISA2.06 xtsqrt Instructions
   Add VSX ISA2.06 Multiply Add Instructions
   Add VSX xscmp*dp Instructions
   Add VSX xmax/xmin Instructions
   Add VSX Vector Compare Instructions
   Add VSX Floating Point to Floating Point Conversion Instructions
   Add VSX ISA2.06 Integer Conversion Instructions
   Add VSX Rounding Instructions

  fpu/softfloat.c         |   45 ++
  include/fpu/softfloat.h |   22 +
  target-ppc/fpu_helper.c | 1151 +++++++++++++++++++++++++++++++++++++++++++++--
  target-ppc/helper.h     |  109 +++++
  target-ppc/translate.c  |  243 ++++++++++
  5 files changed, 1522 insertions(+), 48 deletions(-)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
@ 2013-10-24 16:17 ` Tom Musta
  2013-10-24 18:34   ` Richard Henderson
                     ` (2 more replies)
  2013-10-24 16:18 ` [Qemu-devel] [PATCH 02/19] Add set_fprf Argument to fload_invalid_op_excp() Tom Musta
                   ` (17 subsequent siblings)
  18 siblings, 3 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:17 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds routines to the softfloat library that are useful for
the PowerPC VSX implementation.  The routines are, however, not specific
to PowerPC and are approprriate for softfloat.

The following routines are added:

   - float32_is_denormal() returns true if the 32-bit floating point number
     is denormalized.
   - float64_is_denormal() returns true if the 64-bit floating point number
     is denormalized.
   - float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
     floating point number.
   - float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
     floating point number.
   - float32_to_uint64() converts a 32-bit floating point number to an
     unsigned 64 bit number.

Note that this patch is dependent a previously submitted patch that fixes
the float64_to_uint64 conversion routine;  see
http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg02622.html
for details.

This contribution can be licensed under either the softfloat-2a or -2b
license.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  fpu/softfloat.c         |   45 +++++++++++++++++++++++++++++++++++++++++++++
  include/fpu/softfloat.h |   22 ++++++++++++++++++++++
  2 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 3070eaa..cb03dca 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1550,6 +1550,51 @@ int64 float32_to_int64( float32 a STATUS_PARAM )

  /*----------------------------------------------------------------------------
  | Returns the result of converting the single-precision floating-point value
+| `a' to the 64-bit unsigned integer format.  The conversion is
+| performed according to the IEC/IEEE Standard for Binary Floating-Point
+| Arithmetic---which means in particular that the conversion is rounded
+| according to the current rounding mode.  If `a' is a NaN, the largest
+| unsigned integer is returned.  Otherwise, if the conversion overflows, the
+| largest unsigned integer is returned.  If the 'a' is negative, zero is
+| returned.
+*----------------------------------------------------------------------------*/
+
+uint64 float32_to_uint64(float32 a STATUS_PARAM)
+{
+    flag aSign;
+    int_fast16_t aExp, shiftCount;
+    uint32_t aSig;
+    uint64_t aSig64, aSigExtra;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+
+    aSig = extractFloat32Frac(a);
+    aExp = extractFloat32Exp(a);
+    aSign = extractFloat32Sign(a);
+    if (aSign) {
+        if (aExp) {
+            float_raise(float_flag_invalid STATUS_VAR);
+        } else if (aSig) { /* negative denormalized */
+            float_raise(float_flag_inexact STATUS_VAR);
+        }
+        return 0;
+    }
+    shiftCount = 0xBE - aExp;
+    if (aExp) {
+        aSig |= 0x00800000;
+    }
+    if (shiftCount < 0) {
+        float_raise(float_flag_invalid STATUS_VAR);
+        return (int64_t)LIT64(0xFFFFFFFFFFFFFFFF);
+    }
+
+    aSig64 = aSig;
+    aSig64 <<= 40;
+    shift64ExtraRightJamming(aSig64, 0, shiftCount, &aSig64, &aSigExtra);
+    return roundAndPackUint64(aSig64, aSigExtra STATUS_VAR);
+}
+
+/*----------------------------------------------------------------------------
+| Returns the result of converting the single-precision floating-point value
  | `a' to the 64-bit two's complement integer format.  The conversion is
  | performed according to the IEC/IEEE Standard for Binary Floating-Point
  | Arithmetic, except that the conversion is always rounded toward zero.  If
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index f3927e2..678e527 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -272,6 +272,7 @@ int32 float32_to_int32_round_to_zero( float32 STATUS_PARAM );
  uint32 float32_to_uint32( float32 STATUS_PARAM );
  uint32 float32_to_uint32_round_to_zero( float32 STATUS_PARAM );
  int64 float32_to_int64( float32 STATUS_PARAM );
+uint64 float32_to_uint64(float32 STATUS_PARAM);
  int64 float32_to_int64_round_to_zero( float32 STATUS_PARAM );
  float64 float32_to_float64( float32 STATUS_PARAM );
  floatx80 float32_to_floatx80( float32 STATUS_PARAM );
@@ -348,6 +349,12 @@ INLINE int float32_is_zero_or_denormal(float32 a)
      return (float32_val(a) & 0x7f800000) == 0;
  }

+INLINE int float32_is_denormal(float32 a)
+{
+    return ((float32_val(a) & 0x7f800000) == 0) &&
+           ((float32_val(a) & 0x007fffff) != 0);
+}
+
  INLINE float32 float32_set_sign(float32 a, int sign)
  {
      return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31));
@@ -360,6 +367,10 @@ INLINE float32 float32_set_sign(float32 a, int sign)
  #define float32_half make_float32(0x3f000000)
  #define float32_infinity make_float32(0x7f800000)

+INLINE int float32_get_unbiased_exp(float32 f)
+{
+    return ((f >> 23) & 0xFF) - 127;
+}

  /*----------------------------------------------------------------------------
  | The pattern for a default generated single-precision NaN.
@@ -454,6 +465,12 @@ INLINE int float64_is_zero_or_denormal(float64 a)
      return (float64_val(a) & 0x7ff0000000000000LL) == 0;
  }

+INLINE int float64_is_denormal(float64 a)
+{
+    return ((float64_val(a) & 0x7ff0000000000000LL) == 0) &&
+           ((float64_val(a) & 0x000fffffffffffffLL) != 0);
+}
+
  INLINE float64 float64_set_sign(float64 a, int sign)
  {
      return make_float64((float64_val(a) & 0x7fffffffffffffffULL)
@@ -472,6 +489,11 @@ INLINE float64 float64_set_sign(float64 a, int sign)
  *----------------------------------------------------------------------------*/
  extern const float64 float64_default_nan;

+INLINE int float64_get_unbiased_exp(float64 f)
+{
+    return ((f >> 52) & 0x7FF) - 1023;
+}
+
  /*----------------------------------------------------------------------------
  | Software IEC/IEEE extended double-precision conversion routines.
  *----------------------------------------------------------------------------*/
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 02/19] Add set_fprf Argument to fload_invalid_op_excp()
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
  2013-10-24 16:17 ` [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX Tom Musta
@ 2013-10-24 16:18 ` Tom Musta
  2013-10-24 16:19 ` [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers Tom Musta
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:18 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

The fload_invalid_op_excp() function sets assorted invalid
operation status bits.  However, it also implicitly modifies
the FPRF field of the PowerPC FPSCR.  Many VSX instructions
set invalid operation bits but do not alter FPRF.  Thus the
function is more generally useful if the setting of the FPRF
field is made conditional via a parameter.

All invocations of this routine in existing instructions are
modified to pass 1 and thus retain their current behavior.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |  103 +++++++++++++++++++++++++----------------------
  1 files changed, 55 insertions(+), 48 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 4f60218..f0b0a49 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -106,7 +106,8 @@ uint32_t helper_compute_fprf(CPUPPCState *env, uint64_t arg, uint32_t set_fprf)
  }

  /* Floating-point invalid operations exception */
-static inline uint64_t fload_invalid_op_excp(CPUPPCState *env, int op)
+static inline uint64_t fload_invalid_op_excp(CPUPPCState *env, int op,
+                                             int set_fpcc)
  {
      uint64_t ret = 0;
      int ve;
@@ -138,8 +139,10 @@ static inline uint64_t fload_invalid_op_excp(CPUPPCState *env, int op)
      case POWERPC_EXCP_FP_VXVC:
          /* Ordered comparison of NaN */
          env->fpscr |= 1 << FPSCR_VXVC;
-        env->fpscr &= ~(0xF << FPSCR_FPCC);
-        env->fpscr |= 0x11 << FPSCR_FPCC;
+        if (set_fpcc) {
+            env->fpscr &= ~(0xF << FPSCR_FPCC);
+            env->fpscr |= 0x11 << FPSCR_FPCC;
+        }
          /* We must update the target FPR before raising the exception */
          if (ve != 0) {
              env->exception_index = POWERPC_EXCP_PROGRAM;
@@ -158,8 +161,10 @@ static inline uint64_t fload_invalid_op_excp(CPUPPCState *env, int op)
          if (ve == 0) {
              /* Set the result to quiet NaN */
              ret = 0x7FF8000000000000ULL;
-            env->fpscr &= ~(0xF << FPSCR_FPCC);
-            env->fpscr |= 0x11 << FPSCR_FPCC;
+            if (set_fpcc) {
+                env->fpscr &= ~(0xF << FPSCR_FPCC);
+                env->fpscr |= 0x11 << FPSCR_FPCC;
+            }
          }
          break;
      case POWERPC_EXCP_FP_VXCVI:
@@ -169,8 +174,10 @@ static inline uint64_t fload_invalid_op_excp(CPUPPCState *env, int op)
          if (ve == 0) {
              /* Set the result to quiet NaN */
              ret = 0x7FF8000000000000ULL;
-            env->fpscr &= ~(0xF << FPSCR_FPCC);
-            env->fpscr |= 0x11 << FPSCR_FPCC;
+            if (set_fpcc) {
+                env->fpscr &= ~(0xF << FPSCR_FPCC);
+                env->fpscr |= 0x11 << FPSCR_FPCC;
+            }
          }
          break;
      }
@@ -505,12 +512,12 @@ uint64_t helper_fadd(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
      if (unlikely(float64_is_infinity(farg1.d) && float64_is_infinity(farg2.d) &&
                   float64_is_neg(farg1.d) != float64_is_neg(farg2.d))) {
          /* Magnitude subtraction of infinities */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d))) {
              /* sNaN addition */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          farg1.d = float64_add(farg1.d, farg2.d, &env->fp_status);
      }
@@ -529,12 +536,12 @@ uint64_t helper_fsub(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
      if (unlikely(float64_is_infinity(farg1.d) && float64_is_infinity(farg2.d) &&
                   float64_is_neg(farg1.d) == float64_is_neg(farg2.d))) {
          /* Magnitude subtraction of infinities */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d))) {
              /* sNaN subtraction */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          farg1.d = float64_sub(farg1.d, farg2.d, &env->fp_status);
      }
@@ -553,12 +560,12 @@ uint64_t helper_fmul(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
      if (unlikely((float64_is_infinity(farg1.d) && float64_is_zero(farg2.d)) ||
                   (float64_is_zero(farg1.d) && float64_is_infinity(farg2.d)))) {
          /* Multiplication of zero by infinity */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d))) {
              /* sNaN multiplication */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          farg1.d = float64_mul(farg1.d, farg2.d, &env->fp_status);
      }
@@ -577,15 +584,15 @@ uint64_t helper_fdiv(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
      if (unlikely(float64_is_infinity(farg1.d) &&
                   float64_is_infinity(farg2.d))) {
          /* Division of infinity by infinity */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIDI);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIDI, 1);
      } else if (unlikely(float64_is_zero(farg1.d) && float64_is_zero(farg2.d))) {
          /* Division of zero by zero */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXZDZ);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXZDZ, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d))) {
              /* sNaN division */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          farg1.d = float64_div(farg1.d, farg2.d, &env->fp_status);
      }
@@ -603,11 +610,11 @@ uint64_t helper_fctiw(CPUPPCState *env, uint64_t arg)
      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN conversion */
          farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI);
+                                        POWERPC_EXCP_FP_VXCVI, 1);
      } else if (unlikely(float64_is_quiet_nan(farg.d) ||
                          float64_is_infinity(farg.d))) {
          /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI);
+        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
      } else {
          farg.ll = float64_to_int32(farg.d, &env->fp_status);
          /* XXX: higher bits are not supposed to be significant.
@@ -628,11 +635,11 @@ uint64_t helper_fctiwz(CPUPPCState *env, uint64_t arg)
      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN conversion */
          farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI);
+                                        POWERPC_EXCP_FP_VXCVI, 1);
      } else if (unlikely(float64_is_quiet_nan(farg.d) ||
                          float64_is_infinity(farg.d))) {
          /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI);
+        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
      } else {
          farg.ll = float64_to_int32_round_to_zero(farg.d, &env->fp_status);
          /* XXX: higher bits are not supposed to be significant.
@@ -663,11 +670,11 @@ uint64_t helper_fctid(CPUPPCState *env, uint64_t arg)
      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN conversion */
          farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI);
+                                        POWERPC_EXCP_FP_VXCVI, 1);
      } else if (unlikely(float64_is_quiet_nan(farg.d) ||
                          float64_is_infinity(farg.d))) {
          /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI);
+        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
      } else {
          farg.ll = float64_to_int64(farg.d, &env->fp_status);
      }
@@ -684,11 +691,11 @@ uint64_t helper_fctidz(CPUPPCState *env, uint64_t arg)
      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN conversion */
          farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI);
+                                        POWERPC_EXCP_FP_VXCVI, 1);
      } else if (unlikely(float64_is_quiet_nan(farg.d) ||
                          float64_is_infinity(farg.d))) {
          /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI);
+        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
      } else {
          farg.ll = float64_to_int64_round_to_zero(farg.d, &env->fp_status);
      }
@@ -707,11 +714,11 @@ static inline uint64_t do_fri(CPUPPCState *env, uint64_t arg,
      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN round */
          farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI);
+                                        POWERPC_EXCP_FP_VXCVI, 1);
      } else if (unlikely(float64_is_quiet_nan(farg.d) ||
                          float64_is_infinity(farg.d))) {
          /* qNan / infinity round */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI);
+        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
      } else {
          set_float_rounding_mode(rounding_mode, &env->fp_status);
          farg.ll = float64_round_to_int(farg.d, &env->fp_status);
@@ -754,13 +761,13 @@ uint64_t helper_fmadd(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
      if (unlikely((float64_is_infinity(farg1.d) && float64_is_zero(farg2.d)) ||
                   (float64_is_zero(farg1.d) && float64_is_infinity(farg2.d)))) {
          /* Multiplication of zero by infinity */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d) ||
                       float64_is_signaling_nan(farg3.d))) {
              /* sNaN operation */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          /* This is the way the PowerPC specification defines it */
          float128 ft0_128, ft1_128;
@@ -772,7 +779,7 @@ uint64_t helper_fmadd(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                       float64_is_infinity(farg3.d) &&
                       float128_is_neg(ft0_128) != float64_is_neg(farg3.d))) {
              /* Magnitude subtraction of infinities */
-            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI);
+            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1);
          } else {
              ft1_128 = float64_to_float128(farg3.d, &env->fp_status);
              ft0_128 = float128_add(ft0_128, ft1_128, &env->fp_status);
@@ -797,13 +804,13 @@ uint64_t helper_fmsub(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                   (float64_is_zero(farg1.d) &&
                    float64_is_infinity(farg2.d)))) {
          /* Multiplication of zero by infinity */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d) ||
                       float64_is_signaling_nan(farg3.d))) {
              /* sNaN operation */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          /* This is the way the PowerPC specification defines it */
          float128 ft0_128, ft1_128;
@@ -815,7 +822,7 @@ uint64_t helper_fmsub(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                       float64_is_infinity(farg3.d) &&
                       float128_is_neg(ft0_128) == float64_is_neg(farg3.d))) {
              /* Magnitude subtraction of infinities */
-            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI);
+            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1);
          } else {
              ft1_128 = float64_to_float128(farg3.d, &env->fp_status);
              ft0_128 = float128_sub(ft0_128, ft1_128, &env->fp_status);
@@ -838,13 +845,13 @@ uint64_t helper_fnmadd(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
      if (unlikely((float64_is_infinity(farg1.d) && float64_is_zero(farg2.d)) ||
                   (float64_is_zero(farg1.d) && float64_is_infinity(farg2.d)))) {
          /* Multiplication of zero by infinity */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d) ||
                       float64_is_signaling_nan(farg3.d))) {
              /* sNaN operation */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          /* This is the way the PowerPC specification defines it */
          float128 ft0_128, ft1_128;
@@ -856,7 +863,7 @@ uint64_t helper_fnmadd(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                       float64_is_infinity(farg3.d) &&
                       float128_is_neg(ft0_128) != float64_is_neg(farg3.d))) {
              /* Magnitude subtraction of infinities */
-            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI);
+            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1);
          } else {
              ft1_128 = float64_to_float128(farg3.d, &env->fp_status);
              ft0_128 = float128_add(ft0_128, ft1_128, &env->fp_status);
@@ -883,13 +890,13 @@ uint64_t helper_fnmsub(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                   (float64_is_zero(farg1.d) &&
                    float64_is_infinity(farg2.d)))) {
          /* Multiplication of zero by infinity */
-        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ);
+        farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d) ||
                       float64_is_signaling_nan(farg3.d))) {
              /* sNaN operation */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          /* This is the way the PowerPC specification defines it */
          float128 ft0_128, ft1_128;
@@ -901,7 +908,7 @@ uint64_t helper_fnmsub(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                       float64_is_infinity(farg3.d) &&
                       float128_is_neg(ft0_128) == float64_is_neg(farg3.d))) {
              /* Magnitude subtraction of infinities */
-            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI);
+            farg1.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1);
          } else {
              ft1_128 = float64_to_float128(farg3.d, &env->fp_status);
              ft0_128 = float128_sub(ft0_128, ft1_128, &env->fp_status);
@@ -924,7 +931,7 @@ uint64_t helper_frsp(CPUPPCState *env, uint64_t arg)

      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN square root */
-        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
      }
      f32 = float64_to_float32(farg.d, &env->fp_status);
      farg.d = float32_to_float64(f32, &env->fp_status);
@@ -941,11 +948,11 @@ uint64_t helper_fsqrt(CPUPPCState *env, uint64_t arg)

      if (unlikely(float64_is_neg(farg.d) && !float64_is_zero(farg.d))) {
          /* Square root of a negative nonzero number */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSQRT);
+        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSQRT, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg.d))) {
              /* sNaN square root */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          farg.d = float64_sqrt(farg.d, &env->fp_status);
      }
@@ -961,7 +968,7 @@ uint64_t helper_fre(CPUPPCState *env, uint64_t arg)

      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN reciprocal */
-        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
      }
      farg.d = float64_div(float64_one, farg.d, &env->fp_status);
      return farg.d;
@@ -977,7 +984,7 @@ uint64_t helper_fres(CPUPPCState *env, uint64_t arg)

      if (unlikely(float64_is_signaling_nan(farg.d))) {
          /* sNaN reciprocal */
-        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
      }
      farg.d = float64_div(float64_one, farg.d, &env->fp_status);
      f32 = float64_to_float32(farg.d, &env->fp_status);
@@ -996,11 +1003,11 @@ uint64_t helper_frsqrte(CPUPPCState *env, uint64_t arg)

      if (unlikely(float64_is_neg(farg.d) && !float64_is_zero(farg.d))) {
          /* Reciprocal square root of a negative nonzero number */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSQRT);
+        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSQRT, 1);
      } else {
          if (unlikely(float64_is_signaling_nan(farg.d))) {
              /* sNaN reciprocal square root */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
          }
          farg.d = float64_sqrt(farg.d, &env->fp_status);
          farg.d = float64_div(float64_one, farg.d, &env->fp_status);
@@ -1053,7 +1060,7 @@ void helper_fcmpu(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                   && (float64_is_signaling_nan(farg1.d) ||
                       float64_is_signaling_nan(farg2.d)))) {
          /* sNaN comparison */
-        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN);
+        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
      }
  }

@@ -1085,10 +1092,10 @@ void helper_fcmpo(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
              float64_is_signaling_nan(farg2.d)) {
              /* sNaN comparison */
              fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                  POWERPC_EXCP_FP_VXVC);
+                                  POWERPC_EXCP_FP_VXVC, 1);
          } else {
              /* qNaN comparison */
-            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC);
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 1);
          }
      }
  }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
  2013-10-24 16:17 ` [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX Tom Musta
  2013-10-24 16:18 ` [Qemu-devel] [PATCH 02/19] Add set_fprf Argument to fload_invalid_op_excp() Tom Musta
@ 2013-10-24 16:19 ` Tom Musta
  2013-10-24 18:51   ` Richard Henderson
  2013-10-24 16:20 ` [Qemu-devel] [PATCH 04/19] Add VSX ISA2.06 xadd Instructions Tom Musta
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:19 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds general support that will be used by the VSX helper
routines:

   - a union describing the various VSR subfields.
   - access routines to get and set VSRs
   - VSX decoders
   - a general routine to generate a handler that invokes a VSX
     helper.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   41 +++++++++++++++++++++++++++++++++++++++++
  target-ppc/translate.c  |   14 ++++++++++++++
  2 files changed, 55 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index f0b0a49..cea94ac 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1717,3 +1717,44 @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2)
      /* XXX: TODO: test special values (NaN, infinites, ...) */
      return helper_efdtsteq(env, op1, op2);
  }
+
+#define DECODE_SPLIT(opcode, shift1, nb1, shift2, nb2) \
+    (((((opcode) >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |    \
+     (((opcode) >> (shift2)) & ((1 << (nb2)) - 1)))
+
+#define xT(opcode) DECODE_SPLIT(opcode, 0, 1, 21, 5)
+#define xA(opcode) DECODE_SPLIT(opcode, 2, 1, 16, 5)
+#define xB(opcode) DECODE_SPLIT(opcode, 1, 1, 11, 5)
+#define xC(opcode) DECODE_SPLIT(opcode, 3, 1,  6, 5)
+#define BF(opcode) (((opcode) >> (31-8)) & 7)
+
+typedef union _ppc_vsr_t {
+    uint64_t u64[2];
+    uint32_t u32[4];
+    float32 f32[4];
+    float64 f64[2];
+} ppc_vsr_t;
+
+static void getVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
+{
+    if (n < 32) {
+        vsr->f64[0] = env->fpr[n];
+        vsr->u64[1] = env->vsr[n];
+    } else {
+        vsr->u64[0] = env->avr[n-32].u64[0];
+        vsr->u64[1] = env->avr[n-32].u64[1];
+    }
+}
+
+static void putVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
+{
+    if (n < 32) {
+        env->fpr[n] = vsr->f64[0];
+        env->vsr[n] = vsr->u64[1];
+    } else {
+        env->avr[n-32].u64[0] = vsr->u64[0];
+        env->avr[n-32].u64[1] = vsr->u64[1];
+    }
+}
+
+#define float64_to_float64(x, env) x
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index ef57bae..5b51c0c 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7278,6 +7278,20 @@ VSX_VECTOR_MOVE(xvnabssp, OP_NABS, SGN_MASK_SP)
  VSX_VECTOR_MOVE(xvnegsp, OP_NEG, SGN_MASK_SP)
  VSX_VECTOR_MOVE(xvcpsgnsp, OP_CPSGN, SGN_MASK_SP)

+#define GEN_VSX_HELPER_2(name, op1, op2, inval, type)                         \
+static void gen_##name(DisasContext * ctx)                                    \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    /* NIP cannot be restored if the memory exception comes from an helper */ \
+    gen_update_nip(ctx, ctx->nip - 4);                                        \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    gen_helper_##name(cpu_env, opc);                                          \
+    tcg_temp_free_i32(opc);                                                   \
+}

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 04/19] Add VSX ISA2.06 xadd Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (2 preceding siblings ...)
  2013-10-24 16:19 ` [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers Tom Musta
@ 2013-10-24 16:20 ` Tom Musta
  2013-10-24 19:44   ` Richard Henderson
  2013-10-24 16:20 ` [Qemu-devel] [PATCH 05/19] Add VSX ISA2.06 xsub Instructions Tom Musta
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:20 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

This patch adds the VSX floating point add instructions that are
defined by V2.06 of the PowerPC ISA:  xsadddp, xvadddp and xvaddsp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    6 ++++++
  target-ppc/translate.c  |   12 ++++++++++++
  3 files changed, 64 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index cea94ac..8cbc905 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1758,3 +1758,49 @@ static void putVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
  }

  #define float64_to_float64(x, env) x
+
+/* VSX_ADD - VSX floating point add
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_ADD(op, nels, tp, fld, sfprf)                                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_infinity(xa.fld[i]) &&                           \
+                     tp##_is_infinity(xb.fld[i]) &&                           \
+                     tp##_is_neg(xa.fld[i]) != tp##_is_neg(xb.fld[i]))) {     \
+            xt.fld[i] = float64_to_##tp(                                      \
+                            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, \
+                                                  sfprf),                     \
+                            &env->fp_status);                                 \
+        } else {                                                              \
+            if (unlikely(tp##_is_signaling_nan(xa.fld[i]) ||                  \
+                         tp##_is_signaling_nan(xb.fld[i]))) {                 \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+            }                                                                 \
+            xt.fld[i] = tp##_add(xa.fld[i], xb.fld[i], &env->fp_status);      \
+            if (sfprf) {                                                      \
+                helper_compute_fprf(env, xt.fld[i], sfprf);                   \
+            }                                                                 \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_ADD(xsadddp, 1, float64, f64, 1)
+VSX_ADD(xvadddp, 2, float64, f64, 0)
+VSX_ADD(xvaddsp, 4, float32, f32, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 56814b5..30e6aa4 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -251,6 +251,12 @@ DEF_HELPER_4(vcfsx, void, env, avr, avr, i32)
  DEF_HELPER_4(vctuxs, void, env, avr, avr, i32)
  DEF_HELPER_4(vctsxs, void, env, avr, avr, i32)

+DEF_HELPER_2(xsadddp, void, env, i32)
+
+DEF_HELPER_2(xvadddp, void, env, i32)
+
+DEF_HELPER_2(xvaddsp, void, env, i32)
+
  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
  DEF_HELPER_2(efscfuf, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 5b51c0c..7aa17e1 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7293,6 +7293,12 @@ static void gen_##name(DisasContext * ctx)                                    \
      tcg_temp_free_i32(opc);                                                   \
  }

+GEN_VSX_HELPER_2(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
+
+GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
+
+GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
+
  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
      {                                                                \
@@ -9975,6 +9981,12 @@ GEN_XX2FORM(xvnabssp, 0x12, 0x1A, PPC2_VSX),
  GEN_XX2FORM(xvnegsp, 0x12, 0x1B, PPC2_VSX),
  GEN_XX3FORM(xvcpsgnsp, 0x00, 0x1A, PPC2_VSX),

+GEN_XX3FORM(xsadddp, 0x00, 0x04, PPC2_VSX),
+
+GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
+
+GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
+
  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
  GEN_XX3FORM(name, opc2, opc3, fl2)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 05/19] Add VSX ISA2.06 xsub Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (3 preceding siblings ...)
  2013-10-24 16:20 ` [Qemu-devel] [PATCH 04/19] Add VSX ISA2.06 xadd Instructions Tom Musta
@ 2013-10-24 16:20 ` Tom Musta
  2013-10-24 19:48   ` Richard Henderson
  2013-10-24 16:21 ` [Qemu-devel] [PATCH 06/19] Add VSX ISA2.06 xmul Instructions Tom Musta
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:20 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the floating point subtraction instructions defined
by V2.06 of the PowerPC ISA: xssubdp, xvsubdp and xvsubsp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 +++
  target-ppc/translate.c  |    6 ++++++
  3 files changed, 55 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 8cbc905..c9997a3 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1804,3 +1804,49 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_ADD(xsadddp, 1, float64, f64, 1)
  VSX_ADD(xvadddp, 2, float64, f64, 0)
  VSX_ADD(xvaddsp, 4, float32, f32, 0)
+
+/* VSX_SUB - VSX floating point subtract
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_SUB(op, nels, tp, fld, sfprf)                                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_infinity(xa.fld[i]) &&                           \
+                     tp##_is_infinity(xb.fld[i]) &&                           \
+                     tp##_is_neg(xa.fld[i]) == tp##_is_neg(xb.fld[i]))) {     \
+            xt.fld[i] = float64_to_##tp(                                      \
+                            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, \
+                                              sfprf),                         \
+                            &env->fp_status);                                 \
+        } else {                                                              \
+            if (unlikely(tp##_is_signaling_nan(xa.fld[i]) ||                  \
+                         tp##_is_signaling_nan(xb.fld[i]))) {                 \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+            }                                                                 \
+            xt.fld[i] = tp##_sub(xa.fld[i], xb.fld[i], &env->fp_status);      \
+            if (sfprf) {                                                      \
+                helper_compute_fprf(env, xt.fld[i], sfprf);                   \
+            }                                                                 \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_SUB(xssubdp, 1, float64, f64, 1)
+VSX_SUB(xvsubdp, 2, float64, f64, 0)
+VSX_SUB(xvsubsp, 4, float32, f32, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 30e6aa4..98b0bc5 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -252,10 +252,13 @@ DEF_HELPER_4(vctuxs, void, env, avr, avr, i32)
  DEF_HELPER_4(vctsxs, void, env, avr, avr, i32)

  DEF_HELPER_2(xsadddp, void, env, i32)
+DEF_HELPER_2(xssubdp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
+DEF_HELPER_2(xvsubdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
+DEF_HELPER_2(xvsubsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 7aa17e1..d93bbf4 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7294,10 +7294,13 @@ static void gen_##name(DisasContext * ctx)                                    \
  }

  GEN_VSX_HELPER_2(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xssubdp, 0x00, 0x05, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -9982,10 +9985,13 @@ GEN_XX2FORM(xvnegsp, 0x12, 0x1B, PPC2_VSX),
  GEN_XX3FORM(xvcpsgnsp, 0x00, 0x1A, PPC2_VSX),

  GEN_XX3FORM(xsadddp, 0x00, 0x04, PPC2_VSX),
+GEN_XX3FORM(xssubdp, 0x00, 0x05, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
+GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
+GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 06/19] Add VSX ISA2.06 xmul Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (4 preceding siblings ...)
  2013-10-24 16:20 ` [Qemu-devel] [PATCH 05/19] Add VSX ISA2.06 xsub Instructions Tom Musta
@ 2013-10-24 16:21 ` Tom Musta
  2013-10-24 20:07   ` Richard Henderson
  2013-10-24 16:21 ` [Qemu-devel] [PATCH 07/19] Add VSX ISA2.06 xdiv Instructions Tom Musta
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:21 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point multiply instructions defined
by V2.06 of the PowerPC ISA: xsmuldp, xvmuldp, xvmulsp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 +++
  target-ppc/translate.c  |    6 ++++++
  3 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index c9997a3..8135325 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1850,3 +1850,50 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_SUB(xssubdp, 1, float64, f64, 1)
  VSX_SUB(xvsubdp, 2, float64, f64, 0)
  VSX_SUB(xvsubsp, 4, float32, f32, 0)
+
+/* VSX_MUL - VSX floating point multiply
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_MUL(op, nels, tp, fld, sfprf)                                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely((tp##_is_infinity(xa.fld[i]) &&                          \
+                      tp##_is_zero(xb.fld[i])) ||                             \
+                     (tp##_is_infinity(xb.fld[i]) &&                          \
+                      tp##_is_zero(xa.fld[i])))) {                            \
+            xt.fld[i] = float64_to_##tp(                                      \
+                            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, \
+                                              sfprf),                         \
+                            &env->fp_status);                                 \
+        } else {                                                              \
+            if (unlikely(tp##_is_signaling_nan(xa.fld[i]) ||                  \
+                         tp##_is_signaling_nan(xb.fld[i]))) {                 \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+            }                                                                 \
+            xt.fld[i] = tp##_mul(xa.fld[i], xb.fld[i], &env->fp_status);      \
+            if (sfprf) {                                                      \
+                helper_compute_fprf(env, xt.fld[i], sfprf);                   \
+            }                                                                 \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+   helper_float_check_status(env);                                            \
+}
+
+VSX_MUL(xsmuldp, 1, float64, f64, 1)
+VSX_MUL(xvmuldp, 2, float64, f64, 0)
+VSX_MUL(xvmulsp, 4, float32, f32, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 98b0bc5..a76b159 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -253,12 +253,15 @@ DEF_HELPER_4(vctsxs, void, env, avr, avr, i32)

  DEF_HELPER_2(xsadddp, void, env, i32)
  DEF_HELPER_2(xssubdp, void, env, i32)
+DEF_HELPER_2(xsmuldp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
+DEF_HELPER_2(xvmuldp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
+DEF_HELPER_2(xvmulsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index d93bbf4..c743bf2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7295,12 +7295,15 @@ static void gen_##name(DisasContext * ctx)                                    \

  GEN_VSX_HELPER_2(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xssubdp, 0x00, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -9986,12 +9989,15 @@ GEN_XX3FORM(xvcpsgnsp, 0x00, 0x1A, PPC2_VSX),

  GEN_XX3FORM(xsadddp, 0x00, 0x04, PPC2_VSX),
  GEN_XX3FORM(xssubdp, 0x00, 0x05, PPC2_VSX),
+GEN_XX3FORM(xsmuldp, 0x00, 0x06, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
+GEN_XX3FORM(xvmuldp, 0x00, 0x0E, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
+GEN_XX3FORM(xvmulsp, 0x00, 0x0A, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 07/19] Add VSX ISA2.06 xdiv Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (5 preceding siblings ...)
  2013-10-24 16:21 ` [Qemu-devel] [PATCH 06/19] Add VSX ISA2.06 xmul Instructions Tom Musta
@ 2013-10-24 16:21 ` Tom Musta
  2013-10-24 20:08   ` Richard Henderson
  2013-10-24 16:22 ` [Qemu-devel] [PATCH 08/19] Add VSX ISA2.06 xre Instructions Tom Musta
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:21 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point divide instructions defined
by V2.06 of the PowerPC ISA: xsdivdp, xvdivdp, xvdivsp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 ++
  target-ppc/translate.c  |    6 +++++
  3 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 8135325..85661cf 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1897,3 +1897,55 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_MUL(xsmuldp, 1, float64, f64, 1)
  VSX_MUL(xvmuldp, 2, float64, f64, 0)
  VSX_MUL(xvmulsp, 4, float32, f32, 0)
+
+/* VSX_DIV - VSX floating point divide
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_DIV(op, nels, tp, fld, sfprf)                                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_infinity(xa.fld[i]) &&                           \
+                     tp##_is_infinity(xb.fld[i]))) {                          \
+            xt.fld[i] = float64_to_##tp(                                      \
+                            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIDI, \
+                                              sfprf),                         \
+                            &env->fp_status);                                 \
+        } else if (unlikely(tp##_is_zero(xa.fld[i]) &&                        \
+                            tp##_is_zero(xb.fld[i]))) {                       \
+            xt.fld[i] = float64_to_##tp(                                      \
+                            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXZDZ, \
+                                              sfprf),                         \
+                             &env->fp_status);                                \
+        } else {                                                              \
+            if (unlikely(tp##_is_signaling_nan(xa.fld[i]) ||                  \
+                 tp##_is_signaling_nan(xb.fld[i]))) {                         \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+            }                                                                 \
+            xt.fld[i] = tp##_div(xa.fld[i], xb.fld[i],                        \
+                                    &env->fp_status);                         \
+            if (sfprf) {                                                      \
+                helper_compute_fprf(env, xt.fld[i], sfprf);                   \
+            }                                                                 \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_DIV(xsdivdp, 1, float64, f64, 1)
+VSX_DIV(xvdivdp, 2, float64, f64, 0)
+VSX_DIV(xvdivsp, 4, float32, f32, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index a76b159..c2d3a16 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -254,14 +254,17 @@ DEF_HELPER_4(vctsxs, void, env, avr, avr, i32)
  DEF_HELPER_2(xsadddp, void, env, i32)
  DEF_HELPER_2(xssubdp, void, env, i32)
  DEF_HELPER_2(xsmuldp, void, env, i32)
+DEF_HELPER_2(xsdivdp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
  DEF_HELPER_2(xvmuldp, void, env, i32)
+DEF_HELPER_2(xvdivdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
  DEF_HELPER_2(xvmulsp, void, env, i32)
+DEF_HELPER_2(xvdivsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index c743bf2..d23f645 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7296,14 +7296,17 @@ static void gen_##name(DisasContext * ctx)                                    \
  GEN_VSX_HELPER_2(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xssubdp, 0x00, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -9990,14 +9993,17 @@ GEN_XX3FORM(xvcpsgnsp, 0x00, 0x1A, PPC2_VSX),
  GEN_XX3FORM(xsadddp, 0x00, 0x04, PPC2_VSX),
  GEN_XX3FORM(xssubdp, 0x00, 0x05, PPC2_VSX),
  GEN_XX3FORM(xsmuldp, 0x00, 0x06, PPC2_VSX),
+GEN_XX3FORM(xsdivdp, 0x00, 0x07, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
  GEN_XX3FORM(xvmuldp, 0x00, 0x0E, PPC2_VSX),
+GEN_XX3FORM(xvdivdp, 0x00, 0x0F, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
  GEN_XX3FORM(xvmulsp, 0x00, 0x0A, PPC2_VSX),
+GEN_XX3FORM(xvdivsp, 0x00, 0x0B, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 08/19] Add VSX ISA2.06 xre Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (6 preceding siblings ...)
  2013-10-24 16:21 ` [Qemu-devel] [PATCH 07/19] Add VSX ISA2.06 xdiv Instructions Tom Musta
@ 2013-10-24 16:22 ` Tom Musta
  2013-10-24 20:11   ` Richard Henderson
  2013-10-24 16:22 ` [Qemu-devel] [PATCH 09/19] Add VSX ISA2.06 xsqrt Instructions Tom Musta
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:22 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point reciprocal estimate instructions
defined by V2.06 of the PowerPC ISA: xsredp, xvredp, xvresp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   35 +++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 +++
  target-ppc/translate.c  |    6 ++++++
  3 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 85661cf..11922a5 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1949,3 +1949,38 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_DIV(xsdivdp, 1, float64, f64, 1)
  VSX_DIV(xvdivdp, 2, float64, f64, 0)
  VSX_DIV(xvdivsp, 4, float32, f32, 0)
+
+/* VSX_RE  - VSX floating point reciprocal estimate
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_RE(op, nels, tp, fld, sfprf)                                      \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xb;                                                         \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_signaling_nan(xb.fld[i]))) {                     \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+        }                                                                     \
+        xt.fld[i] = tp##_div(tp##_one, xb.fld[i], &env->fp_status);           \
+        if (sfprf) {                                                          \
+            helper_compute_fprf(env, xt.fld[0], sfprf);                       \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_RE(xsredp, 1, float64, f64, 1)
+VSX_RE(xvredp, 2, float64, f64, 0)
+VSX_RE(xvresp, 4, float32, f32, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index c2d3a16..8af1e89 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -255,16 +255,19 @@ DEF_HELPER_2(xsadddp, void, env, i32)
  DEF_HELPER_2(xssubdp, void, env, i32)
  DEF_HELPER_2(xsmuldp, void, env, i32)
  DEF_HELPER_2(xsdivdp, void, env, i32)
+DEF_HELPER_2(xsredp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
  DEF_HELPER_2(xvmuldp, void, env, i32)
  DEF_HELPER_2(xvdivdp, void, env, i32)
+DEF_HELPER_2(xvredp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
  DEF_HELPER_2(xvmulsp, void, env, i32)
  DEF_HELPER_2(xvdivsp, void, env, i32)
+DEF_HELPER_2(xvresp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index d23f645..06df531 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7297,16 +7297,19 @@ GEN_VSX_HELPER_2(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xssubdp, 0x00, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsredp, 0x14, 0x05, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvresp, 0x14, 0x09, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -9994,16 +9997,19 @@ GEN_XX3FORM(xsadddp, 0x00, 0x04, PPC2_VSX),
  GEN_XX3FORM(xssubdp, 0x00, 0x05, PPC2_VSX),
  GEN_XX3FORM(xsmuldp, 0x00, 0x06, PPC2_VSX),
  GEN_XX3FORM(xsdivdp, 0x00, 0x07, PPC2_VSX),
+GEN_XX2FORM(xsredp,  0x14, 0x05, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
  GEN_XX3FORM(xvmuldp, 0x00, 0x0E, PPC2_VSX),
  GEN_XX3FORM(xvdivdp, 0x00, 0x0F, PPC2_VSX),
+GEN_XX2FORM(xvredp,  0x14, 0x0D, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
  GEN_XX3FORM(xvmulsp, 0x00, 0x0A, PPC2_VSX),
  GEN_XX3FORM(xvdivsp, 0x00, 0x0B, PPC2_VSX),
+GEN_XX2FORM(xvresp, 0x14, 0x09, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 09/19] Add VSX ISA2.06 xsqrt Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (7 preceding siblings ...)
  2013-10-24 16:22 ` [Qemu-devel] [PATCH 08/19] Add VSX ISA2.06 xre Instructions Tom Musta
@ 2013-10-24 16:22 ` Tom Musta
  2013-10-24 20:23   ` Richard Henderson
  2013-10-24 16:23 ` [Qemu-devel] [PATCH 10/19] Add VSX ISA2.06 xrsqrte Instructions Tom Musta
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:22 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point square root instructions
defined by V2.06 of the PowerPC ISA: xssqrtdp, xvsqrtdp, xvsqrtsp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   43 +++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 +++
  target-ppc/translate.c  |    6 ++++++
  3 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 11922a5..d03e8f9 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1984,3 +1984,46 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_RE(xsredp, 1, float64, f64, 1)
  VSX_RE(xvredp, 2, float64, f64, 0)
  VSX_RE(xvresp, 4, float32, f32, 0)
+
+/* VSX_SQRT - VSX floating point square root
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_SQRT(op, nels, tp, fld, sfprf)                                    \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xb;                                                         \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_neg(xb.fld[i]) &&                                \
+                     !tp##_is_zero(xb.fld[i]))) {                             \
+            xt.fld[i] = float64_to_##tp(                                      \
+                            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSQRT,\
+                                                  sfprf),                     \
+                            &env->fp_status);                                 \
+        } else {                                                              \
+            if (unlikely(tp##_is_signaling_nan(xb.fld[i]))) {                 \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+            }                                                                 \
+            xt.fld[i] = tp##_sqrt(xb.fld[i], &env->fp_status);                \
+            if (sfprf) {                                                      \
+                helper_compute_fprf(env, xt.fld[0], sfprf);                   \
+            }                                                                 \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_SQRT(xssqrtdp, 1, float64, f64, 1)
+VSX_SQRT(xvsqrtdp, 2, float64, f64, 0)
+VSX_SQRT(xvsqrtsp, 4, float32, f32, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 8af1e89..27ca4e5 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -256,18 +256,21 @@ DEF_HELPER_2(xssubdp, void, env, i32)
  DEF_HELPER_2(xsmuldp, void, env, i32)
  DEF_HELPER_2(xsdivdp, void, env, i32)
  DEF_HELPER_2(xsredp, void, env, i32)
+DEF_HELPER_2(xssqrtdp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
  DEF_HELPER_2(xvmuldp, void, env, i32)
  DEF_HELPER_2(xvdivdp, void, env, i32)
  DEF_HELPER_2(xvredp, void, env, i32)
+DEF_HELPER_2(xvsqrtdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
  DEF_HELPER_2(xvmulsp, void, env, i32)
  DEF_HELPER_2(xvdivsp, void, env, i32)
  DEF_HELPER_2(xvresp, void, env, i32)
+DEF_HELPER_2(xvsqrtsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 06df531..66cbad1 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7298,18 +7298,21 @@ GEN_VSX_HELPER_2(xssubdp, 0x00, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -9998,18 +10001,21 @@ GEN_XX3FORM(xssubdp, 0x00, 0x05, PPC2_VSX),
  GEN_XX3FORM(xsmuldp, 0x00, 0x06, PPC2_VSX),
  GEN_XX3FORM(xsdivdp, 0x00, 0x07, PPC2_VSX),
  GEN_XX2FORM(xsredp,  0x14, 0x05, PPC2_VSX),
+GEN_XX2FORM(xssqrtdp,  0x16, 0x04, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
  GEN_XX3FORM(xvmuldp, 0x00, 0x0E, PPC2_VSX),
  GEN_XX3FORM(xvdivdp, 0x00, 0x0F, PPC2_VSX),
  GEN_XX2FORM(xvredp,  0x14, 0x0D, PPC2_VSX),
+GEN_XX2FORM(xvsqrtdp,  0x16, 0x0C, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
  GEN_XX3FORM(xvmulsp, 0x00, 0x0A, PPC2_VSX),
  GEN_XX3FORM(xvdivsp, 0x00, 0x0B, PPC2_VSX),
  GEN_XX2FORM(xvresp, 0x14, 0x09, PPC2_VSX),
+GEN_XX2FORM(xvsqrtsp, 0x16, 0x08, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 10/19] Add VSX ISA2.06 xrsqrte Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (8 preceding siblings ...)
  2013-10-24 16:22 ` [Qemu-devel] [PATCH 09/19] Add VSX ISA2.06 xsqrt Instructions Tom Musta
@ 2013-10-24 16:23 ` Tom Musta
  2013-10-24 20:25   ` Richard Henderson
  2013-10-24 16:23 ` [Qemu-devel] [PATCH 11/19] Add VSX ISA2.06 xtdiv Instructions Tom Musta
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:23 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point reciprocal square root
estimate instructions defined by V2.06 of the PowerPC ISA: xsrsqrtedp,
xvrsqrtedp, xvrsqrtesp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   44 ++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 +++
  target-ppc/translate.c  |    6 ++++++
  3 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index d03e8f9..902cb76 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2027,3 +2027,47 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_SQRT(xssqrtdp, 1, float64, f64, 1)
  VSX_SQRT(xvsqrtdp, 2, float64, f64, 0)
  VSX_SQRT(xvsqrtsp, 4, float32, f32, 0)
+
+/* VSX_RSQRTE - VSX floating point reciprocal square root estimate
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_RSQRTE(op, nels, tp, fld, sfprf)                                  \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xb;                                                         \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_neg(xb.fld[i]) &&                                \
+                    !tp##_is_zero(xb.fld[i]))) {                              \
+            xt.fld[i] = float64_to_##tp(                                      \
+                            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSQRT,\
+                                                  sfprf),                     \
+                            &env->fp_status);                                 \
+        } else {                                                              \
+            if (unlikely(tp##_is_signaling_nan(xb.fld[i]))) {                 \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+            }                                                                 \
+            xt.fld[i] = tp##_sqrt(xb.fld[i], &env->fp_status);                \
+            xt.fld[i] = tp##_div(tp##_one, xt.fld[i], &env->fp_status);       \
+            if (sfprf) {                                                      \
+                helper_compute_fprf(env, xt.fld[0], sfprf);                   \
+            }                                                                 \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_RSQRTE(xsrsqrtedp, 1, float64, f64, 1)
+VSX_RSQRTE(xvrsqrtedp, 2, float64, f64, 0)
+VSX_RSQRTE(xvrsqrtesp, 4, float32, f32, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 27ca4e5..02ea86c 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -257,6 +257,7 @@ DEF_HELPER_2(xsmuldp, void, env, i32)
  DEF_HELPER_2(xsdivdp, void, env, i32)
  DEF_HELPER_2(xsredp, void, env, i32)
  DEF_HELPER_2(xssqrtdp, void, env, i32)
+DEF_HELPER_2(xsrsqrtedp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -264,6 +265,7 @@ DEF_HELPER_2(xvmuldp, void, env, i32)
  DEF_HELPER_2(xvdivdp, void, env, i32)
  DEF_HELPER_2(xvredp, void, env, i32)
  DEF_HELPER_2(xvsqrtdp, void, env, i32)
+DEF_HELPER_2(xvrsqrtedp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -271,6 +273,7 @@ DEF_HELPER_2(xvmulsp, void, env, i32)
  DEF_HELPER_2(xvdivsp, void, env, i32)
  DEF_HELPER_2(xvresp, void, env, i32)
  DEF_HELPER_2(xvsqrtsp, void, env, i32)
+DEF_HELPER_2(xvrsqrtesp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 66cbad1..b5253fc 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7299,6 +7299,7 @@ GEN_VSX_HELPER_2(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7306,6 +7307,7 @@ GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7313,6 +7315,7 @@ GEN_VSX_HELPER_2(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10002,6 +10005,7 @@ GEN_XX3FORM(xsmuldp, 0x00, 0x06, PPC2_VSX),
  GEN_XX3FORM(xsdivdp, 0x00, 0x07, PPC2_VSX),
  GEN_XX2FORM(xsredp,  0x14, 0x05, PPC2_VSX),
  GEN_XX2FORM(xssqrtdp,  0x16, 0x04, PPC2_VSX),
+GEN_XX2FORM(xsrsqrtedp,  0x14, 0x04, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10009,6 +10013,7 @@ GEN_XX3FORM(xvmuldp, 0x00, 0x0E, PPC2_VSX),
  GEN_XX3FORM(xvdivdp, 0x00, 0x0F, PPC2_VSX),
  GEN_XX2FORM(xvredp,  0x14, 0x0D, PPC2_VSX),
  GEN_XX2FORM(xvsqrtdp,  0x16, 0x0C, PPC2_VSX),
+GEN_XX2FORM(xvrsqrtedp,  0x14, 0x0C, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10016,6 +10021,7 @@ GEN_XX3FORM(xvmulsp, 0x00, 0x0A, PPC2_VSX),
  GEN_XX3FORM(xvdivsp, 0x00, 0x0B, PPC2_VSX),
  GEN_XX2FORM(xvresp, 0x14, 0x09, PPC2_VSX),
  GEN_XX2FORM(xvsqrtsp, 0x16, 0x08, PPC2_VSX),
+GEN_XX2FORM(xvrsqrtesp, 0x14, 0x08, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 11/19] Add VSX ISA2.06 xtdiv Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (9 preceding siblings ...)
  2013-10-24 16:23 ` [Qemu-devel] [PATCH 10/19] Add VSX ISA2.06 xrsqrte Instructions Tom Musta
@ 2013-10-24 16:23 ` Tom Musta
  2013-10-24 20:30   ` Richard Henderson
  2013-10-24 16:24 ` [Qemu-devel] [PATCH 12/19] Add VSX ISA2.06 xtsqrt Instructions Tom Musta
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:23 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point test for software divide
instructions defined by V2.06 of the PowerPC ISA: xstdivdp, xvtdivdp,
and xvtdivsp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   55 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 ++
  target-ppc/translate.c  |    6 +++++
  3 files changed, 64 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 902cb76..0dc498c 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2071,3 +2071,58 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_RSQRTE(xsrsqrtedp, 1, float64, f64, 1)
  VSX_RSQRTE(xvrsqrtedp, 2, float64, f64, 0)
  VSX_RSQRTE(xvrsqrtesp, 4, float32, f32, 0)
+
+/* VSX_TDIV - VSX floating point test for divide
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   emin  - minimum unbiased exponent
+ *   emax  - maximum unbiased exponent
+ *   nbits - number of fraction bits
+ */
+#define VSX_TDIV(op, nels, tp, fld, emin, emax, nbits)                  \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+{                                                                       \
+    ppc_vsr_t xa, xb;                                                   \
+    int i;                                                              \
+    int fe_flag = 0;                                                    \
+    int fg_flag = 0;                                                    \
+                                                                        \
+    getVSR(xA(opcode), &xa, env);                                       \
+    getVSR(xB(opcode), &xb, env);                                       \
+                                                                        \
+    for (i = 0; i < nels; i++) {                                        \
+        if (unlikely(tp##_is_infinity(xa.fld[i]) ||                     \
+                     tp##_is_infinity(xb.fld[i]) ||                     \
+                     tp##_is_zero(xb.fld[i]))) {                        \
+            fe_flag = 1;                                                \
+            fg_flag = 1;                                                \
+        } else {                                                        \
+            int e_a = tp##_get_unbiased_exp(xa.fld[i]);                 \
+            int e_b = tp##_get_unbiased_exp(xb.fld[i]);                 \
+                                                                        \
+            if (unlikely(tp##_is_any_nan(xa.fld[i]) ||                  \
+                         tp##_is_any_nan(xb.fld[i]))) {                 \
+                fe_flag = 1;                                            \
+            } else if ((e_b <= emin) || (e_b >= (emax-2))) {            \
+                fe_flag = 1;                                            \
+            } else if (!tp##_is_zero(xa.fld[i]) &&                      \
+                       (((e_a - e_b) >= emax) ||                        \
+                        ((e_a - e_b) <= (emin+1)) ||                    \
+                         (e_a <= (emin+nbits)))) {                      \
+                fe_flag = 1;                                            \
+            }                                                           \
+                                                                        \
+            if (unlikely(tp##_is_zero_or_denormal(xb.fld[i]))) {        \
+                fg_flag = 1;                                            \
+            }                                                           \
+        }                                                               \
+    }                                                                   \
+                                                                        \
+    env->crf[BF(opcode)] = 0x8 | (fg_flag ? 4 : 0) | (fe_flag ? 2 : 0); \
+}
+
+VSX_TDIV(xstdivdp, 1, float64, f64, -1022, 1023, 52)
+VSX_TDIV(xvtdivdp, 2, float64, f64, -1022, 1023, 52)
+VSX_TDIV(xvtdivsp, 4, float32, f32, -126, 127, 23)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 02ea86c..316b16f 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -258,6 +258,7 @@ DEF_HELPER_2(xsdivdp, void, env, i32)
  DEF_HELPER_2(xsredp, void, env, i32)
  DEF_HELPER_2(xssqrtdp, void, env, i32)
  DEF_HELPER_2(xsrsqrtedp, void, env, i32)
+DEF_HELPER_2(xstdivdp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -266,6 +267,7 @@ DEF_HELPER_2(xvdivdp, void, env, i32)
  DEF_HELPER_2(xvredp, void, env, i32)
  DEF_HELPER_2(xvsqrtdp, void, env, i32)
  DEF_HELPER_2(xvrsqrtedp, void, env, i32)
+DEF_HELPER_2(xvtdivdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -274,6 +276,7 @@ DEF_HELPER_2(xvdivsp, void, env, i32)
  DEF_HELPER_2(xvresp, void, env, i32)
  DEF_HELPER_2(xvsqrtsp, void, env, i32)
  DEF_HELPER_2(xvrsqrtesp, void, env, i32)
+DEF_HELPER_2(xvtdivsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index b5253fc..fe071f0 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7300,6 +7300,7 @@ GEN_VSX_HELPER_2(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7308,6 +7309,7 @@ GEN_VSX_HELPER_2(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7316,6 +7318,7 @@ GEN_VSX_HELPER_2(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10006,6 +10009,7 @@ GEN_XX3FORM(xsdivdp, 0x00, 0x07, PPC2_VSX),
  GEN_XX2FORM(xsredp,  0x14, 0x05, PPC2_VSX),
  GEN_XX2FORM(xssqrtdp,  0x16, 0x04, PPC2_VSX),
  GEN_XX2FORM(xsrsqrtedp,  0x14, 0x04, PPC2_VSX),
+GEN_XX3FORM(xstdivdp,  0x14, 0x07, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10014,6 +10018,7 @@ GEN_XX3FORM(xvdivdp, 0x00, 0x0F, PPC2_VSX),
  GEN_XX2FORM(xvredp,  0x14, 0x0D, PPC2_VSX),
  GEN_XX2FORM(xvsqrtdp,  0x16, 0x0C, PPC2_VSX),
  GEN_XX2FORM(xvrsqrtedp,  0x14, 0x0C, PPC2_VSX),
+GEN_XX3FORM(xvtdivdp, 0x14, 0x0F, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10022,6 +10027,7 @@ GEN_XX3FORM(xvdivsp, 0x00, 0x0B, PPC2_VSX),
  GEN_XX2FORM(xvresp, 0x14, 0x09, PPC2_VSX),
  GEN_XX2FORM(xvsqrtsp, 0x16, 0x08, PPC2_VSX),
  GEN_XX2FORM(xvrsqrtesp, 0x14, 0x08, PPC2_VSX),
+GEN_XX3FORM(xvtdivsp, 0x14, 0x0B, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 12/19] Add VSX ISA2.06 xtsqrt Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (10 preceding siblings ...)
  2013-10-24 16:23 ` [Qemu-devel] [PATCH 11/19] Add VSX ISA2.06 xtdiv Instructions Tom Musta
@ 2013-10-24 16:24 ` Tom Musta
  2013-10-24 20:34   ` Richard Henderson
  2013-10-24 16:25 ` [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions Tom Musta
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:24 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point test for software square
root instructions defined by V2.06 of the PowerPC ISA: xstsqrtdp,
xvtsqrtdp, xvtsqrtsp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    3 ++
  target-ppc/translate.c  |    6 +++++
  3 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 0dc498c..4e484a3 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2126,3 +2126,55 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
  VSX_TDIV(xstdivdp, 1, float64, f64, -1022, 1023, 52)
  VSX_TDIV(xvtdivdp, 2, float64, f64, -1022, 1023, 52)
  VSX_TDIV(xvtdivsp, 4, float32, f32, -126, 127, 23)
+
+/* VSX_TSQRT - VSX floating point test for square root
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   emin  - minimum unbiased exponent
+ *   emax  - maximum unbiased exponent
+ *   nbits - number of fraction bits
+ */
+#define VSX_TSQRT(op, nels, tp, fld, emin, nbits)                       \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+{                                                                       \
+    ppc_vsr_t xa, xb;                                                   \
+    int i;                                                              \
+    int fe_flag = 0;                                                    \
+    int fg_flag = 0;                                                    \
+                                                                        \
+    getVSR(xA(opcode), &xa, env);                                       \
+    getVSR(xB(opcode), &xb, env);                                       \
+                                                                        \
+    for (i = 0; i < nels; i++) {                                        \
+        if (unlikely(tp##_is_infinity(xb.fld[i]) ||                     \
+                     tp##_is_zero(xb.fld[i]))) {                        \
+            fe_flag = 1;                                                \
+            fg_flag = 1;                                                \
+        } else {                                                        \
+            int e_b = tp##_get_unbiased_exp(xb.fld[i]);                 \
+                                                                        \
+            if (unlikely(tp##_is_any_nan(xb.fld[i]))) {                 \
+                fe_flag = 1;                                            \
+            } else if (unlikely(tp##_is_zero(xb.fld[i]))) {             \
+                fe_flag = 1;                                            \
+            } else if (unlikely(tp##_is_neg(xb.fld[i]))) {              \
+                fe_flag = 1;                                            \
+            } else if (!tp##_is_zero(xb.fld[i]) &&                      \
+                      (e_b <= (emin+nbits))) {                          \
+                fe_flag = 1;                                            \
+            }                                                           \
+                                                                        \
+            if (unlikely(tp##_is_denormal(xb.fld[i]))) {                \
+                fg_flag = 1;                                            \
+            }                                                           \
+        }                                                               \
+    }                                                                   \
+                                                                        \
+    env->crf[BF(opcode)] = 0x8 | (fg_flag ? 4 : 0) | (fe_flag ? 2 : 0); \
+}
+
+VSX_TSQRT(xstsqrtdp, 1, float64, f64, -1022, 52)
+VSX_TSQRT(xvtsqrtdp, 2, float64, f64, -1022, 52)
+VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 316b16f..e1abada 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -259,6 +259,7 @@ DEF_HELPER_2(xsredp, void, env, i32)
  DEF_HELPER_2(xssqrtdp, void, env, i32)
  DEF_HELPER_2(xsrsqrtedp, void, env, i32)
  DEF_HELPER_2(xstdivdp, void, env, i32)
+DEF_HELPER_2(xstsqrtdp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -268,6 +269,7 @@ DEF_HELPER_2(xvredp, void, env, i32)
  DEF_HELPER_2(xvsqrtdp, void, env, i32)
  DEF_HELPER_2(xvrsqrtedp, void, env, i32)
  DEF_HELPER_2(xvtdivdp, void, env, i32)
+DEF_HELPER_2(xvtsqrtdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -277,6 +279,7 @@ DEF_HELPER_2(xvresp, void, env, i32)
  DEF_HELPER_2(xvsqrtsp, void, env, i32)
  DEF_HELPER_2(xvrsqrtesp, void, env, i32)
  DEF_HELPER_2(xvtdivsp, void, env, i32)
+DEF_HELPER_2(xvtsqrtsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index fe071f0..6978fe0 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7301,6 +7301,7 @@ GEN_VSX_HELPER_2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7310,6 +7311,7 @@ GEN_VSX_HELPER_2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7319,6 +7321,7 @@ GEN_VSX_HELPER_2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10010,6 +10013,7 @@ GEN_XX2FORM(xsredp,  0x14, 0x05, PPC2_VSX),
  GEN_XX2FORM(xssqrtdp,  0x16, 0x04, PPC2_VSX),
  GEN_XX2FORM(xsrsqrtedp,  0x14, 0x04, PPC2_VSX),
  GEN_XX3FORM(xstdivdp,  0x14, 0x07, PPC2_VSX),
+GEN_XX2FORM(xstsqrtdp,  0x14, 0x06, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10019,6 +10023,7 @@ GEN_XX2FORM(xvredp,  0x14, 0x0D, PPC2_VSX),
  GEN_XX2FORM(xvsqrtdp,  0x16, 0x0C, PPC2_VSX),
  GEN_XX2FORM(xvrsqrtedp,  0x14, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvtdivdp, 0x14, 0x0F, PPC2_VSX),
+GEN_XX2FORM(xvtsqrtdp, 0x14, 0x0E, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10028,6 +10033,7 @@ GEN_XX2FORM(xvresp, 0x14, 0x09, PPC2_VSX),
  GEN_XX2FORM(xvsqrtsp, 0x16, 0x08, PPC2_VSX),
  GEN_XX2FORM(xvrsqrtesp, 0x14, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvtdivsp, 0x14, 0x0B, PPC2_VSX),
+GEN_XX2FORM(xvtsqrtsp, 0x14, 0x0A, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (11 preceding siblings ...)
  2013-10-24 16:24 ` [Qemu-devel] [PATCH 12/19] Add VSX ISA2.06 xtsqrt Instructions Tom Musta
@ 2013-10-24 16:25 ` Tom Musta
  2013-10-24 20:38   ` Richard Henderson
  2013-10-24 16:25 ` [Qemu-devel] [PATCH 14/19] Add VSX xscmp*dp Instructions Tom Musta
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:25 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point multiply/add instructions
defined by V2.06 of the PowerPC ISA:

   - xsmaddadp,  xvmaddadp,  xvmaddasp
   - xsmaddmdp,  xvmaddmdp,  xvmaddmsp
   - xsmsubadp,  xvmsubadp,  xvmsubasp
   - xsmsubmdp,  xvmsubmdp,  xvmsubmsp
   - xsnmaddadp, xvnmaddadp, xvnmaddasp
   - xsnmaddmdp, xvnmaddmdp, xvnmaddmsp
   - xsnmsubadp, xvnmsubadp, xvnmsubasp
   - xsnmsubmdp, xvnmsubmdp, xvnmsubmsp

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |  106 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |   24 +++++++++++
  target-ppc/translate.c  |   48 +++++++++++++++++++++
  3 files changed, 178 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 4e484a3..12e7abc 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2178,3 +2178,109 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
  VSX_TSQRT(xstsqrtdp, 1, float64, f64, -1022, 52)
  VSX_TSQRT(xvtsqrtdp, 2, float64, f64, -1022, 52)
  VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23)
+
+/* VSX_MADD - VSX floating point muliply/add variations
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   btp   - big (intermediate) type (float64 or float128)
+ *   fld   - vsr_t field (f32 or f64)
+ *   cmp   - comparision operation for testing INF - INF
+ *   sum   - sum operation (add or sub)
+ *   neg   - negate result (0 or 1)
+ *   afrm  - A form (1=A, 0=M)
+ *   sfprf - set FPRF
+ */
+#define VSX_MADD(op, nels, tp, btp, fld, cmp, sum, neg, afrm, sfprf)          \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    ppc_vsr_t *m, *s;                                                         \
+    int i;                                                                    \
+                                                                              \
+    if (afrm) {                                                               \
+        m = &xb;                                                              \
+        s = &xt;                                                              \
+    }                                                                         \
+    else {                                                                    \
+        m = &xt;                                                              \
+        s = &xb;                                                              \
+    }                                                                         \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+                                                                              \
+    helper_reset_fpstatus(env);                                               \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely((tp##_is_infinity(xa.fld[i]) &&                          \
+                      tp##_is_zero(m->fld[i])) ||                             \
+                     (tp##_is_zero(xa.fld[i]) &&                              \
+                      tp##_is_infinity(m->fld[i])))) {                        \
+            xt.fld[i] = float64_to_##tp(                                      \
+                          fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ,   \
+                                              sfprf),                         \
+                          &env->fp_status);                                   \
+        } else {                                                              \
+            if (unlikely(tp##_is_signaling_nan(xa.fld[i]) ||                  \
+                         tp##_is_signaling_nan(m->fld[i]) ||                  \
+                         tp##_is_signaling_nan(s->fld[i]))) {                 \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);    \
+            }                                                                 \
+            btp ft0, ft1;                                                     \
+                                                                              \
+            ft0 = tp##_to_##btp(xa.fld[i], &env->fp_status);                  \
+            ft1 = tp##_to_##btp(m->fld[i], &env->fp_status);                  \
+            ft0 = btp##_mul(ft0, ft1, &env->fp_status);                       \
+            if (unlikely(btp##_is_infinity(ft0) &&                            \
+                         tp##_is_infinity(s->fld[i]) &&                       \
+                         btp##_is_neg(ft0) cmp tp##_is_neg(s->fld[i]))) {     \
+                xt.fld[i] = float64_to_##tp(                                  \
+                              fload_invalid_op_excp(env,                      \
+                                                     POWERPC_EXCP_FP_VXISI,   \
+                                                     sfprf),                  \
+                              &env->fp_status);                               \
+            } else {                                                          \
+                ft1 = tp##_to_##btp(s->fld[i], &env->fp_status);              \
+                ft0 = btp##_##sum(ft0, ft1, &env->fp_status);                 \
+                xt.fld[i] = btp##_to_##tp(ft0, &env->fp_status);              \
+            }                                                                 \
+            if (neg && likely(!tp##_is_any_nan(xt.fld[i]))) {                 \
+                xt.fld[i] = tp##_chs(xt.fld[i]);                              \
+            }                                                                 \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    if (sfprf) {                                                              \
+        helper_compute_fprf(env, xt.fld[0], sfprf);                           \
+    }                                                                         \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_MADD(xsmaddadp, 1, float64, float128, f64, !=, add, 0, 1, 1)
+VSX_MADD(xsmaddmdp, 1, float64, float128, f64, !=, add, 0, 0, 1)
+VSX_MADD(xsmsubadp, 1, float64, float128, f64, ==, sub, 0, 1, 1)
+VSX_MADD(xsmsubmdp, 1, float64, float128, f64, ==, sub, 0, 0, 1)
+VSX_MADD(xsnmaddadp, 1, float64, float128, f64, !=, add, 1, 1, 1)
+VSX_MADD(xsnmaddmdp, 1, float64, float128, f64, !=, add, 1, 0, 1)
+VSX_MADD(xsnmsubadp, 1, float64, float128, f64, ==, sub, 1, 1, 1)
+VSX_MADD(xsnmsubmdp, 1, float64, float128, f64, ==, sub, 1, 0, 1)
+
+VSX_MADD(xvmaddadp, 2, float64, float128, f64, !=, add, 0, 1, 0)
+VSX_MADD(xvmaddmdp, 2, float64, float128, f64, !=, add, 0, 0, 0)
+VSX_MADD(xvmsubadp, 2, float64, float128, f64, ==, sub, 0, 1, 0)
+VSX_MADD(xvmsubmdp, 2, float64, float128, f64, ==, sub, 0, 0, 0)
+VSX_MADD(xvnmaddadp, 2, float64, float128, f64, !=, add, 1, 1, 0)
+VSX_MADD(xvnmaddmdp, 2, float64, float128, f64, !=, add, 1, 0, 0)
+VSX_MADD(xvnmsubadp, 2, float64, float128, f64, ==, sub, 1, 1, 0)
+VSX_MADD(xvnmsubmdp, 2, float64, float128, f64, ==, sub, 1, 0, 0)
+
+VSX_MADD(xvmaddasp, 4, float32, float64, f32, !=, add, 0, 1, 0)
+VSX_MADD(xvmaddmsp, 4, float32, float64, f32, !=, add, 0, 0, 0)
+VSX_MADD(xvmsubasp, 4, float32, float64, f32, ==, sub, 0, 1, 0)
+VSX_MADD(xvmsubmsp, 4, float32, float64, f32, ==, sub, 0, 0, 0)
+VSX_MADD(xvnmaddasp, 4, float32, float64, f32, !=, add, 1, 1, 0)
+VSX_MADD(xvnmaddmsp, 4, float32, float64, f32, !=, add, 1, 0, 0)
+VSX_MADD(xvnmsubasp, 4, float32, float64, f32, ==, sub, 1, 1, 0)
+VSX_MADD(xvnmsubmsp, 4, float32, float64, f32, ==, sub, 1, 0, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index e1abada..15f1b95 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -260,6 +260,14 @@ DEF_HELPER_2(xssqrtdp, void, env, i32)
  DEF_HELPER_2(xsrsqrtedp, void, env, i32)
  DEF_HELPER_2(xstdivdp, void, env, i32)
  DEF_HELPER_2(xstsqrtdp, void, env, i32)
+DEF_HELPER_2(xsmaddadp, void, env, i32)
+DEF_HELPER_2(xsmaddmdp, void, env, i32)
+DEF_HELPER_2(xsmsubadp, void, env, i32)
+DEF_HELPER_2(xsmsubmdp, void, env, i32)
+DEF_HELPER_2(xsnmaddadp, void, env, i32)
+DEF_HELPER_2(xsnmaddmdp, void, env, i32)
+DEF_HELPER_2(xsnmsubadp, void, env, i32)
+DEF_HELPER_2(xsnmsubmdp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -270,6 +278,14 @@ DEF_HELPER_2(xvsqrtdp, void, env, i32)
  DEF_HELPER_2(xvrsqrtedp, void, env, i32)
  DEF_HELPER_2(xvtdivdp, void, env, i32)
  DEF_HELPER_2(xvtsqrtdp, void, env, i32)
+DEF_HELPER_2(xvmaddadp, void, env, i32)
+DEF_HELPER_2(xvmaddmdp, void, env, i32)
+DEF_HELPER_2(xvmsubadp, void, env, i32)
+DEF_HELPER_2(xvmsubmdp, void, env, i32)
+DEF_HELPER_2(xvnmaddadp, void, env, i32)
+DEF_HELPER_2(xvnmaddmdp, void, env, i32)
+DEF_HELPER_2(xvnmsubadp, void, env, i32)
+DEF_HELPER_2(xvnmsubmdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -280,6 +296,14 @@ DEF_HELPER_2(xvsqrtsp, void, env, i32)
  DEF_HELPER_2(xvrsqrtesp, void, env, i32)
  DEF_HELPER_2(xvtdivsp, void, env, i32)
  DEF_HELPER_2(xvtsqrtsp, void, env, i32)
+DEF_HELPER_2(xvmaddasp, void, env, i32)
+DEF_HELPER_2(xvmaddmsp, void, env, i32)
+DEF_HELPER_2(xvmsubasp, void, env, i32)
+DEF_HELPER_2(xvmsubmsp, void, env, i32)
+DEF_HELPER_2(xvnmaddasp, void, env, i32)
+DEF_HELPER_2(xvnmaddmsp, void, env, i32)
+DEF_HELPER_2(xvnmsubasp, void, env, i32)
+DEF_HELPER_2(xvnmsubmsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 6978fe0..3783e94 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7302,6 +7302,14 @@ GEN_VSX_HELPER_2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsmaddmdp, 0x04, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsmsubadp, 0x04, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsmsubmdp, 0x04, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7312,6 +7320,14 @@ GEN_VSX_HELPER_2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmaddmdp, 0x04, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmsubadp, 0x04, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmsubmdp, 0x04, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmaddadp, 0x04, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmaddmdp, 0x04, 0x1D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmsubadp, 0x04, 0x1E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmsubmdp, 0x04, 0x1F, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7322,6 +7338,14 @@ GEN_VSX_HELPER_2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmaddmsp, 0x04, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmsubasp, 0x04, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmsubmsp, 0x04, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmaddasp, 0x04, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmaddmsp, 0x04, 0x19, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmsubasp, 0x04, 0x1A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvnmsubmsp, 0x04, 0x1B, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10014,6 +10038,14 @@ GEN_XX2FORM(xssqrtdp,  0x16, 0x04, PPC2_VSX),
  GEN_XX2FORM(xsrsqrtedp,  0x14, 0x04, PPC2_VSX),
  GEN_XX3FORM(xstdivdp,  0x14, 0x07, PPC2_VSX),
  GEN_XX2FORM(xstsqrtdp,  0x14, 0x06, PPC2_VSX),
+GEN_XX3FORM(xsmaddadp, 0x04, 0x04, PPC2_VSX),
+GEN_XX3FORM(xsmaddmdp, 0x04, 0x05, PPC2_VSX),
+GEN_XX3FORM(xsmsubadp, 0x04, 0x06, PPC2_VSX),
+GEN_XX3FORM(xsmsubmdp, 0x04, 0x07, PPC2_VSX),
+GEN_XX3FORM(xsnmaddadp, 0x04, 0x14, PPC2_VSX),
+GEN_XX3FORM(xsnmaddmdp, 0x04, 0x15, PPC2_VSX),
+GEN_XX3FORM(xsnmsubadp, 0x04, 0x16, PPC2_VSX),
+GEN_XX3FORM(xsnmsubmdp, 0x04, 0x17, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10024,6 +10056,14 @@ GEN_XX2FORM(xvsqrtdp,  0x16, 0x0C, PPC2_VSX),
  GEN_XX2FORM(xvrsqrtedp,  0x14, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvtdivdp, 0x14, 0x0F, PPC2_VSX),
  GEN_XX2FORM(xvtsqrtdp, 0x14, 0x0E, PPC2_VSX),
+GEN_XX3FORM(xvmaddadp, 0x04, 0x0C, PPC2_VSX),
+GEN_XX3FORM(xvmaddmdp, 0x04, 0x0D, PPC2_VSX),
+GEN_XX3FORM(xvmsubadp, 0x04, 0x0E, PPC2_VSX),
+GEN_XX3FORM(xvmsubmdp, 0x04, 0x0F, PPC2_VSX),
+GEN_XX3FORM(xvnmaddadp, 0x04, 0x1C, PPC2_VSX),
+GEN_XX3FORM(xvnmaddmdp, 0x04, 0x1D, PPC2_VSX),
+GEN_XX3FORM(xvnmsubadp, 0x04, 0x1E, PPC2_VSX),
+GEN_XX3FORM(xvnmsubmdp, 0x04, 0x1F, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10034,6 +10074,14 @@ GEN_XX2FORM(xvsqrtsp, 0x16, 0x08, PPC2_VSX),
  GEN_XX2FORM(xvrsqrtesp, 0x14, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvtdivsp, 0x14, 0x0B, PPC2_VSX),
  GEN_XX2FORM(xvtsqrtsp, 0x14, 0x0A, PPC2_VSX),
+GEN_XX3FORM(xvmaddasp, 0x04, 0x08, PPC2_VSX),
+GEN_XX3FORM(xvmaddmsp, 0x04, 0x09, PPC2_VSX),
+GEN_XX3FORM(xvmsubasp, 0x04, 0x0A, PPC2_VSX),
+GEN_XX3FORM(xvmsubmsp, 0x04, 0x0B, PPC2_VSX),
+GEN_XX3FORM(xvnmaddasp, 0x04, 0x18, PPC2_VSX),
+GEN_XX3FORM(xvnmaddmsp, 0x04, 0x19, PPC2_VSX),
+GEN_XX3FORM(xvnmsubasp, 0x04, 0x1A, PPC2_VSX),
+GEN_XX3FORM(xvnmsubmsp, 0x04, 0x1B, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 14/19] Add VSX xscmp*dp Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (12 preceding siblings ...)
  2013-10-24 16:25 ` [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions Tom Musta
@ 2013-10-24 16:25 ` Tom Musta
  2013-10-24 20:39   ` Richard Henderson
  2013-10-24 16:26 ` [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions Tom Musta
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:25 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX scalar floating point compare ordered
and unordered instructions.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   39 +++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    2 ++
  target-ppc/translate.c  |    4 ++++
  3 files changed, 45 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 12e7abc..0373913 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2284,3 +2284,42 @@ VSX_MADD(xvnmaddasp, 4, float32, float64, f32, !=, add, 1, 1, 0)
  VSX_MADD(xvnmaddmsp, 4, float32, float64, f32, !=, add, 1, 0, 0)
  VSX_MADD(xvnmsubasp, 4, float32, float64, f32, ==, sub, 1, 1, 0)
  VSX_MADD(xvnmsubmsp, 4, float32, float64, f32, ==, sub, 1, 0, 0)
+
+#define VSX_SCALAR_CMP(op, ordered)                                      \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
+{                                                                        \
+    ppc_vsr_t xa, xb;                                                    \
+    uint32_t cc = 0;                                                     \
+                                                                         \
+    getVSR(xA(opcode), &xa, env);                                        \
+    getVSR(xB(opcode), &xb, env);                                        \
+                                                                         \
+    if (unlikely(float64_is_any_nan(xa.f64[0]) ||                        \
+                 float64_is_any_nan(xb.f64[0]))) {                       \
+        if (float64_is_signaling_nan(xa.f64[0]) ||                       \
+            float64_is_signaling_nan(xb.f64[0])) {                       \
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);       \
+        }                                                                \
+        if (ordered) {                                                   \
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);         \
+        }                                                                \
+        cc = 1;                                                          \
+    } else {                                                             \
+        if (float64_lt(xa.f64[0], xb.f64[0], &env->fp_status)) {         \
+            cc = 8;                                                      \
+        } else if (!float64_le(xa.f64[0], xb.f64[0], &env->fp_status)) { \
+            cc = 4;                                                      \
+        } else {                                                         \
+            cc = 2;                                                      \
+        }                                                                \
+    }                                                                    \
+                                                                         \
+    env->fpscr &= ~(0x0F << FPSCR_FPRF);                                 \
+    env->fpscr |= cc << FPSCR_FPRF;                                      \
+    env->crf[BF(opcode)] = cc;                                           \
+                                                                         \
+    helper_float_check_status(env);                                      \
+}
+
+VSX_SCALAR_CMP(xscmpodp, 1)
+VSX_SCALAR_CMP(xscmpudp, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 15f1b95..bfb1964 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -268,6 +268,8 @@ DEF_HELPER_2(xsnmaddadp, void, env, i32)
  DEF_HELPER_2(xsnmaddmdp, void, env, i32)
  DEF_HELPER_2(xsnmsubadp, void, env, i32)
  DEF_HELPER_2(xsnmsubmdp, void, env, i32)
+DEF_HELPER_2(xscmpodp, void, env, i32)
+DEF_HELPER_2(xscmpudp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 3783e94..053df68 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7310,6 +7310,8 @@ GEN_VSX_HELPER_2(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10046,6 +10048,8 @@ GEN_XX3FORM(xsnmaddadp, 0x04, 0x14, PPC2_VSX),
  GEN_XX3FORM(xsnmaddmdp, 0x04, 0x15, PPC2_VSX),
  GEN_XX3FORM(xsnmsubadp, 0x04, 0x16, PPC2_VSX),
  GEN_XX3FORM(xsnmsubmdp, 0x04, 0x17, PPC2_VSX),
+GEN_XX2FORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
+GEN_XX2FORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (13 preceding siblings ...)
  2013-10-24 16:25 ` [Qemu-devel] [PATCH 14/19] Add VSX xscmp*dp Instructions Tom Musta
@ 2013-10-24 16:26 ` Tom Musta
  2013-10-24 20:45   ` Richard Henderson
  2013-10-24 22:10   ` Peter Maydell
  2013-10-24 16:26 ` [Qemu-devel] [PATCH 16/19] Add VSX Vector Compare Instructions Tom Musta
                   ` (3 subsequent siblings)
  18 siblings, 2 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:26 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point maximum and minimum
instructions:

   - xsmaxdp, xvmaxdp, xvmaxsp
   - xsmindp, xvmindp, xvminsp

Because of the Power ISA definitions of maximum and minimum
on various boundary cases, the standard softfloat comparison
routines (e.g. float64_lt) do not work as well as one might
think.  Therefore specific routines for comparing 64 and 32
bit floating point numbers are implemented in the PowerPC
helper code.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |  162 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    6 ++
  target-ppc/translate.c  |   12 ++++
  3 files changed, 180 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 0373913..29b27ce 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2323,3 +2323,165 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \

  VSX_SCALAR_CMP(xscmpodp, 1)
  VSX_SCALAR_CMP(xscmpudp, 0)
+
+#define float64_snan_to_qnan(x) ((x) | 0x0008000000000000ul)
+
+static int compare_float64(float64 a, float64 b)
+{
+    uint64_t asgn = a & 0x8000000000000000ul;
+    uint64_t bsgn = b & 0x8000000000000000ul;
+
+    if (asgn != bsgn) {
+        return asgn ? -1 : 1;
+    }
+
+    uint64_t aexp = (a & 0x7FF0000000000000ul) >> (64-12);
+    uint64_t bexp = (b & 0x7FF0000000000000ul) >> (64-12);
+
+    if (aexp < bexp) {
+        return asgn ? 1 : -1;
+    } else if (aexp > bexp) {
+        return asgn ? -1 : 1;
+    } else {
+        uint64_t afrac = a & 0x000FFFFFFFFFFFFFul;
+        uint64_t bfrac = b & 0x000FFFFFFFFFFFFFul;
+
+        if (afrac < bfrac) {
+            return asgn ? 1 : -1;
+        } else if (afrac > bfrac) {
+            return asgn ? -1 : 1;
+        } else {
+            return 0;
+        }
+    }
+}
+
+#define float32_snan_to_qnan(x) ((x) | 0x00400000)
+
+static int compare_float32(float32 a, float32 b)
+{
+    uint64_t asgn = a & 0x80000000;
+    uint64_t bsgn = b & 0x80000000;
+
+    if (asgn != bsgn) {
+        return asgn ? -1 : 1;
+    }
+
+    uint64_t aexp = (a & 0x7FC00000) >> (32-9);
+    uint64_t bexp = (b & 0x7FF00000) >> (32-9);
+
+    if (aexp < bexp) {
+        return asgn ? 1 : -1;
+    } else if (aexp > bexp) {
+        return asgn ? -1 : 1;
+    } else {
+        uint64_t afrac = a & 0x007FFFFF;
+        uint64_t bfrac = b & 0x007FFFFF;
+
+        if (afrac < bfrac) {
+            return asgn ? 1 : -1;
+        } else if (afrac > bfrac) {
+            return asgn ? -1 : 1;
+        } else {
+            return 0;
+        }
+    }
+}
+
+/* VSX_MAX - VSX floating point maximum
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ */
+#define VSX_MAX(op, nels, tp, fld)                                            \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_any_nan(xa.fld[i]) ||                            \
+                     tp##_is_any_nan(xb.fld[i]))) {                           \
+            if (tp##_is_signaling_nan(xa.fld[i])) {                           \
+                xt.fld[i] = tp##_snan_to_qnan(xa.fld[i]);                     \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);        \
+            } else if (tp##_is_signaling_nan(xb.fld[i])) {                    \
+                xt.fld[i] = tp##_snan_to_qnan(xb.fld[i]);                     \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);        \
+            } else if (tp##_is_quiet_nan(xb.fld[i])) {                        \
+                xt.fld[i] = xa.fld[i];                                        \
+            } else { /* XA is QNaN */                                         \
+                xt.fld[i] = xb.fld[i];                                        \
+            }                                                                 \
+        } else if (unlikely(tp##_is_infinity(xa.fld[i]))) {                   \
+            xt.fld[i] = tp##_is_neg(xa.fld[i]) ? xb.fld[i] : xa.fld[i];       \
+        } else if (unlikely(tp##_is_infinity(xb.fld[i]))) {                   \
+            xt.fld[i] = tp##_is_neg(xb.fld[i]) ? xa.fld[i] : xb.fld[i];       \
+        }                                                                     \
+        else {                                                                \
+            xt.fld[i] = (compare_##tp(xa.fld[i], xb.fld[i]) < 0) ?            \
+                        xb.fld[i] : xa.fld[i];                                \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_MAX(xsmaxdp, 1, float64, f64)
+VSX_MAX(xvmaxdp, 2, float64, f64)
+VSX_MAX(xvmaxsp, 4, float32, f32)
+
+/* VSX_MIN - VSX floating point minimum
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ */
+#define VSX_MIN(op, nels, tp, fld)                                            \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+{                                                                             \
+    ppc_vsr_t xt, xa, xb;                                                     \
+    int i;                                                                    \
+                                                                              \
+    getVSR(xA(opcode), &xa, env);                                             \
+    getVSR(xB(opcode), &xb, env);                                             \
+    getVSR(xT(opcode), &xt, env);                                             \
+                                                                              \
+    for (i = 0; i < nels; i++) {                                              \
+        if (unlikely(tp##_is_any_nan(xa.fld[i]) ||                            \
+                     tp##_is_any_nan(xb.fld[i]))) {                           \
+            if (tp##_is_signaling_nan(xa.fld[i])) {                           \
+                xt.fld[i] = tp##_snan_to_qnan(xa.fld[i]);                     \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);        \
+            } else if (tp##_is_signaling_nan(xb.fld[i])) {                    \
+                xt.fld[i] = tp##_snan_to_qnan(xb.fld[i]);                     \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);        \
+            } else if (tp##_is_quiet_nan(xb.fld[i])) {                        \
+                xt.fld[i] = xa.fld[i];                                        \
+            } else { /* XA is QNaN */                                         \
+                xt.fld[i] = xb.fld[i];                                        \
+            }                                                                 \
+        } else if (unlikely(tp##_is_infinity(xa.fld[i]))) {                   \
+            xt.fld[i] = tp##_is_neg(xa.fld[i]) ? xa.fld[i] : xb.fld[i];       \
+        } else if (unlikely(tp##_is_infinity(xb.fld[i]))) {                   \
+            xt.fld[i] = tp##_is_neg(xb.fld[i]) ? xb.fld[i] : xa.fld[i];       \
+        }                                                                     \
+        else {                                                                \
+            xt.fld[i] = (compare_##tp(xa.fld[i], xb.fld[i]) < 0) ?            \
+                        xa.fld[i] : xb.fld[i];                                \
+        }                                                                     \
+    }                                                                         \
+                                                                              \
+    putVSR(xT(opcode), &xt, env);                                             \
+    helper_float_check_status(env);                                           \
+}
+
+VSX_MIN(xsmindp, 1, float64, f64)
+VSX_MIN(xvmindp, 2, float64, f64)
+VSX_MIN(xvminsp, 4, float32, f32)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index bfb1964..40c523a 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -270,6 +270,8 @@ DEF_HELPER_2(xsnmsubadp, void, env, i32)
  DEF_HELPER_2(xsnmsubmdp, void, env, i32)
  DEF_HELPER_2(xscmpodp, void, env, i32)
  DEF_HELPER_2(xscmpudp, void, env, i32)
+DEF_HELPER_2(xsmaxdp, void, env, i32)
+DEF_HELPER_2(xsmindp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -288,6 +290,8 @@ DEF_HELPER_2(xvnmaddadp, void, env, i32)
  DEF_HELPER_2(xvnmaddmdp, void, env, i32)
  DEF_HELPER_2(xvnmsubadp, void, env, i32)
  DEF_HELPER_2(xvnmsubmdp, void, env, i32)
+DEF_HELPER_2(xvmaxdp, void, env, i32)
+DEF_HELPER_2(xvmindp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -306,6 +310,8 @@ DEF_HELPER_2(xvnmaddasp, void, env, i32)
  DEF_HELPER_2(xvnmaddmsp, void, env, i32)
  DEF_HELPER_2(xvnmsubasp, void, env, i32)
  DEF_HELPER_2(xvnmsubmsp, void, env, i32)
+DEF_HELPER_2(xvmaxsp, void, env, i32)
+DEF_HELPER_2(xvminsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 053df68..67d5267 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7312,6 +7312,8 @@ GEN_VSX_HELPER_2(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7330,6 +7332,8 @@ GEN_VSX_HELPER_2(xvnmaddadp, 0x04, 0x1C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmaddmdp, 0x04, 0x1D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmsubadp, 0x04, 0x1E, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmsubmdp, 0x04, 0x1F, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmaxdp, 0x00, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7348,6 +7352,8 @@ GEN_VSX_HELPER_2(xvnmaddasp, 0x04, 0x18, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmaddmsp, 0x04, 0x19, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmsubasp, 0x04, 0x1A, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmsubmsp, 0x04, 0x1B, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvmaxsp, 0x00, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvminsp, 0x00, 0x19, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10050,6 +10056,8 @@ GEN_XX3FORM(xsnmsubadp, 0x04, 0x16, PPC2_VSX),
  GEN_XX3FORM(xsnmsubmdp, 0x04, 0x17, PPC2_VSX),
  GEN_XX2FORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
  GEN_XX2FORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
+GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
+GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10068,6 +10076,8 @@ GEN_XX3FORM(xvnmaddadp, 0x04, 0x1C, PPC2_VSX),
  GEN_XX3FORM(xvnmaddmdp, 0x04, 0x1D, PPC2_VSX),
  GEN_XX3FORM(xvnmsubadp, 0x04, 0x1E, PPC2_VSX),
  GEN_XX3FORM(xvnmsubmdp, 0x04, 0x1F, PPC2_VSX),
+GEN_XX3FORM(xvmaxdp, 0x00, 0x1C, PPC2_VSX),
+GEN_XX3FORM(xvmindp, 0x00, 0x1D, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10086,6 +10096,8 @@ GEN_XX3FORM(xvnmaddasp, 0x04, 0x18, PPC2_VSX),
  GEN_XX3FORM(xvnmaddmsp, 0x04, 0x19, PPC2_VSX),
  GEN_XX3FORM(xvnmsubasp, 0x04, 0x1A, PPC2_VSX),
  GEN_XX3FORM(xvnmsubmsp, 0x04, 0x1B, PPC2_VSX),
+GEN_XX3FORM(xvmaxsp, 0x00, 0x18, PPC2_VSX),
+GEN_XX3FORM(xvminsp, 0x00, 0x19, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 16/19] Add VSX Vector Compare Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (14 preceding siblings ...)
  2013-10-24 16:26 ` [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions Tom Musta
@ 2013-10-24 16:26 ` Tom Musta
  2013-10-24 16:27 ` [Qemu-devel] [PATCH 17/19] Add VSX Floating Point to Floating Point Conversion Instructions Tom Musta
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:26 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX floating point compare vector instructions:

   - xvcmpeqdp[.], xvcmpgedp[.], xvcmpgtdp[.]
   - xvcmpeqsp[.], xvcmpgesp[.], xvcmpgtsp[.]

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   58 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    6 +++++
  target-ppc/translate.c  |   23 ++++++++++++++++++
  3 files changed, 87 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 29b27ce..7bcd213 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2485,3 +2485,61 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
  VSX_MIN(xsmindp, 1, float64, f64)
  VSX_MIN(xvmindp, 2, float64, f64)
  VSX_MIN(xvminsp, 4, float32, f32)
+
+/* VSX_CMP - VSX floating point compare
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   cmp   - comparison operation
+ *   svxvc - set VXVC bit
+ */
+#define VSX_CMP(op, nels, tp, fld, cmp, svxvc)                            \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
+{                                                                         \
+    ppc_vsr_t xt, xa, xb;                                                 \
+    int i;                                                                \
+    int all_true = 1;                                                     \
+    int all_false = 1;                                                    \
+                                                                          \
+    getVSR(xA(opcode), &xa, env);                                         \
+    getVSR(xB(opcode), &xb, env);                                         \
+    getVSR(xT(opcode), &xt, env);                                         \
+                                                                          \
+    for (i = 0; i < nels; i++) {                                          \
+        if (unlikely(tp##_is_any_nan(xa.fld[i]) ||                        \
+                     tp##_is_any_nan(xb.fld[i]))) {                       \
+            if (tp##_is_signaling_nan(xa.fld[i]) ||                       \
+                tp##_is_signaling_nan(xb.fld[i])) {                       \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);    \
+            }                                                             \
+            if (svxvc) {                                                  \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXVC, 0);      \
+            }                                                             \
+            xt.fld[i] = 0;                                                \
+            all_true = 0;                                                 \
+        }                                                                 \
+        else {                                                            \
+            if (tp##_##cmp(xb.fld[i], xa.fld[i], &env->fp_status) == 1) { \
+                xt.fld[i] = -1;                                           \
+                all_false = 0;                                            \
+            } else {                                                      \
+                xt.fld[i] = 0;                                            \
+                all_true = 0;                                             \
+            }                                                             \
+        }                                                                 \
+    }                                                                     \
+                                                                          \
+    putVSR(xT(opcode), &xt, env);                                         \
+    if ((opcode >> (31-21)) & 1) {                                        \
+        env->crf[6] = (all_true ? 0x8 : 0) | (all_false ? 0x2 : 0);       \
+    }                                                                     \
+    helper_float_check_status(env);                                       \
+ }
+
+VSX_CMP(xvcmpeqdp, 2, float64, f64, eq, 0)
+VSX_CMP(xvcmpgedp, 2, float64, f64, le, 1)
+VSX_CMP(xvcmpgtdp, 2, float64, f64, lt, 1)
+VSX_CMP(xvcmpeqsp, 4, float32, f32, eq, 0)
+VSX_CMP(xvcmpgesp, 4, float32, f32, le, 1)
+VSX_CMP(xvcmpgtsp, 4, float32, f32, lt, 1)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 40c523a..333bce4 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -292,6 +292,9 @@ DEF_HELPER_2(xvnmsubadp, void, env, i32)
  DEF_HELPER_2(xvnmsubmdp, void, env, i32)
  DEF_HELPER_2(xvmaxdp, void, env, i32)
  DEF_HELPER_2(xvmindp, void, env, i32)
+DEF_HELPER_2(xvcmpeqdp, void, env, i32)
+DEF_HELPER_2(xvcmpgedp, void, env, i32)
+DEF_HELPER_2(xvcmpgtdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -312,6 +315,9 @@ DEF_HELPER_2(xvnmsubasp, void, env, i32)
  DEF_HELPER_2(xvnmsubmsp, void, env, i32)
  DEF_HELPER_2(xvmaxsp, void, env, i32)
  DEF_HELPER_2(xvminsp, void, env, i32)
+DEF_HELPER_2(xvcmpeqsp, void, env, i32)
+DEF_HELPER_2(xvcmpgesp, void, env, i32)
+DEF_HELPER_2(xvcmpgtsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 67d5267..46813eb 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7334,6 +7334,9 @@ GEN_VSX_HELPER_2(xvnmsubadp, 0x04, 0x1E, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmsubmdp, 0x04, 0x1F, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmaxdp, 0x00, 0x1C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7354,6 +7357,9 @@ GEN_VSX_HELPER_2(xvnmsubasp, 0x04, 0x1A, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvnmsubmsp, 0x04, 0x1B, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvmaxsp, 0x00, 0x18, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvminsp, 0x00, 0x19, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10004,6 +10010,17 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, PPC_NONE, fl2), \
  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 2, opc3, 0, PPC_NONE, fl2), \
  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 3, opc3, 0, PPC_NONE, fl2)

+#undef GEN_XX3_RC_FORM
+#define GEN_XX3_RC_FORM(name, opc2, opc3, fl2)                          \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x00, opc3 | 0x00, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x01, opc3 | 0x00, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x02, opc3 | 0x00, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x03, opc3 | 0x00, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x00, opc3 | 0x10, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x01, opc3 | 0x10, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x02, opc3 | 0x10, 0, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x03, opc3 | 0x10, 0, PPC_NONE, fl2)
+
  #undef GEN_XX3FORM_DM
  #define GEN_XX3FORM_DM(name, opc2, opc3) \
  GEN_HANDLER2_E(name, #name, 0x3C, opc2|0x00, opc3|0x00, 0, PPC_NONE, PPC2_VSX),\
@@ -10078,6 +10095,9 @@ GEN_XX3FORM(xvnmsubadp, 0x04, 0x1E, PPC2_VSX),
  GEN_XX3FORM(xvnmsubmdp, 0x04, 0x1F, PPC2_VSX),
  GEN_XX3FORM(xvmaxdp, 0x00, 0x1C, PPC2_VSX),
  GEN_XX3FORM(xvmindp, 0x00, 0x1D, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpeqdp, 0x0C, 0x0C, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpgtdp, 0x0C, 0x0D, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpgedp, 0x0C, 0x0E, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10098,6 +10118,9 @@ GEN_XX3FORM(xvnmsubasp, 0x04, 0x1A, PPC2_VSX),
  GEN_XX3FORM(xvnmsubmsp, 0x04, 0x1B, PPC2_VSX),
  GEN_XX3FORM(xvmaxsp, 0x00, 0x18, PPC2_VSX),
  GEN_XX3FORM(xvminsp, 0x00, 0x19, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpeqsp, 0x0C, 0x08, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpgtsp, 0x0C, 0x09, PPC2_VSX),
+GEN_XX3_RC_FORM(xvcmpgesp, 0x0C, 0x0A, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 17/19] Add VSX Floating Point to Floating Point Conversion Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (15 preceding siblings ...)
  2013-10-24 16:26 ` [Qemu-devel] [PATCH 16/19] Add VSX Vector Compare Instructions Tom Musta
@ 2013-10-24 16:27 ` Tom Musta
  2013-10-24 20:49   ` Richard Henderson
  2013-10-24 16:27 ` [Qemu-devel] [PATCH 18/19] Add VSX ISA2.06 Integer " Tom Musta
  2013-10-24 16:28 ` [Qemu-devel] [PATCH 19/19] Add VSX Rounding Instructions Tom Musta
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:27 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX instructions that convert between floating
point formats: xscvdpsp, xscvspdp, xvcvdpsp, xvcvspdp.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |    4 ++++
  target-ppc/translate.c  |    8 ++++++++
  3 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 7bcd213..44d9309 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2543,3 +2543,49 @@ VSX_CMP(xvcmpgtdp, 2, float64, f64, lt, 1)
  VSX_CMP(xvcmpeqsp, 4, float32, f32, eq, 0)
  VSX_CMP(xvcmpgesp, 4, float32, f32, le, 1)
  VSX_CMP(xvcmpgtsp, 4, float32, f32, lt, 1)
+
+#if defined(HOST_WORDS_BIGENDIAN)
+#define JOFFSET 0
+#else
+#define JOFFSET 1
+#endif
+
+/* VSX_CVT_FP_TO_FP - VSX floating point/floating point conversion
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   stp   - source type (float32 or float64)
+ *   ttp   - target type (float32 or float64)
+ *   sfld  - source vsr_t field
+ *   tfld  - target vsr_t field (f32 or f64)
+ *   sfprf - set FPRF
+ */
+#define VSX_CVT_FP_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf)    \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                \
+{                                                                  \
+    ppc_vsr_t xt, xb;                                              \
+    int i;                                                         \
+                                                                   \
+    getVSR(xB(opcode), &xb, env);                                  \
+    getVSR(xT(opcode), &xt, env);                                  \
+                                                                   \
+    for (i = 0; i < nels; i++) {                                   \
+        int j = 2*i + JOFFSET;                                     \
+        xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);        \
+        if (unlikely(stp##_is_signaling_nan(xb.sfld))) {           \
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0); \
+            xt.tfld = ttp##_snan_to_qnan(xt.tfld);                 \
+        }                                                          \
+        if (sfprf) {                                               \
+            helper_compute_fprf(env, ttp##_to_float64(xt.tfld,     \
+                                &env->fp_status), sfprf);          \
+        }                                                          \
+    }                                                              \
+                                                                   \
+    putVSR(xT(opcode), &xt, env);                                  \
+    helper_float_check_status(env);                                \
+}
+
+VSX_CVT_FP_TO_FP(xscvdpsp, 1, float64, float32, f64[i], f32[j], 1)
+VSX_CVT_FP_TO_FP(xscvspdp, 1, float32, float64, f32[j], f64[i], 1)
+VSX_CVT_FP_TO_FP(xvcvdpsp, 2, float64, float32, f64[i], f32[j], 0)
+VSX_CVT_FP_TO_FP(xvcvspdp, 2, float32, float64, f32[j], f64[i], 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 333bce4..64b2b2b 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -272,6 +272,8 @@ DEF_HELPER_2(xscmpodp, void, env, i32)
  DEF_HELPER_2(xscmpudp, void, env, i32)
  DEF_HELPER_2(xsmaxdp, void, env, i32)
  DEF_HELPER_2(xsmindp, void, env, i32)
+DEF_HELPER_2(xscvdpsp, void, env, i32)
+DEF_HELPER_2(xscvspdp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -295,6 +297,7 @@ DEF_HELPER_2(xvmindp, void, env, i32)
  DEF_HELPER_2(xvcmpeqdp, void, env, i32)
  DEF_HELPER_2(xvcmpgedp, void, env, i32)
  DEF_HELPER_2(xvcmpgtdp, void, env, i32)
+DEF_HELPER_2(xvcvdpsp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -318,6 +321,7 @@ DEF_HELPER_2(xvminsp, void, env, i32)
  DEF_HELPER_2(xvcmpeqsp, void, env, i32)
  DEF_HELPER_2(xvcmpgesp, void, env, i32)
  DEF_HELPER_2(xvcmpgtsp, void, env, i32)
+DEF_HELPER_2(xvcvspdp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 46813eb..ddbdeef 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7314,6 +7314,8 @@ GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvspdp, 0x12, 0x14, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7337,6 +7339,7 @@ GEN_VSX_HELPER_2(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvdpsp, 0x12, 0x18, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7360,6 +7363,7 @@ GEN_VSX_HELPER_2(xvminsp, 0x00, 0x19, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvspdp, 0x12, 0x1C, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10075,6 +10079,8 @@ GEN_XX2FORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
  GEN_XX2FORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
  GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
  GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),
+GEN_XX2FORM(xscvdpsp, 0x12, 0x10, PPC2_VSX),
+GEN_XX2FORM(xscvspdp, 0x12, 0x14, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10098,6 +10104,7 @@ GEN_XX3FORM(xvmindp, 0x00, 0x1D, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpeqdp, 0x0C, 0x0C, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgtdp, 0x0C, 0x0D, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgedp, 0x0C, 0x0E, PPC2_VSX),
+GEN_XX2FORM(xvcvdpsp, 0x12, 0x18, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10121,6 +10128,7 @@ GEN_XX3FORM(xvminsp, 0x00, 0x19, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpeqsp, 0x0C, 0x08, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgtsp, 0x0C, 0x09, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgesp, 0x0C, 0x0A, PPC2_VSX),
+GEN_XX2FORM(xvcvspdp, 0x12, 0x1C, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 18/19] Add VSX ISA2.06 Integer Conversion Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (16 preceding siblings ...)
  2013-10-24 16:27 ` [Qemu-devel] [PATCH 17/19] Add VSX Floating Point to Floating Point Conversion Instructions Tom Musta
@ 2013-10-24 16:27 ` Tom Musta
  2013-10-24 20:51   ` Richard Henderson
  2013-10-24 16:28 ` [Qemu-devel] [PATCH 19/19] Add VSX Rounding Instructions Tom Musta
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:27 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Integer Conversion instructions defined by
V2.06 of the PowerPC ISA:

   - xscvdpsxds, xscvdpsxws, xscvdpuxds, xscvdpuxws
   - xvcvdpsxds, xvcvdpsxws, xvcvdpuxds, xvcvdpuxws
   - xvcvspsxds, xvcvspsxws, xvcvspuxds, xvcvspuxws
   - xscvsxddp, xscvuxddp
   - xvcvsxddp, xscvsxwdp, xvcvuxddp, xvcvuxwdp
   - xvcvsxdsp, xscvsxwsp, xvcvuxdsp, xvcvuxwsp

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |  108 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |   22 ++++++++++
  target-ppc/translate.c  |   44 +++++++++++++++++++
  3 files changed, 174 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 44d9309..d00bcb8 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2589,3 +2589,111 @@ VSX_CVT_FP_TO_FP(xscvdpsp, 1, float64, float32, f64[i], f32[j], 1)
  VSX_CVT_FP_TO_FP(xscvspdp, 1, float32, float64, f32[j], f64[i], 1)
  VSX_CVT_FP_TO_FP(xvcvdpsp, 2, float64, float32, f64[i], f32[j], 0)
  VSX_CVT_FP_TO_FP(xvcvspdp, 2, float32, float64, f32[j], f64[i], 0)
+
+/* VSX_CVT_FP_TO_INT - VSX floating point to integer conversion
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   stp   - source type (float32 or float64)
+ *   ttp   - target type (int32, uint32, int64 or uint64)
+ *   sfld  - source vsr_t field
+ *   tfld  - target vsr_t field
+ *   jdef  - definition of the j index (i or 2*i)
+ *   rnan  - resulting NaN
+ */
+#define VSX_CVT_FP_TO_INT(op, nels, stp, ttp, sfld, tfld, jdef, rnan)        \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
+{                                                                            \
+    ppc_vsr_t xt, xb;                                                        \
+    int i;                                                                   \
+                                                                             \
+    getVSR(xB(opcode), &xb, env);                                            \
+    getVSR(xT(opcode), &xt, env);                                            \
+                                                                             \
+    for (i = 0; i < nels; i++) {                                             \
+        int j = jdef;                                                        \
+        if (unlikely(stp##_is_any_nan(xb.sfld))) {                           \
+            if (stp##_is_signaling_nan(xb.sfld)) {                           \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);       \
+            }                                                                \
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 0);            \
+            xt.tfld = rnan;                                                  \
+        }                                                                    \
+        else {                                                               \
+            xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);              \
+            if (env->fp_status.float_exception_flags & float_flag_invalid) { \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 0);        \
+            }                                                                \
+        }                                                                    \
+    }                                                                        \
+                                                                             \
+    putVSR(xT(opcode), &xt, env);                                            \
+    helper_float_check_status(env);                                          \
+}
+
+VSX_CVT_FP_TO_INT(xscvdpsxds, 1, float64, int64, f64[j], u64[i], i, \
+                  0x8000000000000000ul)
+VSX_CVT_FP_TO_INT(xscvdpsxws, 1, float64, int32, f64[i], u32[j], \
+                  2*i + JOFFSET, 0x80000000l)
+VSX_CVT_FP_TO_INT(xscvdpuxds, 1, float64, uint64, f64[j], u64[i], i, 0ul)
+VSX_CVT_FP_TO_INT(xscvdpuxws, 1, float64, uint32, f64[i], u32[j], \
+                  2*i + JOFFSET, 0)
+VSX_CVT_FP_TO_INT(xvcvdpsxds, 2, float64, int64, f64[j], u64[i], i, \
+                  0x8000000000000000ul)
+VSX_CVT_FP_TO_INT(xvcvdpsxws, 2, float64, int32, f64[i], u32[j], \
+                  2*i + JOFFSET, 0x80000000l)
+VSX_CVT_FP_TO_INT(xvcvdpuxds, 2, float64, uint64, f64[j], u64[i], i, 0ul)
+VSX_CVT_FP_TO_INT(xvcvdpuxws, 2, float64, uint32, f64[i], u32[j], \
+                  2*i + JOFFSET, 0)
+VSX_CVT_FP_TO_INT(xvcvspsxds, 2, float32, int64, f32[j], u64[i], \
+                  2*i + JOFFSET, 0x8000000000000000ul)
+VSX_CVT_FP_TO_INT(xvcvspsxws, 4, float32, int32, f32[j], u32[j], i, \
+                  0x80000000l)
+VSX_CVT_FP_TO_INT(xvcvspuxds, 2, float32, uint64, f32[j], u64[i], \
+                  2*i + JOFFSET, 0ul)
+VSX_CVT_FP_TO_INT(xvcvspuxws, 4, float32, uint32, f32[j], u32[i], i, 0)
+
+/* VSX_CVT_INT_TO_FP - VSX integer to floating point conversion
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   stp   - source type (int32, uint32, int64 or uint64)
+ *   ttp   - target type (float32 or float64)
+ *   sfld  - source vsr_t field
+ *   tfld  - target vsr_t field
+ *   jdef  - definition of the j index (i or 2*i)
+ *   sfprf - set FPRF
+ */
+#define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, jdef, sfprf)  \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+{                                                                       \
+    ppc_vsr_t xt, xb;                                                   \
+    int i;                                                              \
+                                                                        \
+    getVSR(xB(opcode), &xb, env);                                       \
+    getVSR(xT(opcode), &xt, env);                                       \
+                                                                        \
+    for (i = 0; i < nels; i++) {                                        \
+        int j = jdef;                                                   \
+        xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);             \
+        if (sfprf) {                                                    \
+            helper_compute_fprf(env, xt.tfld, sfprf);                   \
+        }                                                               \
+    }                                                                   \
+                                                                        \
+    putVSR(xT(opcode), &xt, env);                                       \
+    helper_float_check_status(env);                                     \
+}
+
+VSX_CVT_INT_TO_FP(xscvsxddp, 1, int64, float64, u64[j], f64[i], i, 1)
+VSX_CVT_INT_TO_FP(xscvuxddp, 1, uint64, float64, u64[j], f64[i], i, 1)
+VSX_CVT_INT_TO_FP(xvcvsxddp, 2, int64, float64, u64[j], f64[i], i, 0)
+VSX_CVT_INT_TO_FP(xvcvuxddp, 2, uint64, float64, u64[j], f64[i], i, 0)
+VSX_CVT_INT_TO_FP(xvcvsxwdp, 2, int32, float64, u32[j], f64[i], \
+                  2*i + JOFFSET, 0)
+VSX_CVT_INT_TO_FP(xvcvuxwdp, 2, uint64, float64, u32[j], f64[i], \
+                  2*i + JOFFSET, 0)
+VSX_CVT_INT_TO_FP(xvcvsxdsp, 2, int64, float32, u64[i], f32[j], \
+                  2*i + JOFFSET, 0)
+VSX_CVT_INT_TO_FP(xvcvuxdsp, 2, uint64, float32, u64[i], f32[j], \
+                  2*i + JOFFSET, 0)
+VSX_CVT_INT_TO_FP(xvcvsxwsp, 4, int32, float32, u32[j], f32[i], i, 0)
+VSX_CVT_INT_TO_FP(xvcvuxwsp, 4, uint32, float32, u32[j], f32[i], i, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 64b2b2b..20d7bb4 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -274,6 +274,12 @@ DEF_HELPER_2(xsmaxdp, void, env, i32)
  DEF_HELPER_2(xsmindp, void, env, i32)
  DEF_HELPER_2(xscvdpsp, void, env, i32)
  DEF_HELPER_2(xscvspdp, void, env, i32)
+DEF_HELPER_2(xscvdpsxds, void, env, i32)
+DEF_HELPER_2(xscvdpsxws, void, env, i32)
+DEF_HELPER_2(xscvdpuxds, void, env, i32)
+DEF_HELPER_2(xscvdpuxws, void, env, i32)
+DEF_HELPER_2(xscvsxddp, void, env, i32)
+DEF_HELPER_2(xscvuxddp, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -298,6 +304,14 @@ DEF_HELPER_2(xvcmpeqdp, void, env, i32)
  DEF_HELPER_2(xvcmpgedp, void, env, i32)
  DEF_HELPER_2(xvcmpgtdp, void, env, i32)
  DEF_HELPER_2(xvcvdpsp, void, env, i32)
+DEF_HELPER_2(xvcvdpsxds, void, env, i32)
+DEF_HELPER_2(xvcvdpsxws, void, env, i32)
+DEF_HELPER_2(xvcvdpuxds, void, env, i32)
+DEF_HELPER_2(xvcvdpuxws, void, env, i32)
+DEF_HELPER_2(xvcvsxddp, void, env, i32)
+DEF_HELPER_2(xvcvuxddp, void, env, i32)
+DEF_HELPER_2(xvcvsxwdp, void, env, i32)
+DEF_HELPER_2(xvcvuxwdp, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -322,6 +336,14 @@ DEF_HELPER_2(xvcmpeqsp, void, env, i32)
  DEF_HELPER_2(xvcmpgesp, void, env, i32)
  DEF_HELPER_2(xvcmpgtsp, void, env, i32)
  DEF_HELPER_2(xvcvspdp, void, env, i32)
+DEF_HELPER_2(xvcvspsxds, void, env, i32)
+DEF_HELPER_2(xvcvspsxws, void, env, i32)
+DEF_HELPER_2(xvcvspuxds, void, env, i32)
+DEF_HELPER_2(xvcvspuxws, void, env, i32)
+DEF_HELPER_2(xvcvsxdsp, void, env, i32)
+DEF_HELPER_2(xvcvuxdsp, void, env, i32)
+DEF_HELPER_2(xvcvsxwsp, void, env, i32)
+DEF_HELPER_2(xvcvuxwsp, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index ddbdeef..dbe3953 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7316,6 +7316,12 @@ GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscvspdp, 0x12, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvdpsxds, 0x10, 0x15, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvdpsxws, 0x10, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvdpuxds, 0x10, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvdpuxws, 0x10, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvsxddp, 0x10, 0x17, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xscvuxddp, 0x10, 0x16, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7340,6 +7346,14 @@ GEN_VSX_HELPER_2(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvdpsp, 0x12, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvdpsxds, 0x10, 0x1D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvdpsxws, 0x10, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvdpuxds, 0x10, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvdpuxws, 0x10, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvsxddp, 0x10, 0x1F, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvuxddp, 0x10, 0x1E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvsxwdp, 0x10, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvuxwdp, 0x10, 0x0E, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7364,6 +7378,14 @@ GEN_VSX_HELPER_2(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvspdp, 0x12, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvspsxds, 0x10, 0x19, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvspsxws, 0x10, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvspuxds, 0x10, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvspuxws, 0x10, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvsxdsp, 0x10, 0x1B, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvuxdsp, 0x10, 0x1A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvsxwsp, 0x10, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvcvuxwsp, 0x10, 0x0A, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10081,6 +10103,12 @@ GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
  GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),
  GEN_XX2FORM(xscvdpsp, 0x12, 0x10, PPC2_VSX),
  GEN_XX2FORM(xscvspdp, 0x12, 0x14, PPC2_VSX),
+GEN_XX2FORM(xscvdpsxds, 0x10, 0x15, PPC2_VSX),
+GEN_XX2FORM(xscvdpsxws, 0x10, 0x05, PPC2_VSX),
+GEN_XX2FORM(xscvdpuxds, 0x10, 0x14, PPC2_VSX),
+GEN_XX2FORM(xscvdpuxws, 0x10, 0x04, PPC2_VSX),
+GEN_XX2FORM(xscvsxddp, 0x10, 0x17, PPC2_VSX),
+GEN_XX2FORM(xscvuxddp, 0x10, 0x16, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10105,6 +10133,14 @@ GEN_XX3_RC_FORM(xvcmpeqdp, 0x0C, 0x0C, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgtdp, 0x0C, 0x0D, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgedp, 0x0C, 0x0E, PPC2_VSX),
  GEN_XX2FORM(xvcvdpsp, 0x12, 0x18, PPC2_VSX),
+GEN_XX2FORM(xvcvdpsxds, 0x10, 0x1D, PPC2_VSX),
+GEN_XX2FORM(xvcvdpsxws, 0x10, 0x0D, PPC2_VSX),
+GEN_XX2FORM(xvcvdpuxds, 0x10, 0x1C, PPC2_VSX),
+GEN_XX2FORM(xvcvdpuxws, 0x10, 0x0C, PPC2_VSX),
+GEN_XX2FORM(xvcvsxddp, 0x10, 0x1F, PPC2_VSX),
+GEN_XX2FORM(xvcvuxddp, 0x10, 0x1E, PPC2_VSX),
+GEN_XX2FORM(xvcvsxwdp, 0x10, 0x0F, PPC2_VSX),
+GEN_XX2FORM(xvcvuxwdp, 0x10, 0x0E, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10129,6 +10165,14 @@ GEN_XX3_RC_FORM(xvcmpeqsp, 0x0C, 0x08, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgtsp, 0x0C, 0x09, PPC2_VSX),
  GEN_XX3_RC_FORM(xvcmpgesp, 0x0C, 0x0A, PPC2_VSX),
  GEN_XX2FORM(xvcvspdp, 0x12, 0x1C, PPC2_VSX),
+GEN_XX2FORM(xvcvspsxds, 0x10, 0x19, PPC2_VSX),
+GEN_XX2FORM(xvcvspsxws, 0x10, 0x09, PPC2_VSX),
+GEN_XX2FORM(xvcvspuxds, 0x10, 0x18, PPC2_VSX),
+GEN_XX2FORM(xvcvspuxws, 0x10, 0x08, PPC2_VSX),
+GEN_XX2FORM(xvcvsxdsp, 0x10, 0x1B, PPC2_VSX),
+GEN_XX2FORM(xvcvuxdsp, 0x10, 0x1A, PPC2_VSX),
+GEN_XX2FORM(xvcvsxwsp, 0x10, 0x0B, PPC2_VSX),
+GEN_XX2FORM(xvcvuxwsp, 0x10, 0x0A, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [Qemu-devel] [PATCH 19/19] Add VSX Rounding Instructions
  2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
                   ` (17 preceding siblings ...)
  2013-10-24 16:27 ` [Qemu-devel] [PATCH 18/19] Add VSX ISA2.06 Integer " Tom Musta
@ 2013-10-24 16:28 ` Tom Musta
  2013-10-24 20:54   ` Richard Henderson
  18 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 16:28 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Tom Musta, qemu-ppc

This patch adds the VSX Round to Floating Point Integer instructions:

   - xsrdpi, xsrdpic, xsrdpim, xsrdpip, xsrdpiz
   - xvrdpi, xvrdpic, xvrdpim, xvrdpip, xvrdpiz
   - xvrspi, xvrspic, xvrspim, xvrspip, xvrspiz

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
  target-ppc/fpu_helper.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
  target-ppc/helper.h     |   15 ++++++++++
  target-ppc/translate.c  |   30 ++++++++++++++++++++
  3 files changed, 113 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index d00bcb8..697b949 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2697,3 +2697,71 @@ VSX_CVT_INT_TO_FP(xvcvuxdsp, 2, uint64, float32, u64[i], f32[j], \
                    2*i + JOFFSET, 0)
  VSX_CVT_INT_TO_FP(xvcvsxwsp, 4, int32, float32, u32[j], f32[i], i, 0)
  VSX_CVT_INT_TO_FP(xvcvuxwsp, 4, uint32, float32, u32[j], f32[i], i, 0)
+
+/* For "use current rounding mode", define a value that will not be one of
+ * the existing rounding model enums.
+ */
+#define FLOAT_ROUND_CURRENT (float_round_nearest_even + float_round_down + \
+  float_round_up + float_round_to_zero)
+
+/* VSX_ROUND - VSX floating point round
+ *   op    - instruction mnemonic
+ *   nels  - number of elements (1, 2 or 4)
+ *   tp    - type (float32 or float64)
+ *   fld   - vsr_t field (f32 or f64)
+ *   rmode - rounding mode
+ *   sfprf - set FPRF
+ */
+#define VSX_ROUND(op, nels, tp, fld, rmode, sfprf)                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode)                    \
+{                                                                      \
+    ppc_vsr_t xt, xb;                                                  \
+    int i;                                                             \
+    getVSR(xB(opcode), &xb, env);                                      \
+    getVSR(xT(opcode), &xt, env);                                      \
+                                                                       \
+    if (rmode != FLOAT_ROUND_CURRENT) {                                \
+        set_float_rounding_mode(rmode, &env->fp_status);               \
+    }                                                                  \
+                                                                       \
+    for (i = 0; i < nels; i++) {                                       \
+        if (unlikely(tp##_is_signaling_nan(xb.fld[i]))) {              \
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 0);     \
+            xt.fld[i] = tp##_snan_to_qnan(xb.fld[i]);                  \
+        } else {                                                       \
+            xt.fld[i] = tp##_round_to_int(xb.fld[i], &env->fp_status); \
+        }                                                              \
+        if (sfprf) {                                                   \
+            helper_compute_fprf(env, xt.fld[i], sfprf);                \
+        }                                                              \
+    }                                                                  \
+                                                                       \
+    /* If this is not a "use current rounding mode" instruction,       \
+     * then inhibit setting of the XX bit and restore rounding         \
+     * mode from FPSCR */                                              \
+    if (rmode != FLOAT_ROUND_CURRENT) {                                \
+        fpscr_set_rounding_mode(env);                                  \
+        env->fp_status.float_exception_flags &= ~float_flag_inexact;   \
+    }                                                                  \
+                                                                       \
+    putVSR(xT(opcode), &xt, env);                                      \
+    helper_float_check_status(env);                                    \
+}
+
+VSX_ROUND(xsrdpi, 1, float64, f64, float_round_nearest_even, 1)
+VSX_ROUND(xsrdpic, 1, float64, f64, FLOAT_ROUND_CURRENT, 1)
+VSX_ROUND(xsrdpim, 1, float64, f64, float_round_down, 1)
+VSX_ROUND(xsrdpip, 1, float64, f64, float_round_up, 1)
+VSX_ROUND(xsrdpiz, 1, float64, f64, float_round_to_zero, 1)
+
+VSX_ROUND(xvrdpi, 2, float64, f64, float_round_nearest_even, 0)
+VSX_ROUND(xvrdpic, 2, float64, f64, FLOAT_ROUND_CURRENT, 0)
+VSX_ROUND(xvrdpim, 2, float64, f64, float_round_down, 0)
+VSX_ROUND(xvrdpip, 2, float64, f64, float_round_up, 0)
+VSX_ROUND(xvrdpiz, 2, float64, f64, float_round_to_zero, 0)
+
+VSX_ROUND(xvrspi, 4, float32, f32, float_round_nearest_even, 0)
+VSX_ROUND(xvrspic, 4, float32, f32, FLOAT_ROUND_CURRENT, 0)
+VSX_ROUND(xvrspim, 4, float32, f32, float_round_down, 0)
+VSX_ROUND(xvrspip, 4, float32, f32, float_round_up, 0)
+VSX_ROUND(xvrspiz, 4, float32, f32, float_round_to_zero, 0)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 20d7bb4..ea5d9bc 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -280,6 +280,11 @@ DEF_HELPER_2(xscvdpuxds, void, env, i32)
  DEF_HELPER_2(xscvdpuxws, void, env, i32)
  DEF_HELPER_2(xscvsxddp, void, env, i32)
  DEF_HELPER_2(xscvuxddp, void, env, i32)
+DEF_HELPER_2(xsrdpi, void, env, i32)
+DEF_HELPER_2(xsrdpic, void, env, i32)
+DEF_HELPER_2(xsrdpim, void, env, i32)
+DEF_HELPER_2(xsrdpip, void, env, i32)
+DEF_HELPER_2(xsrdpiz, void, env, i32)

  DEF_HELPER_2(xvadddp, void, env, i32)
  DEF_HELPER_2(xvsubdp, void, env, i32)
@@ -312,6 +317,11 @@ DEF_HELPER_2(xvcvsxddp, void, env, i32)
  DEF_HELPER_2(xvcvuxddp, void, env, i32)
  DEF_HELPER_2(xvcvsxwdp, void, env, i32)
  DEF_HELPER_2(xvcvuxwdp, void, env, i32)
+DEF_HELPER_2(xvrdpi, void, env, i32)
+DEF_HELPER_2(xvrdpic, void, env, i32)
+DEF_HELPER_2(xvrdpim, void, env, i32)
+DEF_HELPER_2(xvrdpip, void, env, i32)
+DEF_HELPER_2(xvrdpiz, void, env, i32)

  DEF_HELPER_2(xvaddsp, void, env, i32)
  DEF_HELPER_2(xvsubsp, void, env, i32)
@@ -344,6 +354,11 @@ DEF_HELPER_2(xvcvsxdsp, void, env, i32)
  DEF_HELPER_2(xvcvuxdsp, void, env, i32)
  DEF_HELPER_2(xvcvsxwsp, void, env, i32)
  DEF_HELPER_2(xvcvuxwsp, void, env, i32)
+DEF_HELPER_2(xvrspi, void, env, i32)
+DEF_HELPER_2(xvrspic, void, env, i32)
+DEF_HELPER_2(xvrspim, void, env, i32)
+DEF_HELPER_2(xvrspip, void, env, i32)
+DEF_HELPER_2(xvrspiz, void, env, i32)

  DEF_HELPER_2(efscfsi, i32, env, i32)
  DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index dbe3953..58a9242 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7322,6 +7322,11 @@ GEN_VSX_HELPER_2(xscvdpuxds, 0x10, 0x14, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscvdpuxws, 0x10, 0x04, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscvsxddp, 0x10, 0x17, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xscvuxddp, 0x10, 0x16, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsrdpi, 0x12, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsrdpic, 0x16, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsrdpim, 0x12, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsrdpip, 0x12, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -7354,6 +7359,11 @@ GEN_VSX_HELPER_2(xvcvsxddp, 0x10, 0x1F, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvuxddp, 0x10, 0x1E, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvsxwdp, 0x10, 0x0F, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvuxwdp, 0x10, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrdpi, 0x12, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrdpic, 0x16, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrdpim, 0x12, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrdpip, 0x12, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrdpiz, 0x12, 0x0D, 0, PPC2_VSX)

  GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
@@ -7386,6 +7396,11 @@ GEN_VSX_HELPER_2(xvcvsxdsp, 0x10, 0x1B, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvuxdsp, 0x10, 0x1A, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvsxwsp, 0x10, 0x0B, 0, PPC2_VSX)
  GEN_VSX_HELPER_2(xvcvuxwsp, 0x10, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrspi, 0x12, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrspic, 0x16, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrspim, 0x12, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)

  #define VSX_LOGICAL(name, tcg_op)                                    \
  static void glue(gen_, name)(DisasContext * ctx)                     \
@@ -10109,6 +10124,11 @@ GEN_XX2FORM(xscvdpuxds, 0x10, 0x14, PPC2_VSX),
  GEN_XX2FORM(xscvdpuxws, 0x10, 0x04, PPC2_VSX),
  GEN_XX2FORM(xscvsxddp, 0x10, 0x17, PPC2_VSX),
  GEN_XX2FORM(xscvuxddp, 0x10, 0x16, PPC2_VSX),
+GEN_XX2FORM(xsrdpi, 0x12, 0x04, PPC2_VSX),
+GEN_XX2FORM(xsrdpic, 0x16, 0x06, PPC2_VSX),
+GEN_XX2FORM(xsrdpim, 0x12, 0x07, PPC2_VSX),
+GEN_XX2FORM(xsrdpip, 0x12, 0x06, PPC2_VSX),
+GEN_XX2FORM(xsrdpiz, 0x12, 0x05, PPC2_VSX),

  GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
  GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
@@ -10141,6 +10161,11 @@ GEN_XX2FORM(xvcvsxddp, 0x10, 0x1F, PPC2_VSX),
  GEN_XX2FORM(xvcvuxddp, 0x10, 0x1E, PPC2_VSX),
  GEN_XX2FORM(xvcvsxwdp, 0x10, 0x0F, PPC2_VSX),
  GEN_XX2FORM(xvcvuxwdp, 0x10, 0x0E, PPC2_VSX),
+GEN_XX2FORM(xvrdpi, 0x12, 0x0C, PPC2_VSX),
+GEN_XX2FORM(xvrdpic, 0x16, 0x0E, PPC2_VSX),
+GEN_XX2FORM(xvrdpim, 0x12, 0x0F, PPC2_VSX),
+GEN_XX2FORM(xvrdpip, 0x12, 0x0E, PPC2_VSX),
+GEN_XX2FORM(xvrdpiz, 0x12, 0x0D, PPC2_VSX),

  GEN_XX3FORM(xvaddsp, 0x00, 0x08, PPC2_VSX),
  GEN_XX3FORM(xvsubsp, 0x00, 0x09, PPC2_VSX),
@@ -10173,6 +10198,11 @@ GEN_XX2FORM(xvcvsxdsp, 0x10, 0x1B, PPC2_VSX),
  GEN_XX2FORM(xvcvuxdsp, 0x10, 0x1A, PPC2_VSX),
  GEN_XX2FORM(xvcvsxwsp, 0x10, 0x0B, PPC2_VSX),
  GEN_XX2FORM(xvcvuxwsp, 0x10, 0x0A, PPC2_VSX),
+GEN_XX2FORM(xvrspi, 0x12, 0x08, PPC2_VSX),
+GEN_XX2FORM(xvrspic, 0x16, 0x0A, PPC2_VSX),
+GEN_XX2FORM(xvrspim, 0x12, 0x0B, PPC2_VSX),
+GEN_XX2FORM(xvrspip, 0x12, 0x0A, PPC2_VSX),
+GEN_XX2FORM(xvrspiz, 0x12, 0x09, PPC2_VSX),

  #undef VSX_LOGICAL
  #define VSX_LOGICAL(name, opc2, opc3, fl2) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-24 16:17 ` [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX Tom Musta
@ 2013-10-24 18:34   ` Richard Henderson
  2013-10-25 11:34   ` Alex Bennée
  2013-10-25 11:55   ` Peter Maydell
  2 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 18:34 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:17 AM, Tom Musta wrote:
> This patch adds routines to the softfloat library that are useful for
> the PowerPC VSX implementation.  The routines are, however, not specific
> to PowerPC and are approprriate for softfloat.
> 
> The following routines are added:
> 
>   - float32_is_denormal() returns true if the 32-bit floating point number
>     is denormalized.
>   - float64_is_denormal() returns true if the 64-bit floating point number
>     is denormalized.
>   - float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
>     floating point number.
>   - float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
>     floating point number.
>   - float32_to_uint64() converts a 32-bit floating point number to an
>     unsigned 64 bit number.

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers
  2013-10-24 16:19 ` [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers Tom Musta
@ 2013-10-24 18:51   ` Richard Henderson
  2013-10-24 20:42     ` Tom Musta
  0 siblings, 1 reply; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 18:51 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:19 AM, Tom Musta wrote:
> 
> +#define GEN_VSX_HELPER_2(name, op1, op2, inval, type)                         \
> +static void gen_##name(DisasContext * ctx)                                    \
> +{                                                                             \
> +    TCGv_i32 opc;                                                             \
> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
> +        return;                                                               \
> +    }                                                                         \
> +    /* NIP cannot be restored if the memory exception comes from an helper */ \
> +    gen_update_nip(ctx, ctx->nip - 4);                                        \
> +    opc = tcg_const_i32(ctx->opcode);                                         \
> +    gen_helper_##name(cpu_env, opc);                                          \
> +    tcg_temp_free_i32(opc);                                                   \
> +}

I'm not a fan of delaying decode to the helpers...

You're mostly doing this to avoid passing 3-4 arguments
for the register numbers?


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 04/19] Add VSX ISA2.06 xadd Instructions
  2013-10-24 16:20 ` [Qemu-devel] [PATCH 04/19] Add VSX ISA2.06 xadd Instructions Tom Musta
@ 2013-10-24 19:44   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 19:44 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:20 AM, Tom Musta wrote:
> This patch adds the VSX floating point add instructions that are
> defined by V2.06 of the PowerPC ISA:  xsadddp, xvadddp and xvaddsp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 05/19] Add VSX ISA2.06 xsub Instructions
  2013-10-24 16:20 ` [Qemu-devel] [PATCH 05/19] Add VSX ISA2.06 xsub Instructions Tom Musta
@ 2013-10-24 19:48   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 19:48 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:20 AM, Tom Musta wrote:
> This patch adds the floating point subtraction instructions defined
> by V2.06 of the PowerPC ISA: xssubdp, xvsubdp and xvsubsp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 06/19] Add VSX ISA2.06 xmul Instructions
  2013-10-24 16:21 ` [Qemu-devel] [PATCH 06/19] Add VSX ISA2.06 xmul Instructions Tom Musta
@ 2013-10-24 20:07   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:07 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:21 AM, Tom Musta wrote:
> This patch adds the VSX floating point multiply instructions defined
> by V2.06 of the PowerPC ISA: xsmuldp, xvmuldp, xvmulsp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>



r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 07/19] Add VSX ISA2.06 xdiv Instructions
  2013-10-24 16:21 ` [Qemu-devel] [PATCH 07/19] Add VSX ISA2.06 xdiv Instructions Tom Musta
@ 2013-10-24 20:08   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:08 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:21 AM, Tom Musta wrote:
> This patch adds the VSX floating point divide instructions defined
> by V2.06 of the PowerPC ISA: xsdivdp, xvdivdp, xvdivsp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 08/19] Add VSX ISA2.06 xre Instructions
  2013-10-24 16:22 ` [Qemu-devel] [PATCH 08/19] Add VSX ISA2.06 xre Instructions Tom Musta
@ 2013-10-24 20:11   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:11 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:22 AM, Tom Musta wrote:
> This patch adds the VSX floating point reciprocal estimate instructions
> defined by V2.06 of the PowerPC ISA: xsredp, xvredp, xvresp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 09/19] Add VSX ISA2.06 xsqrt Instructions
  2013-10-24 16:22 ` [Qemu-devel] [PATCH 09/19] Add VSX ISA2.06 xsqrt Instructions Tom Musta
@ 2013-10-24 20:23   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:23 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:22 AM, Tom Musta wrote:
> This patch adds the VSX floating point square root instructions
> defined by V2.06 of the PowerPC ISA: xssqrtdp, xvsqrtdp, xvsqrtsp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 10/19] Add VSX ISA2.06 xrsqrte Instructions
  2013-10-24 16:23 ` [Qemu-devel] [PATCH 10/19] Add VSX ISA2.06 xrsqrte Instructions Tom Musta
@ 2013-10-24 20:25   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:25 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:23 AM, Tom Musta wrote:
> This patch adds the VSX floating point reciprocal square root
> estimate instructions defined by V2.06 of the PowerPC ISA: xsrsqrtedp,
> xvrsqrtedp, xvrsqrtesp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 11/19] Add VSX ISA2.06 xtdiv Instructions
  2013-10-24 16:23 ` [Qemu-devel] [PATCH 11/19] Add VSX ISA2.06 xtdiv Instructions Tom Musta
@ 2013-10-24 20:30   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:30 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:23 AM, Tom Musta wrote:
> This patch adds the VSX floating point test for software divide
> instructions defined by V2.06 of the PowerPC ISA: xstdivdp, xvtdivdp,
> and xvtdivsp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 12/19] Add VSX ISA2.06 xtsqrt Instructions
  2013-10-24 16:24 ` [Qemu-devel] [PATCH 12/19] Add VSX ISA2.06 xtsqrt Instructions Tom Musta
@ 2013-10-24 20:34   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:34 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:24 AM, Tom Musta wrote:
> This patch adds the VSX floating point test for software square
> root instructions defined by V2.06 of the PowerPC ISA: xstsqrtdp,
> xvtsqrtdp, xvtsqrtsp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-24 16:25 ` [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions Tom Musta
@ 2013-10-24 20:38   ` Richard Henderson
  2013-10-25 13:49     ` Tom Musta
  2013-10-25 16:25     ` Tom Musta
  0 siblings, 2 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:38 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:25 AM, Tom Musta wrote:
>                                                \
> +            ft0 = tp##_to_##btp(xa.fld[i], &env->fp_status);                  \
> +            ft1 = tp##_to_##btp(m->fld[i], &env->fp_status);                  \
> +            ft0 = btp##_mul(ft0, ft1, &env->fp_status);                       \
> +            if (unlikely(btp##_is_infinity(ft0) &&                            \
> +                         tp##_is_infinity(s->fld[i]) &&                       \
> +                         btp##_is_neg(ft0) cmp tp##_is_neg(s->fld[i]))) {     \
> +                xt.fld[i] = float64_to_##tp(                                  \
> +                              fload_invalid_op_excp(env,                      \
> +                                                     POWERPC_EXCP_FP_VXISI,   \
> +                                                     sfprf),                  \
> +                              &env->fp_status);                               \
> +            } else {                                                          \
> +                ft1 = tp##_to_##btp(s->fld[i], &env->fp_status);              \
> +                ft0 = btp##_##sum(ft0, ft1, &env->fp_status);                 \
> +                xt.fld[i] = btp##_to_##tp(ft0, &env->fp_status);              \
> +            }                                                                 \
> +            if (neg && likely(!tp##_is_any_nan(xt.fld[i]))) {                 \
> +                xt.fld[i] = tp##_chs(xt.fld[i]);                              \
> +            }                  

You want to be using tp##muladd instead of widening to 128 bits.

> +        s = &xt;                                                              \
> +    }                                                                         \
> +    else {                                                                    \
> +        m = &xt;                                                              \ 

Also be careful of the codingstyle.


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 14/19] Add VSX xscmp*dp Instructions
  2013-10-24 16:25 ` [Qemu-devel] [PATCH 14/19] Add VSX xscmp*dp Instructions Tom Musta
@ 2013-10-24 20:39   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:39 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:25 AM, Tom Musta wrote:
> This patch adds the VSX scalar floating point compare ordered
> and unordered instructions.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers
  2013-10-24 18:51   ` Richard Henderson
@ 2013-10-24 20:42     ` Tom Musta
  2013-10-24 21:00       ` Richard Henderson
  0 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 20:42 UTC (permalink / raw)
  To: Richard Henderson, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 1:51 PM, Richard Henderson wrote:
> On 10/24/2013 09:19 AM, Tom Musta wrote:
>>
>> +#define GEN_VSX_HELPER_2(name, op1, op2, inval, type)                         \
>> +static void gen_##name(DisasContext * ctx)                                    \
>> +{                                                                             \
>> +    TCGv_i32 opc;                                                             \
>> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
>> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
>> +        return;                                                               \
>> +    }                                                                         \
>> +    /* NIP cannot be restored if the memory exception comes from an helper */ \
>> +    gen_update_nip(ctx, ctx->nip - 4);                                        \
>> +    opc = tcg_const_i32(ctx->opcode);                                         \
>> +    gen_helper_##name(cpu_env, opc);                                          \
>> +    tcg_temp_free_i32(opc);                                                   \
>> +}
>
> I'm not a fan of delaying decode to the helpers...
>
> You're mostly doing this to avoid passing 3-4 arguments
> for the register numbers?

Because the VSRs are 128 bits wide and because there is an interesting relationship
with the FPRs and AVRs, passing 6 or more arguments would typically be required
(2 per VSR).  And, they would need to be stitched back together into a single
structure in order to use the loops in the vector routines.  I did prototype something
like this and didn't like it.

Unless you are suggesting that the decoded VSR index (0..63) be passed to the helper?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions
  2013-10-24 16:26 ` [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions Tom Musta
@ 2013-10-24 20:45   ` Richard Henderson
  2013-10-24 21:07     ` Tom Musta
  2013-10-24 22:10   ` Peter Maydell
  1 sibling, 1 reply; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:45 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:26 AM, Tom Musta wrote:
> Because of the Power ISA definitions of maximum and minimum
> on various boundary cases, the standard softfloat comparison
> routines (e.g. float64_lt) do not work as well as one might
> think.  Therefore specific routines for comparing 64 and 32
> bit floating point numbers are implemented in the PowerPC
> helper code.

Really?  All I see in the document is ">fp", used both here
in the minmax insn and in the cmp insn.

If the softfloat compare isn't good enough for minmax, how
can it be good enough for cmp?


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 17/19] Add VSX Floating Point to Floating Point Conversion Instructions
  2013-10-24 16:27 ` [Qemu-devel] [PATCH 17/19] Add VSX Floating Point to Floating Point Conversion Instructions Tom Musta
@ 2013-10-24 20:49   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:49 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:27 AM, Tom Musta wrote:
> This patch adds the VSX instructions that convert between floating
> point formats: xscvdpsp, xscvspdp, xvcvdpsp, xvcvspdp.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 18/19] Add VSX ISA2.06 Integer Conversion Instructions
  2013-10-24 16:27 ` [Qemu-devel] [PATCH 18/19] Add VSX ISA2.06 Integer " Tom Musta
@ 2013-10-24 20:51   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:51 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:27 AM, Tom Musta wrote:
> This patch adds the VSX Integer Conversion instructions defined by
> V2.06 of the PowerPC ISA:
> 
>   - xscvdpsxds, xscvdpsxws, xscvdpuxds, xscvdpuxws
>   - xvcvdpsxds, xvcvdpsxws, xvcvdpuxds, xvcvdpuxws

>   - xvcvspsxds, xvcvspsxws, xvcvspuxds, xvcvspuxws
>   - xscvsxddp, xscvuxddp
>   - xvcvsxddp, xscvsxwdp, xvcvuxddp, xvcvuxwdp
>   - xvcvsxdsp, xscvsxwsp, xvcvuxdsp, xvcvuxwsp
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

With a codingstyle fix,

Reviewed-by: Richard Henderson <rth@twiddle.net>

> +            xt.tfld = rnan;                                                  \
> +        }                                                                    \
> +        else {                                                               \
> +            xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);              \

Here.



r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 19/19] Add VSX Rounding Instructions
  2013-10-24 16:28 ` [Qemu-devel] [PATCH 19/19] Add VSX Rounding Instructions Tom Musta
@ 2013-10-24 20:54   ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 20:54 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 09:28 AM, Tom Musta wrote:
> This patch adds the VSX Round to Floating Point Integer instructions:
> 
>   - xsrdpi, xsrdpic, xsrdpim, xsrdpip, xsrdpiz
>   - xvrdpi, xvrdpic, xvrdpim, xvrdpip, xvrdpiz
>   - xvrspi, xvrspic, xvrspim, xvrspip, xvrspiz
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers
  2013-10-24 20:42     ` Tom Musta
@ 2013-10-24 21:00       ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 21:00 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 01:42 PM, Tom Musta wrote:
> 
> Unless you are suggesting that the decoded VSR index (0..63) be passed to the
> helper?

It was a thought.  I'll leave the ultimate decision to ppc maintainers.

The insns setting cr6 and crf[BF] get more complicated, but in general those
could be handled by returning the cc value that should be stored, and have tcg
code actually assign the field.


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions
  2013-10-24 20:45   ` Richard Henderson
@ 2013-10-24 21:07     ` Tom Musta
  2013-10-24 21:18       ` Richard Henderson
  0 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-24 21:07 UTC (permalink / raw)
  To: Richard Henderson, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 3:45 PM, Richard Henderson wrote:
> On 10/24/2013 09:26 AM, Tom Musta wrote:
>> Because of the Power ISA definitions of maximum and minimum
>> on various boundary cases, the standard softfloat comparison
>> routines (e.g. float64_lt) do not work as well as one might
>> think.  Therefore specific routines for comparing 64 and 32
>> bit floating point numbers are implemented in the PowerPC
>> helper code.
>
> Really?  All I see in the document is ">fp", used both here
> in the minmax insn and in the cmp insn.
>
> If the softfloat compare isn't good enough for minmax, how
> can it be good enough for cmp?

Example:

The ISA is very explicit that max(-0.0, +0.0) = +0.0.

But the comparison operations (and instructions) both consider
-0.0 == +0.0.  Because of this, I do not see how it is possible
to implement max using float*_eq, float*_lt and float*_le.

See, for example, table 58 (Actions for xsmaxdp) on p. 369 of the
V2.06 ISA.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions
  2013-10-24 21:07     ` Tom Musta
@ 2013-10-24 21:18       ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-24 21:18 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 02:07 PM, Tom Musta wrote:
> 
> See, for example, table 58 (Actions for xsmaxdp) on p. 369 of the
> V2.06 ISA.

Bah, I typoed my search in the document and looked at the Altivec insn, which
is only one letter different, and doesn't have the same guarantees.

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions
  2013-10-24 16:26 ` [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions Tom Musta
  2013-10-24 20:45   ` Richard Henderson
@ 2013-10-24 22:10   ` Peter Maydell
  2013-10-25 13:52     ` Tom Musta
  1 sibling, 1 reply; 58+ messages in thread
From: Peter Maydell @ 2013-10-24 22:10 UTC (permalink / raw)
  To: Tom Musta; +Cc: qemu-ppc, QEMU Developers, Richard Henderson

On 24 October 2013 17:26, Tom Musta <tommusta@gmail.com> wrote:
> This patch adds the VSX floating point maximum and minimum
> instructions:
>
>   - xsmaxdp, xvmaxdp, xvmaxsp
>   - xsmindp, xvmindp, xvminsp
>
> Because of the Power ISA definitions of maximum and minimum
> on various boundary cases, the standard softfloat comparison
> routines (e.g. float64_lt) do not work as well as one might
> think.  Therefore specific routines for comparing 64 and 32
> bit floating point numbers are implemented in the PowerPC
> helper code.

Can't you use the min and max softfloat functions? Those are
there specifically because the corner cases mean you can't
implement them using the comparisons. (For instance for
the example you quote of max(-0.0, +0.0) they return +0.0
as you require.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-24 16:17 ` [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX Tom Musta
  2013-10-24 18:34   ` Richard Henderson
@ 2013-10-25 11:34   ` Alex Bennée
  2013-10-25 11:44     ` Peter Maydell
  2013-10-25 11:55   ` Peter Maydell
  2 siblings, 1 reply; 58+ messages in thread
From: Alex Bennée @ 2013-10-25 11:34 UTC (permalink / raw)
  To: Tom Musta; +Cc: qemu-ppc, QEMU Developers


tommusta@gmail.com writes:

> This patch adds routines to the softfloat library that are useful for
> the PowerPC VSX implementation.  The routines are, however, not specific
> to PowerPC and are approprriate for softfloat.
<snip>

Is it worth adding some sort of test into make check to defend these
softfloat functions against unintentional breakage? It would certainly
be worthwhile as soon as multiple arches use these functions as float
errors are often subtle and hard to track down.

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-25 11:34   ` Alex Bennée
@ 2013-10-25 11:44     ` Peter Maydell
  2013-10-25 13:09       ` Alex Bennée
  2013-10-25 13:24       ` Tom Musta
  0 siblings, 2 replies; 58+ messages in thread
From: Peter Maydell @ 2013-10-25 11:44 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Tom Musta, qemu-ppc, QEMU Developers

On 25 October 2013 12:34, Alex Bennée <alex.bennee@linaro.org> wrote:
> Is it worth adding some sort of test into make check to defend these
> softfloat functions against unintentional breakage? It would certainly
> be worthwhile as soon as multiple arches use these functions as float
> errors are often subtle and hard to track down.

Ideally, but there's zero infrastructure for doing the kind
of serious including-edge-cases testing at the moment, so I'm
not really in favour of making it a gating condition for
accepting patches.

If somebody wanted to set up such infrastructure, there are
a couple of approaches that spring to mind:
 (a) get risu (https://wiki.linaro.org/PeterMaydell/Risu) working
  on more target architectures, add the "record-and-replay" feature
  so it can be run without having target hardware, and then just
  test softfloat by testing the actual target fp instructions
 (b) something involving wiring up IBM's IEEE test suite
  vectors directly to our softfloat code:
 https://www.research.ibm.com/cgi-bin/haifa/test_suite_download.pl?first=elenag&second=webmaster
  (it's not clear to me what license the test vectors are
  under)

-- PMM

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-24 16:17 ` [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX Tom Musta
  2013-10-24 18:34   ` Richard Henderson
  2013-10-25 11:34   ` Alex Bennée
@ 2013-10-25 11:55   ` Peter Maydell
  2013-10-25 13:01     ` Tom Musta
  2 siblings, 1 reply; 58+ messages in thread
From: Peter Maydell @ 2013-10-25 11:55 UTC (permalink / raw)
  To: Tom Musta; +Cc: qemu-ppc, QEMU Developers

On 24 October 2013 17:17, Tom Musta <tommusta@gmail.com> wrote:
> This patch adds routines to the softfloat library that are useful for
> the PowerPC VSX implementation.  The routines are, however, not specific
> to PowerPC and are approprriate for softfloat.
>
> The following routines are added:
>
>   - float32_is_denormal() returns true if the 32-bit floating point number
>     is denormalized.
>   - float64_is_denormal() returns true if the 64-bit floating point number
>     is denormalized.

Can you point me at the patches which use these, please?
I couldn't find them with a quick search in my email client.

>   - float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
>     floating point number.
>   - float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
>     floating point number.

These look rather odd to me, and again I can't find the uses in
your patchset. Returning just the exponent is a bit odd and
suggests that maybe the split between target code and softfloat
is in the wrong place.

>   - float32_to_uint64() converts a 32-bit floating point number to an
>     unsigned 64 bit number.

I would put this in its own patch, personally.

>
> +INLINE int float32_is_denormal(float32 a)
> +{
> +    return ((float32_val(a) & 0x7f800000) == 0) &&
> +           ((float32_val(a) & 0x007fffff) != 0);
> +}

return float32_is_zero_or_denormal(a) && !float32_is_zero(a);

is easier to review and less duplicative of code.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-25 11:55   ` Peter Maydell
@ 2013-10-25 13:01     ` Tom Musta
  2013-10-25 13:37       ` Peter Maydell
  0 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-25 13:01 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-ppc, QEMU Developers

Peter:  Thanks for your feedback.  Responses below.


On 10/25/2013 6:55 AM, Peter Maydell wrote:
> On 24 October 2013 17:17, Tom Musta <tommusta@gmail.com> wrote:
>> This patch adds routines to the softfloat library that are useful for
>> the PowerPC VSX implementation.  The routines are, however, not specific
>> to PowerPC and are approprriate for softfloat.
>>
>> The following routines are added:
>>
>>    - float32_is_denormal() returns true if the 32-bit floating point number
>>      is denormalized.
>>    - float64_is_denormal() returns true if the 64-bit floating point number
>>      is denormalized.
>
> Can you point me at the patches which use these, please?
> I couldn't find them with a quick search in my email client.

Please see http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html

>
>>    - float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
>>      floating point number.
>>    - float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
>>      floating point number.
>
> These look rather odd to me, and again I can't find the uses in
> your patchset. Returning just the exponent is a bit odd and
> suggests that maybe the split between target code and softfloat
> is in the wrong place.

Please see http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html
and http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03107.html
and also the corresponding definitions of those instructions in the Power ISA.

What is odd here is the PowerPC instruction(s) :)

But given that softfloat code extracts exponents in numerous places, I do not find
it odd at all that a floating point instruction model for a non-standard
operation might have to do the same.

These functions can easily be kept within the PowerPC code proper if there are
objections to them being added to softfloat.  I would rename them, of course, so
that they do not look like softfloat routines.

>>    - float32_to_uint64() converts a 32-bit floating point number to an
>>      unsigned 64 bit number.
>
> I would put this in its own patch, personally.

Fair enough.  Just so that I am clear ... do you mean submit this as a patch
just by itself (not as part of a series of VSX additions)?

>>
>> +INLINE int float32_is_denormal(float32 a)
>> +{
>> +    return ((float32_val(a) & 0x7f800000) == 0) &&
>> +           ((float32_val(a) & 0x007fffff) != 0);
>> +}
>
> return float32_is_zero_or_denormal(a) && !float32_is_zero(a);
>
> is easier to review and less duplicative of code.
>
> thanks

It surprised me that there were is_zero and is_zero_or_denormal functions but
not is_denormal functions.  I would find it more normal to implement the two
primitive functions and then construct is_zero_or_denormal to be the OR of
those two.  Until you look at efficiency of the implementation.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-25 11:44     ` Peter Maydell
@ 2013-10-25 13:09       ` Alex Bennée
  2013-10-25 13:24       ` Tom Musta
  1 sibling, 0 replies; 58+ messages in thread
From: Alex Bennée @ 2013-10-25 13:09 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Tom Musta, qemu-ppc, QEMU Developers


peter.maydell@linaro.org writes:

> On 25 October 2013 12:34, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Is it worth adding some sort of test into make check to defend these
>> softfloat functions against unintentional breakage? It would certainly
>> be worthwhile as soon as multiple arches use these functions as float
>> errors are often subtle and hard to track down.
>
> Ideally, but there's zero infrastructure for doing the kind
> of serious including-edge-cases testing at the moment, so I'm
> not really in favour of making it a gating condition for
> accepting patches.

I'm not proposing to halt inclusion for that I was just wondering aloud
how it could be defended. For the soft-float routines themselves they
could be tested within the existing tests/ stuff like
tests/check-qfloat.c without having to worry about hooking into target
arch specific test cases.

> If somebody wanted to set up such infrastructure, there are
> a couple of approaches that spring to mind:
>  (a) get risu (https://wiki.linaro.org/PeterMaydell/Risu) working
>   on more target architectures, add the "record-and-replay" feature
>   so it can be run without having target hardware, and then just
>   test softfloat by testing the actual target fp instructions

Interesting. Funnily we spent a lot of time at Transitive fixing up
translation failures that our random code generator threw up. It's also
equally interesting how far you can get with fairly broken translation
that no actual applications care about.

I'll have a look once I've fixed up build machinery around the existing
TCG tests.

>  (b) something involving wiring up IBM's IEEE test suite
>   vectors directly to our softfloat code:
>  https://www.research.ibm.com/cgi-bin/haifa/test_suite_download.pl?first=elenag&second=webmaster
>   (it's not clear to me what license the test vectors are
>   under)
>
> -- PMM

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-25 11:44     ` Peter Maydell
  2013-10-25 13:09       ` Alex Bennée
@ 2013-10-25 13:24       ` Tom Musta
  1 sibling, 0 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-25 13:24 UTC (permalink / raw)
  To: Peter Maydell, Alex Bennée; +Cc: qemu-ppc, QEMU Developers

On 10/25/2013 6:44 AM, Peter Maydell wrote:
> On 25 October 2013 12:34, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Is it worth adding some sort of test into make check to defend these
>> softfloat functions against unintentional breakage? It would certainly
>> be worthwhile as soon as multiple arches use these functions as float
>> errors are often subtle and hard to track down.
>
> Ideally, but there's zero infrastructure for doing the kind
> of serious including-edge-cases testing at the moment, so I'm
> not really in favour of making it a gating condition for
> accepting patches.
>
> If somebody wanted to set up such infrastructure, there are
> a couple of approaches that spring to mind:
>   (a) get risu (https://wiki.linaro.org/PeterMaydell/Risu) working
>    on more target architectures, add the "record-and-replay" feature
>    so it can be run without having target hardware, and then just
>    test softfloat by testing the actual target fp instructions
>   (b) something involving wiring up IBM's IEEE test suite
>    vectors directly to our softfloat code:
>   https://www.research.ibm.com/cgi-bin/haifa/test_suite_download.pl?first=elenag&second=webmaster
>    (it's not clear to me what license the test vectors are
>    under)

Softfloat would seem to lend itself very well to unit testing which makes (b)
attractive.  Let me see if I can get an answer to the licensing question.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX
  2013-10-25 13:01     ` Tom Musta
@ 2013-10-25 13:37       ` Peter Maydell
  0 siblings, 0 replies; 58+ messages in thread
From: Peter Maydell @ 2013-10-25 13:37 UTC (permalink / raw)
  To: Tom Musta; +Cc: qemu-ppc, QEMU Developers

On 25 October 2013 14:01, Tom Musta <tommusta@gmail.com> wrote:
> On 10/25/2013 6:55 AM, Peter Maydell wrote:
>> On 24 October 2013 17:17, Tom Musta <tommusta@gmail.com> wrote:
>>>    - float32_is_denormal() returns true if the 32-bit floating point
>>> number
>>>      is denormalized.
>>>    - float64_is_denormal() returns true if the 64-bit floating point
>>> number
>>>      is denormalized.
>>
>>
>> Can you point me at the patches which use these, please?
>> I couldn't find them with a quick search in my email client.
>
>
> Please see
> http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html

Thanks. For that code you can just use the existing
is_zero_or_denormal function if you like, since you've
already ruled out "is this zero?" by the time you're
checking for "is this denormal?". (In fact that logic
seems to do a number of pointless checks for "is this
zero?" when it's already ruled that case out very early;
it should probably be rephrased.)

However I don't think there's any harm in our providing some
*_is_denormal() functions in our softfloat API if the code
seems clearer if it's written to use them. It does fill
out an odd gap in the API shape, as you note below.

>>>    - float32_get_unbiased_exp() returns the unbiased exponent of a 32-bit
>>>      floating point number.
>>>    - float64_get_unbiased_exp() returns the unbiased exponent of a 64-bit
>>>      floating point number.
>>
>>
>> These look rather odd to me, and again I can't find the uses in
>> your patchset. Returning just the exponent is a bit odd and
>> suggests that maybe the split between target code and softfloat
>> is in the wrong place.
>
>
> Please see
> http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03108.html
> and http://lists.nongnu.org/archive/html/qemu-devel/2013-10/msg03107.html
> and also the corresponding definitions of those instructions in the Power
> ISA.
>
> What is odd here is the PowerPC instruction(s) :)
>
> But given that softfloat code extracts exponents in numerous places, I do
> not find
> it odd at all that a floating point instruction model for a non-standard
> operation might have to do the same.
>
> These functions can easily be kept within the PowerPC code proper if there
> are
> objections to them being added to softfloat.  I would rename them, of
> course, so
> that they do not look like softfloat routines.

Mmm. You'll notice that your calling code has to know rather
a lot about the format of the IEEE floats (in that it has
to know the min/max exponent and mantissa width). So I think
I'd just opencode these in the PPC routines. (This is what we
do in target-arm, see recpe_f32 and rsqrte_f32 for examples.)

>>>    - float32_to_uint64() converts a 32-bit floating point number to an
>>>      unsigned 64 bit number.
>>
>>
>> I would put this in its own patch, personally.
>
>
> Fair enough.  Just so that I am clear ... do you mean submit this as a patch
> just by itself (not as part of a series of VSX additions)?

I mean "in its own patch email so it is a separate commit and
clearly separated from other things for code review purposes".
You probably still keep it as part of this patch series. (In
fact it would also be a good idea to include the previous
patch this one depends on, if that has not yet been committed.)

>
>>>
>>> +INLINE int float32_is_denormal(float32 a)
>>> +{
>>> +    return ((float32_val(a) & 0x7f800000) == 0) &&
>>> +           ((float32_val(a) & 0x007fffff) != 0);
>>> +}
>>
>>
>> return float32_is_zero_or_denormal(a) && !float32_is_zero(a);
>>
>> is easier to review and less duplicative of code.
>>
>> thanks
>
>
> It surprised me that there were is_zero and is_zero_or_denormal functions
> but
> not is_denormal functions.  I would find it more normal to implement the two
> primitive functions and then construct is_zero_or_denormal to be the OR of
> those two.  Until you look at efficiency of the implementation.

I think also the original uses of these functions didn't
need to distinguish zero from denormal, so it was a more
natural API for those uses.

-- PMM

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-24 20:38   ` Richard Henderson
@ 2013-10-25 13:49     ` Tom Musta
  2013-10-25 16:25     ` Tom Musta
  1 sibling, 0 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-25 13:49 UTC (permalink / raw)
  To: Richard Henderson, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 3:38 PM, Richard Henderson wrote:
> On 10/24/2013 09:25 AM, Tom Musta wrote:
>>                                                 \
<snip>

>> +                ft1 = tp##_to_##btp(s->fld[i], &env->fp_status);              \
>> +                ft0 = btp##_##sum(ft0, ft1, &env->fp_status);                 \
>> +                xt.fld[i] = btp##_to_##tp(ft0, &env->fp_status);              \
<snip>
> You want to be using tp##muladd instead of widening to 128 bits.

Thanks for the suggestion, Richard.  I will try it.

>
>> +        s = &xt;                                                              \
>> +    }                                                                         \
>> +    else {                                                                    \
>> +        m = &xt;                                                              \
>
> Also be careful of the codingstyle.

To be fixed in V2 (checkpatch.pl missed this one).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions
  2013-10-24 22:10   ` Peter Maydell
@ 2013-10-25 13:52     ` Tom Musta
  2013-10-25 13:55       ` Peter Maydell
  0 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-25 13:52 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-ppc, QEMU Developers, Richard Henderson

On 10/24/2013 5:10 PM, Peter Maydell wrote:
> Can't you use the min and max softfloat functions? Those are
> there specifically because the corner cases mean you can't
> implement them using the comparisons. (For instance for
> the example you quote of max(-0.0, +0.0) they return +0.0
> as you require.)

I tried this but didn't have much luck getting results to match
the P7 hardware.  Unfortunately, I don't recall the details.
Let me try this approach again.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions
  2013-10-25 13:52     ` Tom Musta
@ 2013-10-25 13:55       ` Peter Maydell
  0 siblings, 0 replies; 58+ messages in thread
From: Peter Maydell @ 2013-10-25 13:55 UTC (permalink / raw)
  To: Tom Musta; +Cc: qemu-ppc, QEMU Developers, Richard Henderson

On 25 October 2013 14:52, Tom Musta <tommusta@gmail.com> wrote:
> On 10/24/2013 5:10 PM, Peter Maydell wrote:
>>
>> Can't you use the min and max softfloat functions? Those are
>> there specifically because the corner cases mean you can't
>> implement them using the comparisons. (For instance for
>> the example you quote of max(-0.0, +0.0) they return +0.0
>> as you require.)

> I tried this but didn't have much luck getting results to match
> the P7 hardware.  Unfortunately, I don't recall the details.
> Let me try this approach again.

The functions are supposed to match the IEEE mandated min/max
behaviour, and I tested the ARM instructions that use them,
so unless the PPC chip designers have gone rather off-piste
they ought to work :-) (It can happen, though, IIRC x86 has
some rather weird non-IEEE min/max insns.)

-- PMM

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-24 20:38   ` Richard Henderson
  2013-10-25 13:49     ` Tom Musta
@ 2013-10-25 16:25     ` Tom Musta
  2013-10-25 16:42       ` Richard Henderson
  2013-10-25 17:20       ` Peter Maydell
  1 sibling, 2 replies; 58+ messages in thread
From: Tom Musta @ 2013-10-25 16:25 UTC (permalink / raw)
  To: Richard Henderson, QEMU Developers; +Cc: qemu-ppc

On 10/24/2013 3:38 PM, Richard Henderson wrote:
> On 10/24/2013 09:25 AM, Tom Musta wrote:
>>                                                 \
>> +            ft0 = tp##_to_##btp(xa.fld[i], &env->fp_status);                  \
>> +            ft1 = tp##_to_##btp(m->fld[i], &env->fp_status);                  \
>> +            ft0 = btp##_mul(ft0, ft1, &env->fp_status);                       \
>> +            if (unlikely(btp##_is_infinity(ft0) &&                            \
>> +                         tp##_is_infinity(s->fld[i]) &&                       \
>> +                         btp##_is_neg(ft0) cmp tp##_is_neg(s->fld[i]))) {     \
>> +                xt.fld[i] = float64_to_##tp(                                  \
>> +                              fload_invalid_op_excp(env,                      \
>> +                                                     POWERPC_EXCP_FP_VXISI,   \
>> +                                                     sfprf),                  \
>> +                              &env->fp_status);                               \
>> +            } else {                                                          \
>> +                ft1 = tp##_to_##btp(s->fld[i], &env->fp_status);              \
>> +                ft0 = btp##_##sum(ft0, ft1, &env->fp_status);                 \
>> +                xt.fld[i] = btp##_to_##tp(ft0, &env->fp_status);              \
>> +            }                                                                 \
>> +            if (neg && likely(!tp##_is_any_nan(xt.fld[i]))) {                 \
>> +                xt.fld[i] = tp##_chs(xt.fld[i]);                              \
>> +            }
>
> You want to be using tp##muladd instead of widening to 128 bits.

I tried recoding xsmaddadp using float64_muladd.  The problem that I hit is the
boundary case where the intermediate product and the summand are infinities of
the opposite sign.  This is the case handled by the first "if" in the code
snippet above.  PowerPC has a dedicated FPSCR bit for this type of condition
(VXISI) as well as a general invalid operation bit (VX).  As far as I can tell,
the softfloat code only has the equivalent of the VX bit.   Thus the implementation
that I proposed is a more accurate representation of the Power ISA.

The VSX code was modeled after the existing fmadd FPU instruction.  I suspect
the author of that code wrote it this way for similar reasons.

I am inclined to keep my proposed implementation, which is consistent with
the existing PowerPC code.

Thoughts?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-25 16:25     ` Tom Musta
@ 2013-10-25 16:42       ` Richard Henderson
  2013-10-25 17:13         ` Tom Musta
  2013-10-25 17:20       ` Peter Maydell
  1 sibling, 1 reply; 58+ messages in thread
From: Richard Henderson @ 2013-10-25 16:42 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/25/2013 09:25 AM, Tom Musta wrote:
> 
> I tried recoding xsmaddadp using float64_muladd.  The problem that I hit is the
> boundary case where the intermediate product and the summand are infinities of
> the opposite sign.  This is the case handled by the first "if" in the code
> snippet above.  PowerPC has a dedicated FPSCR bit for this type of condition
> (VXISI) as well as a general invalid operation bit (VX).  As far as I can tell,
> the softfloat code only has the equivalent of the VX bit.   Thus the
> implementation
> that I proposed is a more accurate representation of the Power ISA.
> 
> The VSX code was modeled after the existing fmadd FPU instruction.  I suspect
> the author of that code wrote it this way for similar reasons.
> 
> I am inclined to keep my proposed implementation, which is consistent with
> the existing PowerPC code.
> 
> Thoughts?

Hmm.  I won't object to your current implementation, since it does produce
correct results.

I believe that a better implementation could use float*_muladd, and check the
result for float_flag_invalid.  If set, compute the intermediate product so you
can figure out the VXISI setting.  But we'd expect that to be an unlikely path.


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-25 16:42       ` Richard Henderson
@ 2013-10-25 17:13         ` Tom Musta
  2013-10-25 17:29           ` Richard Henderson
  0 siblings, 1 reply; 58+ messages in thread
From: Tom Musta @ 2013-10-25 17:13 UTC (permalink / raw)
  To: Richard Henderson, QEMU Developers; +Cc: qemu-ppc

On 10/25/2013 11:42 AM, Richard Henderson wrote:
> I believe that a better implementation could use float*_muladd, and check the
> result for float_flag_invalid.  If set, compute the intermediate product so you
> can figure out the VXISI setting.  But we'd expect that to be an unlikely path.

Interesting thought.  I think I see a way to re-arrange the code.  Thanks, Richard.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-25 16:25     ` Tom Musta
  2013-10-25 16:42       ` Richard Henderson
@ 2013-10-25 17:20       ` Peter Maydell
  2013-10-25 17:34         ` Richard Henderson
  1 sibling, 1 reply; 58+ messages in thread
From: Peter Maydell @ 2013-10-25 17:20 UTC (permalink / raw)
  To: Tom Musta; +Cc: qemu-ppc, QEMU Developers, Richard Henderson

On 25 October 2013 17:25, Tom Musta <tommusta@gmail.com> wrote:
> On 10/24/2013 3:38 PM, Richard Henderson wrote:
>> You want to be using tp##muladd instead of widening to 128 bits.
>
>
> I tried recoding xsmaddadp using float64_muladd.  The problem that I hit is
> the
> boundary case where the intermediate product and the summand are infinities
> of
> the opposite sign.  This is the case handled by the first "if" in the code
> snippet above.  PowerPC has a dedicated FPSCR bit for this type of condition
> (VXISI) as well as a general invalid operation bit (VX).  As far as I can
> tell,
> the softfloat code only has the equivalent of the VX bit.   Thus the
> implementation
> that I proposed is a more accurate representation of the Power ISA.

You could add the flag to the softfloat code -- this is what I did
for the somewhat ARM specific float_flag_output_denormal.

> The VSX code was modeled after the existing fmadd FPU instruction.  I
> suspect
> the author of that code wrote it this way for similar reasons.

I suspect it just predates the provision of fused multiply-add at
the softfloat level. It should ideally be rewritten to use the
softfloat functions.

Are you sure that doing the arithmetic with the softfloat 128 bit
float operations doesn't set the inexact flag anywhere it
shouldn't? (ie where the intermediate product is not exact in
128 bit format but the final result is exact in 64 or 32 bits).

-- PMM

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-25 17:13         ` Tom Musta
@ 2013-10-25 17:29           ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-25 17:29 UTC (permalink / raw)
  To: Tom Musta, QEMU Developers; +Cc: qemu-ppc

On 10/25/2013 10:13 AM, Tom Musta wrote:
> On 10/25/2013 11:42 AM, Richard Henderson wrote:
>> I believe that a better implementation could use float*_muladd, and check the
>> result for float_flag_invalid.  If set, compute the intermediate product so you
>> can figure out the VXISI setting.  But we'd expect that to be an unlikely path.
> 
> Interesting thought.  I think I see a way to re-arrange the code.  Thanks,
> Richard.

Actually, you don't even have to compute the intermediate product.

The only way you can have VXISI for a*b+c is for

  isinf(c) && (isinf(a) || isinf(b))

since the intermediate product a*b is infinite precision, and thus cannot
overflow to inf unless one of the multiplicands is already inf.


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions
  2013-10-25 17:20       ` Peter Maydell
@ 2013-10-25 17:34         ` Richard Henderson
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Henderson @ 2013-10-25 17:34 UTC (permalink / raw)
  To: Peter Maydell, Tom Musta; +Cc: qemu-ppc, QEMU Developers

On 10/25/2013 10:20 AM, Peter Maydell wrote:
> Are you sure that doing the arithmetic with the softfloat 128 bit
> float operations doesn't set the inexact flag anywhere it
> shouldn't? (ie where the intermediate product is not exact in
> 128 bit format but the final result is exact in 64 or 32 bits).

The 128 bit multiply cannot given an inexact, and I believe that if the 128 bit
addition gives inexact then the 64-bit fma result would also have inexact.


r~

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2013-10-25 17:35 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-24 16:16 [Qemu-devel] [PATCH 00/19] PowerPC VSX Stage 3 Tom Musta
2013-10-24 16:17 ` [Qemu-devel] [PATCH 01/19] Add New softfloat Routines for VSX Tom Musta
2013-10-24 18:34   ` Richard Henderson
2013-10-25 11:34   ` Alex Bennée
2013-10-25 11:44     ` Peter Maydell
2013-10-25 13:09       ` Alex Bennée
2013-10-25 13:24       ` Tom Musta
2013-10-25 11:55   ` Peter Maydell
2013-10-25 13:01     ` Tom Musta
2013-10-25 13:37       ` Peter Maydell
2013-10-24 16:18 ` [Qemu-devel] [PATCH 02/19] Add set_fprf Argument to fload_invalid_op_excp() Tom Musta
2013-10-24 16:19 ` [Qemu-devel] [PATCH 03/19] General Support for VSX Helpers Tom Musta
2013-10-24 18:51   ` Richard Henderson
2013-10-24 20:42     ` Tom Musta
2013-10-24 21:00       ` Richard Henderson
2013-10-24 16:20 ` [Qemu-devel] [PATCH 04/19] Add VSX ISA2.06 xadd Instructions Tom Musta
2013-10-24 19:44   ` Richard Henderson
2013-10-24 16:20 ` [Qemu-devel] [PATCH 05/19] Add VSX ISA2.06 xsub Instructions Tom Musta
2013-10-24 19:48   ` Richard Henderson
2013-10-24 16:21 ` [Qemu-devel] [PATCH 06/19] Add VSX ISA2.06 xmul Instructions Tom Musta
2013-10-24 20:07   ` Richard Henderson
2013-10-24 16:21 ` [Qemu-devel] [PATCH 07/19] Add VSX ISA2.06 xdiv Instructions Tom Musta
2013-10-24 20:08   ` Richard Henderson
2013-10-24 16:22 ` [Qemu-devel] [PATCH 08/19] Add VSX ISA2.06 xre Instructions Tom Musta
2013-10-24 20:11   ` Richard Henderson
2013-10-24 16:22 ` [Qemu-devel] [PATCH 09/19] Add VSX ISA2.06 xsqrt Instructions Tom Musta
2013-10-24 20:23   ` Richard Henderson
2013-10-24 16:23 ` [Qemu-devel] [PATCH 10/19] Add VSX ISA2.06 xrsqrte Instructions Tom Musta
2013-10-24 20:25   ` Richard Henderson
2013-10-24 16:23 ` [Qemu-devel] [PATCH 11/19] Add VSX ISA2.06 xtdiv Instructions Tom Musta
2013-10-24 20:30   ` Richard Henderson
2013-10-24 16:24 ` [Qemu-devel] [PATCH 12/19] Add VSX ISA2.06 xtsqrt Instructions Tom Musta
2013-10-24 20:34   ` Richard Henderson
2013-10-24 16:25 ` [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions Tom Musta
2013-10-24 20:38   ` Richard Henderson
2013-10-25 13:49     ` Tom Musta
2013-10-25 16:25     ` Tom Musta
2013-10-25 16:42       ` Richard Henderson
2013-10-25 17:13         ` Tom Musta
2013-10-25 17:29           ` Richard Henderson
2013-10-25 17:20       ` Peter Maydell
2013-10-25 17:34         ` Richard Henderson
2013-10-24 16:25 ` [Qemu-devel] [PATCH 14/19] Add VSX xscmp*dp Instructions Tom Musta
2013-10-24 20:39   ` Richard Henderson
2013-10-24 16:26 ` [Qemu-devel] [PATCH 15/19] Add VSX xmax/xmin Instructions Tom Musta
2013-10-24 20:45   ` Richard Henderson
2013-10-24 21:07     ` Tom Musta
2013-10-24 21:18       ` Richard Henderson
2013-10-24 22:10   ` Peter Maydell
2013-10-25 13:52     ` Tom Musta
2013-10-25 13:55       ` Peter Maydell
2013-10-24 16:26 ` [Qemu-devel] [PATCH 16/19] Add VSX Vector Compare Instructions Tom Musta
2013-10-24 16:27 ` [Qemu-devel] [PATCH 17/19] Add VSX Floating Point to Floating Point Conversion Instructions Tom Musta
2013-10-24 20:49   ` Richard Henderson
2013-10-24 16:27 ` [Qemu-devel] [PATCH 18/19] Add VSX ISA2.06 Integer " Tom Musta
2013-10-24 20:51   ` Richard Henderson
2013-10-24 16:28 ` [Qemu-devel] [PATCH 19/19] Add VSX Rounding Instructions Tom Musta
2013-10-24 20:54   ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.