[Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP
@ 2018-05-10  9:42 Alex Bennée
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised Alex Bennée
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Alex Bennée @ 2018-05-10  9:42 UTC (permalink / raw)
  To: peter.maydell; +Cc: richard.henderson, qemu-arm, qemu-devel, Alex Bennée

Hi,

Hi,

I've not included the test case in the series but you can find it in
my TCG fixup branch:

  https://github.com/stsquad/qemu/blob/testing/tcg-tests-revival-v4/tests/tcg/arm/fcvt.c

Some of the ARMv7 versions are commented out as they where not
supported until later revs. I do have a build that includes that but
unfortunately the Debian compiler it too old to build it.

: patch 0001/fpu softfloat int_to_float ensure r fully initial.patch needs review
: patch 0004/target arm convert conversion helpers to fpst ahp.patch needs review
: patch 0005/target arm squash FZ16 behaviour for conversions.patch needs review

Alex Bennée (5):
  fpu/softfloat: int_to_float ensure r fully initialised
  fpu/softfloat: re-factor float to float conversions
  fpu/softfloat: support ARM Alternative half-precision
  target/arm: convert conversion helpers to fpst/ahp_flag
  target/arm: squash FZ16 behaviour for conversions

 fpu/softfloat-specialize.h |  40 ---
 fpu/softfloat.c            | 546 +++++++++++--------------------------
 include/fpu/softfloat.h    |   8 +-
 target/arm/helper.c        |  66 ++---
 target/arm/helper.h        |  12 +-
 target/arm/translate-a64.c |  38 ++-
 target/arm/translate.c     |  70 +++--
 target/arm/translate.h     |  15 +
 8 files changed, 306 insertions(+), 489 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised
  2018-05-10  9:42 [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Alex Bennée
@ 2018-05-10  9:42 ` Alex Bennée
  2018-05-10 12:40   ` Peter Maydell
  2018-05-10 14:50   ` Richard Henderson
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 2/5] fpu/softfloat: re-factor float to float conversions Alex Bennée
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 12+ messages in thread
From: Alex Bennée @ 2018-05-10  9:42 UTC (permalink / raw)
  To: peter.maydell
  Cc: richard.henderson, qemu-arm, qemu-devel, Alex Bennée,
	Aurelien Jarno

Reported by Coverity (CID1390635). We ensure this for uint_to_float
later on so we might as well mirror that.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 fpu/softfloat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 70e0c40a1c..3adf6a06e4 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1517,7 +1517,7 @@ FLOAT_TO_UINT(64, 64)
 
 static FloatParts int_to_float(int64_t a, float_status *status)
 {
-    FloatParts r;
+    FloatParts r = {};
     if (a == 0) {
         r.cls = float_class_zero;
         r.sign = false;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v3 2/5] fpu/softfloat: re-factor float to float conversions
  2018-05-10  9:42 [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Alex Bennée
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised Alex Bennée
@ 2018-05-10  9:42 ` Alex Bennée
  2018-05-10 12:48   ` Peter Maydell
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 3/5] fpu/softfloat: support ARM Alternative half-precision Alex Bennée
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Alex Bennée @ 2018-05-10  9:42 UTC (permalink / raw)
  To: peter.maydell
  Cc: richard.henderson, qemu-arm, qemu-devel, Alex Bennée,
	Aurelien Jarno

This allows us to delete a lot of additional boilerplate code which is
no longer needed. Currently the ieee flag is ignored (everything is
assumed to be ieee). Handling for ARM AHP will be in the next patch.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

---
v2
  - pass FloatFmt to float_to_float instead of sizes
  - split AHP handling to another patch
  - use rth's suggested re-packing (+ setting .exp)
v3
  - also rm extractFloat16Sign
---
 fpu/softfloat-specialize.h |  40 ----
 fpu/softfloat.c            | 452 +++++++------------------------------
 include/fpu/softfloat.h    |   8 +-
 3 files changed, 88 insertions(+), 412 deletions(-)

diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index 27834af0de..a20b440159 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -293,46 +293,6 @@ float16 float16_maybe_silence_nan(float16 a_, float_status *status)
     return a_;
 }
 
-/*----------------------------------------------------------------------------
-| Returns the result of converting the half-precision floating-point NaN
-| `a' to the canonical NaN format.  If `a' is a signaling NaN, the invalid
-| exception is raised.
-*----------------------------------------------------------------------------*/
-
-static commonNaNT float16ToCommonNaN(float16 a, float_status *status)
-{
-    commonNaNT z;
-
-    if (float16_is_signaling_nan(a, status)) {
-        float_raise(float_flag_invalid, status);
-    }
-    z.sign = float16_val(a) >> 15;
-    z.low = 0;
-    z.high = ((uint64_t) float16_val(a)) << 54;
-    return z;
-}
-
-/*----------------------------------------------------------------------------
-| Returns the result of converting the canonical NaN `a' to the half-
-| precision floating-point format.
-*----------------------------------------------------------------------------*/
-
-static float16 commonNaNToFloat16(commonNaNT a, float_status *status)
-{
-    uint16_t mantissa = a.high >> 54;
-
-    if (status->default_nan_mode) {
-        return float16_default_nan(status);
-    }
-
-    if (mantissa) {
-        return make_float16(((((uint16_t) a.sign) << 15)
-                             | (0x1F << 10) | mantissa));
-    } else {
-        return float16_default_nan(status);
-    }
-}
-
 #ifdef NO_SIGNALING_NANS
 int float32_is_quiet_nan(float32 a_, float_status *status)
 {
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 3adf6a06e4..042e5c901d 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -123,15 +123,6 @@ static inline int extractFloat16Exp(float16 a)
     return (float16_val(a) >> 10) & 0x1f;
 }
 
-/*----------------------------------------------------------------------------
-| Returns the sign bit of the single-precision floating-point value `a'.
-*----------------------------------------------------------------------------*/
-
-static inline flag extractFloat16Sign(float16 a)
-{
-    return float16_val(a)>>15;
-}
-
 /*----------------------------------------------------------------------------
 | Returns the fraction bits of the single-precision floating-point value `a'.
 *----------------------------------------------------------------------------*/
@@ -1194,6 +1185,90 @@ float64 float64_div(float64 a, float64 b, float_status *status)
     return float64_round_pack_canonical(pr, status);
 }
 
+/*
+ * Float to Float conversions
+ *
+ * Returns the result of converting one float format to another. The
+ * conversion is performed according to the IEC/IEEE Standard for
+ * Binary Floating-Point Arithmetic.
+ *
+ * The float_to_float helper only needs to take care of raising
+ * invalid exceptions and handling the conversion on NaNs.
+ */
+
+static FloatParts float_to_float(FloatParts a,
+                                 const FloatFmt *srcf, const FloatFmt *dstf,
+                                 float_status *s)
+{
+    if (is_nan(a.cls)) {
+
+        if (is_snan(a.cls)) {
+            s->float_exception_flags |= float_flag_invalid;
+        }
+
+        if (s->default_nan_mode) {
+            a.cls = float_class_dnan;
+            return a;
+        }
+
+        /*
+         * Our only option now is to "re-pack" the NaN. As the
+         * canonilization process doesn't mess with fraction bits for
+         * NaNs we do it all here. We also reset a.exp to the
+         * destination format exp_max as the maybe_silence_nan code
+         * assumes it is correct (which is would be for non-conversions).
+         */
+        a.frac = a.frac << (64 - srcf->frac_size) >> (64 - dstf->frac_size);
+        a.exp = dstf->exp_max;
+        a.cls = float_class_msnan;
+    }
+
+    return a;
+}
+
+float32 float16_to_float32(float16 a, bool ieee, float_status *s)
+{
+    FloatParts p = float16_unpack_canonical(a, s);
+    FloatParts pr = float_to_float(p, &float16_params, &float32_params, s);
+    return float32_round_pack_canonical(pr, s);
+}
+
+float64 float16_to_float64(float16 a, bool ieee, float_status *s)
+{
+    FloatParts p = float16_unpack_canonical(a, s);
+    FloatParts pr = float_to_float(p, &float16_params, &float64_params, s);
+    return float64_round_pack_canonical(pr, s);
+}
+
+float16 float32_to_float16(float32 a, bool ieee, float_status *s)
+{
+    FloatParts p = float32_unpack_canonical(a, s);
+    FloatParts pr = float_to_float(p, &float32_params, &float16_params, s);
+    return float16_round_pack_canonical(pr, s);
+}
+
+float64 float32_to_float64(float32 a, float_status *s)
+{
+    FloatParts p = float32_unpack_canonical(a, s);
+    FloatParts pr = float_to_float(p, &float32_params, &float64_params, s);
+    return float64_round_pack_canonical(pr, s);
+}
+
+float16 float64_to_float16(float64 a, bool ieee, float_status *s)
+{
+    FloatParts p = float64_unpack_canonical(a, s);
+    FloatParts pr = float_to_float(p, &float64_params, &float16_params, s);
+    return float16_round_pack_canonical(pr, s);
+}
+
+float32 float64_to_float32(float64 a, float_status *s)
+{
+    FloatParts p = float64_unpack_canonical(a, s);
+    FloatParts pr = float_to_float(p, &float64_params, &float32_params, s);
+    return float32_round_pack_canonical(pr, s);
+}
+
+
 /*
  * Rounds the floating-point value `a' to an integer, and returns the
  * result as a floating-point value. The operation is performed
@@ -3142,41 +3217,6 @@ float128 uint64_to_float128(uint64_t a, float_status *status)
     return normalizeRoundAndPackFloat128(0, 0x406E, a, 0, status);
 }
 
-
-
-
-/*----------------------------------------------------------------------------
-| Returns the result of converting the single-precision floating-point value
-| `a' to the double-precision floating-point format.  The conversion is
-| performed according to the IEC/IEEE Standard for Binary Floating-Point
-| Arithmetic.
-*----------------------------------------------------------------------------*/
-
-float64 float32_to_float64(float32 a, float_status *status)
-{
-    flag aSign;
-    int aExp;
-    uint32_t aSig;
-    a = float32_squash_input_denormal(a, status);
-
-    aSig = extractFloat32Frac( a );
-    aExp = extractFloat32Exp( a );
-    aSign = extractFloat32Sign( a );
-    if ( aExp == 0xFF ) {
-        if (aSig) {
-            return commonNaNToFloat64(float32ToCommonNaN(a, status), status);
-        }
-        return packFloat64( aSign, 0x7FF, 0 );
-    }
-    if ( aExp == 0 ) {
-        if ( aSig == 0 ) return packFloat64( aSign, 0, 0 );
-        normalizeFloat32Subnormal( aSig, &aExp, &aSig );
-        --aExp;
-    }
-    return packFloat64( aSign, aExp + 0x380, ( (uint64_t) aSig )<<29 );
-
-}
-
 /*----------------------------------------------------------------------------
 | Returns the result of converting the single-precision floating-point value
 | `a' to the extended double-precision floating-point format.  The conversion
@@ -3695,173 +3735,6 @@ int float32_unordered_quiet(float32 a, float32 b, float_status *status)
     return 0;
 }
 
-
-/*----------------------------------------------------------------------------
-| Returns the result of converting the double-precision floating-point value
-| `a' to the single-precision floating-point format.  The conversion is
-| performed according to the IEC/IEEE Standard for Binary Floating-Point
-| Arithmetic.
-*----------------------------------------------------------------------------*/
-
-float32 float64_to_float32(float64 a, float_status *status)
-{
-    flag aSign;
-    int aExp;
-    uint64_t aSig;
-    uint32_t zSig;
-    a = float64_squash_input_denormal(a, status);
-
-    aSig = extractFloat64Frac( a );
-    aExp = extractFloat64Exp( a );
-    aSign = extractFloat64Sign( a );
-    if ( aExp == 0x7FF ) {
-        if (aSig) {
-            return commonNaNToFloat32(float64ToCommonNaN(a, status), status);
-        }
-        return packFloat32( aSign, 0xFF, 0 );
-    }
-    shift64RightJamming( aSig, 22, &aSig );
-    zSig = aSig;
-    if ( aExp || zSig ) {
-        zSig |= 0x40000000;
-        aExp -= 0x381;
-    }
-    return roundAndPackFloat32(aSign, aExp, zSig, status);
-
-}
-
-
-/*----------------------------------------------------------------------------
-| Packs the sign `zSign', exponent `zExp', and significand `zSig' into a
-| half-precision floating-point value, returning the result.  After being
-| shifted into the proper positions, the three fields are simply added
-| together to form the result.  This means that any integer portion of `zSig'
-| will be added into the exponent.  Since a properly normalized significand
-| will have an integer portion equal to 1, the `zExp' input should be 1 less
-| than the desired result exponent whenever `zSig' is a complete, normalized
-| significand.
-*----------------------------------------------------------------------------*/
-static float16 packFloat16(flag zSign, int zExp, uint16_t zSig)
-{
-    return make_float16(
-        (((uint32_t)zSign) << 15) + (((uint32_t)zExp) << 10) + zSig);
-}
-
-/*----------------------------------------------------------------------------
-| Takes an abstract floating-point value having sign `zSign', exponent `zExp',
-| and significand `zSig', and returns the proper half-precision floating-
-| point value corresponding to the abstract input.  Ordinarily, the abstract
-| value is simply rounded and packed into the half-precision format, with
-| the inexact exception raised if the abstract input cannot be represented
-| exactly.  However, if the abstract value is too large, the overflow and
-| inexact exceptions are raised and an infinity or maximal finite value is
-| returned.  If the abstract value is too small, the input value is rounded to
-| a subnormal number, and the underflow and inexact exceptions are raised if
-| the abstract input cannot be represented exactly as a subnormal half-
-| precision floating-point number.
-| The `ieee' flag indicates whether to use IEEE standard half precision, or
-| ARM-style "alternative representation", which omits the NaN and Inf
-| encodings in order to raise the maximum representable exponent by one.
-|     The input significand `zSig' has its binary point between bits 22
-| and 23, which is 13 bits to the left of the usual location.  This shifted
-| significand must be normalized or smaller.  If `zSig' is not normalized,
-| `zExp' must be 0; in that case, the result returned is a subnormal number,
-| and it must not require rounding.  In the usual case that `zSig' is
-| normalized, `zExp' must be 1 less than the ``true'' floating-point exponent.
-| Note the slightly odd position of the binary point in zSig compared with the
-| other roundAndPackFloat functions. This should probably be fixed if we
-| need to implement more float16 routines than just conversion.
-| The handling of underflow and overflow follows the IEC/IEEE Standard for
-| Binary Floating-Point Arithmetic.
-*----------------------------------------------------------------------------*/
-
-static float16 roundAndPackFloat16(flag zSign, int zExp,
-                                   uint32_t zSig, flag ieee,
-                                   float_status *status)
-{
-    int maxexp = ieee ? 29 : 30;
-    uint32_t mask;
-    uint32_t increment;
-    bool rounding_bumps_exp;
-    bool is_tiny = false;
-
-    /* Calculate the mask of bits of the mantissa which are not
-     * representable in half-precision and will be lost.
-     */
-    if (zExp < 1) {
-        /* Will be denormal in halfprec */
-        mask = 0x00ffffff;
-        if (zExp >= -11) {
-            mask >>= 11 + zExp;
-        }
-    } else {
-        /* Normal number in halfprec */
-        mask = 0x00001fff;
-    }
-
-    switch (status->float_rounding_mode) {
-    case float_round_nearest_even:
-        increment = (mask + 1) >> 1;
-        if ((zSig & mask) == increment) {
-            increment = zSig & (increment << 1);
-        }
-        break;
-    case float_round_ties_away:
-        increment = (mask + 1) >> 1;
-        break;
-    case float_round_up:
-        increment = zSign ? 0 : mask;
-        break;
-    case float_round_down:
-        increment = zSign ? mask : 0;
-        break;
-    default: /* round_to_zero */
-        increment = 0;
-        break;
-    }
-
-    rounding_bumps_exp = (zSig + increment >= 0x01000000);
-
-    if (zExp > maxexp || (zExp == maxexp && rounding_bumps_exp)) {
-        if (ieee) {
-            float_raise(float_flag_overflow | float_flag_inexact, status);
-            return packFloat16(zSign, 0x1f, 0);
-        } else {
-            float_raise(float_flag_invalid, status);
-            return packFloat16(zSign, 0x1f, 0x3ff);
-        }
-    }
-
-    if (zExp < 0) {
-        /* Note that flush-to-zero does not affect half-precision results */
-        is_tiny =
-            (status->float_detect_tininess == float_tininess_before_rounding)
-            || (zExp < -1)
-            || (!rounding_bumps_exp);
-    }
-    if (zSig & mask) {
-        float_raise(float_flag_inexact, status);
-        if (is_tiny) {
-            float_raise(float_flag_underflow, status);
-        }
-    }
-
-    zSig += increment;
-    if (rounding_bumps_exp) {
-        zSig >>= 1;
-        zExp++;
-    }
-
-    if (zExp < -10) {
-        return packFloat16(zSign, 0, 0);
-    }
-    if (zExp < 0) {
-        zSig >>= -zExp;
-        zExp = 0;
-    }
-    return packFloat16(zSign, zExp, zSig >> 13);
-}
-
 /*----------------------------------------------------------------------------
 | If `a' is denormal and we are in flush-to-zero mode then set the
 | input-denormal exception and return zero. Otherwise just return the value.
@@ -3877,163 +3750,6 @@ float16 float16_squash_input_denormal(float16 a, float_status *status)
     return a;
 }
 
-static void normalizeFloat16Subnormal(uint32_t aSig, int *zExpPtr,
-                                      uint32_t *zSigPtr)
-{
-    int8_t shiftCount = countLeadingZeros32(aSig) - 21;
-    *zSigPtr = aSig << shiftCount;
-    *zExpPtr = 1 - shiftCount;
-}
-
-/* Half precision floats come in two formats: standard IEEE and "ARM" format.
-   The latter gains extra exponent range by omitting the NaN/Inf encodings.  */
-
-float32 float16_to_float32(float16 a, flag ieee, float_status *status)
-{
-    flag aSign;
-    int aExp;
-    uint32_t aSig;
-
-    aSign = extractFloat16Sign(a);
-    aExp = extractFloat16Exp(a);
-    aSig = extractFloat16Frac(a);
-
-    if (aExp == 0x1f && ieee) {
-        if (aSig) {
-            return commonNaNToFloat32(float16ToCommonNaN(a, status), status);
-        }
-        return packFloat32(aSign, 0xff, 0);
-    }
-    if (aExp == 0) {
-        if (aSig == 0) {
-            return packFloat32(aSign, 0, 0);
-        }
-
-        normalizeFloat16Subnormal(aSig, &aExp, &aSig);
-        aExp--;
-    }
-    return packFloat32( aSign, aExp + 0x70, aSig << 13);
-}
-
-float16 float32_to_float16(float32 a, flag ieee, float_status *status)
-{
-    flag aSign;
-    int aExp;
-    uint32_t aSig;
-
-    a = float32_squash_input_denormal(a, status);
-
-    aSig = extractFloat32Frac( a );
-    aExp = extractFloat32Exp( a );
-    aSign = extractFloat32Sign( a );
-    if ( aExp == 0xFF ) {
-        if (aSig) {
-            /* Input is a NaN */
-            if (!ieee) {
-                float_raise(float_flag_invalid, status);
-                return packFloat16(aSign, 0, 0);
-            }
-            return commonNaNToFloat16(
-                float32ToCommonNaN(a, status), status);
-        }
-        /* Infinity */
-        if (!ieee) {
-            float_raise(float_flag_invalid, status);
-            return packFloat16(aSign, 0x1f, 0x3ff);
-        }
-        return packFloat16(aSign, 0x1f, 0);
-    }
-    if (aExp == 0 && aSig == 0) {
-        return packFloat16(aSign, 0, 0);
-    }
-    /* Decimal point between bits 22 and 23. Note that we add the 1 bit
-     * even if the input is denormal; however this is harmless because
-     * the largest possible single-precision denormal is still smaller
-     * than the smallest representable half-precision denormal, and so we
-     * will end up ignoring aSig and returning via the "always return zero"
-     * codepath.
-     */
-    aSig |= 0x00800000;
-    aExp -= 0x71;
-
-    return roundAndPackFloat16(aSign, aExp, aSig, ieee, status);
-}
-
-float64 float16_to_float64(float16 a, flag ieee, float_status *status)
-{
-    flag aSign;
-    int aExp;
-    uint32_t aSig;
-
-    aSign = extractFloat16Sign(a);
-    aExp = extractFloat16Exp(a);
-    aSig = extractFloat16Frac(a);
-
-    if (aExp == 0x1f && ieee) {
-        if (aSig) {
-            return commonNaNToFloat64(
-                float16ToCommonNaN(a, status), status);
-        }
-        return packFloat64(aSign, 0x7ff, 0);
-    }
-    if (aExp == 0) {
-        if (aSig == 0) {
-            return packFloat64(aSign, 0, 0);
-        }
-
-        normalizeFloat16Subnormal(aSig, &aExp, &aSig);
-        aExp--;
-    }
-    return packFloat64(aSign, aExp + 0x3f0, ((uint64_t)aSig) << 42);
-}
-
-float16 float64_to_float16(float64 a, flag ieee, float_status *status)
-{
-    flag aSign;
-    int aExp;
-    uint64_t aSig;
-    uint32_t zSig;
-
-    a = float64_squash_input_denormal(a, status);
-
-    aSig = extractFloat64Frac(a);
-    aExp = extractFloat64Exp(a);
-    aSign = extractFloat64Sign(a);
-    if (aExp == 0x7FF) {
-        if (aSig) {
-            /* Input is a NaN */
-            if (!ieee) {
-                float_raise(float_flag_invalid, status);
-                return packFloat16(aSign, 0, 0);
-            }
-            return commonNaNToFloat16(
-                float64ToCommonNaN(a, status), status);
-        }
-        /* Infinity */
-        if (!ieee) {
-            float_raise(float_flag_invalid, status);
-            return packFloat16(aSign, 0x1f, 0x3ff);
-        }
-        return packFloat16(aSign, 0x1f, 0);
-    }
-    shift64RightJamming(aSig, 29, &aSig);
-    zSig = aSig;
-    if (aExp == 0 && zSig == 0) {
-        return packFloat16(aSign, 0, 0);
-    }
-    /* Decimal point between bits 22 and 23. Note that we add the 1 bit
-     * even if the input is denormal; however this is harmless because
-     * the largest possible single-precision denormal is still smaller
-     * than the smallest representable half-precision denormal, and so we
-     * will end up ignoring aSig and returning via the "always return zero"
-     * codepath.
-     */
-    zSig |= 0x00800000;
-    aExp -= 0x3F1;
-
-    return roundAndPackFloat16(aSign, aExp, zSig, ieee, status);
-}
-
 /*----------------------------------------------------------------------------
 | Returns the result of converting the double-precision floating-point value
 | `a' to the extended double-precision floating-point format.  The conversion
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 36626a501b..01ef1c6b81 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -211,10 +211,10 @@ float128 uint64_to_float128(uint64_t, float_status *status);
 /*----------------------------------------------------------------------------
 | Software half-precision conversion routines.
 *----------------------------------------------------------------------------*/
-float16 float32_to_float16(float32, flag, float_status *status);
-float32 float16_to_float32(float16, flag, float_status *status);
-float16 float64_to_float16(float64 a, flag ieee, float_status *status);
-float64 float16_to_float64(float16 a, flag ieee, float_status *status);
+float16 float32_to_float16(float32, bool ieee, float_status *status);
+float32 float16_to_float32(float16, bool ieee, float_status *status);
+float16 float64_to_float16(float64 a, bool ieee, float_status *status);
+float64 float16_to_float64(float16 a, bool ieee, float_status *status);
 int16_t float16_to_int16(float16, float_status *status);
 uint16_t float16_to_uint16(float16 a, float_status *status);
 int16_t float16_to_int16_round_to_zero(float16, float_status *status);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v3 3/5] fpu/softfloat: support ARM Alternative half-precision
  2018-05-10  9:42 [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Alex Bennée
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised Alex Bennée
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 2/5] fpu/softfloat: re-factor float to float conversions Alex Bennée
@ 2018-05-10  9:42 ` Alex Bennée
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 4/5] target/arm: convert conversion helpers to fpst/ahp_flag Alex Bennée
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2018-05-10  9:42 UTC (permalink / raw)
  To: peter.maydell
  Cc: richard.henderson, qemu-arm, qemu-devel, Alex Bennée,
	Aurelien Jarno

For float16 ARM supports an alternative half-precision format which
sacrifices the ability to represent NaN/Inf in return for a higher
dynamic range. To support this I've added an additional
FloatFmt (float16_params_ahp).

The new FloatFmt flag (arm_althp) is then used to modify the behaviour
of canonicalize and round_canonical with respect to representation and
exception raising.

Finally the float16_to_floatN and floatN_to_float16 conversion
routines select the new alternative FloatFmt when !ieee.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

---
v3
  - squash NaN to 0 if destination is AHP F16
---
 fpu/softfloat.c | 108 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 85 insertions(+), 23 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 042e5c901d..79ebc998d3 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -225,6 +225,8 @@ typedef struct {
  *   frac_lsb: least significant bit of fraction
  *   fram_lsbm1: the bit bellow the least significant bit (for rounding)
  *   round_mask/roundeven_mask: masks used for rounding
+ * The following optional modifiers are available:
+ *   arm_althp: handle ARM Alternative Half Precision
  */
 typedef struct {
     int exp_size;
@@ -236,6 +238,7 @@ typedef struct {
     uint64_t frac_lsbm1;
     uint64_t round_mask;
     uint64_t roundeven_mask;
+    bool arm_althp;
 } FloatFmt;
 
 /* Expand fields based on the size of exponent and fraction */
@@ -248,12 +251,17 @@ typedef struct {
     .frac_lsb       = 1ull << (DECOMPOSED_BINARY_POINT - F),         \
     .frac_lsbm1     = 1ull << ((DECOMPOSED_BINARY_POINT - F) - 1),   \
     .round_mask     = (1ull << (DECOMPOSED_BINARY_POINT - F)) - 1,   \
-    .roundeven_mask = (2ull << (DECOMPOSED_BINARY_POINT - F)) - 1
+    .roundeven_mask = (2ull << (DECOMPOSED_BINARY_POINT - F)) - 1,
 
 static const FloatFmt float16_params = {
     FLOAT_PARAMS(5, 10)
 };
 
+static const FloatFmt float16_params_ahp = {
+    FLOAT_PARAMS(5, 10)
+    .arm_althp = true
+};
+
 static const FloatFmt float32_params = {
     FLOAT_PARAMS(8, 23)
 };
@@ -317,7 +325,7 @@ static inline float64 float64_pack_raw(FloatParts p)
 static FloatParts canonicalize(FloatParts part, const FloatFmt *parm,
                                float_status *status)
 {
-    if (part.exp == parm->exp_max) {
+    if (part.exp == parm->exp_max && !parm->arm_althp) {
         if (part.frac == 0) {
             part.cls = float_class_inf;
         } else {
@@ -403,8 +411,9 @@ static FloatParts round_canonical(FloatParts p, float_status *s,
 
         exp += parm->exp_bias;
         if (likely(exp > 0)) {
+            bool maybe_inexact = false;
             if (frac & round_mask) {
-                flags |= float_flag_inexact;
+                maybe_inexact = true;
                 frac += inc;
                 if (frac & DECOMPOSED_OVERFLOW_BIT) {
                     frac >>= 1;
@@ -413,14 +422,26 @@ static FloatParts round_canonical(FloatParts p, float_status *s,
             }
             frac >>= frac_shift;
 
-            if (unlikely(exp >= exp_max)) {
-                flags |= float_flag_overflow | float_flag_inexact;
-                if (overflow_norm) {
-                    exp = exp_max - 1;
-                    frac = -1;
-                } else {
-                    p.cls = float_class_inf;
-                    goto do_inf;
+            if (parm->arm_althp) {
+                if (unlikely(exp >= exp_max + 1)) {
+                        flags |= float_flag_invalid;
+                        frac = -1;
+                        exp = exp_max;
+                } else if (maybe_inexact) {
+                    flags |= float_flag_inexact;
+                }
+            } else {
+                if (unlikely(exp >= exp_max)) {
+                    flags |= float_flag_overflow | float_flag_inexact;
+                    if (overflow_norm) {
+                        exp = exp_max - 1;
+                        frac = -1;
+                    } else {
+                        p.cls = float_class_inf;
+                        goto do_inf;
+                    }
+                } else if (maybe_inexact) {
+                    flags |= float_flag_inexact;
                 }
             }
         } else if (s->flush_to_zero) {
@@ -465,7 +486,13 @@ static FloatParts round_canonical(FloatParts p, float_status *s,
     case float_class_inf:
     do_inf:
         exp = exp_max;
-        frac = 0;
+        if (parm->arm_althp) {
+            flags |= float_flag_invalid;
+            /* Alt HP returns result = sign:Ones(M-1) */
+            frac = -1;
+        } else {
+            frac = 0;
+        }
         break;
 
     case float_class_qnan:
@@ -483,12 +510,21 @@ static FloatParts round_canonical(FloatParts p, float_status *s,
     return p;
 }
 
+/* Explicit FloatFmt version */
+static FloatParts float16a_unpack_canonical(const FloatFmt *params,
+                                            float16 f, float_status *s)
+{
+    return canonicalize(float16_unpack_raw(f), params, s);
+}
+
 static FloatParts float16_unpack_canonical(float16 f, float_status *s)
 {
-    return canonicalize(float16_unpack_raw(f), &float16_params, s);
+    return float16a_unpack_canonical(&float16_params, f, s);
 }
 
-static float16 float16_round_pack_canonical(FloatParts p, float_status *s)
+
+static float16 float16a_round_pack_canonical(const FloatFmt *params,
+                                             FloatParts p, float_status *s)
 {
     switch (p.cls) {
     case float_class_dnan:
@@ -496,11 +532,16 @@ static float16 float16_round_pack_canonical(FloatParts p, float_status *s)
     case float_class_msnan:
         return float16_maybe_silence_nan(float16_pack_raw(p), s);
     default:
-        p = round_canonical(p, s, &float16_params);
+        p = round_canonical(p, s, params);
         return float16_pack_raw(p);
     }
 }
 
+static float16 float16_round_pack_canonical(FloatParts p, float_status *s)
+{
+    return float16a_round_pack_canonical(&float16_params, p, s);
+}
+
 static FloatParts float32_unpack_canonical(float32 f, float_status *s)
 {
     return canonicalize(float32_unpack_raw(f), &float32_params, s);
@@ -1206,6 +1247,17 @@ static FloatParts float_to_float(FloatParts a,
             s->float_exception_flags |= float_flag_invalid;
         }
 
+        if (dstf->arm_althp) {
+            /* There is no NaN in the destination format: raise Invalid
+             * and return a zero with the sign of the input NaN.
+             */
+            s->float_exception_flags |= float_flag_invalid;
+            a.cls = float_class_zero;
+            a.frac = 0;
+            a.exp = 0;
+            return a;
+        }
+
         if (s->default_nan_mode) {
             a.cls = float_class_dnan;
             return a;
@@ -1226,25 +1278,34 @@ static FloatParts float_to_float(FloatParts a,
     return a;
 }
 
+/*
+ * Currently non-ieee implies ARM Alternative Half Precision handling
+ * for float16 values. If more are needed we'll need to expand the API
+ * into softfloat.
+ */
+
 float32 float16_to_float32(float16 a, bool ieee, float_status *s)
 {
-    FloatParts p = float16_unpack_canonical(a, s);
-    FloatParts pr = float_to_float(p, &float16_params, &float32_params, s);
+    const FloatFmt *fmt16 = ieee ? &float16_params : &float16_params_ahp;
+    FloatParts p = float16a_unpack_canonical(fmt16, a, s);
+    FloatParts pr = float_to_float(p, fmt16, &float32_params, s);
     return float32_round_pack_canonical(pr, s);
 }
 
 float64 float16_to_float64(float16 a, bool ieee, float_status *s)
 {
-    FloatParts p = float16_unpack_canonical(a, s);
-    FloatParts pr = float_to_float(p, &float16_params, &float64_params, s);
+    const FloatFmt *fmt16 = ieee ? &float16_params : &float16_params_ahp;
+    FloatParts p = float16a_unpack_canonical(fmt16, a, s);
+    FloatParts pr = float_to_float(p, fmt16, &float64_params, s);
     return float64_round_pack_canonical(pr, s);
 }
 
 float16 float32_to_float16(float32 a, bool ieee, float_status *s)
 {
+    const FloatFmt *fmt16 = ieee ? &float16_params : &float16_params_ahp;
     FloatParts p = float32_unpack_canonical(a, s);
-    FloatParts pr = float_to_float(p, &float32_params, &float16_params, s);
-    return float16_round_pack_canonical(pr, s);
+    FloatParts pr = float_to_float(p, &float32_params, fmt16, s);
+    return float16a_round_pack_canonical(fmt16, pr, s);
 }
 
 float64 float32_to_float64(float32 a, float_status *s)
@@ -1256,9 +1317,10 @@ float64 float32_to_float64(float32 a, float_status *s)
 
 float16 float64_to_float16(float64 a, bool ieee, float_status *s)
 {
+    const FloatFmt *fmt16 = ieee ? &float16_params : &float16_params_ahp;
     FloatParts p = float64_unpack_canonical(a, s);
-    FloatParts pr = float_to_float(p, &float64_params, &float16_params, s);
-    return float16_round_pack_canonical(pr, s);
+    FloatParts pr = float_to_float(p, &float64_params, fmt16, s);
+    return float16a_round_pack_canonical(fmt16, pr, s);
 }
 
 float32 float64_to_float32(float64 a, float_status *s)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v3 4/5] target/arm: convert conversion helpers to fpst/ahp_flag
  2018-05-10  9:42 [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Alex Bennée
                   ` (2 preceding siblings ...)
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 3/5] fpu/softfloat: support ARM Alternative half-precision Alex Bennée
@ 2018-05-10  9:42 ` Alex Bennée
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 5/5] target/arm: squash FZ16 behaviour for conversions Alex Bennée
  2018-05-10 12:55 ` [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Peter Maydell
  5 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2018-05-10  9:42 UTC (permalink / raw)
  To: peter.maydell; +Cc: richard.henderson, qemu-arm, qemu-devel, Alex Bennée

Instead of passing env and leaving it up to the helper to get the
right fpstatus we pass it explicitly. There was already a get_fpstatus
helper for neon for the 32 bit code. We also add an get_ahp_flag() for
passing the state of the alternative FP16 format flag. This leaves
scope for later tracking the AHP state in translation flags.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target/arm/helper.c        | 58 ++++++++++++-------------------
 target/arm/helper.h        | 12 +++----
 target/arm/translate-a64.c | 38 +++++++++++++++++----
 target/arm/translate.c     | 70 +++++++++++++++++++++++++++++---------
 target/arm/translate.h     | 15 ++++++++
 5 files changed, 128 insertions(+), 65 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0fef5d4d06..4dd28bb70c 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11457,64 +11457,50 @@ uint32_t HELPER(set_neon_rmode)(uint32_t rmode, CPUARMState *env)
 }
 
 /* Half precision conversions.  */
-static float32 do_fcvt_f16_to_f32(uint32_t a, CPUARMState *env, float_status *s)
+static float32 do_fcvt_f16_to_f32(float16 a, float_status *s, bool ahp)
 {
-    int ieee = (env->vfp.xregs[ARM_VFP_FPSCR] & (1 << 26)) == 0;
-    float32 r = float16_to_float32(make_float16(a), ieee, s);
-    if (ieee) {
-        return float32_maybe_silence_nan(r, s);
-    }
-    return r;
+    return float16_to_float32(a, !ahp, s);
 }
 
-static uint32_t do_fcvt_f32_to_f16(float32 a, CPUARMState *env, float_status *s)
+static float16 do_fcvt_f32_to_f16(float32 a, float_status *s, bool ahp)
 {
-    int ieee = (env->vfp.xregs[ARM_VFP_FPSCR] & (1 << 26)) == 0;
-    float16 r = float32_to_float16(a, ieee, s);
-    if (ieee) {
-        r = float16_maybe_silence_nan(r, s);
-    }
-    return float16_val(r);
+    return float32_to_float16(a, !ahp, s);
 }
 
-float32 HELPER(neon_fcvt_f16_to_f32)(uint32_t a, CPUARMState *env)
+float32 HELPER(neon_fcvt_f16_to_f32)(float16 a, void *fpstp, uint32_t ahp_mode)
 {
-    return do_fcvt_f16_to_f32(a, env, &env->vfp.standard_fp_status);
+    float_status *fpst = fpstp;
+    return do_fcvt_f16_to_f32(a, fpst, ahp_mode);
 }
 
-uint32_t HELPER(neon_fcvt_f32_to_f16)(float32 a, CPUARMState *env)
+float16 HELPER(neon_fcvt_f32_to_f16)(float32 a, void *fpstp, uint32_t ahp_mode)
 {
-    return do_fcvt_f32_to_f16(a, env, &env->vfp.standard_fp_status);
+    float_status *fpst = fpstp;
+    return do_fcvt_f32_to_f16(a, fpst, ahp_mode);
 }
 
-float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, CPUARMState *env)
+float32 HELPER(vfp_fcvt_f16_to_f32)(float16 a, void *fpstp, uint32_t ahp_mode)
 {
-    return do_fcvt_f16_to_f32(a, env, &env->vfp.fp_status);
+    float_status *fpst = fpstp;
+    return do_fcvt_f16_to_f32(a, fpst, ahp_mode);
 }
 
-uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, CPUARMState *env)
+float16 HELPER(vfp_fcvt_f32_to_f16)(float32 a, void *fpstp, uint32_t ahp_mode)
 {
-    return do_fcvt_f32_to_f16(a, env, &env->vfp.fp_status);
+    float_status *fpst = fpstp;
+    return do_fcvt_f32_to_f16(a, fpst, ahp_mode);
 }
 
-float64 HELPER(vfp_fcvt_f16_to_f64)(uint32_t a, CPUARMState *env)
+float64 HELPER(vfp_fcvt_f16_to_f64)(float16 a, void *fpstp, uint32_t ahp_mode)
 {
-    int ieee = (env->vfp.xregs[ARM_VFP_FPSCR] & (1 << 26)) == 0;
-    float64 r = float16_to_float64(make_float16(a), ieee, &env->vfp.fp_status);
-    if (ieee) {
-        return float64_maybe_silence_nan(r, &env->vfp.fp_status);
-    }
-    return r;
+    float_status *fpst = fpstp;
+    return float16_to_float64(a, !ahp_mode, fpst);
 }
 
-uint32_t HELPER(vfp_fcvt_f64_to_f16)(float64 a, CPUARMState *env)
+float16 HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
 {
-    int ieee = (env->vfp.xregs[ARM_VFP_FPSCR] & (1 << 26)) == 0;
-    float16 r = float64_to_float16(a, ieee, &env->vfp.fp_status);
-    if (ieee) {
-        r = float16_maybe_silence_nan(r, &env->vfp.fp_status);
-    }
-    return float16_val(r);
+    float_status *fpst = fpstp;
+    return float64_to_float16(a, !ahp_mode, fpst);
 }
 
 #define float32_two make_float32(0x40000000)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index 34e8cc8904..288480a0e7 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -181,12 +181,12 @@ DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
 DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
 DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env)
 
-DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env)
-DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env)
-DEF_HELPER_2(neon_fcvt_f16_to_f32, f32, i32, env)
-DEF_HELPER_2(neon_fcvt_f32_to_f16, i32, f32, env)
-DEF_HELPER_FLAGS_2(vfp_fcvt_f16_to_f64, TCG_CALL_NO_RWG, f64, i32, env)
-DEF_HELPER_FLAGS_2(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, i32, f64, env)
+DEF_HELPER_3(vfp_fcvt_f16_to_f32, f32, f16, ptr, i32)
+DEF_HELPER_3(vfp_fcvt_f32_to_f16, f16, f32, ptr, i32)
+DEF_HELPER_3(neon_fcvt_f16_to_f32, f32, f16, ptr, i32)
+DEF_HELPER_3(neon_fcvt_f32_to_f16, f16, f32, ptr, i32)
+DEF_HELPER_FLAGS_3(vfp_fcvt_f16_to_f64, TCG_CALL_NO_RWG, f64, f16, ptr, i32)
+DEF_HELPER_FLAGS_3(vfp_fcvt_f64_to_f16, TCG_CALL_NO_RWG, f16, f64, ptr, i32)
 
 DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr)
 DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 6d49f30b4a..00a7c63240 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4830,10 +4830,15 @@ static void handle_fp_fcvt(DisasContext *s, int opcode,
         } else {
             /* Single to half */
             TCGv_i32 tcg_rd = tcg_temp_new_i32();
-            gen_helper_vfp_fcvt_f32_to_f16(tcg_rd, tcg_rn, cpu_env);
+            TCGv_i32 ahp = get_ahp_flag();
+            TCGv_ptr fpst = get_fpstatus_ptr(true);
+
+            gen_helper_vfp_fcvt_f32_to_f16(tcg_rd, tcg_rn, fpst, ahp);
             /* write_fp_sreg is OK here because top half of tcg_rd is zero */
             write_fp_sreg(s, rd, tcg_rd);
             tcg_temp_free_i32(tcg_rd);
+            tcg_temp_free_i32(ahp);
+            tcg_temp_free_ptr(fpst);
         }
         tcg_temp_free_i32(tcg_rn);
         break;
@@ -4846,9 +4851,13 @@ static void handle_fp_fcvt(DisasContext *s, int opcode,
             /* Double to single */
             gen_helper_vfp_fcvtsd(tcg_rd, tcg_rn, cpu_env);
         } else {
+            TCGv_ptr fpst = get_fpstatus_ptr(true);
+            TCGv_i32 ahp = get_ahp_flag();
             /* Double to half */
-            gen_helper_vfp_fcvt_f64_to_f16(tcg_rd, tcg_rn, cpu_env);
+            gen_helper_vfp_fcvt_f64_to_f16(tcg_rd, tcg_rn, fpst, ahp);
             /* write_fp_sreg is OK here because top half of tcg_rd is zero */
+            tcg_temp_free_ptr(fpst);
+            tcg_temp_free_i32(ahp);
         }
         write_fp_sreg(s, rd, tcg_rd);
         tcg_temp_free_i32(tcg_rd);
@@ -4858,17 +4867,21 @@ static void handle_fp_fcvt(DisasContext *s, int opcode,
     case 0x3:
     {
         TCGv_i32 tcg_rn = read_fp_sreg(s, rn);
+        TCGv_ptr tcg_fpst = get_fpstatus_ptr(true);
+        TCGv_i32 tcg_ahp = get_ahp_flag();
         tcg_gen_ext16u_i32(tcg_rn, tcg_rn);
         if (dtype == 0) {
             /* Half to single */
             TCGv_i32 tcg_rd = tcg_temp_new_i32();
-            gen_helper_vfp_fcvt_f16_to_f32(tcg_rd, tcg_rn, cpu_env);
+            gen_helper_vfp_fcvt_f16_to_f32(tcg_rd, tcg_rn, tcg_fpst, tcg_ahp);
             write_fp_sreg(s, rd, tcg_rd);
+            tcg_temp_free_ptr(tcg_fpst);
+            tcg_temp_free_i32(tcg_ahp);
             tcg_temp_free_i32(tcg_rd);
         } else {
             /* Half to double */
             TCGv_i64 tcg_rd = tcg_temp_new_i64();
-            gen_helper_vfp_fcvt_f16_to_f64(tcg_rd, tcg_rn, cpu_env);
+            gen_helper_vfp_fcvt_f16_to_f64(tcg_rd, tcg_rn, tcg_fpst, tcg_ahp);
             write_fp_dreg(s, rd, tcg_rd);
             tcg_temp_free_i64(tcg_rd);
         }
@@ -8487,12 +8500,17 @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
             } else {
                 TCGv_i32 tcg_lo = tcg_temp_new_i32();
                 TCGv_i32 tcg_hi = tcg_temp_new_i32();
+                TCGv_ptr fpst = get_fpstatus_ptr(true);
+                TCGv_i32 ahp = get_ahp_flag();
+
                 tcg_gen_extr_i64_i32(tcg_lo, tcg_hi, tcg_op);
-                gen_helper_vfp_fcvt_f32_to_f16(tcg_lo, tcg_lo, cpu_env);
-                gen_helper_vfp_fcvt_f32_to_f16(tcg_hi, tcg_hi, cpu_env);
+                gen_helper_vfp_fcvt_f32_to_f16(tcg_lo, tcg_lo, fpst, ahp);
+                gen_helper_vfp_fcvt_f32_to_f16(tcg_hi, tcg_hi, fpst, ahp);
                 tcg_gen_deposit_i32(tcg_res[pass], tcg_lo, tcg_hi, 16, 16);
                 tcg_temp_free_i32(tcg_lo);
                 tcg_temp_free_i32(tcg_hi);
+                tcg_temp_free_ptr(fpst);
+                tcg_temp_free_i32(ahp);
             }
             break;
         case 0x56:  /* FCVTXN, FCVTXN2 */
@@ -10987,18 +11005,24 @@ static void handle_2misc_widening(DisasContext *s, int opcode, bool is_q,
         /* 16 -> 32 bit fp conversion */
         int srcelt = is_q ? 4 : 0;
         TCGv_i32 tcg_res[4];
+        TCGv_ptr fpst = get_fpstatus_ptr(true);
+        TCGv_i32 ahp = get_ahp_flag();
+
 
         for (pass = 0; pass < 4; pass++) {
             tcg_res[pass] = tcg_temp_new_i32();
 
             read_vec_element_i32(s, tcg_res[pass], rn, srcelt + pass, MO_16);
             gen_helper_vfp_fcvt_f16_to_f32(tcg_res[pass], tcg_res[pass],
-                                           cpu_env);
+                                           fpst, ahp);
         }
         for (pass = 0; pass < 4; pass++) {
             write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_32);
             tcg_temp_free_i32(tcg_res[pass]);
         }
+
+        tcg_temp_free_ptr(fpst);
+        tcg_temp_free_i32(ahp);
     }
 }
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index ad208867a7..5eab9d585a 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -3824,53 +3824,75 @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                         gen_vfp_sqrt(dp);
                         break;
                     case 4: /* vcvtb.f32.f16, vcvtb.f64.f16 */
+                    {
+                        TCGv_ptr fpst = get_fpstatus_ptr(false);
+                        TCGv_i32 ahp_mode = get_ahp_flag();
                         tmp = gen_vfp_mrs();
                         tcg_gen_ext16u_i32(tmp, tmp);
                         if (dp) {
                             gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp,
-                                                           cpu_env);
+                                                           fpst, ahp_mode);
                         } else {
                             gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp,
-                                                           cpu_env);
+                                                           fpst, ahp_mode);
                         }
+                        tcg_temp_free_i32(ahp_mode);
+                        tcg_temp_free_ptr(fpst);
                         tcg_temp_free_i32(tmp);
                         break;
+                    }
                     case 5: /* vcvtt.f32.f16, vcvtt.f64.f16 */
+                    {
+                        TCGv_ptr fpst = get_fpstatus_ptr(false);
+                        TCGv_i32 ahp = get_ahp_flag();
                         tmp = gen_vfp_mrs();
                         tcg_gen_shri_i32(tmp, tmp, 16);
                         if (dp) {
                             gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp,
-                                                           cpu_env);
+                                                           fpst, ahp);
                         } else {
                             gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp,
-                                                           cpu_env);
+                                                           fpst, ahp);
                         }
                         tcg_temp_free_i32(tmp);
+                        tcg_temp_free_i32(ahp);
+                        tcg_temp_free_ptr(fpst);
                         break;
+                    }
                     case 6: /* vcvtb.f16.f32, vcvtb.f16.f64 */
+                    {
+                        TCGv_ptr fpst = get_fpstatus_ptr(false);
+                        TCGv_i32 ahp = get_ahp_flag();
                         tmp = tcg_temp_new_i32();
+
                         if (dp) {
                             gen_helper_vfp_fcvt_f64_to_f16(tmp, cpu_F0d,
-                                                           cpu_env);
+                                                           fpst, ahp);
                         } else {
                             gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s,
-                                                           cpu_env);
+                                                           fpst, ahp);
                         }
                         gen_mov_F0_vreg(0, rd);
                         tmp2 = gen_vfp_mrs();
                         tcg_gen_andi_i32(tmp2, tmp2, 0xffff0000);
                         tcg_gen_or_i32(tmp, tmp, tmp2);
                         tcg_temp_free_i32(tmp2);
+                        tcg_temp_free_i32(ahp);
+                        tcg_temp_free_ptr(fpst);
                         gen_vfp_msr(tmp);
                         break;
+                    }
                     case 7: /* vcvtt.f16.f32, vcvtt.f16.f64 */
+                    {
+                        TCGv_ptr fpst = get_fpstatus_ptr(true);
+                        TCGv_i32 ahp = get_ahp_flag();
                         tmp = tcg_temp_new_i32();
                         if (dp) {
                             gen_helper_vfp_fcvt_f64_to_f16(tmp, cpu_F0d,
-                                                           cpu_env);
+                                                           fpst, ahp);
                         } else {
                             gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s,
-                                                           cpu_env);
+                                                           fpst, ahp);
                         }
                         tcg_gen_shli_i32(tmp, tmp, 16);
                         gen_mov_F0_vreg(0, rd);
@@ -3880,6 +3902,7 @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                         tcg_temp_free_i32(tmp2);
                         gen_vfp_msr(tmp);
                         break;
+                    }
                     case 8: /* cmp */
                         gen_vfp_cmp(dp);
                         break;
@@ -7222,53 +7245,68 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     }
                     break;
                 case NEON_2RM_VCVT_F16_F32:
+                {
+                    TCGv_ptr fpst;
+                    TCGv_i32 ahp;
+
                     if (!arm_dc_feature(s, ARM_FEATURE_VFP_FP16) ||
                         q || (rm & 1)) {
                         return 1;
                     }
                     tmp = tcg_temp_new_i32();
                     tmp2 = tcg_temp_new_i32();
+                    fpst = get_fpstatus_ptr(true);
+                    ahp = get_ahp_flag();
                     tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 0));
-                    gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env);
+                    gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, fpst, ahp);
                     tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 1));
-                    gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env);
+                    gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, fpst, ahp);
                     tcg_gen_shli_i32(tmp2, tmp2, 16);
                     tcg_gen_or_i32(tmp2, tmp2, tmp);
                     tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 2));
-                    gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env);
+                    gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, fpst, ahp);
                     tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 3));
                     neon_store_reg(rd, 0, tmp2);
                     tmp2 = tcg_temp_new_i32();
-                    gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env);
+                    gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, fpst, ahp);
                     tcg_gen_shli_i32(tmp2, tmp2, 16);
                     tcg_gen_or_i32(tmp2, tmp2, tmp);
                     neon_store_reg(rd, 1, tmp2);
                     tcg_temp_free_i32(tmp);
+                    tcg_temp_free_i32(ahp);
+                    tcg_temp_free_ptr(fpst);
                     break;
+                }
                 case NEON_2RM_VCVT_F32_F16:
+                {
+                    TCGv_ptr fpst;
+                    TCGv_i32 ahp;
                     if (!arm_dc_feature(s, ARM_FEATURE_VFP_FP16) ||
                         q || (rd & 1)) {
                         return 1;
                     }
+                    fpst = get_fpstatus_ptr(true);
+                    ahp = get_ahp_flag();
                     tmp3 = tcg_temp_new_i32();
                     tmp = neon_load_reg(rm, 0);
                     tmp2 = neon_load_reg(rm, 1);
                     tcg_gen_ext16u_i32(tmp3, tmp);
-                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env);
+                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, fpst, ahp);
                     tcg_gen_st_f32(cpu_F0s, cpu_env, neon_reg_offset(rd, 0));
                     tcg_gen_shri_i32(tmp3, tmp, 16);
-                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env);
+                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, fpst, ahp);
                     tcg_gen_st_f32(cpu_F0s, cpu_env, neon_reg_offset(rd, 1));
                     tcg_temp_free_i32(tmp);
                     tcg_gen_ext16u_i32(tmp3, tmp2);
-                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env);
+                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, fpst, ahp);
                     tcg_gen_st_f32(cpu_F0s, cpu_env, neon_reg_offset(rd, 2));
                     tcg_gen_shri_i32(tmp3, tmp2, 16);
-                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env);
+                    gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, fpst, ahp);
                     tcg_gen_st_f32(cpu_F0s, cpu_env, neon_reg_offset(rd, 3));
                     tcg_temp_free_i32(tmp2);
                     tcg_temp_free_i32(tmp3);
                     break;
+                }
                 case NEON_2RM_AESE: case NEON_2RM_AESMC:
                     if (!arm_dc_feature(s, ARM_FEATURE_V8_AES)
                         || ((rm | rd) & 1)) {
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 4428c98e2e..41d0b8cd9a 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -177,4 +177,19 @@ void arm_free_cc(DisasCompare *cmp);
 void arm_jump_cc(DisasCompare *cmp, TCGLabel *label);
 void arm_gen_test_cc(int cc, TCGLabel *label);
 
+/* Return state of Alternate Half-precision flag, caller frees result */
+static inline TCGv_i32 get_ahp_flag(void)
+{
+    TCGv_i32 fpscr = tcg_temp_new_i32();
+    TCGv_i32 ahp_mode = tcg_temp_new_i32();
+
+    tcg_gen_ld_i32(fpscr, cpu_env, offsetof(CPUARMState,
+                                            vfp.xregs[ARM_VFP_FPSCR]));
+    tcg_gen_extract_i32(ahp_mode, fpscr, 26, 1);
+
+    tcg_temp_free_i32(fpscr);
+
+    return ahp_mode;
+}
+
 #endif /* TARGET_ARM_TRANSLATE_H */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v3 5/5] target/arm: squash FZ16 behaviour for conversions
  2018-05-10  9:42 [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Alex Bennée
                   ` (3 preceding siblings ...)
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 4/5] target/arm: convert conversion helpers to fpst/ahp_flag Alex Bennée
@ 2018-05-10  9:42 ` Alex Bennée
  2018-05-10 12:55 ` [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Peter Maydell
  5 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2018-05-10  9:42 UTC (permalink / raw)
  To: peter.maydell; +Cc: richard.henderson, qemu-arm, qemu-devel, Alex Bennée

The ARM ARM specifies FZ16 is suppressed for conversions. Rather than
pushing this logic into the softfloat code we can simply save the FZ
state and temporarily disable it for the softfloat call.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target/arm/helper.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 4dd28bb70c..17147be58b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11459,12 +11459,20 @@ uint32_t HELPER(set_neon_rmode)(uint32_t rmode, CPUARMState *env)
 /* Half precision conversions.  */
 static float32 do_fcvt_f16_to_f32(float16 a, float_status *s, bool ahp)
 {
-    return float16_to_float32(a, !ahp, s);
+    flag save_flush_to_zero = s->flush_to_zero;
+    set_flush_to_zero(false, s);
+    float32 r = float16_to_float32(a, !ahp, s);
+    set_flush_to_zero(save_flush_to_zero, s);
+    return r;
 }
 
 static float16 do_fcvt_f32_to_f16(float32 a, float_status *s, bool ahp)
 {
-    return float32_to_float16(a, !ahp, s);
+    flag save_flush_to_zero = s->flush_to_zero;
+    set_flush_to_zero(false, s);
+    float16 r = float32_to_float16(a, !ahp, s);
+    set_flush_to_zero(save_flush_to_zero, s);
+    return float16_val(r);
 }
 
 float32 HELPER(neon_fcvt_f16_to_f32)(float16 a, void *fpstp, uint32_t ahp_mode)
@@ -11494,13 +11502,21 @@ float16 HELPER(vfp_fcvt_f32_to_f16)(float32 a, void *fpstp, uint32_t ahp_mode)
 float64 HELPER(vfp_fcvt_f16_to_f64)(float16 a, void *fpstp, uint32_t ahp_mode)
 {
     float_status *fpst = fpstp;
-    return float16_to_float64(a, !ahp_mode, fpst);
+    flag save_flush_to_zero = fpst->flush_to_zero;
+    set_flush_to_zero(false, fpst);
+    float64 r = float16_to_float64(a, !ahp_mode, fpst);
+    set_flush_to_zero(save_flush_to_zero, fpst);
+    return r;
 }
 
 float16 HELPER(vfp_fcvt_f64_to_f16)(float64 a, void *fpstp, uint32_t ahp_mode)
 {
     float_status *fpst = fpstp;
-    return float64_to_float16(a, !ahp_mode, fpst);
+    flag save_flush_to_zero = fpst->flush_to_zero;
+    set_flush_to_zero(false, fpst);
+    float16 r = float64_to_float16(a, !ahp_mode, fpst);
+    set_flush_to_zero(save_flush_to_zero, fpst);
+    return float16_val(r);
 }
 
 #define float32_two make_float32(0x40000000)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised Alex Bennée
@ 2018-05-10 12:40   ` Peter Maydell
  2018-05-10 14:50   ` Richard Henderson
  1 sibling, 0 replies; 12+ messages in thread
From: Peter Maydell @ 2018-05-10 12:40 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Richard Henderson, qemu-arm, QEMU Developers, Aurelien Jarno

On 10 May 2018 at 10:42, Alex Bennée <alex.bennee@linaro.org> wrote:
> Reported by Coverity (CID1390635). We ensure this for uint_to_float
> later on so we might as well mirror that.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  fpu/softfloat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index 70e0c40a1c..3adf6a06e4 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -1517,7 +1517,7 @@ FLOAT_TO_UINT(64, 64)
>
>  static FloatParts int_to_float(int64_t a, float_status *status)
>  {
> -    FloatParts r;
> +    FloatParts r = {};
>      if (a == 0) {
>          r.cls = float_class_zero;
>          r.sign = false;
> --

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/5] fpu/softfloat: re-factor float to float conversions
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 2/5] fpu/softfloat: re-factor float to float conversions Alex Bennée
@ 2018-05-10 12:48   ` Peter Maydell
  2018-05-10 13:03     ` Alex Bennée
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Maydell @ 2018-05-10 12:48 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Richard Henderson, qemu-arm, QEMU Developers, Aurelien Jarno

On 10 May 2018 at 10:42, Alex Bennée <alex.bennee@linaro.org> wrote:
> This allows us to delete a lot of additional boilerplate code which is
> no longer needed. Currently the ieee flag is ignored (everything is
> assumed to be ieee). Handling for ARM AHP will be in the next patch.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>
> ---
> v2
>   - pass FloatFmt to float_to_float instead of sizes
>   - split AHP handling to another patch
>   - use rth's suggested re-packing (+ setting .exp)
> v3
>   - also rm extractFloat16Sign
> ---
>  fpu/softfloat-specialize.h |  40 ----
>  fpu/softfloat.c            | 452 +++++++------------------------------
>  include/fpu/softfloat.h    |   8 +-
>  3 files changed, 88 insertions(+), 412 deletions(-)

This introduces a regression where we don't get tininess-before-rounding
for double/single to halfprec conversions. This is because we're
now using the fp_status_f16 status field, and it has not had
the detect_tininess setting initialized. This fixes it:

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index d175c5e94f..7939c6b8ae 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -324,6 +324,8 @@ static void arm_cpu_reset(CPUState *s)
                               &env->vfp.fp_status);
     set_float_detect_tininess(float_tininess_before_rounding,
                               &env->vfp.standard_fp_status);
+    set_float_detect_tininess(float_tininess_before_rounding,
+                              &env->vfp.fp_status_f16);
 #ifndef CONFIG_USER_ONLY
     if (kvm_enabled()) {
         kvm_arm_reset_vcpu(cpu);

(You can see this if you try something like fcvt h1, d0 where
d0 == 0x3f0f_ffff_ffff_ffff -- we get the right answer of 0x0400
but fail to set Underflow as well as Inexact.)

thanks
-- PMM

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP
  2018-05-10  9:42 [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Alex Bennée
                   ` (4 preceding siblings ...)
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 5/5] target/arm: squash FZ16 behaviour for conversions Alex Bennée
@ 2018-05-10 12:55 ` Peter Maydell
  2018-05-10 13:34   ` Alex Bennée
  5 siblings, 1 reply; 12+ messages in thread
From: Peter Maydell @ 2018-05-10 12:55 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Richard Henderson, qemu-arm, QEMU Developers

On 10 May 2018 at 10:42, Alex Bennée <alex.bennee@linaro.org> wrote:
> Hi,
>
> Hi,
>
> I've not included the test case in the series but you can find it in
> my TCG fixup branch:
>
>   https://github.com/stsquad/qemu/blob/testing/tcg-tests-revival-v4/tests/tcg/arm/fcvt.c
>
> Some of the ARMv7 versions are commented out as they where not
> supported until later revs. I do have a build that includes that but
> unfortunately the Debian compiler it too old to build it.
>
> : patch 0001/fpu softfloat int_to_float ensure r fully initial.patch needs review
> : patch 0004/target arm convert conversion helpers to fpst ahp.patch needs review
> : patch 0005/target arm squash FZ16 behaviour for conversions.patch needs review

This still seems to regress the NaN conversion case I mentioned
in review of the previous series:

(3) Here's a NaN case we get wrong now: 64 to IEEE-16 conversion,
input is 0x7ff0000000000001 (an SNaN), we produce
0x7c00 (infinity) but should produce 0x7e00 (a QNaN).

thanks
-- PMM

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/5] fpu/softfloat: re-factor float to float conversions
  2018-05-10 12:48   ` Peter Maydell
@ 2018-05-10 13:03     ` Alex Bennée
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2018-05-10 13:03 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Richard Henderson, qemu-arm, QEMU Developers, Aurelien Jarno


Peter Maydell <peter.maydell@linaro.org> writes:

> On 10 May 2018 at 10:42, Alex Bennée <alex.bennee@linaro.org> wrote:
>> This allows us to delete a lot of additional boilerplate code which is
>> no longer needed. Currently the ieee flag is ignored (everything is
>> assumed to be ieee). Handling for ARM AHP will be in the next patch.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>>
>> ---
>> v2
>>   - pass FloatFmt to float_to_float instead of sizes
>>   - split AHP handling to another patch
>>   - use rth's suggested re-packing (+ setting .exp)
>> v3
>>   - also rm extractFloat16Sign
>> ---
>>  fpu/softfloat-specialize.h |  40 ----
>>  fpu/softfloat.c            | 452 +++++++------------------------------
>>  include/fpu/softfloat.h    |   8 +-
>>  3 files changed, 88 insertions(+), 412 deletions(-)
>
> This introduces a regression where we don't get tininess-before-rounding
> for double/single to halfprec conversions. This is because we're
> now using the fp_status_f16 status field, and it has not had
> the detect_tininess setting initialized. This fixes it:
>
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index d175c5e94f..7939c6b8ae 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -324,6 +324,8 @@ static void arm_cpu_reset(CPUState *s)
>                                &env->vfp.fp_status);
>      set_float_detect_tininess(float_tininess_before_rounding,
>                                &env->vfp.standard_fp_status);
> +    set_float_detect_tininess(float_tininess_before_rounding,
> +                              &env->vfp.fp_status_f16);

I'm now wondering if I should have tried harder to rationalise the
various float_status structures we've ended up with.

>  #ifndef CONFIG_USER_ONLY
>      if (kvm_enabled()) {
>          kvm_arm_reset_vcpu(cpu);
>
> (You can see this if you try something like fcvt h1, d0 where
> d0 == 0x3f0f_ffff_ffff_ffff -- we get the right answer of 0x0400
> but fail to set Underflow as well as Inexact.)
>
> thanks
> -- PMM


--
Alex Bennée

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP
  2018-05-10 12:55 ` [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Peter Maydell
@ 2018-05-10 13:34   ` Alex Bennée
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2018-05-10 13:34 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Richard Henderson, qemu-arm, QEMU Developers


Peter Maydell <peter.maydell@linaro.org> writes:

> On 10 May 2018 at 10:42, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Hi,
>>
>> Hi,
>>
>> I've not included the test case in the series but you can find it in
>> my TCG fixup branch:
>>
>>   https://github.com/stsquad/qemu/blob/testing/tcg-tests-revival-v4/tests/tcg/arm/fcvt.c
>>
>> Some of the ARMv7 versions are commented out as they where not
>> supported until later revs. I do have a build that includes that but
>> unfortunately the Debian compiler it too old to build it.
>>
>> : patch 0001/fpu softfloat int_to_float ensure r fully initial.patch needs review
>> : patch 0004/target arm convert conversion helpers to fpst ahp.patch needs review
>> : patch 0005/target arm squash FZ16 behaviour for conversions.patch needs review
>
> This still seems to regress the NaN conversion case I mentioned
> in review of the previous series:
>
> (3) Here's a NaN case we get wrong now: 64 to IEEE-16 conversion,
> input is 0x7ff0000000000001 (an SNaN), we produce
> 0x7c00 (infinity) but should produce 0x7e00 (a QNaN).

Hmm I had added the test case but due to another bug it never actually
ran :-/

>
> thanks
> -- PMM


--
Alex Bennée

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised
  2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised Alex Bennée
  2018-05-10 12:40   ` Peter Maydell
@ 2018-05-10 14:50   ` Richard Henderson
  1 sibling, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2018-05-10 14:50 UTC (permalink / raw)
  To: Alex Bennée, peter.maydell; +Cc: qemu-arm, qemu-devel, Aurelien Jarno

On 05/10/2018 02:42 AM, Alex Bennée wrote:
> Reported by Coverity (CID1390635). We ensure this for uint_to_float
> later on so we might as well mirror that.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  fpu/softfloat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-05-10 14:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-10  9:42 [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Alex Bennée
2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 1/5] fpu/softfloat: int_to_float ensure r fully initialised Alex Bennée
2018-05-10 12:40   ` Peter Maydell
2018-05-10 14:50   ` Richard Henderson
2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 2/5] fpu/softfloat: re-factor float to float conversions Alex Bennée
2018-05-10 12:48   ` Peter Maydell
2018-05-10 13:03     ` Alex Bennée
2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 3/5] fpu/softfloat: support ARM Alternative half-precision Alex Bennée
2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 4/5] target/arm: convert conversion helpers to fpst/ahp_flag Alex Bennée
2018-05-10  9:42 ` [Qemu-devel] [PATCH v3 5/5] target/arm: squash FZ16 behaviour for conversions Alex Bennée
2018-05-10 12:55 ` [Qemu-devel] [PATCH v3 0/5] refactor float-to-float and fix AHP Peter Maydell
2018-05-10 13:34   ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.