[Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input
@ 2011-01-06 19:37 Peter Maydell
  2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 1/3] softfloat: Implement flushing input denormals to zero Peter Maydell
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Peter Maydell @ 2011-01-06 19:37 UTC (permalink / raw)
  To: qemu-devel

On ARM, the FPSCR FZ bit (which controls whether denormals should be
flushed to zero) is supposed to cause this flushing to occur both
when the output of a calculation is a denormal (already implemented in
softfloat) and also when the input to a calculation is a denormal
(not implemented, as noted by a FIXME comment).

This patchset adds the support to softfloat for flushing denormals on
input. This is controlled using a new status flag to enable it (so that
CPUs which only flush on output continue to work). There is a new
exception status bit to indicate when input flushing has happened
(because on ARM it is reported via a different FPSCR bit to that used
when an output denormal is flushed to zero).

I have deliberately only implemented this for input float32 and
float64 values because that is what ARM requires (on ARM float16
inputs must not be flushed to zero and floatx80 and float128 are
not used) so other changes would be totally untested code.

Existing CPUs should be unaffected as there is no behaviour change
unless the mode is enabled.

(I suspect that MIPS should be able to use this to implement the
FCSR FO bit if desired.)

Tested using random instruction generation for vadd/vsub/vmul/vdiv
with the FPSCR FZ bit set.

The only change from v1 is that I have split the old patch 2/2
into two parts, since it accidentally included a one-liner bugfix
where we were setting the softfloat cumulative exception flags
wrongly if the FPSCR was written to explicitly.

Peter Maydell (3):
  softfloat: Implement flushing input denormals to zero
  ARM: Set softfloat cumulative exc flags from correct FPSCR bits
  ARM: wire up the softfloat flush_input_to_zero flag

 fpu/softfloat.c     |  104 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 fpu/softfloat.h     |   22 ++++++++++-
 target-arm/helper.c |   10 ++++-
 3 files changed, 131 insertions(+), 5 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v2 1/3] softfloat: Implement flushing input denormals to zero
  2011-01-06 19:37 [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Peter Maydell
@ 2011-01-06 19:37 ` Peter Maydell
  2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 2/3] ARM: Set softfloat cumulative exc flags from correct FPSCR bits Peter Maydell
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Maydell @ 2011-01-06 19:37 UTC (permalink / raw)
  To: qemu-devel

Add support to softfloat for flushing input denormal float32 and float64
to zero. softfloat's existing 'flush_to_zero' flag only flushes denormals
to zero on output. Some CPUs need input denormals to be flushed before
processing as well. Implement this, using a new status flag to enable it
and a new exception status bit to indicate when it has happened. Existing
CPUs should be unaffected as there is no behaviour change unless the
mode is enabled.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
---
 fpu/softfloat.c |  104 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 fpu/softfloat.h |   22 +++++++++++-
 2 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 6f5b05d..17842f4 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -30,8 +30,6 @@ these four paragraphs for those parts of this code that are retained.
 
 =============================================================================*/
 
-/* FIXME: Flush-To-Zero only effects results.  Denormal inputs should also
-   be flushed to zero.  */
 #include "softfloat.h"
 
 /*----------------------------------------------------------------------------
@@ -204,6 +202,21 @@ INLINE flag extractFloat32Sign( float32 a )
 }
 
 /*----------------------------------------------------------------------------
+| If `a' is denormal and we are in flush-to-zero mode then set the
+| input-denormal exception and return zero. Otherwise just return the value.
+*----------------------------------------------------------------------------*/
+static float32 float32_squash_input_denormal(float32 a STATUS_PARAM)
+{
+    if (STATUS(flush_inputs_to_zero)) {
+        if (extractFloat32Exp(a) == 0 && extractFloat32Frac(a) != 0) {
+            float_raise(float_flag_input_denormal STATUS_VAR);
+            return make_float32(float32_val(a) & 0x80000000);
+        }
+    }
+    return a;
+}
+
+/*----------------------------------------------------------------------------
 | Normalizes the subnormal single-precision floating-point value represented
 | by the denormalized significand `aSig'.  The normalized exponent and
 | significand are stored at the locations pointed to by `zExpPtr' and
@@ -368,6 +381,21 @@ INLINE flag extractFloat64Sign( float64 a )
 }
 
 /*----------------------------------------------------------------------------
+| If `a' is denormal and we are in flush-to-zero mode then set the
+| input-denormal exception and return zero. Otherwise just return the value.
+*----------------------------------------------------------------------------*/
+static float64 float64_squash_input_denormal(float64 a STATUS_PARAM)
+{
+    if (STATUS(flush_inputs_to_zero)) {
+        if (extractFloat64Exp(a) == 0 && extractFloat64Frac(a) != 0) {
+            float_raise(float_flag_input_denormal STATUS_VAR);
+            return make_float64(float64_val(a) & (1ULL << 63));
+        }
+    }
+    return a;
+}
+
+/*----------------------------------------------------------------------------
 | Normalizes the subnormal double-precision floating-point value represented
 | by the denormalized significand `aSig'.  The normalized exponent and
 | significand are stored at the locations pointed to by `zExpPtr' and
@@ -1298,6 +1326,7 @@ int32 float32_to_int32( float32 a STATUS_PARAM )
     bits32 aSig;
     bits64 aSig64;
 
+    a = float32_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
     aSign = extractFloat32Sign( a );
@@ -1327,6 +1356,7 @@ int32 float32_to_int32_round_to_zero( float32 a STATUS_PARAM )
     int16 aExp, shiftCount;
     bits32 aSig;
     int32 z;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -1418,6 +1448,7 @@ int64 float32_to_int64( float32 a STATUS_PARAM )
     int16 aExp, shiftCount;
     bits32 aSig;
     bits64 aSig64, aSigExtra;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -1455,6 +1486,7 @@ int64 float32_to_int64_round_to_zero( float32 a STATUS_PARAM )
     bits32 aSig;
     bits64 aSig64;
     int64 z;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -1496,6 +1528,7 @@ float64 float32_to_float64( float32 a STATUS_PARAM )
     flag aSign;
     int16 aExp;
     bits32 aSig;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -1528,6 +1561,7 @@ floatx80 float32_to_floatx80( float32 a STATUS_PARAM )
     int16 aExp;
     bits32 aSig;
 
+    a = float32_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
     aSign = extractFloat32Sign( a );
@@ -1561,6 +1595,7 @@ float128 float32_to_float128( float32 a STATUS_PARAM )
     int16 aExp;
     bits32 aSig;
 
+    a = float32_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
     aSign = extractFloat32Sign( a );
@@ -1593,6 +1628,7 @@ float32 float32_round_to_int( float32 a STATUS_PARAM)
     bits32 lastBitMask, roundBitsMask;
     int8 roundingMode;
     bits32 z;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aExp = extractFloat32Exp( a );
     if ( 0x96 <= aExp ) {
@@ -1796,6 +1832,8 @@ static float32 subFloat32Sigs( float32 a, float32 b, flag zSign STATUS_PARAM)
 float32 float32_add( float32 a, float32 b STATUS_PARAM )
 {
     flag aSign, bSign;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     aSign = extractFloat32Sign( a );
     bSign = extractFloat32Sign( b );
@@ -1817,6 +1855,8 @@ float32 float32_add( float32 a, float32 b STATUS_PARAM )
 float32 float32_sub( float32 a, float32 b STATUS_PARAM )
 {
     flag aSign, bSign;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     aSign = extractFloat32Sign( a );
     bSign = extractFloat32Sign( b );
@@ -1843,6 +1883,9 @@ float32 float32_mul( float32 a, float32 b STATUS_PARAM )
     bits64 zSig64;
     bits32 zSig;
 
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
+
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
     aSign = extractFloat32Sign( a );
@@ -1900,6 +1943,8 @@ float32 float32_div( float32 a, float32 b STATUS_PARAM )
     flag aSign, bSign, zSign;
     int16 aExp, bExp, zExp;
     bits32 aSig, bSig, zSig;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -1966,6 +2011,8 @@ float32 float32_rem( float32 a, float32 b STATUS_PARAM )
     bits64 aSig64, bSig64, q64;
     bits32 alternateASig;
     sbits32 sigMean;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -2062,6 +2109,7 @@ float32 float32_sqrt( float32 a STATUS_PARAM )
     int16 aExp, zExp;
     bits32 aSig, zSig;
     bits64 rem, term;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -2148,6 +2196,7 @@ float32 float32_exp2( float32 a STATUS_PARAM )
     bits32 aSig;
     float64 r, x, xn;
     int i;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -2194,6 +2243,7 @@ float32 float32_log2( float32 a STATUS_PARAM )
     int16 aExp;
     bits32 aSig, zSig, i;
 
+    a = float32_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
     aSign = extractFloat32Sign( a );
@@ -2238,6 +2288,8 @@ float32 float32_log2( float32 a STATUS_PARAM )
 
 int float32_eq( float32 a, float32 b STATUS_PARAM )
 {
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat32Exp( a ) == 0xFF ) && extractFloat32Frac( a ) )
          || ( ( extractFloat32Exp( b ) == 0xFF ) && extractFloat32Frac( b ) )
@@ -2263,6 +2315,8 @@ int float32_le( float32 a, float32 b STATUS_PARAM )
 {
     flag aSign, bSign;
     bits32 av, bv;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat32Exp( a ) == 0xFF ) && extractFloat32Frac( a ) )
          || ( ( extractFloat32Exp( b ) == 0xFF ) && extractFloat32Frac( b ) )
@@ -2289,6 +2343,8 @@ int float32_lt( float32 a, float32 b STATUS_PARAM )
 {
     flag aSign, bSign;
     bits32 av, bv;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat32Exp( a ) == 0xFF ) && extractFloat32Frac( a ) )
          || ( ( extractFloat32Exp( b ) == 0xFF ) && extractFloat32Frac( b ) )
@@ -2315,6 +2371,8 @@ int float32_lt( float32 a, float32 b STATUS_PARAM )
 int float32_eq_signaling( float32 a, float32 b STATUS_PARAM )
 {
     bits32 av, bv;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat32Exp( a ) == 0xFF ) && extractFloat32Frac( a ) )
          || ( ( extractFloat32Exp( b ) == 0xFF ) && extractFloat32Frac( b ) )
@@ -2339,6 +2397,8 @@ int float32_le_quiet( float32 a, float32 b STATUS_PARAM )
 {
     flag aSign, bSign;
     bits32 av, bv;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat32Exp( a ) == 0xFF ) && extractFloat32Frac( a ) )
          || ( ( extractFloat32Exp( b ) == 0xFF ) && extractFloat32Frac( b ) )
@@ -2368,6 +2428,8 @@ int float32_lt_quiet( float32 a, float32 b STATUS_PARAM )
 {
     flag aSign, bSign;
     bits32 av, bv;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+    b = float32_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat32Exp( a ) == 0xFF ) && extractFloat32Frac( a ) )
          || ( ( extractFloat32Exp( b ) == 0xFF ) && extractFloat32Frac( b ) )
@@ -2401,6 +2463,7 @@ int32 float64_to_int32( float64 a STATUS_PARAM )
     flag aSign;
     int16 aExp, shiftCount;
     bits64 aSig;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -2429,6 +2492,7 @@ int32 float64_to_int32_round_to_zero( float64 a STATUS_PARAM )
     int16 aExp, shiftCount;
     bits64 aSig, savedASig;
     int32 z;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -2525,6 +2589,7 @@ int64 float64_to_int64( float64 a STATUS_PARAM )
     flag aSign;
     int16 aExp, shiftCount;
     bits64 aSig, aSigExtra;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -2568,6 +2633,7 @@ int64 float64_to_int64_round_to_zero( float64 a STATUS_PARAM )
     int16 aExp, shiftCount;
     bits64 aSig;
     int64 z;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -2617,6 +2683,7 @@ float32 float64_to_float32( float64 a STATUS_PARAM )
     int16 aExp;
     bits64 aSig;
     bits32 zSig;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -2694,6 +2761,7 @@ bits16 float32_to_float16( float32 a, flag ieee STATUS_PARAM)
     bits32 mask;
     bits32 increment;
     int8 roundingMode;
+    a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
@@ -2788,6 +2856,7 @@ floatx80 float64_to_floatx80( float64 a STATUS_PARAM )
     int16 aExp;
     bits64 aSig;
 
+    a = float64_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
     aSign = extractFloat64Sign( a );
@@ -2822,6 +2891,7 @@ float128 float64_to_float128( float64 a STATUS_PARAM )
     int16 aExp;
     bits64 aSig, zSig0, zSig1;
 
+    a = float64_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
     aSign = extractFloat64Sign( a );
@@ -2855,6 +2925,7 @@ float64 float64_round_to_int( float64 a STATUS_PARAM )
     bits64 lastBitMask, roundBitsMask;
     int8 roundingMode;
     bits64 z;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aExp = extractFloat64Exp( a );
     if ( 0x433 <= aExp ) {
@@ -3071,6 +3142,8 @@ static float64 subFloat64Sigs( float64 a, float64 b, flag zSign STATUS_PARAM )
 float64 float64_add( float64 a, float64 b STATUS_PARAM )
 {
     flag aSign, bSign;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     aSign = extractFloat64Sign( a );
     bSign = extractFloat64Sign( b );
@@ -3092,6 +3165,8 @@ float64 float64_add( float64 a, float64 b STATUS_PARAM )
 float64 float64_sub( float64 a, float64 b STATUS_PARAM )
 {
     flag aSign, bSign;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     aSign = extractFloat64Sign( a );
     bSign = extractFloat64Sign( b );
@@ -3116,6 +3191,9 @@ float64 float64_mul( float64 a, float64 b STATUS_PARAM )
     int16 aExp, bExp, zExp;
     bits64 aSig, bSig, zSig0, zSig1;
 
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
+
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
     aSign = extractFloat64Sign( a );
@@ -3175,6 +3253,8 @@ float64 float64_div( float64 a, float64 b STATUS_PARAM )
     bits64 aSig, bSig, zSig;
     bits64 rem0, rem1;
     bits64 term0, term1;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -3246,6 +3326,8 @@ float64 float64_rem( float64 a, float64 b STATUS_PARAM )
     bits64 q, alternateASig;
     sbits64 sigMean;
 
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
     aSign = extractFloat64Sign( a );
@@ -3328,6 +3410,7 @@ float64 float64_sqrt( float64 a STATUS_PARAM )
     int16 aExp, zExp;
     bits64 aSig, zSig, doubleZSig;
     bits64 rem0, rem1, term0, term1;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -3377,6 +3460,7 @@ float64 float64_log2( float64 a STATUS_PARAM )
     flag aSign, zSign;
     int16 aExp;
     bits64 aSig, aSig0, aSig1, zSig, i;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
@@ -3422,6 +3506,8 @@ float64 float64_log2( float64 a STATUS_PARAM )
 int float64_eq( float64 a, float64 b STATUS_PARAM )
 {
     bits64 av, bv;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat64Exp( a ) == 0x7FF ) && extractFloat64Frac( a ) )
          || ( ( extractFloat64Exp( b ) == 0x7FF ) && extractFloat64Frac( b ) )
@@ -3448,6 +3534,8 @@ int float64_le( float64 a, float64 b STATUS_PARAM )
 {
     flag aSign, bSign;
     bits64 av, bv;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat64Exp( a ) == 0x7FF ) && extractFloat64Frac( a ) )
          || ( ( extractFloat64Exp( b ) == 0x7FF ) && extractFloat64Frac( b ) )
@@ -3475,6 +3563,8 @@ int float64_lt( float64 a, float64 b STATUS_PARAM )
     flag aSign, bSign;
     bits64 av, bv;
 
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
     if (    ( ( extractFloat64Exp( a ) == 0x7FF ) && extractFloat64Frac( a ) )
          || ( ( extractFloat64Exp( b ) == 0x7FF ) && extractFloat64Frac( b ) )
        ) {
@@ -3500,6 +3590,8 @@ int float64_lt( float64 a, float64 b STATUS_PARAM )
 int float64_eq_signaling( float64 a, float64 b STATUS_PARAM )
 {
     bits64 av, bv;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat64Exp( a ) == 0x7FF ) && extractFloat64Frac( a ) )
          || ( ( extractFloat64Exp( b ) == 0x7FF ) && extractFloat64Frac( b ) )
@@ -3524,6 +3616,8 @@ int float64_le_quiet( float64 a, float64 b STATUS_PARAM )
 {
     flag aSign, bSign;
     bits64 av, bv;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat64Exp( a ) == 0x7FF ) && extractFloat64Frac( a ) )
          || ( ( extractFloat64Exp( b ) == 0x7FF ) && extractFloat64Frac( b ) )
@@ -3553,6 +3647,8 @@ int float64_lt_quiet( float64 a, float64 b STATUS_PARAM )
 {
     flag aSign, bSign;
     bits64 av, bv;
+    a = float64_squash_input_denormal(a STATUS_VAR);
+    b = float64_squash_input_denormal(b STATUS_VAR);
 
     if (    ( ( extractFloat64Exp( a ) == 0x7FF ) && extractFloat64Frac( a ) )
          || ( ( extractFloat64Exp( b ) == 0x7FF ) && extractFloat64Frac( b ) )
@@ -5833,6 +5929,8 @@ INLINE int float ## s ## _compare_internal( float ## s a, float ## s b,      \
 {                                                                            \
     flag aSign, bSign;                                                       \
     bits ## s av, bv;                                                        \
+    a = float ## s ## _squash_input_denormal(a STATUS_VAR);                  \
+    b = float ## s ## _squash_input_denormal(b STATUS_VAR);                  \
                                                                              \
     if (( ( extractFloat ## s ## Exp( a ) == nan_exp ) &&                    \
          extractFloat ## s ## Frac( a ) ) ||                                 \
@@ -5929,6 +6027,7 @@ float32 float32_scalbn( float32 a, int n STATUS_PARAM )
     int16 aExp;
     bits32 aSig;
 
+    a = float32_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat32Frac( a );
     aExp = extractFloat32Exp( a );
     aSign = extractFloat32Sign( a );
@@ -5952,6 +6051,7 @@ float64 float64_scalbn( float64 a, int n STATUS_PARAM )
     int16 aExp;
     bits64 aSig;
 
+    a = float64_squash_input_denormal(a STATUS_VAR);
     aSig = extractFloat64Frac( a );
     aExp = extractFloat64Exp( a );
     aSign = extractFloat64Sign( a );
diff --git a/fpu/softfloat.h b/fpu/softfloat.h
index f2104c6..15052cc 100644
--- a/fpu/softfloat.h
+++ b/fpu/softfloat.h
@@ -180,7 +180,8 @@ enum {
     float_flag_divbyzero =  4,
     float_flag_overflow  =  8,
     float_flag_underflow = 16,
-    float_flag_inexact   = 32
+    float_flag_inexact   = 32,
+    float_flag_input_denormal = 64
 };
 
 typedef struct float_status {
@@ -190,7 +191,10 @@ typedef struct float_status {
 #ifdef FLOATX80
     signed char floatx80_rounding_precision;
 #endif
+    /* should denormalised results go to zero and set the inexact flag? */
     flag flush_to_zero;
+    /* should denormalised inputs go to zero and set the input_denormal flag? */
+    flag flush_inputs_to_zero;
     flag default_nan_mode;
 } float_status;
 
@@ -200,6 +204,10 @@ INLINE void set_flush_to_zero(flag val STATUS_PARAM)
 {
     STATUS(flush_to_zero) = val;
 }
+INLINE void set_flush_inputs_to_zero(flag val STATUS_PARAM)
+{
+    STATUS(flush_inputs_to_zero) = val;
+}
 INLINE void set_default_nan_mode(flag val STATUS_PARAM)
 {
     STATUS(default_nan_mode) = val;
@@ -294,11 +302,17 @@ float32 float32_scalbn( float32, int STATUS_PARAM );
 
 INLINE float32 float32_abs(float32 a)
 {
+    /* Note that abs does *not* handle NaN specially, nor does
+     * it flush denormal inputs to zero.
+     */
     return make_float32(float32_val(a) & 0x7fffffff);
 }
 
 INLINE float32 float32_chs(float32 a)
 {
+    /* Note that chs does *not* handle NaN specially, nor does
+     * it flush denormal inputs to zero.
+     */
     return make_float32(float32_val(a) ^ 0x80000000);
 }
 
@@ -374,11 +388,17 @@ float64 float64_scalbn( float64, int STATUS_PARAM );
 
 INLINE float64 float64_abs(float64 a)
 {
+    /* Note that abs does *not* handle NaN specially, nor does
+     * it flush denormal inputs to zero.
+     */
     return make_float64(float64_val(a) & 0x7fffffffffffffffLL);
 }
 
 INLINE float64 float64_chs(float64 a)
 {
+    /* Note that chs does *not* handle NaN specially, nor does
+     * it flush denormal inputs to zero.
+     */
     return make_float64(float64_val(a) ^ 0x8000000000000000LL);
 }
 
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v2 2/3] ARM: Set softfloat cumulative exc flags from correct FPSCR bits
  2011-01-06 19:37 [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Peter Maydell
  2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 1/3] softfloat: Implement flushing input denormals to zero Peter Maydell
@ 2011-01-06 19:37 ` Peter Maydell
  2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 3/3] ARM: wire up the softfloat flush_input_to_zero flag Peter Maydell
  2011-01-06 21:26 ` [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Aurelien Jarno
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Maydell @ 2011-01-06 19:37 UTC (permalink / raw)
  To: qemu-devel

When handling a write to the ARM FPSCR, set the softfloat cumulative
exception flags from the cumulative flags in the FPSCR, not the
exception-enable bits. Also don't apply a mask: vfp_exceptbits_to_host
will only look at the correct bits anyway.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 50c1017..05684a2 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2315,7 +2315,7 @@ void HELPER(vfp_set_fpscr)(CPUState *env, uint32_t val)
     if (changed & (1 << 25))
         set_default_nan_mode((val & (1 << 25)) != 0, &env->vfp.fp_status);
 
-    i = vfp_exceptbits_to_host((val >> 8) & 0x1f);
+    i = vfp_exceptbits_to_host(val);
     set_float_exception_flags(i, &env->vfp.fp_status);
 }
 
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v2 3/3] ARM: wire up the softfloat flush_input_to_zero flag
  2011-01-06 19:37 [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Peter Maydell
  2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 1/3] softfloat: Implement flushing input denormals to zero Peter Maydell
  2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 2/3] ARM: Set softfloat cumulative exc flags from correct FPSCR bits Peter Maydell
@ 2011-01-06 19:37 ` Peter Maydell
  2011-01-06 21:26 ` [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Aurelien Jarno
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Maydell @ 2011-01-06 19:37 UTC (permalink / raw)
  To: qemu-devel

Wire up the new softfloat support for flushing input denormals
to zero on ARM. The FPSCR FZ bit enables flush-to-zero for
both inputs and outputs, but the reporting of when inputs are
flushed to zero is via a separate IDC bit rather than the UFC
(underflow) bit used when output denormals are flushed to zero.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 05684a2..705b99f 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2242,6 +2242,8 @@ static inline int vfp_exceptbits_from_host(int host_bits)
         target_bits |= 8;
     if (host_bits & float_flag_inexact)
         target_bits |= 0x10;
+    if (host_bits & float_flag_input_denormal)
+        target_bits |= 0x80;
     return target_bits;
 }
 
@@ -2278,6 +2280,8 @@ static inline int vfp_exceptbits_to_host(int target_bits)
         host_bits |= float_flag_underflow;
     if (target_bits & 0x10)
         host_bits |= float_flag_inexact;
+    if (target_bits & 0x80)
+        host_bits |= float_flag_input_denormal;
     return host_bits;
 }
 
@@ -2310,8 +2314,10 @@ void HELPER(vfp_set_fpscr)(CPUState *env, uint32_t val)
         }
         set_float_rounding_mode(i, &env->vfp.fp_status);
     }
-    if (changed & (1 << 24))
+    if (changed & (1 << 24)) {
         set_flush_to_zero((val & (1 << 24)) != 0, &env->vfp.fp_status);
+        set_flush_inputs_to_zero((val & (1 << 24)) != 0, &env->vfp.fp_status);
+    }
     if (changed & (1 << 25))
         set_default_nan_mode((val & (1 << 25)) != 0, &env->vfp.fp_status);
 
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input
  2011-01-06 19:37 [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Peter Maydell
                   ` (2 preceding siblings ...)
  2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 3/3] ARM: wire up the softfloat flush_input_to_zero flag Peter Maydell
@ 2011-01-06 21:26 ` Aurelien Jarno
  3 siblings, 0 replies; 5+ messages in thread
From: Aurelien Jarno @ 2011-01-06 21:26 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On Thu, Jan 06, 2011 at 07:37:52PM +0000, Peter Maydell wrote:
> On ARM, the FPSCR FZ bit (which controls whether denormals should be
> flushed to zero) is supposed to cause this flushing to occur both
> when the output of a calculation is a denormal (already implemented in
> softfloat) and also when the input to a calculation is a denormal
> (not implemented, as noted by a FIXME comment).
> 
> This patchset adds the support to softfloat for flushing denormals on
> input. This is controlled using a new status flag to enable it (so that
> CPUs which only flush on output continue to work). There is a new
> exception status bit to indicate when input flushing has happened
> (because on ARM it is reported via a different FPSCR bit to that used
> when an output denormal is flushed to zero).
> 
> I have deliberately only implemented this for input float32 and
> float64 values because that is what ARM requires (on ARM float16
> inputs must not be flushed to zero and floatx80 and float128 are
> not used) so other changes would be totally untested code.
> 
> Existing CPUs should be unaffected as there is no behaviour change
> unless the mode is enabled.
> 
> (I suspect that MIPS should be able to use this to implement the
> FCSR FO bit if desired.)
> 
> Tested using random instruction generation for vadd/vsub/vmul/vdiv
> with the FPSCR FZ bit set.
> 
> The only change from v1 is that I have split the old patch 2/2
> into two parts, since it accidentally included a one-liner bugfix
> where we were setting the softfloat cumulative exception flags
> wrongly if the FPSCR was written to explicitly.
> 
> Peter Maydell (3):
>   softfloat: Implement flushing input denormals to zero
>   ARM: Set softfloat cumulative exc flags from correct FPSCR bits
>   ARM: wire up the softfloat flush_input_to_zero flag
> 
>  fpu/softfloat.c     |  104 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  fpu/softfloat.h     |   22 ++++++++++-
>  target-arm/helper.c |   10 ++++-
>  3 files changed, 131 insertions(+), 5 deletions(-)
> 
> 

Thanks, all 3 applied.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-01-06 21:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-06 19:37 [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Peter Maydell
2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 1/3] softfloat: Implement flushing input denormals to zero Peter Maydell
2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 2/3] ARM: Set softfloat cumulative exc flags from correct FPSCR bits Peter Maydell
2011-01-06 19:37 ` [Qemu-devel] [PATCH v2 3/3] ARM: wire up the softfloat flush_input_to_zero flag Peter Maydell
2011-01-06 21:26 ` [Qemu-devel] [PATCH v2 0/3] ARM/softfloat: support flushing denormals on input Aurelien Jarno

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.