All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes
@ 2020-06-05 18:58 Joseph Myers
  2020-06-05 18:59 ` [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem Joseph Myers
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 18:58 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

The x87 floating-point emulation of the fprem and fprem1 instructions
works via conversion to and from double.  This is inherently
unsuitable for a good emulation of any floatx80 operation.  This patch
series adapts the softfloat floatx80_rem implementation to be suitable
for these instructions and uses it to reimplement them.

There is an existing test for these instructions, test-i386-fprem.c,
based on comparison of output.  It produces 1679695 lines of output,
and before this patch series 415422 of those lines are different on
hardware from the output produced by QEMU.  Some of those differences
are because QEMU's x87 emulation does not yet produce the "denormal
operand" exception; ignoring such differences (modifying the output
from a native run not to report that exception), there are still
398833 different lines.  This patch series reduces that latter number
to 1 (that one difference being because of missing checks for
floating-point stack underflow, another global issue with the x87
emulation), or 35517 different lines without the correction for lack
of denormal operand exception support.

Several fixes to and new features in the softfloat support for this
operation are needed; floatx80_mod, previously present in the m68k
code only, is made generic and unified with floatx80_rem in a new
floatx80_modrem of which floatx80_mod and floatx80_rem are thin
wrappers.  The only architectures using float*_rem for other formats
are arm (FPA emulation) and openrisc (instructions that have been
removed in the latest architecture version); they do not appear to
need any of the new features, and all the bugs fixed are specific to
floatx80, so no changes are made to the remainder implementation for
those formats.

A new feature added is returning the low bits of the quotient from
floatx80_modrem, as needed for both x87 and m68k.  The logic used to
determine the low 7 bits of the quotient for m68k
(target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it
looks at the result of converting the remainder to integer, the
quotient having been discarded by that point); this patch series does
not change that to use the new interface, but the m68k maintainers may
wish to do so.

The Intel instruction set documentation leaves unspecified the exact
number of bits by which the remainder instructions reduce the operand
each time.  The AMD documentation gives a specific formula, which
empirically Intel processors follow as well, and that formula is
implemented in the code.  The AMD documentation also specifies that
flags other than C2 are cleared in the partial remainder case, whereas
the Intel manual is silent on that (but the processors do appear to
clear those flags); this patch implements that flag clearing, and
keeps the existing flag clearing in cases where the instructions raise
"invalid" (although it seems hardware in fact only clears some but not
all flags in that case, leaving other flags unchanged).

The Intel manuals include an inaccurate table asserting that (finite
REM 0) should raise "divide by zero"; actually, in accordance with
IEEE semantics, it raises "invalid".  The AMD manuals inaccurately say
for both fprem and fprem1 that if the exponent difference is negative,
the numerator is returned unchanged, which is correct (apart from
normalizing pseudo-denormals) for fprem but not for fprem1 (and the
old QEMU code had an incorrect optimization following the AMD manuals
for fprem1).

Joseph Myers (7):
  softfloat: merge floatx80_mod and floatx80_rem
  softfloat: fix floatx80 remainder pseudo-denormal check for zero
  softfloat: do not return pseudo-denormal from floatx80 remainder
  softfloat: do not set denominator high bit for floatx80 remainder
  softfloat: return low bits of quotient from floatx80_modrem
  target/i386: reimplement fprem1 using floatx80 operations
  target/i386: reimplement fprem using floatx80 operations

 fpu/softfloat.c          |  83 +++++++++++++++++----
 include/fpu/softfloat.h  |   3 +
 target/i386/fpu_helper.c | 154 ++++++++++++---------------------------
 target/m68k/softfloat.c  |  83 ---------------------
 target/m68k/softfloat.h  |   1 -
 5 files changed, 116 insertions(+), 208 deletions(-)

-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem
  2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
@ 2020-06-05 18:59 ` Joseph Myers
  2020-06-06 19:06   ` Richard Henderson
  2020-06-05 19:00 ` [PATCH 2/7] softfloat: fix floatx80 remainder pseudo-denormal check for zero Joseph Myers
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 18:59 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

The m68k-specific softfloat code includes a function floatx80_mod that
is extremely similar to floatx80_rem, but computing the remainder
based on truncating the quotient toward zero rather than rounding it
to nearest integer.  This is also useful for emulating the x87 fprem
and fprem1 instructions.  Change the floatx80_rem implementation into
floatx80_modrem that can perform either operation, with both
floatx80_rem and floatx80_mod as thin wrappers available for all
targets.

There does not appear to be any use for the _mod operation for other
floating-point formats in QEMU (the only other architectures using
_rem at all are linux-user/arm/nwfpe, for FPA emulation, and openrisc,
for instructions that have been removed in the latest version of the
architecture), so no change is made to the code for other formats.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
 fpu/softfloat.c         | 49 ++++++++++++++++++------
 include/fpu/softfloat.h |  2 +
 target/m68k/softfloat.c | 83 -----------------------------------------
 target/m68k/softfloat.h |  1 -
 4 files changed, 40 insertions(+), 95 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 6c8f2d597a..7b1ce7664f 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -5682,10 +5682,13 @@ floatx80 floatx80_div(floatx80 a, floatx80 b, float_status *status)
 /*----------------------------------------------------------------------------
 | Returns the remainder of the extended double-precision floating-point value
 | `a' with respect to the corresponding value `b'.  The operation is performed
-| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic,
+| if 'mod' is false; if 'mod' is true, return the remainder based on truncating
+| the quotient toward zero instead.
 *----------------------------------------------------------------------------*/
 
-floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status)
+floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
+                         float_status *status)
 {
     bool aSign, zSign;
     int32_t aExp, bExp, expDiff;
@@ -5731,7 +5734,7 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status)
     expDiff = aExp - bExp;
     aSig1 = 0;
     if ( expDiff < 0 ) {
-        if ( expDiff < -1 ) return a;
+        if ( mod || expDiff < -1 ) return a;
         shift128Right( aSig0, 0, 1, &aSig0, &aSig1 );
         expDiff = 0;
     }
@@ -5763,14 +5766,16 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status)
         term1 = 0;
         term0 = bSig;
     }
-    sub128( term0, term1, aSig0, aSig1, &alternateASig0, &alternateASig1 );
-    if (    lt128( alternateASig0, alternateASig1, aSig0, aSig1 )
-         || (    eq128( alternateASig0, alternateASig1, aSig0, aSig1 )
-              && ( q & 1 ) )
-       ) {
-        aSig0 = alternateASig0;
-        aSig1 = alternateASig1;
-        zSign = ! zSign;
+    if (!mod) {
+        sub128( term0, term1, aSig0, aSig1, &alternateASig0, &alternateASig1 );
+        if (    lt128( alternateASig0, alternateASig1, aSig0, aSig1 )
+                || (    eq128( alternateASig0, alternateASig1, aSig0, aSig1 )
+                        && ( q & 1 ) )
+            ) {
+            aSig0 = alternateASig0;
+            aSig1 = alternateASig1;
+            zSign = ! zSign;
+        }
     }
     return
         normalizeRoundAndPackFloatx80(
@@ -5778,6 +5783,28 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status)
 
 }
 
+/*----------------------------------------------------------------------------
+| Returns the remainder of the extended double-precision floating-point value
+| `a' with respect to the corresponding value `b'.  The operation is performed
+| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status)
+{
+    return floatx80_modrem(a, b, false, status);
+}
+
+/*----------------------------------------------------------------------------
+| Returns the remainder of the extended double-precision floating-point value
+| `a' with respect to the corresponding value `b', with the quotient truncated
+| toward zero.
+*----------------------------------------------------------------------------*/
+
+floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status)
+{
+    return floatx80_modrem(a, b, true, status);
+}
+
 /*----------------------------------------------------------------------------
 | Returns the square root of the extended double-precision floating-point
 | value `a'.  The operation is performed according to the IEC/IEEE Standard
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 16ca697a73..bff6934d09 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -687,6 +687,8 @@ floatx80 floatx80_add(floatx80, floatx80, float_status *status);
 floatx80 floatx80_sub(floatx80, floatx80, float_status *status);
 floatx80 floatx80_mul(floatx80, floatx80, float_status *status);
 floatx80 floatx80_div(floatx80, floatx80, float_status *status);
+floatx80 floatx80_modrem(floatx80, floatx80, bool, float_status *status);
+floatx80 floatx80_mod(floatx80, floatx80, float_status *status);
 floatx80 floatx80_rem(floatx80, floatx80, float_status *status);
 floatx80 floatx80_sqrt(floatx80, float_status *status);
 FloatRelation floatx80_compare(floatx80, floatx80, float_status *status);
diff --git a/target/m68k/softfloat.c b/target/m68k/softfloat.c
index 9f120cf15e..b6d0ed7acf 100644
--- a/target/m68k/softfloat.c
+++ b/target/m68k/softfloat.c
@@ -42,89 +42,6 @@ static floatx80 propagateFloatx80NaNOneArg(floatx80 a, float_status *status)
     return a;
 }
 
-/*
- * Returns the modulo remainder of the extended double-precision floating-point
- * value `a' with respect to the corresponding value `b'.
- */
-
-floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status)
-{
-    bool aSign, zSign;
-    int32_t aExp, bExp, expDiff;
-    uint64_t aSig0, aSig1, bSig;
-    uint64_t qTemp, term0, term1;
-
-    aSig0 = extractFloatx80Frac(a);
-    aExp = extractFloatx80Exp(a);
-    aSign = extractFloatx80Sign(a);
-    bSig = extractFloatx80Frac(b);
-    bExp = extractFloatx80Exp(b);
-
-    if (aExp == 0x7FFF) {
-        if ((uint64_t) (aSig0 << 1)
-            || ((bExp == 0x7FFF) && (uint64_t) (bSig << 1))) {
-            return propagateFloatx80NaN(a, b, status);
-        }
-        goto invalid;
-    }
-    if (bExp == 0x7FFF) {
-        if ((uint64_t) (bSig << 1)) {
-            return propagateFloatx80NaN(a, b, status);
-        }
-        return a;
-    }
-    if (bExp == 0) {
-        if (bSig == 0) {
-        invalid:
-            float_raise(float_flag_invalid, status);
-            return floatx80_default_nan(status);
-        }
-        normalizeFloatx80Subnormal(bSig, &bExp, &bSig);
-    }
-    if (aExp == 0) {
-        if ((uint64_t) (aSig0 << 1) == 0) {
-            return a;
-        }
-        normalizeFloatx80Subnormal(aSig0, &aExp, &aSig0);
-    }
-    bSig |= UINT64_C(0x8000000000000000);
-    zSign = aSign;
-    expDiff = aExp - bExp;
-    aSig1 = 0;
-    if (expDiff < 0) {
-        return a;
-    }
-    qTemp = (bSig <= aSig0);
-    if (qTemp) {
-        aSig0 -= bSig;
-    }
-    expDiff -= 64;
-    while (0 < expDiff) {
-        qTemp = estimateDiv128To64(aSig0, aSig1, bSig);
-        qTemp = (2 < qTemp) ? qTemp - 2 : 0;
-        mul64To128(bSig, qTemp, &term0, &term1);
-        sub128(aSig0, aSig1, term0, term1, &aSig0, &aSig1);
-        shortShift128Left(aSig0, aSig1, 62, &aSig0, &aSig1);
-        expDiff -= 62;
-    }
-    expDiff += 64;
-    if (0 < expDiff) {
-        qTemp = estimateDiv128To64(aSig0, aSig1, bSig);
-        qTemp = (2 < qTemp) ? qTemp - 2 : 0;
-        qTemp >>= 64 - expDiff;
-        mul64To128(bSig, qTemp << (64 - expDiff), &term0, &term1);
-        sub128(aSig0, aSig1, term0, term1, &aSig0, &aSig1);
-        shortShift128Left(0, bSig, 64 - expDiff, &term0, &term1);
-        while (le128(term0, term1, aSig0, aSig1)) {
-            ++qTemp;
-            sub128(aSig0, aSig1, term0, term1, &aSig0, &aSig1);
-        }
-    }
-    return
-        normalizeRoundAndPackFloatx80(
-            80, zSign, bExp + expDiff, aSig0, aSig1, status);
-}
-
 /*
  * Returns the mantissa of the extended double-precision floating-point
  * value `a'.
diff --git a/target/m68k/softfloat.h b/target/m68k/softfloat.h
index 365ef6ac7a..4bb9567134 100644
--- a/target/m68k/softfloat.h
+++ b/target/m68k/softfloat.h
@@ -23,7 +23,6 @@
 #define TARGET_M68K_SOFTFLOAT_H
 #include "fpu/softfloat.h"
 
-floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status);
 floatx80 floatx80_getman(floatx80 a, float_status *status);
 floatx80 floatx80_getexp(floatx80 a, float_status *status);
 floatx80 floatx80_scale(floatx80 a, floatx80 b, float_status *status);
-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/7] softfloat: fix floatx80 remainder pseudo-denormal check for zero
  2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
  2020-06-05 18:59 ` [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem Joseph Myers
@ 2020-06-05 19:00 ` Joseph Myers
  2020-06-06 19:06   ` Richard Henderson
  2020-06-05 19:00 ` [PATCH 3/7] softfloat: do not return pseudo-denormal from floatx80 remainder Joseph Myers
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 19:00 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

The floatx80 remainder implementation ignores the high bit of the
significand when checking whether an operand (numerator) with zero
exponent is zero.  This means it mishandles a pseudo-denormal
representation of 0x1p-16382L by treating it as zero.  Fix this by
checking the whole significand instead.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
 fpu/softfloat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 7b1ce7664f..091847beb9 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -5726,7 +5726,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
         normalizeFloatx80Subnormal( bSig, &bExp, &bSig );
     }
     if ( aExp == 0 ) {
-        if ( (uint64_t) ( aSig0<<1 ) == 0 ) return a;
+        if ( aSig0 == 0 ) return a;
         normalizeFloatx80Subnormal( aSig0, &aExp, &aSig0 );
     }
     bSig |= UINT64_C(0x8000000000000000);
-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/7] softfloat: do not return pseudo-denormal from floatx80 remainder
  2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
  2020-06-05 18:59 ` [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem Joseph Myers
  2020-06-05 19:00 ` [PATCH 2/7] softfloat: fix floatx80 remainder pseudo-denormal check for zero Joseph Myers
@ 2020-06-05 19:00 ` Joseph Myers
  2020-06-06 19:08   ` Richard Henderson
  2020-06-05 19:01 ` [PATCH 4/7] softfloat: do not set denominator high bit for " Joseph Myers
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 19:00 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

The floatx80 remainder implementation sometimes returns the numerator
unchanged when the denominator is sufficiently larger than the
numerator.  But if the value to be returned unchanged is a
pseudo-denormal, that is incorrect.  Fix it to normalize the numerator
in that case.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
 fpu/softfloat.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 091847beb9..00f362af23 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -5691,7 +5691,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
                          float_status *status)
 {
     bool aSign, zSign;
-    int32_t aExp, bExp, expDiff;
+    int32_t aExp, bExp, expDiff, aExpOrig;
     uint64_t aSig0, aSig1, bSig;
     uint64_t q, term0, term1, alternateASig0, alternateASig1;
 
@@ -5700,7 +5700,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
         return floatx80_default_nan(status);
     }
     aSig0 = extractFloatx80Frac( a );
-    aExp = extractFloatx80Exp( a );
+    aExpOrig = aExp = extractFloatx80Exp( a );
     aSign = extractFloatx80Sign( a );
     bSig = extractFloatx80Frac( b );
     bExp = extractFloatx80Exp( b );
@@ -5715,6 +5715,11 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
         if ((uint64_t)(bSig << 1)) {
             return propagateFloatx80NaN(a, b, status);
         }
+        if (aExp == 0 && aSig0 >> 63) {
+            /* Pseudo-denormal argument must be returned in normalized
+             * form.  */
+            return packFloatx80(aSign, 1, aSig0);
+        }
         return a;
     }
     if ( bExp == 0 ) {
@@ -5734,7 +5739,14 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
     expDiff = aExp - bExp;
     aSig1 = 0;
     if ( expDiff < 0 ) {
-        if ( mod || expDiff < -1 ) return a;
+        if ( mod || expDiff < -1 ) {
+            if (aExp == 1 && aExpOrig == 0) {
+                /* Pseudo-denormal argument must be returned in
+                 * normalized form.  */
+                return packFloatx80(aSign, aExp, aSig0);
+            }
+            return a;
+        }
         shift128Right( aSig0, 0, 1, &aSig0, &aSig1 );
         expDiff = 0;
     }
-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/7] softfloat: do not set denominator high bit for floatx80 remainder
  2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
                   ` (2 preceding siblings ...)
  2020-06-05 19:00 ` [PATCH 3/7] softfloat: do not return pseudo-denormal from floatx80 remainder Joseph Myers
@ 2020-06-05 19:01 ` Joseph Myers
  2020-06-06 19:08   ` Richard Henderson
  2020-06-05 19:01 ` [PATCH 5/7] softfloat: return low bits of quotient from floatx80_modrem Joseph Myers
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 19:01 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

The floatx80 remainder implementation unnecessarily sets the high bit
of bSig explicitly.  By that point in the function, arguments that are
invalid, zero, infinity or NaN have already been handled and
subnormals have been through normalizeFloatx80Subnormal, so the high
bit will already be set.  Remove the unnecessary code.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
 fpu/softfloat.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 00f362af23..423a815196 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -5734,7 +5734,6 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
         if ( aSig0 == 0 ) return a;
         normalizeFloatx80Subnormal( aSig0, &aExp, &aSig0 );
     }
-    bSig |= UINT64_C(0x8000000000000000);
     zSign = aSign;
     expDiff = aExp - bExp;
     aSig1 = 0;
-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/7] softfloat: return low bits of quotient from floatx80_modrem
  2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
                   ` (3 preceding siblings ...)
  2020-06-05 19:01 ` [PATCH 4/7] softfloat: do not set denominator high bit for " Joseph Myers
@ 2020-06-05 19:01 ` Joseph Myers
  2020-06-06 19:15   ` Richard Henderson
  2020-06-05 19:02 ` [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations Joseph Myers
  2020-06-05 19:02 ` [PATCH 7/7] target/i386: reimplement fprem " Joseph Myers
  6 siblings, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 19:01 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

Both x87 and m68k need the low parts of the quotient for their
remainder operations.  Arrange for floatx80_modrem to track those bits
and return them via a pointer.

The architectures using float32_rem and float64_rem do not appear to
need this information, so the *_rem interface is left unchanged and
the information returned only from floatx80_modrem.  The logic used to
determine the low 7 bits of the quotient for m68k
(target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it
looks at the result of converting the remainder to integer, the
quotient having been discarded by that point); this patch does not
change that, but the m68k maintainers may wish to do so.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
 fpu/softfloat.c         | 23 ++++++++++++++++++-----
 include/fpu/softfloat.h |  3 ++-
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 423a815196..c3c3f382af 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -5684,10 +5684,11 @@ floatx80 floatx80_div(floatx80 a, floatx80 b, float_status *status)
 | `a' with respect to the corresponding value `b'.  The operation is performed
 | according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic,
 | if 'mod' is false; if 'mod' is true, return the remainder based on truncating
-| the quotient toward zero instead.
+| the quotient toward zero instead.  '*quotient' is set to the low 64 bits of
+| the absolute value of the integer quotient.
 *----------------------------------------------------------------------------*/
 
-floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
+floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, uint64_t *quotient,
                          float_status *status)
 {
     bool aSign, zSign;
@@ -5695,6 +5696,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
     uint64_t aSig0, aSig1, bSig;
     uint64_t q, term0, term1, alternateASig0, alternateASig1;
 
+    *quotient = 0;
     if (floatx80_invalid_encoding(a) || floatx80_invalid_encoding(b)) {
         float_raise(float_flag_invalid, status);
         return floatx80_default_nan(status);
@@ -5749,7 +5751,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
         shift128Right( aSig0, 0, 1, &aSig0, &aSig1 );
         expDiff = 0;
     }
-    q = ( bSig <= aSig0 );
+    *quotient = q = ( bSig <= aSig0 );
     if ( q ) aSig0 -= bSig;
     expDiff -= 64;
     while ( 0 < expDiff ) {
@@ -5759,6 +5761,8 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
         sub128( aSig0, aSig1, term0, term1, &aSig0, &aSig1 );
         shortShift128Left( aSig0, aSig1, 62, &aSig0, &aSig1 );
         expDiff -= 62;
+        *quotient <<= 62;
+        *quotient += q;
     }
     expDiff += 64;
     if ( 0 < expDiff ) {
@@ -5772,6 +5776,12 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
             ++q;
             sub128( aSig0, aSig1, term0, term1, &aSig0, &aSig1 );
         }
+        if (expDiff < 64) {
+            *quotient <<= expDiff;
+        } else {
+            *quotient = 0;
+        }
+        *quotient += q;
     }
     else {
         term1 = 0;
@@ -5786,6 +5796,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
             aSig0 = alternateASig0;
             aSig1 = alternateASig1;
             zSign = ! zSign;
+            ++*quotient;
         }
     }
     return
@@ -5802,7 +5813,8 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod,
 
 floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status)
 {
-    return floatx80_modrem(a, b, false, status);
+    uint64_t quotient;
+    return floatx80_modrem(a, b, false, &quotient, status);
 }
 
 /*----------------------------------------------------------------------------
@@ -5813,7 +5825,8 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status)
 
 floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status)
 {
-    return floatx80_modrem(a, b, true, status);
+    uint64_t quotient;
+    return floatx80_modrem(a, b, true, &quotient, status);
 }
 
 /*----------------------------------------------------------------------------
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index bff6934d09..ff4e2605b1 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -687,7 +687,8 @@ floatx80 floatx80_add(floatx80, floatx80, float_status *status);
 floatx80 floatx80_sub(floatx80, floatx80, float_status *status);
 floatx80 floatx80_mul(floatx80, floatx80, float_status *status);
 floatx80 floatx80_div(floatx80, floatx80, float_status *status);
-floatx80 floatx80_modrem(floatx80, floatx80, bool, float_status *status);
+floatx80 floatx80_modrem(floatx80, floatx80, bool, uint64_t *,
+                         float_status *status);
 floatx80 floatx80_mod(floatx80, floatx80, float_status *status);
 floatx80 floatx80_rem(floatx80, floatx80, float_status *status);
 floatx80 floatx80_sqrt(floatx80, float_status *status);
-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations
  2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
                   ` (4 preceding siblings ...)
  2020-06-05 19:01 ` [PATCH 5/7] softfloat: return low bits of quotient from floatx80_modrem Joseph Myers
@ 2020-06-05 19:02 ` Joseph Myers
  2020-06-06 19:16   ` Richard Henderson
  2020-06-08 16:46   ` Alex Bennée
  2020-06-05 19:02 ` [PATCH 7/7] target/i386: reimplement fprem " Joseph Myers
  6 siblings, 2 replies; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 19:02 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

The x87 fprem1 emulation is currently based around conversion to
double, which is inherently unsuitable for a good emulation of any
floatx80 operation.  Reimplement using the soft-float floatx80
remainder operations.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
 target/i386/fpu_helper.c | 96 +++++++++++++++++++---------------------
 1 file changed, 45 insertions(+), 51 deletions(-)

diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c
index 8ef5b463ea..bab35e00a0 100644
--- a/target/i386/fpu_helper.c
+++ b/target/i386/fpu_helper.c
@@ -934,63 +934,57 @@ void helper_fxtract(CPUX86State *env)
     merge_exception_flags(env, old_flags);
 }
 
-void helper_fprem1(CPUX86State *env)
+static void helper_fprem_common(CPUX86State *env, bool mod)
 {
-    double st0, st1, dblq, fpsrcop, fptemp;
-    CPU_LDoubleU fpsrcop1, fptemp1;
-    int expdif;
-    signed long long int q;
-
-    st0 = floatx80_to_double(env, ST0);
-    st1 = floatx80_to_double(env, ST1);
-
-    if (isinf(st0) || isnan(st0) || isnan(st1) || (st1 == 0.0)) {
-        ST0 = double_to_floatx80(env, 0.0 / 0.0); /* NaN */
-        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
-        return;
-    }
-
-    fpsrcop = st0;
-    fptemp = st1;
-    fpsrcop1.d = ST0;
-    fptemp1.d = ST1;
-    expdif = EXPD(fpsrcop1) - EXPD(fptemp1);
-
-    if (expdif < 0) {
-        /* optimisation? taken from the AMD docs */
-        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
-        /* ST0 is unchanged */
-        return;
-    }
+    uint8_t old_flags = save_exception_flags(env);
+    uint64_t quotient;
+    CPU_LDoubleU temp0, temp1;
+    int exp0, exp1, expdiff;
 
-    if (expdif < 53) {
-        dblq = fpsrcop / fptemp;
-        /* round dblq towards nearest integer */
-        dblq = rint(dblq);
-        st0 = fpsrcop - fptemp * dblq;
+    temp0.d = ST0;
+    temp1.d = ST1;
+    exp0 = EXPD(temp0);
+    exp1 = EXPD(temp1);
 
-        /* convert dblq to q by truncating towards zero */
-        if (dblq < 0.0) {
-            q = (signed long long int)(-dblq);
+    env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
+    if (floatx80_is_zero(ST0) || floatx80_is_zero(ST1) ||
+        exp0 == 0x7fff || exp1 == 0x7fff ||
+        floatx80_invalid_encoding(ST0) || floatx80_invalid_encoding(ST1)) {
+        ST0 = floatx80_modrem(ST0, ST1, mod, &quotient, &env->fp_status);
+    } else {
+        if (exp0 == 0) {
+            exp0 = 1 - clz64(temp0.l.lower);
+        }
+        if (exp1 == 0) {
+            exp1 = 1 - clz64(temp1.l.lower);
+        }
+        expdiff = exp0 - exp1;
+        if (expdiff < 64) {
+            ST0 = floatx80_modrem(ST0, ST1, mod, &quotient, &env->fp_status);
+            env->fpus |= (quotient & 0x4) << (8 - 2);  /* (C0) <-- q2 */
+            env->fpus |= (quotient & 0x2) << (14 - 1); /* (C3) <-- q1 */
+            env->fpus |= (quotient & 0x1) << (9 - 0);  /* (C1) <-- q0 */
         } else {
-            q = (signed long long int)dblq;
+            /* Partial remainder.  This choice of how many bits to
+             * process at once is specified in AMD instruction set
+             * manuals, and empirically is followed by Intel
+             * processors as well; it ensures that the final remainder
+             * operation in a loop does produce the correct low three
+             * bits of the quotient.  AMD manuals specify that the
+             * flags other than C2 are cleared, and empirically Intel
+             * processors clear them as well.  */
+            int n = 32 + (expdiff % 32);
+            temp1.d = floatx80_scalbn(temp1.d, expdiff - n, &env->fp_status);
+            ST0 = floatx80_mod(ST0, temp1.d, &env->fp_status);
+            env->fpus |= 0x400;  /* C2 <-- 1 */
         }
-
-        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
-        /* (C0,C3,C1) <-- (q2,q1,q0) */
-        env->fpus |= (q & 0x4) << (8 - 2);  /* (C0) <-- q2 */
-        env->fpus |= (q & 0x2) << (14 - 1); /* (C3) <-- q1 */
-        env->fpus |= (q & 0x1) << (9 - 0);  /* (C1) <-- q0 */
-    } else {
-        env->fpus |= 0x400;  /* C2 <-- 1 */
-        fptemp = pow(2.0, expdif - 50);
-        fpsrcop = (st0 / st1) / fptemp;
-        /* fpsrcop = integer obtained by chopping */
-        fpsrcop = (fpsrcop < 0.0) ?
-                  -(floor(fabs(fpsrcop))) : floor(fpsrcop);
-        st0 -= (st1 * fpsrcop * fptemp);
     }
-    ST0 = double_to_floatx80(env, st0);
+    merge_exception_flags(env, old_flags);
+}
+
+void helper_fprem1(CPUX86State *env)
+{
+    helper_fprem_common(env, false);
 }
 
 void helper_fprem(CPUX86State *env)
-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/7] target/i386: reimplement fprem using floatx80 operations
  2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
                   ` (5 preceding siblings ...)
  2020-06-05 19:02 ` [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations Joseph Myers
@ 2020-06-05 19:02 ` Joseph Myers
  6 siblings, 0 replies; 17+ messages in thread
From: Joseph Myers @ 2020-06-05 19:02 UTC (permalink / raw)
  To: qemu-devel, aurelien, peter.maydell, alex.bennee, laurent,
	pbonzini, rth, ehabkost

The x87 fprem emulation is currently based around conversion to
double, which is inherently unsuitable for a good emulation of any
floatx80 operation.  Reimplement using the soft-float floatx80
remainder operations.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
---
 target/i386/fpu_helper.c | 58 +---------------------------------------
 1 file changed, 1 insertion(+), 57 deletions(-)

diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c
index bab35e00a0..d2fc2c1dde 100644
--- a/target/i386/fpu_helper.c
+++ b/target/i386/fpu_helper.c
@@ -989,63 +989,7 @@ void helper_fprem1(CPUX86State *env)
 
 void helper_fprem(CPUX86State *env)
 {
-    double st0, st1, dblq, fpsrcop, fptemp;
-    CPU_LDoubleU fpsrcop1, fptemp1;
-    int expdif;
-    signed long long int q;
-
-    st0 = floatx80_to_double(env, ST0);
-    st1 = floatx80_to_double(env, ST1);
-
-    if (isinf(st0) || isnan(st0) || isnan(st1) || (st1 == 0.0)) {
-        ST0 = double_to_floatx80(env, 0.0 / 0.0); /* NaN */
-        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
-        return;
-    }
-
-    fpsrcop = st0;
-    fptemp = st1;
-    fpsrcop1.d = ST0;
-    fptemp1.d = ST1;
-    expdif = EXPD(fpsrcop1) - EXPD(fptemp1);
-
-    if (expdif < 0) {
-        /* optimisation? taken from the AMD docs */
-        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
-        /* ST0 is unchanged */
-        return;
-    }
-
-    if (expdif < 53) {
-        dblq = fpsrcop / fptemp; /* ST0 / ST1 */
-        /* round dblq towards zero */
-        dblq = (dblq < 0.0) ? ceil(dblq) : floor(dblq);
-        st0 = fpsrcop - fptemp * dblq; /* fpsrcop is ST0 */
-
-        /* convert dblq to q by truncating towards zero */
-        if (dblq < 0.0) {
-            q = (signed long long int)(-dblq);
-        } else {
-            q = (signed long long int)dblq;
-        }
-
-        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
-        /* (C0,C3,C1) <-- (q2,q1,q0) */
-        env->fpus |= (q & 0x4) << (8 - 2);  /* (C0) <-- q2 */
-        env->fpus |= (q & 0x2) << (14 - 1); /* (C3) <-- q1 */
-        env->fpus |= (q & 0x1) << (9 - 0);  /* (C1) <-- q0 */
-    } else {
-        int N = 32 + (expdif % 32); /* as per AMD docs */
-
-        env->fpus |= 0x400;  /* C2 <-- 1 */
-        fptemp = pow(2.0, (double)(expdif - N));
-        fpsrcop = (st0 / st1) / fptemp;
-        /* fpsrcop = integer obtained by chopping */
-        fpsrcop = (fpsrcop < 0.0) ?
-                  -(floor(fabs(fpsrcop))) : floor(fpsrcop);
-        st0 -= (st1 * fpsrcop * fptemp);
-    }
-    ST0 = double_to_floatx80(env, st0);
+    helper_fprem_common(env, true);
 }
 
 void helper_fyl2xp1(CPUX86State *env)
-- 
2.17.1


-- 
Joseph S. Myers
joseph@codesourcery.com


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem
  2020-06-05 18:59 ` [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem Joseph Myers
@ 2020-06-06 19:06   ` Richard Henderson
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-06 19:06 UTC (permalink / raw)
  To: Joseph Myers, qemu-devel, aurelien, peter.maydell, alex.bennee,
	laurent, pbonzini, rth, ehabkost

On 6/5/20 11:59 AM, Joseph Myers wrote:
> The m68k-specific softfloat code includes a function floatx80_mod that
> is extremely similar to floatx80_rem, but computing the remainder
> based on truncating the quotient toward zero rather than rounding it
> to nearest integer.  This is also useful for emulating the x87 fprem
> and fprem1 instructions.  Change the floatx80_rem implementation into
> floatx80_modrem that can perform either operation, with both
> floatx80_rem and floatx80_mod as thin wrappers available for all
> targets.
> 
> There does not appear to be any use for the _mod operation for other
> floating-point formats in QEMU (the only other architectures using
> _rem at all are linux-user/arm/nwfpe, for FPA emulation, and openrisc,
> for instructions that have been removed in the latest version of the
> architecture), so no change is made to the code for other formats.
> 
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> ---
>  fpu/softfloat.c         | 49 ++++++++++++++++++------
>  include/fpu/softfloat.h |  2 +
>  target/m68k/softfloat.c | 83 -----------------------------------------
>  target/m68k/softfloat.h |  1 -
>  4 files changed, 40 insertions(+), 95 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/7] softfloat: fix floatx80 remainder pseudo-denormal check for zero
  2020-06-05 19:00 ` [PATCH 2/7] softfloat: fix floatx80 remainder pseudo-denormal check for zero Joseph Myers
@ 2020-06-06 19:06   ` Richard Henderson
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-06 19:06 UTC (permalink / raw)
  To: Joseph Myers, qemu-devel, aurelien, peter.maydell, alex.bennee,
	laurent, pbonzini, rth, ehabkost

On 6/5/20 12:00 PM, Joseph Myers wrote:
> The floatx80 remainder implementation ignores the high bit of the
> significand when checking whether an operand (numerator) with zero
> exponent is zero.  This means it mishandles a pseudo-denormal
> representation of 0x1p-16382L by treating it as zero.  Fix this by
> checking the whole significand instead.
> 
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> ---
>  fpu/softfloat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/7] softfloat: do not return pseudo-denormal from floatx80 remainder
  2020-06-05 19:00 ` [PATCH 3/7] softfloat: do not return pseudo-denormal from floatx80 remainder Joseph Myers
@ 2020-06-06 19:08   ` Richard Henderson
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-06 19:08 UTC (permalink / raw)
  To: Joseph Myers, qemu-devel, aurelien, peter.maydell, alex.bennee,
	laurent, pbonzini, rth, ehabkost

On 6/5/20 12:00 PM, Joseph Myers wrote:
> The floatx80 remainder implementation sometimes returns the numerator
> unchanged when the denominator is sufficiently larger than the
> numerator.  But if the value to be returned unchanged is a
> pseudo-denormal, that is incorrect.  Fix it to normalize the numerator
> in that case.
> 
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>


> +        if (aExp == 0 && aSig0 >> 63) {
> +            /* Pseudo-denormal argument must be returned in normalized
> +             * form.  */
> +            return packFloatx80(aSign, 1, aSig0);

We're now standardizing on

    /*
     * muli-line
     * comment
     */

Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] softfloat: do not set denominator high bit for floatx80 remainder
  2020-06-05 19:01 ` [PATCH 4/7] softfloat: do not set denominator high bit for " Joseph Myers
@ 2020-06-06 19:08   ` Richard Henderson
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-06 19:08 UTC (permalink / raw)
  To: Joseph Myers, qemu-devel, aurelien, peter.maydell, alex.bennee,
	laurent, pbonzini, rth, ehabkost

On 6/5/20 12:01 PM, Joseph Myers wrote:
> The floatx80 remainder implementation unnecessarily sets the high bit
> of bSig explicitly.  By that point in the function, arguments that are
> invalid, zero, infinity or NaN have already been handled and
> subnormals have been through normalizeFloatx80Subnormal, so the high
> bit will already be set.  Remove the unnecessary code.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/7] softfloat: return low bits of quotient from floatx80_modrem
  2020-06-05 19:01 ` [PATCH 5/7] softfloat: return low bits of quotient from floatx80_modrem Joseph Myers
@ 2020-06-06 19:15   ` Richard Henderson
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-06 19:15 UTC (permalink / raw)
  To: Joseph Myers, qemu-devel, aurelien, peter.maydell, alex.bennee,
	laurent, pbonzini, rth, ehabkost

On 6/5/20 12:01 PM, Joseph Myers wrote:
> Both x87 and m68k need the low parts of the quotient for their
> remainder operations.  Arrange for floatx80_modrem to track those bits
> and return them via a pointer.
> 
> The architectures using float32_rem and float64_rem do not appear to
> need this information, so the *_rem interface is left unchanged and
> the information returned only from floatx80_modrem.  The logic used to
> determine the low 7 bits of the quotient for m68k
> (target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it
> looks at the result of converting the remainder to integer, the
> quotient having been discarded by that point); this patch does not
> change that, but the m68k maintainers may wish to do so.
> 
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> ---
>  fpu/softfloat.c         | 23 ++++++++++++++++++-----
>  include/fpu/softfloat.h |  3 ++-
>  2 files changed, 20 insertions(+), 6 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations
  2020-06-05 19:02 ` [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations Joseph Myers
@ 2020-06-06 19:16   ` Richard Henderson
  2020-06-08 16:46   ` Alex Bennée
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-06 19:16 UTC (permalink / raw)
  To: Joseph Myers, qemu-devel, aurelien, peter.maydell, alex.bennee,
	laurent, pbonzini, rth, ehabkost

On 6/5/20 12:02 PM, Joseph Myers wrote:
> The x87 fprem1 emulation is currently based around conversion to
> double, which is inherently unsuitable for a good emulation of any
> floatx80 operation.  Reimplement using the soft-float floatx80
> remainder operations.
> 
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> ---
>  target/i386/fpu_helper.c | 96 +++++++++++++++++++---------------------
>  1 file changed, 45 insertions(+), 51 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

I would merge patch 7 into this one; I don't see value in keeping them separate.


r~


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations
  2020-06-05 19:02 ` [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations Joseph Myers
  2020-06-06 19:16   ` Richard Henderson
@ 2020-06-08 16:46   ` Alex Bennée
  2020-06-08 16:51     ` Joseph Myers
  1 sibling, 1 reply; 17+ messages in thread
From: Alex Bennée @ 2020-06-08 16:46 UTC (permalink / raw)
  To: Joseph Myers
  Cc: peter.maydell, ehabkost, qemu-devel, laurent, pbonzini, aurelien, rth


Joseph Myers <joseph@codesourcery.com> writes:

> The x87 fprem1 emulation is currently based around conversion to
> double, which is inherently unsuitable for a good emulation of any
> floatx80 operation.  Reimplement using the soft-float floatx80
> remainder operations.
>
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> ---
>  target/i386/fpu_helper.c | 96 +++++++++++++++++++---------------------
>  1 file changed, 45 insertions(+), 51 deletions(-)
>
> diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c
> index 8ef5b463ea..bab35e00a0 100644
> --- a/target/i386/fpu_helper.c
> +++ b/target/i386/fpu_helper.c
> @@ -934,63 +934,57 @@ void helper_fxtract(CPUX86State *env)
>      merge_exception_flags(env, old_flags);
>  }
>  
> -void helper_fprem1(CPUX86State *env)
> +static void helper_fprem_common(CPUX86State *env, bool mod)
>  {
> -    double st0, st1, dblq, fpsrcop, fptemp;
> -    CPU_LDoubleU fpsrcop1, fptemp1;
> -    int expdif;
> -    signed long long int q;
> -
> -    st0 = floatx80_to_double(env, ST0);
> -    st1 = floatx80_to_double(env, ST1);
> -
> -    if (isinf(st0) || isnan(st0) || isnan(st1) || (st1 == 0.0)) {
> -        ST0 = double_to_floatx80(env, 0.0 / 0.0); /* NaN */
> -        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
> -        return;
> -    }
> -
> -    fpsrcop = st0;
> -    fptemp = st1;
> -    fpsrcop1.d = ST0;
> -    fptemp1.d = ST1;
> -    expdif = EXPD(fpsrcop1) - EXPD(fptemp1);
> -
> -    if (expdif < 0) {
> -        /* optimisation? taken from the AMD docs */
> -        env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */
> -        /* ST0 is unchanged */
> -        return;
> -    }
> +    uint8_t old_flags = save_exception_flags(env);

Hmm where did this come from:

/home/alex/lsrc/qemu.git/target/i386/fpu_helper.c: In function ‘helper_fprem_common’:
/home/alex/lsrc/qemu.git/target/i386/fpu_helper.c:784:25: error: implicit declaration of function ‘save_exception_flags’; did you mean ‘raise_exception_ra’? [-Werror=implicit-function-declaration]
     uint8_t old_flags = save_exception_flags(env);
                         ^~~~~~~~~~~~~~~~~~~~
                         raise_exception_ra
/home/alex/lsrc/qemu.git/target/i386/fpu_helper.c:784:25: error: nested extern declaration of ‘save_exception_flags’ [-Werror=nested-externs]
/home/alex/lsrc/qemu.git/target/i386/fpu_helper.c:827:5: error: implicit declaration of function ‘merge_exception_flags’; did you mean ‘get_float_exception_flags’? [-Werror=implicit-function-declaration]
     merge_exception_flags(env, old_flags);
     ^~~~~~~~~~~~~~~~~~~~~
     get_float_exception_flags
/home/alex/lsrc/qemu.git/target/i386/fpu_helper.c:827:5: error: nested extern declaration of ‘merge_exception_flags’ [-Werror=nested-externs]
cc1: all warnings being treated as errors

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations
  2020-06-08 16:46   ` Alex Bennée
@ 2020-06-08 16:51     ` Joseph Myers
  2020-06-08 21:03       ` Alex Bennée
  0 siblings, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-06-08 16:51 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, ehabkost, qemu-devel, laurent, pbonzini, aurelien, rth

[-- Attachment #1: Type: text/plain, Size: 383 bytes --]

On Mon, 8 Jun 2020, Alex Bennée wrote:

> > +    uint8_t old_flags = save_exception_flags(env);
> 
> Hmm where did this come from:

This series assumes all my other recent x87 fixes (11 such patches in 
three series that aren't yet on master, there's also a single patch for 
pcmpxstrx which is independent of those) are already present.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations
  2020-06-08 16:51     ` Joseph Myers
@ 2020-06-08 21:03       ` Alex Bennée
  0 siblings, 0 replies; 17+ messages in thread
From: Alex Bennée @ 2020-06-08 21:03 UTC (permalink / raw)
  To: Joseph Myers
  Cc: peter.maydell, ehabkost, qemu-devel, laurent, pbonzini, aurelien, rth


Joseph Myers <joseph@codesourcery.com> writes:

> On Mon, 8 Jun 2020, Alex Bennée wrote:
>
>> > +    uint8_t old_flags = save_exception_flags(env);
>> 
>> Hmm where did this come from:
>
> This series assumes all my other recent x87 fixes (11 such patches in 
> three series that aren't yet on master, there's also a single patch for 
> pcmpxstrx which is independent of those) are already present.

Ahh - it's ok to keep a patch that is currently queued in your series
until such time as it's merged. You can also just reference the original
series in your cover letter or supply a branch reference to aid with
application.

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-06-08 21:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-05 18:58 [PATCH 0/7] softfloat, target/i386: fprem, fprem1 fixes Joseph Myers
2020-06-05 18:59 ` [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem Joseph Myers
2020-06-06 19:06   ` Richard Henderson
2020-06-05 19:00 ` [PATCH 2/7] softfloat: fix floatx80 remainder pseudo-denormal check for zero Joseph Myers
2020-06-06 19:06   ` Richard Henderson
2020-06-05 19:00 ` [PATCH 3/7] softfloat: do not return pseudo-denormal from floatx80 remainder Joseph Myers
2020-06-06 19:08   ` Richard Henderson
2020-06-05 19:01 ` [PATCH 4/7] softfloat: do not set denominator high bit for " Joseph Myers
2020-06-06 19:08   ` Richard Henderson
2020-06-05 19:01 ` [PATCH 5/7] softfloat: return low bits of quotient from floatx80_modrem Joseph Myers
2020-06-06 19:15   ` Richard Henderson
2020-06-05 19:02 ` [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations Joseph Myers
2020-06-06 19:16   ` Richard Henderson
2020-06-08 16:46   ` Alex Bennée
2020-06-08 16:51     ` Joseph Myers
2020-06-08 21:03       ` Alex Bennée
2020-06-05 19:02 ` [PATCH 7/7] target/i386: reimplement fprem " Joseph Myers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.