All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Emilio G. Cota" <cota@braap.org>
To: qemu-devel@nongnu.org
Cc: "Alex Bennée" <alex.bennee@linaro.org>,
	"Richard Henderson" <richard.henderson@linaro.org>
Subject: [Qemu-devel] [PATCH v6 12/13] hardfloat: implement float32/64 square root
Date: Sat, 24 Nov 2018 18:55:52 -0500	[thread overview]
Message-ID: <20181124235553.17371-13-cota@braap.org> (raw)
In-Reply-To: <20181124235553.17371-1-cota@braap.org>

Performance results for fp-bench:

Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
sqrt-single: 42.30 MFlops
sqrt-double: 22.97 MFlops
- after:
sqrt-single: 311.42 MFlops
sqrt-double: 311.08 MFlops

Here USE_FP makes a huge difference for f64's, with throughput
going from ~200 MFlops to ~300 MFlops.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 fpu/softfloat.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 58 insertions(+), 2 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index e03feafb6f..4c6ecd1883 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3040,20 +3040,76 @@ float16 QEMU_FLATTEN float16_sqrt(float16 a, float_status *status)
     return float16_round_pack_canonical(pr, status);
 }
 
-float32 QEMU_FLATTEN float32_sqrt(float32 a, float_status *status)
+static float32 QEMU_SOFTFLOAT_ATTR
+soft_f32_sqrt(float32 a, float_status *status)
 {
     FloatParts pa = float32_unpack_canonical(a, status);
     FloatParts pr = sqrt_float(pa, status, &float32_params);
     return float32_round_pack_canonical(pr, status);
 }
 
-float64 QEMU_FLATTEN float64_sqrt(float64 a, float_status *status)
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_f64_sqrt(float64 a, float_status *status)
 {
     FloatParts pa = float64_unpack_canonical(a, status);
     FloatParts pr = sqrt_float(pa, status, &float64_params);
     return float64_round_pack_canonical(pr, status);
 }
 
+float32 QEMU_FLATTEN float32_sqrt(float32 xa, float_status *s)
+{
+    union_float32 ua, ur;
+
+    ua.s = xa;
+    if (unlikely(!can_use_fpu(s))) {
+        goto soft;
+    }
+
+    float32_input_flush1(&ua.s, s);
+    if (QEMU_HARDFLOAT_1F32_USE_FP) {
+        if (unlikely(!(fpclassify(ua.h) == FP_NORMAL ||
+                       fpclassify(ua.h) == FP_ZERO) ||
+                     signbit(ua.h))) {
+            goto soft;
+        }
+    } else if (unlikely(!float32_is_zero_or_normal(ua.s) ||
+                        float32_is_neg(ua.s))) {
+        goto soft;
+    }
+    ur.h = sqrtf(ua.h);
+    return ur.s;
+
+ soft:
+    return soft_f32_sqrt(ua.s, s);
+}
+
+float64 QEMU_FLATTEN float64_sqrt(float64 xa, float_status *s)
+{
+    union_float64 ua, ur;
+
+    ua.s = xa;
+    if (unlikely(!can_use_fpu(s))) {
+        goto soft;
+    }
+
+    float64_input_flush1(&ua.s, s);
+    if (QEMU_HARDFLOAT_1F64_USE_FP) {
+        if (unlikely(!(fpclassify(ua.h) == FP_NORMAL ||
+                       fpclassify(ua.h) == FP_ZERO) ||
+                     signbit(ua.h))) {
+            goto soft;
+        }
+    } else if (unlikely(!float64_is_zero_or_normal(ua.s) ||
+                        float64_is_neg(ua.s))) {
+        goto soft;
+    }
+    ur.h = sqrt(ua.h);
+    return ur.s;
+
+ soft:
+    return soft_f64_sqrt(ua.s, s);
+}
+
 /*----------------------------------------------------------------------------
 | The pattern for a default generated NaN.
 *----------------------------------------------------------------------------*/
-- 
2.17.1

  parent reply	other threads:[~2018-11-24 23:56 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-24 23:55 [Qemu-devel] [PATCH v6 00/13] hardfloat Emilio G. Cota
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 01/13] fp-test: pick TARGET_ARM to get its specialization Emilio G. Cota
2018-12-03 12:13   ` Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 02/13] softfloat: add float{32, 64}_is_{de, }normal Emilio G. Cota
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 03/13] target/tricore: use float32_is_denormal Emilio G. Cota
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 04/13] softfloat: rename canonicalize to sf_canonicalize Emilio G. Cota
2018-12-03 14:16   ` Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 05/13] softfloat: add float{32, 64}_is_zero_or_normal Emilio G. Cota
2018-12-03 14:16   ` Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 06/13] tests/fp: add fp-bench Emilio G. Cota
2018-12-03 14:29   ` Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 07/13] fpu: introduce hardfloat Emilio G. Cota
2018-11-25  0:25   ` Aleksandar Markovic
2018-11-25  1:25     ` Emilio G. Cota
2018-12-04 12:28   ` Alex Bennée
2018-12-04 13:33     ` Richard Henderson
2018-12-04 13:52       ` Alex Bennée
2018-12-04 17:31         ` Emilio G. Cota
2018-12-04 19:08           ` Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 08/13] hardfloat: implement float32/64 addition and subtraction Emilio G. Cota
2018-12-04 18:34   ` Alex Bennée
2018-12-04 20:07     ` Emilio G. Cota
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 09/13] hardfloat: implement float32/64 multiplication Emilio G. Cota
2018-12-05 10:10   ` Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 10/13] hardfloat: implement float32/64 division Emilio G. Cota
2018-12-05 10:11   ` Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 11/13] hardfloat: implement float32/64 fused multiply-add Emilio G. Cota
2018-12-05 12:25   ` Alex Bennée
2018-11-24 23:55 ` Emilio G. Cota [this message]
2018-12-05 12:26   ` [Qemu-devel] [PATCH v6 12/13] hardfloat: implement float32/64 square root Alex Bennée
2018-11-24 23:55 ` [Qemu-devel] [PATCH v6 13/13] hardfloat: implement float32/64 comparison Emilio G. Cota
2018-12-05 12:36   ` Alex Bennée
2018-11-27 17:24 ` [Qemu-devel] [PATCH v6 00/13] hardfloat no-reply
2018-11-27 17:52   ` Emilio G. Cota
2018-11-27 17:32 ` no-reply
2018-12-05 12:41 ` Alex Bennée
2018-12-05 16:47   ` Emilio G. Cota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181124235553.17371-13-cota@braap.org \
    --to=cota@braap.org \
    --cc=alex.bennee@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.