All of lore.kernel.org
 help / color / mirror / Atom feed
* An first try to improve PPC float simulation, not even compiled. Just ask question.
@ 2020-05-01 20:04 罗勇刚(Yonggang Luo)
  2020-05-01 20:49 ` Richard Henderson
  0 siblings, 1 reply; 11+ messages in thread
From: 罗勇刚(Yonggang Luo) @ 2020-05-01 20:04 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, Alex Bennée

[-- Attachment #1: Type: text/plain, Size: 16979 bytes --]

/*
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| Bits  | Name   | Description
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 32    | FE     | Floating-point exception summary. Every floating-point
instruction, except mtfsfi and mtfsf, implicitly sets FX      |
|       |        |  if that instruction causes any of the floating-point
exception bits in the FPSCR to change from 0 to 1. mcrfs,      |
|       |        |  mtfsfi, mtfsf, mtfsb0, and mtfsb1 can alter FPSCR[FX]
explicitly.                                                   |
|       |        |  Note: (Programming) FPSCR[FX] is defined not to be
altered implicitly by mtfsfi and mtfsf because                   |
|       |        |  permitting these instructions to alter FPSCR[FX]
implicitly could cause a paradox. An example is an                 |
|       |        |  mtfsfi or mtfsf that supplies 0 for FPSCR[FX] and 1 for
FPSCR[OX] and executes when FPSCR[OX] = 0.                  |
|       |        |  See also the programming notes with the definition of
these two instructions.                                       |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 33    | FEX    | Floating-point enabled exception summary. FEX is the OR
of all the floating-point exception bits masked by           |
|       |        |  their respective enable bits. mcrfs, mtfsfi, mtfsf,
mtfsb0, and mtfsb1 cannot alter FPSCR[FEX] explicitly.          |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 34    | VX     | Floating-point invalid operation exception summary. VX
is the OR of all the invalid operation exception bits.        |
|       |        |  mcrfs, mtfsfi, mtfsf, mtfsb0, and mtfsb1 cannot alter
FPSCR[VX] explicitly.                                         |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 35    | OX     | Floating-point overflow exception. See Section
5.6.1.7.3, “Overflow Exception.”                                      |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 36    | UX     | Floating-point underflow exception. See Section
5.6.1.7.4, “Underflow Exception.”                                    |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 37    | ZX     | Floating-Point zero divide exception. See Section
5.6.1.7.2, “Zero Divide Exception.”                                |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 38    | XX     | Floating-point inexact exception. See Section 5.6.1.7.5,
“Inexact Exception.”                                        |
|       |        |  FPSCR[XX] is a sticky version of FPSCR[FI] (see below).
Thus the following rules completely describe how            |
|       |        |  FPSCR[XX] is set by a given instruction:
                                                             |
|       |        |  • If the instruction affects FPSCR[FI], the new
FPSCR[XX] value is obtained by ORing the old value of               |
|       |        |  FPSCR[XX] with the new value of FPSCR[FI].
                                                             |
|       |        |  • If the instruction does not affect FPSCR[FI], the
value of FPSCR[XX] is unchanged                                 |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 39    | VXSNAN | Floating-point invalid operation exception (SNAN). See
Section 5.6.1.7.1, “Invalid Operation Exception.”             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 40    | VXISI  | floating-point invalid operation exception (∞ − ∞). See
Section 5.6.1.7.1, “Invalid Operation Exception.”.           |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 41    | VXIDI  | Floating-point invalid operation exception ( ∞ ÷ ∞).See
Section 5.6.1.7.1, “Invalid Operation Exception.”.           |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 42    | VXZDZ  | Floating-point invalid operation exception (0 ÷ 0) See
Section 5.6.1.7.1, “Invalid Operation Exception.”.            |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 43    | VXIMZ  | Floating-point invalid operation exception (∞ ×0). See
Section 5.6.1.7.1, “Invalid Operation Exception.”.            |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 44    | VXVC   | Floating-point invalid operation exception (invalid
compare). See Section 5.6.1.7.1, “Invalid Operation Exception.”. |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 45    | FR     | Floating-point fraction rounded. The last arithmetic or
rounding and conversion instruction incremented the          |
|       |        |  fraction during rounding. See Section 4.4.3.6,
“Rounding.” This bit is not sticky.                                  |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 46    | FI     | Floating-point fraction inexact. The last arithmetic or
rounding and conversion instruction either produced an       |
|       |        |  inexact result during rounding or caused a disabled
overflow exception. See Section 4.4.3.6, “Rounding.” FI is      |
|       |        |  not sticky. See the definition of FPSCR[XX], above,
regarding the relationship between FI and XX.                   |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 47-51 | FPRF   | Floating-point result flags. Arithmetic, rounding, and
convert from integer instructions set FPRF based on the       |
|       |        |  result placed into the target register and on the
target precision, except that if any portion of the result is     |
|       |        |  undefined, the value placed into FPRF is undefined.
Floating-point compare instructions set FPRF based on           |
|       |        |  the relative values of the operands compared. For
convert to integer instructions, the value placed into FPRF       |
|       |        |  is undefined. See Table 4-9.
                                                             |
|       |        |  Note: (Programming) A single-precision operation that
produces a denormalized result sets FPRF to indicate          |
|       |        |  a denormalized number. When possible, single-precision
denormalized numbers are represented in                      |
|       |        |  normalized double format in the target register.
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 47    | C      | Floating-point result class descriptor. Arithmetic,
rounding, and conversion instructions may set this bit with      |
|       |        |  the FPCC bits, to indicate the class of the result as
shown in Figure 4-9.                                          |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 48-51 | FPCC   | Floating-point condition code. Floating-point Compare
instructions set one of the FPCC bits and clear the            |
|       |        |  other three FPCC bits. Arithmetic, rounding, and
conversion instructions may set the FPCC bits with the C bit       |
|       |        |  to indicate the class of the result. In this case, the
three high-order FPCC bits retain their relational           |
|       |        |  significance indicating that the value is less than,
greater than, or equal to zero.                                |
|       |        |  48 Floating-point less than or negative (FL or <)
                                                            |
|       |        |  49 Floating-point greater than or positive (FG or >)
                                                             |
|       |        |  50 Floating-point equal or zero (FE or =)
                                                            |
|       |        |  51 Floating-point unordered or NaN (FU or ?)
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 52    | —      | Reserved, should be cleared.
                                                            |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 53    | VXSOFT | Floating-point invalid operation exception (software
request). Can be altered only by mcrfs, mtfsfi, mtfsf,          |
|       |        |  mtfsb0, or mtfsb1
                                                            |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 54    | VXSQRT | Floating-point invalid operation exception (invalid
square root).                                                    |
|       |        |  Note that VXSQRT is defined even for implementations
that do not support either of the two optional                 |
|       |        |  instructions that set it, fsqrt[.] and frsqrte[.].
Defining it for all implementations gives software a standard    |
|       |        |  interface for handling square root exceptions. If an
implementation does not support fsqrt[.] or frsqrte[.],        |
|       |        |  software can simulate the instruction and set VXSQRT to
reflect the exception.                                      |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 55    | VXCVI  | Floating-point invalid operation exception (invalid
integer convert)                                                 |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 56    | VE     | Floating-point invalid operation exception enable
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 57    | OE     | Floating-point overflow exception enable
                                                            |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 58    | UE     | Floating-point underflow exception enable
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 59    | ZE     | Floating-point zero divide exception enable
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 60    | XE     | Floating-point inexact exception enable
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 61    | NI     | Floating-point non-IEEE mode. If NI = 1, the remaining
FPSCR bits may have meanings other than those given           |
|       |        |  in this document and results of floating-point
operations need not conform to IEEE 754. If the                      |
|       |        |  IEEE-754-conforming result of a floating-point
operation would be a denormalized number, the result of that         |
|       |        |  operation is 0 (with the same sign as the denormalized
number) if FPSCR[NI] = 1 and other requirements              |
|       |        |  specified in the user’s manual for the implementation
are met. The other effects of setting NI may differ among     |
|       |        |  implementations.
                                                             |
|       |        | Setting NI is intended to permit results to be
approximate and to cause performance to be more predictable           |
|       |        |  and less data-dependent than when NI = 0. For example,
in non-IEEE mode, an implementation returns 0                |
|       |        |  instead of a denormalized number and may return a large
number instead of an infinity. In non-IEEE mode an          |
|       |        |  implementation should provide a means for ensuring that
all results are produced without software assistance        |
|       |        |  (that is, without causing an enabled exception type
program interrupt or a floating-point unimplemented             |
|       |        |  instruction exception type program interrupt and
without invoking an emulation assist). The means may be            |
|       |        |  controlled by one or more other FPSCR bits (recall that
the other FPSCR bits have implementation-dependent          |
|       |        |  meanings if NI = 1).
                                                             |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
| 62-63 | RN     | Floating-point rounding control (RN).
                                                             |
|       |        |  00 Round to nearest
                                                            |
|       |        |  01 Round toward zero
                                                             |
|       |        |  10 Round toward +infinity
                                                            |
|       |        |  11 Round toward –infinity
                                                            |
+-------+--------+----------------------------------------------------------------------------------------------------------------------+
*/

static int ieee_ex_to_ppc(int fexcp)
{
    int ret = 0;

    /* Question? TODO: How to handling Invalid Operation Exception. */
    if (fexcp & float_flag_invalid) {
        ret |= VX;
    }

    if (fexcp & float_flag_overflow) {
        ret |= FP_OX;
    }
    if (fexcp & float_flag_underflow) {
        ret |= FP_UX;
    }
    if (fexcp & float_flag_divbyzero) {
        ret |= FP_ZX;
    }
    if (fexcp & float_flag_inexact) {
        ret |= FP_XX;
        ret |= FP_FI;
    }
    return ret;
}

void helper_update_fpscr(CPUPPCState *env, int op, uintptr_t retaddr)
{
    int tmp = get_float_exception_flags(&env->fp_status);
    if (tmp) {
        tmp = ieee_ex_to_ppc(tmp);
        set_float_exception_flags(0, &env->fp_status);
        if (tmp) {
            env->fpscr |= tmp;
            if (fp_exceptions_enabled(env) && (env->fpscr & FP_FEX)) {
                if (env->fpscr & FP_VE) {
                    raise_exception_err_ra(env, POWERPC_EXCP_PROGRAM,
                                        POWERPC_EXCP_FP | op, retaddr);
                }
            }
        }
    }
}
I found the fpscr  are really complicated, especially abount  Invalid
Operation Exception.
And  fp_status can not represent all the  Invalid Operation Exception flags.
What I need to do to represent all the  Invalid Operation Exception

-- 
         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo

[-- Attachment #2: Type: text/html, Size: 20094 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-01 20:04 An first try to improve PPC float simulation, not even compiled. Just ask question 罗勇刚(Yonggang Luo)
@ 2020-05-01 20:49 ` Richard Henderson
  2020-05-01 22:02   ` 罗勇刚(Yonggang Luo)
  2020-05-03 19:46   ` 罗勇刚(Yonggang Luo)
  0 siblings, 2 replies; 11+ messages in thread
From: Richard Henderson @ 2020-05-01 20:49 UTC (permalink / raw)
  To: luoyonggang, qemu-devel, qemu-ppc, Alex Bennée

On 5/1/20 1:04 PM, 罗勇刚(Yonggang Luo) wrote:
> And  fp_status can not represent all the  Invalid Operation Exception flags.
> What I need to do to represent all the  Invalid Operation Exception

Ideally, we add them to include/fpu/softfloat-types.h, expand the
float_exception_flags field in float_status to match, and generate the new
flags in fpu/softfloat.c.

This would actually help target/tricore as well.


r~


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-01 20:49 ` Richard Henderson
@ 2020-05-01 22:02   ` 罗勇刚(Yonggang Luo)
  2020-05-03 19:46   ` 罗勇刚(Yonggang Luo)
  1 sibling, 0 replies; 11+ messages in thread
From: 罗勇刚(Yonggang Luo) @ 2020-05-01 22:02 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 14722 bytes --]

From b4d6ca1d6376fab1f1be06eb472e10b908887c2b Mon Sep 17 00:00:00 2001
From: Yonggang Luo <luoyonggang@gmail.com>
Date: Sat, 2 May 2020 05:59:25 +0800
Subject: [PATCH] [ppc fp] Step 1. Rearrange the fp helpers to eliminate
 helper_reset_fpstatus(). I've mentioned this before, that it's possible to
 leave the steady-state of env->fp_status.exception_flags == 0, so there's
no
 need for a separate function call.  I suspect this is worth a decent
speedup
 by itself.

---
 target/ppc/fpu_helper.c            | 53 ++----------------------------
 target/ppc/helper.h                |  1 -
 target/ppc/translate/fp-impl.inc.c | 23 -------------
 3 files changed, 3 insertions(+), 74 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index d9a8773ee1..4fc5a7ff1c 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -821,6 +821,9 @@ static void do_float_check_status(CPUPPCState *env,
uintptr_t raddr)
                                    env->error_code, raddr);
         }
     }
+    if (status) {
+        set_float_exception_flags(0, &env->fp_status);
+    }
 }

 void helper_float_check_status(CPUPPCState *env)
@@ -828,11 +831,6 @@ void helper_float_check_status(CPUPPCState *env)
     do_float_check_status(env, GETPC());
 }

-void helper_reset_fpstatus(CPUPPCState *env)
-{
-    set_float_exception_flags(0, &env->fp_status);
-}
-
 static void float_invalid_op_addsub(CPUPPCState *env, bool set_fpcc,
                                     uintptr_t retaddr, int classes)
 {
@@ -2110,9 +2108,6 @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt,
                       \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2152,8 +2147,6 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
@@ -2189,9 +2182,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
                       \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2228,13 +2218,11 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t
opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
     }

-    set_float_exception_flags(0, &tstat);
     t.f128 = float128_mul(xa->f128, xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;

@@ -2263,9 +2251,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
                        \
 {
    \
     ppc_vsr_t t = *xt;
   \
     int i;
   \
-
   \
-    helper_reset_fpstatus(env);
    \
-
   \
     for (i = 0; i < nels; i++) {
   \
         float_status tstat = env->fp_status;
   \
         set_float_exception_flags(0, &tstat);
    \
@@ -2305,7 +2290,6 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
@@ -2342,9 +2326,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb)              \
 {
    \
     ppc_vsr_t t = *xt;
   \
     int i;
   \
-
   \
-    helper_reset_fpstatus(env);
    \
-
   \
     for (i = 0; i < nels; i++) {
   \
         if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) {
   \
             float_invalid_op_vxsnan(env, GETPC());
   \
@@ -2382,9 +2363,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb)             \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2430,9 +2408,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb)             \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2592,9 +2567,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
                        \
 {
    \
     ppc_vsr_t t = *xt;
   \
     int i;
   \
-
   \
-    helper_reset_fpstatus(env);
    \
-
   \
     for (i = 0; i < nels; i++) {
   \
         float_status tstat = env->fp_status;
   \
         set_float_exception_flags(0, &tstat);
    \
@@ -2765,9 +2737,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
                   \
 {                                                                        \
     uint32_t cc = 0;                                                     \
     bool vxsnan_flag = false, vxvc_flag = false;                         \
-                                                                         \
-    helper_reset_fpstatus(env);                                          \
-                                                                         \
     if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||        \
         float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {        \
         vxsnan_flag = true;                                              \
@@ -2813,9 +2782,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
                  \
 {                                                                       \
     uint32_t cc = 0;                                                    \
     bool vxsnan_flag = false, vxvc_flag = false;                        \
-                                                                        \
-    helper_reset_fpstatus(env);                                         \
-                                                                        \
     if (float128_is_signaling_nan(xa->f128, &env->fp_status) ||         \
         float128_is_signaling_nan(xb->f128, &env->fp_status)) {         \
         vxsnan_flag = true;                                             \
@@ -3177,9 +3143,6 @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t
xb)
 {
     uint64_t result, sign, exp, frac;

-    float_status tstat = env->fp_status;
-    set_float_exception_flags(0, &tstat);
-
     sign = extract64(xb, 63,  1);
     exp  = extract64(xb, 52, 11);
     frac = extract64(xb,  0, 52) | 0x10000000000000ULL;
@@ -3446,8 +3409,6 @@ VSX_ROUND(xvrspiz, 4, float32, VsrW(i),
float_round_to_zero, 0)

 uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
 {
-    helper_reset_fpstatus(env);
-
     uint64_t xt = helper_frsp(env, xb);

     helper_compute_fprf_float64(env, xt);
@@ -3593,8 +3554,6 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t opcode,
     uint8_t rmode = 0;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     if (r == 0 && rmc == 0) {
         rmode = float_round_ties_away;
     } else if (r == 0 && rmc == 0x3) {
@@ -3650,8 +3609,6 @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode,
     floatx80 round_res;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     if (r == 0 && rmc == 0) {
         rmode = float_round_ties_away;
     } else if (r == 0 && rmc == 0x3) {
@@ -3700,8 +3657,6 @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t
opcode,
     ppc_vsr_t t = { };
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
@@ -3734,8 +3689,6 @@ void helper_xssubqp(CPUPPCState *env, uint32_t opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 4e192de97b..b486c9991f 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -58,7 +58,6 @@ DEF_HELPER_FLAGS_1(cntlzw32, TCG_CALL_NO_RWG_SE, i32, i32)
 DEF_HELPER_FLAGS_2(brinc, TCG_CALL_NO_RWG_SE, tl, tl, tl)

 DEF_HELPER_1(float_check_status, void, env)
-DEF_HELPER_1(reset_fpstatus, void, env)
 DEF_HELPER_2(compute_fprf_float64, void, env, i64)
 DEF_HELPER_3(store_fpscr, void, env, i64, i32)
 DEF_HELPER_2(fpscr_clrbit, void, env, i32)
diff --git a/target/ppc/translate/fp-impl.inc.c
b/target/ppc/translate/fp-impl.inc.c
index e18e268fe5..5e8cd9970e 100644
--- a/target/ppc/translate/fp-impl.inc.c
+++ b/target/ppc/translate/fp-impl.inc.c
@@ -4,11 +4,6 @@
  * Standard FPU translation
  */

-static inline void gen_reset_fpstatus(void)
-{
-    gen_helper_reset_fpstatus(cpu_env);
-}
-
 static inline void gen_compute_fprf_float64(TCGv_i64 arg)
 {
     gen_helper_compute_fprf_float64(cpu_env, arg);
@@ -48,7 +43,6 @@ static void gen_f##name(DisasContext *ctx)
                     \
     t3 = tcg_temp_new_i64();
   \
     /* NIP cannot be restored if the memory exception comes from an helper
*/ \
     gen_update_nip(ctx, ctx->base.pc_next - 4);
    \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rA(ctx->opcode));
    \
     get_fpr(t1, rC(ctx->opcode));
    \
     get_fpr(t2, rB(ctx->opcode));
    \
@@ -88,7 +82,6 @@ static void gen_f##name(DisasContext *ctx)
                     \
     t2 = tcg_temp_new_i64();
   \
     /* NIP cannot be restored if the memory exception comes from an helper
*/ \
     gen_update_nip(ctx, ctx->base.pc_next - 4);
    \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rA(ctx->opcode));
    \
     get_fpr(t1, rB(ctx->opcode));
    \
     gen_helper_f##op(t2, cpu_env, t0, t1);
   \
@@ -123,7 +116,6 @@ static void gen_f##name(DisasContext *ctx)
                       \
     t0 = tcg_temp_new_i64();
   \
     t1 = tcg_temp_new_i64();
   \
     t2 = tcg_temp_new_i64();
   \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rA(ctx->opcode));
    \
     get_fpr(t1, rC(ctx->opcode));
    \
     gen_helper_f##op(t2, cpu_env, t0, t1);
   \
@@ -156,7 +148,6 @@ static void gen_f##name(DisasContext *ctx)
                       \
     }
    \
     t0 = tcg_temp_new_i64();
   \
     t1 = tcg_temp_new_i64();
   \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rB(ctx->opcode));
    \
     gen_helper_f##name(t1, cpu_env, t0);
   \
     set_fpr(rD(ctx->opcode), t1);
    \
@@ -181,7 +172,6 @@ static void gen_f##name(DisasContext *ctx)
                       \
     }
    \
     t0 = tcg_temp_new_i64();
   \
     t1 = tcg_temp_new_i64();
   \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rB(ctx->opcode));
    \
     gen_helper_f##name(t1, cpu_env, t0);
   \
     set_fpr(rD(ctx->opcode), t1);
    \
@@ -222,7 +212,6 @@ static void gen_frsqrtes(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     get_fpr(t0, rB(ctx->opcode));
     gen_helper_frsqrte(t1, cpu_env, t0);
     gen_helper_frsp(t1, cpu_env, t1);
@@ -252,7 +241,6 @@ static void gen_fsqrt(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     get_fpr(t0, rB(ctx->opcode));
     gen_helper_fsqrt(t1, cpu_env, t0);
     set_fpr(rD(ctx->opcode), t1);
@@ -274,7 +262,6 @@ static void gen_fsqrts(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     get_fpr(t0, rB(ctx->opcode));
     gen_helper_fsqrt(t1, cpu_env, t0);
     gen_helper_frsp(t1, cpu_env, t1);
@@ -380,7 +367,6 @@ static void gen_fcmpo(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     crf = tcg_const_i32(crfD(ctx->opcode));
     get_fpr(t0, rA(ctx->opcode));
     get_fpr(t1, rB(ctx->opcode));
@@ -403,7 +389,6 @@ static void gen_fcmpu(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     crf = tcg_const_i32(crfD(ctx->opcode));
     get_fpr(t0, rA(ctx->opcode));
     get_fpr(t1, rB(ctx->opcode));
@@ -612,7 +597,6 @@ static void gen_mffs(DisasContext *ctx)
         return;
     }
     t0 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     set_fpr(rD(ctx->opcode), t0);
     if (unlikely(Rc(ctx->opcode))) {
@@ -635,7 +619,6 @@ static void gen_mffsl(DisasContext *ctx)
         return;
     }
     t0 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     /* Mask everything except mode, status, and enables.  */
     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_STATUS | FP_ENABLES | FP_RN);
@@ -660,7 +643,6 @@ static void gen_mffsce(DisasContext *ctx)

     t0 = tcg_temp_new_i64();

-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     set_fpr(rD(ctx->opcode), t0);

@@ -678,7 +660,6 @@ static void gen_helper_mffscrn(DisasContext *ctx,
TCGv_i64 t1)
     TCGv_i64 t0 = tcg_temp_new_i64();
     TCGv_i32 mask = tcg_const_i32(0x0001);

-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_ENABLES | FP_RN);
     set_fpr(rD(ctx->opcode), t0);
@@ -750,7 +731,6 @@ static void gen_mtfsb0(DisasContext *ctx)
         return;
     }
     crb = 31 - crbD(ctx->opcode);
-    gen_reset_fpstatus();
     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX)) {
         TCGv_i32 t0;
         t0 = tcg_const_i32(crb);
@@ -773,7 +753,6 @@ static void gen_mtfsb1(DisasContext *ctx)
         return;
     }
     crb = 31 - crbD(ctx->opcode);
-    gen_reset_fpstatus();
     /* XXX: we pretend we can only do IEEE floating-point computations */
     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX && crb != FPSCR_NI)) {
         TCGv_i32 t0;
@@ -807,7 +786,6 @@ static void gen_mtfsf(DisasContext *ctx)
         gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);
         return;
     }
-    gen_reset_fpstatus();
     if (l) {
         t0 = tcg_const_i32((ctx->insns_flags2 & PPC2_ISA205) ? 0xffff :
0xff);
     } else {
@@ -844,7 +822,6 @@ static void gen_mtfsfi(DisasContext *ctx)
         return;
     }
     sh = (8 * w) + 7 - bf;
-    gen_reset_fpstatus();
     t0 = tcg_const_i64(((uint64_t)FPIMM(ctx->opcode)) << (4 * sh));
     t1 = tcg_const_i32(1 << sh);
     gen_helper_store_fpscr(cpu_env, t0, t1);
-- 
2.23.0.windows.1

[-- Attachment #2: Type: text/html, Size: 25147 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-01 20:49 ` Richard Henderson
  2020-05-01 22:02   ` 罗勇刚(Yonggang Luo)
@ 2020-05-03 19:46   ` 罗勇刚(Yonggang Luo)
  2020-05-03 23:40     ` BALATON Zoltan
  1 sibling, 1 reply; 11+ messages in thread
From: 罗勇刚(Yonggang Luo) @ 2020-05-03 19:46 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 15969 bytes --]

Hello Richard, Can you have a look at the following patch, and was that are
the right direction?
From b4d6ca1d6376fab1f1be06eb472e10b908887c2b Mon Sep 17 00:00:00 2001
From: Yonggang Luo <luoyonggang@gmail.com>
Date: Sat, 2 May 2020 05:59:25 +0800
Subject: [PATCH] [ppc fp] Step 1. Rearrange the fp helpers to eliminate
 helper_reset_fpstatus(). I've mentioned this before, that it's possible to
 leave the steady-state of env->fp_status.exception_flags == 0, so there's
no
 need for a separate function call.  I suspect this is worth a decent
speedup
 by itself.

---
 target/ppc/fpu_helper.c            | 53 ++----------------------------
 target/ppc/helper.h                |  1 -
 target/ppc/translate/fp-impl.inc.c | 23 -------------
 3 files changed, 3 insertions(+), 74 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index d9a8773ee1..4fc5a7ff1c 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -821,6 +821,9 @@ static void do_float_check_status(CPUPPCState *env,
uintptr_t raddr)
                                    env->error_code, raddr);
         }
     }
+    if (status) {
+        set_float_exception_flags(0, &env->fp_status);
+    }
 }

 void helper_float_check_status(CPUPPCState *env)
@@ -828,11 +831,6 @@ void helper_float_check_status(CPUPPCState *env)
     do_float_check_status(env, GETPC());
 }

-void helper_reset_fpstatus(CPUPPCState *env)
-{
-    set_float_exception_flags(0, &env->fp_status);
-}
-
 static void float_invalid_op_addsub(CPUPPCState *env, bool set_fpcc,
                                     uintptr_t retaddr, int classes)
 {
@@ -2110,9 +2108,6 @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt,
                       \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2152,8 +2147,6 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
@@ -2189,9 +2182,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
                       \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2228,13 +2218,11 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t
opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
     }

-    set_float_exception_flags(0, &tstat);
     t.f128 = float128_mul(xa->f128, xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;

@@ -2263,9 +2251,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
                        \
 {
    \
     ppc_vsr_t t = *xt;
   \
     int i;
   \
-
   \
-    helper_reset_fpstatus(env);
    \
-
   \
     for (i = 0; i < nels; i++) {
   \
         float_status tstat = env->fp_status;
   \
         set_float_exception_flags(0, &tstat);
    \
@@ -2305,7 +2290,6 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
@@ -2342,9 +2326,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb)              \
 {
    \
     ppc_vsr_t t = *xt;
   \
     int i;
   \
-
   \
-    helper_reset_fpstatus(env);
    \
-
   \
     for (i = 0; i < nels; i++) {
   \
         if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) {
   \
             float_invalid_op_vxsnan(env, GETPC());
   \
@@ -2382,9 +2363,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb)             \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2430,9 +2408,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb)             \
 {
   \
     ppc_vsr_t t = *xt;
  \
     int i;
  \
-
  \
-    helper_reset_fpstatus(env);
   \
-
  \
     for (i = 0; i < nels; i++) {
  \
         float_status tstat = env->fp_status;
  \
         set_float_exception_flags(0, &tstat);
   \
@@ -2592,9 +2567,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
                        \
 {
    \
     ppc_vsr_t t = *xt;
   \
     int i;
   \
-
   \
-    helper_reset_fpstatus(env);
    \
-
   \
     for (i = 0; i < nels; i++) {
   \
         float_status tstat = env->fp_status;
   \
         set_float_exception_flags(0, &tstat);
    \
@@ -2765,9 +2737,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
                   \
 {                                                                        \
     uint32_t cc = 0;                                                     \
     bool vxsnan_flag = false, vxvc_flag = false;                         \
-                                                                         \
-    helper_reset_fpstatus(env);                                          \
-                                                                         \
     if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||        \
         float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {        \
         vxsnan_flag = true;                                              \
@@ -2813,9 +2782,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
                  \
 {                                                                       \
     uint32_t cc = 0;                                                    \
     bool vxsnan_flag = false, vxvc_flag = false;                        \
-                                                                        \
-    helper_reset_fpstatus(env);                                         \
-                                                                        \
     if (float128_is_signaling_nan(xa->f128, &env->fp_status) ||         \
         float128_is_signaling_nan(xb->f128, &env->fp_status)) {         \
         vxsnan_flag = true;                                             \
@@ -3177,9 +3143,6 @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t
xb)
 {
     uint64_t result, sign, exp, frac;

-    float_status tstat = env->fp_status;
-    set_float_exception_flags(0, &tstat);
-
     sign = extract64(xb, 63,  1);
     exp  = extract64(xb, 52, 11);
     frac = extract64(xb,  0, 52) | 0x10000000000000ULL;
@@ -3446,8 +3409,6 @@ VSX_ROUND(xvrspiz, 4, float32, VsrW(i),
float_round_to_zero, 0)

 uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
 {
-    helper_reset_fpstatus(env);
-
     uint64_t xt = helper_frsp(env, xb);

     helper_compute_fprf_float64(env, xt);
@@ -3593,8 +3554,6 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t opcode,
     uint8_t rmode = 0;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     if (r == 0 && rmc == 0) {
         rmode = float_round_ties_away;
     } else if (r == 0 && rmc == 0x3) {
@@ -3650,8 +3609,6 @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode,
     floatx80 round_res;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     if (r == 0 && rmc == 0) {
         rmode = float_round_ties_away;
     } else if (r == 0 && rmc == 0x3) {
@@ -3700,8 +3657,6 @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t
opcode,
     ppc_vsr_t t = { };
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
@@ -3734,8 +3689,6 @@ void helper_xssubqp(CPUPPCState *env, uint32_t opcode,
     ppc_vsr_t t = *xt;
     float_status tstat;

-    helper_reset_fpstatus(env);
-
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 4e192de97b..b486c9991f 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -58,7 +58,6 @@ DEF_HELPER_FLAGS_1(cntlzw32, TCG_CALL_NO_RWG_SE, i32, i32)
 DEF_HELPER_FLAGS_2(brinc, TCG_CALL_NO_RWG_SE, tl, tl, tl)

 DEF_HELPER_1(float_check_status, void, env)
-DEF_HELPER_1(reset_fpstatus, void, env)
 DEF_HELPER_2(compute_fprf_float64, void, env, i64)
 DEF_HELPER_3(store_fpscr, void, env, i64, i32)
 DEF_HELPER_2(fpscr_clrbit, void, env, i32)
diff --git a/target/ppc/translate/fp-impl.inc.c
b/target/ppc/translate/fp-impl.inc.c
index e18e268fe5..5e8cd9970e 100644
--- a/target/ppc/translate/fp-impl.inc.c
+++ b/target/ppc/translate/fp-impl.inc.c
@@ -4,11 +4,6 @@
  * Standard FPU translation
  */

-static inline void gen_reset_fpstatus(void)
-{
-    gen_helper_reset_fpstatus(cpu_env);
-}
-
 static inline void gen_compute_fprf_float64(TCGv_i64 arg)
 {
     gen_helper_compute_fprf_float64(cpu_env, arg);
@@ -48,7 +43,6 @@ static void gen_f##name(DisasContext *ctx)
                     \
     t3 = tcg_temp_new_i64();
   \
     /* NIP cannot be restored if the memory exception comes from an helper
*/ \
     gen_update_nip(ctx, ctx->base.pc_next - 4);
    \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rA(ctx->opcode));
    \
     get_fpr(t1, rC(ctx->opcode));
    \
     get_fpr(t2, rB(ctx->opcode));
    \
@@ -88,7 +82,6 @@ static void gen_f##name(DisasContext *ctx)
                     \
     t2 = tcg_temp_new_i64();
   \
     /* NIP cannot be restored if the memory exception comes from an helper
*/ \
     gen_update_nip(ctx, ctx->base.pc_next - 4);
    \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rA(ctx->opcode));
    \
     get_fpr(t1, rB(ctx->opcode));
    \
     gen_helper_f##op(t2, cpu_env, t0, t1);
   \
@@ -123,7 +116,6 @@ static void gen_f##name(DisasContext *ctx)
                       \
     t0 = tcg_temp_new_i64();
   \
     t1 = tcg_temp_new_i64();
   \
     t2 = tcg_temp_new_i64();
   \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rA(ctx->opcode));
    \
     get_fpr(t1, rC(ctx->opcode));
    \
     gen_helper_f##op(t2, cpu_env, t0, t1);
   \
@@ -156,7 +148,6 @@ static void gen_f##name(DisasContext *ctx)
                       \
     }
    \
     t0 = tcg_temp_new_i64();
   \
     t1 = tcg_temp_new_i64();
   \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rB(ctx->opcode));
    \
     gen_helper_f##name(t1, cpu_env, t0);
   \
     set_fpr(rD(ctx->opcode), t1);
    \
@@ -181,7 +172,6 @@ static void gen_f##name(DisasContext *ctx)
                       \
     }
    \
     t0 = tcg_temp_new_i64();
   \
     t1 = tcg_temp_new_i64();
   \
-    gen_reset_fpstatus();
    \
     get_fpr(t0, rB(ctx->opcode));
    \
     gen_helper_f##name(t1, cpu_env, t0);
   \
     set_fpr(rD(ctx->opcode), t1);
    \
@@ -222,7 +212,6 @@ static void gen_frsqrtes(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     get_fpr(t0, rB(ctx->opcode));
     gen_helper_frsqrte(t1, cpu_env, t0);
     gen_helper_frsp(t1, cpu_env, t1);
@@ -252,7 +241,6 @@ static void gen_fsqrt(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     get_fpr(t0, rB(ctx->opcode));
     gen_helper_fsqrt(t1, cpu_env, t0);
     set_fpr(rD(ctx->opcode), t1);
@@ -274,7 +262,6 @@ static void gen_fsqrts(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     get_fpr(t0, rB(ctx->opcode));
     gen_helper_fsqrt(t1, cpu_env, t0);
     gen_helper_frsp(t1, cpu_env, t1);
@@ -380,7 +367,6 @@ static void gen_fcmpo(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     crf = tcg_const_i32(crfD(ctx->opcode));
     get_fpr(t0, rA(ctx->opcode));
     get_fpr(t1, rB(ctx->opcode));
@@ -403,7 +389,6 @@ static void gen_fcmpu(DisasContext *ctx)
     }
     t0 = tcg_temp_new_i64();
     t1 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     crf = tcg_const_i32(crfD(ctx->opcode));
     get_fpr(t0, rA(ctx->opcode));
     get_fpr(t1, rB(ctx->opcode));
@@ -612,7 +597,6 @@ static void gen_mffs(DisasContext *ctx)
         return;
     }
     t0 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     set_fpr(rD(ctx->opcode), t0);
     if (unlikely(Rc(ctx->opcode))) {
@@ -635,7 +619,6 @@ static void gen_mffsl(DisasContext *ctx)
         return;
     }
     t0 = tcg_temp_new_i64();
-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     /* Mask everything except mode, status, and enables.  */
     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_STATUS | FP_ENABLES | FP_RN);
@@ -660,7 +643,6 @@ static void gen_mffsce(DisasContext *ctx)

     t0 = tcg_temp_new_i64();

-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     set_fpr(rD(ctx->opcode), t0);

@@ -678,7 +660,6 @@ static void gen_helper_mffscrn(DisasContext *ctx,
TCGv_i64 t1)
     TCGv_i64 t0 = tcg_temp_new_i64();
     TCGv_i32 mask = tcg_const_i32(0x0001);

-    gen_reset_fpstatus();
     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_ENABLES | FP_RN);
     set_fpr(rD(ctx->opcode), t0);
@@ -750,7 +731,6 @@ static void gen_mtfsb0(DisasContext *ctx)
         return;
     }
     crb = 31 - crbD(ctx->opcode);
-    gen_reset_fpstatus();
     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX)) {
         TCGv_i32 t0;
         t0 = tcg_const_i32(crb);
@@ -773,7 +753,6 @@ static void gen_mtfsb1(DisasContext *ctx)
         return;
     }
     crb = 31 - crbD(ctx->opcode);
-    gen_reset_fpstatus();
     /* XXX: we pretend we can only do IEEE floating-point computations */
     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX && crb != FPSCR_NI)) {
         TCGv_i32 t0;
@@ -807,7 +786,6 @@ static void gen_mtfsf(DisasContext *ctx)
         gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);
         return;
     }
-    gen_reset_fpstatus();
     if (l) {
         t0 = tcg_const_i32((ctx->insns_flags2 & PPC2_ISA205) ? 0xffff :
0xff);
     } else {
@@ -844,7 +822,6 @@ static void gen_mtfsfi(DisasContext *ctx)
         return;
     }
     sh = (8 * w) + 7 - bf;
-    gen_reset_fpstatus();
     t0 = tcg_const_i64(((uint64_t)FPIMM(ctx->opcode)) << (4 * sh));
     t1 = tcg_const_i32(1 << sh);
     gen_helper_store_fpscr(cpu_env, t0, t1);
-- 
2.23.0.windows.1


On Sat, May 2, 2020 at 4:49 AM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 5/1/20 1:04 PM, 罗勇刚(Yonggang Luo) wrote:
> > And  fp_status can not represent all the  Invalid Operation Exception
> flags.
> > What I need to do to represent all the  Invalid Operation Exception
>
> Ideally, we add them to include/fpu/softfloat-types.h, expand the
> float_exception_flags field in float_status to match, and generate the new
> flags in fpu/softfloat.c.
>
> This would actually help target/tricore as well.
>
>
> r~
>


-- 
         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo

[-- Attachment #2: Type: text/html, Size: 26251 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-03 19:46   ` 罗勇刚(Yonggang Luo)
@ 2020-05-03 23:40     ` BALATON Zoltan
  2020-05-04  0:41       ` 罗勇刚(Yonggang Luo)
  0 siblings, 1 reply; 11+ messages in thread
From: BALATON Zoltan @ 2020-05-03 23:40 UTC (permalink / raw)
  To: 罗勇刚(Yonggang Luo)
  Cc: qemu-ppc, Alex Bennée, Richard Henderson, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 16180 bytes --]

Hello,

On Mon, 4 May 2020, 罗勇刚(Yonggang Luo) wrote:
> Hello Richard, Can you have a look at the following patch, and was that are
> the right direction?

Formatting of the patch is broken by your mailer, try sending it with 
something that does not change it otherwise it's a bit hard to read.

Richard suggested to add an assert to check the fp_status is correctly 
cleared in place of helper_reset_fpstatus first for debugging so you could 
change the helper accordingly before deleting it and run a few tests to 
verify it still works. You'll need get some tests and benchmarks working 
to be able to verify your changes that's why I've said that would be step 
0. If you checked that it still produces the same results and the assert 
does not trigger then you can remove the helper.

Regards,
BALATON Zoltan

> From b4d6ca1d6376fab1f1be06eb472e10b908887c2b Mon Sep 17 00:00:00 2001
> From: Yonggang Luo <luoyonggang@gmail.com>
> Date: Sat, 2 May 2020 05:59:25 +0800
> Subject: [PATCH] [ppc fp] Step 1. Rearrange the fp helpers to eliminate
> helper_reset_fpstatus(). I've mentioned this before, that it's possible to
> leave the steady-state of env->fp_status.exception_flags == 0, so there's
> no
> need for a separate function call.  I suspect this is worth a decent
> speedup
> by itself.
>
> ---
> target/ppc/fpu_helper.c            | 53 ++----------------------------
> target/ppc/helper.h                |  1 -
> target/ppc/translate/fp-impl.inc.c | 23 -------------
> 3 files changed, 3 insertions(+), 74 deletions(-)
>
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index d9a8773ee1..4fc5a7ff1c 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -821,6 +821,9 @@ static void do_float_check_status(CPUPPCState *env,
> uintptr_t raddr)
>                                    env->error_code, raddr);
>         }
>     }
> +    if (status) {
> +        set_float_exception_flags(0, &env->fp_status);
> +    }
> }
>
> void helper_float_check_status(CPUPPCState *env)
> @@ -828,11 +831,6 @@ void helper_float_check_status(CPUPPCState *env)
>     do_float_check_status(env, GETPC());
> }
>
> -void helper_reset_fpstatus(CPUPPCState *env)
> -{
> -    set_float_exception_flags(0, &env->fp_status);
> -}
> -
> static void float_invalid_op_addsub(CPUPPCState *env, bool set_fpcc,
>                                     uintptr_t retaddr, int classes)
> {
> @@ -2110,9 +2108,6 @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt,
>                       \
> {
>   \
>     ppc_vsr_t t = *xt;
>  \
>     int i;
>  \
> -
>  \
> -    helper_reset_fpstatus(env);
>   \
> -
>  \
>     for (i = 0; i < nels; i++) {
>  \
>         float_status tstat = env->fp_status;
>  \
>         set_float_exception_flags(0, &tstat);
>   \
> @@ -2152,8 +2147,6 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode,
>     ppc_vsr_t t = *xt;
>     float_status tstat;
>
> -    helper_reset_fpstatus(env);
> -
>     tstat = env->fp_status;
>     if (unlikely(Rc(opcode) != 0)) {
>         tstat.float_rounding_mode = float_round_to_odd;
> @@ -2189,9 +2182,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>                       \
> {
>   \
>     ppc_vsr_t t = *xt;
>  \
>     int i;
>  \
> -
>  \
> -    helper_reset_fpstatus(env);
>   \
> -
>  \
>     for (i = 0; i < nels; i++) {
>  \
>         float_status tstat = env->fp_status;
>  \
>         set_float_exception_flags(0, &tstat);
>   \
> @@ -2228,13 +2218,11 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t
> opcode,
>     ppc_vsr_t t = *xt;
>     float_status tstat;
>
> -    helper_reset_fpstatus(env);
>     tstat = env->fp_status;
>     if (unlikely(Rc(opcode) != 0)) {
>         tstat.float_rounding_mode = float_round_to_odd;
>     }
>
> -    set_float_exception_flags(0, &tstat);
>     t.f128 = float128_mul(xa->f128, xb->f128, &tstat);
>     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
>
> @@ -2263,9 +2251,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>                        \
> {
>    \
>     ppc_vsr_t t = *xt;
>   \
>     int i;
>   \
> -
>   \
> -    helper_reset_fpstatus(env);
>    \
> -
>   \
>     for (i = 0; i < nels; i++) {
>   \
>         float_status tstat = env->fp_status;
>   \
>         set_float_exception_flags(0, &tstat);
>    \
> @@ -2305,7 +2290,6 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode,
>     ppc_vsr_t t = *xt;
>     float_status tstat;
>
> -    helper_reset_fpstatus(env);
>     tstat = env->fp_status;
>     if (unlikely(Rc(opcode) != 0)) {
>         tstat.float_rounding_mode = float_round_to_odd;
> @@ -2342,9 +2326,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> ppc_vsr_t *xb)              \
> {
>    \
>     ppc_vsr_t t = *xt;
>   \
>     int i;
>   \
> -
>   \
> -    helper_reset_fpstatus(env);
>    \
> -
>   \
>     for (i = 0; i < nels; i++) {
>   \
>         if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) {
>   \
>             float_invalid_op_vxsnan(env, GETPC());
>   \
> @@ -2382,9 +2363,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> ppc_vsr_t *xb)             \
> {
>   \
>     ppc_vsr_t t = *xt;
>  \
>     int i;
>  \
> -
>  \
> -    helper_reset_fpstatus(env);
>   \
> -
>  \
>     for (i = 0; i < nels; i++) {
>  \
>         float_status tstat = env->fp_status;
>  \
>         set_float_exception_flags(0, &tstat);
>   \
> @@ -2430,9 +2408,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> ppc_vsr_t *xb)             \
> {
>   \
>     ppc_vsr_t t = *xt;
>  \
>     int i;
>  \
> -
>  \
> -    helper_reset_fpstatus(env);
>   \
> -
>  \
>     for (i = 0; i < nels; i++) {
>  \
>         float_status tstat = env->fp_status;
>  \
>         set_float_exception_flags(0, &tstat);
>   \
> @@ -2592,9 +2567,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>                        \
> {
>    \
>     ppc_vsr_t t = *xt;
>   \
>     int i;
>   \
> -
>   \
> -    helper_reset_fpstatus(env);
>    \
> -
>   \
>     for (i = 0; i < nels; i++) {
>   \
>         float_status tstat = env->fp_status;
>   \
>         set_float_exception_flags(0, &tstat);
>    \
> @@ -2765,9 +2737,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
>                   \
> {                                                                        \
>     uint32_t cc = 0;                                                     \
>     bool vxsnan_flag = false, vxvc_flag = false;                         \
> -                                                                         \
> -    helper_reset_fpstatus(env);                                          \
> -                                                                         \
>     if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||        \
>         float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {        \
>         vxsnan_flag = true;                                              \
> @@ -2813,9 +2782,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
>                  \
> {                                                                       \
>     uint32_t cc = 0;                                                    \
>     bool vxsnan_flag = false, vxvc_flag = false;                        \
> -                                                                        \
> -    helper_reset_fpstatus(env);                                         \
> -                                                                        \
>     if (float128_is_signaling_nan(xa->f128, &env->fp_status) ||         \
>         float128_is_signaling_nan(xb->f128, &env->fp_status)) {         \
>         vxsnan_flag = true;                                             \
> @@ -3177,9 +3143,6 @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t
> xb)
> {
>     uint64_t result, sign, exp, frac;
>
> -    float_status tstat = env->fp_status;
> -    set_float_exception_flags(0, &tstat);
> -
>     sign = extract64(xb, 63,  1);
>     exp  = extract64(xb, 52, 11);
>     frac = extract64(xb,  0, 52) | 0x10000000000000ULL;
> @@ -3446,8 +3409,6 @@ VSX_ROUND(xvrspiz, 4, float32, VsrW(i),
> float_round_to_zero, 0)
>
> uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
> {
> -    helper_reset_fpstatus(env);
> -
>     uint64_t xt = helper_frsp(env, xb);
>
>     helper_compute_fprf_float64(env, xt);
> @@ -3593,8 +3554,6 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t opcode,
>     uint8_t rmode = 0;
>     float_status tstat;
>
> -    helper_reset_fpstatus(env);
> -
>     if (r == 0 && rmc == 0) {
>         rmode = float_round_ties_away;
>     } else if (r == 0 && rmc == 0x3) {
> @@ -3650,8 +3609,6 @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode,
>     floatx80 round_res;
>     float_status tstat;
>
> -    helper_reset_fpstatus(env);
> -
>     if (r == 0 && rmc == 0) {
>         rmode = float_round_ties_away;
>     } else if (r == 0 && rmc == 0x3) {
> @@ -3700,8 +3657,6 @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t
> opcode,
>     ppc_vsr_t t = { };
>     float_status tstat;
>
> -    helper_reset_fpstatus(env);
> -
>     tstat = env->fp_status;
>     if (unlikely(Rc(opcode) != 0)) {
>         tstat.float_rounding_mode = float_round_to_odd;
> @@ -3734,8 +3689,6 @@ void helper_xssubqp(CPUPPCState *env, uint32_t opcode,
>     ppc_vsr_t t = *xt;
>     float_status tstat;
>
> -    helper_reset_fpstatus(env);
> -
>     tstat = env->fp_status;
>     if (unlikely(Rc(opcode) != 0)) {
>         tstat.float_rounding_mode = float_round_to_odd;
> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> index 4e192de97b..b486c9991f 100644
> --- a/target/ppc/helper.h
> +++ b/target/ppc/helper.h
> @@ -58,7 +58,6 @@ DEF_HELPER_FLAGS_1(cntlzw32, TCG_CALL_NO_RWG_SE, i32, i32)
> DEF_HELPER_FLAGS_2(brinc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
>
> DEF_HELPER_1(float_check_status, void, env)
> -DEF_HELPER_1(reset_fpstatus, void, env)
> DEF_HELPER_2(compute_fprf_float64, void, env, i64)
> DEF_HELPER_3(store_fpscr, void, env, i64, i32)
> DEF_HELPER_2(fpscr_clrbit, void, env, i32)
> diff --git a/target/ppc/translate/fp-impl.inc.c
> b/target/ppc/translate/fp-impl.inc.c
> index e18e268fe5..5e8cd9970e 100644
> --- a/target/ppc/translate/fp-impl.inc.c
> +++ b/target/ppc/translate/fp-impl.inc.c
> @@ -4,11 +4,6 @@
>  * Standard FPU translation
>  */
>
> -static inline void gen_reset_fpstatus(void)
> -{
> -    gen_helper_reset_fpstatus(cpu_env);
> -}
> -
> static inline void gen_compute_fprf_float64(TCGv_i64 arg)
> {
>     gen_helper_compute_fprf_float64(cpu_env, arg);
> @@ -48,7 +43,6 @@ static void gen_f##name(DisasContext *ctx)
>                     \
>     t3 = tcg_temp_new_i64();
>   \
>     /* NIP cannot be restored if the memory exception comes from an helper
> */ \
>     gen_update_nip(ctx, ctx->base.pc_next - 4);
>    \
> -    gen_reset_fpstatus();
>    \
>     get_fpr(t0, rA(ctx->opcode));
>    \
>     get_fpr(t1, rC(ctx->opcode));
>    \
>     get_fpr(t2, rB(ctx->opcode));
>    \
> @@ -88,7 +82,6 @@ static void gen_f##name(DisasContext *ctx)
>                     \
>     t2 = tcg_temp_new_i64();
>   \
>     /* NIP cannot be restored if the memory exception comes from an helper
> */ \
>     gen_update_nip(ctx, ctx->base.pc_next - 4);
>    \
> -    gen_reset_fpstatus();
>    \
>     get_fpr(t0, rA(ctx->opcode));
>    \
>     get_fpr(t1, rB(ctx->opcode));
>    \
>     gen_helper_f##op(t2, cpu_env, t0, t1);
>   \
> @@ -123,7 +116,6 @@ static void gen_f##name(DisasContext *ctx)
>                       \
>     t0 = tcg_temp_new_i64();
>   \
>     t1 = tcg_temp_new_i64();
>   \
>     t2 = tcg_temp_new_i64();
>   \
> -    gen_reset_fpstatus();
>    \
>     get_fpr(t0, rA(ctx->opcode));
>    \
>     get_fpr(t1, rC(ctx->opcode));
>    \
>     gen_helper_f##op(t2, cpu_env, t0, t1);
>   \
> @@ -156,7 +148,6 @@ static void gen_f##name(DisasContext *ctx)
>                       \
>     }
>    \
>     t0 = tcg_temp_new_i64();
>   \
>     t1 = tcg_temp_new_i64();
>   \
> -    gen_reset_fpstatus();
>    \
>     get_fpr(t0, rB(ctx->opcode));
>    \
>     gen_helper_f##name(t1, cpu_env, t0);
>   \
>     set_fpr(rD(ctx->opcode), t1);
>    \
> @@ -181,7 +172,6 @@ static void gen_f##name(DisasContext *ctx)
>                       \
>     }
>    \
>     t0 = tcg_temp_new_i64();
>   \
>     t1 = tcg_temp_new_i64();
>   \
> -    gen_reset_fpstatus();
>    \
>     get_fpr(t0, rB(ctx->opcode));
>    \
>     gen_helper_f##name(t1, cpu_env, t0);
>   \
>     set_fpr(rD(ctx->opcode), t1);
>    \
> @@ -222,7 +212,6 @@ static void gen_frsqrtes(DisasContext *ctx)
>     }
>     t0 = tcg_temp_new_i64();
>     t1 = tcg_temp_new_i64();
> -    gen_reset_fpstatus();
>     get_fpr(t0, rB(ctx->opcode));
>     gen_helper_frsqrte(t1, cpu_env, t0);
>     gen_helper_frsp(t1, cpu_env, t1);
> @@ -252,7 +241,6 @@ static void gen_fsqrt(DisasContext *ctx)
>     }
>     t0 = tcg_temp_new_i64();
>     t1 = tcg_temp_new_i64();
> -    gen_reset_fpstatus();
>     get_fpr(t0, rB(ctx->opcode));
>     gen_helper_fsqrt(t1, cpu_env, t0);
>     set_fpr(rD(ctx->opcode), t1);
> @@ -274,7 +262,6 @@ static void gen_fsqrts(DisasContext *ctx)
>     }
>     t0 = tcg_temp_new_i64();
>     t1 = tcg_temp_new_i64();
> -    gen_reset_fpstatus();
>     get_fpr(t0, rB(ctx->opcode));
>     gen_helper_fsqrt(t1, cpu_env, t0);
>     gen_helper_frsp(t1, cpu_env, t1);
> @@ -380,7 +367,6 @@ static void gen_fcmpo(DisasContext *ctx)
>     }
>     t0 = tcg_temp_new_i64();
>     t1 = tcg_temp_new_i64();
> -    gen_reset_fpstatus();
>     crf = tcg_const_i32(crfD(ctx->opcode));
>     get_fpr(t0, rA(ctx->opcode));
>     get_fpr(t1, rB(ctx->opcode));
> @@ -403,7 +389,6 @@ static void gen_fcmpu(DisasContext *ctx)
>     }
>     t0 = tcg_temp_new_i64();
>     t1 = tcg_temp_new_i64();
> -    gen_reset_fpstatus();
>     crf = tcg_const_i32(crfD(ctx->opcode));
>     get_fpr(t0, rA(ctx->opcode));
>     get_fpr(t1, rB(ctx->opcode));
> @@ -612,7 +597,6 @@ static void gen_mffs(DisasContext *ctx)
>         return;
>     }
>     t0 = tcg_temp_new_i64();
> -    gen_reset_fpstatus();
>     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>     set_fpr(rD(ctx->opcode), t0);
>     if (unlikely(Rc(ctx->opcode))) {
> @@ -635,7 +619,6 @@ static void gen_mffsl(DisasContext *ctx)
>         return;
>     }
>     t0 = tcg_temp_new_i64();
> -    gen_reset_fpstatus();
>     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>     /* Mask everything except mode, status, and enables.  */
>     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_STATUS | FP_ENABLES | FP_RN);
> @@ -660,7 +643,6 @@ static void gen_mffsce(DisasContext *ctx)
>
>     t0 = tcg_temp_new_i64();
>
> -    gen_reset_fpstatus();
>     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>     set_fpr(rD(ctx->opcode), t0);
>
> @@ -678,7 +660,6 @@ static void gen_helper_mffscrn(DisasContext *ctx,
> TCGv_i64 t1)
>     TCGv_i64 t0 = tcg_temp_new_i64();
>     TCGv_i32 mask = tcg_const_i32(0x0001);
>
> -    gen_reset_fpstatus();
>     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_ENABLES | FP_RN);
>     set_fpr(rD(ctx->opcode), t0);
> @@ -750,7 +731,6 @@ static void gen_mtfsb0(DisasContext *ctx)
>         return;
>     }
>     crb = 31 - crbD(ctx->opcode);
> -    gen_reset_fpstatus();
>     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX)) {
>         TCGv_i32 t0;
>         t0 = tcg_const_i32(crb);
> @@ -773,7 +753,6 @@ static void gen_mtfsb1(DisasContext *ctx)
>         return;
>     }
>     crb = 31 - crbD(ctx->opcode);
> -    gen_reset_fpstatus();
>     /* XXX: we pretend we can only do IEEE floating-point computations */
>     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX && crb != FPSCR_NI)) {
>         TCGv_i32 t0;
> @@ -807,7 +786,6 @@ static void gen_mtfsf(DisasContext *ctx)
>         gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);
>         return;
>     }
> -    gen_reset_fpstatus();
>     if (l) {
>         t0 = tcg_const_i32((ctx->insns_flags2 & PPC2_ISA205) ? 0xffff :
> 0xff);
>     } else {
> @@ -844,7 +822,6 @@ static void gen_mtfsfi(DisasContext *ctx)
>         return;
>     }
>     sh = (8 * w) + 7 - bf;
> -    gen_reset_fpstatus();
>     t0 = tcg_const_i64(((uint64_t)FPIMM(ctx->opcode)) << (4 * sh));
>     t1 = tcg_const_i32(1 << sh);
>     gen_helper_store_fpscr(cpu_env, t0, t1);
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-03 23:40     ` BALATON Zoltan
@ 2020-05-04  0:41       ` 罗勇刚(Yonggang Luo)
  2020-05-04 10:04         ` Alex Bennée
  2020-05-04 16:49         ` Richard Henderson
  0 siblings, 2 replies; 11+ messages in thread
From: 罗勇刚(Yonggang Luo) @ 2020-05-04  0:41 UTC (permalink / raw)
  To: BALATON Zoltan; +Cc: qemu-ppc, Alex Bennée, Richard Henderson, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 17716 bytes --]

On Mon, May 4, 2020 at 7:40 AM BALATON Zoltan <balaton@eik.bme.hu> wrote:

> Hello,
>
> On Mon, 4 May 2020, 罗勇刚(Yonggang Luo) wrote:
> > Hello Richard, Can you have a look at the following patch, and was that
> are
> > the right direction?
>
> Formatting of the patch is broken by your mailer, try sending it with
> something that does not change it otherwise it's a bit hard to read.
>
> Richard suggested to add an assert to check the fp_status is correctly
> cleared in place of helper_reset_fpstatus first for debugging so you could
> change the helper accordingly before deleting it and run a few tests to
> verify it still works. You'll need get some tests and benchmarks working
> to be able to verify your changes that's why I've said that would be step
> 0. If you checked that it still produces the same results and the assert
> does not trigger then you can remove the helper.
>
That's what I need help,
1. How to write a assert to replace helper_reset_fpstatus .
  just directly assert? or something else
2.  a few tests to run
 How to running these tests, and where are these tests.
Do I need to add new tests? Where to start
3.  Benchmarks
Same as 2

>
> Regards,
> BALATON Zoltan
>
> > From b4d6ca1d6376fab1f1be06eb472e10b908887c2b Mon Sep 17 00:00:00 2001
> > From: Yonggang Luo <luoyonggang@gmail.com>
> > Date: Sat, 2 May 2020 05:59:25 +0800
> > Subject: [PATCH] [ppc fp] Step 1. Rearrange the fp helpers to eliminate
> > helper_reset_fpstatus(). I've mentioned this before, that it's possible
> to
> > leave the steady-state of env->fp_status.exception_flags == 0, so there's
> > no
> > need for a separate function call.  I suspect this is worth a decent
> > speedup
> > by itself.
> >
> > ---
> > target/ppc/fpu_helper.c            | 53 ++----------------------------
> > target/ppc/helper.h                |  1 -
> > target/ppc/translate/fp-impl.inc.c | 23 -------------
> > 3 files changed, 3 insertions(+), 74 deletions(-)
> >
> > diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> > index d9a8773ee1..4fc5a7ff1c 100644
> > --- a/target/ppc/fpu_helper.c
> > +++ b/target/ppc/fpu_helper.c
> > @@ -821,6 +821,9 @@ static void do_float_check_status(CPUPPCState *env,
> > uintptr_t raddr)
> >                                    env->error_code, raddr);
> >         }
> >     }
> > +    if (status) {
> > +        set_float_exception_flags(0, &env->fp_status);
> > +    }
> > }
> >
> > void helper_float_check_status(CPUPPCState *env)
> > @@ -828,11 +831,6 @@ void helper_float_check_status(CPUPPCState *env)
> >     do_float_check_status(env, GETPC());
> > }
> >
> > -void helper_reset_fpstatus(CPUPPCState *env)
> > -{
> > -    set_float_exception_flags(0, &env->fp_status);
> > -}
> > -
> > static void float_invalid_op_addsub(CPUPPCState *env, bool set_fpcc,
> >                                     uintptr_t retaddr, int classes)
> > {
> > @@ -2110,9 +2108,6 @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt,
> >                       \
> > {
> >   \
> >     ppc_vsr_t t = *xt;
> >  \
> >     int i;
> >  \
> > -
> >  \
> > -    helper_reset_fpstatus(env);
> >   \
> > -
> >  \
> >     for (i = 0; i < nels; i++) {
> >  \
> >         float_status tstat = env->fp_status;
> >  \
> >         set_float_exception_flags(0, &tstat);
> >   \
> > @@ -2152,8 +2147,6 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t
> opcode,
> >     ppc_vsr_t t = *xt;
> >     float_status tstat;
> >
> > -    helper_reset_fpstatus(env);
> > -
> >     tstat = env->fp_status;
> >     if (unlikely(Rc(opcode) != 0)) {
> >         tstat.float_rounding_mode = float_round_to_odd;
> > @@ -2189,9 +2182,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> >                       \
> > {
> >   \
> >     ppc_vsr_t t = *xt;
> >  \
> >     int i;
> >  \
> > -
> >  \
> > -    helper_reset_fpstatus(env);
> >   \
> > -
> >  \
> >     for (i = 0; i < nels; i++) {
> >  \
> >         float_status tstat = env->fp_status;
> >  \
> >         set_float_exception_flags(0, &tstat);
> >   \
> > @@ -2228,13 +2218,11 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t
> > opcode,
> >     ppc_vsr_t t = *xt;
> >     float_status tstat;
> >
> > -    helper_reset_fpstatus(env);
> >     tstat = env->fp_status;
> >     if (unlikely(Rc(opcode) != 0)) {
> >         tstat.float_rounding_mode = float_round_to_odd;
> >     }
> >
> > -    set_float_exception_flags(0, &tstat);
> >     t.f128 = float128_mul(xa->f128, xb->f128, &tstat);
> >     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
> >
> > @@ -2263,9 +2251,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> >                        \
> > {
> >    \
> >     ppc_vsr_t t = *xt;
> >   \
> >     int i;
> >   \
> > -
> >   \
> > -    helper_reset_fpstatus(env);
> >    \
> > -
> >   \
> >     for (i = 0; i < nels; i++) {
> >   \
> >         float_status tstat = env->fp_status;
> >   \
> >         set_float_exception_flags(0, &tstat);
> >    \
> > @@ -2305,7 +2290,6 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t
> opcode,
> >     ppc_vsr_t t = *xt;
> >     float_status tstat;
> >
> > -    helper_reset_fpstatus(env);
> >     tstat = env->fp_status;
> >     if (unlikely(Rc(opcode) != 0)) {
> >         tstat.float_rounding_mode = float_round_to_odd;
> > @@ -2342,9 +2326,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> > ppc_vsr_t *xb)              \
> > {
> >    \
> >     ppc_vsr_t t = *xt;
> >   \
> >     int i;
> >   \
> > -
> >   \
> > -    helper_reset_fpstatus(env);
> >    \
> > -
> >   \
> >     for (i = 0; i < nels; i++) {
> >   \
> >         if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) {
> >   \
> >             float_invalid_op_vxsnan(env, GETPC());
> >   \
> > @@ -2382,9 +2363,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> > ppc_vsr_t *xb)             \
> > {
> >   \
> >     ppc_vsr_t t = *xt;
> >  \
> >     int i;
> >  \
> > -
> >  \
> > -    helper_reset_fpstatus(env);
> >   \
> > -
> >  \
> >     for (i = 0; i < nels; i++) {
> >  \
> >         float_status tstat = env->fp_status;
> >  \
> >         set_float_exception_flags(0, &tstat);
> >   \
> > @@ -2430,9 +2408,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> > ppc_vsr_t *xb)             \
> > {
> >   \
> >     ppc_vsr_t t = *xt;
> >  \
> >     int i;
> >  \
> > -
> >  \
> > -    helper_reset_fpstatus(env);
> >   \
> > -
> >  \
> >     for (i = 0; i < nels; i++) {
> >  \
> >         float_status tstat = env->fp_status;
> >  \
> >         set_float_exception_flags(0, &tstat);
> >   \
> > @@ -2592,9 +2567,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
> >                        \
> > {
> >    \
> >     ppc_vsr_t t = *xt;
> >   \
> >     int i;
> >   \
> > -
> >   \
> > -    helper_reset_fpstatus(env);
> >    \
> > -
> >   \
> >     for (i = 0; i < nels; i++) {
> >   \
> >         float_status tstat = env->fp_status;
> >   \
> >         set_float_exception_flags(0, &tstat);
> >    \
> > @@ -2765,9 +2737,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
> >                   \
> > {
> \
> >     uint32_t cc = 0;
>  \
> >     bool vxsnan_flag = false, vxvc_flag = false;
>  \
> > -
>  \
> > -    helper_reset_fpstatus(env);
>   \
> > -
>  \
> >     if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||
> \
> >         float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {
> \
> >         vxsnan_flag = true;
> \
> > @@ -2813,9 +2782,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
> >                  \
> > {                                                                       \
> >     uint32_t cc = 0;                                                    \
> >     bool vxsnan_flag = false, vxvc_flag = false;                        \
> > -
> \
> > -    helper_reset_fpstatus(env);
>  \
> > -
> \
> >     if (float128_is_signaling_nan(xa->f128, &env->fp_status) ||         \
> >         float128_is_signaling_nan(xb->f128, &env->fp_status)) {         \
> >         vxsnan_flag = true;                                             \
> > @@ -3177,9 +3143,6 @@ uint64_t helper_xscvdpspn(CPUPPCState *env,
> uint64_t
> > xb)
> > {
> >     uint64_t result, sign, exp, frac;
> >
> > -    float_status tstat = env->fp_status;
> > -    set_float_exception_flags(0, &tstat);
> > -
> >     sign = extract64(xb, 63,  1);
> >     exp  = extract64(xb, 52, 11);
> >     frac = extract64(xb,  0, 52) | 0x10000000000000ULL;
> > @@ -3446,8 +3409,6 @@ VSX_ROUND(xvrspiz, 4, float32, VsrW(i),
> > float_round_to_zero, 0)
> >
> > uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
> > {
> > -    helper_reset_fpstatus(env);
> > -
> >     uint64_t xt = helper_frsp(env, xb);
> >
> >     helper_compute_fprf_float64(env, xt);
> > @@ -3593,8 +3554,6 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t
> opcode,
> >     uint8_t rmode = 0;
> >     float_status tstat;
> >
> > -    helper_reset_fpstatus(env);
> > -
> >     if (r == 0 && rmc == 0) {
> >         rmode = float_round_ties_away;
> >     } else if (r == 0 && rmc == 0x3) {
> > @@ -3650,8 +3609,6 @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t
> opcode,
> >     floatx80 round_res;
> >     float_status tstat;
> >
> > -    helper_reset_fpstatus(env);
> > -
> >     if (r == 0 && rmc == 0) {
> >         rmode = float_round_ties_away;
> >     } else if (r == 0 && rmc == 0x3) {
> > @@ -3700,8 +3657,6 @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t
> > opcode,
> >     ppc_vsr_t t = { };
> >     float_status tstat;
> >
> > -    helper_reset_fpstatus(env);
> > -
> >     tstat = env->fp_status;
> >     if (unlikely(Rc(opcode) != 0)) {
> >         tstat.float_rounding_mode = float_round_to_odd;
> > @@ -3734,8 +3689,6 @@ void helper_xssubqp(CPUPPCState *env, uint32_t
> opcode,
> >     ppc_vsr_t t = *xt;
> >     float_status tstat;
> >
> > -    helper_reset_fpstatus(env);
> > -
> >     tstat = env->fp_status;
> >     if (unlikely(Rc(opcode) != 0)) {
> >         tstat.float_rounding_mode = float_round_to_odd;
> > diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> > index 4e192de97b..b486c9991f 100644
> > --- a/target/ppc/helper.h
> > +++ b/target/ppc/helper.h
> > @@ -58,7 +58,6 @@ DEF_HELPER_FLAGS_1(cntlzw32, TCG_CALL_NO_RWG_SE, i32,
> i32)
> > DEF_HELPER_FLAGS_2(brinc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> >
> > DEF_HELPER_1(float_check_status, void, env)
> > -DEF_HELPER_1(reset_fpstatus, void, env)
> > DEF_HELPER_2(compute_fprf_float64, void, env, i64)
> > DEF_HELPER_3(store_fpscr, void, env, i64, i32)
> > DEF_HELPER_2(fpscr_clrbit, void, env, i32)
> > diff --git a/target/ppc/translate/fp-impl.inc.c
> > b/target/ppc/translate/fp-impl.inc.c
> > index e18e268fe5..5e8cd9970e 100644
> > --- a/target/ppc/translate/fp-impl.inc.c
> > +++ b/target/ppc/translate/fp-impl.inc.c
> > @@ -4,11 +4,6 @@
> >  * Standard FPU translation
> >  */
> >
> > -static inline void gen_reset_fpstatus(void)
> > -{
> > -    gen_helper_reset_fpstatus(cpu_env);
> > -}
> > -
> > static inline void gen_compute_fprf_float64(TCGv_i64 arg)
> > {
> >     gen_helper_compute_fprf_float64(cpu_env, arg);
> > @@ -48,7 +43,6 @@ static void gen_f##name(DisasContext *ctx)
> >                     \
> >     t3 = tcg_temp_new_i64();
> >   \
> >     /* NIP cannot be restored if the memory exception comes from an
> helper
> > */ \
> >     gen_update_nip(ctx, ctx->base.pc_next - 4);
> >    \
> > -    gen_reset_fpstatus();
> >    \
> >     get_fpr(t0, rA(ctx->opcode));
> >    \
> >     get_fpr(t1, rC(ctx->opcode));
> >    \
> >     get_fpr(t2, rB(ctx->opcode));
> >    \
> > @@ -88,7 +82,6 @@ static void gen_f##name(DisasContext *ctx)
> >                     \
> >     t2 = tcg_temp_new_i64();
> >   \
> >     /* NIP cannot be restored if the memory exception comes from an
> helper
> > */ \
> >     gen_update_nip(ctx, ctx->base.pc_next - 4);
> >    \
> > -    gen_reset_fpstatus();
> >    \
> >     get_fpr(t0, rA(ctx->opcode));
> >    \
> >     get_fpr(t1, rB(ctx->opcode));
> >    \
> >     gen_helper_f##op(t2, cpu_env, t0, t1);
> >   \
> > @@ -123,7 +116,6 @@ static void gen_f##name(DisasContext *ctx)
> >                       \
> >     t0 = tcg_temp_new_i64();
> >   \
> >     t1 = tcg_temp_new_i64();
> >   \
> >     t2 = tcg_temp_new_i64();
> >   \
> > -    gen_reset_fpstatus();
> >    \
> >     get_fpr(t0, rA(ctx->opcode));
> >    \
> >     get_fpr(t1, rC(ctx->opcode));
> >    \
> >     gen_helper_f##op(t2, cpu_env, t0, t1);
> >   \
> > @@ -156,7 +148,6 @@ static void gen_f##name(DisasContext *ctx)
> >                       \
> >     }
> >    \
> >     t0 = tcg_temp_new_i64();
> >   \
> >     t1 = tcg_temp_new_i64();
> >   \
> > -    gen_reset_fpstatus();
> >    \
> >     get_fpr(t0, rB(ctx->opcode));
> >    \
> >     gen_helper_f##name(t1, cpu_env, t0);
> >   \
> >     set_fpr(rD(ctx->opcode), t1);
> >    \
> > @@ -181,7 +172,6 @@ static void gen_f##name(DisasContext *ctx)
> >                       \
> >     }
> >    \
> >     t0 = tcg_temp_new_i64();
> >   \
> >     t1 = tcg_temp_new_i64();
> >   \
> > -    gen_reset_fpstatus();
> >    \
> >     get_fpr(t0, rB(ctx->opcode));
> >    \
> >     gen_helper_f##name(t1, cpu_env, t0);
> >   \
> >     set_fpr(rD(ctx->opcode), t1);
> >    \
> > @@ -222,7 +212,6 @@ static void gen_frsqrtes(DisasContext *ctx)
> >     }
> >     t0 = tcg_temp_new_i64();
> >     t1 = tcg_temp_new_i64();
> > -    gen_reset_fpstatus();
> >     get_fpr(t0, rB(ctx->opcode));
> >     gen_helper_frsqrte(t1, cpu_env, t0);
> >     gen_helper_frsp(t1, cpu_env, t1);
> > @@ -252,7 +241,6 @@ static void gen_fsqrt(DisasContext *ctx)
> >     }
> >     t0 = tcg_temp_new_i64();
> >     t1 = tcg_temp_new_i64();
> > -    gen_reset_fpstatus();
> >     get_fpr(t0, rB(ctx->opcode));
> >     gen_helper_fsqrt(t1, cpu_env, t0);
> >     set_fpr(rD(ctx->opcode), t1);
> > @@ -274,7 +262,6 @@ static void gen_fsqrts(DisasContext *ctx)
> >     }
> >     t0 = tcg_temp_new_i64();
> >     t1 = tcg_temp_new_i64();
> > -    gen_reset_fpstatus();
> >     get_fpr(t0, rB(ctx->opcode));
> >     gen_helper_fsqrt(t1, cpu_env, t0);
> >     gen_helper_frsp(t1, cpu_env, t1);
> > @@ -380,7 +367,6 @@ static void gen_fcmpo(DisasContext *ctx)
> >     }
> >     t0 = tcg_temp_new_i64();
> >     t1 = tcg_temp_new_i64();
> > -    gen_reset_fpstatus();
> >     crf = tcg_const_i32(crfD(ctx->opcode));
> >     get_fpr(t0, rA(ctx->opcode));
> >     get_fpr(t1, rB(ctx->opcode));
> > @@ -403,7 +389,6 @@ static void gen_fcmpu(DisasContext *ctx)
> >     }
> >     t0 = tcg_temp_new_i64();
> >     t1 = tcg_temp_new_i64();
> > -    gen_reset_fpstatus();
> >     crf = tcg_const_i32(crfD(ctx->opcode));
> >     get_fpr(t0, rA(ctx->opcode));
> >     get_fpr(t1, rB(ctx->opcode));
> > @@ -612,7 +597,6 @@ static void gen_mffs(DisasContext *ctx)
> >         return;
> >     }
> >     t0 = tcg_temp_new_i64();
> > -    gen_reset_fpstatus();
> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
> >     set_fpr(rD(ctx->opcode), t0);
> >     if (unlikely(Rc(ctx->opcode))) {
> > @@ -635,7 +619,6 @@ static void gen_mffsl(DisasContext *ctx)
> >         return;
> >     }
> >     t0 = tcg_temp_new_i64();
> > -    gen_reset_fpstatus();
> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
> >     /* Mask everything except mode, status, and enables.  */
> >     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_STATUS | FP_ENABLES | FP_RN);
> > @@ -660,7 +643,6 @@ static void gen_mffsce(DisasContext *ctx)
> >
> >     t0 = tcg_temp_new_i64();
> >
> > -    gen_reset_fpstatus();
> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
> >     set_fpr(rD(ctx->opcode), t0);
> >
> > @@ -678,7 +660,6 @@ static void gen_helper_mffscrn(DisasContext *ctx,
> > TCGv_i64 t1)
> >     TCGv_i64 t0 = tcg_temp_new_i64();
> >     TCGv_i32 mask = tcg_const_i32(0x0001);
> >
> > -    gen_reset_fpstatus();
> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
> >     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_ENABLES | FP_RN);
> >     set_fpr(rD(ctx->opcode), t0);
> > @@ -750,7 +731,6 @@ static void gen_mtfsb0(DisasContext *ctx)
> >         return;
> >     }
> >     crb = 31 - crbD(ctx->opcode);
> > -    gen_reset_fpstatus();
> >     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX)) {
> >         TCGv_i32 t0;
> >         t0 = tcg_const_i32(crb);
> > @@ -773,7 +753,6 @@ static void gen_mtfsb1(DisasContext *ctx)
> >         return;
> >     }
> >     crb = 31 - crbD(ctx->opcode);
> > -    gen_reset_fpstatus();
> >     /* XXX: we pretend we can only do IEEE floating-point computations */
> >     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX && crb != FPSCR_NI)) {
> >         TCGv_i32 t0;
> > @@ -807,7 +786,6 @@ static void gen_mtfsf(DisasContext *ctx)
> >         gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);
> >         return;
> >     }
> > -    gen_reset_fpstatus();
> >     if (l) {
> >         t0 = tcg_const_i32((ctx->insns_flags2 & PPC2_ISA205) ? 0xffff :
> > 0xff);
> >     } else {
> > @@ -844,7 +822,6 @@ static void gen_mtfsfi(DisasContext *ctx)
> >         return;
> >     }
> >     sh = (8 * w) + 7 - bf;
> > -    gen_reset_fpstatus();
> >     t0 = tcg_const_i64(((uint64_t)FPIMM(ctx->opcode)) << (4 * sh));
> >     t1 = tcg_const_i32(1 << sh);
> >     gen_helper_store_fpscr(cpu_env, t0, t1);
> >



-- 
         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo

[-- Attachment #2: Type: text/html, Size: 23407 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-04  0:41       ` 罗勇刚(Yonggang Luo)
@ 2020-05-04 10:04         ` Alex Bennée
  2020-05-04 16:49         ` Richard Henderson
  1 sibling, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2020-05-04 10:04 UTC (permalink / raw)
  To: luoyonggang; +Cc: qemu-ppc, Richard Henderson, qemu-devel


罗勇刚(Yonggang Luo) <luoyonggang@gmail.com> writes:

> On Mon, May 4, 2020 at 7:40 AM BALATON Zoltan <balaton@eik.bme.hu> wrote:
>
>> Hello,
>>
>> On Mon, 4 May 2020, 罗勇刚(Yonggang Luo) wrote:
>> > Hello Richard, Can you have a look at the following patch, and was that
>> are
>> > the right direction?
>>
>> Formatting of the patch is broken by your mailer, try sending it with
>> something that does not change it otherwise it's a bit hard to read.
>>
>> Richard suggested to add an assert to check the fp_status is correctly
>> cleared in place of helper_reset_fpstatus first for debugging so you could
>> change the helper accordingly before deleting it and run a few tests to
>> verify it still works. You'll need get some tests and benchmarks working
>> to be able to verify your changes that's why I've said that would be step
>> 0. If you checked that it still produces the same results and the assert
>> does not trigger then you can remove the helper.
>>
> That's what I need help,
> 1. How to write a assert to replace helper_reset_fpstatus .
>   just directly assert? or something else
> 2.  a few tests to run
>  How to running these tests, and where are these tests.

All the softfloat testing is currently done in tests/fp. I think you
need to make a new version of fp-test.c (fp-test-native.c?) that instead
of comparing the qemu softfloat functions with TestFloats runs the
native functions (as emitted by the compiler). That should pass when run
on real hardware and we can compare when run under emulation. 

> Do I need to add new tests? Where to start
> 3.  Benchmarks

fp-bench is the raw benchmarking app which again can be built for a
guest architecture to measure throughput under emulation.

> Same as 2
>
>>
>> Regards,
>> BALATON Zoltan
>>
>> > From b4d6ca1d6376fab1f1be06eb472e10b908887c2b Mon Sep 17 00:00:00 2001
>> > From: Yonggang Luo <luoyonggang@gmail.com>
>> > Date: Sat, 2 May 2020 05:59:25 +0800
>> > Subject: [PATCH] [ppc fp] Step 1. Rearrange the fp helpers to eliminate
>> > helper_reset_fpstatus(). I've mentioned this before, that it's possible
>> to
>> > leave the steady-state of env->fp_status.exception_flags == 0, so there's
>> > no
>> > need for a separate function call.  I suspect this is worth a decent
>> > speedup
>> > by itself.
>> >
>> > ---
>> > target/ppc/fpu_helper.c            | 53 ++----------------------------
>> > target/ppc/helper.h                |  1 -
>> > target/ppc/translate/fp-impl.inc.c | 23 -------------
>> > 3 files changed, 3 insertions(+), 74 deletions(-)
>> >
>> > diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
>> > index d9a8773ee1..4fc5a7ff1c 100644
>> > --- a/target/ppc/fpu_helper.c
>> > +++ b/target/ppc/fpu_helper.c
>> > @@ -821,6 +821,9 @@ static void do_float_check_status(CPUPPCState *env,
>> > uintptr_t raddr)
>> >                                    env->error_code, raddr);
>> >         }
>> >     }
>> > +    if (status) {
>> > +        set_float_exception_flags(0, &env->fp_status);
>> > +    }
>> > }
>> >
>> > void helper_float_check_status(CPUPPCState *env)
>> > @@ -828,11 +831,6 @@ void helper_float_check_status(CPUPPCState *env)
>> >     do_float_check_status(env, GETPC());
>> > }
>> >
>> > -void helper_reset_fpstatus(CPUPPCState *env)
>> > -{
>> > -    set_float_exception_flags(0, &env->fp_status);
>> > -}
>> > -
>> > static void float_invalid_op_addsub(CPUPPCState *env, bool set_fpcc,
>> >                                     uintptr_t retaddr, int classes)
>> > {
>> > @@ -2110,9 +2108,6 @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt,
>> >                       \
>> > {
>> >   \
>> >     ppc_vsr_t t = *xt;
>> >  \
>> >     int i;
>> >  \
>> > -
>> >  \
>> > -    helper_reset_fpstatus(env);
>> >   \
>> > -
>> >  \
>> >     for (i = 0; i < nels; i++) {
>> >  \
>> >         float_status tstat = env->fp_status;
>> >  \
>> >         set_float_exception_flags(0, &tstat);
>> >   \
>> > @@ -2152,8 +2147,6 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t
>> opcode,
>> >     ppc_vsr_t t = *xt;
>> >     float_status tstat;
>> >
>> > -    helper_reset_fpstatus(env);
>> > -
>> >     tstat = env->fp_status;
>> >     if (unlikely(Rc(opcode) != 0)) {
>> >         tstat.float_rounding_mode = float_round_to_odd;
>> > @@ -2189,9 +2182,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>> >                       \
>> > {
>> >   \
>> >     ppc_vsr_t t = *xt;
>> >  \
>> >     int i;
>> >  \
>> > -
>> >  \
>> > -    helper_reset_fpstatus(env);
>> >   \
>> > -
>> >  \
>> >     for (i = 0; i < nels; i++) {
>> >  \
>> >         float_status tstat = env->fp_status;
>> >  \
>> >         set_float_exception_flags(0, &tstat);
>> >   \
>> > @@ -2228,13 +2218,11 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t
>> > opcode,
>> >     ppc_vsr_t t = *xt;
>> >     float_status tstat;
>> >
>> > -    helper_reset_fpstatus(env);
>> >     tstat = env->fp_status;
>> >     if (unlikely(Rc(opcode) != 0)) {
>> >         tstat.float_rounding_mode = float_round_to_odd;
>> >     }
>> >
>> > -    set_float_exception_flags(0, &tstat);
>> >     t.f128 = float128_mul(xa->f128, xb->f128, &tstat);
>> >     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
>> >
>> > @@ -2263,9 +2251,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>> >                        \
>> > {
>> >    \
>> >     ppc_vsr_t t = *xt;
>> >   \
>> >     int i;
>> >   \
>> > -
>> >   \
>> > -    helper_reset_fpstatus(env);
>> >    \
>> > -
>> >   \
>> >     for (i = 0; i < nels; i++) {
>> >   \
>> >         float_status tstat = env->fp_status;
>> >   \
>> >         set_float_exception_flags(0, &tstat);
>> >    \
>> > @@ -2305,7 +2290,6 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t
>> opcode,
>> >     ppc_vsr_t t = *xt;
>> >     float_status tstat;
>> >
>> > -    helper_reset_fpstatus(env);
>> >     tstat = env->fp_status;
>> >     if (unlikely(Rc(opcode) != 0)) {
>> >         tstat.float_rounding_mode = float_round_to_odd;
>> > @@ -2342,9 +2326,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>> > ppc_vsr_t *xb)              \
>> > {
>> >    \
>> >     ppc_vsr_t t = *xt;
>> >   \
>> >     int i;
>> >   \
>> > -
>> >   \
>> > -    helper_reset_fpstatus(env);
>> >    \
>> > -
>> >   \
>> >     for (i = 0; i < nels; i++) {
>> >   \
>> >         if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) {
>> >   \
>> >             float_invalid_op_vxsnan(env, GETPC());
>> >   \
>> > @@ -2382,9 +2363,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>> > ppc_vsr_t *xb)             \
>> > {
>> >   \
>> >     ppc_vsr_t t = *xt;
>> >  \
>> >     int i;
>> >  \
>> > -
>> >  \
>> > -    helper_reset_fpstatus(env);
>> >   \
>> > -
>> >  \
>> >     for (i = 0; i < nels; i++) {
>> >  \
>> >         float_status tstat = env->fp_status;
>> >  \
>> >         set_float_exception_flags(0, &tstat);
>> >   \
>> > @@ -2430,9 +2408,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>> > ppc_vsr_t *xb)             \
>> > {
>> >   \
>> >     ppc_vsr_t t = *xt;
>> >  \
>> >     int i;
>> >  \
>> > -
>> >  \
>> > -    helper_reset_fpstatus(env);
>> >   \
>> > -
>> >  \
>> >     for (i = 0; i < nels; i++) {
>> >  \
>> >         float_status tstat = env->fp_status;
>> >  \
>> >         set_float_exception_flags(0, &tstat);
>> >   \
>> > @@ -2592,9 +2567,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,
>> >                        \
>> > {
>> >    \
>> >     ppc_vsr_t t = *xt;
>> >   \
>> >     int i;
>> >   \
>> > -
>> >   \
>> > -    helper_reset_fpstatus(env);
>> >    \
>> > -
>> >   \
>> >     for (i = 0; i < nels; i++) {
>> >   \
>> >         float_status tstat = env->fp_status;
>> >   \
>> >         set_float_exception_flags(0, &tstat);
>> >    \
>> > @@ -2765,9 +2737,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
>> >                   \
>> > {
>> \
>> >     uint32_t cc = 0;
>>  \
>> >     bool vxsnan_flag = false, vxvc_flag = false;
>>  \
>> > -
>>  \
>> > -    helper_reset_fpstatus(env);
>>   \
>> > -
>>  \
>> >     if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||
>> \
>> >         float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {
>> \
>> >         vxsnan_flag = true;
>> \
>> > @@ -2813,9 +2782,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,
>> >                  \
>> > {                                                                       \
>> >     uint32_t cc = 0;                                                    \
>> >     bool vxsnan_flag = false, vxvc_flag = false;                        \
>> > -
>> \
>> > -    helper_reset_fpstatus(env);
>>  \
>> > -
>> \
>> >     if (float128_is_signaling_nan(xa->f128, &env->fp_status) ||         \
>> >         float128_is_signaling_nan(xb->f128, &env->fp_status)) {         \
>> >         vxsnan_flag = true;                                             \
>> > @@ -3177,9 +3143,6 @@ uint64_t helper_xscvdpspn(CPUPPCState *env,
>> uint64_t
>> > xb)
>> > {
>> >     uint64_t result, sign, exp, frac;
>> >
>> > -    float_status tstat = env->fp_status;
>> > -    set_float_exception_flags(0, &tstat);
>> > -
>> >     sign = extract64(xb, 63,  1);
>> >     exp  = extract64(xb, 52, 11);
>> >     frac = extract64(xb,  0, 52) | 0x10000000000000ULL;
>> > @@ -3446,8 +3409,6 @@ VSX_ROUND(xvrspiz, 4, float32, VsrW(i),
>> > float_round_to_zero, 0)
>> >
>> > uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
>> > {
>> > -    helper_reset_fpstatus(env);
>> > -
>> >     uint64_t xt = helper_frsp(env, xb);
>> >
>> >     helper_compute_fprf_float64(env, xt);
>> > @@ -3593,8 +3554,6 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t
>> opcode,
>> >     uint8_t rmode = 0;
>> >     float_status tstat;
>> >
>> > -    helper_reset_fpstatus(env);
>> > -
>> >     if (r == 0 && rmc == 0) {
>> >         rmode = float_round_ties_away;
>> >     } else if (r == 0 && rmc == 0x3) {
>> > @@ -3650,8 +3609,6 @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t
>> opcode,
>> >     floatx80 round_res;
>> >     float_status tstat;
>> >
>> > -    helper_reset_fpstatus(env);
>> > -
>> >     if (r == 0 && rmc == 0) {
>> >         rmode = float_round_ties_away;
>> >     } else if (r == 0 && rmc == 0x3) {
>> > @@ -3700,8 +3657,6 @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t
>> > opcode,
>> >     ppc_vsr_t t = { };
>> >     float_status tstat;
>> >
>> > -    helper_reset_fpstatus(env);
>> > -
>> >     tstat = env->fp_status;
>> >     if (unlikely(Rc(opcode) != 0)) {
>> >         tstat.float_rounding_mode = float_round_to_odd;
>> > @@ -3734,8 +3689,6 @@ void helper_xssubqp(CPUPPCState *env, uint32_t
>> opcode,
>> >     ppc_vsr_t t = *xt;
>> >     float_status tstat;
>> >
>> > -    helper_reset_fpstatus(env);
>> > -
>> >     tstat = env->fp_status;
>> >     if (unlikely(Rc(opcode) != 0)) {
>> >         tstat.float_rounding_mode = float_round_to_odd;
>> > diff --git a/target/ppc/helper.h b/target/ppc/helper.h
>> > index 4e192de97b..b486c9991f 100644
>> > --- a/target/ppc/helper.h
>> > +++ b/target/ppc/helper.h
>> > @@ -58,7 +58,6 @@ DEF_HELPER_FLAGS_1(cntlzw32, TCG_CALL_NO_RWG_SE, i32,
>> i32)
>> > DEF_HELPER_FLAGS_2(brinc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
>> >
>> > DEF_HELPER_1(float_check_status, void, env)
>> > -DEF_HELPER_1(reset_fpstatus, void, env)
>> > DEF_HELPER_2(compute_fprf_float64, void, env, i64)
>> > DEF_HELPER_3(store_fpscr, void, env, i64, i32)
>> > DEF_HELPER_2(fpscr_clrbit, void, env, i32)
>> > diff --git a/target/ppc/translate/fp-impl.inc.c
>> > b/target/ppc/translate/fp-impl.inc.c
>> > index e18e268fe5..5e8cd9970e 100644
>> > --- a/target/ppc/translate/fp-impl.inc.c
>> > +++ b/target/ppc/translate/fp-impl.inc.c
>> > @@ -4,11 +4,6 @@
>> >  * Standard FPU translation
>> >  */
>> >
>> > -static inline void gen_reset_fpstatus(void)
>> > -{
>> > -    gen_helper_reset_fpstatus(cpu_env);
>> > -}
>> > -
>> > static inline void gen_compute_fprf_float64(TCGv_i64 arg)
>> > {
>> >     gen_helper_compute_fprf_float64(cpu_env, arg);
>> > @@ -48,7 +43,6 @@ static void gen_f##name(DisasContext *ctx)
>> >                     \
>> >     t3 = tcg_temp_new_i64();
>> >   \
>> >     /* NIP cannot be restored if the memory exception comes from an
>> helper
>> > */ \
>> >     gen_update_nip(ctx, ctx->base.pc_next - 4);
>> >    \
>> > -    gen_reset_fpstatus();
>> >    \
>> >     get_fpr(t0, rA(ctx->opcode));
>> >    \
>> >     get_fpr(t1, rC(ctx->opcode));
>> >    \
>> >     get_fpr(t2, rB(ctx->opcode));
>> >    \
>> > @@ -88,7 +82,6 @@ static void gen_f##name(DisasContext *ctx)
>> >                     \
>> >     t2 = tcg_temp_new_i64();
>> >   \
>> >     /* NIP cannot be restored if the memory exception comes from an
>> helper
>> > */ \
>> >     gen_update_nip(ctx, ctx->base.pc_next - 4);
>> >    \
>> > -    gen_reset_fpstatus();
>> >    \
>> >     get_fpr(t0, rA(ctx->opcode));
>> >    \
>> >     get_fpr(t1, rB(ctx->opcode));
>> >    \
>> >     gen_helper_f##op(t2, cpu_env, t0, t1);
>> >   \
>> > @@ -123,7 +116,6 @@ static void gen_f##name(DisasContext *ctx)
>> >                       \
>> >     t0 = tcg_temp_new_i64();
>> >   \
>> >     t1 = tcg_temp_new_i64();
>> >   \
>> >     t2 = tcg_temp_new_i64();
>> >   \
>> > -    gen_reset_fpstatus();
>> >    \
>> >     get_fpr(t0, rA(ctx->opcode));
>> >    \
>> >     get_fpr(t1, rC(ctx->opcode));
>> >    \
>> >     gen_helper_f##op(t2, cpu_env, t0, t1);
>> >   \
>> > @@ -156,7 +148,6 @@ static void gen_f##name(DisasContext *ctx)
>> >                       \
>> >     }
>> >    \
>> >     t0 = tcg_temp_new_i64();
>> >   \
>> >     t1 = tcg_temp_new_i64();
>> >   \
>> > -    gen_reset_fpstatus();
>> >    \
>> >     get_fpr(t0, rB(ctx->opcode));
>> >    \
>> >     gen_helper_f##name(t1, cpu_env, t0);
>> >   \
>> >     set_fpr(rD(ctx->opcode), t1);
>> >    \
>> > @@ -181,7 +172,6 @@ static void gen_f##name(DisasContext *ctx)
>> >                       \
>> >     }
>> >    \
>> >     t0 = tcg_temp_new_i64();
>> >   \
>> >     t1 = tcg_temp_new_i64();
>> >   \
>> > -    gen_reset_fpstatus();
>> >    \
>> >     get_fpr(t0, rB(ctx->opcode));
>> >    \
>> >     gen_helper_f##name(t1, cpu_env, t0);
>> >   \
>> >     set_fpr(rD(ctx->opcode), t1);
>> >    \
>> > @@ -222,7 +212,6 @@ static void gen_frsqrtes(DisasContext *ctx)
>> >     }
>> >     t0 = tcg_temp_new_i64();
>> >     t1 = tcg_temp_new_i64();
>> > -    gen_reset_fpstatus();
>> >     get_fpr(t0, rB(ctx->opcode));
>> >     gen_helper_frsqrte(t1, cpu_env, t0);
>> >     gen_helper_frsp(t1, cpu_env, t1);
>> > @@ -252,7 +241,6 @@ static void gen_fsqrt(DisasContext *ctx)
>> >     }
>> >     t0 = tcg_temp_new_i64();
>> >     t1 = tcg_temp_new_i64();
>> > -    gen_reset_fpstatus();
>> >     get_fpr(t0, rB(ctx->opcode));
>> >     gen_helper_fsqrt(t1, cpu_env, t0);
>> >     set_fpr(rD(ctx->opcode), t1);
>> > @@ -274,7 +262,6 @@ static void gen_fsqrts(DisasContext *ctx)
>> >     }
>> >     t0 = tcg_temp_new_i64();
>> >     t1 = tcg_temp_new_i64();
>> > -    gen_reset_fpstatus();
>> >     get_fpr(t0, rB(ctx->opcode));
>> >     gen_helper_fsqrt(t1, cpu_env, t0);
>> >     gen_helper_frsp(t1, cpu_env, t1);
>> > @@ -380,7 +367,6 @@ static void gen_fcmpo(DisasContext *ctx)
>> >     }
>> >     t0 = tcg_temp_new_i64();
>> >     t1 = tcg_temp_new_i64();
>> > -    gen_reset_fpstatus();
>> >     crf = tcg_const_i32(crfD(ctx->opcode));
>> >     get_fpr(t0, rA(ctx->opcode));
>> >     get_fpr(t1, rB(ctx->opcode));
>> > @@ -403,7 +389,6 @@ static void gen_fcmpu(DisasContext *ctx)
>> >     }
>> >     t0 = tcg_temp_new_i64();
>> >     t1 = tcg_temp_new_i64();
>> > -    gen_reset_fpstatus();
>> >     crf = tcg_const_i32(crfD(ctx->opcode));
>> >     get_fpr(t0, rA(ctx->opcode));
>> >     get_fpr(t1, rB(ctx->opcode));
>> > @@ -612,7 +597,6 @@ static void gen_mffs(DisasContext *ctx)
>> >         return;
>> >     }
>> >     t0 = tcg_temp_new_i64();
>> > -    gen_reset_fpstatus();
>> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>> >     set_fpr(rD(ctx->opcode), t0);
>> >     if (unlikely(Rc(ctx->opcode))) {
>> > @@ -635,7 +619,6 @@ static void gen_mffsl(DisasContext *ctx)
>> >         return;
>> >     }
>> >     t0 = tcg_temp_new_i64();
>> > -    gen_reset_fpstatus();
>> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>> >     /* Mask everything except mode, status, and enables.  */
>> >     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_STATUS | FP_ENABLES | FP_RN);
>> > @@ -660,7 +643,6 @@ static void gen_mffsce(DisasContext *ctx)
>> >
>> >     t0 = tcg_temp_new_i64();
>> >
>> > -    gen_reset_fpstatus();
>> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>> >     set_fpr(rD(ctx->opcode), t0);
>> >
>> > @@ -678,7 +660,6 @@ static void gen_helper_mffscrn(DisasContext *ctx,
>> > TCGv_i64 t1)
>> >     TCGv_i64 t0 = tcg_temp_new_i64();
>> >     TCGv_i32 mask = tcg_const_i32(0x0001);
>> >
>> > -    gen_reset_fpstatus();
>> >     tcg_gen_extu_tl_i64(t0, cpu_fpscr);
>> >     tcg_gen_andi_i64(t0, t0, FP_DRN | FP_ENABLES | FP_RN);
>> >     set_fpr(rD(ctx->opcode), t0);
>> > @@ -750,7 +731,6 @@ static void gen_mtfsb0(DisasContext *ctx)
>> >         return;
>> >     }
>> >     crb = 31 - crbD(ctx->opcode);
>> > -    gen_reset_fpstatus();
>> >     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX)) {
>> >         TCGv_i32 t0;
>> >         t0 = tcg_const_i32(crb);
>> > @@ -773,7 +753,6 @@ static void gen_mtfsb1(DisasContext *ctx)
>> >         return;
>> >     }
>> >     crb = 31 - crbD(ctx->opcode);
>> > -    gen_reset_fpstatus();
>> >     /* XXX: we pretend we can only do IEEE floating-point computations */
>> >     if (likely(crb != FPSCR_FEX && crb != FPSCR_VX && crb != FPSCR_NI)) {
>> >         TCGv_i32 t0;
>> > @@ -807,7 +786,6 @@ static void gen_mtfsf(DisasContext *ctx)
>> >         gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);
>> >         return;
>> >     }
>> > -    gen_reset_fpstatus();
>> >     if (l) {
>> >         t0 = tcg_const_i32((ctx->insns_flags2 & PPC2_ISA205) ? 0xffff :
>> > 0xff);
>> >     } else {
>> > @@ -844,7 +822,6 @@ static void gen_mtfsfi(DisasContext *ctx)
>> >         return;
>> >     }
>> >     sh = (8 * w) + 7 - bf;
>> > -    gen_reset_fpstatus();
>> >     t0 = tcg_const_i64(((uint64_t)FPIMM(ctx->opcode)) << (4 * sh));
>> >     t1 = tcg_const_i32(1 << sh);
>> >     gen_helper_store_fpscr(cpu_env, t0, t1);
>> >


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-04  0:41       ` 罗勇刚(Yonggang Luo)
  2020-05-04 10:04         ` Alex Bennée
@ 2020-05-04 16:49         ` Richard Henderson
  2020-05-04 18:30           ` BALATON Zoltan
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2020-05-04 16:49 UTC (permalink / raw)
  To: luoyonggang, BALATON Zoltan; +Cc: qemu-ppc, Alex Bennée, qemu-devel

On 5/3/20 5:41 PM, 罗勇刚(Yonggang Luo) wrote:
> 
> 
> On Mon, May 4, 2020 at 7:40 AM BALATON Zoltan <balaton@eik.bme.hu
> <mailto:balaton@eik.bme.hu>> wrote:
> 
>     Hello,
> 
>     On Mon, 4 May 2020, 罗勇刚(Yonggang Luo) wrote:
>     > Hello Richard, Can you have a look at the following patch, and was that are
>     > the right direction?
> 
>     Formatting of the patch is broken by your mailer, try sending it with
>     something that does not change it otherwise it's a bit hard to read.
> 
>     Richard suggested to add an assert to check the fp_status is correctly
>     cleared in place of helper_reset_fpstatus first for debugging so you could
>     change the helper accordingly before deleting it and run a few tests to
>     verify it still works. You'll need get some tests and benchmarks working
>     to be able to verify your changes that's why I've said that would be step
>     0. If you checked that it still produces the same results and the assert
>     does not trigger then you can remove the helper.
> 
> That's what I need help,
> 1. How to write a assert to replace helper_reset_fpstatus .
>   just directly assert? or something else

You can't place the assert where helper_reset_fpstatus was.  You need to place
it in each of the helpers, like helper_fadd, that previously has a call to
helper_reset_fpstatus preceeding it.

The assert should be placed before the first floatN_op call that uses
env->fp_status.  E.g.

float64 helper_fadd(CPUPPCState *env, float64 arg1, float64 arg2)
{
    float64 ret;
    int status;

    status = get_float_exception_flags(&env->fp_status);
    assert(status == 0);
    ret = float64_add(arg1, arg2, &env->fp_status);
    status = get_float_exception_flags(&env->fp_status);

    if (unlikely(status & float_flag_invalid)) {
        float_invalid_op_addsub(env, 1, GETPC(),
                                float64_classify(arg1) |
                                float64_classify(arg2));
    }

    return ret;
}


r~


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-04 16:49         ` Richard Henderson
@ 2020-05-04 18:30           ` BALATON Zoltan
  2020-05-04 18:46             ` Richard Henderson
  0 siblings, 1 reply; 11+ messages in thread
From: BALATON Zoltan @ 2020-05-04 18:30 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, luoyonggang, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2785 bytes --]

On Mon, 4 May 2020, Richard Henderson wrote:
> On 5/3/20 5:41 PM, 罗勇刚(Yonggang Luo) wrote:
>> On Mon, May 4, 2020 at 7:40 AM BALATON Zoltan <balaton@eik.bme.hu
>> <mailto:balaton@eik.bme.hu>> wrote:
>>
>>     Hello,
>>
>>     On Mon, 4 May 2020, 罗勇刚(Yonggang Luo) wrote:
>>    > Hello Richard, Can you have a look at the following patch, and was that are
>>    > the right direction?
>>
>>     Formatting of the patch is broken by your mailer, try sending it with
>>     something that does not change it otherwise it's a bit hard to read.
>>
>>     Richard suggested to add an assert to check the fp_status is correctly
>>     cleared in place of helper_reset_fpstatus first for debugging so you could
>>     change the helper accordingly before deleting it and run a few tests to
>>     verify it still works. You'll need get some tests and benchmarks working
>>     to be able to verify your changes that's why I've said that would be step
>>     0. If you checked that it still produces the same results and the assert
>>     does not trigger then you can remove the helper.
>>
>> That's what I need help,
>> 1. How to write a assert to replace helper_reset_fpstatus .
>>   just directly assert? or something else
>
> You can't place the assert where helper_reset_fpstatus was.  You need to place
> it in each of the helpers, like helper_fadd, that previously has a call to
> helper_reset_fpstatus preceeding it.

Why? If we want to verify that clearing fp_status after flags are 
processed is equivalent to clearing flags before fp ops then verifying 
that the fp_status is already cleared when the current 
helper_reset_fpstatus is called should be enough to check that nothing has 
set the flags inbetween so the current reset helper would be no op. 
Therefore I thought you could put the assert there for checking this. This 
assert is for debugging and checking the change only and not meant to be 
left there otherwise we lose all the performance gain so it's easier to 
put in the current helper before removing it for this than in every fp op 
helper. What am I missing?

Regards,
BALATON Zoltan

> The assert should be placed before the first floatN_op call that uses
> env->fp_status.  E.g.
>
> float64 helper_fadd(CPUPPCState *env, float64 arg1, float64 arg2)
> {
>    float64 ret;
>    int status;
>
>    status = get_float_exception_flags(&env->fp_status);
>    assert(status == 0);
>    ret = float64_add(arg1, arg2, &env->fp_status);
>    status = get_float_exception_flags(&env->fp_status);
>
>    if (unlikely(status & float_flag_invalid)) {
>        float_invalid_op_addsub(env, 1, GETPC(),
>                                float64_classify(arg1) |
>                                float64_classify(arg2));
>    }
>
>    return ret;
> }
>
>
> r~
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-04 18:30           ` BALATON Zoltan
@ 2020-05-04 18:46             ` Richard Henderson
  2020-05-04 18:50               ` BALATON Zoltan
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2020-05-04 18:46 UTC (permalink / raw)
  To: BALATON Zoltan; +Cc: Alex Bennée, luoyonggang, qemu-ppc, qemu-devel

On 5/4/20 11:30 AM, BALATON Zoltan wrote:
> On Mon, 4 May 2020, Richard Henderson wrote:
>> On 5/3/20 5:41 PM, 罗勇刚(Yonggang Luo) wrote:
>>> On Mon, May 4, 2020 at 7:40 AM BALATON Zoltan <balaton@eik.bme.hu
>>> <mailto:balaton@eik.bme.hu>> wrote:
>>>
>>>     Hello,
>>>
>>>     On Mon, 4 May 2020, 罗勇刚(Yonggang Luo) wrote:
>>>    > Hello Richard, Can you have a look at the following patch, and was that
>>> are
>>>    > the right direction?
>>>
>>>     Formatting of the patch is broken by your mailer, try sending it with
>>>     something that does not change it otherwise it's a bit hard to read.
>>>
>>>     Richard suggested to add an assert to check the fp_status is correctly
>>>     cleared in place of helper_reset_fpstatus first for debugging so you could
>>>     change the helper accordingly before deleting it and run a few tests to
>>>     verify it still works. You'll need get some tests and benchmarks working
>>>     to be able to verify your changes that's why I've said that would be step
>>>     0. If you checked that it still produces the same results and the assert
>>>     does not trigger then you can remove the helper.
>>>
>>> That's what I need help,
>>> 1. How to write a assert to replace helper_reset_fpstatus .
>>>   just directly assert? or something else
>>
>> You can't place the assert where helper_reset_fpstatus was.  You need to place
>> it in each of the helpers, like helper_fadd, that previously has a call to
>> helper_reset_fpstatus preceeding it.
> 
> Why? If we want to verify that clearing fp_status after flags are processed is
> equivalent to clearing flags before fp ops then verifying that the fp_status is
> already cleared when the current helper_reset_fpstatus is called should be
> enough to check that nothing has set the flags in between so the current reset
> helper would be no op. Therefore I thought you could put the assert there for
> checking this. This assert is for debugging and checking the change only and
> not meant to be left there otherwise we lose all the performance gain so it's
> easier to put in the current helper before removing it for this than in every
> fp op helper. What am I missing?

I'm not sure what you are suggesting.

If you are suggesting

 void helper_reset_fpstatus(CPUPPCState *env)
 {
-    set_float_exception_flags(0, &env->fp_status);
+    assert(get_float_exception_flags(&env->fp_status) == 0);
 }

then, sure, that works.  But we also want to remove that call, so in order to
retain the check for debugging, we need to move the assert into the other helpers.


r~


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: An first try to improve PPC float simulation, not even compiled. Just ask question.
  2020-05-04 18:46             ` Richard Henderson
@ 2020-05-04 18:50               ` BALATON Zoltan
  0 siblings, 0 replies; 11+ messages in thread
From: BALATON Zoltan @ 2020-05-04 18:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-ppc, luoyonggang, Alex Bennée, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2997 bytes --]

On Mon, 4 May 2020, Richard Henderson wrote:
> On 5/4/20 11:30 AM, BALATON Zoltan wrote:
>> On Mon, 4 May 2020, Richard Henderson wrote:
>>> On 5/3/20 5:41 PM, 罗勇刚(Yonggang Luo) wrote:
>>>> On Mon, May 4, 2020 at 7:40 AM BALATON Zoltan <balaton@eik.bme.hu
>>>> <mailto:balaton@eik.bme.hu>> wrote:
>>>>
>>>>     Hello,
>>>>
>>>>     On Mon, 4 May 2020, 罗勇刚(Yonggang Luo) wrote:
>>>>    > Hello Richard, Can you have a look at the following patch, and was that
>>>> are
>>>>    > the right direction?
>>>>
>>>>     Formatting of the patch is broken by your mailer, try sending it with
>>>>     something that does not change it otherwise it's a bit hard to read.
>>>>
>>>>     Richard suggested to add an assert to check the fp_status is correctly
>>>>     cleared in place of helper_reset_fpstatus first for debugging so you could
>>>>     change the helper accordingly before deleting it and run a few tests to
>>>>     verify it still works. You'll need get some tests and benchmarks working
>>>>     to be able to verify your changes that's why I've said that would be step
>>>>     0. If you checked that it still produces the same results and the assert
>>>>     does not trigger then you can remove the helper.
>>>>
>>>> That's what I need help,
>>>> 1. How to write a assert to replace helper_reset_fpstatus .
>>>>   just directly assert? or something else
>>>
>>> You can't place the assert where helper_reset_fpstatus was.  You need to place
>>> it in each of the helpers, like helper_fadd, that previously has a call to
>>> helper_reset_fpstatus preceeding it.
>>
>> Why? If we want to verify that clearing fp_status after flags are processed is
>> equivalent to clearing flags before fp ops then verifying that the fp_status is
>> already cleared when the current helper_reset_fpstatus is called should be
>> enough to check that nothing has set the flags in between so the current reset
>> helper would be no op. Therefore I thought you could put the assert there for
>> checking this. This assert is for debugging and checking the change only and
>> not meant to be left there otherwise we lose all the performance gain so it's
>> easier to put in the current helper before removing it for this than in every
>> fp op helper. What am I missing?
>
> I'm not sure what you are suggesting.
>
> If you are suggesting
>
> void helper_reset_fpstatus(CPUPPCState *env)
> {
> -    set_float_exception_flags(0, &env->fp_status);
> +    assert(get_float_exception_flags(&env->fp_status) == 0);
> }
>
> then, sure, that works.  But we also want to remove that call, so in order to
> retain the check for debugging, we need to move the assert into the other helpers.

Yes, I meant to change helper_reset_fpstatus as above and add clearing 
fp_status after processing flags then run some tests to verify we can 
remove this call and then remove it together with the assert which should 
not be needed after this checking.

Regards,
BALATON Zoltan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-05-04 18:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01 20:04 An first try to improve PPC float simulation, not even compiled. Just ask question 罗勇刚(Yonggang Luo)
2020-05-01 20:49 ` Richard Henderson
2020-05-01 22:02   ` 罗勇刚(Yonggang Luo)
2020-05-03 19:46   ` 罗勇刚(Yonggang Luo)
2020-05-03 23:40     ` BALATON Zoltan
2020-05-04  0:41       ` 罗勇刚(Yonggang Luo)
2020-05-04 10:04         ` Alex Bennée
2020-05-04 16:49         ` Richard Henderson
2020-05-04 18:30           ` BALATON Zoltan
2020-05-04 18:46             ` Richard Henderson
2020-05-04 18:50               ` BALATON Zoltan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.