On Tue, Apr 28, 2020 at 4:36 PM Alex Bennée <alex.bennee@linaro.org> wrote:

>
> 罗勇刚(Yonggang Luo) <luoyonggang@gmail.com> writes:
>
> > I am confusing why only  inexact  are set then we can use hard-float.
>
> The inexact behaviour of the host hardware may be different from the
> guest architecture we are trying to emulate and the host hardware may
> not be configurable to emulate the guest mode.
>
> Have a look in softfloat.c and see all the places where
> float_flag_inexact is set. Can you convince yourself that the host
> hardware will do the same?
>
> > And PPC always clearing inexact  flag before calling to soft-float
> > funcitons. so we can not
> > optimize it with hard-float.
> > I need some resouces about ineact flag and why always clearing inexcat in
> > PPC FP simualtion.
>
> Because that is the behaviour of the PPC floating point unit. The
> inexact flag will represent the last operation done.
>
> > I am looking for two possible solution:
> > 1. do not clear inexact flag in PPC simulation
> > 2. even the inexact are cleared, we can still use alternative hard-float.
> >
> > But now I am the beginner, Have no clue about all the things.
>
> Well you'll need to learn about floating point because these are rather
> fundamental aspects of it's behaviour. In the old days QEMU used to use
> the host floating point processor with it's template based translation.
> However this led to lots of weird bugs because the floating point
> answers under qemu where different from the target it was trying to
> emulate. It was for this reason softfloat was introduced. The hardfloat
> optimisation can only be done when we are confident that we will get the
> exact same answer of the target we are trying to emulate - a "faster but
> incorrect" mode is just going to cause confusion as discussed in the
> previous thread. Have you read that yet?
>
Yeap, I've alredy read that carefully, and I know for PPC now there is no
fast and correct way to
do hard float emulation, And my intention is to finding a possible way to
do fast and correct way to
do hard float emulation for PPC target at least under x86 host.

>
> >
> > On Mon, Apr 27, 2020 at 7:10 PM Alex Bennée <alex.bennee@linaro.org>
> wrote:
> >
> >>
> >> BALATON Zoltan <balaton@eik.bme.hu> writes:
> >>
> >> > On Mon, 27 Apr 2020, Alex Bennée wrote:
> >> >> 罗勇刚(Yonggang Luo) <luoyonggang@gmail.com> writes:
> >> >>> Because ppc fpu-helper are always clearing float_flag_inexact,
> >> >>> So is that possible to optimize the performance when
> >> float_flag_inexact
> >> >>> are cleared?
> >> >>
> >> >> There was some discussion about this in the last thread about
> enabling
> >> >> hardfloat for PPC. See the thread:
> >> >>
> >> >>  Subject: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
> >> >>  Date: Tue, 18 Feb 2020 18:10:16 +0100
> >> >>  Message-Id: <20200218171702.979F074637D@zero.eik.bme.hu>
> >> >
> >> > I've answered this already with link to that thread here:
> >> >
> >> > On Fri, 10 Apr 2020, BALATON Zoltan wrote:
> >> > : Date: Fri, 10 Apr 2020 20:04:53 +0200 (CEST)
> >> > : From: BALATON Zoltan <balaton@eik.bme.hu>
> >> > : To: "罗勇刚(Yonggang Luo)" <luoyonggang@gmail.com>
> >> > : Cc: qemu-devel@nongnu.org, Mark Cave-Ayland, John Arbuckle,
> >> qemu-ppc@nongnu.org, Paul Clarke, Howard Spoelstra, David Gibson
> >> > : Subject: Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
> >> > :
> >> > : On Fri, 10 Apr 2020, 罗勇刚(Yonggang Luo) wrote:
> >> > :> Are this stable now? I'd like to see hard float to be landed:)
> >> > :
> >> > : If you want to see hardfloat for PPC then you should read the
> >> > replies to : this patch which can be found here:
> >> > :
> >> > : http://patchwork.ozlabs.org/patch/1240235/
> >> > :
> >> > : to understand what's needed then try to implement the solution with
> >> > FP : exceptions cached in a global that maybe could work. I won't be
> >> > able to : do that as said here:
> >> > :
> >> > :
> https://lists.nongnu.org/archive/html/qemu-ppc/2020-03/msg00006.html
> >> > :
> >> > : because I don't have time to learn all the details needed. I think :
> >> > others are in the same situation so unless somebody puts in the :
> >> > necessary effort this won't change.
> >> >
> >> > Which also had a proposed solution to the problem that you could try
> >> > to implement, in particular see this message:
> >> >
> >> >
> >>
> http://patchwork.ozlabs.org/project/qemu-devel/patch/20200218171702.979F074637D@zero.eik.bme.hu/#2375124
> >> >
> >> > amd Richard's reply immediately below that. In short to optimise FPU
> >> > emulation we would either find a way to compute inexact flag quickly
> >> > without reading the FPU status (this may not be possible) or somehow
> >> > get status from the FPU but the obvious way of claring the flag and
> >> > reading them after each operation is too slow. So maybe using
> >> > exceptions and only clearing when actually there's a change could be
> >> > faster.
> >> >
> >> > As to how to use exceptions see this message in above thread:
> >> >
> >> > https://lists.nongnu.org/archive/html/qemu-ppc/2020-03/msg00005.html
> >> >
> >> > But that's only to show how to hook in an exception handler what it
> >> > does needs to be implemented. Then tested and benchmarked.
> >> >
> >> > I still don't know where are the extensive PPC floating point tests to
> >> > use for checking results though as that was never answered.
> >>
> >> Specifically for PPC we don't have them. We use the softfloat test cases
> >> to exercise our softfloat/hardfloat code as part of "make
> >> check-softfloat". You can also re-build fp-bench for each guest target
> >> to measure raw throughput.
> >>
> >> >> However in short the problem is if the float_flag_inexact is clear
> you
> >> >> must use softfloat so you can properly calculate the inexact status.
> We
> >> >> can't take advantage of the inexact stickiness without loosing the
> >> >> fidelity of the calculation.
> >> >
> >> > I still don't get why can't we use hardware via exception handler to
> >> > detect flags for us and why do we only use hardfloat in some corner
> >> > cases. If reading the status is too costly then we could mirror it in
> >> > a global which is set by an FP exception handler. Shouldn't that be
> >> > faster? Is there a reason that can't work?
> >>
> >> It would work but it would be slow. Almost every FP operation sets
> >> the inexact flag so it would generate an exception and exceptions take
> >> time to process.
> >>
> >> For the guests where we use hardfloat operations with inexact already
> >> latched is not a corner case - it is the common case which is why it
> >> helps.
> >>
> >> >
> >> > Regards,
> >> > BALATON Zoltan
> >>
> >>
> >> --
> >> Alex Bennée
> >>
>
>
> --
> Alex Bennée
>


-- 
         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo