On Tue, Apr 28, 2020 at 4:36 PM Alex Bennée wrote: > > 罗勇刚(Yonggang Luo) writes: > > > I am confusing why only inexact are set then we can use hard-float. > > The inexact behaviour of the host hardware may be different from the > guest architecture we are trying to emulate and the host hardware may > not be configurable to emulate the guest mode. > > Have a look in softfloat.c and see all the places where > float_flag_inexact is set. Can you convince yourself that the host > hardware will do the same? > > > And PPC always clearing inexact flag before calling to soft-float > > funcitons. so we can not > > optimize it with hard-float. > > I need some resouces about ineact flag and why always clearing inexcat in > > PPC FP simualtion. > > Because that is the behaviour of the PPC floating point unit. The > inexact flag will represent the last operation done. > > > I am looking for two possible solution: > > 1. do not clear inexact flag in PPC simulation > > 2. even the inexact are cleared, we can still use alternative hard-float. > > > > But now I am the beginner, Have no clue about all the things. > > Well you'll need to learn about floating point because these are rather > fundamental aspects of it's behaviour. In the old days QEMU used to use > the host floating point processor with it's template based translation. > However this led to lots of weird bugs because the floating point > answers under qemu where different from the target it was trying to > emulate. It was for this reason softfloat was introduced. The hardfloat > optimisation can only be done when we are confident that we will get the > exact same answer of the target we are trying to emulate - a "faster but > incorrect" mode is just going to cause confusion as discussed in the > previous thread. Have you read that yet? > Yeap, I've alredy read that carefully, and I know for PPC now there is no fast and correct way to do hard float emulation, And my intention is to finding a possible way to do fast and correct way to do hard float emulation for PPC target at least under x86 host. > > > > > On Mon, Apr 27, 2020 at 7:10 PM Alex Bennée > wrote: > > > >> > >> BALATON Zoltan writes: > >> > >> > On Mon, 27 Apr 2020, Alex Bennée wrote: > >> >> 罗勇刚(Yonggang Luo) writes: > >> >>> Because ppc fpu-helper are always clearing float_flag_inexact, > >> >>> So is that possible to optimize the performance when > >> float_flag_inexact > >> >>> are cleared? > >> >> > >> >> There was some discussion about this in the last thread about > enabling > >> >> hardfloat for PPC. See the thread: > >> >> > >> >> Subject: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC > >> >> Date: Tue, 18 Feb 2020 18:10:16 +0100 > >> >> Message-Id: <20200218171702.979F074637D@zero.eik.bme.hu> > >> > > >> > I've answered this already with link to that thread here: > >> > > >> > On Fri, 10 Apr 2020, BALATON Zoltan wrote: > >> > : Date: Fri, 10 Apr 2020 20:04:53 +0200 (CEST) > >> > : From: BALATON Zoltan > >> > : To: "罗勇刚(Yonggang Luo)" > >> > : Cc: qemu-devel@nongnu.org, Mark Cave-Ayland, John Arbuckle, > >> qemu-ppc@nongnu.org, Paul Clarke, Howard Spoelstra, David Gibson > >> > : Subject: Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC > >> > : > >> > : On Fri, 10 Apr 2020, 罗勇刚(Yonggang Luo) wrote: > >> > :> Are this stable now? I'd like to see hard float to be landed:) > >> > : > >> > : If you want to see hardfloat for PPC then you should read the > >> > replies to : this patch which can be found here: > >> > : > >> > : http://patchwork.ozlabs.org/patch/1240235/ > >> > : > >> > : to understand what's needed then try to implement the solution with > >> > FP : exceptions cached in a global that maybe could work. I won't be > >> > able to : do that as said here: > >> > : > >> > : > https://lists.nongnu.org/archive/html/qemu-ppc/2020-03/msg00006.html > >> > : > >> > : because I don't have time to learn all the details needed. I think : > >> > others are in the same situation so unless somebody puts in the : > >> > necessary effort this won't change. > >> > > >> > Which also had a proposed solution to the problem that you could try > >> > to implement, in particular see this message: > >> > > >> > > >> > http://patchwork.ozlabs.org/project/qemu-devel/patch/20200218171702.979F074637D@zero.eik.bme.hu/#2375124 > >> > > >> > amd Richard's reply immediately below that. In short to optimise FPU > >> > emulation we would either find a way to compute inexact flag quickly > >> > without reading the FPU status (this may not be possible) or somehow > >> > get status from the FPU but the obvious way of claring the flag and > >> > reading them after each operation is too slow. So maybe using > >> > exceptions and only clearing when actually there's a change could be > >> > faster. > >> > > >> > As to how to use exceptions see this message in above thread: > >> > > >> > https://lists.nongnu.org/archive/html/qemu-ppc/2020-03/msg00005.html > >> > > >> > But that's only to show how to hook in an exception handler what it > >> > does needs to be implemented. Then tested and benchmarked. > >> > > >> > I still don't know where are the extensive PPC floating point tests to > >> > use for checking results though as that was never answered. > >> > >> Specifically for PPC we don't have them. We use the softfloat test cases > >> to exercise our softfloat/hardfloat code as part of "make > >> check-softfloat". You can also re-build fp-bench for each guest target > >> to measure raw throughput. > >> > >> >> However in short the problem is if the float_flag_inexact is clear > you > >> >> must use softfloat so you can properly calculate the inexact status. > We > >> >> can't take advantage of the inexact stickiness without loosing the > >> >> fidelity of the calculation. > >> > > >> > I still don't get why can't we use hardware via exception handler to > >> > detect flags for us and why do we only use hardfloat in some corner > >> > cases. If reading the status is too costly then we could mirror it in > >> > a global which is set by an FP exception handler. Shouldn't that be > >> > faster? Is there a reason that can't work? > >> > >> It would work but it would be slow. Almost every FP operation sets > >> the inexact flag so it would generate an exception and exceptions take > >> time to process. > >> > >> For the guests where we use hardfloat operations with inexact already > >> latched is not a corner case - it is the common case which is why it > >> helps. > >> > >> > > >> > Regards, > >> > BALATON Zoltan > >> > >> > >> -- > >> Alex Bennée > >> > > > -- > Alex Bennée > -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo