From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29D74C4BA2D for ; Wed, 26 Feb 2020 22:52:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EF1052072D for ; Wed, 26 Feb 2020 22:52:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EF1052072D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=eik.bme.hu Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51188 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j75Xe-00033a-6l for qemu-devel@archiver.kernel.org; Wed, 26 Feb 2020 17:52:34 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:38569) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j75WJ-0001cM-PU for qemu-devel@nongnu.org; Wed, 26 Feb 2020 17:51:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j75WI-0003Lm-0J for qemu-devel@nongnu.org; Wed, 26 Feb 2020 17:51:11 -0500 Received: from zero.eik.bme.hu ([2001:738:2001:2001::2001]:42890) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j75WH-00038g-Pn; Wed, 26 Feb 2020 17:51:09 -0500 Received: from zero.eik.bme.hu (blah.eik.bme.hu [152.66.115.182]) by localhost (Postfix) with SMTP id BF4AC74637D; Wed, 26 Feb 2020 23:51:05 +0100 (CET) Received: by zero.eik.bme.hu (Postfix, from userid 432) id 98BA37461AE; Wed, 26 Feb 2020 23:51:05 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by zero.eik.bme.hu (Postfix) with ESMTP id 96A3E74569F; Wed, 26 Feb 2020 23:51:05 +0100 (CET) Date: Wed, 26 Feb 2020 23:51:05 +0100 (CET) From: BALATON Zoltan To: =?ISO-8859-15?Q?Alex_Benn=E9e?= Subject: Re: R: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC In-Reply-To: <87eeuhxw0y.fsf@linaro.org> Message-ID: References: <20200218171702.979F074637D@zero.eik.bme.hu> <1BC2E9E9-A694-4ED3-BD3D-D731F23B7245@gmail.com> <3539F747-145F-49CC-B494-C9794A8ABABA@gmail.com> <87eeuhxw0y.fsf@linaro.org> User-Agent: Alpine 2.22 (BSF 395 2020-01-19) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="3866299591-1146421166-1582757465=:39102" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2001:738:2001:2001::2001 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, Programmingkid , "qemu-ppc@nongnu.org" , Howard Spoelstra , luigi burdo , Dino Papararo , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --3866299591-1146421166-1582757465=:39102 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable On Wed, 26 Feb 2020, Alex Benn=C3=A9e wrote: > That's the wrong way around. We have regression tests for a reason. I'l= l > happily accept patches to turn on hardfloat for PPC if: > > a) they don't cause regressions in our fairly extensive floating point > tests Where are these tests and how to run them? I'm not aware of such tests so= =20 I've only tried running simple guest code to test changes but if there ar= e=20 more extensive FP tests I'd better use those. > b) the PPC maintainers are happy with the new performance profile > > The way forward would be to: > > 1. patch to drop #if defined(TARGET_PPC) || defined(__FAST_MATH__) This is simple but I've found that while it seems to make some vector=20 instructions faster it also makes most simple FP ops slower because it=20 will go thorugh checking if it can use hardfloat but never can because th= e=20 fp_status is cleared before every FP op. That's why I've set inexact bit=20 to let it use hardfloat and be able to test if it would work at all.=20 That's all my RFC patch did, I've had a 2nd version trying to avoid slow=20 down with above #if defined() dropped but hardfloat=3Dfalse so it only us= es=20 softfloat as before but it did not worked out too well, some tests said v= 2=20 was even slower. Maybe to avoid overhead we should add a variable instead= =20 of the QEMU_NO_HARDFLOAT define that can be set during runtime but=20 probably that won't be faster either. Thus it seems there's no way to=20 enable hardfloat for PPC and not have slower performance for most FP ops=20 without also doing some of the other points below (even if it's beneficia= l=20 for vector ops). > 2. audit target/ppc/fpu_helper.c w.r.t chip manual and fix any unneeded > splatting of flags (if any) This would either need someone who knows PPC FPU or someone who can take=20 the time to learn and go through the code. Not sure I want to volunteer=20 for that. But I think the clearing of the flags is mainly to emulate FI=20 bit which is an non-sticky inexact bit that should show the inexact statu= s=20 of last FP op. (There's another simliar bit for fraction rounded as well=20 but that does not disable hardfloat.) Question is if we really want to=20 accurately emulate these bits? Are there any software we care about=20 relying on these? If we can live with not having correct FI bit emulation= =20 (which was the case for a long time until these were fixed) then we could= =20 have an easy way to enable hardfloat without more extensive changes. If w= e=20 want to accurately emulate also these bits then we probably will need=20 changes to softfloat to allow registering FP exception handlers so we=20 don't have to clear and check bits but can get an exception from FPU and=20 then can set those bits but I have no idea how to do that. Regards, BALATON Zoltan --3866299591-1146421166-1582757465=:39102--