From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42195) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YHCRt-0004BD-VA for qemu-devel@nongnu.org; Fri, 30 Jan 2015 09:21:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YHCRo-0007uN-0r for qemu-devel@nongnu.org; Fri, 30 Jan 2015 09:21:29 -0500 Received: from mail-qa0-f43.google.com ([209.85.216.43]:36847) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YHCRn-0007uD-SA for qemu-devel@nongnu.org; Fri, 30 Jan 2015 09:21:23 -0500 Received: by mail-qa0-f43.google.com with SMTP id v10so20032912qac.2 for ; Fri, 30 Jan 2015 06:21:23 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <1422559909-19377-1-git-send-email-peter.maydell@linaro.org> Date: Fri, 30 Jan 2015 08:21:23 -0600 Message-ID: From: Greg Bellows Content-Type: multipart/alternative; boundary=001a11c12eeab33243050ddf5178 Subject: Re: [Qemu-devel] [PATCH] target-arm: Squash input denormals in FRECPS and FRSQRTS List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laurent Desnogues Cc: Peter Maydell , Patch Tracking , =?UTF-8?B?QWxleCBCZW5uw6ll?= , "qemu-devel@nongnu.org" , Xiangyu Hu --001a11c12eeab33243050ddf5178 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Fri, Jan 30, 2015 at 3:41 AM, Laurent Desnogues < laurent.desnogues@gmail.com> wrote: > On Thu, Jan 29, 2015 at 8:31 PM, Peter Maydell > wrote: > > The helper functions for FRECPS and FRSQRTS have special case > > handling that includes checks for zero inputs, so squash input > > denormals if necessary before those checks. This fixes incorrect > > output when the FPCR DZ bit is set to enable squashing of input > > denormals. > > > > Signed-off-by: Peter Maydell > > Tested-by: Laurent Desnogues > > Thanks, > > Laurent > > > --- > > A quick eyeball of helper-a64.c suggests that these are the only > > other insns we needed to fix, and a risu test of these insns > > confirms that (a) they're buggy and (b) this patch fixes them. > > I haven't done an exhaustive coverage test of the whole instruction > > set with the DZ bit set, though... > > > > target-arm/helper-a64.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c > > index ebd9247..8aa40e9 100644 > > --- a/target-arm/helper-a64.c > > +++ b/target-arm/helper-a64.c > > @@ -229,6 +229,9 @@ float32 HELPER(recpsf_f32)(float32 a, float32 b, > void *fpstp) > > { > > float_status *fpst =3D fpstp; > > > > + a =3D float32_squash_input_denormal(a, fpst); > > + b =3D float32_squash_input_denormal(b, fpst); > > + > > a =3D float32_chs(a); > > if ((float32_is_infinity(a) && float32_is_zero(b)) || > > (float32_is_infinity(b) && float32_is_zero(a))) { > > @@ -241,6 +244,9 @@ float64 HELPER(recpsf_f64)(float64 a, float64 b, > void *fpstp) > > { > > float_status *fpst =3D fpstp; > > > > + a =3D float64_squash_input_denormal(a, fpst); > > + b =3D float64_squash_input_denormal(b, fpst); > > + > > a =3D float64_chs(a); > > if ((float64_is_infinity(a) && float64_is_zero(b)) || > > (float64_is_infinity(b) && float64_is_zero(a))) { > > @@ -253,6 +259,9 @@ float32 HELPER(rsqrtsf_f32)(float32 a, float32 b, > void *fpstp) > > { > > float_status *fpst =3D fpstp; > > > > + a =3D float32_squash_input_denormal(a, fpst); > > + b =3D float32_squash_input_denormal(b, fpst); > > + > > a =3D float32_chs(a); > > if ((float32_is_infinity(a) && float32_is_zero(b)) || > > (float32_is_infinity(b) && float32_is_zero(a))) { > > @@ -265,6 +274,9 @@ float64 HELPER(rsqrtsf_f64)(float64 a, float64 b, > void *fpstp) > > { > > float_status *fpst =3D fpstp; > > > > + a =3D float64_squash_input_denormal(a, fpst); > > + b =3D float64_squash_input_denormal(b, fpst); > > + > > a =3D float64_chs(a); > > if ((float64_is_infinity(a) && float64_is_zero(b)) || > > (float64_is_infinity(b) && float64_is_zero(a))) { > > -- > > 1.9.1 > > > > > > =E2=80=8BReviewed-by: Greg Bellows =E2=80=8B --001a11c12eeab33243050ddf5178 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Fri, Jan 30, 2015 at 3:41 AM, Laurent Desnogues <laurent.desnogues@gmail.com> wrote:
On Thu, Jan 29, 2015 at 8:31 PM, Peter Mayde= ll <peter.maydell@linaro.org= > wrote:
> The helper functions for FRECPS and FRSQRTS have special case
> handling that includes checks for zero inputs, so squash input
> denormals if necessary before those checks. This fixes incorrect
> output when the FPCR DZ bit is set to enable squashing of input
> denormals.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>

Thanks,

Laurent

> ---
> A quick eyeball of helper-a64.c suggests that these are the only
> other insns we needed to fix, and a risu test of these insns
> confirms that (a) they're buggy and (b) this patch fixes them.
> I haven't done an exhaustive coverage test of the whole instructio= n
> set with the DZ bit set, though...
>
>=C2=A0 target-arm/helper-a64.c | 12 ++++++++++++
>=C2=A0 1 file changed, 12 insertions(+)
>
> diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
> index ebd9247..8aa40e9 100644
> --- a/target-arm/helper-a64.c
> +++ b/target-arm/helper-a64.c
> @@ -229,6 +229,9 @@ float32 HELPER(recpsf_f32)(float32 a, float32 b, v= oid *fpstp)
>=C2=A0 {
>=C2=A0 =C2=A0 =C2=A0 float_status *fpst =3D fpstp;
>
> +=C2=A0 =C2=A0 a =3D float32_squash_input_denormal(a, fpst);
> +=C2=A0 =C2=A0 b =3D float32_squash_input_denormal(b, fpst);
> +
>=C2=A0 =C2=A0 =C2=A0 a =3D float32_chs(a);
>=C2=A0 =C2=A0 =C2=A0 if ((float32_is_infinity(a) && float32_is_= zero(b)) ||
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (float32_is_infinity(b) && f= loat32_is_zero(a))) {
> @@ -241,6 +244,9 @@ float64 HELPER(recpsf_f64)(float64 a, float64 b, v= oid *fpstp)
>=C2=A0 {
>=C2=A0 =C2=A0 =C2=A0 float_status *fpst =3D fpstp;
>
> +=C2=A0 =C2=A0 a =3D float64_squash_input_denormal(a, fpst);
> +=C2=A0 =C2=A0 b =3D float64_squash_input_denormal(b, fpst);
> +
>=C2=A0 =C2=A0 =C2=A0 a =3D float64_chs(a);
>=C2=A0 =C2=A0 =C2=A0 if ((float64_is_infinity(a) && float64_is_= zero(b)) ||
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (float64_is_infinity(b) && f= loat64_is_zero(a))) {
> @@ -253,6 +259,9 @@ float32 HELPER(rsqrtsf_f32)(float32 a, float32 b, = void *fpstp)
>=C2=A0 {
>=C2=A0 =C2=A0 =C2=A0 float_status *fpst =3D fpstp;
>
> +=C2=A0 =C2=A0 a =3D float32_squash_input_denormal(a, fpst);
> +=C2=A0 =C2=A0 b =3D float32_squash_input_denormal(b, fpst);
> +
>=C2=A0 =C2=A0 =C2=A0 a =3D float32_chs(a);
>=C2=A0 =C2=A0 =C2=A0 if ((float32_is_infinity(a) && float32_is_= zero(b)) ||
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (float32_is_infinity(b) && f= loat32_is_zero(a))) {
> @@ -265,6 +274,9 @@ float64 HELPER(rsqrtsf_f64)(float64 a, float64 b, = void *fpstp)
>=C2=A0 {
>=C2=A0 =C2=A0 =C2=A0 float_status *fpst =3D fpstp;
>
> +=C2=A0 =C2=A0 a =3D float64_squash_input_denormal(a, fpst);
> +=C2=A0 =C2=A0 b =3D float64_squash_input_denormal(b, fpst);
> +
>=C2=A0 =C2=A0 =C2=A0 a =3D float64_chs(a);
>=C2=A0 =C2=A0 =C2=A0 if ((float64_is_infinity(a) && float64_is_= zero(b)) ||
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (float64_is_infinity(b) && f= loat64_is_zero(a))) {
> --
> 1.9.1
>
>


=E2= =80=8BReviewed-by: Greg Bellows <greg.bellows@linaro.org>=E2=80=8B

--001a11c12eeab33243050ddf5178--