All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix some Neon insns on big-endian hosts
@ 2020-10-28 19:17 Peter Maydell
  2020-10-28 19:17 ` [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops " Peter Maydell
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Peter Maydell @ 2020-10-28 19:17 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

This patchseries fixes a couple of places where the vector
helpers we use for Neon insns were missing uses of the
address-swizzling macros H1(), H2() or H4(). This is harmless
on little-endian hosts but causes the wrong results to be
generated on big-endian hosts. The affected insns are
VUDOT(scalar), VSDOT(scalar), VPADD, VPMAX and VPMIN.

This series is independent of Richard's recent "target/arm:
Fix neon reg offsets"; it fixes the handful of remaining
risu test failures I see.

thanks
-- PMM

Peter Maydell (2):
  target/arm: Fix float16 pairwise Neon ops on big-endian hosts
  target/arm: Fix VUDOT/VSDOT (scalar) on big-endian hosts

 target/arm/vec_helper.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops on big-endian hosts
  2020-10-28 19:17 [PATCH 0/2] Fix some Neon insns on big-endian hosts Peter Maydell
@ 2020-10-28 19:17 ` Peter Maydell
  2020-10-28 19:45   ` Philippe Mathieu-Daudé
  2020-10-28 19:17 ` [PATCH 2/2] target/arm: Fix VUDOT/VSDOT (scalar) " Peter Maydell
  2020-10-28 19:34 ` [PATCH 0/2] Fix some Neon insns " Richard Henderson
  2 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2020-10-28 19:17 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

In the neon_padd/pmax/pmin helpers for float16, a cut-and-paste error
meant we were using the H4() address swizzler macro rather than the
H2() which is required for 2-byte data.  This had no effect on
little-endian hosts but meant we put the result data into the
destination Dreg in the wrong order on big-endian hosts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/vec_helper.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index a973454e4f4..30d76d05beb 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -1858,10 +1858,10 @@ DO_ABA(gvec_uaba_d, uint64_t)
         r2 = float16_##OP(m[H2(0)], m[H2(1)], fpst);                    \
         r3 = float16_##OP(m[H2(2)], m[H2(3)], fpst);                    \
                                                                         \
-        d[H4(0)] = r0;                                                  \
-        d[H4(1)] = r1;                                                  \
-        d[H4(2)] = r2;                                                  \
-        d[H4(3)] = r3;                                                  \
+        d[H2(0)] = r0;                                                  \
+        d[H2(1)] = r1;                                                  \
+        d[H2(2)] = r2;                                                  \
+        d[H2(3)] = r3;                                                  \
     }
 
 DO_NEON_PAIRWISE(neon_padd, add)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] target/arm: Fix VUDOT/VSDOT (scalar) on big-endian hosts
  2020-10-28 19:17 [PATCH 0/2] Fix some Neon insns on big-endian hosts Peter Maydell
  2020-10-28 19:17 ` [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops " Peter Maydell
@ 2020-10-28 19:17 ` Peter Maydell
  2020-10-28 19:51   ` Philippe Mathieu-Daudé
  2020-10-28 19:34 ` [PATCH 0/2] Fix some Neon insns " Richard Henderson
  2 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2020-10-28 19:17 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

The helper functions for performing the udot/sdot operations against
a scalar were not using an address-swizzling macro when converting
the index of the scalar element into a pointer into the vm array.
This had no effect on little-endian hosts but meant we generated
incorrect results on big-endian hosts.

For these insns, the index is indexing over group of 4 8-bit values,
so 32 bits per indexed entity, and H4() is therefore what we want.
(For Neon the only possible input indexes are 0 and 1.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
I believe that gvec_udot_idx_h and gvec_sdot_idx_h are OK
because the index there is over groups of 4*16-bit values,
which are 64 bits each.
---
 target/arm/vec_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index 30d76d05beb..0f33127c4c4 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -293,7 +293,7 @@ void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
     intptr_t index = simd_data(desc);
     uint32_t *d = vd;
     int8_t *n = vn;
-    int8_t *m_indexed = (int8_t *)vm + index * 4;
+    int8_t *m_indexed = (int8_t *)vm + H4(index) * 4;
 
     /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
      * Otherwise opr_sz is a multiple of 16.
@@ -324,7 +324,7 @@ void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
     intptr_t index = simd_data(desc);
     uint32_t *d = vd;
     uint8_t *n = vn;
-    uint8_t *m_indexed = (uint8_t *)vm + index * 4;
+    uint8_t *m_indexed = (uint8_t *)vm + H4(index) * 4;
 
     /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
      * Otherwise opr_sz is a multiple of 16.
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix some Neon insns on big-endian hosts
  2020-10-28 19:17 [PATCH 0/2] Fix some Neon insns on big-endian hosts Peter Maydell
  2020-10-28 19:17 ` [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops " Peter Maydell
  2020-10-28 19:17 ` [PATCH 2/2] target/arm: Fix VUDOT/VSDOT (scalar) " Peter Maydell
@ 2020-10-28 19:34 ` Richard Henderson
  2 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2020-10-28 19:34 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 10/28/20 12:17 PM, Peter Maydell wrote:
> Peter Maydell (2):
>   target/arm: Fix float16 pairwise Neon ops on big-endian hosts
>   target/arm: Fix VUDOT/VSDOT (scalar) on big-endian hosts

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops on big-endian hosts
  2020-10-28 19:17 ` [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops " Peter Maydell
@ 2020-10-28 19:45   ` Philippe Mathieu-Daudé
  2020-11-02  9:21     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 7+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-10-28 19:45 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel; +Cc: Richard Henderson

On 10/28/20 8:17 PM, Peter Maydell wrote:
> In the neon_padd/pmax/pmin helpers for float16, a cut-and-paste error
> meant we were using the H4() address swizzler macro rather than the
> H2() which is required for 2-byte data.  This had no effect on
> little-endian hosts but meant we put the result data into the
> destination Dreg in the wrong order on big-endian hosts.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/vec_helper.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] target/arm: Fix VUDOT/VSDOT (scalar) on big-endian hosts
  2020-10-28 19:17 ` [PATCH 2/2] target/arm: Fix VUDOT/VSDOT (scalar) " Peter Maydell
@ 2020-10-28 19:51   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 7+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-10-28 19:51 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel; +Cc: Richard Henderson

On 10/28/20 8:17 PM, Peter Maydell wrote:
> The helper functions for performing the udot/sdot operations against
> a scalar were not using an address-swizzling macro when converting
> the index of the scalar element into a pointer into the vm array.
> This had no effect on little-endian hosts but meant we generated
> incorrect results on big-endian hosts.
> 
> For these insns, the index is indexing over group of 4 8-bit values,
> so 32 bits per indexed entity, and H4() is therefore what we want.
> (For Neon the only possible input indexes are 0 and 1.)
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> I believe that gvec_udot_idx_h and gvec_sdot_idx_h are OK
> because the index there is over groups of 4*16-bit values,
> which are 64 bits each.
> ---
>  target/arm/vec_helper.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops on big-endian hosts
  2020-10-28 19:45   ` Philippe Mathieu-Daudé
@ 2020-11-02  9:21     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 7+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-11-02  9:21 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel; +Cc: Richard Henderson

On 10/28/20 8:45 PM, Philippe Mathieu-Daudé wrote:
> On 10/28/20 8:17 PM, Peter Maydell wrote:
>> In the neon_padd/pmax/pmin helpers for float16, a cut-and-paste error
>> meant we were using the H4() address swizzler macro rather than the
>> H2() which is required for 2-byte data.  This had no effect on
>> little-endian hosts but meant we put the result data into the
>> destination Dreg in the wrong order on big-endian hosts.
>>
>> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>> ---
>>  target/arm/vec_helper.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-02  9:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-28 19:17 [PATCH 0/2] Fix some Neon insns on big-endian hosts Peter Maydell
2020-10-28 19:17 ` [PATCH 1/2] target/arm: Fix float16 pairwise Neon ops " Peter Maydell
2020-10-28 19:45   ` Philippe Mathieu-Daudé
2020-11-02  9:21     ` Philippe Mathieu-Daudé
2020-10-28 19:17 ` [PATCH 2/2] target/arm: Fix VUDOT/VSDOT (scalar) " Peter Maydell
2020-10-28 19:51   ` Philippe Mathieu-Daudé
2020-10-28 19:34 ` [PATCH 0/2] Fix some Neon insns " Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.