All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix `lxvdsx` (issue #212)
@ 2021-05-17 21:40 Paul A. Clarke
  2021-05-18  1:34 ` David Gibson
  0 siblings, 1 reply; 6+ messages in thread
From: Paul A. Clarke @ 2021-05-17 21:40 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Giuseppe Musacchio, Richard Henderson, David Gibson

`lxvdsx` is byte-swapping the data it loads, which it should not
do.  Fix it.

Fixes #212.

Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c
Signed-off-by:  Paul A. Clarke <pc@us.ibm.com
---
 target/ppc/translate/vsx-impl.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index b817d31260bb..46f97c029ca8 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx)
     gen_addr_reg_index(ctx, EA);
 
     data = tcg_temp_new_i64();
-    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ);
+    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
     tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
 
     tcg_temp_free(EA);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix `lxvdsx` (issue #212)
  2021-05-17 21:40 [PATCH] Fix `lxvdsx` (issue #212) Paul A. Clarke
@ 2021-05-18  1:34 ` David Gibson
  2021-05-18  6:40   ` Giuseppe Musacchio
  2021-05-18  6:53   ` Mark Cave-Ayland
  0 siblings, 2 replies; 6+ messages in thread
From: David Gibson @ 2021-05-18  1:34 UTC (permalink / raw)
  To: Paul A. Clarke
  Cc: Giuseppe Musacchio, Richard Henderson, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1993 bytes --]


On Mon, May 17, 2021 at 04:40:32PM -0500, Paul A. Clarke wrote:
> `lxvdsx` is byte-swapping the data it loads, which it should not
> do.  Fix it.
> 
> Fixes #212.
> 
> Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c
> Signed-off-by:  Paul A. Clarke <pc@us.ibm.com
                           nit, missing '>' ...^

I'm having a hard time convincing myself this is correct in all cases.
Have you tested it with all combinations of BE/LE host and BE/LE guest
code?

The description in the ISA is pretty inscrutable, since it's in terms
of the confusing numbering if different element types in BE vs LE
mode.

It looks to me like before bcb0b7b1a1c0 this originally resolved to
MO_Q modified by ctx->default_tcg_memop_mask, which appears to depend
on the current guest endian mode.  That's pretty hard to trace through
the various layers of macros, but for reference, before bcb0b7b1a1c0
this used gen_qemu_ld64_i64(), which appears to be constructed by the
line GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q)) in translate.c.

Richard or Giuseppe, care to weigh in?

> ---
>  target/ppc/translate/vsx-impl.c.inc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
> index b817d31260bb..46f97c029ca8 100644
> --- a/target/ppc/translate/vsx-impl.c.inc
> +++ b/target/ppc/translate/vsx-impl.c.inc
> @@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx)
>      gen_addr_reg_index(ctx, EA);
>  
>      data = tcg_temp_new_i64();
> -    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ);
> +    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
>      tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
>  
>      tcg_temp_free(EA);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix `lxvdsx` (issue #212)
  2021-05-18  1:34 ` David Gibson
@ 2021-05-18  6:40   ` Giuseppe Musacchio
  2021-05-18  7:30     ` Greg Kurz
  2021-05-18  6:53   ` Mark Cave-Ayland
  1 sibling, 1 reply; 6+ messages in thread
From: Giuseppe Musacchio @ 2021-05-18  6:40 UTC (permalink / raw)
  To: David Gibson, Paul A. Clarke; +Cc: Richard Henderson, qemu-ppc, qemu-devel

The ISA [1] specifies the load order to be the target one, hence
the use of MO_TEQ in my patch (in both lxvwsx and lxvdsx).

I believe the error is hidden in some of the .mak files: I could not
reproduce this problem with Qemu's user-mode emulation in either
BE nor LE mode, this lead me to discover that ppc64-softmmu.mak is
always defining TARGET_WORDS_BIGENDIAN=y. The user-mode targets are
correctly split into ppc64 and ppc64le, where only the former is
declared as BE.

The presence of that define is unconditionally making MO_TE an alias
for MO_BE, that's why Paul's patch seems to fix the problem.

I didn't catch this problem earlier as pretty much of our testing is
done using the Linux user-mode emulation.

Cheers,
G.M.

[1] https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv

On 18/05/21 03:34, David Gibson wrote:
> 
> On Mon, May 17, 2021 at 04:40:32PM -0500, Paul A. Clarke wrote:
>> `lxvdsx` is byte-swapping the data it loads, which it should not
>> do.  Fix it.
>>
>> Fixes #212.
>>
>> Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c
>> Signed-off-by:  Paul A. Clarke <pc@us.ibm.com
>                            nit, missing '>' ...^
> 
> I'm having a hard time convincing myself this is correct in all cases.
> Have you tested it with all combinations of BE/LE host and BE/LE guest
> code?
> 
> The description in the ISA is pretty inscrutable, since it's in terms
> of the confusing numbering if different element types in BE vs LE
> mode.
> 
> It looks to me like before bcb0b7b1a1c0 this originally resolved to
> MO_Q modified by ctx->default_tcg_memop_mask, which appears to depend
> on the current guest endian mode.  That's pretty hard to trace through
> the various layers of macros, but for reference, before bcb0b7b1a1c0
> this used gen_qemu_ld64_i64(), which appears to be constructed by the
> line GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q)) in translate.c.
> 
> Richard or Giuseppe, care to weigh in?
> 
>> ---
>>  target/ppc/translate/vsx-impl.c.inc | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
>> index b817d31260bb..46f97c029ca8 100644
>> --- a/target/ppc/translate/vsx-impl.c.inc
>> +++ b/target/ppc/translate/vsx-impl.c.inc
>> @@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx)
>>      gen_addr_reg_index(ctx, EA);
>>  
>>      data = tcg_temp_new_i64();
>> -    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ);
>> +    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
>>      tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
>>  
>>      tcg_temp_free(EA);
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix `lxvdsx` (issue #212)
  2021-05-18  1:34 ` David Gibson
  2021-05-18  6:40   ` Giuseppe Musacchio
@ 2021-05-18  6:53   ` Mark Cave-Ayland
  1 sibling, 0 replies; 6+ messages in thread
From: Mark Cave-Ayland @ 2021-05-18  6:53 UTC (permalink / raw)
  To: David Gibson, Paul A. Clarke
  Cc: Giuseppe Musacchio, qemu-ppc, Richard Henderson, qemu-devel

On 18/05/2021 02:34, David Gibson wrote:

> I'm having a hard time convincing myself this is correct in all cases.
> Have you tested it with all combinations of BE/LE host and BE/LE guest
> code?
> 
> The description in the ISA is pretty inscrutable, since it's in terms
> of the confusing numbering if different element types in BE vs LE
> mode.
> 
> It looks to me like before bcb0b7b1a1c0 this originally resolved to
> MO_Q modified by ctx->default_tcg_memop_mask, which appears to depend
> on the current guest endian mode.  That's pretty hard to trace through
> the various layers of macros, but for reference, before bcb0b7b1a1c0
> this used gen_qemu_ld64_i64(), which appears to be constructed by the
> line GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q)) in translate.c.
> 
> Richard or Giuseppe, care to weigh in?
> 
>> ---
>>   target/ppc/translate/vsx-impl.c.inc | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
>> index b817d31260bb..46f97c029ca8 100644
>> --- a/target/ppc/translate/vsx-impl.c.inc
>> +++ b/target/ppc/translate/vsx-impl.c.inc
>> @@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx)
>>       gen_addr_reg_index(ctx, EA);
>>   
>>       data = tcg_temp_new_i64();
>> -    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ);
>> +    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
>>       tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
>>   
>>       tcg_temp_free(EA);

Right. I think what is happening here is currently the load uses MO_TE (i.e. target 
endian) which defaults to big endian for PPC and that's why you're seeing the byte 
swap. The reason this is required for the vector instructions is because the vector 
registers need to be stored in host byte-order to allow them to make use of the 
host's vector instructions.

A quick look around the same file at gen_lxvw4x() suggests that you need a solution 
like this to work correctly with both big and little endian:

     if (ctx->le_mode) {
         tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
     } else {
         tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_BEQ);
     }

The original commit message for bcb0b7b1a1c0 mentions that the implementation is 
based upon that of the lxvwsx instruction, so I'm fairly sure that gen_lxvwsx() is 
also broken for little endian and will need a similar fix too.


ATB,

Mark.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix `lxvdsx` (issue #212)
  2021-05-18  6:40   ` Giuseppe Musacchio
@ 2021-05-18  7:30     ` Greg Kurz
  2021-05-19  0:46       ` David Gibson
  0 siblings, 1 reply; 6+ messages in thread
From: Greg Kurz @ 2021-05-18  7:30 UTC (permalink / raw)
  To: Giuseppe Musacchio
  Cc: Richard Henderson, Mark Cave-Ayland, qemu-devel, qemu-ppc,
	Paul A. Clarke, David Gibson

On Tue, 18 May 2021 08:40:36 +0200
Giuseppe Musacchio <thatlemon@gmail.com> wrote:

> The ISA [1] specifies the load order to be the target one, hence
> the use of MO_TEQ in my patch (in both lxvwsx and lxvdsx).
> 
> I believe the error is hidden in some of the .mak files: I could not
> reproduce this problem with Qemu's user-mode emulation in either
> BE nor LE mode, this lead me to discover that ppc64-softmmu.mak is
> always defining TARGET_WORDS_BIGENDIAN=y. The user-mode targets are
> correctly split into ppc64 and ppc64le, where only the former is
> declared as BE.
> 

Yes. In system-mode emulation, modern POWER CPUs are expected to
be able to switch from BE to LE and vice-versa at runtime. Older
PowerPC CPUs are BE. The qemu-system-ppc64 binary is thus built
with TARGET_WORDS_BIGENDIAN=y and every place where runtime
endianness matters need to do a check and only byteswap if needed.

Mark's suggestion in another mail of this thread is the way to go.

> The presence of that define is unconditionally making MO_TE an alias
> for MO_BE, that's why Paul's patch seems to fix the problem.
> 
> I didn't catch this problem earlier as pretty much of our testing is
> done using the Linux user-mode emulation.
> 
> Cheers,
> G.M.
> 
> [1] https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv
> 
> On 18/05/21 03:34, David Gibson wrote:
> > 
> > On Mon, May 17, 2021 at 04:40:32PM -0500, Paul A. Clarke wrote:
> >> `lxvdsx` is byte-swapping the data it loads, which it should not
> >> do.  Fix it.
> >>
> >> Fixes #212.
> >>
> >> Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c
> >> Signed-off-by:  Paul A. Clarke <pc@us.ibm.com
> >                            nit, missing '>' ...^
> > 
> > I'm having a hard time convincing myself this is correct in all cases.
> > Have you tested it with all combinations of BE/LE host and BE/LE guest
> > code?
> > 
> > The description in the ISA is pretty inscrutable, since it's in terms
> > of the confusing numbering if different element types in BE vs LE
> > mode.
> > 
> > It looks to me like before bcb0b7b1a1c0 this originally resolved to
> > MO_Q modified by ctx->default_tcg_memop_mask, which appears to depend
> > on the current guest endian mode.  That's pretty hard to trace through
> > the various layers of macros, but for reference, before bcb0b7b1a1c0
> > this used gen_qemu_ld64_i64(), which appears to be constructed by the
> > line GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q)) in translate.c.
> > 
> > Richard or Giuseppe, care to weigh in?
> > 
> >> ---
> >>  target/ppc/translate/vsx-impl.c.inc | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
> >> index b817d31260bb..46f97c029ca8 100644
> >> --- a/target/ppc/translate/vsx-impl.c.inc
> >> +++ b/target/ppc/translate/vsx-impl.c.inc
> >> @@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx)
> >>      gen_addr_reg_index(ctx, EA);
> >>  
> >>      data = tcg_temp_new_i64();
> >> -    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ);
> >> +    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
> >>      tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
> >>  
> >>      tcg_temp_free(EA);
> > 
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix `lxvdsx` (issue #212)
  2021-05-18  7:30     ` Greg Kurz
@ 2021-05-19  0:46       ` David Gibson
  0 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2021-05-19  0:46 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Giuseppe Musacchio, Richard Henderson, Mark Cave-Ayland,
	qemu-devel, qemu-ppc, Paul A. Clarke

[-- Attachment #1: Type: text/plain, Size: 3930 bytes --]

On Tue, May 18, 2021 at 09:30:38AM +0200, Greg Kurz wrote:
> On Tue, 18 May 2021 08:40:36 +0200
> Giuseppe Musacchio <thatlemon@gmail.com> wrote:
> 
> > The ISA [1] specifies the load order to be the target one, hence
> > the use of MO_TEQ in my patch (in both lxvwsx and lxvdsx).
> > 
> > I believe the error is hidden in some of the .mak files: I could not
> > reproduce this problem with Qemu's user-mode emulation in either
> > BE nor LE mode, this lead me to discover that ppc64-softmmu.mak is
> > always defining TARGET_WORDS_BIGENDIAN=y. The user-mode targets are
> > correctly split into ppc64 and ppc64le, where only the former is
> > declared as BE.
> > 
> 
> Yes. In system-mode emulation, modern POWER CPUs are expected to
> be able to switch from BE to LE and vice-versa at runtime. Older
> PowerPC CPUs are BE. The qemu-system-ppc64 binary is thus built
> with TARGET_WORDS_BIGENDIAN=y and every place where runtime
> endianness matters need to do a check and only byteswap if needed.

Right.  With modern POWER (and ARM and several others), "target
endian" isn't really a well-defined concept, since the processor can
switch at runtime.

> 
> Mark's suggestion in another mail of this thread is the way to go.
> 
> > The presence of that define is unconditionally making MO_TE an alias
> > for MO_BE, that's why Paul's patch seems to fix the problem.
> > 
> > I didn't catch this problem earlier as pretty much of our testing is
> > done using the Linux user-mode emulation.
> > 
> > Cheers,
> > G.M.
> > 
> > [1] https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv
> > 
> > On 18/05/21 03:34, David Gibson wrote:
> > > 
> > > On Mon, May 17, 2021 at 04:40:32PM -0500, Paul A. Clarke wrote:
> > >> `lxvdsx` is byte-swapping the data it loads, which it should not
> > >> do.  Fix it.
> > >>
> > >> Fixes #212.
> > >>
> > >> Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c
> > >> Signed-off-by:  Paul A. Clarke <pc@us.ibm.com
> > >                            nit, missing '>' ...^
> > > 
> > > I'm having a hard time convincing myself this is correct in all cases.
> > > Have you tested it with all combinations of BE/LE host and BE/LE guest
> > > code?
> > > 
> > > The description in the ISA is pretty inscrutable, since it's in terms
> > > of the confusing numbering if different element types in BE vs LE
> > > mode.
> > > 
> > > It looks to me like before bcb0b7b1a1c0 this originally resolved to
> > > MO_Q modified by ctx->default_tcg_memop_mask, which appears to depend
> > > on the current guest endian mode.  That's pretty hard to trace through
> > > the various layers of macros, but for reference, before bcb0b7b1a1c0
> > > this used gen_qemu_ld64_i64(), which appears to be constructed by the
> > > line GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q)) in translate.c.
> > > 
> > > Richard or Giuseppe, care to weigh in?
> > > 
> > >> ---
> > >>  target/ppc/translate/vsx-impl.c.inc | 2 +-
> > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
> > >> index b817d31260bb..46f97c029ca8 100644
> > >> --- a/target/ppc/translate/vsx-impl.c.inc
> > >> +++ b/target/ppc/translate/vsx-impl.c.inc
> > >> @@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx)
> > >>      gen_addr_reg_index(ctx, EA);
> > >>  
> > >>      data = tcg_temp_new_i64();
> > >> -    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ);
> > >> +    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ);
> > >>      tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
> > >>  
> > >>      tcg_temp_free(EA);
> > > 
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-05-19  3:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-17 21:40 [PATCH] Fix `lxvdsx` (issue #212) Paul A. Clarke
2021-05-18  1:34 ` David Gibson
2021-05-18  6:40   ` Giuseppe Musacchio
2021-05-18  7:30     ` Greg Kurz
2021-05-19  0:46       ` David Gibson
2021-05-18  6:53   ` Mark Cave-Ayland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.