All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order
@ 2019-03-03 17:23 Mark Cave-Ayland
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function Mark Cave-Ayland
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

After some investigation into Andrew's report of corruption in his ppc64le
tests at https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg07234.html, I
discovered the underlying cause was that the first 32 VSX registers are not
stored in host endian order.

This is something that Richard and I had discussed before, but missed that with
VSX if you have source registers from different register sets then even logical
operations will give you the wrong result.

Rather than revert 7b8fe477e1 "target/ppc: convert VSX logical operations to
vector operations" let's keep the use of the accelerated vector instructions,
and instead fix the real problem which is to switch the first 32 VSX registers
to host endian order matching the VMX registers.

Patches 1-5 aim to consolidate the offset calculations for both CPUPPCState
and the associated _ptr() functions into one single place.

With this preliminary work complete, patch 6 switches the first 32 registers
into host endian order without too much difficulty.

Finally now that all VSX registers are stored in the same way, the vsr offset
functions and get_cpu_vsrh()/get_cpu_vsrl() can be simplified accordingly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>


Mark Cave-Ayland (8):
  target/ppc: introduce single fpr_offset() function
  target/ppc: introduce single vsrl_offset() function
  target/ppc: move Vsr* macros from internal.h to cpu.h
  target/ppc: introduce avrh_offset() and avrl_offset() functions
  target/ppc: introduce avr_offset() function
  target/ppc: switch fpr/vsrl registers so all VSX registers are in host
    endian order
  target/ppc: introduce vsrh_offset() function
  target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions

 target/ppc/cpu.h                    | 56 +++++++++++++++++++++++++++++++++++--
 target/ppc/internal.h               | 27 +++---------------
 target/ppc/machine.c                |  8 +++---
 target/ppc/translate.c              | 28 ++++++++-----------
 target/ppc/translate/vmx-impl.inc.c | 27 ++++++++----------
 target/ppc/translate/vsx-impl.inc.c | 39 +++-----------------------
 6 files changed, 88 insertions(+), 97 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 23:19   ` Richard Henderson
  2019-03-04  5:37   ` David Gibson
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function Mark Cave-Ayland
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

Instead of having multiple copies of the offset calculation logic, move it to a
single fpr_offset() function.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h       | 7 ++++++-
 target/ppc/translate.c | 4 ++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 26604ddf98..4bb4e42670 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2563,9 +2563,14 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx)
 }
 
 /* Accessors for FP, VMX and VSX registers */
+static inline int fpr_offset(int i)
+{
+    return offsetof(CPUPPCState, vsr[i].u64[0]);
+}
+
 static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
 {
-    return &env->vsr[i].u64[0];
+    return (uint64_t *)((uintptr_t)env + fpr_offset(i));
 }
 
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 819221f246..3b1992faf1 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6677,12 +6677,12 @@ GEN_TM_PRIV_NOOP(trechkpt);
 
 static inline void get_fpr(TCGv_i64 dst, int regno)
 {
-    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState, vsr[regno].u64[0]));
+    tcg_gen_ld_i64(dst, cpu_env, fpr_offset(regno));
 }
 
 static inline void set_fpr(int regno, TCGv_i64 src)
 {
-    tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState, vsr[regno].u64[0]));
+    tcg_gen_st_i64(src, cpu_env, fpr_offset(regno));
 }
 
 static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 23:20   ` Richard Henderson
  2019-03-04  5:39   ` David Gibson
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 3/8] target/ppc: move Vsr* macros from internal.h to cpu.h Mark Cave-Ayland
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

Instead of having multiple copies of the offset calculation logic, move it to a
single vsrl_offset() function.

This commit also renames the existing get_vsr()/set_vsr() functions to
get_vsrl()/set_vsrl() which better describes their purpose.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h                    |  7 ++++++-
 target/ppc/translate/vsx-impl.inc.c | 12 ++++++------
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 4bb4e42670..4a7df13c2d 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2573,9 +2573,14 @@ static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
     return (uint64_t *)((uintptr_t)env + fpr_offset(i));
 }
 
+static inline int vsrl_offset(int i)
+{
+    return offsetof(CPUPPCState, vsr[i].u64[1]);
+}
+
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
 {
-    return &env->vsr[i].u64[1];
+    return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
 }
 
 static inline ppc_avr_t *cpu_avr_ptr(CPUPPCState *env, int i)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index e73197e717..381ae0f2e9 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1,13 +1,13 @@
 /***                           VSX extension                               ***/
 
-static inline void get_vsr(TCGv_i64 dst, int n)
+static inline void get_vsrl(TCGv_i64 dst, int n)
 {
-    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState, vsr[n].u64[1]));
+    tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
 }
 
-static inline void set_vsr(int n, TCGv_i64 src)
+static inline void set_vsrl(int n, TCGv_i64 src)
 {
-    tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState, vsr[n].u64[1]));
+    tcg_gen_st_i64(src, cpu_env, vsrl_offset(n));
 }
 
 static inline int vsr_full_offset(int n)
@@ -27,7 +27,7 @@ static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
 static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
 {
     if (n < 32) {
-        get_vsr(dst, n);
+        get_vsrl(dst, n);
     } else {
         get_avr64(dst, n - 32, false);
     }
@@ -45,7 +45,7 @@ static inline void set_cpu_vsrh(int n, TCGv_i64 src)
 static inline void set_cpu_vsrl(int n, TCGv_i64 src)
 {
     if (n < 32) {
-        set_vsr(n, src);
+        set_vsrl(n, src);
     } else {
         set_avr64(n - 32, src, false);
     }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 3/8] target/ppc: move Vsr* macros from internal.h to cpu.h
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function Mark Cave-Ayland
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions Mark Cave-Ayland
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

It isn't possible to include internal.h from cpu.h so move the Vsr* macros
into cpu.h alongside the other VMX/VSX register access functions.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h      | 20 ++++++++++++++++++++
 target/ppc/internal.h | 19 -------------------
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 4a7df13c2d..d0580c6b6d 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2563,6 +2563,26 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx)
 }
 
 /* Accessors for FP, VMX and VSX registers */
+#if defined(HOST_WORDS_BIGENDIAN)
+#define VsrB(i) u8[i]
+#define VsrSB(i) s8[i]
+#define VsrH(i) u16[i]
+#define VsrSH(i) s16[i]
+#define VsrW(i) u32[i]
+#define VsrSW(i) s32[i]
+#define VsrD(i) u64[i]
+#define VsrSD(i) s64[i]
+#else
+#define VsrB(i) u8[15 - (i)]
+#define VsrSB(i) s8[15 - (i)]
+#define VsrH(i) u16[7 - (i)]
+#define VsrSH(i) s16[7 - (i)]
+#define VsrW(i) u32[3 - (i)]
+#define VsrSW(i) s32[3 - (i)]
+#define VsrD(i) u64[1 - (i)]
+#define VsrSD(i) s64[1 - (i)]
+#endif
+
 static inline int fpr_offset(int i)
 {
     return offsetof(CPUPPCState, vsr[i].u64[0]);
diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index f26a71ffcf..3ebbdf4da4 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -204,25 +204,6 @@ EXTRACT_HELPER(IMM8, 11, 8);
 EXTRACT_HELPER(DCMX, 16, 7);
 EXTRACT_HELPER_SPLIT_3(DCMX_XV, 5, 16, 0, 1, 2, 5, 1, 6, 6);
 
-#if defined(HOST_WORDS_BIGENDIAN)
-#define VsrB(i) u8[i]
-#define VsrSB(i) s8[i]
-#define VsrH(i) u16[i]
-#define VsrSH(i) s16[i]
-#define VsrW(i) u32[i]
-#define VsrSW(i) s32[i]
-#define VsrD(i) u64[i]
-#define VsrSD(i) s64[i]
-#else
-#define VsrB(i) u8[15 - (i)]
-#define VsrSB(i) s8[15 - (i)]
-#define VsrH(i) u16[7 - (i)]
-#define VsrSH(i) s16[7 - (i)]
-#define VsrW(i) u32[3 - (i)]
-#define VsrSW(i) s32[3 - (i)]
-#define VsrD(i) u64[1 - (i)]
-#define VsrSD(i) s64[1 - (i)]
-#endif
 static inline void getVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
 {
     vsr->VsrD(0) = env->vsr[n].u64[0];
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
                   ` (2 preceding siblings ...)
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 3/8] target/ppc: move Vsr* macros from internal.h to cpu.h Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 23:31   ` Richard Henderson
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function Mark Cave-Ayland
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

These will become more useful later, but initially use this as an aid to
simplify the offset calculation by replacing the HOST_TARGET_BIGENDIAN
sections with the VsrD macro.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h       | 10 ++++++++++
 target/ppc/translate.c | 24 ++++++++++--------------
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index d0580c6b6d..326593e0e7 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2603,6 +2603,16 @@ static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
     return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
 }
 
+static inline int avrh_offset(int i)
+{
+    return offsetof(CPUPPCState, vsr[32 + i].VsrD(0));
+}
+
+static inline int avrl_offset(int i)
+{
+    return offsetof(CPUPPCState, vsr[32 + i].VsrD(1));
+}
+
 static inline ppc_avr_t *cpu_avr_ptr(CPUPPCState *env, int i)
 {
     return &env->vsr[32 + i];
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 3b1992faf1..f646f359e7 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6687,24 +6687,20 @@ static inline void set_fpr(int regno, TCGv_i64 src)
 
 static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
 {
-#ifdef HOST_WORDS_BIGENDIAN
-    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
-                                          vsr[32 + regno].u64[(high ? 0 : 1)]));
-#else
-    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
-                                          vsr[32 + regno].u64[(high ? 1 : 0)]));
-#endif
+    if (high) {
+        tcg_gen_ld_i64(dst, cpu_env, avrh_offset(regno));
+    } else {
+        tcg_gen_ld_i64(dst, cpu_env, avrl_offset(regno));
+    }
 }
 
 static inline void set_avr64(int regno, TCGv_i64 src, bool high)
 {
-#ifdef HOST_WORDS_BIGENDIAN
-    tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState,
-                                          vsr[32 + regno].u64[(high ? 0 : 1)]));
-#else
-    tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState,
-                                          vsr[32 + regno].u64[(high ? 1 : 0)]));
-#endif
+    if (high) {
+        tcg_gen_st_i64(src, cpu_env, avrh_offset(regno));
+    } else {
+        tcg_gen_st_i64(src, cpu_env, avrl_offset(regno));
+    }
 }
 
 #include "translate/fp-impl.inc.c"
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
                   ` (3 preceding siblings ...)
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 23:29   ` Richard Henderson
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 6/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

All TCG vector operations require pointers to the base address of the vector
rather than separate access to the top and bottom 64-bits. Convert
the VMX TCG instructions to use a new avr_offset() function instead of
avr64_offset(), which can itself be written as a simple wrapper onto
vsr_full_offset().

After the conversion is complete then avr64_offset() can be removed since its
functionality is now completely within get_avr64()/set_avr64().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h                    | 12 +++++++++++-
 target/ppc/translate/vmx-impl.inc.c | 27 +++++++++++----------------
 target/ppc/translate/vsx-impl.inc.c |  5 -----
 3 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 326593e0e7..89651988ab 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2598,6 +2598,11 @@ static inline int vsrl_offset(int i)
     return offsetof(CPUPPCState, vsr[i].u64[1]);
 }
 
+static inline int vsr_full_offset(int i)
+{
+    return offsetof(CPUPPCState, vsr[i].u64[0]);
+}
+
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
 {
     return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
@@ -2613,9 +2618,14 @@ static inline int avrl_offset(int i)
     return offsetof(CPUPPCState, vsr[32 + i].VsrD(1));
 }
 
+static inline int avr_offset(int i)
+{
+    return vsr_full_offset(i + 32);
+}
+
 static inline ppc_avr_t *cpu_avr_ptr(CPUPPCState *env, int i)
 {
-    return &env->vsr[32 + i];
+    return (ppc_avr_t *)((uintptr_t)env + avr_offset(i));
 }
 
 void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env);
diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index f1b15ae2cb..5f0c96a5e9 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -10,15 +10,10 @@
 static inline TCGv_ptr gen_avr_ptr(int reg)
 {
     TCGv_ptr r = tcg_temp_new_ptr();
-    tcg_gen_addi_ptr(r, cpu_env, offsetof(CPUPPCState, vsr[32 + reg].u64[0]));
+    tcg_gen_addi_ptr(r, cpu_env, avr_offset(reg));
     return r;
 }
 
-static inline long avr64_offset(int reg, bool high)
-{
-    return offsetof(CPUPPCState, vsr[32 + reg].u64[(high ? 0 : 1)]);
-}
-
 #define GEN_VR_LDX(name, opc2, opc3)                                          \
 static void glue(gen_, name)(DisasContext *ctx)                                       \
 {                                                                             \
@@ -205,7 +200,7 @@ static void gen_mtvscr(DisasContext *ctx)
     }
 
     val = tcg_temp_new_i32();
-    bofs = avr64_offset(rB(ctx->opcode), true);
+    bofs = avr_offset(rB(ctx->opcode));
 #ifdef HOST_WORDS_BIGENDIAN
     bofs += 3 * 4;
 #endif
@@ -284,9 +279,9 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
     }                                                                   \
                                                                         \
     tcg_op(vece,                                                        \
-           avr64_offset(rD(ctx->opcode), true),                         \
-           avr64_offset(rA(ctx->opcode), true),                         \
-           avr64_offset(rB(ctx->opcode), true),                         \
+           avr_offset(rD(ctx->opcode)),                                 \
+           avr_offset(rA(ctx->opcode)),                                 \
+           avr_offset(rB(ctx->opcode)),                                 \
            16, 16);                                                     \
 }
 
@@ -578,10 +573,10 @@ static void glue(gen_, NAME)(DisasContext *ctx)                         \
         gen_exception(ctx, POWERPC_EXCP_VPU);                           \
         return;                                                         \
     }                                                                   \
-    tcg_gen_gvec_4(avr64_offset(rD(ctx->opcode), true),                 \
+    tcg_gen_gvec_4(avr_offset(rD(ctx->opcode)),                         \
                    offsetof(CPUPPCState, vscr_sat),                     \
-                   avr64_offset(rA(ctx->opcode), true),                 \
-                   avr64_offset(rB(ctx->opcode), true),                 \
+                   avr_offset(rA(ctx->opcode)),                         \
+                   avr_offset(rB(ctx->opcode)),                         \
                    16, 16, &g);                                         \
 }
 
@@ -755,7 +750,7 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
             return;                                                     \
         }                                                               \
         simm = SIMM5(ctx->opcode);                                      \
-        tcg_op(avr64_offset(rD(ctx->opcode), true), 16, 16, simm);      \
+        tcg_op(avr_offset(rD(ctx->opcode)), 16, 16, simm);              \
     }
 
 GEN_VXFORM_DUPI(vspltisb, tcg_gen_gvec_dup8i, 6, 12);
@@ -850,8 +845,8 @@ static void gen_vsplt(DisasContext *ctx, int vece)
     }
 
     uimm = UIMM5(ctx->opcode);
-    bofs = avr64_offset(rB(ctx->opcode), true);
-    dofs = avr64_offset(rD(ctx->opcode), true);
+    bofs = avr_offset(rB(ctx->opcode));
+    dofs = avr_offset(rD(ctx->opcode));
 
     /* Experimental testing shows that hardware masks the immediate.  */
     bofs += (uimm << vece) & 15;
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 381ae0f2e9..7d02a235e7 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -10,11 +10,6 @@ static inline void set_vsrl(int n, TCGv_i64 src)
     tcg_gen_st_i64(src, cpu_env, vsrl_offset(n));
 }
 
-static inline int vsr_full_offset(int n)
-{
-    return offsetof(CPUPPCState, vsr[n].u64[0]);
-}
-
 static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
 {
     if (n < 32) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 6/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
                   ` (4 preceding siblings ...)
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 23:32   ` Richard Henderson
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 7/8] target/ppc: introduce vsrh_offset() function Mark Cave-Ayland
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

When VSX support was initially added, the fpr registers were added at
offset 0 of the VSR register and the vsrl registers were added at offset
1. This is in contrast to the VMX registers (the last 32 VSX registers) which
are stored in host-endian order.

Switch the fpr/vsrl registers so that the lower 32 VSX registers are now also
stored in host endian order to match the VMX registers. This ensures that TCG
vector operations involving mixed VMX and VSX registers will function
correctly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h      | 4 ++--
 target/ppc/internal.h | 8 ++++----
 target/ppc/machine.c  | 8 ++++----
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 89651988ab..faae25a566 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2585,7 +2585,7 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx)
 
 static inline int fpr_offset(int i)
 {
-    return offsetof(CPUPPCState, vsr[i].u64[0]);
+    return offsetof(CPUPPCState, vsr[i].VsrD(0));
 }
 
 static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
@@ -2595,7 +2595,7 @@ static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
 
 static inline int vsrl_offset(int i)
 {
-    return offsetof(CPUPPCState, vsr[i].u64[1]);
+    return offsetof(CPUPPCState, vsr[i].VsrD(1));
 }
 
 static inline int vsr_full_offset(int i)
diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index 3ebbdf4da4..fb6f64ed1e 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -206,14 +206,14 @@ EXTRACT_HELPER_SPLIT_3(DCMX_XV, 5, 16, 0, 1, 2, 5, 1, 6, 6);
 
 static inline void getVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
 {
-    vsr->VsrD(0) = env->vsr[n].u64[0];
-    vsr->VsrD(1) = env->vsr[n].u64[1];
+    vsr->VsrD(0) = env->vsr[n].VsrD(0);
+    vsr->VsrD(1) = env->vsr[n].VsrD(1);
 }
 
 static inline void putVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
 {
-    env->vsr[n].u64[0] = vsr->VsrD(0);
-    env->vsr[n].u64[1] = vsr->VsrD(1);
+    env->vsr[n].VsrD(0) = vsr->VsrD(0);
+    env->vsr[n].VsrD(1) = vsr->VsrD(1);
 }
 
 void helper_compute_fprf_float16(CPUPPCState *env, float16 arg);
diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 756b6d2971..a92d0ad3a3 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -150,7 +150,7 @@ static int get_fpr(QEMUFile *f, void *pv, size_t size,
 {
     ppc_vsr_t *v = pv;
 
-    v->u64[0] = qemu_get_be64(f);
+    v->VsrD(0) = qemu_get_be64(f);
 
     return 0;
 }
@@ -160,7 +160,7 @@ static int put_fpr(QEMUFile *f, void *pv, size_t size,
 {
     ppc_vsr_t *v = pv;
 
-    qemu_put_be64(f, v->u64[0]);
+    qemu_put_be64(f, v->VsrD(0));
     return 0;
 }
 
@@ -181,7 +181,7 @@ static int get_vsr(QEMUFile *f, void *pv, size_t size,
 {
     ppc_vsr_t *v = pv;
 
-    v->u64[1] = qemu_get_be64(f);
+    v->VsrD(1) = qemu_get_be64(f);
 
     return 0;
 }
@@ -191,7 +191,7 @@ static int put_vsr(QEMUFile *f, void *pv, size_t size,
 {
     ppc_vsr_t *v = pv;
 
-    qemu_put_be64(f, v->u64[1]);
+    qemu_put_be64(f, v->VsrD(1));
     return 0;
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 7/8] target/ppc: introduce vsrh_offset() function
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
                   ` (5 preceding siblings ...)
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 6/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 23:33   ` Richard Henderson
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions Mark Cave-Ayland
  2019-03-04  5:43 ` [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order David Gibson
  8 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

Now that both VSX and VMX registers are in host-endian order we can introduce
a vsrh_offset() function as a replacement for fpr_offset().

In addition the avrh_offset() and avrl_offset() functions can be simplified in
terms of vsrh_offset() and vsrl_offset().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h       | 16 ++++++++--------
 target/ppc/translate.c |  4 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index faae25a566..9f8eb0bdc0 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2583,16 +2583,11 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx)
 #define VsrSD(i) s64[1 - (i)]
 #endif
 
-static inline int fpr_offset(int i)
+static inline int vsrh_offset(int i)
 {
     return offsetof(CPUPPCState, vsr[i].VsrD(0));
 }
 
-static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
-{
-    return (uint64_t *)((uintptr_t)env + fpr_offset(i));
-}
-
 static inline int vsrl_offset(int i)
 {
     return offsetof(CPUPPCState, vsr[i].VsrD(1));
@@ -2603,6 +2598,11 @@ static inline int vsr_full_offset(int i)
     return offsetof(CPUPPCState, vsr[i].u64[0]);
 }
 
+static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
+{
+    return (uint64_t *)((uintptr_t)env + vsrh_offset(i));
+}
+
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
 {
     return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
@@ -2610,12 +2610,12 @@ static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
 
 static inline int avrh_offset(int i)
 {
-    return offsetof(CPUPPCState, vsr[32 + i].VsrD(0));
+    return vsrh_offset(i + 32);
 }
 
 static inline int avrl_offset(int i)
 {
-    return offsetof(CPUPPCState, vsr[32 + i].VsrD(1));
+    return vsrl_offset(i + 32);
 }
 
 static inline int avr_offset(int i)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index f646f359e7..1c7377d588 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6677,12 +6677,12 @@ GEN_TM_PRIV_NOOP(trechkpt);
 
 static inline void get_fpr(TCGv_i64 dst, int regno)
 {
-    tcg_gen_ld_i64(dst, cpu_env, fpr_offset(regno));
+    tcg_gen_ld_i64(dst, cpu_env, vsrh_offset(regno));
 }
 
 static inline void set_fpr(int regno, TCGv_i64 src)
 {
-    tcg_gen_st_i64(src, cpu_env, fpr_offset(regno));
+    tcg_gen_st_i64(src, cpu_env, vsrh_offset(regno));
 }
 
 static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
                   ` (6 preceding siblings ...)
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 7/8] target/ppc: introduce vsrh_offset() function Mark Cave-Ayland
@ 2019-03-03 17:23 ` Mark Cave-Ayland
  2019-03-03 23:35   ` Richard Henderson
  2019-03-04  5:43 ` [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order David Gibson
  8 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-03 17:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, richard.henderson

Now that the VSX registers are all in host endian order, there is no need to
go via different accessors depending upon the register number. Instead the
high and low parts can be accessed directly via vsrh_offset() and vsrl_offset()
accordingly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/translate/vsx-impl.inc.c | 34 ++++------------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 7d02a235e7..43e97756a2 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1,49 +1,23 @@
 /***                           VSX extension                               ***/
 
-static inline void get_vsrl(TCGv_i64 dst, int n)
-{
-    tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
-}
-
-static inline void set_vsrl(int n, TCGv_i64 src)
-{
-    tcg_gen_st_i64(src, cpu_env, vsrl_offset(n));
-}
-
 static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
 {
-    if (n < 32) {
-        get_fpr(dst, n);
-    } else {
-        get_avr64(dst, n - 32, true);
-    }
+    tcg_gen_ld_i64(dst, cpu_env, vsrh_offset(n));
 }
 
 static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
 {
-    if (n < 32) {
-        get_vsrl(dst, n);
-    } else {
-        get_avr64(dst, n - 32, false);
-    }
+    tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
 }
 
 static inline void set_cpu_vsrh(int n, TCGv_i64 src)
 {
-    if (n < 32) {
-        set_fpr(n, src);
-    } else {
-        set_avr64(n - 32, src, true);
-    }
+    tcg_gen_st_i64(src, cpu_env, vsrh_offset(n));
 }
 
 static inline void set_cpu_vsrl(int n, TCGv_i64 src)
 {
-    if (n < 32) {
-        set_vsrl(n, src);
-    } else {
-        set_avr64(n - 32, src, false);
-    }
+    tcg_gen_st_i64(src, cpu_env, vsrl_offset(n));
 }
 
 #define VSX_LOAD_SCALAR(name, operation)                      \
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function Mark Cave-Ayland
@ 2019-03-03 23:19   ` Richard Henderson
  2019-03-04  5:37   ` David Gibson
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2019-03-03 23:19 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
> Instead of having multiple copies of the offset calculation logic, move it to a
> single fpr_offset() function.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/cpu.h       | 7 ++++++-
>  target/ppc/translate.c | 4 ++--
>  2 files changed, 8 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function Mark Cave-Ayland
@ 2019-03-03 23:20   ` Richard Henderson
  2019-03-04  5:39   ` David Gibson
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2019-03-03 23:20 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
> Instead of having multiple copies of the offset calculation logic, move it to a
> single vsrl_offset() function.
> 
> This commit also renames the existing get_vsr()/set_vsr() functions to
> get_vsrl()/set_vsrl() which better describes their purpose.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/cpu.h                    |  7 ++++++-
>  target/ppc/translate/vsx-impl.inc.c | 12 ++++++------
>  2 files changed, 12 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function Mark Cave-Ayland
@ 2019-03-03 23:29   ` Richard Henderson
  2019-03-05 17:16     ` Mark Cave-Ayland
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2019-03-03 23:29 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
> All TCG vector operations require pointers to the base address of the vector
> rather than separate access to the top and bottom 64-bits. Convert
> the VMX TCG instructions to use a new avr_offset() function instead of
> avr64_offset(), which can itself be written as a simple wrapper onto
> vsr_full_offset().
> 
> After the conversion is complete then avr64_offset() can be removed since its
> functionality is now completely within get_avr64()/set_avr64().
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/cpu.h                    | 12 +++++++++++-
>  target/ppc/translate/vmx-impl.inc.c | 27 +++++++++++----------------
>  target/ppc/translate/vsx-impl.inc.c |  5 -----
>  3 files changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 326593e0e7..89651988ab 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -2598,6 +2598,11 @@ static inline int vsrl_offset(int i)
>      return offsetof(CPUPPCState, vsr[i].u64[1]);
>  }
>  
> +static inline int vsr_full_offset(int i)
> +{
> +    return offsetof(CPUPPCState, vsr[i].u64[0]);
> +}
> +
>  static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
>  {
>      return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
> @@ -2613,9 +2618,14 @@ static inline int avrl_offset(int i)
>      return offsetof(CPUPPCState, vsr[32 + i].VsrD(1));
>  }
>  
> +static inline int avr_offset(int i)
> +{
> +    return vsr_full_offset(i + 32);
> +}

avr_full_offset?


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions Mark Cave-Ayland
@ 2019-03-03 23:31   ` Richard Henderson
  2019-03-05 17:38     ` Mark Cave-Ayland
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2019-03-03 23:31 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
> These will become more useful later, but initially use this as an aid to
> simplify the offset calculation by replacing the HOST_TARGET_BIGENDIAN
> sections with the VsrD macro.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/cpu.h       | 10 ++++++++++
>  target/ppc/translate.c | 24 ++++++++++--------------
>  2 files changed, 20 insertions(+), 14 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index d0580c6b6d..326593e0e7 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -2603,6 +2603,16 @@ static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
>      return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
>  }
>  
> +static inline int avrh_offset(int i)
> +{
> +    return offsetof(CPUPPCState, vsr[32 + i].VsrD(0));
> +}
> +
> +static inline int avrl_offset(int i)
> +{
> +    return offsetof(CPUPPCState, vsr[32 + i].VsrD(1));
> +}

I really don't see the point of these...

>  static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
>  {
> -#ifdef HOST_WORDS_BIGENDIAN
> -    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
> -                                          vsr[32 + regno].u64[(high ? 0 : 1)]));
> -#else
> -    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
> -                                          vsr[32 + regno].u64[(high ? 1 : 0)]));
> -#endif
> +    if (high) {
> +        tcg_gen_ld_i64(dst, cpu_env, avrh_offset(regno));
> +    } else {
> +        tcg_gen_ld_i64(dst, cpu_env, avrl_offset(regno));
> +    }

When you could just as well write this as

  tcg_gen_ld_i64(dst, cpu_env,
    offsetof(CPUPPCState, vsr[32+regno].VsrD(high)));


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 6/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 6/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
@ 2019-03-03 23:32   ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2019-03-03 23:32 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
> When VSX support was initially added, the fpr registers were added at
> offset 0 of the VSR register and the vsrl registers were added at offset
> 1. This is in contrast to the VMX registers (the last 32 VSX registers) which
> are stored in host-endian order.
> 
> Switch the fpr/vsrl registers so that the lower 32 VSX registers are now also
> stored in host endian order to match the VMX registers. This ensures that TCG
> vector operations involving mixed VMX and VSX registers will function
> correctly.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/cpu.h      | 4 ++--
>  target/ppc/internal.h | 8 ++++----
>  target/ppc/machine.c  | 8 ++++----
>  3 files changed, 10 insertions(+), 10 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 7/8] target/ppc: introduce vsrh_offset() function
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 7/8] target/ppc: introduce vsrh_offset() function Mark Cave-Ayland
@ 2019-03-03 23:33   ` Richard Henderson
  2019-03-05 17:42     ` Mark Cave-Ayland
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2019-03-03 23:33 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
> -static inline int fpr_offset(int i)
> +static inline int vsrh_offset(int i)

I don't agree with this.  The original is clearer for its uses.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions Mark Cave-Ayland
@ 2019-03-03 23:35   ` Richard Henderson
  2019-03-05 18:16     ` Mark Cave-Ayland
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2019-03-03 23:35 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
>  static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
>  {
> -    if (n < 32) {
> -        get_fpr(dst, n);
> -    } else {
> -        get_avr64(dst, n - 32, true);
> -    }
> +    tcg_gen_ld_i64(dst, cpu_env, vsrh_offset(n));
>  }
>  
>  static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
>  {
> -    if (n < 32) {
> -        get_vsrl(dst, n);
> -    } else {
> -        get_avr64(dst, n - 32, false);
> -    }
> +    tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
>  }
>  
>  static inline void set_cpu_vsrh(int n, TCGv_i64 src)
>  {
> -    if (n < 32) {
> -        set_fpr(n, src);
> -    } else {
> -        set_avr64(n - 32, src, true);
> -    }
> +    tcg_gen_st_i64(src, cpu_env, vsrh_offset(n));
>  }

I think these ought to have a "high" parameter, like set/get_avr64.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function Mark Cave-Ayland
  2019-03-03 23:19   ` Richard Henderson
@ 2019-03-04  5:37   ` David Gibson
  1 sibling, 0 replies; 27+ messages in thread
From: David Gibson @ 2019-03-04  5:37 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: qemu-devel, qemu-ppc, richard.henderson

[-- Attachment #1: Type: text/plain, Size: 1993 bytes --]

On Sun, Mar 03, 2019 at 05:23:36PM +0000, Mark Cave-Ayland wrote:
> Instead of having multiple copies of the offset calculation logic, move it to a
> single fpr_offset() function.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

Applied to ppc-for-4.0, thanks.

> ---
>  target/ppc/cpu.h       | 7 ++++++-
>  target/ppc/translate.c | 4 ++--
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 26604ddf98..4bb4e42670 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -2563,9 +2563,14 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx)
>  }
>  
>  /* Accessors for FP, VMX and VSX registers */
> +static inline int fpr_offset(int i)
> +{
> +    return offsetof(CPUPPCState, vsr[i].u64[0]);
> +}
> +
>  static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
>  {
> -    return &env->vsr[i].u64[0];
> +    return (uint64_t *)((uintptr_t)env + fpr_offset(i));
>  }
>  
>  static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 819221f246..3b1992faf1 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -6677,12 +6677,12 @@ GEN_TM_PRIV_NOOP(trechkpt);
>  
>  static inline void get_fpr(TCGv_i64 dst, int regno)
>  {
> -    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState, vsr[regno].u64[0]));
> +    tcg_gen_ld_i64(dst, cpu_env, fpr_offset(regno));
>  }
>  
>  static inline void set_fpr(int regno, TCGv_i64 src)
>  {
> -    tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState, vsr[regno].u64[0]));
> +    tcg_gen_st_i64(src, cpu_env, fpr_offset(regno));
>  }
>  
>  static inline void get_avr64(TCGv_i64 dst, int regno, bool high)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function Mark Cave-Ayland
  2019-03-03 23:20   ` Richard Henderson
@ 2019-03-04  5:39   ` David Gibson
  1 sibling, 0 replies; 27+ messages in thread
From: David Gibson @ 2019-03-04  5:39 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: qemu-devel, qemu-ppc, richard.henderson

[-- Attachment #1: Type: text/plain, Size: 2919 bytes --]

On Sun, Mar 03, 2019 at 05:23:37PM +0000, Mark Cave-Ayland wrote:
> Instead of having multiple copies of the offset calculation logic, move it to a
> single vsrl_offset() function.
> 
> This commit also renames the existing get_vsr()/set_vsr() functions to
> get_vsrl()/set_vsrl() which better describes their purpose.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

Applied to ppc-for-4.0, thanks.

> ---
>  target/ppc/cpu.h                    |  7 ++++++-
>  target/ppc/translate/vsx-impl.inc.c | 12 ++++++------
>  2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 4bb4e42670..4a7df13c2d 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -2573,9 +2573,14 @@ static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
>      return (uint64_t *)((uintptr_t)env + fpr_offset(i));
>  }
>  
> +static inline int vsrl_offset(int i)
> +{
> +    return offsetof(CPUPPCState, vsr[i].u64[1]);
> +}
> +
>  static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
>  {
> -    return &env->vsr[i].u64[1];
> +    return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
>  }
>  
>  static inline ppc_avr_t *cpu_avr_ptr(CPUPPCState *env, int i)
> diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
> index e73197e717..381ae0f2e9 100644
> --- a/target/ppc/translate/vsx-impl.inc.c
> +++ b/target/ppc/translate/vsx-impl.inc.c
> @@ -1,13 +1,13 @@
>  /***                           VSX extension                               ***/
>  
> -static inline void get_vsr(TCGv_i64 dst, int n)
> +static inline void get_vsrl(TCGv_i64 dst, int n)
>  {
> -    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState, vsr[n].u64[1]));
> +    tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
>  }
>  
> -static inline void set_vsr(int n, TCGv_i64 src)
> +static inline void set_vsrl(int n, TCGv_i64 src)
>  {
> -    tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState, vsr[n].u64[1]));
> +    tcg_gen_st_i64(src, cpu_env, vsrl_offset(n));
>  }
>  
>  static inline int vsr_full_offset(int n)
> @@ -27,7 +27,7 @@ static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
>  static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
>  {
>      if (n < 32) {
> -        get_vsr(dst, n);
> +        get_vsrl(dst, n);
>      } else {
>          get_avr64(dst, n - 32, false);
>      }
> @@ -45,7 +45,7 @@ static inline void set_cpu_vsrh(int n, TCGv_i64 src)
>  static inline void set_cpu_vsrl(int n, TCGv_i64 src)
>  {
>      if (n < 32) {
> -        set_vsr(n, src);
> +        set_vsrl(n, src);
>      } else {
>          set_avr64(n - 32, src, false);
>      }

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order
  2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
                   ` (7 preceding siblings ...)
  2019-03-03 17:23 ` [Qemu-devel] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions Mark Cave-Ayland
@ 2019-03-04  5:43 ` David Gibson
  8 siblings, 0 replies; 27+ messages in thread
From: David Gibson @ 2019-03-04  5:43 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: qemu-devel, qemu-ppc, richard.henderson

[-- Attachment #1: Type: text/plain, Size: 2675 bytes --]

On Sun, Mar 03, 2019 at 05:23:35PM +0000, Mark Cave-Ayland wrote:
> After some investigation into Andrew's report of corruption in his ppc64le
> tests at https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg07234.html, I
> discovered the underlying cause was that the first 32 VSX registers are not
> stored in host endian order.
> 
> This is something that Richard and I had discussed before, but missed that with
> VSX if you have source registers from different register sets then even logical
> operations will give you the wrong result.
> 
> Rather than revert 7b8fe477e1 "target/ppc: convert VSX logical operations to
> vector operations" let's keep the use of the accelerated vector instructions,
> and instead fix the real problem which is to switch the first 32 VSX registers
> to host endian order matching the VMX registers.
> 
> Patches 1-5 aim to consolidate the offset calculations for both CPUPPCState
> and the associated _ptr() functions into one single place.
> 
> With this preliminary work complete, patch 6 switches the first 32 registers
> into host endian order without too much difficulty.
> 
> Finally now that all VSX registers are stored in the same way, the vsr offset
> functions and get_cpu_vsrh()/get_cpu_vsrl() can be simplified accordingly.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

I've applied the first two patches.  The rest I'll wait on a respin
addressing Richard's comments.

> 
> 
> Mark Cave-Ayland (8):
>   target/ppc: introduce single fpr_offset() function
>   target/ppc: introduce single vsrl_offset() function
>   target/ppc: move Vsr* macros from internal.h to cpu.h
>   target/ppc: introduce avrh_offset() and avrl_offset() functions
>   target/ppc: introduce avr_offset() function
>   target/ppc: switch fpr/vsrl registers so all VSX registers are in host
>     endian order
>   target/ppc: introduce vsrh_offset() function
>   target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions
> 
>  target/ppc/cpu.h                    | 56 +++++++++++++++++++++++++++++++++++--
>  target/ppc/internal.h               | 27 +++---------------
>  target/ppc/machine.c                |  8 +++---
>  target/ppc/translate.c              | 28 ++++++++-----------
>  target/ppc/translate/vmx-impl.inc.c | 27 ++++++++----------
>  target/ppc/translate/vsx-impl.inc.c | 39 +++-----------------------
>  6 files changed, 88 insertions(+), 97 deletions(-)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function
  2019-03-03 23:29   ` Richard Henderson
@ 2019-03-05 17:16     ` Mark Cave-Ayland
  2019-03-05 21:16       ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-05 17:16 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david

On 03/03/2019 23:29, Richard Henderson wrote:

> On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
>> All TCG vector operations require pointers to the base address of the vector
>> rather than separate access to the top and bottom 64-bits. Convert
>> the VMX TCG instructions to use a new avr_offset() function instead of
>> avr64_offset(), which can itself be written as a simple wrapper onto
>> vsr_full_offset().
>>
>> After the conversion is complete then avr64_offset() can be removed since its
>> functionality is now completely within get_avr64()/set_avr64().
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> ---
>>  target/ppc/cpu.h                    | 12 +++++++++++-
>>  target/ppc/translate/vmx-impl.inc.c | 27 +++++++++++----------------
>>  target/ppc/translate/vsx-impl.inc.c |  5 -----
>>  3 files changed, 22 insertions(+), 22 deletions(-)
>>
>> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
>> index 326593e0e7..89651988ab 100644
>> --- a/target/ppc/cpu.h
>> +++ b/target/ppc/cpu.h
>> @@ -2598,6 +2598,11 @@ static inline int vsrl_offset(int i)
>>      return offsetof(CPUPPCState, vsr[i].u64[1]);
>>  }
>>  
>> +static inline int vsr_full_offset(int i)
>> +{
>> +    return offsetof(CPUPPCState, vsr[i].u64[0]);
>> +}
>> +
>>  static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
>>  {
>>      return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
>> @@ -2613,9 +2618,14 @@ static inline int avrl_offset(int i)
>>      return offsetof(CPUPPCState, vsr[32 + i].VsrD(1));
>>  }
>>  
>> +static inline int avr_offset(int i)
>> +{
>> +    return vsr_full_offset(i + 32);
>> +}
> 
> avr_full_offset?

I chose avr_offset() because once you get to the end of the patchset, everything uses
offsets to the first byte of the register regardless of endian except for the avr64
functions (i.e. full becomes the new normal which seems like a fairly standard
expectation for offset).

Really though I don't feel too strongly about this, so would you like me to rename it
avr_full_offset() to match the existing vsr_full_offset()?



ATB,

Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions
  2019-03-03 23:31   ` Richard Henderson
@ 2019-03-05 17:38     ` Mark Cave-Ayland
  2019-03-05 21:24       ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-05 17:38 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david

On 03/03/2019 23:31, Richard Henderson wrote:

> On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
>> These will become more useful later, but initially use this as an aid to
>> simplify the offset calculation by replacing the HOST_TARGET_BIGENDIAN
>> sections with the VsrD macro.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> ---
>>  target/ppc/cpu.h       | 10 ++++++++++
>>  target/ppc/translate.c | 24 ++++++++++--------------
>>  2 files changed, 20 insertions(+), 14 deletions(-)
>>
>> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
>> index d0580c6b6d..326593e0e7 100644
>> --- a/target/ppc/cpu.h
>> +++ b/target/ppc/cpu.h
>> @@ -2603,6 +2603,16 @@ static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
>>      return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
>>  }
>>  
>> +static inline int avrh_offset(int i)
>> +{
>> +    return offsetof(CPUPPCState, vsr[32 + i].VsrD(0));
>> +}
>> +
>> +static inline int avrl_offset(int i)
>> +{
>> +    return offsetof(CPUPPCState, vsr[32 + i].VsrD(1));
>> +}
> 
> I really don't see the point of these...
> 
>>  static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
>>  {
>> -#ifdef HOST_WORDS_BIGENDIAN
>> -    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
>> -                                          vsr[32 + regno].u64[(high ? 0 : 1)]));
>> -#else
>> -    tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
>> -                                          vsr[32 + regno].u64[(high ? 1 : 0)]));
>> -#endif
>> +    if (high) {
>> +        tcg_gen_ld_i64(dst, cpu_env, avrh_offset(regno));
>> +    } else {
>> +        tcg_gen_ld_i64(dst, cpu_env, avrl_offset(regno));
>> +    }
> 
> When you could just as well write this as
> 
>   tcg_gen_ld_i64(dst, cpu_env,
>     offsetof(CPUPPCState, vsr[32+regno].VsrD(high)));

It's really that this is a stepping stone to patch 7 where you end up with this:

static inline int vsrh_offset(int i)
{
    return offsetof(CPUPPCState, vsr[i].VsrD(0));
}

static inline int vsrl_offset(int i)
{
    return offsetof(CPUPPCState, vsr[i].VsrD(1));
}

...

static inline int avrh_offset(int i)
{
    return vsrh_offset(i + 32);
}

static inline int avrl_offset(int i)
{
    return vsrl_offset(i + 32);
}

i.e. the logic that describes the AVR registers as being the last set of 32 VSX
registers is captured more succinctly in the avr[l,h] wrapper functions. How about
rewriting the above like this:

    tcg_gen_ld_i64(dst, cpu_env, high ? avrh_offset(regno) : avrl_offset(regno));

which looks a bit easier on the eye? I'm less keen on pushing the "high" bool all the
way down into offset functions as I find the separate vsrh/vsrl functions much easier
to read in the helpers than the get_avr64() version.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 7/8] target/ppc: introduce vsrh_offset() function
  2019-03-03 23:33   ` Richard Henderson
@ 2019-03-05 17:42     ` Mark Cave-Ayland
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-05 17:42 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david

On 03/03/2019 23:33, Richard Henderson wrote:

> On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
>> -static inline int fpr_offset(int i)
>> +static inline int vsrh_offset(int i)
> 
> I don't agree with this.  The original is clearer for its uses.

Well as the patchset was coming from a VSX perspective, I took the approach that more
people would be familiar with high/low pair rather than understanding that VSX was
evolution of the original FPRs.

But again for me, I'm happy with either way so I don't mind changing it.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions
  2019-03-03 23:35   ` Richard Henderson
@ 2019-03-05 18:16     ` Mark Cave-Ayland
  2019-03-06 21:48       ` [Qemu-devel] [Qemu-ppc] " Mark Cave-Ayland
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-05 18:16 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david

On 03/03/2019 23:35, Richard Henderson wrote:

> On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
>>  static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
>>  {
>> -    if (n < 32) {
>> -        get_fpr(dst, n);
>> -    } else {
>> -        get_avr64(dst, n - 32, true);
>> -    }
>> +    tcg_gen_ld_i64(dst, cpu_env, vsrh_offset(n));
>>  }
>>  
>>  static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
>>  {
>> -    if (n < 32) {
>> -        get_vsrl(dst, n);
>> -    } else {
>> -        get_avr64(dst, n - 32, false);
>> -    }
>> +    tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
>>  }
>>  
>>  static inline void set_cpu_vsrh(int n, TCGv_i64 src)
>>  {
>> -    if (n < 32) {
>> -        set_fpr(n, src);
>> -    } else {
>> -        set_avr64(n - 32, src, true);
>> -    }
>> +    tcg_gen_st_i64(src, cpu_env, vsrh_offset(n));
>>  }
> 
> I think these ought to have a "high" parameter, like set/get_avr64.

Right, this is effectively the same discussion as in my previous email so I suggest
we follow this up there.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function
  2019-03-05 17:16     ` Mark Cave-Ayland
@ 2019-03-05 21:16       ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2019-03-05 21:16 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/5/19 9:16 AM, Mark Cave-Ayland wrote:
> Really though I don't feel too strongly about this, so would you like me to
> rename it avr_full_offset() to match the existing vsr_full_offset()?
I do think matching the names makes things clearer.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions
  2019-03-05 17:38     ` Mark Cave-Ayland
@ 2019-03-05 21:24       ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2019-03-05 21:24 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/5/19 9:38 AM, Mark Cave-Ayland wrote:
> It's really that this is a stepping stone to patch 7 where you end up with this:
> 
> static inline int vsrh_offset(int i)
> {
>     return offsetof(CPUPPCState, vsr[i].VsrD(0));
> }
> 
> static inline int vsrl_offset(int i)
> {
>     return offsetof(CPUPPCState, vsr[i].VsrD(1));
> }
> 
> ...
> 
> static inline int avrh_offset(int i)
> {
>     return vsrh_offset(i + 32);
> }
> 
> static inline int avrl_offset(int i)
> {
>     return vsrl_offset(i + 32);
> }

Yes, but the only users look like...

>     tcg_gen_ld_i64(dst, cpu_env, high ? avrh_offset(regno) : avrl_offset(regno));

... this.  And to me that suggests that one helper instead of two would be
better, so that you can write

  tcg_gen_ld_i64(dst, cpu_env, avr64_offset(regno, high));

and propagate the "high" into the VsrD argument.

> which looks a bit easier on the eye? I'm less keen on pushing the "high" bool all the
> way down into offset functions as I find the separate vsrh/vsrl functions much easier
> to read in the helpers than the get_avr64() version.

Ah.  Hmm.

Will we not in the end require a function A64's

  vec_reg_offset(regno, element, size)

I know DavidH did when writing the s390x vector support last week.  Perhaps
that's more palatable than avr64_offset that only supports 64-bit elements?


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions
  2019-03-05 18:16     ` Mark Cave-Ayland
@ 2019-03-06 21:48       ` Mark Cave-Ayland
  2019-03-06 22:27         ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Cave-Ayland @ 2019-03-06 21:48 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david

On 05/03/2019 18:16, Mark Cave-Ayland wrote:

> On 03/03/2019 23:35, Richard Henderson wrote:
> 
>> On 3/3/19 9:23 AM, Mark Cave-Ayland wrote:
>>>  static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
>>>  {
>>> -    if (n < 32) {
>>> -        get_fpr(dst, n);
>>> -    } else {
>>> -        get_avr64(dst, n - 32, true);
>>> -    }
>>> +    tcg_gen_ld_i64(dst, cpu_env, vsrh_offset(n));
>>>  }
>>>  
>>>  static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
>>>  {
>>> -    if (n < 32) {
>>> -        get_vsrl(dst, n);
>>> -    } else {
>>> -        get_avr64(dst, n - 32, false);
>>> -    }
>>> +    tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
>>>  }
>>>  
>>>  static inline void set_cpu_vsrh(int n, TCGv_i64 src)
>>>  {
>>> -    if (n < 32) {
>>> -        set_fpr(n, src);
>>> -    } else {
>>> -        set_avr64(n - 32, src, true);
>>> -    }
>>> +    tcg_gen_st_i64(src, cpu_env, vsrh_offset(n));
>>>  }
>>
>> I think these ought to have a "high" parameter, like set/get_avr64.
> 
> Right, this is effectively the same discussion as in my previous email so I suggest
> we follow this up there.

I've reworked this patchset over the evening to keep avr64_offset() and looking over
the result it's more readable than I thought, mostly thanks to its use of the VsrD macro.

The only part I'm now not sure about is whether from the above you want to keep
fpr_offset() and vsrh_offset(), or whether in the final patch in the series I can
introduce vsr64_offset() similar to avr64_offset() and switch the callers over to use it?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions
  2019-03-06 21:48       ` [Qemu-devel] [Qemu-ppc] " Mark Cave-Ayland
@ 2019-03-06 22:27         ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2019-03-06 22:27 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david

On 3/6/19 1:48 PM, Mark Cave-Ayland wrote:
> The only part I'm now not sure about is whether from the above you want to keep
> fpr_offset() and vsrh_offset(), or whether in the final patch in the series I can
> introduce vsr64_offset() similar to avr64_offset() and switch the callers over to use it?

I would not keep vsrh_offset and would introduce vsr64_offset.

But I would keep fpr_offset, specifically for fpr stuff.
If you like, you can have it forward to vsr64_offset(r, high=true).


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-03-06 22:27 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-03 17:23 [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
2019-03-03 17:23 ` [Qemu-devel] [PATCH 1/8] target/ppc: introduce single fpr_offset() function Mark Cave-Ayland
2019-03-03 23:19   ` Richard Henderson
2019-03-04  5:37   ` David Gibson
2019-03-03 17:23 ` [Qemu-devel] [PATCH 2/8] target/ppc: introduce single vsrl_offset() function Mark Cave-Ayland
2019-03-03 23:20   ` Richard Henderson
2019-03-04  5:39   ` David Gibson
2019-03-03 17:23 ` [Qemu-devel] [PATCH 3/8] target/ppc: move Vsr* macros from internal.h to cpu.h Mark Cave-Ayland
2019-03-03 17:23 ` [Qemu-devel] [PATCH 4/8] target/ppc: introduce avrh_offset() and avrl_offset() functions Mark Cave-Ayland
2019-03-03 23:31   ` Richard Henderson
2019-03-05 17:38     ` Mark Cave-Ayland
2019-03-05 21:24       ` Richard Henderson
2019-03-03 17:23 ` [Qemu-devel] [PATCH 5/8] target/ppc: introduce avr_offset() function Mark Cave-Ayland
2019-03-03 23:29   ` Richard Henderson
2019-03-05 17:16     ` Mark Cave-Ayland
2019-03-05 21:16       ` Richard Henderson
2019-03-03 17:23 ` [Qemu-devel] [PATCH 6/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order Mark Cave-Ayland
2019-03-03 23:32   ` Richard Henderson
2019-03-03 17:23 ` [Qemu-devel] [PATCH 7/8] target/ppc: introduce vsrh_offset() function Mark Cave-Ayland
2019-03-03 23:33   ` Richard Henderson
2019-03-05 17:42     ` Mark Cave-Ayland
2019-03-03 17:23 ` [Qemu-devel] [PATCH 8/8] target/ppc: simplify get_cpu_vsrh() and get_cpu_vsrl() functions Mark Cave-Ayland
2019-03-03 23:35   ` Richard Henderson
2019-03-05 18:16     ` Mark Cave-Ayland
2019-03-06 21:48       ` [Qemu-devel] [Qemu-ppc] " Mark Cave-Ayland
2019-03-06 22:27         ` Richard Henderson
2019-03-04  5:43 ` [Qemu-devel] [PATCH 0/8] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.