[Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up
@ 2019-04-28 14:38 Mark Cave-Ayland
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c Mark Cave-Ayland
                   ` (13 more replies)
  0 siblings, 14 replies; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

With the conversion of PPC VSX registers to host endian during the 4.0 development
cycle, the VSX helpers getVSR() and putVSR() which were used to convert between big
endian and host endian (and are currently just a no-op) can now be removed. This
eliminates an extra copy for each VSX source and destination register at runtime.

Patches 1-3 do the elimination work on a per-file basis and switch VSX register
accesses to be via pointers rather than on copies managed using getVSR()/putVSR().

After this patches 4-12 change the VSX registers to be passed to helpers via pointers
rather than register number so that the decode of the vector register pointers occurs
at translation time instead of at runtime. This matches how VMX instructions are
currently decoded.

Finally patches 13 and 14 perform some additional tidy-up around decoding registers
at translation time instead of runtime for VSX_REST_DC and VSX_FMADD.

Greg: I've added you as CC since you managed to find a bug in my last series. This
one is much more mechanical, but if you are able to confirm this doesn't introduce
any regressions in your test images then that would be great.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

Mark Cave-Ayland (14):
  target/ppc: remove getVSR()/putVSR() from fpu_helper.c
  target/ppc: remove getVSR()/putVSR() from mem_helper.c
  target/ppc: remove getVSR()/putVSR() from int_helper.c
  target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c
  target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at
    translation time
  target/ppc: decode target register in VSX_EXTRACT_INSERT at
    translation time
  target/ppc: improve VSX_TEST_DC with new generator macros
  target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro

 target/ppc/fpu_helper.c             | 745 +++++++++++++++---------------------
 target/ppc/helper.h                 | 344 +++++++++--------
 target/ppc/int_helper.c             |  24 +-
 target/ppc/internal.h               |  12 -
 target/ppc/mem_helper.c             |  22 +-
 target/ppc/translate/vsx-impl.inc.c | 523 ++++++++++++++++---------
 6 files changed, 842 insertions(+), 828 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:25   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c Mark Cave-Ayland
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Since commit 8a14d31b00 "target/ppc: switch fpr/vsrl registers so all VSX
registers are in host endian order" functions getVSR() and putVSR() which used
to convert the VSR registers into host endian order are no longer required.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c | 707 ++++++++++++++++++++++--------------------------
 1 file changed, 317 insertions(+), 390 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 0b7308f539..148ee5ac4c 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -1803,35 +1803,33 @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2)
 #define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp)                    \
 void helper_##name(CPUPPCState *env, uint32_t opcode)                        \
 {                                                                            \
-    ppc_vsr_t xt, xa, xb;                                                    \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                   \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
-    getVSR(xA(opcode), &xa, env);                                            \
-    getVSR(xB(opcode), &xb, env);                                            \
-    getVSR(xT(opcode), &xt, env);                                            \
     helper_reset_fpstatus(env);                                              \
                                                                              \
     for (i = 0; i < nels; i++) {                                             \
         float_status tstat = env->fp_status;                                 \
         set_float_exception_flags(0, &tstat);                                \
-        xt.fld = tp##_##op(xa.fld, xb.fld, &tstat);                          \
+        xt->fld = tp##_##op(xa->fld, xb->fld, &tstat);                       \
         env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
                                                                              \
         if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {    \
             float_invalid_op_addsub(env, sfprf, GETPC(),                     \
-                                    tp##_classify(xa.fld) |                  \
-                                    tp##_classify(xb.fld));                  \
+                                    tp##_classify(xa->fld) |                 \
+                                    tp##_classify(xb->fld));                 \
         }                                                                    \
                                                                              \
         if (r2sp) {                                                          \
-            xt.fld = helper_frsp(env, xt.fld);                               \
+            xt->fld = helper_frsp(env, xt->fld);                             \
         }                                                                    \
                                                                              \
         if (sfprf) {                                                         \
-            helper_compute_fprf_float64(env, xt.fld);                        \
+            helper_compute_fprf_float64(env, xt->fld);                       \
         }                                                                    \
     }                                                                        \
-    putVSR(xT(opcode), &xt, env);                                            \
     do_float_check_status(env, GETPC());                                     \
 }
 
@@ -1846,12 +1844,11 @@ VSX_ADD_SUB(xvsubsp, sub, 4, float32, VsrW(i), 0, 0)
 
 void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xt, xa, xb;
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
-    getVSR(rA(opcode) + 32, &xa, env);
-    getVSR(rB(opcode) + 32, &xb, env);
-    getVSR(rD(opcode) + 32, &xt, env);
     helper_reset_fpstatus(env);
 
     tstat = env->fp_status;
@@ -1860,18 +1857,17 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
     }
 
     set_float_exception_flags(0, &tstat);
-    xt.f128 = float128_add(xa.f128, xb.f128, &tstat);
+    xt->f128 = float128_add(xa->f128, xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
 
     if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
         float_invalid_op_addsub(env, 1, GETPC(),
-                                float128_classify(xa.f128) |
-                                float128_classify(xb.f128));
+                                float128_classify(xa->f128) |
+                                float128_classify(xb->f128));
     }
 
-    helper_compute_fprf_float128(env, xt.f128);
+    helper_compute_fprf_float128(env, xt->f128);
 
-    putVSR(rD(opcode) + 32, &xt, env);
     do_float_check_status(env, GETPC());
 }
 
@@ -1886,36 +1882,34 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
 #define VSX_MUL(op, nels, tp, fld, sfprf, r2sp)                              \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
-    ppc_vsr_t xt, xa, xb;                                                    \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                   \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
-    getVSR(xA(opcode), &xa, env);                                            \
-    getVSR(xB(opcode), &xb, env);                                            \
-    getVSR(xT(opcode), &xt, env);                                            \
     helper_reset_fpstatus(env);                                              \
                                                                              \
     for (i = 0; i < nels; i++) {                                             \
         float_status tstat = env->fp_status;                                 \
         set_float_exception_flags(0, &tstat);                                \
-        xt.fld = tp##_mul(xa.fld, xb.fld, &tstat);                           \
+        xt->fld = tp##_mul(xa->fld, xb->fld, &tstat);                        \
         env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
                                                                              \
         if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {    \
             float_invalid_op_mul(env, sfprf, GETPC(),                        \
-                                 tp##_classify(xa.fld) |                     \
-                                 tp##_classify(xb.fld));                     \
+                                 tp##_classify(xa->fld) |                    \
+                                 tp##_classify(xb->fld));                    \
         }                                                                    \
                                                                              \
         if (r2sp) {                                                          \
-            xt.fld = helper_frsp(env, xt.fld);                               \
+            xt->fld = helper_frsp(env, xt->fld);                             \
         }                                                                    \
                                                                              \
         if (sfprf) {                                                         \
-            helper_compute_fprf_float64(env, xt.fld);                        \
+            helper_compute_fprf_float64(env, xt->fld);                       \
         }                                                                    \
     }                                                                        \
                                                                              \
-    putVSR(xT(opcode), &xt, env);                                            \
     do_float_check_status(env, GETPC());                                     \
 }
 
@@ -1926,13 +1920,11 @@ VSX_MUL(xvmulsp, 4, float32, VsrW(i), 0, 0)
 
 void helper_xsmulqp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xt, xa, xb;
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
-    getVSR(rA(opcode) + 32, &xa, env);
-    getVSR(rB(opcode) + 32, &xb, env);
-    getVSR(rD(opcode) + 32, &xt, env);
-
     helper_reset_fpstatus(env);
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
@@ -1940,17 +1932,16 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t opcode)
     }
 
     set_float_exception_flags(0, &tstat);
-    xt.f128 = float128_mul(xa.f128, xb.f128, &tstat);
+    xt->f128 = float128_mul(xa->f128, xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
 
     if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
         float_invalid_op_mul(env, 1, GETPC(),
-                             float128_classify(xa.f128) |
-                             float128_classify(xb.f128));
+                             float128_classify(xa->f128) |
+                             float128_classify(xb->f128));
     }
-    helper_compute_fprf_float128(env, xt.f128);
+    helper_compute_fprf_float128(env, xt->f128);
 
-    putVSR(rD(opcode) + 32, &xt, env);
     do_float_check_status(env, GETPC());
 }
 
@@ -1965,39 +1956,37 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t opcode)
 #define VSX_DIV(op, nels, tp, fld, sfprf, r2sp)                               \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 {                                                                             \
-    ppc_vsr_t xt, xa, xb;                                                     \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     int i;                                                                    \
                                                                               \
-    getVSR(xA(opcode), &xa, env);                                             \
-    getVSR(xB(opcode), &xb, env);                                             \
-    getVSR(xT(opcode), &xt, env);                                             \
     helper_reset_fpstatus(env);                                               \
                                                                               \
     for (i = 0; i < nels; i++) {                                              \
         float_status tstat = env->fp_status;                                  \
         set_float_exception_flags(0, &tstat);                                 \
-        xt.fld = tp##_div(xa.fld, xb.fld, &tstat);                            \
+        xt->fld = tp##_div(xa->fld, xb->fld, &tstat);                         \
         env->fp_status.float_exception_flags |= tstat.float_exception_flags;  \
                                                                               \
         if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {     \
             float_invalid_op_div(env, sfprf, GETPC(),                         \
-                                 tp##_classify(xa.fld) |                      \
-                                 tp##_classify(xb.fld));                      \
+                                 tp##_classify(xa->fld) |                     \
+                                 tp##_classify(xb->fld));                     \
         }                                                                     \
         if (unlikely(tstat.float_exception_flags & float_flag_divbyzero)) {   \
             float_zero_divide_excp(env, GETPC());                             \
         }                                                                     \
                                                                               \
         if (r2sp) {                                                           \
-            xt.fld = helper_frsp(env, xt.fld);                                \
+            xt->fld = helper_frsp(env, xt->fld);                              \
         }                                                                     \
                                                                               \
         if (sfprf) {                                                          \
-            helper_compute_fprf_float64(env, xt.fld);                         \
+            helper_compute_fprf_float64(env, xt->fld);                        \
         }                                                                     \
     }                                                                         \
                                                                               \
-    putVSR(xT(opcode), &xt, env);                                             \
     do_float_check_status(env, GETPC());                                      \
 }
 
@@ -2008,13 +1997,11 @@ VSX_DIV(xvdivsp, 4, float32, VsrW(i), 0, 0)
 
 void helper_xsdivqp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xt, xa, xb;
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
-    getVSR(rA(opcode) + 32, &xa, env);
-    getVSR(rB(opcode) + 32, &xb, env);
-    getVSR(rD(opcode) + 32, &xt, env);
-
     helper_reset_fpstatus(env);
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
@@ -2022,20 +2009,19 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode)
     }
 
     set_float_exception_flags(0, &tstat);
-    xt.f128 = float128_div(xa.f128, xb.f128, &tstat);
+    xt->f128 = float128_div(xa->f128, xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
 
     if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
         float_invalid_op_div(env, 1, GETPC(),
-                             float128_classify(xa.f128) |
-                             float128_classify(xb.f128));
+                             float128_classify(xa->f128) |
+                             float128_classify(xb->f128));
     }
     if (unlikely(tstat.float_exception_flags & float_flag_divbyzero)) {
         float_zero_divide_excp(env, GETPC());
     }
 
-    helper_compute_fprf_float128(env, xt.f128);
-    putVSR(rD(opcode) + 32, &xt, env);
+    helper_compute_fprf_float128(env, xt->f128);
     do_float_check_status(env, GETPC());
 }
 
@@ -2050,29 +2036,27 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode)
 #define VSX_RE(op, nels, tp, fld, sfprf, r2sp)                                \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 {                                                                             \
-    ppc_vsr_t xt, xb;                                                         \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     int i;                                                                    \
                                                                               \
-    getVSR(xB(opcode), &xb, env);                                             \
-    getVSR(xT(opcode), &xt, env);                                             \
     helper_reset_fpstatus(env);                                               \
                                                                               \
     for (i = 0; i < nels; i++) {                                              \
-        if (unlikely(tp##_is_signaling_nan(xb.fld, &env->fp_status))) {       \
+        if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) {      \
             float_invalid_op_vxsnan(env, GETPC());                            \
         }                                                                     \
-        xt.fld = tp##_div(tp##_one, xb.fld, &env->fp_status);                 \
+        xt->fld = tp##_div(tp##_one, xb->fld, &env->fp_status);               \
                                                                               \
         if (r2sp) {                                                           \
-            xt.fld = helper_frsp(env, xt.fld);                                \
+            xt->fld = helper_frsp(env, xt->fld);                              \
         }                                                                     \
                                                                               \
         if (sfprf) {                                                          \
-            helper_compute_fprf_float64(env, xt.fld);                         \
+            helper_compute_fprf_float64(env, xt->fld);                        \
         }                                                                     \
     }                                                                         \
                                                                               \
-    putVSR(xT(opcode), &xt, env);                                             \
     do_float_check_status(env, GETPC());                                      \
 }
 
@@ -2092,37 +2076,35 @@ VSX_RE(xvresp, 4, float32, VsrW(i), 0, 0)
 #define VSX_SQRT(op, nels, tp, fld, sfprf, r2sp)                             \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
-    ppc_vsr_t xt, xb;                                                        \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
-    getVSR(xB(opcode), &xb, env);                                            \
-    getVSR(xT(opcode), &xt, env);                                            \
     helper_reset_fpstatus(env);                                              \
                                                                              \
     for (i = 0; i < nels; i++) {                                             \
         float_status tstat = env->fp_status;                                 \
         set_float_exception_flags(0, &tstat);                                \
-        xt.fld = tp##_sqrt(xb.fld, &tstat);                                  \
+        xt->fld = tp##_sqrt(xb->fld, &tstat);                                \
         env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
                                                                              \
         if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {    \
-            if (tp##_is_neg(xb.fld) && !tp##_is_zero(xb.fld)) {              \
+            if (tp##_is_neg(xb->fld) && !tp##_is_zero(xb->fld)) {            \
                 float_invalid_op_vxsqrt(env, sfprf, GETPC());                \
-            } else if (tp##_is_signaling_nan(xb.fld, &tstat)) {              \
+            } else if (tp##_is_signaling_nan(xb->fld, &tstat)) {             \
                 float_invalid_op_vxsnan(env, GETPC());                       \
             }                                                                \
         }                                                                    \
                                                                              \
         if (r2sp) {                                                          \
-            xt.fld = helper_frsp(env, xt.fld);                               \
+            xt->fld = helper_frsp(env, xt->fld);                             \
         }                                                                    \
                                                                              \
         if (sfprf) {                                                         \
-            helper_compute_fprf_float64(env, xt.fld);                        \
+            helper_compute_fprf_float64(env, xt->fld);                       \
         }                                                                    \
     }                                                                        \
                                                                              \
-    putVSR(xT(opcode), &xt, env);                                            \
     do_float_check_status(env, GETPC());                                     \
 }
 
@@ -2142,38 +2124,36 @@ VSX_SQRT(xvsqrtsp, 4, float32, VsrW(i), 0, 0)
 #define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp)                           \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
-    ppc_vsr_t xt, xb;                                                        \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
-    getVSR(xB(opcode), &xb, env);                                            \
-    getVSR(xT(opcode), &xt, env);                                            \
     helper_reset_fpstatus(env);                                              \
                                                                              \
     for (i = 0; i < nels; i++) {                                             \
         float_status tstat = env->fp_status;                                 \
         set_float_exception_flags(0, &tstat);                                \
-        xt.fld = tp##_sqrt(xb.fld, &tstat);                                  \
-        xt.fld = tp##_div(tp##_one, xt.fld, &tstat);                         \
+        xt->fld = tp##_sqrt(xb->fld, &tstat);                                \
+        xt->fld = tp##_div(tp##_one, xt->fld, &tstat);                       \
         env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
                                                                              \
         if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {    \
-            if (tp##_is_neg(xb.fld) && !tp##_is_zero(xb.fld)) {              \
+            if (tp##_is_neg(xb->fld) && !tp##_is_zero(xb->fld)) {            \
                 float_invalid_op_vxsqrt(env, sfprf, GETPC());                \
-            } else if (tp##_is_signaling_nan(xb.fld, &tstat)) {              \
+            } else if (tp##_is_signaling_nan(xb->fld, &tstat)) {             \
                 float_invalid_op_vxsnan(env, GETPC());                       \
             }                                                                \
         }                                                                    \
                                                                              \
         if (r2sp) {                                                          \
-            xt.fld = helper_frsp(env, xt.fld);                               \
+            xt->fld = helper_frsp(env, xt->fld);                             \
         }                                                                    \
                                                                              \
         if (sfprf) {                                                         \
-            helper_compute_fprf_float64(env, xt.fld);                        \
+            helper_compute_fprf_float64(env, xt->fld);                       \
         }                                                                    \
     }                                                                        \
                                                                              \
-    putVSR(xT(opcode), &xt, env);                                            \
     do_float_check_status(env, GETPC());                                     \
 }
 
@@ -2195,37 +2175,35 @@ VSX_RSQRTE(xvrsqrtesp, 4, float32, VsrW(i), 0, 0)
 #define VSX_TDIV(op, nels, tp, fld, emin, emax, nbits)                  \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
 {                                                                       \
-    ppc_vsr_t xa, xb;                                                   \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                              \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                              \
     int i;                                                              \
     int fe_flag = 0;                                                    \
     int fg_flag = 0;                                                    \
                                                                         \
-    getVSR(xA(opcode), &xa, env);                                       \
-    getVSR(xB(opcode), &xb, env);                                       \
-                                                                        \
     for (i = 0; i < nels; i++) {                                        \
-        if (unlikely(tp##_is_infinity(xa.fld) ||                        \
-                     tp##_is_infinity(xb.fld) ||                        \
-                     tp##_is_zero(xb.fld))) {                           \
+        if (unlikely(tp##_is_infinity(xa->fld) ||                       \
+                     tp##_is_infinity(xb->fld) ||                       \
+                     tp##_is_zero(xb->fld))) {                          \
             fe_flag = 1;                                                \
             fg_flag = 1;                                                \
         } else {                                                        \
-            int e_a = ppc_##tp##_get_unbiased_exp(xa.fld);              \
-            int e_b = ppc_##tp##_get_unbiased_exp(xb.fld);              \
+            int e_a = ppc_##tp##_get_unbiased_exp(xa->fld);             \
+            int e_b = ppc_##tp##_get_unbiased_exp(xb->fld);             \
                                                                         \
-            if (unlikely(tp##_is_any_nan(xa.fld) ||                     \
-                         tp##_is_any_nan(xb.fld))) {                    \
+            if (unlikely(tp##_is_any_nan(xa->fld) ||                    \
+                         tp##_is_any_nan(xb->fld))) {                   \
                 fe_flag = 1;                                            \
             } else if ((e_b <= emin) || (e_b >= (emax - 2))) {          \
                 fe_flag = 1;                                            \
-            } else if (!tp##_is_zero(xa.fld) &&                         \
+            } else if (!tp##_is_zero(xa->fld) &&                        \
                        (((e_a - e_b) >= emax) ||                        \
                         ((e_a - e_b) <= (emin + 1)) ||                  \
                         (e_a <= (emin + nbits)))) {                     \
                 fe_flag = 1;                                            \
             }                                                           \
                                                                         \
-            if (unlikely(tp##_is_zero_or_denormal(xb.fld))) {           \
+            if (unlikely(tp##_is_zero_or_denormal(xb->fld))) {          \
                 /*                                                      \
                  * XB is not zero because of the above check and so     \
                  * must be denormalized.                                \
@@ -2255,34 +2233,31 @@ VSX_TDIV(xvtdivsp, 4, float32, VsrW(i), -126, 127, 23)
 #define VSX_TSQRT(op, nels, tp, fld, emin, nbits)                       \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
 {                                                                       \
-    ppc_vsr_t xa, xb;                                                   \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                              \
     int i;                                                              \
     int fe_flag = 0;                                                    \
     int fg_flag = 0;                                                    \
                                                                         \
-    getVSR(xA(opcode), &xa, env);                                       \
-    getVSR(xB(opcode), &xb, env);                                       \
-                                                                        \
     for (i = 0; i < nels; i++) {                                        \
-        if (unlikely(tp##_is_infinity(xb.fld) ||                        \
-                     tp##_is_zero(xb.fld))) {                           \
+        if (unlikely(tp##_is_infinity(xb->fld) ||                       \
+                     tp##_is_zero(xb->fld))) {                          \
             fe_flag = 1;                                                \
             fg_flag = 1;                                                \
         } else {                                                        \
-            int e_b = ppc_##tp##_get_unbiased_exp(xb.fld);              \
+            int e_b = ppc_##tp##_get_unbiased_exp(xb->fld);             \
                                                                         \
-            if (unlikely(tp##_is_any_nan(xb.fld))) {                    \
+            if (unlikely(tp##_is_any_nan(xb->fld))) {                   \
                 fe_flag = 1;                                            \
-            } else if (unlikely(tp##_is_zero(xb.fld))) {                \
+            } else if (unlikely(tp##_is_zero(xb->fld))) {               \
                 fe_flag = 1;                                            \
-            } else if (unlikely(tp##_is_neg(xb.fld))) {                 \
+            } else if (unlikely(tp##_is_neg(xb->fld))) {                \
                 fe_flag = 1;                                            \
-            } else if (!tp##_is_zero(xb.fld) &&                         \
+            } else if (!tp##_is_zero(xb->fld) &&                        \
                        (e_b <= (emin + nbits))) {                       \
                 fe_flag = 1;                                            \
             }                                                           \
                                                                         \
-            if (unlikely(tp##_is_zero_or_denormal(xb.fld))) {           \
+            if (unlikely(tp##_is_zero_or_denormal(xb->fld))) {          \
                 /*                                                      \
                  * XB is not zero because of the above check and        \
                  * therefore must be denormalized.                      \
@@ -2313,24 +2288,20 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23)
 #define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)              \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 {                                                                             \
-    ppc_vsr_t xt_in, xa, xb, xt_out;                                          \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     ppc_vsr_t *b, *c;                                                         \
     int i;                                                                    \
                                                                               \
     if (afrm) { /* AxB + T */                                                 \
-        b = &xb;                                                              \
-        c = &xt_in;                                                           \
+        b = xb;                                                               \
+        c = xt;                                                               \
     } else { /* AxT + B */                                                    \
-        b = &xt_in;                                                           \
-        c = &xb;                                                              \
+        b = xt;                                                               \
+        c = xb;                                                               \
     }                                                                         \
                                                                               \
-    getVSR(xA(opcode), &xa, env);                                             \
-    getVSR(xB(opcode), &xb, env);                                             \
-    getVSR(xT(opcode), &xt_in, env);                                          \
-                                                                              \
-    xt_out = xt_in;                                                           \
-                                                                              \
     helper_reset_fpstatus(env);                                               \
                                                                               \
     for (i = 0; i < nels; i++) {                                              \
@@ -2342,30 +2313,29 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
              * result to odd.                                                 \
              */                                                               \
             set_float_rounding_mode(float_round_to_zero, &tstat);             \
-            xt_out.fld = tp##_muladd(xa.fld, b->fld, c->fld,                  \
-                                       maddflgs, &tstat);                     \
-            xt_out.fld |= (get_float_exception_flags(&tstat) &                \
+            xt->fld = tp##_muladd(xa->fld, b->fld, c->fld,                    \
+                                  maddflgs, &tstat);                          \
+            xt->fld |= (get_float_exception_flags(&tstat) &                   \
                               float_flag_inexact) != 0;                       \
         } else {                                                              \
-            xt_out.fld = tp##_muladd(xa.fld, b->fld, c->fld,                  \
-                                        maddflgs, &tstat);                    \
+            xt->fld = tp##_muladd(xa->fld, b->fld, c->fld,                    \
+                                  maddflgs, &tstat);                          \
         }                                                                     \
         env->fp_status.float_exception_flags |= tstat.float_exception_flags;  \
                                                                               \
         if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {     \
-            tp##_maddsub_update_excp(env, xa.fld, b->fld,                     \
+            tp##_maddsub_update_excp(env, xa->fld, b->fld,                    \
                                      c->fld, maddflgs, GETPC());              \
         }                                                                     \
                                                                               \
         if (r2sp) {                                                           \
-            xt_out.fld = helper_frsp(env, xt_out.fld);                        \
+            xt->fld = helper_frsp(env, xt->fld);                              \
         }                                                                     \
                                                                               \
         if (sfprf) {                                                          \
-            helper_compute_fprf_float64(env, xt_out.fld);                     \
+            helper_compute_fprf_float64(env, xt->fld);                        \
         }                                                                     \
     }                                                                         \
-    putVSR(xT(opcode), &xt_out, env);                                         \
     do_float_check_status(env, GETPC());                                      \
 }
 
@@ -2415,22 +2385,20 @@ VSX_MADD(xvnmsubmsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0, 0)
 #define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc)                                \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
 {                                                                             \
-    ppc_vsr_t xt, xa, xb;                                                     \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false;            \
                                                                               \
-    getVSR(xA(opcode), &xa, env);                                             \
-    getVSR(xB(opcode), &xb, env);                                             \
-    getVSR(xT(opcode), &xt, env);                                             \
-                                                                              \
-    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||              \
-        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {              \
+    if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||             \
+        float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {             \
         vxsnan_flag = true;                                                   \
         if (fpscr_ve == 0 && svxvc) {                                         \
             vxvc_flag = true;                                                 \
         }                                                                     \
     } else if (svxvc) {                                                       \
-        vxvc_flag = float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||      \
-            float64_is_quiet_nan(xb.VsrD(0), &env->fp_status);                \
+        vxvc_flag = float64_is_quiet_nan(xa->VsrD(0), &env->fp_status) ||     \
+            float64_is_quiet_nan(xb->VsrD(0), &env->fp_status);               \
     }                                                                         \
     if (vxsnan_flag) {                                                        \
         float_invalid_op_vxsnan(env, GETPC());                                \
@@ -2441,15 +2409,15 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
     vex_flag = fpscr_ve && (vxvc_flag || vxsnan_flag);                        \
                                                                               \
     if (!vex_flag) {                                                          \
-        if (float64_##cmp(xb.VsrD(0), xa.VsrD(0), &env->fp_status) == exp) {  \
-            xt.VsrD(0) = -1;                                                  \
-            xt.VsrD(1) = 0;                                                   \
+        if (float64_##cmp(xb->VsrD(0), xa->VsrD(0),                           \
+                          &env->fp_status) == exp) {                          \
+            xt->VsrD(0) = -1;                                                 \
+            xt->VsrD(1) = 0;                                                  \
         } else {                                                              \
-            xt.VsrD(0) = 0;                                                   \
-            xt.VsrD(1) = 0;                                                   \
+            xt->VsrD(0) = 0;                                                  \
+            xt->VsrD(1) = 0;                                                  \
         }                                                                     \
     }                                                                         \
-    putVSR(xT(opcode), &xt, env);                                             \
     do_float_check_status(env, GETPC());                                      \
 }
 
@@ -2460,18 +2428,16 @@ VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
 
 void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xa, xb;
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     int64_t exp_a, exp_b;
     uint32_t cc;
 
-    getVSR(xA(opcode), &xa, env);
-    getVSR(xB(opcode), &xb, env);
-
-    exp_a = extract64(xa.VsrD(0), 52, 11);
-    exp_b = extract64(xb.VsrD(0), 52, 11);
+    exp_a = extract64(xa->VsrD(0), 52, 11);
+    exp_b = extract64(xb->VsrD(0), 52, 11);
 
-    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||
-                 float64_is_any_nan(xb.VsrD(0)))) {
+    if (unlikely(float64_is_any_nan(xa->VsrD(0)) ||
+                 float64_is_any_nan(xb->VsrD(0)))) {
         cc = CRF_SO;
     } else {
         if (exp_a < exp_b) {
@@ -2492,18 +2458,16 @@ void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode)
 
 void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xa, xb;
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     int64_t exp_a, exp_b;
     uint32_t cc;
 
-    getVSR(rA(opcode) + 32, &xa, env);
-    getVSR(rB(opcode) + 32, &xb, env);
+    exp_a = extract64(xa->VsrD(0), 48, 15);
+    exp_b = extract64(xb->VsrD(0), 48, 15);
 
-    exp_a = extract64(xa.VsrD(0), 48, 15);
-    exp_b = extract64(xb.VsrD(0), 48, 15);
-
-    if (unlikely(float128_is_any_nan(xa.f128) ||
-                 float128_is_any_nan(xb.f128))) {
+    if (unlikely(float128_is_any_nan(xa->f128) ||
+                 float128_is_any_nan(xb->f128))) {
         cc = CRF_SO;
     } else {
         if (exp_a < exp_b) {
@@ -2525,23 +2489,22 @@ void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode)
 #define VSX_SCALAR_CMP(op, ordered)                                      \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
 {                                                                        \
-    ppc_vsr_t xa, xb;                                                    \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                               \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                               \
     uint32_t cc = 0;                                                     \
     bool vxsnan_flag = false, vxvc_flag = false;                         \
                                                                          \
     helper_reset_fpstatus(env);                                          \
-    getVSR(xA(opcode), &xa, env);                                        \
-    getVSR(xB(opcode), &xb, env);                                        \
                                                                          \
-    if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||         \
-        float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {         \
+    if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||        \
+        float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {        \
         vxsnan_flag = true;                                              \
         cc = CRF_SO;                                                     \
         if (fpscr_ve == 0 && ordered) {                                  \
             vxvc_flag = true;                                            \
         }                                                                \
-    } else if (float64_is_quiet_nan(xa.VsrD(0), &env->fp_status) ||      \
-               float64_is_quiet_nan(xb.VsrD(0), &env->fp_status)) {      \
+    } else if (float64_is_quiet_nan(xa->VsrD(0), &env->fp_status) ||     \
+               float64_is_quiet_nan(xb->VsrD(0), &env->fp_status)) {     \
         cc = CRF_SO;                                                     \
         if (ordered) {                                                   \
             vxvc_flag = true;                                            \
@@ -2554,9 +2517,9 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
         float_invalid_op_vxvc(env, 0, GETPC());                          \
     }                                                                    \
                                                                          \
-    if (float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {           \
+    if (float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) {         \
         cc |= CRF_LT;                                                    \
-    } else if (!float64_le(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) {   \
+    } else if (!float64_le(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) { \
         cc |= CRF_GT;                                                    \
     } else {                                                             \
         cc |= CRF_EQ;                                                    \
@@ -2575,23 +2538,22 @@ VSX_SCALAR_CMP(xscmpudp, 0)
 #define VSX_SCALAR_CMPQ(op, ordered)                                    \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
 {                                                                       \
-    ppc_vsr_t xa, xb;                                                   \
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];                         \
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                         \
     uint32_t cc = 0;                                                    \
     bool vxsnan_flag = false, vxvc_flag = false;                        \
                                                                         \
     helper_reset_fpstatus(env);                                         \
-    getVSR(rA(opcode) + 32, &xa, env);                                  \
-    getVSR(rB(opcode) + 32, &xb, env);                                  \
                                                                         \
-    if (float128_is_signaling_nan(xa.f128, &env->fp_status) ||          \
-        float128_is_signaling_nan(xb.f128, &env->fp_status)) {          \
+    if (float128_is_signaling_nan(xa->f128, &env->fp_status) ||         \
+        float128_is_signaling_nan(xb->f128, &env->fp_status)) {         \
         vxsnan_flag = true;                                             \
         cc = CRF_SO;                                                    \
         if (fpscr_ve == 0 && ordered) {                                 \
             vxvc_flag = true;                                           \
         }                                                               \
-    } else if (float128_is_quiet_nan(xa.f128, &env->fp_status) ||       \
-               float128_is_quiet_nan(xb.f128, &env->fp_status)) {       \
+    } else if (float128_is_quiet_nan(xa->f128, &env->fp_status) ||      \
+               float128_is_quiet_nan(xb->f128, &env->fp_status)) {      \
         cc = CRF_SO;                                                    \
         if (ordered) {                                                  \
             vxvc_flag = true;                                           \
@@ -2604,9 +2566,9 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
         float_invalid_op_vxvc(env, 0, GETPC());                         \
     }                                                                   \
                                                                         \
-    if (float128_lt(xa.f128, xb.f128, &env->fp_status)) {               \
+    if (float128_lt(xa->f128, xb->f128, &env->fp_status)) {             \
         cc |= CRF_LT;                                                   \
-    } else if (!float128_le(xa.f128, xb.f128, &env->fp_status)) {       \
+    } else if (!float128_le(xa->f128, xb->f128, &env->fp_status)) {     \
         cc |= CRF_GT;                                                   \
     } else {                                                            \
         cc |= CRF_EQ;                                                   \
@@ -2633,22 +2595,19 @@ VSX_SCALAR_CMPQ(xscmpuqp, 0)
 #define VSX_MAX_MIN(name, op, nels, tp, fld)                                  \
 void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
 {                                                                             \
-    ppc_vsr_t xt, xa, xb;                                                     \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     int i;                                                                    \
                                                                               \
-    getVSR(xA(opcode), &xa, env);                                             \
-    getVSR(xB(opcode), &xb, env);                                             \
-    getVSR(xT(opcode), &xt, env);                                             \
-                                                                              \
     for (i = 0; i < nels; i++) {                                              \
-        xt.fld = tp##_##op(xa.fld, xb.fld, &env->fp_status);                  \
-        if (unlikely(tp##_is_signaling_nan(xa.fld, &env->fp_status) ||        \
-                     tp##_is_signaling_nan(xb.fld, &env->fp_status))) {       \
+        xt->fld = tp##_##op(xa->fld, xb->fld, &env->fp_status);               \
+        if (unlikely(tp##_is_signaling_nan(xa->fld, &env->fp_status) ||       \
+                     tp##_is_signaling_nan(xb->fld, &env->fp_status))) {      \
             float_invalid_op_vxsnan(env, GETPC());                            \
         }                                                                     \
     }                                                                         \
                                                                               \
-    putVSR(xT(opcode), &xt, env);                                             \
     do_float_check_status(env, GETPC());                                      \
 }
 
@@ -2662,27 +2621,26 @@ VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
 #define VSX_MAX_MINC(name, max)                                               \
 void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
 {                                                                             \
-    ppc_vsr_t xt, xa, xb;                                                     \
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                               \
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];                               \
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                               \
+    ppc_vsr_t r;                                                              \
     bool vxsnan_flag = false, vex_flag = false;                               \
                                                                               \
-    getVSR(rA(opcode) + 32, &xa, env);                                        \
-    getVSR(rB(opcode) + 32, &xb, env);                                        \
-    getVSR(rD(opcode) + 32, &xt, env);                                        \
-                                                                              \
-    if (unlikely(float64_is_any_nan(xa.VsrD(0)) ||                            \
-                 float64_is_any_nan(xb.VsrD(0)))) {                           \
-        if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status) ||          \
-            float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {          \
+    if (unlikely(float64_is_any_nan(xa->VsrD(0)) ||                           \
+                 float64_is_any_nan(xb->VsrD(0)))) {                          \
+        if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||         \
+            float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {         \
             vxsnan_flag = true;                                               \
         }                                                                     \
-        xt.VsrD(0) = xb.VsrD(0);                                              \
+        r.VsrD(0) = xb->VsrD(0);                                              \
     } else if ((max &&                                                        \
-               !float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) ||       \
+               !float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) ||     \
                (!max &&                                                       \
-               float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status))) {        \
-        xt.VsrD(0) = xa.VsrD(0);                                              \
+               float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status))) {      \
+        r.VsrD(0) = xa->VsrD(0);                                              \
     } else {                                                                  \
-        xt.VsrD(0) = xb.VsrD(0);                                              \
+        r.VsrD(0) = xb->VsrD(0);                                              \
     }                                                                         \
                                                                               \
     vex_flag = fpscr_ve & vxsnan_flag;                                        \
@@ -2690,7 +2648,7 @@ void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
         float_invalid_op_vxsnan(env, GETPC());                                \
     }                                                                         \
     if (!vex_flag) {                                                          \
-        putVSR(rD(opcode) + 32, &xt, env);                                    \
+        memcpy(xt, &r, sizeof(ppc_vsr_t));                                    \
     }                                                                         \
 }                                                                             \
 
@@ -2700,44 +2658,46 @@ VSX_MAX_MINC(xsmincdp, 0);
 #define VSX_MAX_MINJ(name, max)                                               \
 void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
 {                                                                             \
-    ppc_vsr_t xt, xa, xb;                                                     \
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                               \
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];                               \
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                               \
+    ppc_vsr_t r;                                                              \
     bool vxsnan_flag = false, vex_flag = false;                               \
                                                                               \
-    getVSR(rA(opcode) + 32, &xa, env);                                        \
-    getVSR(rB(opcode) + 32, &xb, env);                                        \
-    getVSR(rD(opcode) + 32, &xt, env);                                        \
-                                                                              \
-    if (unlikely(float64_is_any_nan(xa.VsrD(0)))) {                           \
-        if (float64_is_signaling_nan(xa.VsrD(0), &env->fp_status)) {          \
+    if (unlikely(float64_is_any_nan(xa->VsrD(0)))) {                          \
+        if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status)) {         \
             vxsnan_flag = true;                                               \
         }                                                                     \
-        xt.VsrD(0) = xa.VsrD(0);                                              \
-    } else if (unlikely(float64_is_any_nan(xb.VsrD(0)))) {                    \
-        if (float64_is_signaling_nan(xb.VsrD(0), &env->fp_status)) {          \
+        r.VsrD(0) = xa->VsrD(0);                                              \
+    } else if (unlikely(float64_is_any_nan(xb->VsrD(0)))) {                   \
+        if (float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) {         \
             vxsnan_flag = true;                                               \
         }                                                                     \
-        xt.VsrD(0) = xb.VsrD(0);                                              \
-    } else if (float64_is_zero(xa.VsrD(0)) && float64_is_zero(xb.VsrD(0))) {  \
+        r.VsrD(0) = xb->VsrD(0);                                              \
+    } else if (float64_is_zero(xa->VsrD(0)) &&                                \
+               float64_is_zero(xb->VsrD(0))) {                                \
         if (max) {                                                            \
-            if (!float64_is_neg(xa.VsrD(0)) || !float64_is_neg(xb.VsrD(0))) { \
-                xt.VsrD(0) = 0ULL;                                            \
+            if (!float64_is_neg(xa->VsrD(0)) ||                               \
+                !float64_is_neg(xb->VsrD(0))) {                               \
+                r.VsrD(0) = 0ULL;                                             \
             } else {                                                          \
-                xt.VsrD(0) = 0x8000000000000000ULL;                           \
+                r.VsrD(0) = 0x8000000000000000ULL;                            \
             }                                                                 \
         } else {                                                              \
-            if (float64_is_neg(xa.VsrD(0)) || float64_is_neg(xb.VsrD(0))) {   \
-                xt.VsrD(0) = 0x8000000000000000ULL;                           \
+            if (float64_is_neg(xa->VsrD(0)) ||                                \
+                float64_is_neg(xb->VsrD(0))) {                                \
+                r.VsrD(0) = 0x8000000000000000ULL;                            \
             } else {                                                          \
-                xt.VsrD(0) = 0ULL;                                            \
+                r.VsrD(0) = 0ULL;                                             \
             }                                                                 \
         }                                                                     \
     } else if ((max &&                                                        \
-               !float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status)) ||       \
+               !float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) ||     \
                (!max &&                                                       \
-               float64_lt(xa.VsrD(0), xb.VsrD(0), &env->fp_status))) {        \
-        xt.VsrD(0) = xa.VsrD(0);                                              \
+               float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status))) {      \
+        r.VsrD(0) = xa->VsrD(0);                                              \
     } else {                                                                  \
-        xt.VsrD(0) = xb.VsrD(0);                                              \
+        r.VsrD(0) = xb->VsrD(0);                                              \
     }                                                                         \
                                                                               \
     vex_flag = fpscr_ve & vxsnan_flag;                                        \
@@ -2745,7 +2705,7 @@ void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
         float_invalid_op_vxsnan(env, GETPC());                                \
     }                                                                         \
     if (!vex_flag) {                                                          \
-        putVSR(rD(opcode) + 32, &xt, env);                                    \
+        memcpy(xt, &r, sizeof(ppc_vsr_t));                                    \
     }                                                                         \
 }                                                                             \
 
@@ -2765,39 +2725,36 @@ VSX_MAX_MINJ(xsminjdp, 0);
 #define VSX_CMP(op, nels, tp, fld, cmp, svxvc, exp)                       \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
 {                                                                         \
-    ppc_vsr_t xt, xa, xb;                                                 \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                \
     int i;                                                                \
     int all_true = 1;                                                     \
     int all_false = 1;                                                    \
                                                                           \
-    getVSR(xA(opcode), &xa, env);                                         \
-    getVSR(xB(opcode), &xb, env);                                         \
-    getVSR(xT(opcode), &xt, env);                                         \
-                                                                          \
     for (i = 0; i < nels; i++) {                                          \
-        if (unlikely(tp##_is_any_nan(xa.fld) ||                           \
-                     tp##_is_any_nan(xb.fld))) {                          \
-            if (tp##_is_signaling_nan(xa.fld, &env->fp_status) ||         \
-                tp##_is_signaling_nan(xb.fld, &env->fp_status)) {         \
+        if (unlikely(tp##_is_any_nan(xa->fld) ||                          \
+                     tp##_is_any_nan(xb->fld))) {                         \
+            if (tp##_is_signaling_nan(xa->fld, &env->fp_status) ||        \
+                tp##_is_signaling_nan(xb->fld, &env->fp_status)) {        \
                 float_invalid_op_vxsnan(env, GETPC());                    \
             }                                                             \
             if (svxvc) {                                                  \
                 float_invalid_op_vxvc(env, 0, GETPC());                   \
             }                                                             \
-            xt.fld = 0;                                                   \
+            xt->fld = 0;                                                  \
             all_true = 0;                                                 \
         } else {                                                          \
-            if (tp##_##cmp(xb.fld, xa.fld, &env->fp_status) == exp) {     \
-                xt.fld = -1;                                              \
+            if (tp##_##cmp(xb->fld, xa->fld, &env->fp_status) == exp) {   \
+                xt->fld = -1;                                             \
                 all_false = 0;                                            \
             } else {                                                      \
-                xt.fld = 0;                                               \
+                xt->fld = 0;                                              \
                 all_true = 0;                                             \
             }                                                             \
         }                                                                 \
     }                                                                     \
                                                                           \
-    putVSR(xT(opcode), &xt, env);                                         \
     if ((opcode >> (31 - 21)) & 1) {                                      \
         env->crf[6] = (all_true ? 0x8 : 0) | (all_false ? 0x2 : 0);       \
     }                                                                     \
@@ -2826,25 +2783,22 @@ VSX_CMP(xvcmpnesp, 4, float32, VsrW(i), eq, 0, 0)
 #define VSX_CVT_FP_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf)    \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                \
 {                                                                  \
-    ppc_vsr_t xt, xb;                                              \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                         \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                         \
     int i;                                                         \
                                                                    \
-    getVSR(xB(opcode), &xb, env);                                  \
-    getVSR(xT(opcode), &xt, env);                                  \
-                                                                   \
     for (i = 0; i < nels; i++) {                                   \
-        xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);        \
-        if (unlikely(stp##_is_signaling_nan(xb.sfld,               \
+        xt->tfld = stp##_to_##ttp(xb->sfld, &env->fp_status);      \
+        if (unlikely(stp##_is_signaling_nan(xb->sfld,              \
                                             &env->fp_status))) {   \
             float_invalid_op_vxsnan(env, GETPC());                 \
-            xt.tfld = ttp##_snan_to_qnan(xt.tfld);                 \
+            xt->tfld = ttp##_snan_to_qnan(xt->tfld);               \
         }                                                          \
         if (sfprf) {                                               \
-            helper_compute_fprf_##ttp(env, xt.tfld);               \
+            helper_compute_fprf_##ttp(env, xt->tfld);              \
         }                                                          \
     }                                                              \
                                                                    \
-    putVSR(xT(opcode), &xt, env);                                  \
     do_float_check_status(env, GETPC());                           \
 }
 
@@ -2866,25 +2820,22 @@ VSX_CVT_FP_TO_FP(xvcvspdp, 2, float32, float64, VsrW(2 * i), VsrD(i), 0)
 #define VSX_CVT_FP_TO_FP_VECTOR(op, nels, stp, ttp, sfld, tfld, sfprf)    \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
 {                                                                       \
-    ppc_vsr_t xt, xb;                                                   \
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                         \
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                         \
     int i;                                                              \
                                                                         \
-    getVSR(rB(opcode) + 32, &xb, env);                                  \
-    getVSR(rD(opcode) + 32, &xt, env);                                  \
-                                                                        \
     for (i = 0; i < nels; i++) {                                        \
-        xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);             \
-        if (unlikely(stp##_is_signaling_nan(xb.sfld,                    \
+        xt->tfld = stp##_to_##ttp(xb->sfld, &env->fp_status);           \
+        if (unlikely(stp##_is_signaling_nan(xb->sfld,                   \
                                             &env->fp_status))) {        \
             float_invalid_op_vxsnan(env, GETPC());                      \
-            xt.tfld = ttp##_snan_to_qnan(xt.tfld);                      \
+            xt->tfld = ttp##_snan_to_qnan(xt->tfld);                    \
         }                                                               \
         if (sfprf) {                                                    \
-            helper_compute_fprf_##ttp(env, xt.tfld);                    \
+            helper_compute_fprf_##ttp(env, xt->tfld);                   \
         }                                                               \
     }                                                                   \
                                                                         \
-    putVSR(rD(opcode) + 32, &xt, env);                                  \
     do_float_check_status(env, GETPC());                                \
 }
 
@@ -2904,25 +2855,24 @@ VSX_CVT_FP_TO_FP_VECTOR(xscvdpqp, 1, float64, float128, VsrD(0), f128, 1)
 #define VSX_CVT_FP_TO_FP_HP(op, nels, stp, ttp, sfld, tfld, sfprf) \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                \
 {                                                                  \
-    ppc_vsr_t xt, xb;                                              \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                         \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                         \
     int i;                                                         \
                                                                    \
-    getVSR(xB(opcode), &xb, env);                                  \
-    memset(&xt, 0, sizeof(xt));                                    \
+    memset(xt, 0, sizeof(ppc_vsr_t));                              \
                                                                    \
     for (i = 0; i < nels; i++) {                                   \
-        xt.tfld = stp##_to_##ttp(xb.sfld, 1, &env->fp_status);     \
-        if (unlikely(stp##_is_signaling_nan(xb.sfld,               \
+        xt->tfld = stp##_to_##ttp(xb->sfld, 1, &env->fp_status);   \
+        if (unlikely(stp##_is_signaling_nan(xb->sfld,              \
                                             &env->fp_status))) {   \
             float_invalid_op_vxsnan(env, GETPC());                 \
-            xt.tfld = ttp##_snan_to_qnan(xt.tfld);                 \
+            xt->tfld = ttp##_snan_to_qnan(xt->tfld);               \
         }                                                          \
         if (sfprf) {                                               \
-            helper_compute_fprf_##ttp(env, xt.tfld);               \
+            helper_compute_fprf_##ttp(env, xt->tfld);              \
         }                                                          \
     }                                                              \
                                                                    \
-    putVSR(xT(opcode), &xt, env);                                  \
     do_float_check_status(env, GETPC());                           \
 }
 
@@ -2937,26 +2887,25 @@ VSX_CVT_FP_TO_FP_HP(xvcvhpsp, 4, float16, float32, VsrH(2 * i + 1), VsrW(i), 0)
  */
 void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xt, xb;
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     float_status tstat;
 
-    getVSR(rB(opcode) + 32, &xb, env);
-    memset(&xt, 0, sizeof(xt));
+    memset(xt, 0, sizeof(ppc_vsr_t));
 
     tstat = env->fp_status;
     if (unlikely(Rc(opcode) != 0)) {
         tstat.float_rounding_mode = float_round_to_odd;
     }
 
-    xt.VsrD(0) = float128_to_float64(xb.f128, &tstat);
+    xt->VsrD(0) = float128_to_float64(xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
-    if (unlikely(float128_is_signaling_nan(xb.f128, &tstat))) {
+    if (unlikely(float128_is_signaling_nan(xb->f128, &tstat))) {
         float_invalid_op_vxsnan(env, GETPC());
-        xt.VsrD(0) = float64_snan_to_qnan(xt.VsrD(0));
+        xt->VsrD(0) = float64_snan_to_qnan(xt->VsrD(0));
     }
-    helper_compute_fprf_float64(env, xt.VsrD(0));
+    helper_compute_fprf_float64(env, xt->VsrD(0));
 
-    putVSR(rD(opcode) + 32, &xt, env);
     do_float_check_status(env, GETPC());
 }
 
@@ -2990,24 +2939,22 @@ uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
     int all_flags = env->fp_status.float_exception_flags, flags;             \
-    ppc_vsr_t xt, xb;                                                        \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
-    getVSR(xB(opcode), &xb, env);                                            \
-    getVSR(xT(opcode), &xt, env);                                            \
-                                                                             \
     for (i = 0; i < nels; i++) {                                             \
         env->fp_status.float_exception_flags = 0;                            \
-        xt.tfld = stp##_to_##ttp##_round_to_zero(xb.sfld, &env->fp_status);  \
+        xt->tfld = stp##_to_##ttp##_round_to_zero(xb->sfld,                  \
+                                                  &env->fp_status);          \
         flags = env->fp_status.float_exception_flags;                        \
         if (unlikely(flags & float_flag_invalid)) {                          \
-            float_invalid_cvt(env, 0, GETPC(), stp##_classify(xb.sfld));     \
-            xt.tfld = rnan;                                                  \
+            float_invalid_cvt(env, 0, GETPC(), stp##_classify(xb->sfld));    \
+            xt->tfld = rnan;                                                  \
         }                                                                    \
         all_flags |= flags;                                                  \
     }                                                                        \
                                                                              \
-    putVSR(xT(opcode), &xt, env);                                            \
     env->fp_status.float_exception_flags = all_flags;                        \
     do_float_check_status(env, GETPC());                                     \
 }
@@ -3042,18 +2989,17 @@ VSX_CVT_FP_TO_INT(xvcvspuxws, 4, float32, uint32, VsrW(i), VsrW(i), 0U)
 #define VSX_CVT_FP_TO_INT_VECTOR(op, stp, ttp, sfld, tfld, rnan)             \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
 {                                                                            \
-    ppc_vsr_t xt, xb;                                                        \
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                              \
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                              \
                                                                              \
-    getVSR(rB(opcode) + 32, &xb, env);                                       \
-    memset(&xt, 0, sizeof(xt));                                              \
+    memset(xt, 0, sizeof(ppc_vsr_t));                                        \
                                                                              \
-    xt.tfld = stp##_to_##ttp##_round_to_zero(xb.sfld, &env->fp_status);      \
+    xt->tfld = stp##_to_##ttp##_round_to_zero(xb->sfld, &env->fp_status);    \
     if (env->fp_status.float_exception_flags & float_flag_invalid) {         \
-        float_invalid_cvt(env, 0, GETPC(), stp##_classify(xb.sfld));         \
-        xt.tfld = rnan;                                                      \
+        float_invalid_cvt(env, 0, GETPC(), stp##_classify(xb->sfld));        \
+        xt->tfld = rnan;                                                     \
     }                                                                        \
                                                                              \
-    putVSR(rD(opcode) + 32, &xt, env);                                       \
     do_float_check_status(env, GETPC());                                     \
 }
 
@@ -3079,23 +3025,20 @@ VSX_CVT_FP_TO_INT_VECTOR(xscvqpuwz, float128, uint32, f128, VsrD(0), 0x0ULL)
 #define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf, r2sp)  \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
 {                                                                       \
-    ppc_vsr_t xt, xb;                                                   \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                              \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                              \
     int i;                                                              \
                                                                         \
-    getVSR(xB(opcode), &xb, env);                                       \
-    getVSR(xT(opcode), &xt, env);                                       \
-                                                                        \
     for (i = 0; i < nels; i++) {                                        \
-        xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);             \
+        xt->tfld = stp##_to_##ttp(xb->sfld, &env->fp_status);           \
         if (r2sp) {                                                     \
-            xt.tfld = helper_frsp(env, xt.tfld);                        \
+            xt->tfld = helper_frsp(env, xt->tfld);                      \
         }                                                               \
         if (sfprf) {                                                    \
-            helper_compute_fprf_float64(env, xt.tfld);                  \
+            helper_compute_fprf_float64(env, xt->tfld);                 \
         }                                                               \
     }                                                                   \
                                                                         \
-    putVSR(xT(opcode), &xt, env);                                       \
     do_float_check_status(env, GETPC());                                \
 }
 
@@ -3123,15 +3066,12 @@ VSX_CVT_INT_TO_FP(xvcvuxwsp, 4, uint32, float32, VsrW(i), VsrW(i), 0, 0)
 #define VSX_CVT_INT_TO_FP_VECTOR(op, stp, ttp, sfld, tfld)              \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
 {                                                                       \
-    ppc_vsr_t xt, xb;                                                   \
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                         \
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                         \
                                                                         \
-    getVSR(rB(opcode) + 32, &xb, env);                                  \
-    getVSR(rD(opcode) + 32, &xt, env);                                  \
+    xt->tfld = stp##_to_##ttp(xb->sfld, &env->fp_status);               \
+    helper_compute_fprf_##ttp(env, xt->tfld);                           \
                                                                         \
-    xt.tfld = stp##_to_##ttp(xb.sfld, &env->fp_status);                 \
-    helper_compute_fprf_##ttp(env, xt.tfld);                            \
-                                                                        \
-    putVSR(xT(opcode) + 32, &xt, env);                                  \
     do_float_check_status(env, GETPC());                                \
 }
 
@@ -3157,25 +3097,24 @@ VSX_CVT_INT_TO_FP_VECTOR(xscvudqp, uint64, float128, VsrD(0), f128)
 #define VSX_ROUND(op, nels, tp, fld, rmode, sfprf)                     \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                    \
 {                                                                      \
-    ppc_vsr_t xt, xb;                                                  \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                             \
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                             \
     int i;                                                             \
-    getVSR(xB(opcode), &xb, env);                                      \
-    getVSR(xT(opcode), &xt, env);                                      \
                                                                        \
     if (rmode != FLOAT_ROUND_CURRENT) {                                \
         set_float_rounding_mode(rmode, &env->fp_status);               \
     }                                                                  \
                                                                        \
     for (i = 0; i < nels; i++) {                                       \
-        if (unlikely(tp##_is_signaling_nan(xb.fld,                     \
+        if (unlikely(tp##_is_signaling_nan(xb->fld,                    \
                                            &env->fp_status))) {        \
             float_invalid_op_vxsnan(env, GETPC());                     \
-            xt.fld = tp##_snan_to_qnan(xb.fld);                        \
+            xt->fld = tp##_snan_to_qnan(xb->fld);                      \
         } else {                                                       \
-            xt.fld = tp##_round_to_int(xb.fld, &env->fp_status);       \
+            xt->fld = tp##_round_to_int(xb->fld, &env->fp_status);     \
         }                                                              \
         if (sfprf) {                                                   \
-            helper_compute_fprf_float64(env, xt.fld);                  \
+            helper_compute_fprf_float64(env, xt->fld);                 \
         }                                                              \
     }                                                                  \
                                                                        \
@@ -3189,7 +3128,6 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                    \
         env->fp_status.float_exception_flags &= ~float_flag_inexact;   \
     }                                                                  \
                                                                        \
-    putVSR(xT(opcode), &xt, env);                                      \
     do_float_check_status(env, GETPC());                               \
 }
 
@@ -3225,21 +3163,19 @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
 #define VSX_XXPERM(op, indexed)                                       \
 void helper_##op(CPUPPCState *env, uint32_t opcode)                   \
 {                                                                     \
-    ppc_vsr_t xt, xa, pcv, xto;                                       \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                            \
+    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                            \
+    ppc_vsr_t *pcv = &env->vsr[xB(opcode)];                           \
     int i, idx;                                                       \
                                                                       \
-    getVSR(xA(opcode), &xa, env);                                     \
-    getVSR(xT(opcode), &xt, env);                                     \
-    getVSR(xB(opcode), &pcv, env);                                    \
-                                                                      \
     for (i = 0; i < 16; i++) {                                        \
-        idx = pcv.VsrB(i) & 0x1F;                                     \
+        idx = pcv->VsrB(i) & 0x1F;                                    \
         if (indexed) {                                                \
             idx = 31 - idx;                                           \
         }                                                             \
-        xto.VsrB(i) = (idx <= 15) ? xa.VsrB(idx) : xt.VsrB(idx - 16); \
+        xt->VsrB(i) = (idx <= 15) ? xa->VsrB(idx)                     \
+                                  : xt->VsrB(idx - 16);               \
     }                                                                 \
-    putVSR(xT(opcode), &xto, env);                                    \
 }
 
 VSX_XXPERM(xxperm, 0)
@@ -3247,22 +3183,21 @@ VSX_XXPERM(xxpermr, 1)
 
 void helper_xvxsigsp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xt, xb;
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     uint32_t exp, i, fraction;
 
-    getVSR(xB(opcode), &xb, env);
-    memset(&xt, 0, sizeof(xt));
+    memset(xt, 0, sizeof(ppc_vsr_t));
 
     for (i = 0; i < 4; i++) {
-        exp = (xb.VsrW(i) >> 23) & 0xFF;
-        fraction = xb.VsrW(i) & 0x7FFFFF;
+        exp = (xb->VsrW(i) >> 23) & 0xFF;
+        fraction = xb->VsrW(i) & 0x7FFFFF;
         if (exp != 0 && exp != 255) {
-            xt.VsrW(i) = fraction | 0x00800000;
+            xt->VsrW(i) = fraction | 0x00800000;
         } else {
-            xt.VsrW(i) = fraction;
+            xt->VsrW(i) = fraction;
         }
     }
-    putVSR(xT(opcode), &xt, env);
 }
 
 /*
@@ -3279,27 +3214,28 @@ void helper_xvxsigsp(CPUPPCState *env, uint32_t opcode)
 #define VSX_TEST_DC(op, nels, xbn, tp, fld, tfld, fld_max, scrf)  \
 void helper_##op(CPUPPCState *env, uint32_t opcode)         \
 {                                                           \
-    ppc_vsr_t xt, xb;                                       \
+    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                  \
+    ppc_vsr_t *xb = &env->vsr[xbn];                         \
+    ppc_vsr_t r;                                            \
     uint32_t i, sign, dcmx;                                 \
     uint32_t cc, match = 0;                                 \
                                                             \
-    getVSR(xbn, &xb, env);                                  \
     if (!scrf) {                                            \
-        memset(&xt, 0, sizeof(xt));                         \
+        memset(xt, 0, sizeof(ppc_vsr_t));                   \
         dcmx = DCMX_XV(opcode);                             \
     } else {                                                \
         dcmx = DCMX(opcode);                                \
     }                                                       \
                                                             \
     for (i = 0; i < nels; i++) {                            \
-        sign = tp##_is_neg(xb.fld);                         \
-        if (tp##_is_any_nan(xb.fld)) {                      \
+        sign = tp##_is_neg(xb->fld);                        \
+        if (tp##_is_any_nan(xb->fld)) {                     \
             match = extract32(dcmx, 6, 1);                  \
-        } else if (tp##_is_infinity(xb.fld)) {              \
+        } else if (tp##_is_infinity(xb->fld)) {             \
             match = extract32(dcmx, 4 + !sign, 1);          \
-        } else if (tp##_is_zero(xb.fld)) {                  \
+        } else if (tp##_is_zero(xb->fld)) {                 \
             match = extract32(dcmx, 2 + !sign, 1);          \
-        } else if (tp##_is_zero_or_denormal(xb.fld)) {      \
+        } else if (tp##_is_zero_or_denormal(xb->fld)) {     \
             match = extract32(dcmx, 0 + !sign, 1);          \
         }                                                   \
                                                             \
@@ -3309,12 +3245,12 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)         \
             env->fpscr |= cc << FPSCR_FPRF;                 \
             env->crf[BF(opcode)] = cc;                      \
         } else {                                            \
-            xt.tfld = match ? fld_max : 0;                  \
+            r.tfld = match ? fld_max : 0;                   \
         }                                                   \
         match = 0;                                          \
     }                                                       \
     if (!scrf) {                                            \
-        putVSR(xT(opcode), &xt, env);                       \
+        memcpy(xt, &r, sizeof(ppc_vsr_t));                  \
     }                                                       \
 }
 
@@ -3325,29 +3261,28 @@ VSX_TEST_DC(xststdcqp, 1, (rB(opcode) + 32), float128, f128, VsrD(0), 0, 1)
 
 void helper_xststdcsp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xb;
+    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     uint32_t dcmx, sign, exp;
     uint32_t cc, match = 0, not_sp = 0;
 
-    getVSR(xB(opcode), &xb, env);
     dcmx = DCMX(opcode);
-    exp = (xb.VsrD(0) >> 52) & 0x7FF;
+    exp = (xb->VsrD(0) >> 52) & 0x7FF;
 
-    sign = float64_is_neg(xb.VsrD(0));
-    if (float64_is_any_nan(xb.VsrD(0))) {
+    sign = float64_is_neg(xb->VsrD(0));
+    if (float64_is_any_nan(xb->VsrD(0))) {
         match = extract32(dcmx, 6, 1);
-    } else if (float64_is_infinity(xb.VsrD(0))) {
+    } else if (float64_is_infinity(xb->VsrD(0))) {
         match = extract32(dcmx, 4 + !sign, 1);
-    } else if (float64_is_zero(xb.VsrD(0))) {
+    } else if (float64_is_zero(xb->VsrD(0))) {
         match = extract32(dcmx, 2 + !sign, 1);
-    } else if (float64_is_zero_or_denormal(xb.VsrD(0)) ||
+    } else if (float64_is_zero_or_denormal(xb->VsrD(0)) ||
                (exp > 0 && exp < 0x381)) {
         match = extract32(dcmx, 0 + !sign, 1);
     }
 
-    not_sp = !float64_eq(xb.VsrD(0),
+    not_sp = !float64_eq(xb->VsrD(0),
                          float32_to_float64(
-                             float64_to_float32(xb.VsrD(0), &env->fp_status),
+                             float64_to_float32(xb->VsrD(0), &env->fp_status),
                              &env->fp_status), &env->fp_status);
 
     cc = sign << CRF_LT_BIT | match << CRF_EQ_BIT | not_sp << CRF_SO_BIT;
@@ -3358,16 +3293,15 @@ void helper_xststdcsp(CPUPPCState *env, uint32_t opcode)
 
 void helper_xsrqpi(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xb;
-    ppc_vsr_t xt;
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     uint8_t r = Rrm(opcode);
     uint8_t ex = Rc(opcode);
     uint8_t rmc = RMC(opcode);
     uint8_t rmode = 0;
     float_status tstat;
 
-    getVSR(rB(opcode) + 32, &xb, env);
-    memset(&xt, 0, sizeof(xt));
+    memset(xt, 0, sizeof(ppc_vsr_t));
     helper_reset_fpstatus(env);
 
     if (r == 0 && rmc == 0) {
@@ -3396,13 +3330,13 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t opcode)
     tstat = env->fp_status;
     set_float_exception_flags(0, &tstat);
     set_float_rounding_mode(rmode, &tstat);
-    xt.f128 = float128_round_to_int(xb.f128, &tstat);
+    xt->f128 = float128_round_to_int(xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
 
     if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
-        if (float128_is_signaling_nan(xb.f128, &tstat)) {
+        if (float128_is_signaling_nan(xb->f128, &tstat)) {
             float_invalid_op_vxsnan(env, GETPC());
-            xt.f128 = float128_snan_to_qnan(xt.f128);
+            xt->f128 = float128_snan_to_qnan(xt->f128);
         }
     }
 
@@ -3410,23 +3344,21 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t opcode)
         env->fp_status.float_exception_flags &= ~float_flag_inexact;
     }
 
-    helper_compute_fprf_float128(env, xt.f128);
+    helper_compute_fprf_float128(env, xt->f128);
     do_float_check_status(env, GETPC());
-    putVSR(rD(opcode) + 32, &xt, env);
 }
 
 void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xb;
-    ppc_vsr_t xt;
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     uint8_t r = Rrm(opcode);
     uint8_t rmc = RMC(opcode);
     uint8_t rmode = 0;
     floatx80 round_res;
     float_status tstat;
 
-    getVSR(rB(opcode) + 32, &xb, env);
-    memset(&xt, 0, sizeof(xt));
+    memset(xt, 0, sizeof(ppc_vsr_t));
     helper_reset_fpstatus(env);
 
     if (r == 0 && rmc == 0) {
@@ -3455,30 +3387,28 @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode)
     tstat = env->fp_status;
     set_float_exception_flags(0, &tstat);
     set_float_rounding_mode(rmode, &tstat);
-    round_res = float128_to_floatx80(xb.f128, &tstat);
-    xt.f128 = floatx80_to_float128(round_res, &tstat);
+    round_res = float128_to_floatx80(xb->f128, &tstat);
+    xt->f128 = floatx80_to_float128(round_res, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
 
     if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
-        if (float128_is_signaling_nan(xb.f128, &tstat)) {
+        if (float128_is_signaling_nan(xb->f128, &tstat)) {
             float_invalid_op_vxsnan(env, GETPC());
-            xt.f128 = float128_snan_to_qnan(xt.f128);
+            xt->f128 = float128_snan_to_qnan(xt->f128);
         }
     }
 
-    helper_compute_fprf_float128(env, xt.f128);
-    putVSR(rD(opcode) + 32, &xt, env);
+    helper_compute_fprf_float128(env, xt->f128);
     do_float_check_status(env, GETPC());
 }
 
 void helper_xssqrtqp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xb;
-    ppc_vsr_t xt;
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
-    getVSR(rB(opcode) + 32, &xb, env);
-    memset(&xt, 0, sizeof(xt));
+    memset(xt, 0, sizeof(ppc_vsr_t));
     helper_reset_fpstatus(env);
 
     tstat = env->fp_status;
@@ -3487,34 +3417,32 @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t opcode)
     }
 
     set_float_exception_flags(0, &tstat);
-    xt.f128 = float128_sqrt(xb.f128, &tstat);
+    xt->f128 = float128_sqrt(xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
 
     if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
-        if (float128_is_signaling_nan(xb.f128, &tstat)) {
+        if (float128_is_signaling_nan(xb->f128, &tstat)) {
             float_invalid_op_vxsnan(env, GETPC());
-            xt.f128 = float128_snan_to_qnan(xb.f128);
-        } else if  (float128_is_quiet_nan(xb.f128, &tstat)) {
-            xt.f128 = xb.f128;
-        } else if (float128_is_neg(xb.f128) && !float128_is_zero(xb.f128)) {
+            xt->f128 = float128_snan_to_qnan(xb->f128);
+        } else if (float128_is_quiet_nan(xb->f128, &tstat)) {
+            xt->f128 = xb->f128;
+        } else if (float128_is_neg(xb->f128) && !float128_is_zero(xb->f128)) {
             float_invalid_op_vxsqrt(env, 1, GETPC());
-            xt.f128 = float128_default_nan(&env->fp_status);
+            xt->f128 = float128_default_nan(&env->fp_status);
         }
     }
 
-    helper_compute_fprf_float128(env, xt.f128);
-    putVSR(rD(opcode) + 32, &xt, env);
+    helper_compute_fprf_float128(env, xt->f128);
     do_float_check_status(env, GETPC());
 }
 
 void helper_xssubqp(CPUPPCState *env, uint32_t opcode)
 {
-    ppc_vsr_t xt, xa, xb;
+    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
+    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
+    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
-    getVSR(rA(opcode) + 32, &xa, env);
-    getVSR(rB(opcode) + 32, &xb, env);
-    getVSR(rD(opcode) + 32, &xt, env);
     helper_reset_fpstatus(env);
 
     tstat = env->fp_status;
@@ -3523,16 +3451,15 @@ void helper_xssubqp(CPUPPCState *env, uint32_t opcode)
     }
 
     set_float_exception_flags(0, &tstat);
-    xt.f128 = float128_sub(xa.f128, xb.f128, &tstat);
+    xt->f128 = float128_sub(xa->f128, xb->f128, &tstat);
     env->fp_status.float_exception_flags |= tstat.float_exception_flags;
 
     if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
         float_invalid_op_addsub(env, 1, GETPC(),
-                                float128_classify(xa.f128) |
-                                float128_classify(xb.f128));
+                                float128_classify(xa->f128) |
+                                float128_classify(xb->f128));
     }
 
-    helper_compute_fprf_float128(env, xt.f128);
-    putVSR(rD(opcode) + 32, &xt, env);
+    helper_compute_fprf_float128(env, xt->f128);
     do_float_check_status(env, GETPC());
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:29   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 03/14] target/ppc: remove getVSR()/putVSR() from int_helper.c Mark Cave-Ayland
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Since commit 8a14d31b00 "target/ppc: switch fpr/vsrl registers so all VSX
registers are in host endian order" functions getVSR() and putVSR() which used
to convert the VSR registers into host endian order are no longer required.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/mem_helper.c             | 24 +++++++++++-------------
 target/ppc/translate/vsx-impl.inc.c |  8 +++++---
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/target/ppc/mem_helper.c b/target/ppc/mem_helper.c
index 5b0f9ee50d..4dfa7ee23f 100644
--- a/target/ppc/mem_helper.c
+++ b/target/ppc/mem_helper.c
@@ -415,28 +415,27 @@ STVE(stvewx, cpu_stl_data_ra, bswap32, u32)
 
 #define VSX_LXVL(name, lj)                                              \
 void helper_##name(CPUPPCState *env, target_ulong addr,                 \
-                   target_ulong xt_num, target_ulong rb)                \
+                   target_ulong xt, target_ulong rb)                    \
 {                                                                       \
+    ppc_vsr_t *r = &env->vsr[xt];                                       \
+    int nb = GET_NB(env->gpr[rb]);                                      \
     int i;                                                              \
-    ppc_vsr_t xt;                                                       \
-    uint64_t nb = GET_NB(rb);                                           \
                                                                         \
-    xt.s128 = int128_zero();                                            \
+    r->s128 = int128_zero();                                            \
     if (nb) {                                                           \
         nb = (nb >= 16) ? 16 : nb;                                      \
         if (msr_le && !lj) {                                            \
             for (i = 16; i > 16 - nb; i--) {                            \
-                xt.VsrB(i - 1) = cpu_ldub_data_ra(env, addr, GETPC());  \
+                r->VsrB(i - 1) = cpu_ldub_data_ra(env, addr, GETPC());  \
                 addr = addr_add(env, addr, 1);                          \
             }                                                           \
         } else {                                                        \
             for (i = 0; i < nb; i++) {                                  \
-                xt.VsrB(i) = cpu_ldub_data_ra(env, addr, GETPC());      \
+                r->VsrB(i) = cpu_ldub_data_ra(env, addr, GETPC());      \
                 addr = addr_add(env, addr, 1);                          \
             }                                                           \
         }                                                               \
     }                                                                   \
-    putVSR(xt_num, &xt, env);                                           \
 }
 
 VSX_LXVL(lxvl, 0)
@@ -445,25 +444,24 @@ VSX_LXVL(lxvll, 1)
 
 #define VSX_STXVL(name, lj)                                       \
 void helper_##name(CPUPPCState *env, target_ulong addr,           \
-                   target_ulong xt_num, target_ulong rb)          \
+                   target_ulong xt, target_ulong rb)              \
 {                                                                 \
+    ppc_vsr_t *r = &env->vsr[xt];                                 \
+    int nb = GET_NB(env->gpr[rb]);                                \
     int i;                                                        \
-    ppc_vsr_t xt;                                                 \
-    target_ulong nb = GET_NB(rb);                                 \
                                                                   \
     if (!nb) {                                                    \
         return;                                                   \
     }                                                             \
-    getVSR(xt_num, &xt, env);                                     \
     nb = (nb >= 16) ? 16 : nb;                                    \
     if (msr_le && !lj) {                                          \
         for (i = 16; i > 16 - nb; i--) {                          \
-            cpu_stb_data_ra(env, addr, xt.VsrB(i - 1), GETPC());  \
+            cpu_stb_data_ra(env, addr, r->VsrB(i - 1), GETPC());  \
             addr = addr_add(env, addr, 1);                        \
         }                                                         \
     } else {                                                      \
         for (i = 0; i < nb; i++) {                                \
-            cpu_stb_data_ra(env, addr, xt.VsrB(i), GETPC());      \
+            cpu_stb_data_ra(env, addr, r->VsrB(i), GETPC());      \
             addr = addr_add(env, addr, 1);                        \
         }                                                         \
     }                                                             \
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 11d9b75d01..c776d50ddc 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -290,7 +290,7 @@ VSX_VECTOR_LOAD_STORE(stxvx, st_i64, 1)
 #define VSX_VECTOR_LOAD_STORE_LENGTH(name)                      \
 static void gen_##name(DisasContext *ctx)                       \
 {                                                               \
-    TCGv EA, xt;                                                \
+    TCGv EA, xt, rb;                                            \
                                                                 \
     if (xT(ctx->opcode) < 32) {                                 \
         if (unlikely(!ctx->vsx_enabled)) {                      \
@@ -304,12 +304,14 @@ static void gen_##name(DisasContext *ctx)                       \
         }                                                       \
     }                                                           \
     EA = tcg_temp_new();                                        \
-    xt = tcg_const_tl(xT(ctx->opcode));                         \
     gen_set_access_type(ctx, ACCESS_INT);                       \
     gen_addr_register(ctx, EA);                                 \
-    gen_helper_##name(cpu_env, EA, xt, cpu_gpr[rB(ctx->opcode)]); \
+    xt = tcg_const_tl(xT(ctx->opcode));                         \
+    rb = tcg_const_tl(rB(ctx->opcode));                         \
+    gen_helper_##name(cpu_env, EA, xt, rb);                     \
     tcg_temp_free(EA);                                          \
     tcg_temp_free(xt);                                          \
+    tcg_temp_free(rb);                                          \
 }
 
 VSX_VECTOR_LOAD_STORE_LENGTH(lxvl)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 03/14] target/ppc: remove getVSR()/putVSR() from int_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c Mark Cave-Ayland
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:32   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c Mark Cave-Ayland
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Since commit 8a14d31b00 "target/ppc: switch fpr/vsrl registers so all VSX
registers are in host endian order" functions getVSR() and putVSR() which used
to convert the VSR registers into host endian order are no longer required.

Now that there are now no more users of getVSR()/putVSR() these functions can
be completely removed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/int_helper.c | 20 +++++++-------------
 target/ppc/internal.h   | 12 ------------
 2 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index f6a088ac08..f616fb249d 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1904,38 +1904,32 @@ VEXTRACT(d, u64)
 void helper_xxextractuw(CPUPPCState *env, target_ulong xtn,
                         target_ulong xbn, uint32_t index)
 {
-    ppc_vsr_t xt, xb;
+    ppc_vsr_t *xt = &env->vsr[xtn];
+    ppc_vsr_t *xb = &env->vsr[xbn];
     size_t es = sizeof(uint32_t);
     uint32_t ext_index;
     int i;
 
-    getVSR(xbn, &xb, env);
-    memset(&xt, 0, sizeof(xt));
+    memset(xt, 0, sizeof(ppc_vsr_t));
 
     ext_index = index;
     for (i = 0; i < es; i++, ext_index++) {
-        xt.VsrB(8 - es + i) = xb.VsrB(ext_index % 16);
+        xt->VsrB(8 - es + i) = xb->VsrB(ext_index % 16);
     }
-
-    putVSR(xtn, &xt, env);
 }
 
 void helper_xxinsertw(CPUPPCState *env, target_ulong xtn,
                       target_ulong xbn, uint32_t index)
 {
-    ppc_vsr_t xt, xb;
+    ppc_vsr_t *xt = &env->vsr[xtn];
+    ppc_vsr_t *xb = &env->vsr[xbn];
     size_t es = sizeof(uint32_t);
     int ins_index, i = 0;
 
-    getVSR(xbn, &xb, env);
-    getVSR(xtn, &xt, env);
-
     ins_index = index;
     for (i = 0; i < es && ins_index < 16; i++, ins_index++) {
-        xt.VsrB(ins_index) = xb.VsrB(8 - es + i);
+        xt->VsrB(ins_index) = xb->VsrB(8 - es + i);
     }
-
-    putVSR(xtn, &xt, env);
 }
 
 #define VEXT_SIGNED(name, element, cast)                            \
diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index fb6f64ed1e..d3d327e548 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -204,18 +204,6 @@ EXTRACT_HELPER(IMM8, 11, 8);
 EXTRACT_HELPER(DCMX, 16, 7);
 EXTRACT_HELPER_SPLIT_3(DCMX_XV, 5, 16, 0, 1, 2, 5, 1, 6, 6);
 
-static inline void getVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
-{
-    vsr->VsrD(0) = env->vsr[n].VsrD(0);
-    vsr->VsrD(1) = env->vsr[n].VsrD(1);
-}
-
-static inline void putVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
-{
-    env->vsr[n].VsrD(0) = vsr->VsrD(0);
-    env->vsr[n].VsrD(1) = vsr->VsrD(1);
-}
-
 void helper_compute_fprf_float16(CPUPPCState *env, float16 arg);
 void helper_compute_fprf_float32(CPUPPCState *env, float32 arg);
 void helper_compute_fprf_float128(CPUPPCState *env, float128 arg);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (2 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 03/14] target/ppc: remove getVSR()/putVSR() from int_helper.c Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:36   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 " Mark Cave-Ayland
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X3 macro which performs the decode based
upon xT, xA and xB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             |  48 ++++-------
 target/ppc/helper.h                 | 140 ++++++++++++++++---------------
 target/ppc/translate/vsx-impl.inc.c | 163 +++++++++++++++++++++---------------
 3 files changed, 183 insertions(+), 168 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 148ee5ac4c..39c4f4cfb0 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -1801,11 +1801,9 @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2)
  *   sfprf - set FPRF
  */
 #define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp)                    \
-void helper_##name(CPUPPCState *env, uint32_t opcode)                        \
+void helper_##name(CPUPPCState *env, uint32_t opcode,                        \
+                   ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)              \
 {                                                                            \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                   \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -1880,11 +1878,9 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
  *   sfprf - set FPRF
  */
 #define VSX_MUL(op, nels, tp, fld, sfprf, r2sp)                              \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                          \
+                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)                \
 {                                                                            \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                   \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -1954,11 +1950,9 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t opcode)
  *   sfprf - set FPRF
  */
 #define VSX_DIV(op, nels, tp, fld, sfprf, r2sp)                               \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                           \
+                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)                 \
 {                                                                             \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     int i;                                                                    \
                                                                               \
     helper_reset_fpstatus(env);                                               \
@@ -2286,11 +2280,9 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23)
  *   sfprf - set FPRF
  */
 #define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)              \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                           \
+                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)                 \
 {                                                                             \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     ppc_vsr_t *b, *c;                                                         \
     int i;                                                                    \
                                                                               \
@@ -2383,11 +2375,9 @@ VSX_MADD(xvnmsubmsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0, 0)
  *   svxvc - set VXVC bit
  */
 #define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc)                                \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                           \
+                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)                 \
 {                                                                             \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false;            \
                                                                               \
     if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) ||             \
@@ -2593,11 +2583,9 @@ VSX_SCALAR_CMPQ(xscmpuqp, 0)
  *   fld   - vsr_t field (VsrD(*) or VsrW(*))
  */
 #define VSX_MAX_MIN(name, op, nels, tp, fld)                                  \
-void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
+void helper_##name(CPUPPCState *env, uint32_t opcode,                         \
+                   ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)               \
 {                                                                             \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                    \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     int i;                                                                    \
                                                                               \
     for (i = 0; i < nels; i++) {                                              \
@@ -2723,11 +2711,9 @@ VSX_MAX_MINJ(xsminjdp, 0);
  *   exp   - expected result of comparison
  */
 #define VSX_CMP(op, nels, tp, fld, cmp, svxvc, exp)                       \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                       \
+                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)             \
 {                                                                         \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                \
     int i;                                                                \
     int all_true = 1;                                                     \
     int all_false = 1;                                                    \
@@ -3161,11 +3147,9 @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
 }
 
 #define VSX_XXPERM(op, indexed)                                       \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                   \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                   \
+                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *pcv)        \
 {                                                                     \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                            \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                            \
-    ppc_vsr_t *pcv = &env->vsr[xB(opcode)];                           \
     int i, idx;                                                       \
                                                                       \
     for (i = 0; i < 16; i++) {                                        \
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 638a6e99c4..e2f9010ae0 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -108,6 +108,10 @@ DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64)
 #define dh_ctype_avr ppc_avr_t *
 #define dh_is_signed_avr dh_is_signed_ptr
 
+#define dh_alias_vsr ptr
+#define dh_ctype_vsr ppc_vsr_t *
+#define dh_is_signed_vsr dh_is_signed_ptr
+
 DEF_HELPER_3(vavgub, void, avr, avr, avr)
 DEF_HELPER_3(vavguh, void, avr, avr, avr)
 DEF_HELPER_3(vavguw, void, avr, avr, avr)
@@ -373,38 +377,38 @@ DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdtrunc, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdutrunc, i32, avr, avr, avr, i32)
 
-DEF_HELPER_2(xsadddp, void, env, i32)
+DEF_HELPER_5(xsadddp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xsaddqp, void, env, i32)
-DEF_HELPER_2(xssubdp, void, env, i32)
-DEF_HELPER_2(xsmuldp, void, env, i32)
+DEF_HELPER_5(xssubdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmuldp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xsmulqp, void, env, i32)
-DEF_HELPER_2(xsdivdp, void, env, i32)
+DEF_HELPER_5(xsdivdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xsdivqp, void, env, i32)
 DEF_HELPER_2(xsredp, void, env, i32)
 DEF_HELPER_2(xssqrtdp, void, env, i32)
 DEF_HELPER_2(xsrsqrtedp, void, env, i32)
 DEF_HELPER_2(xstdivdp, void, env, i32)
 DEF_HELPER_2(xstsqrtdp, void, env, i32)
-DEF_HELPER_2(xsmaddadp, void, env, i32)
-DEF_HELPER_2(xsmaddmdp, void, env, i32)
-DEF_HELPER_2(xsmsubadp, void, env, i32)
-DEF_HELPER_2(xsmsubmdp, void, env, i32)
-DEF_HELPER_2(xsnmaddadp, void, env, i32)
-DEF_HELPER_2(xsnmaddmdp, void, env, i32)
-DEF_HELPER_2(xsnmsubadp, void, env, i32)
-DEF_HELPER_2(xsnmsubmdp, void, env, i32)
-DEF_HELPER_2(xscmpeqdp, void, env, i32)
-DEF_HELPER_2(xscmpgtdp, void, env, i32)
-DEF_HELPER_2(xscmpgedp, void, env, i32)
-DEF_HELPER_2(xscmpnedp, void, env, i32)
+DEF_HELPER_5(xsmaddadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmaddmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmsubadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmsubmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmaddadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmaddmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmsubadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmsubmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xscmpeqdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xscmpgtdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xscmpgedp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xscmpnedp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xscmpexpdp, void, env, i32)
 DEF_HELPER_2(xscmpexpqp, void, env, i32)
 DEF_HELPER_2(xscmpodp, void, env, i32)
 DEF_HELPER_2(xscmpudp, void, env, i32)
 DEF_HELPER_2(xscmpoqp, void, env, i32)
 DEF_HELPER_2(xscmpuqp, void, env, i32)
-DEF_HELPER_2(xsmaxdp, void, env, i32)
-DEF_HELPER_2(xsmindp, void, env, i32)
+DEF_HELPER_5(xsmaxdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmindp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xsmaxcdp, void, env, i32)
 DEF_HELPER_2(xsmincdp, void, env, i32)
 DEF_HELPER_2(xsmaxjdp, void, env, i32)
@@ -444,46 +448,46 @@ DEF_HELPER_2(xsrqpxp, void, env, i32)
 DEF_HELPER_2(xssqrtqp, void, env, i32)
 DEF_HELPER_2(xssubqp, void, env, i32)
 
-DEF_HELPER_2(xsaddsp, void, env, i32)
-DEF_HELPER_2(xssubsp, void, env, i32)
-DEF_HELPER_2(xsmulsp, void, env, i32)
-DEF_HELPER_2(xsdivsp, void, env, i32)
+DEF_HELPER_5(xsaddsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xssubsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmulsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsdivsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xsresp, void, env, i32)
 DEF_HELPER_2(xsrsp, i64, env, i64)
 DEF_HELPER_2(xssqrtsp, void, env, i32)
 DEF_HELPER_2(xsrsqrtesp, void, env, i32)
-DEF_HELPER_2(xsmaddasp, void, env, i32)
-DEF_HELPER_2(xsmaddmsp, void, env, i32)
-DEF_HELPER_2(xsmsubasp, void, env, i32)
-DEF_HELPER_2(xsmsubmsp, void, env, i32)
-DEF_HELPER_2(xsnmaddasp, void, env, i32)
-DEF_HELPER_2(xsnmaddmsp, void, env, i32)
-DEF_HELPER_2(xsnmsubasp, void, env, i32)
-DEF_HELPER_2(xsnmsubmsp, void, env, i32)
+DEF_HELPER_5(xsmaddasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmaddmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmsubasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmsubmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmaddasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmaddmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmsubasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsnmsubmsp, void, env, i32, vsr, vsr, vsr)
 
-DEF_HELPER_2(xvadddp, void, env, i32)
-DEF_HELPER_2(xvsubdp, void, env, i32)
-DEF_HELPER_2(xvmuldp, void, env, i32)
-DEF_HELPER_2(xvdivdp, void, env, i32)
+DEF_HELPER_5(xvadddp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvsubdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmuldp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvdivdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xvredp, void, env, i32)
 DEF_HELPER_2(xvsqrtdp, void, env, i32)
 DEF_HELPER_2(xvrsqrtedp, void, env, i32)
 DEF_HELPER_2(xvtdivdp, void, env, i32)
 DEF_HELPER_2(xvtsqrtdp, void, env, i32)
-DEF_HELPER_2(xvmaddadp, void, env, i32)
-DEF_HELPER_2(xvmaddmdp, void, env, i32)
-DEF_HELPER_2(xvmsubadp, void, env, i32)
-DEF_HELPER_2(xvmsubmdp, void, env, i32)
-DEF_HELPER_2(xvnmaddadp, void, env, i32)
-DEF_HELPER_2(xvnmaddmdp, void, env, i32)
-DEF_HELPER_2(xvnmsubadp, void, env, i32)
-DEF_HELPER_2(xvnmsubmdp, void, env, i32)
-DEF_HELPER_2(xvmaxdp, void, env, i32)
-DEF_HELPER_2(xvmindp, void, env, i32)
-DEF_HELPER_2(xvcmpeqdp, void, env, i32)
-DEF_HELPER_2(xvcmpgedp, void, env, i32)
-DEF_HELPER_2(xvcmpgtdp, void, env, i32)
-DEF_HELPER_2(xvcmpnedp, void, env, i32)
+DEF_HELPER_5(xvmaddadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmaddmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmsubadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmsubmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmaddadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmaddmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmsubadp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmsubmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmaxdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmindp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpeqdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpgedp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpgtdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpnedp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xvcvdpsp, void, env, i32)
 DEF_HELPER_2(xvcvdpsxds, void, env, i32)
 DEF_HELPER_2(xvcvdpsxws, void, env, i32)
@@ -499,29 +503,29 @@ DEF_HELPER_2(xvrdpim, void, env, i32)
 DEF_HELPER_2(xvrdpip, void, env, i32)
 DEF_HELPER_2(xvrdpiz, void, env, i32)
 
-DEF_HELPER_2(xvaddsp, void, env, i32)
-DEF_HELPER_2(xvsubsp, void, env, i32)
-DEF_HELPER_2(xvmulsp, void, env, i32)
-DEF_HELPER_2(xvdivsp, void, env, i32)
+DEF_HELPER_5(xvaddsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvsubsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmulsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvdivsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xvresp, void, env, i32)
 DEF_HELPER_2(xvsqrtsp, void, env, i32)
 DEF_HELPER_2(xvrsqrtesp, void, env, i32)
 DEF_HELPER_2(xvtdivsp, void, env, i32)
 DEF_HELPER_2(xvtsqrtsp, void, env, i32)
-DEF_HELPER_2(xvmaddasp, void, env, i32)
-DEF_HELPER_2(xvmaddmsp, void, env, i32)
-DEF_HELPER_2(xvmsubasp, void, env, i32)
-DEF_HELPER_2(xvmsubmsp, void, env, i32)
-DEF_HELPER_2(xvnmaddasp, void, env, i32)
-DEF_HELPER_2(xvnmaddmsp, void, env, i32)
-DEF_HELPER_2(xvnmsubasp, void, env, i32)
-DEF_HELPER_2(xvnmsubmsp, void, env, i32)
-DEF_HELPER_2(xvmaxsp, void, env, i32)
-DEF_HELPER_2(xvminsp, void, env, i32)
-DEF_HELPER_2(xvcmpeqsp, void, env, i32)
-DEF_HELPER_2(xvcmpgesp, void, env, i32)
-DEF_HELPER_2(xvcmpgtsp, void, env, i32)
-DEF_HELPER_2(xvcmpnesp, void, env, i32)
+DEF_HELPER_5(xvmaddasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmaddmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmsubasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmsubmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmaddasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmaddmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmsubasp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvnmsubmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvmaxsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvminsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpeqsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpgesp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpgtsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xvcmpnesp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xvcvspdp, void, env, i32)
 DEF_HELPER_2(xvcvsphp, void, env, i32)
 DEF_HELPER_2(xvcvhpsp, void, env, i32)
@@ -540,8 +544,8 @@ DEF_HELPER_2(xvrspic, void, env, i32)
 DEF_HELPER_2(xvrspim, void, env, i32)
 DEF_HELPER_2(xvrspip, void, env, i32)
 DEF_HELPER_2(xvrspiz, void, env, i32)
-DEF_HELPER_2(xxperm, void, env, i32)
-DEF_HELPER_2(xxpermr, void, env, i32)
+DEF_HELPER_5(xxperm, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xxpermr, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xxextractuw, void, env, tl, tl, i32)
 DEF_HELPER_4(xxinsertw, void, env, tl, tl, i32)
 DEF_HELPER_2(xvxsigsp, void, env, i32)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index c776d50ddc..083dfa515c 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -20,6 +20,13 @@ static inline void set_cpu_vsrl(int n, TCGv_i64 src)
     tcg_gen_st_i64(src, cpu_env, vsr64_offset(n, false));
 }
 
+static inline TCGv_ptr gen_vsr_ptr(int reg)
+{
+    TCGv_ptr r = tcg_temp_new_ptr();
+    tcg_gen_addi_ptr(r, cpu_env, vsr_full_offset(reg));
+    return r;
+}
+
 #define VSX_LOAD_SCALAR(name, operation)                      \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -924,6 +931,26 @@ static void gen_##name(DisasContext *ctx)                                     \
     tcg_temp_free_i32(opc);                                                   \
 }
 
+#define GEN_VSX_HELPER_X3(name, op1, op2, inval, type)                        \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xt, xa, xb;                                                      \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xt = gen_vsr_ptr(xT(ctx->opcode));                                        \
+    xa = gen_vsr_ptr(xA(ctx->opcode));                                        \
+    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \
+    gen_helper_##name(cpu_env, opc, xt, xa, xb);                              \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xt);                                                    \
+    tcg_temp_free_ptr(xa);                                                    \
+    tcg_temp_free_ptr(xb);                                                    \
+}
+
 #define GEN_VSX_HELPER_XT_XB_ENV(name, op1, op2, inval, type) \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -942,38 +969,38 @@ static void gen_##name(DisasContext *ctx)                     \
     tcg_temp_free_i64(t1);                                    \
 }
 
-GEN_VSX_HELPER_2(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsaddqp, 0x04, 0x00, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xssubdp, 0x00, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xssubdp, 0x00, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmulqp, 0x04, 0x01, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsdivqp, 0x04, 0x11, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmaddmdp, 0x04, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmsubadp, 0x04, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmsubmdp, 0x04, 0x07, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X3(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsmaddmdp, 0x04, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsmsubadp, 0x04, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsmsubmdp, 0x04, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X3(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmaxcdp, 0x00, 0x10, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xsmincdp, 0x00, 0x11, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xsmaxjdp, 0x00, 0x12, 0, PPC2_ISA300)
@@ -1010,50 +1037,50 @@ GEN_VSX_HELPER_2(xsrqpxp, 0x05, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xssqrtqp, 0x04, 0x19, 0x1B, PPC2_ISA300)
 GEN_VSX_HELPER_2(xssubqp, 0x04, 0x10, 0, PPC2_ISA300)
 
-GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsmsubmsp, 0x04, 0x03, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsmsubmsp, 0x04, 0x03, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X3(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xststdcdp, 0x14, 0x16, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xststdcqp, 0x04, 0x16, 0, PPC2_ISA300)
 
-GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmaddmdp, 0x04, 0x0D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmsubadp, 0x04, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmsubmdp, 0x04, 0x0F, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmaddadp, 0x04, 0x1C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmaddmdp, 0x04, 0x1D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmsubadp, 0x04, 0x1E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmsubmdp, 0x04, 0x1F, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmaxdp, 0x00, 0x1C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpnedp, 0x0C, 0x0F, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X3(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmaddmdp, 0x04, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmsubadp, 0x04, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmsubmdp, 0x04, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmaddadp, 0x04, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmaddmdp, 0x04, 0x1D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmsubadp, 0x04, 0x1E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmsubmdp, 0x04, 0x1F, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmaxdp, 0x00, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpnedp, 0x0C, 0x0F, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xvcvdpsp, 0x12, 0x18, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvdpsxds, 0x10, 0x1D, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvdpsxws, 0x10, 0x0D, 0, PPC2_VSX)
@@ -1069,29 +1096,29 @@ GEN_VSX_HELPER_2(xvrdpim, 0x12, 0x0F, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrdpip, 0x12, 0x0E, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrdpiz, 0x12, 0x0D, 0, PPC2_VSX)
 
-GEN_VSX_HELPER_2(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmaddmsp, 0x04, 0x09, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmsubasp, 0x04, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmsubmsp, 0x04, 0x0B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmaddasp, 0x04, 0x18, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmaddmsp, 0x04, 0x19, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmsubasp, 0x04, 0x1A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvnmsubmsp, 0x04, 0x1B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvmaxsp, 0x00, 0x18, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvminsp, 0x00, 0x19, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcmpnesp, 0x0C, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmaddmsp, 0x04, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmsubasp, 0x04, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmsubmsp, 0x04, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmaddasp, 0x04, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmaddmsp, 0x04, 0x19, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmsubasp, 0x04, 0x1A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvnmsubmsp, 0x04, 0x1B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvmaxsp, 0x00, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvminsp, 0x00, 0x19, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X3(xvcmpnesp, 0x0C, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvspdp, 0x12, 0x1C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvcvhpsp, 0x16, 0x1D, 0x18, PPC2_ISA300)
 GEN_VSX_HELPER_2(xvcvsphp, 0x16, 0x1D, 0x19, PPC2_ISA300)
@@ -1110,8 +1137,8 @@ GEN_VSX_HELPER_2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtstdcsp, 0x14, 0x1A, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xxpermr, 0x08, 0x07, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X3(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X3(xxpermr, 0x08, 0x07, 0, PPC2_ISA300)
 
 static void gen_xxbrd(DisasContext *ctx)
 {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (3 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:38   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 06/14] target/ppc: introduce GEN_VSX_HELPER_X2_AB " Mark Cave-Ayland
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X2 macro which performs the decode based
upon xT and xB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             |  50 ++++++-------
 target/ppc/helper.h                 | 122 +++++++++++++++----------------
 target/ppc/translate/vsx-impl.inc.c | 140 ++++++++++++++++++++----------------
 3 files changed, 160 insertions(+), 152 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 39c4f4cfb0..05099e1291 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2028,10 +2028,9 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode)
  *   sfprf - set FPRF
  */
 #define VSX_RE(op, nels, tp, fld, sfprf, r2sp)                                \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                           \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                                \
 {                                                                             \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                    \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                    \
     int i;                                                                    \
                                                                               \
     helper_reset_fpstatus(env);                                               \
@@ -2068,10 +2067,9 @@ VSX_RE(xvresp, 4, float32, VsrW(i), 0, 0)
  *   sfprf - set FPRF
  */
 #define VSX_SQRT(op, nels, tp, fld, sfprf, r2sp)                             \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                          \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                               \
 {                                                                            \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -2116,10 +2114,9 @@ VSX_SQRT(xvsqrtsp, 4, float32, VsrW(i), 0, 0)
  *   sfprf - set FPRF
  */
 #define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp)                           \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                          \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                               \
 {                                                                            \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -2767,10 +2764,9 @@ VSX_CMP(xvcmpnesp, 4, float32, VsrW(i), eq, 0, 0)
  *   sfprf - set FPRF
  */
 #define VSX_CVT_FP_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf)    \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                     \
 {                                                                  \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                         \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                         \
     int i;                                                         \
                                                                    \
     for (i = 0; i < nels; i++) {                                   \
@@ -2839,10 +2835,9 @@ VSX_CVT_FP_TO_FP_VECTOR(xscvdpqp, 1, float64, float128, VsrD(0), f128, 1)
  *   sfprf - set FPRF
  */
 #define VSX_CVT_FP_TO_FP_HP(op, nels, stp, ttp, sfld, tfld, sfprf) \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                     \
 {                                                                  \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                         \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                         \
     int i;                                                         \
                                                                    \
     memset(xt, 0, sizeof(ppc_vsr_t));                              \
@@ -2871,10 +2866,9 @@ VSX_CVT_FP_TO_FP_HP(xvcvhpsp, 4, float16, float32, VsrH(2 * i + 1), VsrW(i), 0)
  * xscvqpdp isn't using VSX_CVT_FP_TO_FP() because xscvqpdpo will be
  * added to this later.
  */
-void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode)
+void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode,
+                     ppc_vsr_t *xt, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     float_status tstat;
 
     memset(xt, 0, sizeof(ppc_vsr_t));
@@ -2922,11 +2916,10 @@ uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)
  *   rnan  - resulting NaN
  */
 #define VSX_CVT_FP_TO_INT(op, nels, stp, ttp, sfld, tfld, rnan)              \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                          \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                               \
 {                                                                            \
     int all_flags = env->fp_status.float_exception_flags, flags;             \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
     int i;                                                                   \
                                                                              \
     for (i = 0; i < nels; i++) {                                             \
@@ -3009,10 +3002,9 @@ VSX_CVT_FP_TO_INT_VECTOR(xscvqpuwz, float128, uint32, f128, VsrD(0), 0x0ULL)
  *   sfprf - set FPRF
  */
 #define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf, r2sp)  \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                     \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                          \
 {                                                                       \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                              \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                              \
     int i;                                                              \
                                                                         \
     for (i = 0; i < nels; i++) {                                        \
@@ -3081,10 +3073,9 @@ VSX_CVT_INT_TO_FP_VECTOR(xscvudqp, uint64, float128, VsrD(0), f128)
  *   sfprf - set FPRF
  */
 #define VSX_ROUND(op, nels, tp, fld, rmode, sfprf)                     \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                    \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                    \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                         \
 {                                                                      \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                             \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                             \
     int i;                                                             \
                                                                        \
     if (rmode != FLOAT_ROUND_CURRENT) {                                \
@@ -3165,10 +3156,9 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,                   \
 VSX_XXPERM(xxperm, 0)
 VSX_XXPERM(xxpermr, 1)
 
-void helper_xvxsigsp(CPUPPCState *env, uint32_t opcode)
+void helper_xvxsigsp(CPUPPCState *env, uint32_t opcode,
+                     ppc_vsr_t *xt, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     uint32_t exp, i, fraction;
 
     memset(xt, 0, sizeof(ppc_vsr_t));
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index e2f9010ae0..c957c53755 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -384,9 +384,9 @@ DEF_HELPER_5(xsmuldp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xsmulqp, void, env, i32)
 DEF_HELPER_5(xsdivdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_2(xsdivqp, void, env, i32)
-DEF_HELPER_2(xsredp, void, env, i32)
-DEF_HELPER_2(xssqrtdp, void, env, i32)
-DEF_HELPER_2(xsrsqrtedp, void, env, i32)
+DEF_HELPER_4(xsredp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xssqrtdp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xsrsqrtedp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xstdivdp, void, env, i32)
 DEF_HELPER_2(xstsqrtdp, void, env, i32)
 DEF_HELPER_5(xsmaddadp, void, env, i32, vsr, vsr, vsr)
@@ -413,36 +413,36 @@ DEF_HELPER_2(xsmaxcdp, void, env, i32)
 DEF_HELPER_2(xsmincdp, void, env, i32)
 DEF_HELPER_2(xsmaxjdp, void, env, i32)
 DEF_HELPER_2(xsminjdp, void, env, i32)
-DEF_HELPER_2(xscvdphp, void, env, i32)
+DEF_HELPER_4(xscvdphp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvdpqp, void, env, i32)
-DEF_HELPER_2(xscvdpsp, void, env, i32)
+DEF_HELPER_4(xscvdpsp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvdpspn, i64, env, i64)
-DEF_HELPER_2(xscvqpdp, void, env, i32)
+DEF_HELPER_4(xscvqpdp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvqpsdz, void, env, i32)
 DEF_HELPER_2(xscvqpswz, void, env, i32)
 DEF_HELPER_2(xscvqpudz, void, env, i32)
 DEF_HELPER_2(xscvqpuwz, void, env, i32)
-DEF_HELPER_2(xscvhpdp, void, env, i32)
+DEF_HELPER_4(xscvhpdp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvsdqp, void, env, i32)
-DEF_HELPER_2(xscvspdp, void, env, i32)
+DEF_HELPER_4(xscvspdp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvspdpn, i64, env, i64)
-DEF_HELPER_2(xscvdpsxds, void, env, i32)
-DEF_HELPER_2(xscvdpsxws, void, env, i32)
-DEF_HELPER_2(xscvdpuxds, void, env, i32)
-DEF_HELPER_2(xscvdpuxws, void, env, i32)
-DEF_HELPER_2(xscvsxddp, void, env, i32)
-DEF_HELPER_2(xscvuxdsp, void, env, i32)
-DEF_HELPER_2(xscvsxdsp, void, env, i32)
+DEF_HELPER_4(xscvdpsxds, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvdpsxws, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvdpuxds, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvdpuxws, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvsxddp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvuxdsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvsxdsp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvudqp, void, env, i32)
-DEF_HELPER_2(xscvuxddp, void, env, i32)
+DEF_HELPER_4(xscvuxddp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xststdcsp, void, env, i32)
 DEF_HELPER_2(xststdcdp, void, env, i32)
 DEF_HELPER_2(xststdcqp, void, env, i32)
-DEF_HELPER_2(xsrdpi, void, env, i32)
-DEF_HELPER_2(xsrdpic, void, env, i32)
-DEF_HELPER_2(xsrdpim, void, env, i32)
-DEF_HELPER_2(xsrdpip, void, env, i32)
-DEF_HELPER_2(xsrdpiz, void, env, i32)
+DEF_HELPER_4(xsrdpi, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xsrdpic, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xsrdpim, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xsrdpip, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xsrdpiz, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xsrqpi, void, env, i32)
 DEF_HELPER_2(xsrqpxp, void, env, i32)
 DEF_HELPER_2(xssqrtqp, void, env, i32)
@@ -452,10 +452,10 @@ DEF_HELPER_5(xsaddsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xssubsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmulsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsdivsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xsresp, void, env, i32)
+DEF_HELPER_4(xsresp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xsrsp, i64, env, i64)
-DEF_HELPER_2(xssqrtsp, void, env, i32)
-DEF_HELPER_2(xsrsqrtesp, void, env, i32)
+DEF_HELPER_4(xssqrtsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xsrsqrtesp, void, env, i32, vsr, vsr)
 DEF_HELPER_5(xsmaddasp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmaddmsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmsubasp, void, env, i32, vsr, vsr, vsr)
@@ -469,9 +469,9 @@ DEF_HELPER_5(xvadddp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvsubdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmuldp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvdivdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xvredp, void, env, i32)
-DEF_HELPER_2(xvsqrtdp, void, env, i32)
-DEF_HELPER_2(xvrsqrtedp, void, env, i32)
+DEF_HELPER_4(xvredp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvsqrtdp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrsqrtedp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xvtdivdp, void, env, i32)
 DEF_HELPER_2(xvtsqrtdp, void, env, i32)
 DEF_HELPER_5(xvmaddadp, void, env, i32, vsr, vsr, vsr)
@@ -488,28 +488,28 @@ DEF_HELPER_5(xvcmpeqdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpgedp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpgtdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpnedp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xvcvdpsp, void, env, i32)
-DEF_HELPER_2(xvcvdpsxds, void, env, i32)
-DEF_HELPER_2(xvcvdpsxws, void, env, i32)
-DEF_HELPER_2(xvcvdpuxds, void, env, i32)
-DEF_HELPER_2(xvcvdpuxws, void, env, i32)
-DEF_HELPER_2(xvcvsxddp, void, env, i32)
-DEF_HELPER_2(xvcvuxddp, void, env, i32)
-DEF_HELPER_2(xvcvsxwdp, void, env, i32)
-DEF_HELPER_2(xvcvuxwdp, void, env, i32)
-DEF_HELPER_2(xvrdpi, void, env, i32)
-DEF_HELPER_2(xvrdpic, void, env, i32)
-DEF_HELPER_2(xvrdpim, void, env, i32)
-DEF_HELPER_2(xvrdpip, void, env, i32)
-DEF_HELPER_2(xvrdpiz, void, env, i32)
+DEF_HELPER_4(xvcvdpsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvdpsxds, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvdpsxws, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvdpuxds, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvdpuxws, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvsxddp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvuxddp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvsxwdp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvuxwdp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrdpi, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrdpic, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrdpim, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrdpip, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrdpiz, void, env, i32, vsr, vsr)
 
 DEF_HELPER_5(xvaddsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvsubsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmulsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvdivsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xvresp, void, env, i32)
-DEF_HELPER_2(xvsqrtsp, void, env, i32)
-DEF_HELPER_2(xvrsqrtesp, void, env, i32)
+DEF_HELPER_4(xvresp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvsqrtsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrsqrtesp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xvtdivsp, void, env, i32)
 DEF_HELPER_2(xvtsqrtsp, void, env, i32)
 DEF_HELPER_5(xvmaddasp, void, env, i32, vsr, vsr, vsr)
@@ -526,29 +526,29 @@ DEF_HELPER_5(xvcmpeqsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpgesp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpgtsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpnesp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xvcvspdp, void, env, i32)
-DEF_HELPER_2(xvcvsphp, void, env, i32)
-DEF_HELPER_2(xvcvhpsp, void, env, i32)
-DEF_HELPER_2(xvcvspsxds, void, env, i32)
-DEF_HELPER_2(xvcvspsxws, void, env, i32)
-DEF_HELPER_2(xvcvspuxds, void, env, i32)
-DEF_HELPER_2(xvcvspuxws, void, env, i32)
-DEF_HELPER_2(xvcvsxdsp, void, env, i32)
-DEF_HELPER_2(xvcvuxdsp, void, env, i32)
-DEF_HELPER_2(xvcvsxwsp, void, env, i32)
-DEF_HELPER_2(xvcvuxwsp, void, env, i32)
+DEF_HELPER_4(xvcvspdp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvsphp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvhpsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvspsxds, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvspsxws, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvspuxds, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvspuxws, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvsxdsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvuxdsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvsxwsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvcvuxwsp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xvtstdcsp, void, env, i32)
 DEF_HELPER_2(xvtstdcdp, void, env, i32)
-DEF_HELPER_2(xvrspi, void, env, i32)
-DEF_HELPER_2(xvrspic, void, env, i32)
-DEF_HELPER_2(xvrspim, void, env, i32)
-DEF_HELPER_2(xvrspip, void, env, i32)
-DEF_HELPER_2(xvrspiz, void, env, i32)
+DEF_HELPER_4(xvrspi, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrspic, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrspim, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrspip, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvrspiz, void, env, i32, vsr, vsr)
 DEF_HELPER_5(xxperm, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xxpermr, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xxextractuw, void, env, tl, tl, i32)
 DEF_HELPER_4(xxinsertw, void, env, tl, tl, i32)
-DEF_HELPER_2(xvxsigsp, void, env, i32)
+DEF_HELPER_4(xvxsigsp, void, env, i32, vsr, vsr)
 
 DEF_HELPER_2(efscfsi, i32, env, i32)
 DEF_HELPER_2(efscfui, i32, env, i32)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 083dfa515c..7f719e9172 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -951,6 +951,24 @@ static void gen_##name(DisasContext *ctx)                                     \
     tcg_temp_free_ptr(xb);                                                    \
 }
 
+#define GEN_VSX_HELPER_X2(name, op1, op2, inval, type)                        \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xt, xb;                                                          \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xt = gen_vsr_ptr(xT(ctx->opcode));                                        \
+    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \
+    gen_helper_##name(cpu_env, opc, xt, xb);                                  \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xt);                                                    \
+    tcg_temp_free_ptr(xb);                                                    \
+}
+
 #define GEN_VSX_HELPER_XT_XB_ENV(name, op1, op2, inval, type) \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -976,9 +994,9 @@ GEN_VSX_HELPER_X3(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsmulqp, 0x04, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsdivqp, 0x04, 0x11, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
@@ -1005,31 +1023,31 @@ GEN_VSX_HELPER_2(xsmaxcdp, 0x00, 0x10, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xsmincdp, 0x00, 0x11, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xsmaxjdp, 0x00, 0x12, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xsminjdp, 0x00, 0x12, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvdphp, 0x16, 0x15, 0x11, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvdphp, 0x16, 0x15, 0x11, PPC2_ISA300)
+GEN_VSX_HELPER_X2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvdpqp, 0x04, 0x1A, 0x16, PPC2_ISA300)
 GEN_VSX_HELPER_XT_XB_ENV(xscvdpspn, 0x16, 0x10, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xscvqpdp, 0x04, 0x1A, 0x14, PPC2_ISA300)
+GEN_VSX_HELPER_X2(xscvqpdp, 0x04, 0x1A, 0x14, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscvqpsdz, 0x04, 0x1A, 0x19, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscvqpswz, 0x04, 0x1A, 0x09, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscvqpudz, 0x04, 0x1A, 0x11, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscvqpuwz, 0x04, 0x1A, 0x01, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvhpdp, 0x16, 0x15, 0x10, PPC2_ISA300)
+GEN_VSX_HELPER_X2(xscvhpdp, 0x16, 0x15, 0x10, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscvsdqp, 0x04, 0x1A, 0x0A, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvspdp, 0x12, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvspdp, 0x12, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_XT_XB_ENV(xscvspdpn, 0x16, 0x14, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xscvdpsxds, 0x10, 0x15, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscvdpsxws, 0x10, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscvdpuxds, 0x10, 0x14, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscvdpuxws, 0x10, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscvsxddp, 0x10, 0x17, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvdpsxds, 0x10, 0x15, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvdpsxws, 0x10, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvdpuxds, 0x10, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvdpuxws, 0x10, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvsxddp, 0x10, 0x17, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvudqp, 0x04, 0x1A, 0x02, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvuxddp, 0x10, 0x16, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsrdpi, 0x12, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsrdpic, 0x16, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsrdpim, 0x12, 0x07, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsrdpip, 0x12, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xscvuxddp, 0x10, 0x16, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xsrdpi, 0x12, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xsrdpic, 0x16, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xsrdpim, 0x12, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xsrdpip, 0x12, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_XT_XB_ENV(xsrsp, 0x12, 0x11, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xsrqpi, 0x05, 0x00, 0, PPC2_ISA300)
@@ -1041,9 +1059,9 @@ GEN_VSX_HELPER_X3(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207)
@@ -1052,8 +1070,8 @@ GEN_VSX_HELPER_X3(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207)
+GEN_VSX_HELPER_X2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xststdcdp, 0x14, 0x16, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xststdcqp, 0x04, 0x16, 0, PPC2_ISA300)
@@ -1062,9 +1080,9 @@ GEN_VSX_HELPER_X3(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
@@ -1081,28 +1099,28 @@ GEN_VSX_HELPER_X3(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpgtdp, 0x0C, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpgedp, 0x0C, 0x0E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpnedp, 0x0C, 0x0F, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xvcvdpsp, 0x12, 0x18, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvdpsxds, 0x10, 0x1D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvdpsxws, 0x10, 0x0D, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvdpuxds, 0x10, 0x1C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvdpuxws, 0x10, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvsxddp, 0x10, 0x1F, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvuxddp, 0x10, 0x1E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvsxwdp, 0x10, 0x0F, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvuxwdp, 0x10, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrdpi, 0x12, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrdpic, 0x16, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrdpim, 0x12, 0x0F, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrdpip, 0x12, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrdpiz, 0x12, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvdpsp, 0x12, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvdpsxds, 0x10, 0x1D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvdpsxws, 0x10, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvdpuxds, 0x10, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvdpuxws, 0x10, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvsxddp, 0x10, 0x1F, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvuxddp, 0x10, 0x1E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvsxwdp, 0x10, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvuxwdp, 0x10, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrdpi, 0x12, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrdpic, 0x16, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrdpim, 0x12, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrdpip, 0x12, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrdpiz, 0x12, 0x0D, 0, PPC2_VSX)
 
 GEN_VSX_HELPER_X3(xvaddsp, 0x00, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvsubsp, 0x00, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmulsp, 0x00, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
@@ -1119,22 +1137,22 @@ GEN_VSX_HELPER_X3(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpgtsp, 0x0C, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpgesp, 0x0C, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpnesp, 0x0C, 0x0B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvspdp, 0x12, 0x1C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvhpsp, 0x16, 0x1D, 0x18, PPC2_ISA300)
-GEN_VSX_HELPER_2(xvcvsphp, 0x16, 0x1D, 0x19, PPC2_ISA300)
-GEN_VSX_HELPER_2(xvcvspsxds, 0x10, 0x19, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvspsxws, 0x10, 0x09, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvspuxds, 0x10, 0x18, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvspuxws, 0x10, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvsxdsp, 0x10, 0x1B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvuxdsp, 0x10, 0x1A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvsxwsp, 0x10, 0x0B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvcvuxwsp, 0x10, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrspi, 0x12, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrspic, 0x16, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrspim, 0x12, 0x0B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvspdp, 0x12, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvhpsp, 0x16, 0x1D, 0x18, PPC2_ISA300)
+GEN_VSX_HELPER_X2(xvcvsphp, 0x16, 0x1D, 0x19, PPC2_ISA300)
+GEN_VSX_HELPER_X2(xvcvspsxds, 0x10, 0x19, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvspsxws, 0x10, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvspuxds, 0x10, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvspuxws, 0x10, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvsxdsp, 0x10, 0x1B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvuxdsp, 0x10, 0x1A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvsxwsp, 0x10, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvcvuxwsp, 0x10, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrspi, 0x12, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrspic, 0x16, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrspim, 0x12, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtstdcsp, 0x14, 0x1A, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
@@ -1813,7 +1831,7 @@ static void gen_xvxexpdp(DisasContext *ctx)
     tcg_temp_free_i64(xbl);
 }
 
-GEN_VSX_HELPER_2(xvxsigsp, 0x00, 0x04, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X2(xvxsigsp, 0x00, 0x04, 0, PPC2_ISA300)
 
 static void gen_xvxsigdp(DisasContext *ctx)
 {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 06/14] target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (4 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 " Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:41   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 07/14] target/ppc: introduce GEN_VSX_HELPER_X1 " Mark Cave-Ayland
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X2_AB macro which performs the decode based
upon xA and xB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             | 15 ++++++---------
 target/ppc/helper.h                 | 12 ++++++------
 target/ppc/translate/vsx-impl.inc.c | 30 ++++++++++++++++++++++++------
 3 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 05099e1291..a2ff71e379 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2164,10 +2164,9 @@ VSX_RSQRTE(xvrsqrtesp, 4, float32, VsrW(i), 0, 0)
  *   nbits - number of fraction bits
  */
 #define VSX_TDIV(op, nels, tp, fld, emin, emax, nbits)                  \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                     \
+                 ppc_vsr_t *xa, ppc_vsr_t *xb)                          \
 {                                                                       \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                              \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                              \
     int i;                                                              \
     int fe_flag = 0;                                                    \
     int fg_flag = 0;                                                    \
@@ -2413,10 +2412,9 @@ VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
 VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
 VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
 
-void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode)
+void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
+                       ppc_vsr_t *xa, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     int64_t exp_a, exp_b;
     uint32_t cc;
 
@@ -2474,10 +2472,9 @@ void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode)
 }
 
 #define VSX_SCALAR_CMP(op, ordered)                                      \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                      \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                      \
+                 ppc_vsr_t *xa, ppc_vsr_t *xb)                           \
 {                                                                        \
-    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                               \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                               \
     uint32_t cc = 0;                                                     \
     bool vxsnan_flag = false, vxvc_flag = false;                         \
                                                                          \
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index c957c53755..9c236f016e 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -387,7 +387,7 @@ DEF_HELPER_2(xsdivqp, void, env, i32)
 DEF_HELPER_4(xsredp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xssqrtdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrsqrtedp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xstdivdp, void, env, i32)
+DEF_HELPER_4(xstdivdp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xstsqrtdp, void, env, i32)
 DEF_HELPER_5(xsmaddadp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmaddmdp, void, env, i32, vsr, vsr, vsr)
@@ -401,10 +401,10 @@ DEF_HELPER_5(xscmpeqdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpgtdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpgedp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpnedp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xscmpexpdp, void, env, i32)
+DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscmpexpqp, void, env, i32)
-DEF_HELPER_2(xscmpodp, void, env, i32)
-DEF_HELPER_2(xscmpudp, void, env, i32)
+DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscmpudp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscmpoqp, void, env, i32)
 DEF_HELPER_2(xscmpuqp, void, env, i32)
 DEF_HELPER_5(xsmaxdp, void, env, i32, vsr, vsr, vsr)
@@ -472,7 +472,7 @@ DEF_HELPER_5(xvdivdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xvredp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvsqrtdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrsqrtedp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xvtdivdp, void, env, i32)
+DEF_HELPER_4(xvtdivdp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xvtsqrtdp, void, env, i32)
 DEF_HELPER_5(xvmaddadp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmaddmdp, void, env, i32, vsr, vsr, vsr)
@@ -510,7 +510,7 @@ DEF_HELPER_5(xvdivsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xvresp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvsqrtsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrsqrtesp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xvtdivsp, void, env, i32)
+DEF_HELPER_4(xvtdivsp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xvtsqrtsp, void, env, i32)
 DEF_HELPER_5(xvmaddasp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmaddmsp, void, env, i32, vsr, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 7f719e9172..fed56fce69 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -969,6 +969,24 @@ static void gen_##name(DisasContext *ctx)                                     \
     tcg_temp_free_ptr(xb);                                                    \
 }
 
+#define GEN_VSX_HELPER_X2_AB(name, op1, op2, inval, type)                     \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xa, xb;                                                          \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xa = gen_vsr_ptr(xA(ctx->opcode));                                        \
+    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \
+    gen_helper_##name(cpu_env, opc, xa, xb);                                  \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xa);                                                    \
+    tcg_temp_free_ptr(xb);                                                    \
+}
+
 #define GEN_VSX_HELPER_XT_XB_ENV(name, op1, op2, inval, type) \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -997,7 +1015,7 @@ GEN_VSX_HELPER_2(xsdivqp, 0x04, 0x11, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2_AB(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaddmdp, 0x04, 0x05, 0, PPC2_VSX)
@@ -1011,10 +1029,10 @@ GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2_AB(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
@@ -1083,7 +1101,7 @@ GEN_VSX_HELPER_X3(xvdivdp, 0x00, 0x0F, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2_AB(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddmdp, 0x04, 0x0D, 0, PPC2_VSX)
@@ -1121,7 +1139,7 @@ GEN_VSX_HELPER_X3(xvdivsp, 0x00, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2_AB(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddmsp, 0x04, 0x09, 0, PPC2_VSX)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 07/14] target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (5 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 06/14] target/ppc: introduce GEN_VSX_HELPER_X2_AB " Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:43   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 08/14] target/ppc: introduce GEN_VSX_HELPER_R3 " Mark Cave-Ayland
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X1 macro which performs the decode based
upon xB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             |  6 ++----
 target/ppc/helper.h                 |  8 ++++----
 target/ppc/translate/vsx-impl.inc.c | 24 ++++++++++++++++++++----
 3 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index a2ff71e379..4b4bc229b5 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2221,9 +2221,8 @@ VSX_TDIV(xvtdivsp, 4, float32, VsrW(i), -126, 127, 23)
  *   nbits - number of fraction bits
  */
 #define VSX_TSQRT(op, nels, tp, fld, emin, nbits)                       \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xb)      \
 {                                                                       \
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                              \
     int i;                                                              \
     int fe_flag = 0;                                                    \
     int fg_flag = 0;                                                    \
@@ -3230,9 +3229,8 @@ VSX_TEST_DC(xvtstdcsp, 4, xB(opcode), float32, VsrW(i), VsrW(i), UINT32_MAX, 0)
 VSX_TEST_DC(xststdcdp, 1, xB(opcode), float64, VsrD(0), VsrD(0), 0, 1)
 VSX_TEST_DC(xststdcqp, 1, (rB(opcode) + 32), float128, f128, VsrD(0), 0, 1)
 
-void helper_xststdcsp(CPUPPCState *env, uint32_t opcode)
+void helper_xststdcsp(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xb = &env->vsr[xB(opcode)];
     uint32_t dcmx, sign, exp;
     uint32_t cc, match = 0, not_sp = 0;
 
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 9c236f016e..cb6b7a55c1 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -388,7 +388,7 @@ DEF_HELPER_4(xsredp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xssqrtdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrsqrtedp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xstdivdp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xstsqrtdp, void, env, i32)
+DEF_HELPER_3(xstsqrtdp, void, env, i32, vsr)
 DEF_HELPER_5(xsmaddadp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmaddmdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmsubadp, void, env, i32, vsr, vsr, vsr)
@@ -435,7 +435,7 @@ DEF_HELPER_4(xscvuxdsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvsxdsp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvudqp, void, env, i32)
 DEF_HELPER_4(xscvuxddp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xststdcsp, void, env, i32)
+DEF_HELPER_3(xststdcsp, void, env, i32, vsr)
 DEF_HELPER_2(xststdcdp, void, env, i32)
 DEF_HELPER_2(xststdcqp, void, env, i32)
 DEF_HELPER_4(xsrdpi, void, env, i32, vsr, vsr)
@@ -473,7 +473,7 @@ DEF_HELPER_4(xvredp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvsqrtdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrsqrtedp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvtdivdp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xvtsqrtdp, void, env, i32)
+DEF_HELPER_3(xvtsqrtdp, void, env, i32, vsr)
 DEF_HELPER_5(xvmaddadp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmaddmdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmsubadp, void, env, i32, vsr, vsr, vsr)
@@ -511,7 +511,7 @@ DEF_HELPER_4(xvresp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvsqrtsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrsqrtesp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvtdivsp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xvtsqrtsp, void, env, i32)
+DEF_HELPER_3(xvtsqrtsp, void, env, i32, vsr)
 DEF_HELPER_5(xvmaddasp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmaddmsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmsubasp, void, env, i32, vsr, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index fed56fce69..a30e22a852 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -987,6 +987,22 @@ static void gen_##name(DisasContext *ctx)                                     \
     tcg_temp_free_ptr(xb);                                                    \
 }
 
+#define GEN_VSX_HELPER_X1(name, op1, op2, inval, type)                        \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xb;                                                              \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \
+    gen_helper_##name(cpu_env, opc, xb);                                      \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xb);                                                    \
+}
+
 #define GEN_VSX_HELPER_XT_XB_ENV(name, op1, op2, inval, type) \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -1016,7 +1032,7 @@ GEN_VSX_HELPER_X2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaddmdp, 0x04, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmsubadp, 0x04, 0x06, 0, PPC2_VSX)
@@ -1090,7 +1106,7 @@ GEN_VSX_HELPER_X3(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207)
-GEN_VSX_HELPER_2(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X1(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xststdcdp, 0x14, 0x16, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xststdcqp, 0x04, 0x16, 0, PPC2_ISA300)
 
@@ -1102,7 +1118,7 @@ GEN_VSX_HELPER_X2(xvredp, 0x14, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X1(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddmdp, 0x04, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmsubadp, 0x04, 0x0E, 0, PPC2_VSX)
@@ -1140,7 +1156,7 @@ GEN_VSX_HELPER_X2(xvresp, 0x14, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X1(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaddmsp, 0x04, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmsubasp, 0x04, 0x0A, 0, PPC2_VSX)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 08/14] target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (6 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 07/14] target/ppc: introduce GEN_VSX_HELPER_X1 " Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:47   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 09/14] target/ppc: introduce GEN_VSX_HELPER_R2 " Mark Cave-Ayland
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X3 macro which performs the decode based
upon rD, rA and rB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             | 36 ++++++++++++------------------------
 target/ppc/helper.h                 | 16 ++++++++--------
 target/ppc/translate/vsx-impl.inc.c | 36 ++++++++++++++++++++++++++++--------
 3 files changed, 48 insertions(+), 40 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 4b4bc229b5..4e97093186 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -1840,11 +1840,9 @@ VSX_ADD_SUB(xssubsp, sub, 1, float64, VsrD(0), 1, 1)
 VSX_ADD_SUB(xvsubdp, sub, 2, float64, VsrD(i), 0, 0)
 VSX_ADD_SUB(xvsubsp, sub, 4, float32, VsrW(i), 0, 0)
 
-void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
+void helper_xsaddqp(CPUPPCState *env, uint32_t opcode,
+                    ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
     helper_reset_fpstatus(env);
@@ -1914,11 +1912,9 @@ VSX_MUL(xsmulsp, 1, float64, VsrD(0), 1, 1)
 VSX_MUL(xvmuldp, 2, float64, VsrD(i), 0, 0)
 VSX_MUL(xvmulsp, 4, float32, VsrW(i), 0, 0)
 
-void helper_xsmulqp(CPUPPCState *env, uint32_t opcode)
+void helper_xsmulqp(CPUPPCState *env, uint32_t opcode,
+                    ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
     helper_reset_fpstatus(env);
@@ -1989,11 +1985,9 @@ VSX_DIV(xsdivsp, 1, float64, VsrD(0), 1, 1)
 VSX_DIV(xvdivdp, 2, float64, VsrD(i), 0, 0)
 VSX_DIV(xvdivsp, 4, float32, VsrW(i), 0, 0)
 
-void helper_xsdivqp(CPUPPCState *env, uint32_t opcode)
+void helper_xsdivqp(CPUPPCState *env, uint32_t opcode,
+                    ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
     helper_reset_fpstatus(env);
@@ -2600,11 +2594,9 @@ VSX_MAX_MIN(xvmindp, minnum, 2, float64, VsrD(i))
 VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
 
 #define VSX_MAX_MINC(name, max)                                               \
-void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
+void helper_##name(CPUPPCState *env, uint32_t opcode,                         \
+                   ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)               \
 {                                                                             \
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                               \
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];                               \
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                               \
     ppc_vsr_t r;                                                              \
     bool vxsnan_flag = false, vex_flag = false;                               \
                                                                               \
@@ -2637,11 +2629,9 @@ VSX_MAX_MINC(xsmaxcdp, 1);
 VSX_MAX_MINC(xsmincdp, 0);
 
 #define VSX_MAX_MINJ(name, max)                                               \
-void helper_##name(CPUPPCState *env, uint32_t opcode)                         \
+void helper_##name(CPUPPCState *env, uint32_t opcode,                         \
+                   ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)               \
 {                                                                             \
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                               \
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];                               \
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                               \
     ppc_vsr_t r;                                                              \
     bool vxsnan_flag = false, vex_flag = false;                               \
                                                                               \
@@ -3405,11 +3395,9 @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t opcode)
     do_float_check_status(env, GETPC());
 }
 
-void helper_xssubqp(CPUPPCState *env, uint32_t opcode)
+void helper_xssubqp(CPUPPCState *env, uint32_t opcode,
+                    ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
     helper_reset_fpstatus(env);
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index cb6b7a55c1..8ed35f91da 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -378,12 +378,12 @@ DEF_HELPER_4(bcdtrunc, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdutrunc, i32, avr, avr, avr, i32)
 
 DEF_HELPER_5(xsadddp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xsaddqp, void, env, i32)
+DEF_HELPER_5(xsaddqp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xssubdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmuldp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xsmulqp, void, env, i32)
+DEF_HELPER_5(xsmulqp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsdivdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xsdivqp, void, env, i32)
+DEF_HELPER_5(xsdivqp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xsredp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xssqrtdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrsqrtedp, void, env, i32, vsr, vsr)
@@ -409,10 +409,10 @@ DEF_HELPER_2(xscmpoqp, void, env, i32)
 DEF_HELPER_2(xscmpuqp, void, env, i32)
 DEF_HELPER_5(xsmaxdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmindp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_2(xsmaxcdp, void, env, i32)
-DEF_HELPER_2(xsmincdp, void, env, i32)
-DEF_HELPER_2(xsmaxjdp, void, env, i32)
-DEF_HELPER_2(xsminjdp, void, env, i32)
+DEF_HELPER_5(xsmaxcdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmincdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsmaxjdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_5(xsminjdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xscvdphp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvdpqp, void, env, i32)
 DEF_HELPER_4(xscvdpsp, void, env, i32, vsr, vsr)
@@ -446,7 +446,7 @@ DEF_HELPER_4(xsrdpiz, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xsrqpi, void, env, i32)
 DEF_HELPER_2(xsrqpxp, void, env, i32)
 DEF_HELPER_2(xssqrtqp, void, env, i32)
-DEF_HELPER_2(xssubqp, void, env, i32)
+DEF_HELPER_5(xssubqp, void, env, i32, vsr, vsr, vsr)
 
 DEF_HELPER_5(xsaddsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xssubsp, void, env, i32, vsr, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index a30e22a852..d30682cd4f 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1003,6 +1003,26 @@ static void gen_##name(DisasContext *ctx)                                     \
     tcg_temp_free_ptr(xb);                                                    \
 }
 
+#define GEN_VSX_HELPER_R3(name, op1, op2, inval, type)                        \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xt, xa, xb;                                                      \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xt = gen_vsr_ptr(rD(ctx->opcode) + 32);                                   \
+    xa = gen_vsr_ptr(rA(ctx->opcode) + 32);                                   \
+    xb = gen_vsr_ptr(rB(ctx->opcode) + 32);                                   \
+    gen_helper_##name(cpu_env, opc, xt, xa, xb);                              \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xt);                                                    \
+    tcg_temp_free_ptr(xa);                                                    \
+    tcg_temp_free_ptr(xb);                                                    \
+}
+
 #define GEN_VSX_HELPER_XT_XB_ENV(name, op1, op2, inval, type) \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -1022,12 +1042,12 @@ static void gen_##name(DisasContext *ctx)                     \
 }
 
 GEN_VSX_HELPER_X3(xsadddp, 0x00, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsaddqp, 0x04, 0x00, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xsaddqp, 0x04, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xssubdp, 0x00, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmuldp, 0x00, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmulqp, 0x04, 0x01, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xsmulqp, 0x04, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xsdivdp, 0x00, 0x07, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsdivqp, 0x04, 0x11, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xsdivqp, 0x04, 0x11, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xsredp, 0x14, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
@@ -1053,10 +1073,10 @@ GEN_VSX_HELPER_2(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xsmaxcdp, 0x00, 0x10, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xsmincdp, 0x00, 0x11, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xsmaxjdp, 0x00, 0x12, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xsminjdp, 0x00, 0x12, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xsmaxcdp, 0x00, 0x10, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xsmincdp, 0x00, 0x11, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xsmaxjdp, 0x00, 0x12, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xsminjdp, 0x00, 0x12, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xscvdphp, 0x16, 0x15, 0x11, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xscvdpqp, 0x04, 0x1A, 0x16, PPC2_ISA300)
@@ -1087,7 +1107,7 @@ GEN_VSX_HELPER_XT_XB_ENV(xsrsp, 0x12, 0x11, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsrqpi, 0x05, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xsrqpxp, 0x05, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_2(xssqrtqp, 0x04, 0x19, 0x1B, PPC2_ISA300)
-GEN_VSX_HELPER_2(xssubqp, 0x04, 0x10, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R3(xssubqp, 0x04, 0x10, 0, PPC2_ISA300)
 
 GEN_VSX_HELPER_X3(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 09/14] target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (7 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 08/14] target/ppc: introduce GEN_VSX_HELPER_R3 " Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:51   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 10/14] target/ppc: introduce GEN_VSX_HELPER_R2_AB " Mark Cave-Ayland
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X3 macro which performs the decode based
upon rD and rB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             | 31 ++++++++++++------------------
 target/ppc/helper.h                 | 20 +++++++++----------
 target/ppc/translate/vsx-impl.inc.c | 38 +++++++++++++++++++++++++++----------
 3 files changed, 50 insertions(+), 39 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 4e97093186..b26a1f1494 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2786,10 +2786,9 @@ VSX_CVT_FP_TO_FP(xvcvspdp, 2, float32, float64, VsrW(2 * i), VsrD(i), 0)
  *   sfprf - set FPRF
  */
 #define VSX_CVT_FP_TO_FP_VECTOR(op, nels, stp, ttp, sfld, tfld, sfprf)    \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                       \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                       \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                            \
 {                                                                       \
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                         \
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                         \
     int i;                                                              \
                                                                         \
     for (i = 0; i < nels; i++) {                                        \
@@ -2952,10 +2951,9 @@ VSX_CVT_FP_TO_INT(xvcvspuxws, 4, float32, uint32, VsrW(i), VsrW(i), 0U)
  *   rnan  - resulting NaN
  */
 #define VSX_CVT_FP_TO_INT_VECTOR(op, stp, ttp, sfld, tfld, rnan)             \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                          \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                          \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                               \
 {                                                                            \
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                              \
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                              \
                                                                              \
     memset(xt, 0, sizeof(ppc_vsr_t));                                        \
                                                                              \
@@ -3028,11 +3026,9 @@ VSX_CVT_INT_TO_FP(xvcvuxwsp, 4, uint32, float32, VsrW(i), VsrW(i), 0, 0)
  *   tfld  - target vsr_t field
  */
 #define VSX_CVT_INT_TO_FP_VECTOR(op, stp, ttp, sfld, tfld)              \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                     \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)                          \
 {                                                                       \
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];                         \
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                         \
-                                                                        \
     xt->tfld = stp##_to_##ttp(xb->sfld, &env->fp_status);               \
     helper_compute_fprf_##ttp(env, xt->tfld);                           \
                                                                         \
@@ -3250,10 +3246,9 @@ void helper_xststdcsp(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xb)
     env->crf[BF(opcode)] = cc;
 }
 
-void helper_xsrqpi(CPUPPCState *env, uint32_t opcode)
+void helper_xsrqpi(CPUPPCState *env, uint32_t opcode,
+                   ppc_vsr_t *xt, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     uint8_t r = Rrm(opcode);
     uint8_t ex = Rc(opcode);
     uint8_t rmc = RMC(opcode);
@@ -3307,10 +3302,9 @@ void helper_xsrqpi(CPUPPCState *env, uint32_t opcode)
     do_float_check_status(env, GETPC());
 }
 
-void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode)
+void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode,
+                    ppc_vsr_t *xt, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     uint8_t r = Rrm(opcode);
     uint8_t rmc = RMC(opcode);
     uint8_t rmode = 0;
@@ -3361,10 +3355,9 @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode)
     do_float_check_status(env, GETPC());
 }
 
-void helper_xssqrtqp(CPUPPCState *env, uint32_t opcode)
+void helper_xssqrtqp(CPUPPCState *env, uint32_t opcode,
+                     ppc_vsr_t *xt, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     float_status tstat;
 
     memset(xt, 0, sizeof(ppc_vsr_t));
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 8ed35f91da..cea56ece30 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -414,16 +414,16 @@ DEF_HELPER_5(xsmincdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmaxjdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsminjdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xscvdphp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xscvdpqp, void, env, i32)
+DEF_HELPER_4(xscvdpqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvdpsp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvdpspn, i64, env, i64)
 DEF_HELPER_4(xscvqpdp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xscvqpsdz, void, env, i32)
-DEF_HELPER_2(xscvqpswz, void, env, i32)
-DEF_HELPER_2(xscvqpudz, void, env, i32)
-DEF_HELPER_2(xscvqpuwz, void, env, i32)
+DEF_HELPER_4(xscvqpsdz, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvqpswz, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvqpudz, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscvqpuwz, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvhpdp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xscvsdqp, void, env, i32)
+DEF_HELPER_4(xscvsdqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvspdp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xscvspdpn, i64, env, i64)
 DEF_HELPER_4(xscvdpsxds, void, env, i32, vsr, vsr)
@@ -433,7 +433,7 @@ DEF_HELPER_4(xscvdpuxws, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvsxddp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvuxdsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvsxdsp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xscvudqp, void, env, i32)
+DEF_HELPER_4(xscvudqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvuxddp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xststdcsp, void, env, i32, vsr)
 DEF_HELPER_2(xststdcdp, void, env, i32)
@@ -443,9 +443,9 @@ DEF_HELPER_4(xsrdpic, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrdpim, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrdpip, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrdpiz, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xsrqpi, void, env, i32)
-DEF_HELPER_2(xsrqpxp, void, env, i32)
-DEF_HELPER_2(xssqrtqp, void, env, i32)
+DEF_HELPER_4(xsrqpi, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xsrqpxp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xssqrtqp, void, env, i32, vsr, vsr)
 DEF_HELPER_5(xssubqp, void, env, i32, vsr, vsr, vsr)
 
 DEF_HELPER_5(xsaddsp, void, env, i32, vsr, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index d30682cd4f..f304c11538 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1023,6 +1023,24 @@ static void gen_##name(DisasContext *ctx)                                     \
     tcg_temp_free_ptr(xb);                                                    \
 }
 
+#define GEN_VSX_HELPER_R2(name, op1, op2, inval, type)                        \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xt, xb;                                                          \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xt = gen_vsr_ptr(rD(ctx->opcode) + 32);                                   \
+    xb = gen_vsr_ptr(rB(ctx->opcode) + 32);                                   \
+    gen_helper_##name(cpu_env, opc, xt, xb);                                  \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xt);                                                    \
+    tcg_temp_free_ptr(xb);                                                    \
+}
+
 #define GEN_VSX_HELPER_XT_XB_ENV(name, op1, op2, inval, type) \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -1079,15 +1097,15 @@ GEN_VSX_HELPER_R3(xsmaxjdp, 0x00, 0x12, 0, PPC2_ISA300)
 GEN_VSX_HELPER_R3(xsminjdp, 0x00, 0x12, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xscvdphp, 0x16, 0x15, 0x11, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xscvdpsp, 0x12, 0x10, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscvdpqp, 0x04, 0x1A, 0x16, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xscvdpqp, 0x04, 0x1A, 0x16, PPC2_ISA300)
 GEN_VSX_HELPER_XT_XB_ENV(xscvdpspn, 0x16, 0x10, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvqpdp, 0x04, 0x1A, 0x14, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvqpsdz, 0x04, 0x1A, 0x19, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvqpswz, 0x04, 0x1A, 0x09, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvqpudz, 0x04, 0x1A, 0x11, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvqpuwz, 0x04, 0x1A, 0x01, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xscvqpsdz, 0x04, 0x1A, 0x19, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xscvqpswz, 0x04, 0x1A, 0x09, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xscvqpudz, 0x04, 0x1A, 0x11, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xscvqpuwz, 0x04, 0x1A, 0x01, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xscvhpdp, 0x16, 0x15, 0x10, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscvsdqp, 0x04, 0x1A, 0x0A, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xscvsdqp, 0x04, 0x1A, 0x0A, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xscvspdp, 0x12, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_XT_XB_ENV(xscvspdpn, 0x16, 0x14, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvdpsxds, 0x10, 0x15, 0, PPC2_VSX)
@@ -1095,7 +1113,7 @@ GEN_VSX_HELPER_X2(xscvdpsxws, 0x10, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xscvdpuxds, 0x10, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xscvdpuxws, 0x10, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xscvsxddp, 0x10, 0x17, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscvudqp, 0x04, 0x1A, 0x02, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xscvudqp, 0x04, 0x1A, 0x02, PPC2_ISA300)
 GEN_VSX_HELPER_X2(xscvuxddp, 0x10, 0x16, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrdpi, 0x12, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrdpic, 0x16, 0x06, 0, PPC2_VSX)
@@ -1104,9 +1122,9 @@ GEN_VSX_HELPER_X2(xsrdpip, 0x12, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_XT_XB_ENV(xsrsp, 0x12, 0x11, 0, PPC2_VSX207)
 
-GEN_VSX_HELPER_2(xsrqpi, 0x05, 0x00, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xsrqpxp, 0x05, 0x01, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xssqrtqp, 0x04, 0x19, 0x1B, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xsrqpi, 0x05, 0x00, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xsrqpxp, 0x05, 0x01, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xssqrtqp, 0x04, 0x19, 0x1B, PPC2_ISA300)
 GEN_VSX_HELPER_R3(xssubqp, 0x04, 0x10, 0, PPC2_ISA300)
 
 GEN_VSX_HELPER_X3(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 10/14] target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (8 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 09/14] target/ppc: introduce GEN_VSX_HELPER_R2 " Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:52   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 11/14] target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time Mark Cave-Ayland
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_R2_AB macro which performs the decode based
upon rA and rB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             | 10 ++++------
 target/ppc/helper.h                 |  6 +++---
 target/ppc/translate/vsx-impl.inc.c | 24 +++++++++++++++++++++---
 3 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index b26a1f1494..370b1d2c46 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2434,10 +2434,9 @@ void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
     do_float_check_status(env, GETPC());
 }
 
-void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode)
+void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode,
+                       ppc_vsr_t *xa, ppc_vsr_t *xb)
 {
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
     int64_t exp_a, exp_b;
     uint32_t cc;
 
@@ -2513,10 +2512,9 @@ VSX_SCALAR_CMP(xscmpodp, 1)
 VSX_SCALAR_CMP(xscmpudp, 0)
 
 #define VSX_SCALAR_CMPQ(op, ordered)                                    \
-void helper_##op(CPUPPCState *env, uint32_t opcode)                     \
+void helper_##op(CPUPPCState *env, uint32_t opcode,                     \
+                 ppc_vsr_t *xa, ppc_vsr_t *xb)                          \
 {                                                                       \
-    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];                         \
-    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];                         \
     uint32_t cc = 0;                                                    \
     bool vxsnan_flag = false, vxvc_flag = false;                        \
                                                                         \
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index cea56ece30..167d6e45fd 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -402,11 +402,11 @@ DEF_HELPER_5(xscmpgtdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpgedp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpnedp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xscmpexpqp, void, env, i32)
+DEF_HELPER_4(xscmpexpqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscmpudp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xscmpoqp, void, env, i32)
-DEF_HELPER_2(xscmpuqp, void, env, i32)
+DEF_HELPER_4(xscmpoqp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xscmpuqp, void, env, i32, vsr, vsr)
 DEF_HELPER_5(xsmaxdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmindp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xsmaxcdp, void, env, i32, vsr, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index f304c11538..51d4e0cdd6 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1041,6 +1041,24 @@ static void gen_##name(DisasContext *ctx)                                     \
     tcg_temp_free_ptr(xb);                                                    \
 }
 
+#define GEN_VSX_HELPER_R2_AB(name, op1, op2, inval, type)                     \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xa, xb;                                                          \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xa = gen_vsr_ptr(rA(ctx->opcode) + 32);                                   \
+    xb = gen_vsr_ptr(rB(ctx->opcode) + 32);                                   \
+    gen_helper_##name(cpu_env, opc, xa, xb);                                  \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xa);                                                    \
+    tcg_temp_free_ptr(xb);                                                    \
+}
+
 #define GEN_VSX_HELPER_XT_XB_ENV(name, op1, op2, inval, type) \
 static void gen_##name(DisasContext *ctx)                     \
 {                                                             \
@@ -1084,11 +1102,11 @@ GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R2_AB(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xscmpudp, 0x0C, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_R2_AB(xscmpoqp, 0x04, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_R2_AB(xscmpuqp, 0x04, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmaxdp, 0x00, 0x14, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xsmindp, 0x00, 0x15, 0, PPC2_VSX)
 GEN_VSX_HELPER_R3(xsmaxcdp, 0x00, 0x10, 0, PPC2_ISA300)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 11/14] target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (9 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 10/14] target/ppc: introduce GEN_VSX_HELPER_R2_AB " Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:53   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 12/14] target/ppc: decode target register in VSX_EXTRACT_INSERT " Mark Cave-Ayland
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/helper.h                 | 8 ++++----
 target/ppc/mem_helper.c             | 6 ++----
 target/ppc/translate/vsx-impl.inc.c | 7 ++++---
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 167d6e45fd..5f844cc968 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -291,10 +291,10 @@ DEF_HELPER_3(stvebx, void, env, avr, tl)
 DEF_HELPER_3(stvehx, void, env, avr, tl)
 DEF_HELPER_3(stvewx, void, env, avr, tl)
 #if defined(TARGET_PPC64)
-DEF_HELPER_4(lxvl, void, env, tl, tl, tl)
-DEF_HELPER_4(lxvll, void, env, tl, tl, tl)
-DEF_HELPER_4(stxvl, void, env, tl, tl, tl)
-DEF_HELPER_4(stxvll, void, env, tl, tl, tl)
+DEF_HELPER_4(lxvl, void, env, tl, vsr, tl)
+DEF_HELPER_4(lxvll, void, env, tl, vsr, tl)
+DEF_HELPER_4(stxvl, void, env, tl, vsr, tl)
+DEF_HELPER_4(stxvll, void, env, tl, vsr, tl)
 #endif
 DEF_HELPER_4(vsumsws, void, env, avr, avr, avr)
 DEF_HELPER_4(vsum2sws, void, env, avr, avr, avr)
diff --git a/target/ppc/mem_helper.c b/target/ppc/mem_helper.c
index 4dfa7ee23f..f3426315d2 100644
--- a/target/ppc/mem_helper.c
+++ b/target/ppc/mem_helper.c
@@ -415,9 +415,8 @@ STVE(stvewx, cpu_stl_data_ra, bswap32, u32)
 
 #define VSX_LXVL(name, lj)                                              \
 void helper_##name(CPUPPCState *env, target_ulong addr,                 \
-                   target_ulong xt, target_ulong rb)                    \
+                   ppc_vsr_t *r, target_ulong rb)                       \
 {                                                                       \
-    ppc_vsr_t *r = &env->vsr[xt];                                       \
     int nb = GET_NB(env->gpr[rb]);                                      \
     int i;                                                              \
                                                                         \
@@ -444,9 +443,8 @@ VSX_LXVL(lxvll, 1)
 
 #define VSX_STXVL(name, lj)                                       \
 void helper_##name(CPUPPCState *env, target_ulong addr,           \
-                   target_ulong xt, target_ulong rb)              \
+                   ppc_vsr_t *r, target_ulong rb)                 \
 {                                                                 \
-    ppc_vsr_t *r = &env->vsr[xt];                                 \
     int nb = GET_NB(env->gpr[rb]);                                \
     int i;                                                        \
                                                                   \
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 51d4e0cdd6..7c79ec22dd 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -297,7 +297,8 @@ VSX_VECTOR_LOAD_STORE(stxvx, st_i64, 1)
 #define VSX_VECTOR_LOAD_STORE_LENGTH(name)                      \
 static void gen_##name(DisasContext *ctx)                       \
 {                                                               \
-    TCGv EA, xt, rb;                                            \
+    TCGv EA, rb;                                                \
+    TCGv_ptr xt;                                                \
                                                                 \
     if (xT(ctx->opcode) < 32) {                                 \
         if (unlikely(!ctx->vsx_enabled)) {                      \
@@ -313,12 +314,12 @@ static void gen_##name(DisasContext *ctx)                       \
     EA = tcg_temp_new();                                        \
     gen_set_access_type(ctx, ACCESS_INT);                       \
     gen_addr_register(ctx, EA);                                 \
-    xt = tcg_const_tl(xT(ctx->opcode));                         \
+    xt = gen_vsr_ptr(xT(ctx->opcode));                          \
     rb = tcg_const_tl(rB(ctx->opcode));                         \
     gen_helper_##name(cpu_env, EA, xt, rb);                     \
     tcg_temp_free(EA);                                          \
-    tcg_temp_free(xt);                                          \
     tcg_temp_free(rb);                                          \
+    tcg_temp_free_ptr(xt);                                      \
 }
 
 VSX_VECTOR_LOAD_STORE_LENGTH(lxvl)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 12/14] target/ppc: decode target register in VSX_EXTRACT_INSERT at translation time
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (10 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 11/14] target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:53   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 13/14] target/ppc: improve VSX_TEST_DC with new generator macros Mark Cave-Ayland
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro Mark Cave-Ayland
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/helper.h                 |  4 ++--
 target/ppc/int_helper.c             | 12 ++++--------
 target/ppc/translate/vsx-impl.inc.c | 10 +++++-----
 3 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 5f844cc968..81630a5f23 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -546,8 +546,8 @@ DEF_HELPER_4(xvrspip, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrspiz, void, env, i32, vsr, vsr)
 DEF_HELPER_5(xxperm, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xxpermr, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_4(xxextractuw, void, env, tl, tl, i32)
-DEF_HELPER_4(xxinsertw, void, env, tl, tl, i32)
+DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
+DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
 DEF_HELPER_4(xvxsigsp, void, env, i32, vsr, vsr)
 
 DEF_HELPER_2(efscfsi, i32, env, i32)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index f616fb249d..10e7ee5943 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1901,11 +1901,9 @@ VEXTRACT(uw, u32)
 VEXTRACT(d, u64)
 #undef VEXTRACT
 
-void helper_xxextractuw(CPUPPCState *env, target_ulong xtn,
-                        target_ulong xbn, uint32_t index)
+void helper_xxextractuw(CPUPPCState *env, ppc_vsr_t *xt,
+                        ppc_vsr_t *xb, uint32_t index)
 {
-    ppc_vsr_t *xt = &env->vsr[xtn];
-    ppc_vsr_t *xb = &env->vsr[xbn];
     size_t es = sizeof(uint32_t);
     uint32_t ext_index;
     int i;
@@ -1918,11 +1916,9 @@ void helper_xxextractuw(CPUPPCState *env, target_ulong xtn,
     }
 }
 
-void helper_xxinsertw(CPUPPCState *env, target_ulong xtn,
-                      target_ulong xbn, uint32_t index)
+void helper_xxinsertw(CPUPPCState *env, ppc_vsr_t *xt,
+                      ppc_vsr_t *xb, uint32_t index)
 {
-    ppc_vsr_t *xt = &env->vsr[xtn];
-    ppc_vsr_t *xb = &env->vsr[xbn];
     size_t es = sizeof(uint32_t);
     int ins_index, i = 0;
 
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 7c79ec22dd..b8f24b7462 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1569,7 +1569,7 @@ static void gen_xxsldwi(DisasContext *ctx)
 #define VSX_EXTRACT_INSERT(name)                                \
 static void gen_##name(DisasContext *ctx)                       \
 {                                                               \
-    TCGv xt, xb;                                                \
+    TCGv_ptr xt, xb;                                            \
     TCGv_i32 t0;                                                \
     TCGv_i64 t1;                                                \
     uint8_t uimm = UIMM4(ctx->opcode);                          \
@@ -1578,8 +1578,8 @@ static void gen_##name(DisasContext *ctx)                       \
         gen_exception(ctx, POWERPC_EXCP_VSXU);                  \
         return;                                                 \
     }                                                           \
-    xt = tcg_const_tl(xT(ctx->opcode));                         \
-    xb = tcg_const_tl(xB(ctx->opcode));                         \
+    xt = gen_vsr_ptr(xT(ctx->opcode));                          \
+    xb = gen_vsr_ptr(xB(ctx->opcode));                          \
     t0 = tcg_temp_new_i32();                                    \
     t1 = tcg_temp_new_i64();                                    \
     /*                                                          \
@@ -1594,8 +1594,8 @@ static void gen_##name(DisasContext *ctx)                       \
     }                                                           \
     tcg_gen_movi_i32(t0, uimm);                                 \
     gen_helper_##name(cpu_env, xt, xb, t0);                     \
-    tcg_temp_free(xb);                                          \
-    tcg_temp_free(xt);                                          \
+    tcg_temp_free_ptr(xb);                                      \
+    tcg_temp_free_ptr(xt);                                      \
     tcg_temp_free_i32(t0);                                      \
     tcg_temp_free_i64(t1);                                      \
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 13/14] target/ppc: improve VSX_TEST_DC with new generator macros
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (11 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 12/14] target/ppc: decode target register in VSX_EXTRACT_INSERT " Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 16:55   ` Richard Henderson
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro Mark Cave-Ayland
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

The source and destination registers can now be decoded in the generator
function using the new GEN_VSX_HELPER_X2 and GEN_VSX_HELPER_R2 macros.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             | 16 +++++++---------
 target/ppc/helper.h                 |  8 ++++----
 target/ppc/translate/vsx-impl.inc.c |  8 ++++----
 3 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 370b1d2c46..357be25867 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3158,18 +3158,16 @@ void helper_xvxsigsp(CPUPPCState *env, uint32_t opcode,
  * VSX_TEST_DC - VSX floating point test data class
  *   op    - instruction mnemonic
  *   nels  - number of elements (1, 2 or 4)
- *   xbn   - VSR register number
  *   tp    - type (float32 or float64)
  *   fld   - vsr_t field (VsrD(*) or VsrW(*))
  *   tfld   - target vsr_t field (VsrD(*) or VsrW(*))
  *   fld_max - target field max
  *   scrf - set result in CR and FPCC
  */
-#define VSX_TEST_DC(op, nels, xbn, tp, fld, tfld, fld_max, scrf)  \
-void helper_##op(CPUPPCState *env, uint32_t opcode)         \
+#define VSX_TEST_DC(op, nels, tp, fld, tfld, fld_max, scrf)  \
+void helper_##op(CPUPPCState *env, uint32_t opcode,         \
+                 ppc_vsr_t *xt, ppc_vsr_t *xb)              \
 {                                                           \
-    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                  \
-    ppc_vsr_t *xb = &env->vsr[xbn];                         \
     ppc_vsr_t r;                                            \
     uint32_t i, sign, dcmx;                                 \
     uint32_t cc, match = 0;                                 \
@@ -3208,10 +3206,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)         \
     }                                                       \
 }
 
-VSX_TEST_DC(xvtstdcdp, 2, xB(opcode), float64, VsrD(i), VsrD(i), UINT64_MAX, 0)
-VSX_TEST_DC(xvtstdcsp, 4, xB(opcode), float32, VsrW(i), VsrW(i), UINT32_MAX, 0)
-VSX_TEST_DC(xststdcdp, 1, xB(opcode), float64, VsrD(0), VsrD(0), 0, 1)
-VSX_TEST_DC(xststdcqp, 1, (rB(opcode) + 32), float128, f128, VsrD(0), 0, 1)
+VSX_TEST_DC(xvtstdcdp, 2, float64, VsrD(i), VsrD(i), UINT64_MAX, 0)
+VSX_TEST_DC(xvtstdcsp, 4, float32, VsrW(i), VsrW(i), UINT32_MAX, 0)
+VSX_TEST_DC(xststdcdp, 1, float64, VsrD(0), VsrD(0), 0, 1)
+VSX_TEST_DC(xststdcqp, 1, float128, f128, VsrD(0), 0, 1)
 
 void helper_xststdcsp(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xb)
 {
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 81630a5f23..cd97fae438 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -436,8 +436,8 @@ DEF_HELPER_4(xscvsxdsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvudqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscvuxddp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xststdcsp, void, env, i32, vsr)
-DEF_HELPER_2(xststdcdp, void, env, i32)
-DEF_HELPER_2(xststdcqp, void, env, i32)
+DEF_HELPER_4(xststdcdp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xststdcqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrdpi, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrdpic, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrdpim, void, env, i32, vsr, vsr)
@@ -537,8 +537,8 @@ DEF_HELPER_4(xvcvsxdsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvcvuxdsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvcvsxwsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvcvuxwsp, void, env, i32, vsr, vsr)
-DEF_HELPER_2(xvtstdcsp, void, env, i32)
-DEF_HELPER_2(xvtstdcdp, void, env, i32)
+DEF_HELPER_4(xvtstdcsp, void, env, i32, vsr, vsr)
+DEF_HELPER_4(xvtstdcdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrspi, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrspic, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrspim, void, env, i32, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index b8f24b7462..03b342a2fb 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1164,8 +1164,8 @@ GEN_VSX_HELPER_X3(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X1(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xststdcdp, 0x14, 0x16, 0, PPC2_ISA300)
-GEN_VSX_HELPER_2(xststdcqp, 0x04, 0x16, 0, PPC2_ISA300)
+GEN_VSX_HELPER_X2(xststdcdp, 0x14, 0x16, 0, PPC2_ISA300)
+GEN_VSX_HELPER_R2(xststdcqp, 0x04, 0x16, 0, PPC2_ISA300)
 
 GEN_VSX_HELPER_X3(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -1244,8 +1244,8 @@ GEN_VSX_HELPER_X2(xvrspic, 0x16, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrspim, 0x12, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvtstdcsp, 0x14, 0x1A, 0, PPC2_VSX)
-GEN_VSX_HELPER_2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvtstdcsp, 0x14, 0x1A, 0, PPC2_VSX)
+GEN_VSX_HELPER_X2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xxpermr, 0x08, 0x07, 0, PPC2_ISA300)
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
  2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
                   ` (12 preceding siblings ...)
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 13/14] target/ppc: improve VSX_TEST_DC with new generator macros Mark Cave-Ayland
@ 2019-04-28 14:38 ` Mark Cave-Ayland
  2019-04-30 17:00   ` Richard Henderson
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-04-28 14:38 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, rth, gkurz

Introduce a new GEN_VSX_HELPER_VSX_MADD macro for the generator function which
enables the source and destination registers to be decoded at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/fpu_helper.c             |  12 +----
 target/ppc/helper.h                 |  64 +++++++++++------------
 target/ppc/translate/vsx-impl.inc.c | 101 ++++++++++++++++++++++++------------
 3 files changed, 101 insertions(+), 76 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 357be25867..6b5293495f 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2270,19 +2270,11 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23)
  */
 #define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)              \
 void helper_##op(CPUPPCState *env, uint32_t opcode,                           \
-                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)                 \
+                 ppc_vsr_t *xt, ppc_vsr_t *xa,                                \
+                 ppc_vsr_t *b, ppc_vsr_t *c)                                  \
 {                                                                             \
-    ppc_vsr_t *b, *c;                                                         \
     int i;                                                                    \
                                                                               \
-    if (afrm) { /* AxB + T */                                                 \
-        b = xb;                                                               \
-        c = xt;                                                               \
-    } else { /* AxT + B */                                                    \
-        b = xt;                                                               \
-        c = xb;                                                               \
-    }                                                                         \
-                                                                              \
     helper_reset_fpstatus(env);                                               \
                                                                               \
     for (i = 0; i < nels; i++) {                                              \
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index cd97fae438..5db6dc0797 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -389,14 +389,14 @@ DEF_HELPER_4(xssqrtdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrsqrtedp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xstdivdp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xstsqrtdp, void, env, i32, vsr)
-DEF_HELPER_5(xsmaddadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsmaddmdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubmdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmaddadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmaddmdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_6(xsmaddadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsmaddmdp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsmsubadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsmsubmdp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmaddadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmaddmdp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmsubadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmsubmdp, void, env, i32, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpeqdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpgtdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xscmpgedp, void, env, i32, vsr, vsr, vsr)
@@ -456,14 +456,14 @@ DEF_HELPER_4(xsresp, void, env, i32, vsr, vsr)
 DEF_HELPER_2(xsrsp, i64, env, i64)
 DEF_HELPER_4(xssqrtsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsrsqrtesp, void, env, i32, vsr, vsr)
-DEF_HELPER_5(xsmaddasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsmaddmsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubmsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmaddasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmaddmsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_6(xsmaddasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsmaddmsp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsmsubasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsmsubmsp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmaddasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmaddmsp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmsubasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xsnmsubmsp, void, env, i32, vsr, vsr, vsr, vsr)
 
 DEF_HELPER_5(xvadddp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvsubdp, void, env, i32, vsr, vsr, vsr)
@@ -474,14 +474,14 @@ DEF_HELPER_4(xvsqrtdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrsqrtedp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvtdivdp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xvtsqrtdp, void, env, i32, vsr)
-DEF_HELPER_5(xvmaddadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvmaddmdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvmsubadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvmsubmdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmaddadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmaddmdp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmsubadp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmsubmdp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_6(xvmaddadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvmaddmdp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvmsubadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvmsubmdp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmaddadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmaddmdp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmsubadp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmsubmdp, void, env, i32, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(xvmaxdp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvmindp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpeqdp, void, env, i32, vsr, vsr, vsr)
@@ -512,14 +512,14 @@ DEF_HELPER_4(xvsqrtsp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvrsqrtesp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xvtdivsp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xvtsqrtsp, void, env, i32, vsr)
-DEF_HELPER_5(xvmaddasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvmaddmsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvmsubasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvmsubmsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmaddasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmaddmsp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmsubasp, void, env, i32, vsr, vsr, vsr)
-DEF_HELPER_5(xvnmsubmsp, void, env, i32, vsr, vsr, vsr)
+DEF_HELPER_6(xvmaddasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvmaddmsp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvmsubasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvmsubmsp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmaddasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmaddmsp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmsubasp, void, env, i32, vsr, vsr, vsr, vsr)
+DEF_HELPER_6(xvnmsubmsp, void, env, i32, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(xvmaxsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvminsp, void, env, i32, vsr, vsr, vsr)
 DEF_HELPER_5(xvcmpeqsp, void, env, i32, vsr, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c
index 03b342a2fb..18b5f5a917 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1090,14 +1090,6 @@ GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsmaddmdp, 0x04, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsmsubadp, 0x04, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsmsubmdp, 0x04, 0x07, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
@@ -1140,12 +1132,10 @@ GEN_VSX_HELPER_X2(xsrdpim, 0x12, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrdpip, 0x12, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_XT_XB_ENV(xsrsp, 0x12, 0x11, 0, PPC2_VSX207)
-
 GEN_VSX_HELPER_R2(xsrqpi, 0x05, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_R2(xsrqpxp, 0x05, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_R2(xssqrtqp, 0x04, 0x19, 0x1B, PPC2_ISA300)
 GEN_VSX_HELPER_R3(xssubqp, 0x04, 0x10, 0, PPC2_ISA300)
-
 GEN_VSX_HELPER_X3(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X3(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
@@ -1153,14 +1143,6 @@ GEN_VSX_HELPER_X3(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsmsubmsp, 0x04, 0x03, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
-GEN_VSX_HELPER_X3(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207)
 GEN_VSX_HELPER_X1(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300)
@@ -1176,14 +1158,6 @@ GEN_VSX_HELPER_X2(xvsqrtdp, 0x16, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrsqrtedp, 0x14, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xvtdivdp, 0x14, 0x0F, 0, PPC2_VSX)
 GEN_VSX_HELPER_X1(xvtsqrtdp, 0x14, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmaddmdp, 0x04, 0x0D, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmsubadp, 0x04, 0x0E, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmsubmdp, 0x04, 0x0F, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmaddadp, 0x04, 0x1C, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmaddmdp, 0x04, 0x1D, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmsubadp, 0x04, 0x1E, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmsubmdp, 0x04, 0x1F, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaxdp, 0x00, 0x1C, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmindp, 0x00, 0x1D, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpeqdp, 0x0C, 0x0C, 0, PPC2_VSX)
@@ -1214,14 +1188,6 @@ GEN_VSX_HELPER_X2(xvsqrtsp, 0x16, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrsqrtesp, 0x14, 0x08, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xvtdivsp, 0x14, 0x0B, 0, PPC2_VSX)
 GEN_VSX_HELPER_X1(xvtsqrtsp, 0x14, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmaddmsp, 0x04, 0x09, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmsubasp, 0x04, 0x0A, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvmsubmsp, 0x04, 0x0B, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmaddasp, 0x04, 0x18, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmaddmsp, 0x04, 0x19, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmsubasp, 0x04, 0x1A, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xvnmsubmsp, 0x04, 0x1B, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvmaxsp, 0x00, 0x18, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvminsp, 0x00, 0x19, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xvcmpeqsp, 0x0C, 0x08, 0, PPC2_VSX)
@@ -1249,6 +1215,73 @@ GEN_VSX_HELPER_X2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xxpermr, 0x08, 0x07, 0, PPC2_ISA300)
 
+#define GEN_VSX_HELPER_VSX_MADD(name, op1, op2, inval, type)                  \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 opc;                                                             \
+    TCGv_ptr xt, xa, b, c;                                                    \
+    if (unlikely(!ctx->vsx_enabled)) {                                        \
+        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
+        return;                                                               \
+    }                                                                         \
+    opc = tcg_const_i32(ctx->opcode);                                         \
+    xt = gen_vsr_ptr(xT(ctx->opcode));                                        \
+    xa = gen_vsr_ptr(xA(ctx->opcode));                                        \
+    if (ctx->opcode & PPC_BIT(25)) {                                          \
+        /*                                                                    \
+         * AxT + B                                                            \
+         */                                                                   \
+        b = gen_vsr_ptr(xT(ctx->opcode));                                     \
+        c = gen_vsr_ptr(xB(ctx->opcode));                                     \
+    } else {                                                                  \
+        /*                                                                    \
+         * AxB + T                                                            \
+         */                                                                   \
+        b = gen_vsr_ptr(xB(ctx->opcode));                                     \
+        c = gen_vsr_ptr(xT(ctx->opcode));                                     \
+    }                                                                         \
+    gen_helper_##name(cpu_env, opc, xt, xa, b, c);                            \
+    tcg_temp_free_i32(opc);                                                   \
+    tcg_temp_free_ptr(xt);                                                    \
+    tcg_temp_free_ptr(xa);                                                    \
+    tcg_temp_free_ptr(b);                                                     \
+    tcg_temp_free_ptr(c);                                                     \
+}
+
+GEN_VSX_HELPER_VSX_MADD(xsmaddmdp, 0x04, 0x05, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsmsubmdp, 0x04, 0x07, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsnmaddmdp, 0x04, 0x15, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsnmsubmdp, 0x04, 0x17, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xsmsubmsp, 0x04, 0x03, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xvmaddmdp, 0x04, 0x0D, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvmsubmdp, 0x04, 0x0F, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmaddmdp, 0x04, 0x1D, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmsubmdp, 0x04, 0x1F, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvmaddmsp, 0x04, 0x09, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvmsubmsp, 0x04, 0x0B, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmaddmsp, 0x04, 0x19, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmsubmsp, 0x04, 0x1B, 0, PPC2_VSX)
+
+GEN_VSX_HELPER_VSX_MADD(xsmaddadp, 0x04, 0x04, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsmsubadp, 0x04, 0x06, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsnmaddadp, 0x04, 0x14, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsnmsubadp, 0x04, 0x16, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207)
+GEN_VSX_HELPER_VSX_MADD(xvmaddadp, 0x04, 0x0C, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvmsubadp, 0x04, 0x0E, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmaddadp, 0x04, 0x1C, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmsubadp, 0x04, 0x1E, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvmaddasp, 0x04, 0x08, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvmsubasp, 0x04, 0x0A, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmaddasp, 0x04, 0x18, 0, PPC2_VSX)
+GEN_VSX_HELPER_VSX_MADD(xvnmsubasp, 0x04, 0x1A, 0, PPC2_VSX)
+
 static void gen_xxbrd(DisasContext *ctx)
 {
     TCGv_i64 xth;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c Mark Cave-Ayland
@ 2019-04-30 16:25   ` Richard Henderson
  2019-05-05  9:27     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:25 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>  void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
>  {
> -    ppc_vsr_t xt, xa, xb;
> +    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
> +    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
> +    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
>      float_status tstat;
>  
> -    getVSR(rA(opcode) + 32, &xa, env);
> -    getVSR(rB(opcode) + 32, &xb, env);
> -    getVSR(rD(opcode) + 32, &xt, env);
>      helper_reset_fpstatus(env);
>  
>      tstat = env->fp_status;
> @@ -1860,18 +1857,17 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
>      }
>  
>      set_float_exception_flags(0, &tstat);
> -    xt.f128 = float128_add(xa.f128, xb.f128, &tstat);
> +    xt->f128 = float128_add(xa->f128, xb->f128, &tstat);
>      env->fp_status.float_exception_flags |= tstat.float_exception_flags;
>  
>      if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
>          float_invalid_op_addsub(env, 1, GETPC(),
> -                                float128_classify(xa.f128) |
> -                                float128_classify(xb.f128));
> +                                float128_classify(xa->f128) |
> +                                float128_classify(xb->f128));

These values are no longer valid, because you may have written over them with
the store to xt->f128.  You need to keep the result in a local variable until
the location of the putVSR in order to keep the current semantics.

(Although the current semantics probably need to be reviewed with respect to
how the exception is signaled vs the result is stored to the register file.  I
know there are current bugs in this area with respect to regular floating-point
operations, never mind the vector floating-point ones.)

>  #define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp)                    \
>  void helper_##name(CPUPPCState *env, uint32_t opcode)                        \
>  {                                                                            \
> -    ppc_vsr_t xt, xa, xb;                                                    \
> +    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
> +    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                   \
> +    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
>      int i;                                                                   \
>                                                                               \
> -    getVSR(xA(opcode), &xa, env);                                            \
> -    getVSR(xB(opcode), &xb, env);                                            \
> -    getVSR(xT(opcode), &xt, env);                                            \
>      helper_reset_fpstatus(env);                                              \
>                                                                               \
>      for (i = 0; i < nels; i++) {                                             \
>          float_status tstat = env->fp_status;                                 \
>          set_float_exception_flags(0, &tstat);                                \
> -        xt.fld = tp##_##op(xa.fld, xb.fld, &tstat);                          \
> +        xt->fld = tp##_##op(xa->fld, xb->fld, &tstat);                       \
>          env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
>                                                                               \
>          if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {    \
>              float_invalid_op_addsub(env, sfprf, GETPC(),                     \
> -                                    tp##_classify(xa.fld) |                  \
> -                                    tp##_classify(xb.fld));                  \
> +                                    tp##_classify(xa->fld) |                 \
> +                                    tp##_classify(xb->fld));                 \
>          }                                                                    \

Similarly.  Only here it's more interesting in that element 0 is modified when
element 3 raises an exception.  To keep current semantics you need to keep xt
as a ppc_vsr_t local variable and write back at the end.

It looks like the same is true for every other function.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c Mark Cave-Ayland
@ 2019-04-30 16:29   ` Richard Henderson
  2019-05-05  9:34     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:29 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>  #define VSX_LXVL(name, lj)                                              \
>  void helper_##name(CPUPPCState *env, target_ulong addr,                 \
> -                   target_ulong xt_num, target_ulong rb)                \
> +                   target_ulong xt, target_ulong rb)                    \
>  {                                                                       \
> +    ppc_vsr_t *r = &env->vsr[xt];                                       \
> +    int nb = GET_NB(env->gpr[rb]);                                      \
>      int i;                                                              \
> -    ppc_vsr_t xt;                                                       \
> -    uint64_t nb = GET_NB(rb);                                           \
>                                                                          \
> -    xt.s128 = int128_zero();                                            \
> +    r->s128 = int128_zero();                                            \
>      if (nb) {                                                           \
>          nb = (nb >= 16) ? 16 : nb;                                      \
>          if (msr_le && !lj) {                                            \
>              for (i = 16; i > 16 - nb; i--) {                            \
> -                xt.VsrB(i - 1) = cpu_ldub_data_ra(env, addr, GETPC());  \
> +                r->VsrB(i - 1) = cpu_ldub_data_ra(env, addr, GETPC());  \
>                  addr = addr_add(env, addr, 1);                          \
>              }                                                           \
>          } else {                                                        \
>              for (i = 0; i < nb; i++) {                                  \
> -                xt.VsrB(i) = cpu_ldub_data_ra(env, addr, GETPC());      \
> +                r->VsrB(i) = cpu_ldub_data_ra(env, addr, GETPC());      \
>                  addr = addr_add(env, addr, 1);                          \
>              }                                                           \
>          }                                                               \
>      }                                                                   \
> -    putVSR(xt_num, &xt, env);                                           \
>  }

Similarly, this modifies env->vsr[xt] before all exceptions are recognized.

> @@ -304,12 +304,14 @@ static void gen_##name(DisasContext *ctx)                       \
>          }                                                       \
>      }                                                           \
>      EA = tcg_temp_new();                                        \
> -    xt = tcg_const_tl(xT(ctx->opcode));                         \
>      gen_set_access_type(ctx, ACCESS_INT);                       \
>      gen_addr_register(ctx, EA);                                 \
> -    gen_helper_##name(cpu_env, EA, xt, cpu_gpr[rB(ctx->opcode)]); \
> +    xt = tcg_const_tl(xT(ctx->opcode));                         \
> +    rb = tcg_const_tl(rB(ctx->opcode));                         \
> +    gen_helper_##name(cpu_env, EA, xt, rb);                     \
>      tcg_temp_free(EA);                                          \
>      tcg_temp_free(xt);                                          \
> +    tcg_temp_free(rb);                                          \
>  }

Why are you adjusting the function to pass the rB register number rather than
the contents of rB?  That seems the wrong way around...


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 03/14] target/ppc: remove getVSR()/putVSR() from int_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 03/14] target/ppc: remove getVSR()/putVSR() from int_helper.c Mark Cave-Ayland
@ 2019-04-30 16:32   ` Richard Henderson
  2019-05-05  9:36     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:32 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>  void helper_xxextractuw(CPUPPCState *env, target_ulong xtn,
>                          target_ulong xbn, uint32_t index)
>  {
> -    ppc_vsr_t xt, xb;
> +    ppc_vsr_t *xt = &env->vsr[xtn];
> +    ppc_vsr_t *xb = &env->vsr[xbn];
>      size_t es = sizeof(uint32_t);
>      uint32_t ext_index;
>      int i;
>  
> -    getVSR(xbn, &xb, env);
> -    memset(&xt, 0, sizeof(xt));
> +    memset(xt, 0, sizeof(ppc_vsr_t));

This fails if xt == xb.

Similarly for xxinsertw.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c Mark Cave-Ayland
@ 2019-04-30 16:36   ` Richard Henderson
  2019-05-05  9:52     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:36 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> +#define GEN_VSX_HELPER_X3(name, op1, op2, inval, type)                        \
> +static void gen_##name(DisasContext *ctx)                                     \
> +{                                                                             \
> +    TCGv_i32 opc;                                                             \
> +    TCGv_ptr xt, xa, xb;                                                      \
> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
> +        return;                                                               \
> +    }                                                                         \
> +    opc = tcg_const_i32(ctx->opcode);                                         \
> +    xt = gen_vsr_ptr(xT(ctx->opcode));                                        \
> +    xa = gen_vsr_ptr(xA(ctx->opcode));                                        \
> +    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \

Do you still need to pass opc?

Anyway, I guess this is still progress...
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 " Mark Cave-Ayland
@ 2019-04-30 16:38   ` Richard Henderson
  2019-05-05  9:57     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:38 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> +#define GEN_VSX_HELPER_X2(name, op1, op2, inval, type)                        \
> +static void gen_##name(DisasContext *ctx)                                     \
> +{                                                                             \
> +    TCGv_i32 opc;                                                             \
> +    TCGv_ptr xt, xb;                                                          \
> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
> +        return;                                                               \
> +    }                                                                         \
> +    opc = tcg_const_i32(ctx->opcode);                                         \
> +    xt = gen_vsr_ptr(xT(ctx->opcode));                                        \
> +    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \
> +    gen_helper_##name(cpu_env, opc, xt, xb);                                  \

Similarly wrt opc.  However,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 06/14] target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 06/14] target/ppc: introduce GEN_VSX_HELPER_X2_AB " Mark Cave-Ayland
@ 2019-04-30 16:41   ` Richard Henderson
  2019-05-05 10:03     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:41 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> Rather than perform the VSR register decoding within the helper itself,
> introduce a new GEN_VSX_HELPER_X2_AB macro which performs the decode based
> upon xA and xB at translation time.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/fpu_helper.c             | 15 ++++++---------
>  target/ppc/helper.h                 | 12 ++++++------
>  target/ppc/translate/vsx-impl.inc.c | 30 ++++++++++++++++++++++++------
>  3 files changed, 36 insertions(+), 21 deletions(-)

This time I see VSX_TDIV right away as a place that still
uses opcode.  It can certainly be cleaned up, but that's
a different patch.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 07/14] target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 07/14] target/ppc: introduce GEN_VSX_HELPER_X1 " Mark Cave-Ayland
@ 2019-04-30 16:43   ` Richard Henderson
  2019-05-05 10:06     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:43 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> Rather than perform the VSR register decoding within the helper itself,
> introduce a new GEN_VSX_HELPER_X1 macro which performs the decode based
> upon xB at translation time.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/fpu_helper.c             |  6 ++----
>  target/ppc/helper.h                 |  8 ++++----
>  target/ppc/translate/vsx-impl.inc.c | 24 ++++++++++++++++++++----
>  3 files changed, 26 insertions(+), 12 deletions(-)

Similarly wrt VSX_TSQRT.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 08/14] target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 08/14] target/ppc: introduce GEN_VSX_HELPER_R3 " Mark Cave-Ayland
@ 2019-04-30 16:47   ` Richard Henderson
  2019-05-05 10:07     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:47 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> +#define GEN_VSX_HELPER_R3(name, op1, op2, inval, type)                        \
> +static void gen_##name(DisasContext *ctx)                                     \
> +{                                                                             \
> +    TCGv_i32 opc;                                                             \
> +    TCGv_ptr xt, xa, xb;                                                      \
> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
> +        return;                                                               \
> +    }                                                                         \
> +    opc = tcg_const_i32(ctx->opcode);                                         \
> +    xt = gen_vsr_ptr(rD(ctx->opcode) + 32);                                   \
> +    xa = gen_vsr_ptr(rA(ctx->opcode) + 32);                                   \
> +    xb = gen_vsr_ptr(rB(ctx->opcode) + 32);                                   \
> +    gen_helper_##name(cpu_env, opc, xt, xa, xb);                              \

Is opc still used here?  Otherwise,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 09/14] target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 09/14] target/ppc: introduce GEN_VSX_HELPER_R2 " Mark Cave-Ayland
@ 2019-04-30 16:51   ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:51 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> -void helper_xsrqpi(CPUPPCState *env, uint32_t opcode)
> +void helper_xsrqpi(CPUPPCState *env, uint32_t opcode,
> +                   ppc_vsr_t *xt, ppc_vsr_t *xb)
>  {
> -    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
> -    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
>      uint8_t r = Rrm(opcode);
>      uint8_t ex = Rc(opcode);
>      uint8_t rmc = RMC(opcode);

For future cleanup, it looks like the rounding mode should be passed in directly.

But anyway,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 10/14] target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 10/14] target/ppc: introduce GEN_VSX_HELPER_R2_AB " Mark Cave-Ayland
@ 2019-04-30 16:52   ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:52 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> Rather than perform the VSR register decoding within the helper itself,
> introduce a new GEN_VSX_HELPER_R2_AB macro which performs the decode based
> upon rA and rB at translation time.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/fpu_helper.c             | 10 ++++------
>  target/ppc/helper.h                 |  6 +++---
>  target/ppc/translate/vsx-impl.inc.c | 24 +++++++++++++++++++++---
>  3 files changed, 28 insertions(+), 12 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 11/14] target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 11/14] target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time Mark Cave-Ayland
@ 2019-04-30 16:53   ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:53 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/helper.h                 | 8 ++++----
>  target/ppc/mem_helper.c             | 6 ++----
>  target/ppc/translate/vsx-impl.inc.c | 7 ++++---
>  3 files changed, 10 insertions(+), 11 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 12/14] target/ppc: decode target register in VSX_EXTRACT_INSERT at translation time
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 12/14] target/ppc: decode target register in VSX_EXTRACT_INSERT " Mark Cave-Ayland
@ 2019-04-30 16:53   ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:53 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/helper.h                 |  4 ++--
>  target/ppc/int_helper.c             | 12 ++++--------
>  target/ppc/translate/vsx-impl.inc.c | 10 +++++-----
>  3 files changed, 11 insertions(+), 15 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 13/14] target/ppc: improve VSX_TEST_DC with new generator macros
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 13/14] target/ppc: improve VSX_TEST_DC with new generator macros Mark Cave-Ayland
@ 2019-04-30 16:55   ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 16:55 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
> The source and destination registers can now be decoded in the generator
> function using the new GEN_VSX_HELPER_X2 and GEN_VSX_HELPER_R2 macros.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/fpu_helper.c             | 16 +++++++---------
>  target/ppc/helper.h                 |  8 ++++----
>  target/ppc/translate/vsx-impl.inc.c |  8 ++++----
>  3 files changed, 15 insertions(+), 17 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
  2019-04-28 14:38 ` [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro Mark Cave-Ayland
@ 2019-04-30 17:00   ` Richard Henderson
  2019-05-05 10:20     ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-04-30 17:00 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>  #define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)              \
>  void helper_##op(CPUPPCState *env, uint32_t opcode,                           \
> -                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)                 \
> +                 ppc_vsr_t *xt, ppc_vsr_t *xa,                                \
> +                 ppc_vsr_t *b, ppc_vsr_t *c)                                  \
>  {                                                                             \
> -    ppc_vsr_t *b, *c;                                                         \
>      int i;                                                                    \
>                                                                                \
> -    if (afrm) { /* AxB + T */                                                 \
> -        b = xb;                                                               \
> -        c = xt;                                                               \
> -    } else { /* AxT + B */                                                    \
> -        b = xt;                                                               \
> -        c = xb;                                                               \
> -    }                                                                         \

The afrm argument is no longer used.
This also means that e.g.

VSX_MADD(xsmaddadp, 1, float64, VsrD(0), MADD_FLGS, 1, 1, 0)
VSX_MADD(xsmaddmdp, 1, float64, VsrD(0), MADD_FLGS, 0, 1, 0)

are redundant.  Similarly with all of the other pairs.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c
  2019-04-30 16:25   ` Richard Henderson
@ 2019-05-05  9:27     ` Mark Cave-Ayland
  2019-05-05 14:31       ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05  9:27 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:25, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>>  void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
>>  {
>> -    ppc_vsr_t xt, xa, xb;
>> +    ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
>> +    ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
>> +    ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
>>      float_status tstat;
>>  
>> -    getVSR(rA(opcode) + 32, &xa, env);
>> -    getVSR(rB(opcode) + 32, &xb, env);
>> -    getVSR(rD(opcode) + 32, &xt, env);
>>      helper_reset_fpstatus(env);
>>  
>>      tstat = env->fp_status;
>> @@ -1860,18 +1857,17 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
>>      }
>>  
>>      set_float_exception_flags(0, &tstat);
>> -    xt.f128 = float128_add(xa.f128, xb.f128, &tstat);
>> +    xt->f128 = float128_add(xa->f128, xb->f128, &tstat);
>>      env->fp_status.float_exception_flags |= tstat.float_exception_flags;
>>  
>>      if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
>>          float_invalid_op_addsub(env, 1, GETPC(),
>> -                                float128_classify(xa.f128) |
>> -                                float128_classify(xb.f128));
>> +                                float128_classify(xa->f128) |
>> +                                float128_classify(xb->f128));
> 
> These values are no longer valid, because you may have written over them with
> the store to xt->f128.  You need to keep the result in a local variable until
> the location of the putVSR in order to keep the current semantics.
> 
> (Although the current semantics probably need to be reviewed with respect to
> how the exception is signaled vs the result is stored to the register file.  I
> know there are current bugs in this area with respect to regular floating-point
> operations, never mind the vector floating-point ones.)
> 
>>  #define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp)                    \
>>  void helper_##name(CPUPPCState *env, uint32_t opcode)                        \
>>  {                                                                            \
>> -    ppc_vsr_t xt, xa, xb;                                                    \
>> +    ppc_vsr_t *xt = &env->vsr[xT(opcode)];                                   \
>> +    ppc_vsr_t *xa = &env->vsr[xA(opcode)];                                   \
>> +    ppc_vsr_t *xb = &env->vsr[xB(opcode)];                                   \
>>      int i;                                                                   \
>>                                                                               \
>> -    getVSR(xA(opcode), &xa, env);                                            \
>> -    getVSR(xB(opcode), &xb, env);                                            \
>> -    getVSR(xT(opcode), &xt, env);                                            \
>>      helper_reset_fpstatus(env);                                              \
>>                                                                               \
>>      for (i = 0; i < nels; i++) {                                             \
>>          float_status tstat = env->fp_status;                                 \
>>          set_float_exception_flags(0, &tstat);                                \
>> -        xt.fld = tp##_##op(xa.fld, xb.fld, &tstat);                          \
>> +        xt->fld = tp##_##op(xa->fld, xb->fld, &tstat);                       \
>>          env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
>>                                                                               \
>>          if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {    \
>>              float_invalid_op_addsub(env, sfprf, GETPC(),                     \
>> -                                    tp##_classify(xa.fld) |                  \
>> -                                    tp##_classify(xb.fld));                  \
>> +                                    tp##_classify(xa->fld) |                 \
>> +                                    tp##_classify(xb->fld));                 \
>>          }                                                                    \
> 
> Similarly.  Only here it's more interesting in that element 0 is modified when
> element 3 raises an exception.  To keep current semantics you need to keep xt
> as a ppc_vsr_t local variable and write back at the end.
> 
> It looks like the same is true for every other function.

Meh, so I forgot about the case where src == dest and obviously it's not something
that gets tickled by my test images :(

I've spent a bit of time today going through the functions and it seems that all
functions which have an xt parameter, minus a couple of the TEST macros, require the
result to be calculated in a local variable first.

I think the best solution is still to remove getVSR()/putVSR() but replace them with
macros for copying and zeroing like this:

    #define VSRCPY(d, s) (memcpy(d, s, sizeof(ppc_vsr_t)))
    #define VSRZERO(d)   (memset(d, 0, sizeof(ppc_vsr_t)))

Even though we're still doing a copy of the result register I hope that not having to
copy the 2 source registers, plus replacing the copies with a straight memcpy() will
still be an advantage. Does that seem sensible to you?

FWIW I agree that the exception handling seems inconsistent around the functions: for
example it looks strange that the VSX_FMADD code blindly copies the result back even
when an exception occurred.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c
  2019-04-30 16:29   ` Richard Henderson
@ 2019-05-05  9:34     ` Mark Cave-Ayland
  2019-05-05 14:34       ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05  9:34 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:29, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>>  #define VSX_LXVL(name, lj)                                              \
>>  void helper_##name(CPUPPCState *env, target_ulong addr,                 \
>> -                   target_ulong xt_num, target_ulong rb)                \
>> +                   target_ulong xt, target_ulong rb)                    \
>>  {                                                                       \
>> +    ppc_vsr_t *r = &env->vsr[xt];                                       \
>> +    int nb = GET_NB(env->gpr[rb]);                                      \
>>      int i;                                                              \
>> -    ppc_vsr_t xt;                                                       \
>> -    uint64_t nb = GET_NB(rb);                                           \
>>                                                                          \
>> -    xt.s128 = int128_zero();                                            \
>> +    r->s128 = int128_zero();                                            \
>>      if (nb) {                                                           \
>>          nb = (nb >= 16) ? 16 : nb;                                      \
>>          if (msr_le && !lj) {                                            \
>>              for (i = 16; i > 16 - nb; i--) {                            \
>> -                xt.VsrB(i - 1) = cpu_ldub_data_ra(env, addr, GETPC());  \
>> +                r->VsrB(i - 1) = cpu_ldub_data_ra(env, addr, GETPC());  \
>>                  addr = addr_add(env, addr, 1);                          \
>>              }                                                           \
>>          } else {                                                        \
>>              for (i = 0; i < nb; i++) {                                  \
>> -                xt.VsrB(i) = cpu_ldub_data_ra(env, addr, GETPC());      \
>> +                r->VsrB(i) = cpu_ldub_data_ra(env, addr, GETPC());      \
>>                  addr = addr_add(env, addr, 1);                          \
>>              }                                                           \
>>          }                                                               \
>>      }                                                                   \
>> -    putVSR(xt_num, &xt, env);                                           \
>>  }
> 
> Similarly, this modifies env->vsr[xt] before all exceptions are recognized.

Okay - if you're happy with my previous suggestion with the VSR* macros and the local
variable then I can make the same change here too.

>> @@ -304,12 +304,14 @@ static void gen_##name(DisasContext *ctx)                       \
>>          }                                                       \
>>      }                                                           \
>>      EA = tcg_temp_new();                                        \
>> -    xt = tcg_const_tl(xT(ctx->opcode));                         \
>>      gen_set_access_type(ctx, ACCESS_INT);                       \
>>      gen_addr_register(ctx, EA);                                 \
>> -    gen_helper_##name(cpu_env, EA, xt, cpu_gpr[rB(ctx->opcode)]); \
>> +    xt = tcg_const_tl(xT(ctx->opcode));                         \
>> +    rb = tcg_const_tl(rB(ctx->opcode));                         \
>> +    gen_helper_##name(cpu_env, EA, xt, rb);                     \
>>      tcg_temp_free(EA);                                          \
>>      tcg_temp_free(xt);                                          \
>> +    tcg_temp_free(rb);                                          \
>>  }
> 
> Why are you adjusting the function to pass the rB register number rather than
> the contents of rB?  That seems the wrong way around...

I think what I was trying to do here was eliminate the cpu_gpr since it feels to me
that with the vector patchsets and your negative offset patches that this should be
the way to go for accessing CPUState rather than using TCG globals.

Looking at this again I realise the solution is really the same as is currently used
for gen_load_spr() so I can use something like this:

    static inline void gen_load_gpr(TCGv t, int reg)
    {
        tcg_gen_ld_tl(t, cpu_env, offsetof(CPUPPCState, gpr[reg]));
    }

Does this seem reasonable as a solution?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 03/14] target/ppc: remove getVSR()/putVSR() from int_helper.c
  2019-04-30 16:32   ` Richard Henderson
@ 2019-05-05  9:36     ` Mark Cave-Ayland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05  9:36 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:32, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>>  void helper_xxextractuw(CPUPPCState *env, target_ulong xtn,
>>                          target_ulong xbn, uint32_t index)
>>  {
>> -    ppc_vsr_t xt, xb;
>> +    ppc_vsr_t *xt = &env->vsr[xtn];
>> +    ppc_vsr_t *xb = &env->vsr[xbn];
>>      size_t es = sizeof(uint32_t);
>>      uint32_t ext_index;
>>      int i;
>>  
>> -    getVSR(xbn, &xb, env);
>> -    memset(&xt, 0, sizeof(xt));
>> +    memset(xt, 0, sizeof(ppc_vsr_t));
> 
> This fails if xt == xb.
> 
> Similarly for xxinsertw.

Yeah. Again I can easily fix this up with a local variable.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
  2019-04-30 16:36   ` Richard Henderson
@ 2019-05-05  9:52     ` Mark Cave-Ayland
  2019-05-05 14:49       ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05  9:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:36, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>> +#define GEN_VSX_HELPER_X3(name, op1, op2, inval, type)                        \
>> +static void gen_##name(DisasContext *ctx)                                     \
>> +{                                                                             \
>> +    TCGv_i32 opc;                                                             \
>> +    TCGv_ptr xt, xa, xb;                                                      \
>> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
>> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
>> +        return;                                                               \
>> +    }                                                                         \
>> +    opc = tcg_const_i32(ctx->opcode);                                         \
>> +    xt = gen_vsr_ptr(xT(ctx->opcode));                                        \
>> +    xa = gen_vsr_ptr(xA(ctx->opcode));                                        \
>> +    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \
> 
> Do you still need to pass opc?
> 
> Anyway, I guess this is still progress...
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Right, it looks like VSX_CMP is the culprit here. Am I right in thinking that it's
best to remove the opc parameter from GEN_VSX_HELPER_X3 above, and then have a
separate gen and helper function for just the VSX_CMP instructions? Presumably this
reduces of the overhead at both translation and execution time for the instructions
that don't require it.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
  2019-04-30 16:38   ` Richard Henderson
@ 2019-05-05  9:57     ` Mark Cave-Ayland
  2019-05-05 15:03       ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05  9:57 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:38, Richard Henderson wrote:
> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>> +#define GEN_VSX_HELPER_X2(name, op1, op2, inval, type)                        \
>> +static void gen_##name(DisasContext *ctx)                                     \
>> +{                                                                             \
>> +    TCGv_i32 opc;                                                             \
>> +    TCGv_ptr xt, xb;                                                          \
>> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
>> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
>> +        return;                                                               \
>> +    }                                                                         \
>> +    opc = tcg_const_i32(ctx->opcode);                                         \
>> +    xt = gen_vsr_ptr(xT(ctx->opcode));                                        \
>> +    xb = gen_vsr_ptr(xB(ctx->opcode));                                        \
>> +    gen_helper_##name(cpu_env, opc, xt, xb);                                  \
> 
> Similarly wrt opc.  However,
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

For reference the culprits here is helper_xscvqpdp(). But again if you agree that it
makes sense to create separate gen/helper functions then I can remove the opcode
later on in the series.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 06/14] target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c
  2019-04-30 16:41   ` Richard Henderson
@ 2019-05-05 10:03     ` Mark Cave-Ayland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05 10:03 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:41, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>> Rather than perform the VSR register decoding within the helper itself,
>> introduce a new GEN_VSX_HELPER_X2_AB macro which performs the decode based
>> upon xA and xB at translation time.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> ---
>>  target/ppc/fpu_helper.c             | 15 ++++++---------
>>  target/ppc/helper.h                 | 12 ++++++------
>>  target/ppc/translate/vsx-impl.inc.c | 30 ++++++++++++++++++++++++------
>>  3 files changed, 36 insertions(+), 21 deletions(-)
> 
> This time I see VSX_TDIV right away as a place that still
> uses opcode.  It can certainly be cleaned up, but that's
> a different patch.

Right, and also helper_xscmpexpdp() and VSX_SCALAR_CMP() which is all of the helpers
touched here. Perhaps one to think about since if this macro isn't used anywhere else
after I've reworked the series to use separate gen/helper functions then probably it
makes sense not to bother with this one.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 07/14] target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c
  2019-04-30 16:43   ` Richard Henderson
@ 2019-05-05 10:06     ` Mark Cave-Ayland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05 10:06 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:43, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>> Rather than perform the VSR register decoding within the helper itself,
>> introduce a new GEN_VSX_HELPER_X1 macro which performs the decode based
>> upon xB at translation time.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> ---
>>  target/ppc/fpu_helper.c             |  6 ++----
>>  target/ppc/helper.h                 |  8 ++++----
>>  target/ppc/translate/vsx-impl.inc.c | 24 ++++++++++++++++++++----
>>  3 files changed, 26 insertions(+), 12 deletions(-)
> 
> Similarly wrt VSX_TSQRT.

Yes, and also the other helper touched here: helper_xststdcsp(). I think the comment
here is basically the same as for GEN_VSX_HELPER_X2_AB.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 08/14] target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c
  2019-04-30 16:47   ` Richard Henderson
@ 2019-05-05 10:07     ` Mark Cave-Ayland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05 10:07 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 17:47, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>> +#define GEN_VSX_HELPER_R3(name, op1, op2, inval, type)                        \
>> +static void gen_##name(DisasContext *ctx)                                     \
>> +{                                                                             \
>> +    TCGv_i32 opc;                                                             \
>> +    TCGv_ptr xt, xa, xb;                                                      \
>> +    if (unlikely(!ctx->vsx_enabled)) {                                        \
>> +        gen_exception(ctx, POWERPC_EXCP_VSXU);                                \
>> +        return;                                                               \
>> +    }                                                                         \
>> +    opc = tcg_const_i32(ctx->opcode);                                         \
>> +    xt = gen_vsr_ptr(rD(ctx->opcode) + 32);                                   \
>> +    xa = gen_vsr_ptr(rA(ctx->opcode) + 32);                                   \
>> +    xb = gen_vsr_ptr(rB(ctx->opcode) + 32);                                   \
>> +    gen_helper_##name(cpu_env, opc, xt, xa, xb);                              \
> 
> Is opc still used here?  Otherwise,
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Yeah, looks like it's used for the rounding mode. I'll have a look and see how this
compares with the previous case.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
  2019-04-30 17:00   ` Richard Henderson
@ 2019-05-05 10:20     ` Mark Cave-Ayland
  2019-05-05 15:17       ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05 10:20 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 30/04/2019 18:00, Richard Henderson wrote:

> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>>  #define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)              \
>>  void helper_##op(CPUPPCState *env, uint32_t opcode,                           \
>> -                 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)                 \
>> +                 ppc_vsr_t *xt, ppc_vsr_t *xa,                                \
>> +                 ppc_vsr_t *b, ppc_vsr_t *c)                                  \
>>  {                                                                             \
>> -    ppc_vsr_t *b, *c;                                                         \
>>      int i;                                                                    \
>>                                                                                \
>> -    if (afrm) { /* AxB + T */                                                 \
>> -        b = xb;                                                               \
>> -        c = xt;                                                               \
>> -    } else { /* AxT + B */                                                    \
>> -        b = xt;                                                               \
>> -        c = xb;                                                               \
>> -    }                                                                         \
> 
> The afrm argument is no longer used.
> This also means that e.g.
> 
> VSX_MADD(xsmaddadp, 1, float64, VsrD(0), MADD_FLGS, 1, 1, 0)
> VSX_MADD(xsmaddmdp, 1, float64, VsrD(0), MADD_FLGS, 0, 1, 0)
> 
> are redundant.  Similarly with all of the other pairs.

Agreed. What do you think is the best solution here - maybe a double macro that looks
something like this?

#define VSX_MADD(op, prec, nels, tp, fld, maddflgs, sfprf, r2sp)
_VSX_MADD(op##aprec, nels, tp, fld, maddflgs, sfprf, r2sp)
_VSX_MADD(op##mprec, nels, tp, fld, maddflgs, sfprf, r2sp)

VSX_MADD(xsmadd, dp, 1, float64, VsrD(0), MADD_FLGS, 1, 0)


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c
  2019-05-05  9:27     ` Mark Cave-Ayland
@ 2019-05-05 14:31       ` Richard Henderson
  2019-05-05 15:46         ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-05-05 14:31 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 5/5/19 2:27 AM, Mark Cave-Ayland wrote:
> I've spent a bit of time today going through the functions and it seems that all
> functions which have an xt parameter, minus a couple of the TEST macros, require the
> result to be calculated in a local variable first.
> 
> I think the best solution is still to remove getVSR()/putVSR() but replace them with
> macros for copying and zeroing like this:
> 
>     #define VSRCPY(d, s) (memcpy(d, s, sizeof(ppc_vsr_t)))
>     #define VSRZERO(d)   (memset(d, 0, sizeof(ppc_vsr_t)))

Local variable, yes.  But I see no reason for macros.

  ppc_vsr_t res = { };
  ...
  *xt = res;


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c
  2019-05-05  9:34     ` Mark Cave-Ayland
@ 2019-05-05 14:34       ` Richard Henderson
  2019-05-05 15:49         ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-05-05 14:34 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 5/5/19 2:34 AM, Mark Cave-Ayland wrote:
>>>      EA = tcg_temp_new();                                        \
>>> -    xt = tcg_const_tl(xT(ctx->opcode));                         \
>>>      gen_set_access_type(ctx, ACCESS_INT);                       \
>>>      gen_addr_register(ctx, EA);                                 \
>>> -    gen_helper_##name(cpu_env, EA, xt, cpu_gpr[rB(ctx->opcode)]); \
>>> +    xt = tcg_const_tl(xT(ctx->opcode));                         \
>>> +    rb = tcg_const_tl(rB(ctx->opcode));                         \
>>> +    gen_helper_##name(cpu_env, EA, xt, rb);                     \
>>>      tcg_temp_free(EA);                                          \
>>>      tcg_temp_free(xt);                                          \
>>> +    tcg_temp_free(rb);                                          \
>>>  }
>>
>> Why are you adjusting the function to pass the rB register number rather than
>> the contents of rB?  That seems the wrong way around...
> 
> I think what I was trying to do here was eliminate the cpu_gpr since it
> feels to me that with the vector patchsets and your negative offset patches
> that this should be the way to go for accessing CPUState rather than using
> TCG globals.

Not for the integer register set.

> Looking at this again I realise the solution is really the same as is
> currently used
> for gen_load_spr() so I can use something like this:>
>     static inline void gen_load_gpr(TCGv t, int reg)
>     {
>         tcg_gen_ld_tl(t, cpu_env, offsetof(CPUPPCState, gpr[reg]));
>     }
> 
> Does this seem reasonable as a solution?

No, this will fail quickly.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
  2019-05-05  9:52     ` Mark Cave-Ayland
@ 2019-05-05 14:49       ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-05-05 14:49 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 5/5/19 2:52 AM, Mark Cave-Ayland wrote:
> Right, it looks like VSX_CMP is the culprit here. Am I right in thinking that it's
> best to remove the opc parameter from GEN_VSX_HELPER_X3 above, and then have a
> separate gen and helper function for just the VSX_CMP instructions? Presumably this
> reduces of the overhead at both translation and execution time for the instructions
> that don't require it.

Yep.

I think the best fix for VSX_CMP is to return the value that is to be assigned
to cr[6], and let it assign like so:

  gen_helper_foo(cpu_crf[6], other, arguments);

(Or if the opcode bit is unset,

  TCGv_i32 ignored = tcg_temp_new_i32();
  gen_helper_foo(ignored, other arguments);
  tcg_temp_free_i32(ignored);
)

at which point these functions do not modify tcg globals, so the decl
can be improved to

  DEF_HELPER_FLAGS_2(xvcmpeqdp, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr)

To keep the assignment vs exception order, you remove the direct call to
do_float_check_status and use gen_helper_float_check_status.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
  2019-05-05  9:57     ` Mark Cave-Ayland
@ 2019-05-05 15:03       ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-05-05 15:03 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 5/5/19 2:57 AM, Mark Cave-Ayland wrote:
> For reference the culprits here is helper_xscvqpdp(). But again if you agree that it
> makes sense to create separate gen/helper functions then I can remove the opcode
> later on in the series.

Yes.  Or indeed in a completely separate series.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
  2019-05-05 10:20     ` Mark Cave-Ayland
@ 2019-05-05 15:17       ` Richard Henderson
  2019-05-05 15:54         ` Mark Cave-Ayland
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-05-05 15:17 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 5/5/19 3:20 AM, Mark Cave-Ayland wrote:
>> The afrm argument is no longer used.
>> This also means that e.g.
>>
>> VSX_MADD(xsmaddadp, 1, float64, VsrD(0), MADD_FLGS, 1, 1, 0)
>> VSX_MADD(xsmaddmdp, 1, float64, VsrD(0), MADD_FLGS, 0, 1, 0)
>>
>> are redundant.  Similarly with all of the other pairs.
> 
> Agreed. What do you think is the best solution here - maybe a double macro that looks
> something like this?
> 
> #define VSX_MADD(op, prec, nels, tp, fld, maddflgs, sfprf, r2sp)
> _VSX_MADD(op##aprec, nels, tp, fld, maddflgs, sfprf, r2sp)
> _VSX_MADD(op##mprec, nels, tp, fld, maddflgs, sfprf, r2sp)
> 
> VSX_MADD(xsmadd, dp, 1, float64, VsrD(0), MADD_FLGS, 1, 0)

I have no idea what you're suggesting.

I am suggesting one function, xsmadddp, replacing xsmaddadp + xsmaddmdp, that
is used by both instructions.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c
  2019-05-05 14:31       ` Richard Henderson
@ 2019-05-05 15:46         ` Mark Cave-Ayland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05 15:46 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 05/05/2019 15:31, Richard Henderson wrote:

> On 5/5/19 2:27 AM, Mark Cave-Ayland wrote:
>> I've spent a bit of time today going through the functions and it seems that all
>> functions which have an xt parameter, minus a couple of the TEST macros, require the
>> result to be calculated in a local variable first.
>>
>> I think the best solution is still to remove getVSR()/putVSR() but replace them with
>> macros for copying and zeroing like this:
>>
>>     #define VSRCPY(d, s) (memcpy(d, s, sizeof(ppc_vsr_t)))
>>     #define VSRZERO(d)   (memset(d, 0, sizeof(ppc_vsr_t)))
> 
> Local variable, yes.  But I see no reason for macros.
> 
>   ppc_vsr_t res = { };
>   ...
>   *xt = res;

Ah thanks for the hint - I wasn't aware that this could be done in a consistent way
for structs across all compilers. I'll implement this in the next version of the series.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c
  2019-05-05 14:34       ` Richard Henderson
@ 2019-05-05 15:49         ` Mark Cave-Ayland
  2019-05-05 15:58           ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05 15:49 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 05/05/2019 15:34, Richard Henderson wrote:

> On 5/5/19 2:34 AM, Mark Cave-Ayland wrote:
>>>>      EA = tcg_temp_new();                                        \
>>>> -    xt = tcg_const_tl(xT(ctx->opcode));                         \
>>>>      gen_set_access_type(ctx, ACCESS_INT);                       \
>>>>      gen_addr_register(ctx, EA);                                 \
>>>> -    gen_helper_##name(cpu_env, EA, xt, cpu_gpr[rB(ctx->opcode)]); \
>>>> +    xt = tcg_const_tl(xT(ctx->opcode));                         \
>>>> +    rb = tcg_const_tl(rB(ctx->opcode));                         \
>>>> +    gen_helper_##name(cpu_env, EA, xt, rb);                     \
>>>>      tcg_temp_free(EA);                                          \
>>>>      tcg_temp_free(xt);                                          \
>>>> +    tcg_temp_free(rb);                                          \
>>>>  }
>>>
>>> Why are you adjusting the function to pass the rB register number rather than
>>> the contents of rB?  That seems the wrong way around...
>>
>> I think what I was trying to do here was eliminate the cpu_gpr since it
>> feels to me that with the vector patchsets and your negative offset patches
>> that this should be the way to go for accessing CPUState rather than using
>> TCG globals.
> 
> Not for the integer register set.
> 
>> Looking at this again I realise the solution is really the same as is
>> currently used
>> for gen_load_spr() so I can use something like this:>
>>     static inline void gen_load_gpr(TCGv t, int reg)
>>     {
>>         tcg_gen_ld_tl(t, cpu_env, offsetof(CPUPPCState, gpr[reg]));
>>     }
>>
>> Does this seem reasonable as a solution?
> 
> No, this will fail quickly.

Okay in that case I'll leave it as-is. So just to satisfy my curiosity here: is the
problem here the mixing and matching of offsets and TCG globals, rather than the use
of offsets as done for the VMX/VSX registers?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
  2019-05-05 15:17       ` Richard Henderson
@ 2019-05-05 15:54         ` Mark Cave-Ayland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Cave-Ayland @ 2019-05-05 15:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc, david, rth, gkurz

On 05/05/2019 16:17, Richard Henderson wrote:

> On 5/5/19 3:20 AM, Mark Cave-Ayland wrote:
>>> The afrm argument is no longer used.
>>> This also means that e.g.
>>>
>>> VSX_MADD(xsmaddadp, 1, float64, VsrD(0), MADD_FLGS, 1, 1, 0)
>>> VSX_MADD(xsmaddmdp, 1, float64, VsrD(0), MADD_FLGS, 0, 1, 0)
>>>
>>> are redundant.  Similarly with all of the other pairs.
>>
>> Agreed. What do you think is the best solution here - maybe a double macro that looks
>> something like this?
>>
>> #define VSX_MADD(op, prec, nels, tp, fld, maddflgs, sfprf, r2sp)
>> _VSX_MADD(op##aprec, nels, tp, fld, maddflgs, sfprf, r2sp)
>> _VSX_MADD(op##mprec, nels, tp, fld, maddflgs, sfprf, r2sp)
>>
>> VSX_MADD(xsmadd, dp, 1, float64, VsrD(0), MADD_FLGS, 1, 0)
> 
> I have no idea what you're suggesting.
> 
> I am suggesting one function, xsmadddp, replacing xsmaddadp + xsmaddmdp, that
> is used by both instructions.

Gotcha. I was thinking that the standard practice was to have a 1:1 correspondence
between the gen function and helper function for instructions, but I'm happy to go
with your suggestion above.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c
  2019-05-05 15:49         ` Mark Cave-Ayland
@ 2019-05-05 15:58           ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2019-05-05 15:58 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, rth, gkurz

On 5/5/19 8:49 AM, Mark Cave-Ayland wrote:
> Okay in that case I'll leave it as-is. So just to satisfy my curiosity here: is the
> problem here the mixing and matching of offsets and TCG globals, rather than the use
> of offsets as done for the VMX/VSX registers?

Correct.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2019-05-05 15:58 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-28 14:38 [Qemu-devel] [PATCH 00/14] target/ppc: remove getVSR()/putVSR() and further tidy-up Mark Cave-Ayland
2019-04-28 14:38 ` [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c Mark Cave-Ayland
2019-04-30 16:25   ` Richard Henderson
2019-05-05  9:27     ` Mark Cave-Ayland
2019-05-05 14:31       ` Richard Henderson
2019-05-05 15:46         ` Mark Cave-Ayland
2019-04-28 14:38 ` [Qemu-devel] [PATCH 02/14] target/ppc: remove getVSR()/putVSR() from mem_helper.c Mark Cave-Ayland
2019-04-30 16:29   ` Richard Henderson
2019-05-05  9:34     ` Mark Cave-Ayland
2019-05-05 14:34       ` Richard Henderson
2019-05-05 15:49         ` Mark Cave-Ayland
2019-05-05 15:58           ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 03/14] target/ppc: remove getVSR()/putVSR() from int_helper.c Mark Cave-Ayland
2019-04-30 16:32   ` Richard Henderson
2019-05-05  9:36     ` Mark Cave-Ayland
2019-04-28 14:38 ` [Qemu-devel] [PATCH 04/14] target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c Mark Cave-Ayland
2019-04-30 16:36   ` Richard Henderson
2019-05-05  9:52     ` Mark Cave-Ayland
2019-05-05 14:49       ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 05/14] target/ppc: introduce GEN_VSX_HELPER_X2 " Mark Cave-Ayland
2019-04-30 16:38   ` Richard Henderson
2019-05-05  9:57     ` Mark Cave-Ayland
2019-05-05 15:03       ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 06/14] target/ppc: introduce GEN_VSX_HELPER_X2_AB " Mark Cave-Ayland
2019-04-30 16:41   ` Richard Henderson
2019-05-05 10:03     ` Mark Cave-Ayland
2019-04-28 14:38 ` [Qemu-devel] [PATCH 07/14] target/ppc: introduce GEN_VSX_HELPER_X1 " Mark Cave-Ayland
2019-04-30 16:43   ` Richard Henderson
2019-05-05 10:06     ` Mark Cave-Ayland
2019-04-28 14:38 ` [Qemu-devel] [PATCH 08/14] target/ppc: introduce GEN_VSX_HELPER_R3 " Mark Cave-Ayland
2019-04-30 16:47   ` Richard Henderson
2019-05-05 10:07     ` Mark Cave-Ayland
2019-04-28 14:38 ` [Qemu-devel] [PATCH 09/14] target/ppc: introduce GEN_VSX_HELPER_R2 " Mark Cave-Ayland
2019-04-30 16:51   ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 10/14] target/ppc: introduce GEN_VSX_HELPER_R2_AB " Mark Cave-Ayland
2019-04-30 16:52   ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 11/14] target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time Mark Cave-Ayland
2019-04-30 16:53   ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 12/14] target/ppc: decode target register in VSX_EXTRACT_INSERT " Mark Cave-Ayland
2019-04-30 16:53   ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 13/14] target/ppc: improve VSX_TEST_DC with new generator macros Mark Cave-Ayland
2019-04-30 16:55   ` Richard Henderson
2019-04-28 14:38 ` [Qemu-devel] [PATCH 14/14] target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro Mark Cave-Ayland
2019-04-30 17:00   ` Richard Henderson
2019-05-05 10:20     ` Mark Cave-Ayland
2019-05-05 15:17       ` Richard Henderson
2019-05-05 15:54         ` Mark Cave-Ayland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.