All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15
@ 2017-02-23 19:56 Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 01/15] target/ppc: introduce helper_update_ov_legacy Nikunj A Dadhania
                   ` (15 more replies)
  0 siblings, 16 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Patches:
01-06  Cleans up the XER split out variables and now the 
       flag bits are stored in XER at their respective places 

07-14  Contains implentation of CA32 and OV32 bits added to the 
       ISA 3.0. Various fixed-point arithmetic instructions are 
       updated to take care of the newer flags.
 
15     Finally the last patch adds new instruction mcrxrx, that helps
       reading the carry (CA and CA32) and the overflow (OV and OV32) flags


Booted the POWER8 guest fine, needs more testing as changes are 
intrusive in nature.

Changelog:
v3:
* Get rid of cpu_ca, cpu_ov, cpu_so split out variables
* As most of the patches under went changes, dropped the 
  reviewed-bys(except neg[.] patch)

v2: 
* Add missing condition in narrow mode(add/subf), multiply and divide
* Drop nego patch, subf implementation is sufficient for setting OV and OV32
* Retaining neg[.], as the code is simplified.
* Fix OV resetting in compute_ov()

v1: 
* Use these ISA 3.0 flag to enable CA32 and OV32
* Re-write ca32 compute routine
* Add setting of flags for "neg." and "nego."

Nikunj A Dadhania (15):
  target/ppc: introduce helper_update_ov_legacy
  target/ppc: update ov flag from remaining paths
  target/ppc: introduce helper_update_ca_legacy
  target/ppc: add gen_op_update_ca_legacy() helper
  target/ppc: add gen_op_update_ov_legacy() helper
  target/ppc: remove xer split-out flags(so, ov, ca)
  target/ppc: support for 32-bit carry and overflow
  target/ppc: update ca32 in arithmetic add
  target/ppc: update ca32 in arithmetic substract
  target/ppc: add gen_op_update_ov_isa300()
  target/ppc: update OV/OV32 for mull[d,w] insns
  target/ppc: update OV/OV32 for divide operations
  target/ppc: update OV/OV32 flags for add/sub
  target/ppc: use tcg ops for neg instruction
  target/ppc: add mcrxrx instruction

 target/ppc/cpu.c        |   8 +-
 target/ppc/cpu.h        |  33 ++--
 target/ppc/int_helper.c |  90 ++++++-----
 target/ppc/translate.c  | 396 +++++++++++++++++++++++++++++++++++-------------
 4 files changed, 371 insertions(+), 156 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 01/15] target/ppc: introduce helper_update_ov_legacy
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 02/15] target/ppc: update ov flag from remaining paths Nikunj A Dadhania
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Removes duplicate code and will be useful for consolidating flags

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/int_helper.c | 34 +++++++++++++---------------------
 1 file changed, 13 insertions(+), 21 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index dd0a892..da4e1a6 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -28,6 +28,15 @@
 /*****************************************************************************/
 /* Fixed point operations helpers */
 
+static inline void helper_update_ov_legacy(CPUPPCState *env, int ov)
+{
+    if (unlikely(ov)) {
+        env->so = env->ov = 1;
+    } else {
+        env->ov = 0;
+    }
+}
+
 target_ulong helper_divweu(CPUPPCState *env, target_ulong ra, target_ulong rb,
                            uint32_t oe)
 {
@@ -49,11 +58,7 @@ target_ulong helper_divweu(CPUPPCState *env, target_ulong ra, target_ulong rb,
     }
 
     if (oe) {
-        if (unlikely(overflow)) {
-            env->so = env->ov = 1;
-        } else {
-            env->ov = 0;
-        }
+        helper_update_ov_legacy(env, overflow);
     }
 
     return (target_ulong)rt;
@@ -81,11 +86,7 @@ target_ulong helper_divwe(CPUPPCState *env, target_ulong ra, target_ulong rb,
     }
 
     if (oe) {
-        if (unlikely(overflow)) {
-            env->so = env->ov = 1;
-        } else {
-            env->ov = 0;
-        }
+        helper_update_ov_legacy(env, overflow);
     }
 
     return (target_ulong)rt;
@@ -105,11 +106,7 @@ uint64_t helper_divdeu(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe)
     }
 
     if (oe) {
-        if (unlikely(overflow)) {
-            env->so = env->ov = 1;
-        } else {
-            env->ov = 0;
-        }
+        helper_update_ov_legacy(env, overflow);
     }
 
     return rt;
@@ -127,12 +124,7 @@ uint64_t helper_divde(CPUPPCState *env, uint64_t rau, uint64_t rbu, uint32_t oe)
     }
 
     if (oe) {
-
-        if (unlikely(overflow)) {
-            env->so = env->ov = 1;
-        } else {
-            env->ov = 0;
-        }
+        helper_update_ov_legacy(env, overflow);
     }
 
     return rt;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 02/15] target/ppc: update ov flag from remaining paths
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 01/15] target/ppc: introduce helper_update_ov_legacy Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 20:23   ` Richard Henderson
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 03/15] target/ppc: introduce helper_update_ca_legacy Nikunj A Dadhania
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/int_helper.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index da4e1a6..b376860 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -320,22 +320,24 @@ target_ulong helper_divo(CPUPPCState *env, target_ulong arg1,
                          target_ulong arg2)
 {
     uint64_t tmp = (uint64_t)arg1 << 32 | env->spr[SPR_MQ];
+    int ov;
 
     if (((int32_t)tmp == INT32_MIN && (int32_t)arg2 == (int32_t)-1) ||
         (int32_t)arg2 == 0) {
-        env->so = env->ov = 1;
+        ov = 1;
         env->spr[SPR_MQ] = 0;
         return INT32_MIN;
     } else {
         env->spr[SPR_MQ] = tmp % arg2;
         tmp /= (int32_t)arg2;
         if ((int32_t)tmp != tmp) {
-            env->so = env->ov = 1;
+            ov = 1;
         } else {
-            env->ov = 0;
+            ov = 0;
         }
         return tmp;
     }
+    helper_update_ov_legacy(env, ov);
 }
 
 target_ulong helper_divs(CPUPPCState *env, target_ulong arg1,
@@ -356,11 +358,11 @@ target_ulong helper_divso(CPUPPCState *env, target_ulong arg1,
 {
     if (((int32_t)arg1 == INT32_MIN && (int32_t)arg2 == (int32_t)-1) ||
         (int32_t)arg2 == 0) {
-        env->so = env->ov = 1;
+        helper_update_ov_legacy(env, 1);
         env->spr[SPR_MQ] = 0;
         return INT32_MIN;
     } else {
-        env->ov = 0;
+        helper_update_ov_legacy(env, 0);
         env->spr[SPR_MQ] = (int32_t)arg1 % (int32_t)arg2;
         return (int32_t)arg1 / (int32_t)arg2;
     }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 03/15] target/ppc: introduce helper_update_ca_legacy
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 01/15] target/ppc: introduce helper_update_ov_legacy Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 02/15] target/ppc: update ov flag from remaining paths Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 04/15] target/ppc: add gen_op_update_ca_legacy() helper Nikunj A Dadhania
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Update the environment carry variable in the helper.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/int_helper.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index b376860..d7af671 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -37,6 +37,11 @@ static inline void helper_update_ov_legacy(CPUPPCState *env, int ov)
     }
 }
 
+static inline void helper_update_ca(CPUPPCState *env, int ca)
+{
+    env->ca = ca;
+}
+
 target_ulong helper_divweu(CPUPPCState *env, target_ulong ra, target_ulong rb,
                            uint32_t oe)
 {
@@ -213,24 +218,26 @@ target_ulong helper_sraw(CPUPPCState *env, target_ulong value,
                          target_ulong shift)
 {
     int32_t ret;
+    int ca;
 
     if (likely(!(shift & 0x20))) {
         if (likely((uint32_t)shift != 0)) {
             shift &= 0x1f;
             ret = (int32_t)value >> shift;
             if (likely(ret >= 0 || (value & ((1 << shift) - 1)) == 0)) {
-                env->ca = 0;
+                ca = 0;
             } else {
-                env->ca = 1;
+                ca = 1;
             }
         } else {
             ret = (int32_t)value;
-            env->ca = 0;
+            ca = 0;
         }
     } else {
         ret = (int32_t)value >> 31;
-        env->ca = (ret != 0);
+        ca = (ret != 0);
     }
+    helper_update_ca(env, ca);
     return (target_long)ret;
 }
 
@@ -239,24 +246,26 @@ target_ulong helper_srad(CPUPPCState *env, target_ulong value,
                          target_ulong shift)
 {
     int64_t ret;
+    int ca;
 
     if (likely(!(shift & 0x40))) {
         if (likely((uint64_t)shift != 0)) {
             shift &= 0x3f;
             ret = (int64_t)value >> shift;
             if (likely(ret >= 0 || (value & ((1ULL << shift) - 1)) == 0)) {
-                env->ca = 0;
+                ca = 0;
             } else {
-                env->ca = 1;
+                ca = 1;
             }
         } else {
             ret = (int64_t)value;
-            env->ca = 0;
+            ca = 0;
         }
     } else {
         ret = (int64_t)value >> 63;
-        env->ca = (ret != 0);
+        ca = (ret != 0);
     }
+    helper_update_ca(env, ca);
     return ret;
 }
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 04/15] target/ppc: add gen_op_update_ca_legacy() helper
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (2 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 03/15] target/ppc: introduce helper_update_ca_legacy Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 05/15] target/ppc: add gen_op_update_ov_legacy() helper Nikunj A Dadhania
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Update cpu_ca using the helper routine. This will help in consolidating
xer flags code

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 91 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 63 insertions(+), 28 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b09e16f..ae7b43d 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -792,6 +792,11 @@ static void gen_cmpb(DisasContext *ctx)
 
 /***                           Integer arithmetic                          ***/
 
+static inline void gen_op_update_ca_legacy(TCGv ca)
+{
+    tcg_gen_mov_tl(cpu_ca, ca);
+}
+
 static inline void gen_op_arith_compute_ov(DisasContext *ctx, TCGv arg0,
                                            TCGv arg1, TCGv arg2, int sub)
 {
@@ -818,11 +823,16 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
                                     bool compute_ov, bool compute_rc0)
 {
     TCGv t0 = ret;
+    TCGv ca = tcg_temp_new();
 
     if (compute_ca || compute_ov) {
         t0 = tcg_temp_new();
     }
 
+    if (add_ca) {
+        tcg_gen_mov_tl(ca, cpu_ca);
+    }
+
     if (compute_ca) {
         if (NARROW_MODE(ctx)) {
             /* Caution: a non-obvious corner case of the spec is that we
@@ -832,32 +842,34 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
             tcg_gen_xor_tl(t1, arg1, arg2);        /* add without carry */
             tcg_gen_add_tl(t0, arg1, arg2);
             if (add_ca) {
-                tcg_gen_add_tl(t0, t0, cpu_ca);
+                tcg_gen_add_tl(t0, t0, ca);
             }
-            tcg_gen_xor_tl(cpu_ca, t0, t1);        /* bits changed w/ carry */
+            tcg_gen_xor_tl(ca, t0, t1);        /* bits changed w/ carry */
             tcg_temp_free(t1);
-            tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);   /* extract bit 32 */
-            tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+            tcg_gen_extract_tl(ca, ca, 32, 1);
         } else {
             TCGv zero = tcg_const_tl(0);
             if (add_ca) {
-                tcg_gen_add2_tl(t0, cpu_ca, arg1, zero, cpu_ca, zero);
-                tcg_gen_add2_tl(t0, cpu_ca, t0, cpu_ca, arg2, zero);
+                tcg_gen_add2_tl(t0, ca, arg1, zero, ca, zero);
+                tcg_gen_add2_tl(t0, ca, t0, ca, arg2, zero);
             } else {
-                tcg_gen_add2_tl(t0, cpu_ca, arg1, zero, arg2, zero);
+                tcg_gen_add2_tl(t0, ca, arg1, zero, arg2, zero);
             }
             tcg_temp_free(zero);
         }
     } else {
         tcg_gen_add_tl(t0, arg1, arg2);
         if (add_ca) {
-            tcg_gen_add_tl(t0, t0, cpu_ca);
+            tcg_gen_add_tl(t0, t0, ca);
         }
     }
 
     if (compute_ov) {
         gen_op_arith_compute_ov(ctx, t0, arg1, arg2, 0);
     }
+    if (compute_ca) {
+        gen_op_update_ca_legacy(ca);
+    }
     if (unlikely(compute_rc0)) {
         gen_set_Rc0(ctx, t0);
     }
@@ -866,6 +878,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
         tcg_gen_mov_tl(ret, t0);
         tcg_temp_free(t0);
     }
+    tcg_temp_free(ca);
 }
 /* Add functions with two operands */
 #define GEN_INT_ARITH_ADD(name, opc3, add_ca, compute_ca, compute_ov)         \
@@ -1327,10 +1340,14 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
                                      bool compute_ov, bool compute_rc0)
 {
     TCGv t0 = ret;
+    TCGv ca = tcg_temp_new();
 
     if (compute_ca || compute_ov) {
         t0 = tcg_temp_new();
     }
+    if (add_ca) {
+        tcg_gen_extract_tl(ca, cpu_xer, XER_CA_BIT, 1);
+    }
 
     if (compute_ca) {
         /* dest = ~arg1 + arg2 [+ ca].  */
@@ -1342,34 +1359,33 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
             TCGv t1 = tcg_temp_new();
             tcg_gen_not_tl(inv1, arg1);
             if (add_ca) {
-                tcg_gen_add_tl(t0, arg2, cpu_ca);
+                tcg_gen_add_tl(t0, arg2, ca);
             } else {
                 tcg_gen_addi_tl(t0, arg2, 1);
             }
             tcg_gen_xor_tl(t1, arg2, inv1);         /* add without carry */
             tcg_gen_add_tl(t0, t0, inv1);
             tcg_temp_free(inv1);
-            tcg_gen_xor_tl(cpu_ca, t0, t1);         /* bits changes w/ carry */
+            tcg_gen_xor_tl(ca, t0, t1);         /* bits changes w/ carry */
             tcg_temp_free(t1);
-            tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);    /* extract bit 32 */
-            tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+            tcg_gen_extract_tl(ca, ca, 32, 1);    /* extract bit 32 */
         } else if (add_ca) {
             TCGv zero, inv1 = tcg_temp_new();
             tcg_gen_not_tl(inv1, arg1);
             zero = tcg_const_tl(0);
-            tcg_gen_add2_tl(t0, cpu_ca, arg2, zero, cpu_ca, zero);
-            tcg_gen_add2_tl(t0, cpu_ca, t0, cpu_ca, inv1, zero);
+            tcg_gen_add2_tl(t0, ca, arg2, zero, ca, zero);
+            tcg_gen_add2_tl(t0, ca, t0, ca, inv1, zero);
             tcg_temp_free(zero);
             tcg_temp_free(inv1);
         } else {
-            tcg_gen_setcond_tl(TCG_COND_GEU, cpu_ca, arg2, arg1);
+            tcg_gen_setcond_tl(TCG_COND_GEU, ca, arg2, arg1);
             tcg_gen_sub_tl(t0, arg2, arg1);
         }
     } else if (add_ca) {
         /* Since we're ignoring carry-out, we can simplify the
            standard ~arg1 + arg2 + ca to arg2 - arg1 + ca - 1.  */
         tcg_gen_sub_tl(t0, arg2, arg1);
-        tcg_gen_add_tl(t0, t0, cpu_ca);
+        tcg_gen_add_tl(t0, t0, ca);
         tcg_gen_subi_tl(t0, t0, 1);
     } else {
         tcg_gen_sub_tl(t0, arg2, arg1);
@@ -1378,6 +1394,9 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
     if (compute_ov) {
         gen_op_arith_compute_ov(ctx, t0, arg1, arg2, 1);
     }
+    if (compute_ca) {
+        gen_op_update_ca_legacy(ca);
+    }
     if (unlikely(compute_rc0)) {
         gen_set_Rc0(ctx, t0);
     }
@@ -1386,6 +1405,7 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
         tcg_gen_mov_tl(ret, t0);
         tcg_temp_free(t0);
     }
+    tcg_temp_free(ca);
 }
 /* Sub functions with Two operands functions */
 #define GEN_INT_ARITH_SUBF(name, opc3, add_ca, compute_ca, compute_ov)        \
@@ -2119,23 +2139,27 @@ static void gen_srawi(DisasContext *ctx)
     int sh = SH(ctx->opcode);
     TCGv dst = cpu_gpr[rA(ctx->opcode)];
     TCGv src = cpu_gpr[rS(ctx->opcode)];
+    TCGv ca = tcg_temp_new();
+
     if (sh == 0) {
         tcg_gen_ext32s_tl(dst, src);
-        tcg_gen_movi_tl(cpu_ca, 0);
+        tcg_gen_movi_tl(ca, 0);
     } else {
         TCGv t0;
         tcg_gen_ext32s_tl(dst, src);
-        tcg_gen_andi_tl(cpu_ca, dst, (1ULL << sh) - 1);
+        tcg_gen_andi_tl(ca, dst, (1ULL << sh) - 1);
         t0 = tcg_temp_new();
         tcg_gen_sari_tl(t0, dst, TARGET_LONG_BITS - 1);
-        tcg_gen_and_tl(cpu_ca, cpu_ca, t0);
+        tcg_gen_and_tl(ca, ca, t0);
         tcg_temp_free(t0);
-        tcg_gen_setcondi_tl(TCG_COND_NE, cpu_ca, cpu_ca, 0);
+        tcg_gen_setcondi_tl(TCG_COND_NE, ca, ca, 0);
         tcg_gen_sari_tl(dst, dst, sh);
     }
     if (unlikely(Rc(ctx->opcode) != 0)) {
         gen_set_Rc0(ctx, dst);
     }
+    gen_op_update_ca_legacy(ca);
+    tcg_temp_free(ca);
 }
 
 /* srw & srw. */
@@ -2197,22 +2221,25 @@ static inline void gen_sradi(DisasContext *ctx, int n)
     int sh = SH(ctx->opcode) + (n << 5);
     TCGv dst = cpu_gpr[rA(ctx->opcode)];
     TCGv src = cpu_gpr[rS(ctx->opcode)];
+    TCGv ca = tcg_temp_new();
     if (sh == 0) {
         tcg_gen_mov_tl(dst, src);
-        tcg_gen_movi_tl(cpu_ca, 0);
+        tcg_gen_movi_tl(ca, 0);
     } else {
         TCGv t0;
-        tcg_gen_andi_tl(cpu_ca, src, (1ULL << sh) - 1);
+        tcg_gen_andi_tl(ca, src, (1ULL << sh) - 1);
         t0 = tcg_temp_new();
         tcg_gen_sari_tl(t0, src, TARGET_LONG_BITS - 1);
-        tcg_gen_and_tl(cpu_ca, cpu_ca, t0);
+        tcg_gen_and_tl(ca, ca, t0);
         tcg_temp_free(t0);
-        tcg_gen_setcondi_tl(TCG_COND_NE, cpu_ca, cpu_ca, 0);
+        tcg_gen_setcondi_tl(TCG_COND_NE, ca, ca, 0);
         tcg_gen_sari_tl(dst, src, sh);
     }
     if (unlikely(Rc(ctx->opcode) != 0)) {
         gen_set_Rc0(ctx, dst);
     }
+    gen_op_update_ca_legacy(ca);
+    tcg_temp_free(ca);
 }
 
 static void gen_sradi0(DisasContext *ctx)
@@ -4990,16 +5017,20 @@ static void gen_sraiq(DisasContext *ctx)
     TCGLabel *l1 = gen_new_label();
     TCGv t0 = tcg_temp_new();
     TCGv t1 = tcg_temp_new();
+    TCGv ca = tcg_temp_local_new();
+
     tcg_gen_shri_tl(t0, cpu_gpr[rS(ctx->opcode)], sh);
     tcg_gen_shli_tl(t1, cpu_gpr[rS(ctx->opcode)], 32 - sh);
     tcg_gen_or_tl(t0, t0, t1);
     gen_store_spr(SPR_MQ, t0);
-    tcg_gen_movi_tl(cpu_ca, 0);
+    tcg_gen_movi_tl(ca, 0);
     tcg_gen_brcondi_tl(TCG_COND_EQ, t1, 0, l1);
     tcg_gen_brcondi_tl(TCG_COND_GE, cpu_gpr[rS(ctx->opcode)], 0, l1);
-    tcg_gen_movi_tl(cpu_ca, 1);
+    tcg_gen_movi_tl(ca, 1);
     gen_set_label(l1);
     tcg_gen_sari_tl(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)], sh);
+    gen_op_update_ca_legacy(ca);
+    tcg_temp_free(ca);
     tcg_temp_free(t0);
     tcg_temp_free(t1);
     if (unlikely(Rc(ctx->opcode) != 0))
@@ -5014,6 +5045,8 @@ static void gen_sraq(DisasContext *ctx)
     TCGv t0 = tcg_temp_new();
     TCGv t1 = tcg_temp_local_new();
     TCGv t2 = tcg_temp_local_new();
+    TCGv ca = tcg_temp_local_new();
+
     tcg_gen_andi_tl(t2, cpu_gpr[rB(ctx->opcode)], 0x1F);
     tcg_gen_shr_tl(t0, cpu_gpr[rS(ctx->opcode)], t2);
     tcg_gen_sar_tl(t1, cpu_gpr[rS(ctx->opcode)], t2);
@@ -5028,11 +5061,13 @@ static void gen_sraq(DisasContext *ctx)
     gen_set_label(l1);
     tcg_temp_free(t0);
     tcg_gen_mov_tl(cpu_gpr[rA(ctx->opcode)], t1);
-    tcg_gen_movi_tl(cpu_ca, 0);
+    tcg_gen_movi_tl(ca, 0);
     tcg_gen_brcondi_tl(TCG_COND_GE, t1, 0, l2);
     tcg_gen_brcondi_tl(TCG_COND_EQ, t2, 0, l2);
-    tcg_gen_movi_tl(cpu_ca, 1);
+    tcg_gen_movi_tl(ca, 1);
     gen_set_label(l2);
+    gen_op_update_ca_legacy(ca);
+    tcg_temp_free(ca);
     tcg_temp_free(t1);
     tcg_temp_free(t2);
     if (unlikely(Rc(ctx->opcode) != 0))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 05/15] target/ppc: add gen_op_update_ov_legacy() helper
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (3 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 04/15] target/ppc: add gen_op_update_ca_legacy() helper Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca) Nikunj A Dadhania
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Update cpu_ov/so using the helper

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 84 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 57 insertions(+), 27 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index ae7b43d..4c0e985 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -797,24 +797,43 @@ static inline void gen_op_update_ca_legacy(TCGv ca)
     tcg_gen_mov_tl(cpu_ca, ca);
 }
 
+static inline void gen_op_update_ov_legacy(TCGv ov)
+{
+    tcg_gen_mov_tl(cpu_ov, ov);
+    tcg_gen_or_tl(cpu_so, ov);
+}
+
+/* Sub functions with one operand and one immediate */
+#define GEN_UPDATE_OV(name, const_val)          \
+static void glue(gen_op_, name)(void)           \
+{                                               \
+    TCGv t0 = tcg_const_tl(const_val);          \
+    gen_op_update_ov_legacy(t0);                \
+    tcg_temp_free(t0);                          \
+}
+GEN_UPDATE_OV(set_ov, 1);
+GEN_UPDATE_OV(clear_ov, 0);
+
 static inline void gen_op_arith_compute_ov(DisasContext *ctx, TCGv arg0,
                                            TCGv arg1, TCGv arg2, int sub)
 {
     TCGv t0 = tcg_temp_new();
+    TCGv ov = tcg_temp_new();
 
-    tcg_gen_xor_tl(cpu_ov, arg0, arg2);
+    tcg_gen_xor_tl(ov, arg0, arg2);
     tcg_gen_xor_tl(t0, arg1, arg2);
     if (sub) {
-        tcg_gen_and_tl(cpu_ov, cpu_ov, t0);
+        tcg_gen_and_tl(ov, ov, t0);
     } else {
-        tcg_gen_andc_tl(cpu_ov, cpu_ov, t0);
+        tcg_gen_andc_tl(ov, ov, t0);
     }
     tcg_temp_free(t0);
     if (NARROW_MODE(ctx)) {
-        tcg_gen_ext32s_tl(cpu_ov, cpu_ov);
+        tcg_gen_ext32s_tl(ov, ov);
     }
-    tcg_gen_shri_tl(cpu_ov, cpu_ov, TARGET_LONG_BITS - 1);
-    tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
+    tcg_gen_shri_tl(ov, ov, TARGET_LONG_BITS - 1);
+    gen_op_update_ov_legacy(ov);
+    tcg_temp_free(ov);
 }
 
 /* Common add function */
@@ -997,8 +1016,10 @@ static inline void gen_op_arith_divw(DisasContext *ctx, TCGv ret, TCGv arg1,
         tcg_gen_extu_i32_tl(ret, t3);
     }
     if (compute_ov) {
-        tcg_gen_extu_i32_tl(cpu_ov, t2);
-        tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
+        TCGv ov = tcg_temp_new();
+        tcg_gen_extu_i32_tl(ov, t2);
+        gen_op_update_ov_legacy(ov);
+        tcg_temp_free(ov);
     }
     tcg_temp_free_i32(t0);
     tcg_temp_free_i32(t1);
@@ -1068,8 +1089,7 @@ static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, TCGv arg1,
         tcg_gen_divu_i64(ret, t0, t1);
     }
     if (compute_ov) {
-        tcg_gen_mov_tl(cpu_ov, t2);
-        tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
+        gen_op_update_ov_legacy(t2);
     }
     tcg_temp_free_i64(t0);
     tcg_temp_free_i64(t1);
@@ -1249,6 +1269,7 @@ static void gen_mullwo(DisasContext *ctx)
 {
     TCGv_i32 t0 = tcg_temp_new_i32();
     TCGv_i32 t1 = tcg_temp_new_i32();
+    TCGv ov = tcg_temp_new();
 
     tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]);
     tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]);
@@ -1261,8 +1282,9 @@ static void gen_mullwo(DisasContext *ctx)
 
     tcg_gen_sari_i32(t0, t0, 31);
     tcg_gen_setcond_i32(TCG_COND_NE, t0, t0, t1);
-    tcg_gen_extu_i32_tl(cpu_ov, t0);
-    tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
+    tcg_gen_extu_i32_tl(ov, t0);
+    gen_op_update_ov_legacy(ov);
+    tcg_temp_free(ov);
 
     tcg_temp_free_i32(t0);
     tcg_temp_free_i32(t1);
@@ -1316,14 +1338,16 @@ static void gen_mulldo(DisasContext *ctx)
 {
     TCGv_i64 t0 = tcg_temp_new_i64();
     TCGv_i64 t1 = tcg_temp_new_i64();
+    TCGv ov = tcg_temp_new();
 
     tcg_gen_muls2_i64(t0, t1, cpu_gpr[rA(ctx->opcode)],
                       cpu_gpr[rB(ctx->opcode)]);
     tcg_gen_mov_i64(cpu_gpr[rD(ctx->opcode)], t0);
 
     tcg_gen_sari_i64(t0, t0, 63);
-    tcg_gen_setcond_i64(TCG_COND_NE, cpu_ov, t0, t1);
-    tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
+    tcg_gen_setcond_i64(TCG_COND_NE, ov, t0, t1);
+    gen_op_update_ov_legacy(ov);
+    tcg_temp_free(ov);
 
     tcg_temp_free_i64(t0);
     tcg_temp_free_i64(t1);
@@ -4586,12 +4610,13 @@ static void gen_abso(DisasContext *ctx)
     TCGLabel *l1 = gen_new_label();
     TCGLabel *l2 = gen_new_label();
     TCGLabel *l3 = gen_new_label();
+    TCGv ov = tcg_temp_local_new();
+
     /* Start with XER OV disabled, the most likely case */
-    tcg_gen_movi_tl(cpu_ov, 0);
+    tcg_gen_movi_tl(ov, 0);
     tcg_gen_brcondi_tl(TCG_COND_GE, cpu_gpr[rA(ctx->opcode)], 0, l2);
     tcg_gen_brcondi_tl(TCG_COND_NE, cpu_gpr[rA(ctx->opcode)], 0x80000000, l1);
-    tcg_gen_movi_tl(cpu_ov, 1);
-    tcg_gen_movi_tl(cpu_so, 1);
+    tcg_gen_movi_tl(ov, 1);
     tcg_gen_br(l2);
     gen_set_label(l1);
     tcg_gen_neg_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]);
@@ -4601,6 +4626,8 @@ static void gen_abso(DisasContext *ctx)
     gen_set_label(l3);
     if (unlikely(Rc(ctx->opcode) != 0))
         gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]);
+    gen_op_update_ov_legacy(ov);
+    tcg_temp_free(ov);
 }
 
 /* clcs */
@@ -4671,8 +4698,9 @@ static void gen_dozo(DisasContext *ctx)
     TCGv t0 = tcg_temp_new();
     TCGv t1 = tcg_temp_new();
     TCGv t2 = tcg_temp_new();
+    TCGv ov = tcg_temp_local_new();
     /* Start with XER OV disabled, the most likely case */
-    tcg_gen_movi_tl(cpu_ov, 0);
+    tcg_gen_movi_tl(ov, 0);
     tcg_gen_brcond_tl(TCG_COND_GE, cpu_gpr[rB(ctx->opcode)], cpu_gpr[rA(ctx->opcode)], l1);
     tcg_gen_sub_tl(t0, cpu_gpr[rB(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]);
     tcg_gen_xor_tl(t1, cpu_gpr[rB(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]);
@@ -4680,8 +4708,7 @@ static void gen_dozo(DisasContext *ctx)
     tcg_gen_andc_tl(t1, t1, t2);
     tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], t0);
     tcg_gen_brcondi_tl(TCG_COND_GE, t1, 0, l2);
-    tcg_gen_movi_tl(cpu_ov, 1);
-    tcg_gen_movi_tl(cpu_so, 1);
+    tcg_gen_movi_tl(ov, 1);
     tcg_gen_br(l2);
     gen_set_label(l1);
     tcg_gen_movi_tl(cpu_gpr[rD(ctx->opcode)], 0);
@@ -4691,6 +4718,8 @@ static void gen_dozo(DisasContext *ctx)
     tcg_temp_free(t2);
     if (unlikely(Rc(ctx->opcode) != 0))
         gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]);
+    gen_op_update_ov_legacy(ov);
+    tcg_temp_free(ov);
 }
 
 /* dozi */
@@ -4795,9 +4824,10 @@ static void gen_mulo(DisasContext *ctx)
     TCGLabel *l1 = gen_new_label();
     TCGv_i64 t0 = tcg_temp_new_i64();
     TCGv_i64 t1 = tcg_temp_new_i64();
+    TCGv ov = tcg_temp_local_new();
     TCGv t2 = tcg_temp_new();
     /* Start with XER OV disabled, the most likely case */
-    tcg_gen_movi_tl(cpu_ov, 0);
+    tcg_gen_movi_tl(ov, 0);
     tcg_gen_extu_tl_i64(t0, cpu_gpr[rA(ctx->opcode)]);
     tcg_gen_extu_tl_i64(t1, cpu_gpr[rB(ctx->opcode)]);
     tcg_gen_mul_i64(t0, t0, t1);
@@ -4807,14 +4837,15 @@ static void gen_mulo(DisasContext *ctx)
     tcg_gen_trunc_i64_tl(cpu_gpr[rD(ctx->opcode)], t1);
     tcg_gen_ext32s_i64(t1, t0);
     tcg_gen_brcond_i64(TCG_COND_EQ, t0, t1, l1);
-    tcg_gen_movi_tl(cpu_ov, 1);
-    tcg_gen_movi_tl(cpu_so, 1);
+    tcg_gen_movi_tl(ov, 1);
     gen_set_label(l1);
     tcg_temp_free_i64(t0);
     tcg_temp_free_i64(t1);
     tcg_temp_free(t2);
     if (unlikely(Rc(ctx->opcode) != 0))
         gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]);
+    gen_op_update_ov_legacy(ov);
+    tcg_temp_free(ov);
 }
 
 /* nabs - nabs. */
@@ -4844,7 +4875,7 @@ static void gen_nabso(DisasContext *ctx)
     tcg_gen_neg_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]);
     gen_set_label(l2);
     /* nabs never overflows */
-    tcg_gen_movi_tl(cpu_ov, 0);
+    gen_op_clear_ov();
     if (unlikely(Rc(ctx->opcode) != 0))
         gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]);
 }
@@ -5474,7 +5505,7 @@ static inline void gen_405_mulladd_insn(DisasContext *ctx, int opc2, int opc3,
 
             if (opc3 & 0x10) {
                 /* Start with XER OV disabled, the most likely case */
-                tcg_gen_movi_tl(cpu_ov, 0);
+                gen_op_clear_ov();
             }
             if (opc3 & 0x01) {
                 /* Signed */
@@ -5497,8 +5528,7 @@ static inline void gen_405_mulladd_insn(DisasContext *ctx, int opc2, int opc3,
             }
             if (opc3 & 0x10) {
                 /* Check overflow */
-                tcg_gen_movi_tl(cpu_ov, 1);
-                tcg_gen_movi_tl(cpu_so, 1);
+                gen_op_set_ov();
             }
             gen_set_label(l1);
             tcg_gen_mov_tl(cpu_gpr[rt], t0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (4 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 05/15] target/ppc: add gen_op_update_ov_legacy() helper Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 20:26   ` Richard Henderson
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 07/15] target/ppc: support for 32-bit carry and overflow Nikunj A Dadhania
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Now get rid all the split out variables so, ca, ov. After this patch,
all the bits are stored in CPUPPCState::xer at appropriate places.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/cpu.c        |   8 +---
 target/ppc/cpu.h        |  26 ++++++------
 target/ppc/int_helper.c |  12 +++---
 target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
 4 files changed, 78 insertions(+), 74 deletions(-)

diff --git a/target/ppc/cpu.c b/target/ppc/cpu.c
index de3004b..3c08dac 100644
--- a/target/ppc/cpu.c
+++ b/target/ppc/cpu.c
@@ -23,14 +23,10 @@
 
 target_ulong cpu_read_xer(CPUPPCState *env)
 {
-    return env->xer | (env->so << XER_SO) | (env->ov << XER_OV) |
-        (env->ca << XER_CA);
+    return env->xer;
 }
 
 void cpu_write_xer(CPUPPCState *env, target_ulong xer)
 {
-    env->so = (xer >> XER_SO) & 1;
-    env->ov = (xer >> XER_OV) & 1;
-    env->ca = (xer >> XER_CA) & 1;
-    env->xer = xer & ~((1u << XER_SO) | (1u << XER_OV) | (1u << XER_CA));
+    env->xer = xer;
 }
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index b559b67..f1a7ca0 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -962,9 +962,6 @@ struct CPUPPCState {
 #endif
     /* XER (with SO, OV, CA split out) */
     target_ulong xer;
-    target_ulong so;
-    target_ulong ov;
-    target_ulong ca;
     /* Reservation address */
     target_ulong reserve_addr;
     /* Reservation value */
@@ -1369,16 +1366,19 @@ int ppc_compat_max_threads(PowerPCCPU *cpu);
 #define CRF_CH_AND_CL (1 << CRF_SO_BIT)
 
 /* XER definitions */
-#define XER_SO  31
-#define XER_OV  30
-#define XER_CA  29
-#define XER_CMP  8
-#define XER_BC   0
-#define xer_so  (env->so)
-#define xer_ov  (env->ov)
-#define xer_ca  (env->ca)
-#define xer_cmp ((env->xer >> XER_CMP) & 0xFF)
-#define xer_bc  ((env->xer >> XER_BC)  & 0x7F)
+#define XER_SO_BIT  31
+#define XER_OV_BIT  30
+#define XER_CA_BIT  29
+#define XER_CMP_BIT  8
+#define XER_BC_BIT   0
+#define XER_SO  (1 << XER_SO_BIT)
+#define XER_OV  (1 << XER_OV_BIT)
+#define XER_CA  (1 << XER_CA_BIT)
+#define xer_so  ((env->xer & XER_SO) >> XER_SO_BIT)
+#define xer_ov  ((env->xer & XER_OV) >> XER_OV_BIT)
+#define xer_ca  ((env->xer & XER_CA) >> XER_CA_BIT)
+#define xer_cmp ((env->xer >> XER_CMP_BIT) & 0xFF)
+#define xer_bc  ((env->xer >> XER_BC_BIT)  & 0x7F)
 
 /* SPR definitions */
 #define SPR_MQ                (0x000)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index d7af671..b0c3c2b 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -30,16 +30,18 @@
 
 static inline void helper_update_ov_legacy(CPUPPCState *env, int ov)
 {
-    if (unlikely(ov)) {
-        env->so = env->ov = 1;
-    } else {
-        env->ov = 0;
+    env->xer = env->xer & ~(XER_OV);
+    if (ov) {
+        env->xer |= XER_SO | XER_OV;
     }
 }
 
 static inline void helper_update_ca(CPUPPCState *env, int ca)
 {
-    env->ca = ca;
+    env->xer = env->xer & ~(XER_CA);
+    if (ca) {
+        env->xer |= XER_CA;
+    }
 }
 
 target_ulong helper_divweu(CPUPPCState *env, target_ulong ra, target_ulong rb,
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 4c0e985..5be1bb9 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -71,7 +71,7 @@ static TCGv cpu_lr;
 #if defined(TARGET_PPC64)
 static TCGv cpu_cfar;
 #endif
-static TCGv cpu_xer, cpu_so, cpu_ov, cpu_ca;
+static TCGv cpu_xer;
 static TCGv cpu_reserve;
 static TCGv cpu_fpscr;
 static TCGv_i32 cpu_access_type;
@@ -167,12 +167,6 @@ void ppc_translate_init(void)
 
     cpu_xer = tcg_global_mem_new(cpu_env,
                                  offsetof(CPUPPCState, xer), "xer");
-    cpu_so = tcg_global_mem_new(cpu_env,
-                                offsetof(CPUPPCState, so), "SO");
-    cpu_ov = tcg_global_mem_new(cpu_env,
-                                offsetof(CPUPPCState, ov), "OV");
-    cpu_ca = tcg_global_mem_new(cpu_env,
-                                offsetof(CPUPPCState, ca), "CA");
 
     cpu_reserve = tcg_global_mem_new(cpu_env,
                                      offsetof(CPUPPCState, reserve_addr),
@@ -607,8 +601,11 @@ static inline void gen_op_cmp(TCGv arg0, TCGv arg1, int s, int crf)
 {
     TCGv t0 = tcg_temp_new();
     TCGv_i32 t1 = tcg_temp_new_i32();
+    TCGv so = tcg_temp_new();
 
-    tcg_gen_trunc_tl_i32(cpu_crf[crf], cpu_so);
+    tcg_gen_extract_tl(so, cpu_xer, XER_SO_BIT, 1);
+    tcg_gen_trunc_tl_i32(cpu_crf[crf], so);
+    tcg_temp_free(so);
 
     tcg_gen_setcond_tl((s ? TCG_COND_LT: TCG_COND_LTU), t0, arg0, arg1);
     tcg_gen_trunc_tl_i32(t1, t0);
@@ -794,13 +791,24 @@ static void gen_cmpb(DisasContext *ctx)
 
 static inline void gen_op_update_ca_legacy(TCGv ca)
 {
-    tcg_gen_mov_tl(cpu_ca, ca);
+    TCGv t0 = tcg_temp_new();
+    tcg_gen_movi_tl(t0, XER_CA);
+    tcg_gen_andc_tl(cpu_xer, cpu_xer, t0);
+    tcg_gen_shli_tl(t0, ca, XER_CA_BIT);
+    tcg_gen_or_tl(cpu_xer, cpu_xer, t0);
+    tcg_temp_free(t0);
 }
 
 static inline void gen_op_update_ov_legacy(TCGv ov)
 {
-    tcg_gen_mov_tl(cpu_ov, ov);
-    tcg_gen_or_tl(cpu_so, ov);
+    TCGv t1 = tcg_temp_new();
+    TCGv zero = tcg_const_tl(0);
+    tcg_gen_movi_tl(t1, XER_OV);
+    tcg_gen_andc_tl(cpu_xer, cpu_xer, t1);
+    tcg_gen_movi_tl(t1, XER_OV | XER_SO);
+    tcg_gen_movcond_tl(TCG_COND_EQ, cpu_xer, ov, zero, cpu_xer, t1);
+    tcg_temp_free(t1);
+    tcg_temp_free(zero);
 }
 
 /* Sub functions with one operand and one immediate */
@@ -849,7 +857,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
     }
 
     if (add_ca) {
-        tcg_gen_mov_tl(ca, cpu_ca);
+        tcg_gen_extract_tl(ca, cpu_xer, XER_CA_BIT, 1);
     }
 
     if (compute_ca) {
@@ -3151,8 +3159,12 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA,
                                   int reg, int memop)
 {
     TCGLabel *l1;
+    TCGv so = tcg_temp_new();
+
+    tcg_gen_extract_tl(so, cpu_xer, XER_SO_BIT, 1);
+    tcg_gen_trunc_tl_i32(cpu_crf[0], so);
+    tcg_temp_free(so);
 
-    tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
     l1 = gen_new_label();
     tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1);
     tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], CRF_EQ);
@@ -3230,6 +3242,7 @@ static void gen_stqcx_(DisasContext *ctx)
 #if !defined(CONFIG_USER_ONLY)
     TCGLabel *l1;
     TCGv gpr1, gpr2;
+    TCGv so = tcg_temp_new();
 #endif
 
     if (unlikely((rD(ctx->opcode) & 1))) {
@@ -3246,7 +3259,10 @@ static void gen_stqcx_(DisasContext *ctx)
 #if defined(CONFIG_USER_ONLY)
     gen_conditional_store(ctx, EA, reg, 16);
 #else
-    tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
+    tcg_gen_extract_tl(so, cpu_xer, XER_SO_BIT, 1);
+    tcg_gen_trunc_tl_i32(cpu_crf[0], so);
+    tcg_temp_free(so);
+
     l1 = gen_new_label();
     tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1);
     tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], CRF_EQ);
@@ -3756,51 +3772,26 @@ static void gen_tdi(DisasContext *ctx)
 
 static void gen_read_xer(TCGv dst)
 {
-    TCGv t0 = tcg_temp_new();
-    TCGv t1 = tcg_temp_new();
-    TCGv t2 = tcg_temp_new();
     tcg_gen_mov_tl(dst, cpu_xer);
-    tcg_gen_shli_tl(t0, cpu_so, XER_SO);
-    tcg_gen_shli_tl(t1, cpu_ov, XER_OV);
-    tcg_gen_shli_tl(t2, cpu_ca, XER_CA);
-    tcg_gen_or_tl(t0, t0, t1);
-    tcg_gen_or_tl(dst, dst, t2);
-    tcg_gen_or_tl(dst, dst, t0);
-    tcg_temp_free(t0);
-    tcg_temp_free(t1);
-    tcg_temp_free(t2);
 }
 
 static void gen_write_xer(TCGv src)
 {
-    tcg_gen_andi_tl(cpu_xer, src,
-                    ~((1u << XER_SO) | (1u << XER_OV) | (1u << XER_CA)));
-    tcg_gen_extract_tl(cpu_so, src, XER_SO, 1);
-    tcg_gen_extract_tl(cpu_ov, src, XER_OV, 1);
-    tcg_gen_extract_tl(cpu_ca, src, XER_CA, 1);
+    tcg_gen_mov_tl(cpu_xer, src);
 }
 
 /* mcrxr */
 static void gen_mcrxr(DisasContext *ctx)
 {
-    TCGv_i32 t0 = tcg_temp_new_i32();
-    TCGv_i32 t1 = tcg_temp_new_i32();
     TCGv_i32 dst = cpu_crf[crfD(ctx->opcode)];
+    TCGv t0 = tcg_temp_new();
 
-    tcg_gen_trunc_tl_i32(t0, cpu_so);
-    tcg_gen_trunc_tl_i32(t1, cpu_ov);
-    tcg_gen_trunc_tl_i32(dst, cpu_ca);
-    tcg_gen_shli_i32(t0, t0, 3);
-    tcg_gen_shli_i32(t1, t1, 2);
-    tcg_gen_shli_i32(dst, dst, 1);
-    tcg_gen_or_i32(dst, dst, t0);
-    tcg_gen_or_i32(dst, dst, t1);
-    tcg_temp_free_i32(t0);
-    tcg_temp_free_i32(t1);
-
-    tcg_gen_movi_tl(cpu_so, 0);
-    tcg_gen_movi_tl(cpu_ov, 0);
-    tcg_gen_movi_tl(cpu_ca, 0);
+    tcg_gen_trunc_tl_i32(dst, cpu_xer);
+    tcg_gen_shri_i32(dst, dst, XER_CA_BIT - 1);
+    tcg_gen_andi_i32(dst, dst, 0xE);
+    tcg_gen_movi_tl(t0, XER_SO | XER_OV | XER_CA);
+    tcg_gen_andc_tl(cpu_xer, cpu_xer, t0);
+    tcg_temp_free(t0);
 }
 
 /* mfcr mfocrf */
@@ -4421,6 +4412,7 @@ static void gen_slbfee_(DisasContext *ctx)
     gen_inval_exception(ctx, POWERPC_EXCP_PRIV_REG);
 #else
     TCGLabel *l1, *l2;
+    TCGv so;
 
     if (unlikely(ctx->pr)) {
         gen_inval_exception(ctx, POWERPC_EXCP_PRIV_REG);
@@ -4430,7 +4422,11 @@ static void gen_slbfee_(DisasContext *ctx)
                              cpu_gpr[rB(ctx->opcode)]);
     l1 = gen_new_label();
     l2 = gen_new_label();
-    tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
+    so = tcg_temp_new();
+
+    tcg_gen_extract_tl(so, cpu_xer, XER_SO_BIT, 1);
+    tcg_gen_trunc_tl_i32(cpu_crf[0], so);
+    tcg_temp_free(so);
     tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_gpr[rS(ctx->opcode)], -1, l1);
     tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], CRF_EQ);
     tcg_gen_br(l2);
@@ -5854,7 +5850,12 @@ static void gen_tlbsx_40x(DisasContext *ctx)
     tcg_temp_free(t0);
     if (Rc(ctx->opcode)) {
         TCGLabel *l1 = gen_new_label();
-        tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
+        TCGv so = tcg_temp_new();
+
+        tcg_gen_extract_tl(so, cpu_xer, XER_SO_BIT, 1);
+        tcg_gen_trunc_tl_i32(cpu_crf[0], so);
+        tcg_temp_free(so);
+
         tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_gpr[rD(ctx->opcode)], -1, l1);
         tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 0x02);
         gen_set_label(l1);
@@ -5929,7 +5930,12 @@ static void gen_tlbsx_440(DisasContext *ctx)
     tcg_temp_free(t0);
     if (Rc(ctx->opcode)) {
         TCGLabel *l1 = gen_new_label();
-        tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
+        TCGv so = tcg_temp_new();
+
+        tcg_gen_extract_tl(so, cpu_xer, XER_SO_BIT, 1);
+        tcg_gen_trunc_tl_i32(cpu_crf[0], so);
+        tcg_temp_free(so);
+
         tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_gpr[rD(ctx->opcode)], -1, l1);
         tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 0x02);
         gen_set_label(l1);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 07/15] target/ppc: support for 32-bit carry and overflow
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (5 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca) Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 08/15] target/ppc: update ca32 in arithmetic add Nikunj A Dadhania
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

POWER ISA 3.0 adds CA32 and OV32 status in 64-bit mode. Add the flags
and corresponding defines.

Moreover, CA32 is updated when CA is updated and OV32 is updated when OV
is updated.

Arithmetic instructions:
    * Addition and Substractions:

        addic, addic., subfic, addc, subfc, adde, subfe, addme, subfme,
        addze, and subfze always updates CA and CA32.

        => CA reflects the carry out of bit 0 in 64-bit mode and out of
           bit 32 in 32-bit mode.
        => CA32 reflects the carry out of bit 32 independent of the
           mode.

        => SO and OV reflects overflow of the 64-bit result in 64-bit
           mode and overflow of the low-order 32-bit result in 32-bit
           mode
        => OV32 reflects overflow of the low-order 32-bit independent of
           the mode

    * Multiply Low and Divide:

        For mulld, divd, divde, divdu and divdeu: SO, OV, and OV32 bits
        reflects overflow of the 64-bit result

        For mullw, divw, divwe, divwu and divweu: SO, OV, and OV32 bits
        reflects overflow of the 32-bit result

     * Negate with OE=1 (nego)

       For 64-bit mode if the register RA contains
       0x8000_0000_0000_0000, OV and OV32 are set to 1.

       For 32-bit mode if the register RA contains 0x8000_0000, OV and
       OV32 are set to 1.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/cpu.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index f1a7ca0..e789d4b 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1369,14 +1369,20 @@ int ppc_compat_max_threads(PowerPCCPU *cpu);
 #define XER_SO_BIT  31
 #define XER_OV_BIT  30
 #define XER_CA_BIT  29
+#define XER_OV32_BIT  19
+#define XER_CA32_BIT  18
 #define XER_CMP_BIT  8
 #define XER_BC_BIT   0
 #define XER_SO  (1 << XER_SO_BIT)
 #define XER_OV  (1 << XER_OV_BIT)
 #define XER_CA  (1 << XER_CA_BIT)
+#define XER_OV32  (1 << XER_OV32_BIT)
+#define XER_CA32  (1 << XER_CA32_BIT)
 #define xer_so  ((env->xer & XER_SO) >> XER_SO_BIT)
 #define xer_ov  ((env->xer & XER_OV) >> XER_OV_BIT)
 #define xer_ca  ((env->xer & XER_CA) >> XER_CA_BIT)
+#define xer_ov32  ((env->xer & XER_OV32) >> XER_OV32_BIT)
+#define xer_ca32  ((env->xer & XER_CA32) >> XER_CA32_BIT)
 #define xer_cmp ((env->xer >> XER_CMP_BIT) & 0xFF)
 #define xer_bc  ((env->xer >> XER_BC_BIT)  & 0x7F)
 
@@ -2343,6 +2349,7 @@ enum {
 
 /*****************************************************************************/
 
+#define is_isa300(ctx) (!!(ctx->insns_flags2 & PPC2_ISA300))
 target_ulong cpu_read_xer(CPUPPCState *env);
 void cpu_write_xer(CPUPPCState *env, target_ulong xer);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 08/15] target/ppc: update ca32 in arithmetic add
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (6 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 07/15] target/ppc: support for 32-bit carry and overflow Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 09/15] target/ppc: update ca32 in arithmetic substract Nikunj A Dadhania
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Adds routine to compute ca32 - gen_op_arith_compute_ca32

For 64-bit mode use the compute ca32 routine. While for 32-bit mode, CA
and CA32 will have same value.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 5be1bb9..c98e708 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -799,6 +799,28 @@ static inline void gen_op_update_ca_legacy(TCGv ca)
     tcg_temp_free(t0);
 }
 
+static inline void gen_op_update_ca_isa300(TCGv ca, TCGv ca32)
+{
+    TCGv t0 = tcg_temp_new();
+
+    tcg_gen_movi_tl(t0, XER_CA | XER_CA32);
+    tcg_gen_andc_tl(cpu_xer, cpu_xer, t0);
+    tcg_gen_shli_tl(t0, ca, XER_CA_BIT);
+    tcg_gen_or_tl(cpu_xer, cpu_xer, t0);
+    tcg_gen_shli_tl(t0, ca32, XER_CA32_BIT);
+    tcg_gen_or_tl(cpu_xer, cpu_xer, t0);
+    tcg_temp_free(t0);
+}
+
+static inline void gen_op_update_ca(DisasContext *ctx, TCGv ca, TCGv ca32)
+{
+    if (is_isa300(ctx)) {
+        gen_op_update_ca_isa300(ca, ca32);
+    } else {
+        gen_op_update_ca_legacy(ca);
+    }
+}
+
 static inline void gen_op_update_ov_legacy(TCGv ov)
 {
     TCGv t1 = tcg_temp_new();
@@ -844,6 +866,23 @@ static inline void gen_op_arith_compute_ov(DisasContext *ctx, TCGv arg0,
     tcg_temp_free(ov);
 }
 
+static inline void gen_op_arith_compute_ca32(DisasContext *ctx, TCGv ca32,
+                                             TCGv res, TCGv arg0, TCGv arg1,
+                                             int sub)
+{
+    TCGv t0;
+
+    if (!is_isa300(ctx)) {
+        return;
+    }
+
+    t0 = tcg_temp_new();
+    tcg_gen_xor_tl(t0, arg0, arg1);
+    tcg_gen_xor_tl(t0, t0, res);
+    tcg_gen_extract_tl(ca32, t0, 32, 1);
+    tcg_temp_free(t0);
+}
+
 /* Common add function */
 static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
                                     TCGv arg2, bool add_ca, bool compute_ca,
@@ -851,6 +890,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
 {
     TCGv t0 = ret;
     TCGv ca = tcg_temp_new();
+    TCGv ca32 = tcg_temp_new();
 
     if (compute_ca || compute_ov) {
         t0 = tcg_temp_new();
@@ -874,6 +914,9 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
             tcg_gen_xor_tl(ca, t0, t1);        /* bits changed w/ carry */
             tcg_temp_free(t1);
             tcg_gen_extract_tl(ca, ca, 32, 1);
+            if (is_isa300(ctx)) {
+                tcg_gen_mov_tl(ca32, ca);
+            }
         } else {
             TCGv zero = tcg_const_tl(0);
             if (add_ca) {
@@ -882,6 +925,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
             } else {
                 tcg_gen_add2_tl(t0, ca, arg1, zero, arg2, zero);
             }
+            gen_op_arith_compute_ca32(ctx, ca32, t0, arg1, arg2, 0);
             tcg_temp_free(zero);
         }
     } else {
@@ -895,7 +939,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
         gen_op_arith_compute_ov(ctx, t0, arg1, arg2, 0);
     }
     if (compute_ca) {
-        gen_op_update_ca_legacy(ca);
+        gen_op_update_ca(ctx, ca, ca32);
     }
     if (unlikely(compute_rc0)) {
         gen_set_Rc0(ctx, t0);
@@ -906,6 +950,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv ret, TCGv arg1,
         tcg_temp_free(t0);
     }
     tcg_temp_free(ca);
+    tcg_temp_free(ca32);
 }
 /* Add functions with two operands */
 #define GEN_INT_ARITH_ADD(name, opc3, add_ca, compute_ca, compute_ov)         \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 09/15] target/ppc: update ca32 in arithmetic substract
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (7 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 08/15] target/ppc: update ca32 in arithmetic add Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 10/15] target/ppc: add gen_op_update_ov_isa300() Nikunj A Dadhania
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index c98e708..143b595 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -877,7 +877,11 @@ static inline void gen_op_arith_compute_ca32(DisasContext *ctx, TCGv ca32,
     }
 
     t0 = tcg_temp_new();
-    tcg_gen_xor_tl(t0, arg0, arg1);
+    if (sub) {
+        tcg_gen_eqv_tl(t0, arg0, arg1);
+    } else {
+        tcg_gen_xor_tl(t0, arg0, arg1);
+    }
     tcg_gen_xor_tl(t0, t0, res);
     tcg_gen_extract_tl(ca32, t0, 32, 1);
     tcg_temp_free(t0);
@@ -1418,6 +1422,7 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
 {
     TCGv t0 = ret;
     TCGv ca = tcg_temp_new();
+    TCGv ca32 = tcg_temp_new();
 
     if (compute_ca || compute_ov) {
         t0 = tcg_temp_new();
@@ -1446,17 +1451,22 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
             tcg_gen_xor_tl(ca, t0, t1);         /* bits changes w/ carry */
             tcg_temp_free(t1);
             tcg_gen_extract_tl(ca, ca, 32, 1);    /* extract bit 32 */
+            if (is_isa300(ctx)) {
+                tcg_gen_mov_tl(ca32, ca);
+            }
         } else if (add_ca) {
             TCGv zero, inv1 = tcg_temp_new();
             tcg_gen_not_tl(inv1, arg1);
             zero = tcg_const_tl(0);
             tcg_gen_add2_tl(t0, ca, arg2, zero, ca, zero);
             tcg_gen_add2_tl(t0, ca, t0, ca, inv1, zero);
+            gen_op_arith_compute_ca32(ctx, ca32, t0, inv1, arg2, 0);
             tcg_temp_free(zero);
             tcg_temp_free(inv1);
         } else {
             tcg_gen_setcond_tl(TCG_COND_GEU, ca, arg2, arg1);
             tcg_gen_sub_tl(t0, arg2, arg1);
+            gen_op_arith_compute_ca32(ctx, ca32, t0, arg1, arg2, 1);
         }
     } else if (add_ca) {
         /* Since we're ignoring carry-out, we can simplify the
@@ -1472,7 +1482,7 @@ static inline void gen_op_arith_subf(DisasContext *ctx, TCGv ret, TCGv arg1,
         gen_op_arith_compute_ov(ctx, t0, arg1, arg2, 1);
     }
     if (compute_ca) {
-        gen_op_update_ca_legacy(ca);
+        gen_op_update_ca(ctx, ca, ca32);
     }
     if (unlikely(compute_rc0)) {
         gen_set_Rc0(ctx, t0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 10/15] target/ppc: add gen_op_update_ov_isa300()
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (8 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 09/15] target/ppc: update ca32 in arithmetic substract Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 11/15] target/ppc: update OV/OV32 for mull[d, w] insns Nikunj A Dadhania
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Introduce routine to update OV and OV32 in case it is POWER9 and above.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 143b595..ea0a356 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -833,6 +833,31 @@ static inline void gen_op_update_ov_legacy(TCGv ov)
     tcg_temp_free(zero);
 }
 
+static inline void gen_op_update_ov_isa300(TCGv ov, TCGv ov32)
+{
+    TCGv t1 = tcg_temp_new();
+    TCGv t2 = tcg_temp_new();
+    tcg_gen_movi_tl(t1, XER_OV | XER_OV32);
+    tcg_gen_andc_tl(cpu_xer, cpu_xer, t1);
+    tcg_gen_shli_tl(t1, ov, XER_OV_BIT);
+    tcg_gen_shli_tl(t2, ov, XER_SO_BIT);
+    tcg_gen_or_tl(t1, t1, t2);
+    tcg_gen_shli_tl(t2, ov32, XER_OV32_BIT);
+    tcg_gen_or_tl(t1, t1, t2);
+    tcg_gen_or_tl(cpu_xer, cpu_xer, t1);
+    tcg_temp_free(t1);
+    tcg_temp_free(t2);
+}
+
+static inline void gen_op_update_ov(DisasContext *ctx, TCGv ov, TCGv ov32)
+{
+    if (is_isa300(ctx)) {
+        gen_op_update_ov_isa300(ov, ov32);
+    } else {
+        gen_op_update_ov_legacy(ov);
+    }
+}
+
 /* Sub functions with one operand and one immediate */
 #define GEN_UPDATE_OV(name, const_val)          \
 static void glue(gen_op_, name)(void)           \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 11/15] target/ppc: update OV/OV32 for mull[d, w] insns
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (9 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 10/15] target/ppc: add gen_op_update_ov_isa300() Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 12/15] target/ppc: update OV/OV32 for divide operations Nikunj A Dadhania
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

For Multiply Word:
SO, OV, and OV32 bits reflects overflow of the 32-bit result

For Multiply DoubleWord:
SO, OV, and OV32 bits reflects overflow of the 64-bit result

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index ea0a356..e1105e8 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -1365,7 +1365,7 @@ static void gen_mullwo(DisasContext *ctx)
     tcg_gen_sari_i32(t0, t0, 31);
     tcg_gen_setcond_i32(TCG_COND_NE, t0, t0, t1);
     tcg_gen_extu_i32_tl(ov, t0);
-    gen_op_update_ov_legacy(ov);
+    gen_op_update_ov(ctx, ov, ov);
     tcg_temp_free(ov);
 
     tcg_temp_free_i32(t0);
@@ -1428,7 +1428,7 @@ static void gen_mulldo(DisasContext *ctx)
 
     tcg_gen_sari_i64(t0, t0, 63);
     tcg_gen_setcond_i64(TCG_COND_NE, ov, t0, t1);
-    gen_op_update_ov_legacy(ov);
+    gen_op_update_ov(ctx, ov, ov);
     tcg_temp_free(ov);
 
     tcg_temp_free_i64(t0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 12/15] target/ppc: update OV/OV32 for divide operations
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (10 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 11/15] target/ppc: update OV/OV32 for mull[d, w] insns Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 13/15] target/ppc: update OV/OV32 flags for add/sub Nikunj A Dadhania
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Add helper_update_ov_isa300() in the int_helper for updating the
overflow flags.

For Divide Word:
SO, OV, and OV32 bits reflects overflow of the 32-bit result

For Divide DoubleWord:
SO, OV, and OV32 bits reflects overflow of the 64-bit result

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/int_helper.c | 17 +++++++++++++++++
 target/ppc/translate.c  |  8 ++++----
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index b0c3c2b..8cedce6 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -36,6 +36,23 @@ static inline void helper_update_ov_legacy(CPUPPCState *env, int ov)
     }
 }
 
+static inline void helper_update_ov_isa300(CPUPPCState *env, int ov, int ov32)
+{
+    env->xer = env->xer & ~(XER_OV | XER_OV32);
+    if (ov) {
+        env->xer |= XER_SO | XER_OV | XER_OV32;
+    }
+}
+
+static inline void helper_update_ov(CPUPPCState *env, int ov)
+{
+    if (is_isa300(env)) {
+        helper_update_ov_isa300(env, ov, ov);
+    } else {
+        helper_update_ov_legacy(env, ov);
+    }
+}
+
 static inline void helper_update_ca(CPUPPCState *env, int ca)
 {
     env->xer = env->xer & ~(XER_CA);
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index e1105e8..f7d37b0 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -1100,7 +1100,7 @@ static inline void gen_op_arith_divw(DisasContext *ctx, TCGv ret, TCGv arg1,
     if (compute_ov) {
         TCGv ov = tcg_temp_new();
         tcg_gen_extu_i32_tl(ov, t2);
-        gen_op_update_ov_legacy(ov);
+        gen_op_update_ov(ctx, ov, ov);
         tcg_temp_free(ov);
     }
     tcg_temp_free_i32(t0);
@@ -1171,7 +1171,7 @@ static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, TCGv arg1,
         tcg_gen_divu_i64(ret, t0, t1);
     }
     if (compute_ov) {
-        gen_op_update_ov_legacy(t2);
+        gen_op_update_ov(ctx, t2, t2);
     }
     tcg_temp_free_i64(t0);
     tcg_temp_free_i64(t1);
@@ -1189,10 +1189,10 @@ static void glue(gen_, name)(DisasContext *ctx)
                       cpu_gpr[rA(ctx->opcode)], cpu_gpr[rB(ctx->opcode)],     \
                       sign, compute_ov);                                      \
 }
-/* divwu  divwu.  divwuo  divwuo.   */
+/* divdu  divdu.  divduo  divduo.   */
 GEN_INT_ARITH_DIVD(divdu, 0x0E, 0, 0);
 GEN_INT_ARITH_DIVD(divduo, 0x1E, 0, 1);
-/* divw  divw.  divwo  divwo.   */
+/* divd  divd.  divdo  divdo.   */
 GEN_INT_ARITH_DIVD(divd, 0x0F, 1, 0);
 GEN_INT_ARITH_DIVD(divdo, 0x1F, 1, 1);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 13/15] target/ppc: update OV/OV32 flags for add/sub
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (11 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 12/15] target/ppc: update OV/OV32 for divide operations Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 14/15] target/ppc: use tcg ops for neg instruction Nikunj A Dadhania
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

* SO and OV reflects overflow of the 64-bit result in 64-bit mode and
  overflow of the low-order 32-bit result in 32-bit mode

* OV32 reflects overflow of the low-order 32-bit independent of the mode

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index f7d37b0..dc75cca 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -874,6 +874,7 @@ static inline void gen_op_arith_compute_ov(DisasContext *ctx, TCGv arg0,
 {
     TCGv t0 = tcg_temp_new();
     TCGv ov = tcg_temp_new();
+    TCGv ov32 = tcg_temp_new();
 
     tcg_gen_xor_tl(ov, arg0, arg2);
     tcg_gen_xor_tl(t0, arg1, arg2);
@@ -884,11 +885,19 @@ static inline void gen_op_arith_compute_ov(DisasContext *ctx, TCGv arg0,
     }
     tcg_temp_free(t0);
     if (NARROW_MODE(ctx)) {
-        tcg_gen_ext32s_tl(ov, ov);
+        tcg_gen_extract_tl(ov, ov, 31, 1);
+        if (is_isa300(ctx)) {
+            tcg_gen_mov_tl(ov32, ov);
+        }
+    } else {
+        if (is_isa300(ctx)) {
+            tcg_gen_extract_tl(ov32, ov, 31, 1);
+        }
+        tcg_gen_extract_tl(ov, ov, 63, 1);
     }
-    tcg_gen_shri_tl(ov, ov, TARGET_LONG_BITS - 1);
-    gen_op_update_ov_legacy(ov);
+    gen_op_update_ov(ctx, ov, ov32);
     tcg_temp_free(ov);
+    tcg_temp_free(ov32);
 }
 
 static inline void gen_op_arith_compute_ca32(DisasContext *ctx, TCGv ca32,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 14/15] target/ppc: use tcg ops for neg instruction
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (12 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 13/15] target/ppc: update OV/OV32 flags for add/sub Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 15/15] target/ppc: add mcrxrx instruction Nikunj A Dadhania
  2017-02-24  5:02 ` [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 David Gibson
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target/ppc/translate.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index dc75cca..5af9667 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -1583,7 +1583,10 @@ static inline void gen_op_arith_neg(DisasContext *ctx, bool compute_ov)
 
 static void gen_neg(DisasContext *ctx)
 {
-    gen_op_arith_neg(ctx, 0);
+    tcg_gen_neg_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]);
+    if (unlikely(Rc(ctx->opcode))) {
+        gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]);
+    }
 }
 
 static void gen_nego(DisasContext *ctx)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH v4 15/15] target/ppc: add mcrxrx instruction
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (13 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 14/15] target/ppc: use tcg ops for neg instruction Nikunj A Dadhania
@ 2017-02-23 19:56 ` Nikunj A Dadhania
  2017-02-24  5:02 ` [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 David Gibson
  15 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-23 19:56 UTC (permalink / raw)
  To: qemu-ppc, david, rth; +Cc: qemu-devel, bharata, nikunj

mcrxrx: Move to CR from XER Extended

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
 target/ppc/translate.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 5af9667..f4e41e5 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -3886,6 +3886,32 @@ static void gen_mcrxr(DisasContext *ctx)
     tcg_temp_free(t0);
 }
 
+#ifdef TARGET_PPC64
+/* mcrxrx */
+static void gen_mcrxrx(DisasContext *ctx)
+{
+    TCGv t0 = tcg_temp_new();
+    TCGv t1 = tcg_temp_new();
+    TCGv_i32 dst = cpu_crf[crfD(ctx->opcode)];
+
+    /* copy OV and OV32 */
+    tcg_gen_extract_tl(t0, cpu_xer, XER_OV_BIT, 1);
+    tcg_gen_extract_tl(t1, cpu_xer, XER_OV32_BIT, 1);
+    tcg_gen_shli_tl(t0, t0, 1);
+    tcg_gen_or_tl(t0, t0, t1);
+    tcg_gen_shli_tl(t0, t0, 1);
+    /* copy CA and CA32 */
+    tcg_gen_extract_tl(t1, cpu_xer, XER_CA_BIT, 1);
+    tcg_gen_or_tl(t0, t0, t1);
+    tcg_gen_shli_tl(t0, t0, 1);
+    tcg_gen_extract_tl(t1, cpu_xer, XER_CA32_BIT, 1);
+    tcg_gen_or_tl(t0, t0, t1);
+    tcg_gen_trunc_tl_i32(dst, t0);
+    tcg_temp_free(t0);
+    tcg_temp_free(t1);
+}
+#endif
+
 /* mfcr mfocrf */
 static void gen_mfcr(DisasContext *ctx)
 {
@@ -6584,6 +6610,7 @@ GEN_HANDLER(mtcrf, 0x1F, 0x10, 0x04, 0x00000801, PPC_MISC),
 #if defined(TARGET_PPC64)
 GEN_HANDLER(mtmsrd, 0x1F, 0x12, 0x05, 0x001EF801, PPC_64B),
 GEN_HANDLER_E(setb, 0x1F, 0x00, 0x04, 0x0003F801, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(mcrxrx, 0x1F, 0x00, 0x12, 0x007FF801, PPC_NONE, PPC2_ISA300),
 #endif
 GEN_HANDLER(mtmsr, 0x1F, 0x12, 0x04, 0x001EF801, PPC_MISC),
 GEN_HANDLER(mtspr, 0x1F, 0x13, 0x0E, 0x00000000, PPC_MISC),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 02/15] target/ppc: update ov flag from remaining paths
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 02/15] target/ppc: update ov flag from remaining paths Nikunj A Dadhania
@ 2017-02-23 20:23   ` Richard Henderson
  2017-02-24  0:45     ` Nikunj A Dadhania
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2017-02-23 20:23 UTC (permalink / raw)
  To: Nikunj A Dadhania, qemu-ppc, david; +Cc: qemu-devel, bharata

On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
> @@ -320,22 +320,24 @@ target_ulong helper_divo(CPUPPCState *env, target_ulong arg1,
>                           target_ulong arg2)
>  {
>      uint64_t tmp = (uint64_t)arg1 << 32 | env->spr[SPR_MQ];
> +    int ov;
>
>      if (((int32_t)tmp == INT32_MIN && (int32_t)arg2 == (int32_t)-1) ||
>          (int32_t)arg2 == 0) {
> -        env->so = env->ov = 1;
> +        ov = 1;
>          env->spr[SPR_MQ] = 0;
>          return INT32_MIN;
>      } else {
>          env->spr[SPR_MQ] = tmp % arg2;
>          tmp /= (int32_t)arg2;
>          if ((int32_t)tmp != tmp) {
> -            env->so = env->ov = 1;
> +            ov = 1;
>          } else {
> -            env->ov = 0;
> +            ov = 0;
>          }
>          return tmp;
>      }
> +    helper_update_ov_legacy(env, ov);
>  }
>

You're attempting to run the helper after "return".


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca) Nikunj A Dadhania
@ 2017-02-23 20:26   ` Richard Henderson
  2017-02-24  0:48     ` Nikunj A Dadhania
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2017-02-23 20:26 UTC (permalink / raw)
  To: Nikunj A Dadhania, qemu-ppc, david; +Cc: qemu-devel, bharata

On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
> Now get rid all the split out variables so, ca, ov. After this patch,
> all the bits are stored in CPUPPCState::xer at appropriate places.
>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
>  target/ppc/cpu.c        |   8 +---
>  target/ppc/cpu.h        |  26 ++++++------
>  target/ppc/int_helper.c |  12 +++---
>  target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
>  4 files changed, 78 insertions(+), 74 deletions(-)

I do not think this is a good direction to take this.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 02/15] target/ppc: update ov flag from remaining paths
  2017-02-23 20:23   ` Richard Henderson
@ 2017-02-24  0:45     ` Nikunj A Dadhania
  0 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-24  0:45 UTC (permalink / raw)
  To: Richard Henderson, qemu-ppc, david; +Cc: qemu-devel, bharata

Richard Henderson <rth@twiddle.net> writes:

> On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
>> @@ -320,22 +320,24 @@ target_ulong helper_divo(CPUPPCState *env, target_ulong arg1,
>>                           target_ulong arg2)
>>  {
>>      uint64_t tmp = (uint64_t)arg1 << 32 | env->spr[SPR_MQ];
>> +    int ov;
>>
>>      if (((int32_t)tmp == INT32_MIN && (int32_t)arg2 == (int32_t)-1) ||
>>          (int32_t)arg2 == 0) {
>> -        env->so = env->ov = 1;
>> +        ov = 1;
>>          env->spr[SPR_MQ] = 0;
>>          return INT32_MIN;
>>      } else {
>>          env->spr[SPR_MQ] = tmp % arg2;
>>          tmp /= (int32_t)arg2;
>>          if ((int32_t)tmp != tmp) {
>> -            env->so = env->ov = 1;
>> +            ov = 1;
>>          } else {
>> -            env->ov = 0;
>> +            ov = 0;
>>          }
>>          return tmp;
>>      }
>> +    helper_update_ov_legacy(env, ov);
>>  }
>>
>
> You're attempting to run the helper after "return".

Right, will correct it.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-23 20:26   ` Richard Henderson
@ 2017-02-24  0:48     ` Nikunj A Dadhania
  2017-02-24  2:58       ` David Gibson
  0 siblings, 1 reply; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-24  0:48 UTC (permalink / raw)
  To: Richard Henderson, qemu-ppc, david; +Cc: qemu-devel, bharata

Richard Henderson <rth@twiddle.net> writes:

> On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
>> Now get rid all the split out variables so, ca, ov. After this patch,
>> all the bits are stored in CPUPPCState::xer at appropriate places.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>> ---
>>  target/ppc/cpu.c        |   8 +---
>>  target/ppc/cpu.h        |  26 ++++++------
>>  target/ppc/int_helper.c |  12 +++---
>>  target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
>>  4 files changed, 78 insertions(+), 74 deletions(-)
>
> I do not think this is a good direction to take this.

Hmm, any particular reason?

I can send back the v3 with suggested changes dropping the xer split out
changes.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-24  0:48     ` Nikunj A Dadhania
@ 2017-02-24  2:58       ` David Gibson
  2017-02-24  6:41         ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: David Gibson @ 2017-02-24  2:58 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: Richard Henderson, qemu-ppc, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 1313 bytes --]

On Fri, Feb 24, 2017 at 06:18:22AM +0530, Nikunj A Dadhania wrote:
> Richard Henderson <rth@twiddle.net> writes:
> 
> > On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
> >> Now get rid all the split out variables so, ca, ov. After this patch,
> >> all the bits are stored in CPUPPCState::xer at appropriate places.
> >>
> >> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> >> ---
> >>  target/ppc/cpu.c        |   8 +---
> >>  target/ppc/cpu.h        |  26 ++++++------
> >>  target/ppc/int_helper.c |  12 +++---
> >>  target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
> >>  4 files changed, 78 insertions(+), 74 deletions(-)
> >
> > I do not think this is a good direction to take this.
> 
> Hmm, any particular reason?

Right, I suggested this, but based only a suspicion that the split
variables weren't worth the complexity.  I'm happy to be corrected by
someone with better knowledge of TCG, but it'd be nice to know why.

> I can send back the v3 with suggested changes dropping the xer split out
> changes.
> 
> Regards
> Nikunj
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15
  2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
                   ` (14 preceding siblings ...)
  2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 15/15] target/ppc: add mcrxrx instruction Nikunj A Dadhania
@ 2017-02-24  5:02 ` David Gibson
  2017-02-24  5:53   ` Nikunj A Dadhania
  15 siblings, 1 reply; 27+ messages in thread
From: David Gibson @ 2017-02-24  5:02 UTC (permalink / raw)
  To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 2603 bytes --]

On Fri, Feb 24, 2017 at 01:26:25AM +0530, Nikunj A Dadhania wrote:
> Patches:
> 01-06  Cleans up the XER split out variables and now the 
>        flag bits are stored in XER at their respective places 
> 
> 07-14  Contains implentation of CA32 and OV32 bits added to the 
>        ISA 3.0. Various fixed-point arithmetic instructions are 
>        updated to take care of the newer flags.
>  
> 15     Finally the last patch adds new instruction mcrxrx, that helps
>        reading the carry (CA and CA32) and the overflow (OV and OV32) flags

I've applied 1/15, I've rest the left pending correction of 2/15 and
discussions on the rest.

> 
> 
> Booted the POWER8 guest fine, needs more testing as changes are 
> intrusive in nature.
> 
> Changelog:
> v3:
> * Get rid of cpu_ca, cpu_ov, cpu_so split out variables
> * As most of the patches under went changes, dropped the 
>   reviewed-bys(except neg[.] patch)
> 
> v2: 
> * Add missing condition in narrow mode(add/subf), multiply and divide
> * Drop nego patch, subf implementation is sufficient for setting OV and OV32
> * Retaining neg[.], as the code is simplified.
> * Fix OV resetting in compute_ov()
> 
> v1: 
> * Use these ISA 3.0 flag to enable CA32 and OV32
> * Re-write ca32 compute routine
> * Add setting of flags for "neg." and "nego."
> 
> Nikunj A Dadhania (15):
>   target/ppc: introduce helper_update_ov_legacy
>   target/ppc: update ov flag from remaining paths
>   target/ppc: introduce helper_update_ca_legacy
>   target/ppc: add gen_op_update_ca_legacy() helper
>   target/ppc: add gen_op_update_ov_legacy() helper
>   target/ppc: remove xer split-out flags(so, ov, ca)
>   target/ppc: support for 32-bit carry and overflow
>   target/ppc: update ca32 in arithmetic add
>   target/ppc: update ca32 in arithmetic substract
>   target/ppc: add gen_op_update_ov_isa300()
>   target/ppc: update OV/OV32 for mull[d,w] insns
>   target/ppc: update OV/OV32 for divide operations
>   target/ppc: update OV/OV32 flags for add/sub
>   target/ppc: use tcg ops for neg instruction
>   target/ppc: add mcrxrx instruction
> 
>  target/ppc/cpu.c        |   8 +-
>  target/ppc/cpu.h        |  33 ++--
>  target/ppc/int_helper.c |  90 ++++++-----
>  target/ppc/translate.c  | 396 +++++++++++++++++++++++++++++++++++-------------
>  4 files changed, 371 insertions(+), 156 deletions(-)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15
  2017-02-24  5:02 ` [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 David Gibson
@ 2017-02-24  5:53   ` Nikunj A Dadhania
  0 siblings, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-24  5:53 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, rth, qemu-devel, bharata

David Gibson <david@gibson.dropbear.id.au> writes:

> [ Unknown signature status ]
> On Fri, Feb 24, 2017 at 01:26:25AM +0530, Nikunj A Dadhania wrote:
>> Patches:
>> 01-06  Cleans up the XER split out variables and now the 
>>        flag bits are stored in XER at their respective places 
>> 
>> 07-14  Contains implentation of CA32 and OV32 bits added to the 
>>        ISA 3.0. Various fixed-point arithmetic instructions are 
>>        updated to take care of the newer flags.
>>  
>> 15     Finally the last patch adds new instruction mcrxrx, that helps
>>        reading the carry (CA and CA32) and the overflow (OV and OV32) flags
>
> I've applied 1/15, I've rest the left pending correction of 2/15 and
> discussions on the rest.

I thought of changing back to previous implementation, and posted v5 :-)

/me needs to slow down !

Regards,
Nikunj

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-24  2:58       ` David Gibson
@ 2017-02-24  6:41         ` Richard Henderson
  2017-02-24  7:05           ` Nikunj A Dadhania
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2017-02-24  6:41 UTC (permalink / raw)
  To: David Gibson, Nikunj A Dadhania; +Cc: qemu-ppc, qemu-devel, bharata

On 02/24/2017 01:58 PM, David Gibson wrote:
> On Fri, Feb 24, 2017 at 06:18:22AM +0530, Nikunj A Dadhania wrote:
>> Richard Henderson <rth@twiddle.net> writes:
>>
>>> On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
>>>> Now get rid all the split out variables so, ca, ov. After this patch,
>>>> all the bits are stored in CPUPPCState::xer at appropriate places.
>>>>
>>>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>>>> ---
>>>>  target/ppc/cpu.c        |   8 +---
>>>>  target/ppc/cpu.h        |  26 ++++++------
>>>>  target/ppc/int_helper.c |  12 +++---
>>>>  target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
>>>>  4 files changed, 78 insertions(+), 74 deletions(-)
>>>
>>> I do not think this is a good direction to take this.
>>
>> Hmm, any particular reason?
>
> Right, I suggested this, but based only a suspicion that the split
> variables weren't worth the complexity.  I'm happy to be corrected by
> someone with better knowledge of TCG, but it'd be nice to know why.

Normally we're interested in minimizing the size of the generated code, 
delaying computation until we can show it being used.

Now, ppc is a bit different from other targets (which might compute overflow 
for any addition insn) in that it only computes overflow when someone asks for 
it.  Moreover, it's fairly rare for the addo/subo/nego instructions to be used.

Therefore, I'm not 100% sure what the "best" solution is.  However, I'd be 
surprised if the least amount of code places all of the bits into their 
canonical location within XER.

Do note that when looking at this, the various methods by which the OV/SO bits 
are copied to CR flags ought to be taken into account.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-24  6:41         ` Richard Henderson
@ 2017-02-24  7:05           ` Nikunj A Dadhania
  2017-02-24  7:12             ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
  2017-02-25  2:03             ` [Qemu-devel] " Richard Henderson
  0 siblings, 2 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-24  7:05 UTC (permalink / raw)
  To: Richard Henderson, David Gibson; +Cc: qemu-ppc, qemu-devel, bharata

Richard Henderson <rth@twiddle.net> writes:

> On 02/24/2017 01:58 PM, David Gibson wrote:
>> On Fri, Feb 24, 2017 at 06:18:22AM +0530, Nikunj A Dadhania wrote:
>>> Richard Henderson <rth@twiddle.net> writes:
>>>
>>>> On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
>>>>> Now get rid all the split out variables so, ca, ov. After this patch,
>>>>> all the bits are stored in CPUPPCState::xer at appropriate places.
>>>>>
>>>>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>>>>> ---
>>>>>  target/ppc/cpu.c        |   8 +---
>>>>>  target/ppc/cpu.h        |  26 ++++++------
>>>>>  target/ppc/int_helper.c |  12 +++---
>>>>>  target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
>>>>>  4 files changed, 78 insertions(+), 74 deletions(-)
>>>>
>>>> I do not think this is a good direction to take this.
>>>
>>> Hmm, any particular reason?
>>
>> Right, I suggested this, but based only a suspicion that the split
>> variables weren't worth the complexity.  I'm happy to be corrected by
>> someone with better knowledge of TCG, but it'd be nice to know why.
>
> Normally we're interested in minimizing the size of the generated code, 
> delaying computation until we can show it being used.
>
> Now, ppc is a bit different from other targets (which might compute overflow 
> for any addition insn) in that it only computes overflow when someone asks for 
> it.  Moreover, it's fairly rare for the addo/subo/nego instructions to
> be used.

> Therefore, I'm not 100% sure what the "best" solution is.

Agreed, with that logic, wont it be more efficient to move the OV/CA
updationg to respective callers, and when xer_read/write happens, its
just one tcg_ops.


> However, I'd be surprised if the least amount of code places all of
> the bits into their canonical location within XER.
>
> Do note that when looking at this, the various methods by which the OV/SO bits 
> are copied to CR flags ought to be taken into account.

I lost you in the last two para, can you explain in detail?

Regards
Nikunj

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-24  7:05           ` Nikunj A Dadhania
@ 2017-02-24  7:12             ` Nikunj A Dadhania
  2017-02-25  2:03             ` [Qemu-devel] " Richard Henderson
  1 sibling, 0 replies; 27+ messages in thread
From: Nikunj A Dadhania @ 2017-02-24  7:12 UTC (permalink / raw)
  To: Richard Henderson, David Gibson; +Cc: qemu-ppc, qemu-devel, bharata

Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> writes:

> Richard Henderson <rth@twiddle.net> writes:
>
>> On 02/24/2017 01:58 PM, David Gibson wrote:
>>> On Fri, Feb 24, 2017 at 06:18:22AM +0530, Nikunj A Dadhania wrote:
>>>> Richard Henderson <rth@twiddle.net> writes:
>>>>
>>>>> On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
>>>>>> Now get rid all the split out variables so, ca, ov. After this patch,
>>>>>> all the bits are stored in CPUPPCState::xer at appropriate places.
>>>>>>
>>>>>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>  target/ppc/cpu.c        |   8 +---
>>>>>>  target/ppc/cpu.h        |  26 ++++++------
>>>>>>  target/ppc/int_helper.c |  12 +++---
>>>>>>  target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
>>>>>>  4 files changed, 78 insertions(+), 74 deletions(-)
>>>>>
>>>>> I do not think this is a good direction to take this.
>>>>
>>>> Hmm, any particular reason?
>>>
>>> Right, I suggested this, but based only a suspicion that the split
>>> variables weren't worth the complexity.  I'm happy to be corrected by
>>> someone with better knowledge of TCG, but it'd be nice to know why.
>>
>> Normally we're interested in minimizing the size of the generated code, 
>> delaying computation until we can show it being used.
>>
>> Now, ppc is a bit different from other targets (which might compute overflow 
>> for any addition insn) in that it only computes overflow when someone asks for 
>> it.  Moreover, it's fairly rare for the addo/subo/nego instructions to
>> be used.
>
>> Therefore, I'm not 100% sure what the "best" solution is.
>
> Agreed, with that logic, wont it be more efficient to move the OV/CA
> updationg to respective callers, and when xer_read/write happens, its
> just one tcg_ops.

BTW, i haven't seen remarkable difference in the boot time after this
change.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca)
  2017-02-24  7:05           ` Nikunj A Dadhania
  2017-02-24  7:12             ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
@ 2017-02-25  2:03             ` Richard Henderson
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2017-02-25  2:03 UTC (permalink / raw)
  To: Nikunj A Dadhania, David Gibson; +Cc: qemu-ppc, qemu-devel, bharata

On 02/24/2017 06:05 PM, Nikunj A Dadhania wrote:
> Richard Henderson <rth@twiddle.net> writes:
>
>> On 02/24/2017 01:58 PM, David Gibson wrote:
>>> On Fri, Feb 24, 2017 at 06:18:22AM +0530, Nikunj A Dadhania wrote:
>>>> Richard Henderson <rth@twiddle.net> writes:
>>>>
>>>>> On 02/24/2017 06:56 AM, Nikunj A Dadhania wrote:
>>>>>> Now get rid all the split out variables so, ca, ov. After this patch,
>>>>>> all the bits are stored in CPUPPCState::xer at appropriate places.
>>>>>>
>>>>>> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>  target/ppc/cpu.c        |   8 +---
>>>>>>  target/ppc/cpu.h        |  26 ++++++------
>>>>>>  target/ppc/int_helper.c |  12 +++---
>>>>>>  target/ppc/translate.c  | 106 +++++++++++++++++++++++++-----------------------
>>>>>>  4 files changed, 78 insertions(+), 74 deletions(-)
>>>>>
>>>>> I do not think this is a good direction to take this.
>>>>
>>>> Hmm, any particular reason?
>>>
>>> Right, I suggested this, but based only a suspicion that the split
>>> variables weren't worth the complexity.  I'm happy to be corrected by
>>> someone with better knowledge of TCG, but it'd be nice to know why.
>>
>> Normally we're interested in minimizing the size of the generated code,
>> delaying computation until we can show it being used.
>>
>> Now, ppc is a bit different from other targets (which might compute overflow
>> for any addition insn) in that it only computes overflow when someone asks for
>> it.  Moreover, it's fairly rare for the addo/subo/nego instructions to
>> be used.
>
>> Therefore, I'm not 100% sure what the "best" solution is.
>
> Agreed, with that logic, wont it be more efficient to move the OV/CA
> updationg to respective callers, and when xer_read/write happens, its
> just one tcg_ops.
>
>
>> However, I'd be surprised if the least amount of code places all of
>> the bits into their canonical location within XER.
>>
>> Do note that when looking at this, the various methods by which the OV/SO bits
>> are copied to CR flags ought to be taken into account.
>
> I lost you in the last two para, can you explain in detail?

Reading XER via MFSPR is not the only way to access the CA/OV/SO bits.  One may 
use the "dot" form of the instruction to copy SO to CR0[3].  One may use the 
MCRXRX instruction to copy 5 bits from XER to CR[BF].  One may use the add/sub 
extended instructions to access the CA bit.

Therefore it is not a forgone conclusion that a read of XER will *ever* occur, 
and therefore it is not necessarily most efficient to keep the CA/OV/SO bits in 
the canonical XER form.

I think it's especially important to keep CA separate in order to facilitate 
multi-word addition chains.

I suspect that it's most efficient to keep SO in a form that best simplifies 
"dot" instructions, e.g. since there is no un-dotted "andi" instruction. 
Naturally, the form in which you store SO is going to influence how you store OV.

The other thing that is desirable is to allow the TCG optimizer to delete 
computations that are dead.  That cannot be done if you're constantly folding 
results back into a single XER register.

Consider a sequence like

	li	r0, 0
	mtspr	xer, r0
	addo	r3, r4, r5
	addo.	r6, r7, r8

where we clear XER (and thus SO), perform two computations, and then read SO 
via the dot.  Obviously the two computations of OV are not dead, because they 
get ORed into SO.  However, the first computation of OV32 is dead, shadowed by 
the second, because there is no accumulating SO32 bit.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-02-25  2:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-23 19:56 [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 01/15] target/ppc: introduce helper_update_ov_legacy Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 02/15] target/ppc: update ov flag from remaining paths Nikunj A Dadhania
2017-02-23 20:23   ` Richard Henderson
2017-02-24  0:45     ` Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 03/15] target/ppc: introduce helper_update_ca_legacy Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 04/15] target/ppc: add gen_op_update_ca_legacy() helper Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 05/15] target/ppc: add gen_op_update_ov_legacy() helper Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 06/15] target/ppc: remove xer split-out flags(so, ov, ca) Nikunj A Dadhania
2017-02-23 20:26   ` Richard Henderson
2017-02-24  0:48     ` Nikunj A Dadhania
2017-02-24  2:58       ` David Gibson
2017-02-24  6:41         ` Richard Henderson
2017-02-24  7:05           ` Nikunj A Dadhania
2017-02-24  7:12             ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
2017-02-25  2:03             ` [Qemu-devel] " Richard Henderson
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 07/15] target/ppc: support for 32-bit carry and overflow Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 08/15] target/ppc: update ca32 in arithmetic add Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 09/15] target/ppc: update ca32 in arithmetic substract Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 10/15] target/ppc: add gen_op_update_ov_isa300() Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 11/15] target/ppc: update OV/OV32 for mull[d, w] insns Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 12/15] target/ppc: update OV/OV32 for divide operations Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 13/15] target/ppc: update OV/OV32 flags for add/sub Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 14/15] target/ppc: use tcg ops for neg instruction Nikunj A Dadhania
2017-02-23 19:56 ` [Qemu-devel] [PATCH v4 15/15] target/ppc: add mcrxrx instruction Nikunj A Dadhania
2017-02-24  5:02 ` [Qemu-devel] [PATCH v4 00/15] POWER9 TCG enablements - part15 David Gibson
2017-02-24  5:53   ` Nikunj A Dadhania

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.