[Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8
@ 2014-01-07 16:05 Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 01/22] target-ppc: Add ISA2.06 bpermd Instruction Tom Musta
                   ` (22 more replies)
  0 siblings, 23 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

The QEMU emulation models for Power7 and Power8 are still missing some
of the base instructions that were introduced in Power ISA 2.06 and
even a few that were introduced prior to that.

This patch series gets these models caught up with respect to the
base 2.06 ISA.  That is, the Book I and Book II instructions for 
the branch, fixed point and floating point units will be completed
with this series.  Decimal floating point is not addressed in this
series, nor are the base ISA 2.07 changes for the Power8 model.

In some cases, existing instructions are re-implemented using common
macros, thus eliminating some redundant code.  The patch series
eliminates almost half as much code as it adds.

Additionally, some bugs in the common floating point library (softfloat)
are fixed.

V2: Addressing comments from Richard Henderson.  Fixed corner case bug
in divweu.

V3: Addressing comments from Peter Maydell.  softloat changes have been
moved back to the VSX stage 3 patch series.

V4: 
  - Split the single flag (PPC2_ISA206) into multiple flags per request from
    Alex Graf.
  - Recoded divwe[u] instructions as a helper.
  - Correct tcg-debug compilation errors.
  - Corrected fcfid[u]s helpers to avoid double rounding.
  - Corrected frin to use "nearest ties away" rounding mode instead of 
    "nearest ties to even".
  - Updated (new) Power7+ model to include all newly added instruction flags.
  - Assorted comments from Richard Henderson, Peter Maydell and Scott Wood.

Tom Musta (22):
  target-ppc: Add ISA2.06 bpermd Instruction
  target-ppc: Add Flag for ISA2.06 Divide Extended Instructions
  target-ppc: Add ISA2.06 divdeu[o] Instructions
  target-ppc: Add ISA2.06 divde[o] Instructions
  target-ppc: Add ISA 2.06 divweu[o] Instructions
  target-ppc: Add ISA 2.06 divwe[o] Instructions
  target-ppc: Add Flag for ISA2.06 Atomic Instructions
  target-ppc: Add ISA2.06 lbarx, lharx Instructions
  target-ppc: Add ISA 2.06 stbcx. and sthcx. Instructions
  target-ppc: Add Flag for ISA V2.06 Floating Point Conversion
  target-ppc: Add ISA2.06 Float to Integer Instructions
  target-ppc: Add ISA 2.06 fcfid[u][s] Instructions
  softfloat: Fix exception flag handling for float32_to_float16()
  softfloat: Factor out RoundAndPackFloat16 and
    NormalizeFloat16Subnormal
  softfloat: Refactor code handling various rounding modes
  softfloat: Add support for ties-away rounding
  target-ppc: Fix and enable fri[mnpz]
  target-ppc: Add Flag for Power ISA V2.06 Floating Point Test
    Instructions
  target-ppc: Add ISA 2.06 ftdiv Instruction
  target-ppc: Add ISA 2.06 ftsqrt
  target-ppc: Enable frsqrtes on Power7 and Power8
  target-ppc: Add ISA2.06 lfiwzx Instruction

 fpu/softfloat.c             |  657 ++++++++++++++++++++++++++++---------------
 include/fpu/softfloat.h     |    3 +-
 include/qemu/host-utils.h   |   28 ++
 target-ppc/cpu.h            |   15 +-
 target-ppc/fpu_helper.c     |  255 +++++++++--------
 target-ppc/helper.h         |   15 +
 target-ppc/int_helper.c     |  133 +++++++++
 target-ppc/translate.c      |  243 +++++++++++-----
 target-ppc/translate_init.c |   23 ++-
 util/host-utils.c           |   75 +++++
 10 files changed, 1025 insertions(+), 422 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 01/22] target-ppc: Add ISA2.06 bpermd Instruction
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 02/22] target-ppc: Add Flag for ISA2.06 Divide Extended Instructions Tom Musta
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Bit Permute Doubleword (bpermd) instruction,
which was introduced in Power ISA 2.06 as part of the base 64-bit
architecture.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V2: Addressing stylistic comments from Richard Henderson.

V4: Use newly added PPC2_PERM_ISA206 flag. Eliminated use of extraneous
env parameter in helper (per Richard Henderson's review).  Changed "1ul" to
"1ull" per Scott Wood's review.  Added flag to Power7+ and e5500 models.

 target-ppc/cpu.h            |    4 +++-
 target-ppc/helper.h         |    1 +
 target-ppc/int_helper.c     |   20 ++++++++++++++++++++
 target-ppc/translate.c      |   10 ++++++++++
 target-ppc/translate_init.c |   11 +++++++----
 5 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 0abc848..0b330fd 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1877,9 +1877,11 @@ enum {
     PPC2_ISA205        = 0x0000000000000020ULL,
     /* VSX additions in ISA 2.07                                             */
     PPC2_VSX207        = 0x0000000000000040ULL,
+    /* ISA 2.06B bpermd                                                      */
+    PPC2_PERM_ISA206   = 0x0000000000000080ULL,
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
-                        PPC2_ISA205 | PPC2_VSX207)
+                        PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206)
 };
 
 /*****************************************************************************/
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 6250eba..a42aa61 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -41,6 +41,7 @@ DEF_HELPER_3(sraw, tl, env, tl, tl)
 #if defined(TARGET_PPC64)
 DEF_HELPER_FLAGS_1(cntlzd, TCG_CALL_NO_RWG_SE, tl, tl)
 DEF_HELPER_FLAGS_1(popcntd, TCG_CALL_NO_RWG_SE, tl, tl)
+DEF_HELPER_FLAGS_2(bpermd, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_3(srad, tl, env, tl, tl)
 #endif
 
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index e50bdd2..0e7afb3 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -53,6 +53,26 @@ target_ulong helper_cntlzd(target_ulong t)
 }
 #endif
 
+#if defined(TARGET_PPC64)
+
+uint64_t helper_bpermd(uint64_t rs, uint64_t rb)
+{
+    int i;
+    uint64_t ra = 0;
+
+    for (i = 0; i < 8; i++) {
+        int index = (rs >> (i*8)) & 0xFF;
+        if (index < 64) {
+            if (rb & (1ull << (63-index))) {
+                ra |= 1 << i;
+            }
+        }
+    }
+    return ra;
+}
+
+#endif
+
 target_ulong helper_cmpb(target_ulong rs, target_ulong rb)
 {
     target_ulong mask = 0xff;
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index e2dd272..491c585 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1525,6 +1525,15 @@ static void gen_prtyd(DisasContext *ctx)
 #endif
 
 #if defined(TARGET_PPC64)
+/* bpermd */
+static void gen_bpermd(DisasContext *ctx)
+{
+    gen_helper_bpermd(cpu_gpr[rA(ctx->opcode)],
+                      cpu_gpr[rS(ctx->opcode)], cpu_gpr[rB(ctx->opcode)]);
+}
+#endif
+
+#if defined(TARGET_PPC64)
 /* extsw & extsw. */
 GEN_LOGICAL1(extsw, tcg_gen_ext32s_tl, 0x1E, PPC_64B);
 
@@ -9338,6 +9347,7 @@ GEN_HANDLER_E(prtyw, 0x1F, 0x1A, 0x04, 0x0000F801, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER(popcntd, 0x1F, 0x1A, 0x0F, 0x0000F801, PPC_POPCNTWD),
 GEN_HANDLER(cntlzd, 0x1F, 0x1A, 0x01, 0x00000000, PPC_64B),
 GEN_HANDLER_E(prtyd, 0x1F, 0x1A, 0x05, 0x0000F801, PPC_NONE, PPC2_ISA205),
+GEN_HANDLER_E(bpermd, 0x1F, 0x1C, 0x07, 0x00000001, PPC_NONE, PPC2_PERM_ISA206),
 #endif
 GEN_HANDLER(rlwimi, 0x14, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(rlwinm, 0x15, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index dd57df3..823e40d 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -4720,7 +4720,7 @@ POWERPC_FAMILY(e5500)(ObjectClass *oc, void *data)
                        PPC_FLOAT_STFIWX | PPC_WAIT |
                        PPC_MEM_TLBSYNC | PPC_TLBIVAX | PPC_MEM_SYNC |
                        PPC_64B | PPC_POPCNTB | PPC_POPCNTWD;
-    pcc->insns_flags2 = PPC2_BOOKE206 | PPC2_PRCNTL;
+    pcc->insns_flags2 = PPC2_BOOKE206 | PPC2_PRCNTL | PPC2_PERM_ISA206;
     pcc->msr_mask = 0x000000009402FB36ULL;
     pcc->mmu_model = POWERPC_MMU_BOOKE206;
     pcc->excp_model = POWERPC_EXCP_BOOKE;
@@ -7236,7 +7236,8 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
                        PPC_64B | PPC_ALTIVEC |
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
-    pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205;
+    pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
+                        PPC2_PERM_ISA206;
     pcc->msr_mask = 0x800000000284FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7274,7 +7275,8 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
                        PPC_64B | PPC_ALTIVEC |
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
-    pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205;
+    pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
+                        PPC2_PERM_ISA206;
     pcc->msr_mask = 0x800000000204FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7312,7 +7314,8 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                        PPC_64B | PPC_ALTIVEC |
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
-    pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX;
+    pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX |
+                        PPC2_PERM_ISA206;
     pcc->msr_mask = 0x800000000284FF36ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 02/22] target-ppc: Add Flag for ISA2.06 Divide Extended Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 01/22] target-ppc: Add ISA2.06 bpermd Instruction Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-08 18:27   ` Richard Henderson
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 03/22] target-ppc: Add ISA2.06 divdeu[o] Instructions Tom Musta
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds a flag for the Divide Extended instructions that
were introduced in Power ISA V2.06B.  The flag is added to the
Power7 and Power8 models.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: Split into new and separate patch.  Added flag to Power7+
model.

 target-ppc/cpu.h            |    5 ++++-
 target-ppc/translate_init.c |    6 +++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 0b330fd..8ba0d32 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1879,9 +1879,12 @@ enum {
     PPC2_VSX207        = 0x0000000000000040ULL,
     /* ISA 2.06B bpermd                                                      */
     PPC2_PERM_ISA206   = 0x0000000000000080ULL,
+    /* ISA 2.06B divide extended variants                                    */
+    PPC2_DIVE_ISA206   = 0x0000000000000100ULL,
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
-                        PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206)
+                        PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
+                        PPC2_DIVE_ISA206)
 };
 
 /*****************************************************************************/
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 823e40d..390024e 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7237,7 +7237,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
-                        PPC2_PERM_ISA206;
+                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206;
     pcc->msr_mask = 0x800000000284FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7276,7 +7276,7 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
-                        PPC2_PERM_ISA206;
+                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206;
     pcc->msr_mask = 0x800000000204FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7315,7 +7315,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX |
-                        PPC2_PERM_ISA206;
+                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206;
     pcc->msr_mask = 0x800000000284FF36ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 03/22] target-ppc: Add ISA2.06 divdeu[o] Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 01/22] target-ppc: Add ISA2.06 bpermd Instruction Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 02/22] target-ppc: Add Flag for ISA2.06 Divide Extended Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 04/22] target-ppc: Add ISA2.06 divde[o] Instructions Tom Musta
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Divide Doubleword Extended Unsigned
instructions.  This instruction requires dividing a 128-bit
value by a 64 bit value.  Since 128 bit integer division is
not supported in TCG, a helper is used.  An architecture
independent 128-bit division routine is added to host-utils.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V2: Moved the 128-bit divide routine into host-utils per Richard
Henderson's suggestion.

V4: Use the newly added PPC2_DIVE_ISA206 flag. Modified helper
macro which will be common for divde[u] and divwe[u] instruction
variants.

 include/qemu/host-utils.h |   14 ++++++++++++++
 target-ppc/helper.h       |    1 +
 target-ppc/int_helper.c   |   27 +++++++++++++++++++++++++++
 target-ppc/translate.c    |   21 +++++++++++++++++++++
 util/host-utils.c         |   38 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 101 insertions(+), 0 deletions(-)

diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index de85d28..7615122 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -44,9 +44,23 @@ static inline void muls64(uint64_t *plow, uint64_t *phigh,
     *plow = r;
     *phigh = r >> 64;
 }
+
+static inline int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor)
+{
+    if (divisor == 0) {
+        return 1;
+    } else {
+        __uint128_t dividend = ((__uint128_t)*phigh << 64) | *plow;
+        __uint128_t result = dividend / divisor;
+        *plow = result;
+        *phigh = dividend % divisor;
+        return result > UINT64_MAX;
+    }
+}
 #else
 void muls64(uint64_t *phigh, uint64_t *plow, int64_t a, int64_t b);
 void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b);
+int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor);
 #endif
 
 /**
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index a42aa61..bd1abeb 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -31,6 +31,7 @@ DEF_HELPER_5(lscbx, tl, env, tl, i32, i32, i32)
 
 #if defined(TARGET_PPC64)
 DEF_HELPER_3(mulldo, i64, env, i64, i64)
+DEF_HELPER_4(divdeu, i64, env, i64, i64, i32)
 #endif
 
 DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_NO_RWG_SE, tl, tl)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 0e7afb3..6f3d8fd 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -41,6 +41,33 @@ uint64_t helper_mulldo(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
 }
 #endif
 
+#if defined(TARGET_PPC64)
+
+uint64_t helper_divdeu(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe)
+{
+    uint64_t rt = 0;
+    int overflow = 0;
+
+    overflow = divu128(&rt, &ra, rb);
+
+    if (unlikely(overflow)) {
+        rt = 0; /* Undefined */
+    }
+
+    if (oe) {
+        if (unlikely(overflow)) {
+            env->so = env->ov = 1;
+        } else {
+            env->ov = 0;
+        }
+    }
+
+    return rt;
+}
+
+#endif
+
+
 target_ulong helper_cntlzw(target_ulong t)
 {
     return clz32(t);
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 491c585..ea7c21e 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -984,6 +984,20 @@ GEN_INT_ARITH_DIVW(divwuo, 0x1E, 0, 1);
 /* divw  divw.  divwo  divwo.   */
 GEN_INT_ARITH_DIVW(divw, 0x0F, 1, 0);
 GEN_INT_ARITH_DIVW(divwo, 0x1F, 1, 1);
+
+/* div[wd]eu[o][.] */
+#define GEN_DIVE(name, hlpr, compute_ov)                                      \
+static void gen_##name(DisasContext *ctx)                                     \
+{                                                                             \
+    TCGv_i32 t0 = tcg_const_i32(compute_ov);                                  \
+    gen_helper_##hlpr(cpu_gpr[rD(ctx->opcode)], cpu_env,                      \
+                     cpu_gpr[rA(ctx->opcode)], cpu_gpr[rB(ctx->opcode)], t0); \
+    tcg_temp_free_i32(t0);                                                    \
+    if (unlikely(Rc(ctx->opcode) != 0)) {                                     \
+        gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]);                           \
+    }                                                                         \
+}
+
 #if defined(TARGET_PPC64)
 static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, TCGv arg1,
                                      TCGv arg2, int sign, int compute_ov)
@@ -1032,6 +1046,10 @@ GEN_INT_ARITH_DIVD(divduo, 0x1E, 0, 1);
 /* divw  divw.  divwo  divwo.   */
 GEN_INT_ARITH_DIVD(divd, 0x0F, 1, 0);
 GEN_INT_ARITH_DIVD(divdo, 0x1F, 1, 1);
+
+GEN_DIVE(divdeu, divdeu, 0);
+GEN_DIVE(divdeuo, divdeu, 1);
+
 #endif
 
 /* mulhw  mulhw. */
@@ -9610,6 +9628,9 @@ GEN_INT_ARITH_DIVD(divduo, 0x1E, 0, 1),
 GEN_INT_ARITH_DIVD(divd, 0x0F, 1, 0),
 GEN_INT_ARITH_DIVD(divdo, 0x1F, 1, 1),
 
+GEN_HANDLER_E(divdeu, 0x1F, 0x09, 0x0C, 0, PPC_NONE, PPC2_DIVE_ISA206),
+GEN_HANDLER_E(divdeuo, 0x1F, 0x09, 0x1C, 0, PPC_NONE, PPC2_DIVE_ISA206),
+
 #undef GEN_INT_ARITH_MUL_HELPER
 #define GEN_INT_ARITH_MUL_HELPER(name, opc3)                                  \
 GEN_HANDLER(name, 0x1F, 0x09, opc3, 0x00000000, PPC_64B)
diff --git a/util/host-utils.c b/util/host-utils.c
index f0784d6..b6f7a6e 100644
--- a/util/host-utils.c
+++ b/util/host-utils.c
@@ -86,4 +86,42 @@ void muls64 (uint64_t *plow, uint64_t *phigh, int64_t a, int64_t b)
     }
     *phigh = rh;
 }
+
+/* Unsigned 128x64 division.  Returns 1 if overflow (divide by zero or */
+/* quotient exceeds 64 bits).  Otherwise returns quotient via plow and */
+/* remainder via phigh. */
+int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor)
+{
+    uint64_t dhi = *phigh;
+    uint64_t dlo = *plow;
+    unsigned i;
+    uint64_t carry = 0;
+
+    if (divisor == 0) {
+        return 1;
+    } else if (dhi == 0) {
+        *plow  = dlo / divisor;
+        *phigh = dlo % divisor;
+        return 0;
+    } else if (dhi > divisor) {
+        return 1;
+    } else {
+
+        for (i = 0; i < 64; i++) {
+            carry = dhi >> 63;
+            dhi = (dhi << 1) | (dlo >> 63);
+            if (carry | dhi >= divisor) {
+                dhi -= divisor;
+                carry = 1;
+            } else {
+                carry = 0;
+            }
+            dlo = (dlo << 1) | carry;
+        }
+
+        *plow = dlo;
+        *phigh = dhi;
+        return 0;
+    }
+}
 #endif /* !CONFIG_INT128 */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 04/22] target-ppc: Add ISA2.06 divde[o] Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (2 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 03/22] target-ppc: Add ISA2.06 divdeu[o] Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 05/22] target-ppc: Add ISA 2.06 divweu[o] Instructions Tom Musta
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Divide Doubleword Extended instructions.
The implementation builds on the unsigned helper provided in
the previous patch.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V2: Updated to use the host-utils 128 bit divide.

V4: Using the newly addedd PPC2_DIVE_ISA206 flag.  Modified overflow
detection code per Richard Henderson's suggestion.

 include/qemu/host-utils.h |   14 ++++++++++++++
 target-ppc/helper.h       |    1 +
 target-ppc/int_helper.c   |   23 +++++++++++++++++++++++
 target-ppc/translate.c    |    5 ++++-
 util/host-utils.c         |   37 +++++++++++++++++++++++++++++++++++++
 5 files changed, 79 insertions(+), 1 deletions(-)

diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index 7615122..9cb0d67 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -57,10 +57,24 @@ static inline int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor)
         return result > UINT64_MAX;
     }
 }
+
+static inline int divs128(int64_t *plow, int64_t *phigh, int64_t divisor)
+{
+    if (divisor == 0) {
+        return 1;
+    } else {
+        __int128_t dividend = ((__int128_t)*phigh << 64) | *plow;
+        __int128_t result = dividend / divisor;
+        *plow = result;
+        *phigh = dividend % divisor;
+        return result != *plow;
+    }
+}
 #else
 void muls64(uint64_t *phigh, uint64_t *plow, int64_t a, int64_t b);
 void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b);
 int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor);
+int divs128(int64_t *plow, int64_t *phigh, int64_t divisor);
 #endif
 
 /**
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index bd1abeb..fea03d9 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -32,6 +32,7 @@ DEF_HELPER_5(lscbx, tl, env, tl, i32, i32, i32)
 #if defined(TARGET_PPC64)
 DEF_HELPER_3(mulldo, i64, env, i64, i64)
 DEF_HELPER_4(divdeu, i64, env, i64, i64, i32)
+DEF_HELPER_4(divde, i64, env, i64, i64, i32)
 #endif
 
 DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_NO_RWG_SE, tl, tl)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 6f3d8fd..920dba7 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -65,6 +65,29 @@ uint64_t helper_divdeu(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe)
     return rt;
 }
 
+uint64_t helper_divde(CPUPPCState *env, uint64_t rau, uint64_t rbu, uint32_t oe)
+{
+    int64_t rt = 0;
+    int64_t ra = (int64_t)rau;
+    int64_t rb = (int64_t)rbu;
+    int overflow = divs128(&rt, &ra, rb);
+
+    if (unlikely(overflow)) {
+        rt = 0; /* Undefined */
+    }
+
+    if (oe) {
+
+        if (unlikely(overflow)) {
+            env->so = env->ov = 1;
+        } else {
+            env->ov = 0;
+        }
+    }
+
+    return rt;
+}
+
 #endif
 
 
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index ea7c21e..e8ffdfc 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1049,7 +1049,8 @@ GEN_INT_ARITH_DIVD(divdo, 0x1F, 1, 1);
 
 GEN_DIVE(divdeu, divdeu, 0);
 GEN_DIVE(divdeuo, divdeu, 1);
-
+GEN_DIVE(divde, divde, 0);
+GEN_DIVE(divdeo, divde, 1);
 #endif
 
 /* mulhw  mulhw. */
@@ -9630,6 +9631,8 @@ GEN_INT_ARITH_DIVD(divdo, 0x1F, 1, 1),
 
 GEN_HANDLER_E(divdeu, 0x1F, 0x09, 0x0C, 0, PPC_NONE, PPC2_DIVE_ISA206),
 GEN_HANDLER_E(divdeuo, 0x1F, 0x09, 0x1C, 0, PPC_NONE, PPC2_DIVE_ISA206),
+GEN_HANDLER_E(divde, 0x1F, 0x09, 0x0D, 0, PPC_NONE, PPC2_DIVE_ISA206),
+GEN_HANDLER_E(divdeo, 0x1F, 0x09, 0x1D, 0, PPC_NONE, PPC2_DIVE_ISA206),
 
 #undef GEN_INT_ARITH_MUL_HELPER
 #define GEN_INT_ARITH_MUL_HELPER(name, opc3)                                  \
diff --git a/util/host-utils.c b/util/host-utils.c
index b6f7a6e..140e25c 100644
--- a/util/host-utils.c
+++ b/util/host-utils.c
@@ -124,4 +124,41 @@ int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor)
         return 0;
     }
 }
+
+int divs128(int64_t *plow, int64_t *phigh, int64_t divisor)
+{
+    int sgn_dvdnd = *phigh < 0;
+    int sgn_divsr = divisor < 0;
+    int overflow = 0;
+
+    if (sgn_dvdnd) {
+        *plow = ~(*plow);
+        *phigh = ~(*phigh);
+        if (*plow == (int64_t)-1) {
+            *plow = 0;
+            (*phigh)++;
+         } else {
+            (*plow)++;
+         }
+    }
+
+    if (sgn_divsr) {
+        divisor = 0 - divisor;
+    }
+
+    overflow = divu128((uint64_t *)plow, (uint64_t *)phigh, (uint64_t)divisor);
+
+    if (sgn_dvdnd  ^ sgn_divsr) {
+        *plow = 0 - *plow;
+    }
+
+    if (!overflow) {
+        if ((*plow < 0) ^ (sgn_dvdnd ^ sgn_divsr)) {
+            overflow = 1;
+        }
+    }
+
+    return overflow;
+}
+
 #endif /* !CONFIG_INT128 */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 05/22] target-ppc: Add ISA 2.06 divweu[o] Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (3 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 04/22] target-ppc: Add ISA2.06 divde[o] Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-08 18:26   ` Richard Henderson
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 06/22] target-ppc: Add ISA 2.06 divwe[o] Instructions Tom Musta
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch addes the Unsigned Divide Word Extended instructions
which were introduced in Power ISA 2.06B.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V2: Eliminating extraneous code in the overflow case per comments
from Richard Henderson.  Fixed corner case bug in divweu (check
for (RA) >= (RB)).

V4: Using newly added PPC2_DIVE_ISA206 flag.  Converted to helper
per Richard Henderson's review.

 target-ppc/helper.h     |    1 +
 target-ppc/int_helper.c |   31 +++++++++++++++++++++++++++++++
 target-ppc/translate.c  |    5 +++++
 3 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index fea03d9..41025c9 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -34,6 +34,7 @@ DEF_HELPER_3(mulldo, i64, env, i64, i64)
 DEF_HELPER_4(divdeu, i64, env, i64, i64, i32)
 DEF_HELPER_4(divde, i64, env, i64, i64, i32)
 #endif
+DEF_HELPER_4(divweu, tl, env, tl, tl, i32)
 
 DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_NO_RWG_SE, tl, tl)
 DEF_HELPER_FLAGS_1(popcntb, TCG_CALL_NO_RWG_SE, tl, tl)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 920dba7..45586be 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -41,6 +41,37 @@ uint64_t helper_mulldo(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
 }
 #endif
 
+target_ulong helper_divweu(CPUPPCState *env, target_ulong ra, target_ulong rb,
+                           uint32_t oe)
+{
+    uint64_t rt = 0;
+    int overflow = 0;
+
+    uint64_t dividend = (uint64_t)ra << 32;
+    uint64_t divisor = (uint32_t)rb;
+
+    if (unlikely(divisor == 0)) {
+        overflow = 1;
+    } else {
+        rt = dividend / divisor;
+        overflow = rt > UINT32_MAX;
+    }
+
+    if (unlikely(overflow)) {
+        rt = 0; /* Undefined */
+    }
+
+    if (oe) {
+        if (unlikely(overflow)) {
+            env->so = env->ov = 1;
+        } else {
+            env->ov = 0;
+        }
+    }
+
+    return (target_ulong)rt;
+}
+
 #if defined(TARGET_PPC64)
 
 uint64_t helper_divdeu(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index e8ffdfc..a7d3295 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -998,6 +998,9 @@ static void gen_##name(DisasContext *ctx)                                     \
     }                                                                         \
 }
 
+GEN_DIVE(divweu, divweu, 0);
+GEN_DIVE(divweuo, divweu, 1);
+
 #if defined(TARGET_PPC64)
 static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, TCGv arg1,
                                      TCGv arg2, int sign, int compute_ov)
@@ -9619,6 +9622,8 @@ GEN_INT_ARITH_DIVW(divwu, 0x0E, 0, 0),
 GEN_INT_ARITH_DIVW(divwuo, 0x1E, 0, 1),
 GEN_INT_ARITH_DIVW(divw, 0x0F, 1, 0),
 GEN_INT_ARITH_DIVW(divwo, 0x1F, 1, 1),
+GEN_HANDLER_E(divweu, 0x1F, 0x0B, 0x0C, 0, PPC_NONE, PPC2_DIVE_ISA206),
+GEN_HANDLER_E(divweuo, 0x1F, 0x0B, 0x1C, 0, PPC_NONE, PPC2_DIVE_ISA206),
 
 #if defined(TARGET_PPC64)
 #undef GEN_INT_ARITH_DIVD
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 06/22] target-ppc: Add ISA 2.06 divwe[o] Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (4 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 05/22] target-ppc: Add ISA 2.06 divweu[o] Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-08 18:28   ` Richard Henderson
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 07/22] target-ppc: Add Flag for ISA2.06 Atomic Instructions Tom Musta
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch addes the signed Divide Word Extended instructions
which were introduced in Power ISA 2.06B.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V2: Eliminating extraneous code in the overflow case per comments
from Richard Henderson.  Fixed corner case bug in divweu (check
for (RA) >= (RB)).

V4: Using newly added PPC2_DIVE_ISA206 flag.  Converted to helper
per Richard Henderson's review.

 target-ppc/helper.h     |    1 +
 target-ppc/int_helper.c |   32 ++++++++++++++++++++++++++++++++
 target-ppc/translate.c  |    4 ++++
 3 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 41025c9..ae2e990 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -35,6 +35,7 @@ DEF_HELPER_4(divdeu, i64, env, i64, i64, i32)
 DEF_HELPER_4(divde, i64, env, i64, i64, i32)
 #endif
 DEF_HELPER_4(divweu, tl, env, tl, tl, i32)
+DEF_HELPER_4(divwe, tl, env, tl, tl, i32)
 
 DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_NO_RWG_SE, tl, tl)
 DEF_HELPER_FLAGS_1(popcntb, TCG_CALL_NO_RWG_SE, tl, tl)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 45586be..71db3fb 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -72,6 +72,38 @@ target_ulong helper_divweu(CPUPPCState *env, target_ulong ra, target_ulong rb,
     return (target_ulong)rt;
 }
 
+target_ulong helper_divwe(CPUPPCState *env, target_ulong ra, target_ulong rb,
+                          uint32_t oe)
+{
+    int64_t rt = 0;
+    int overflow = 0;
+
+    int64_t dividend = (int64_t)ra << 32;
+    int64_t divisor = (int64_t)((int32_t)rb);
+
+    if (unlikely((divisor == 0) ||
+                 ((divisor == -1ull) && (dividend == INT64_MIN)))) {
+        overflow = 1;
+    } else {
+        rt = dividend / divisor;
+        overflow = rt != (int32_t)rt;
+    }
+
+    if (unlikely(overflow)) {
+        rt = 0; /* Undefined */
+    }
+
+    if (oe) {
+        if (unlikely(overflow)) {
+            env->so = env->ov = 1;
+        } else {
+            env->ov = 0;
+        }
+    }
+
+    return (target_ulong)rt;
+}
+
 #if defined(TARGET_PPC64)
 
 uint64_t helper_divdeu(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index a7d3295..d476d92 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1000,6 +1000,8 @@ static void gen_##name(DisasContext *ctx)                                     \
 
 GEN_DIVE(divweu, divweu, 0);
 GEN_DIVE(divweuo, divweu, 1);
+GEN_DIVE(divwe, divwe, 0);
+GEN_DIVE(divweo, divwe, 1);
 
 #if defined(TARGET_PPC64)
 static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, TCGv arg1,
@@ -9622,6 +9624,8 @@ GEN_INT_ARITH_DIVW(divwu, 0x0E, 0, 0),
 GEN_INT_ARITH_DIVW(divwuo, 0x1E, 0, 1),
 GEN_INT_ARITH_DIVW(divw, 0x0F, 1, 0),
 GEN_INT_ARITH_DIVW(divwo, 0x1F, 1, 1),
+GEN_HANDLER_E(divwe, 0x1F, 0x0B, 0x0D, 0, PPC_NONE, PPC2_DIVE_ISA206),
+GEN_HANDLER_E(divweo, 0x1F, 0x0B, 0x1D, 0, PPC_NONE, PPC2_DIVE_ISA206),
 GEN_HANDLER_E(divweu, 0x1F, 0x0B, 0x0C, 0, PPC_NONE, PPC2_DIVE_ISA206),
 GEN_HANDLER_E(divweuo, 0x1F, 0x0B, 0x1C, 0, PPC_NONE, PPC2_DIVE_ISA206),
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 07/22] target-ppc: Add Flag for ISA2.06 Atomic Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (5 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 06/22] target-ppc: Add ISA 2.06 divwe[o] Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-08 18:28   ` Richard Henderson
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 08/22] target-ppc: Add ISA2.06 lbarx, lharx Instructions Tom Musta
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds a flag for the atomic instructions introduced
in Power ISA V2.06B.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: Split into new and separate patch.  Added to Power7+ model.

 target-ppc/cpu.h            |    5 ++++-
 target-ppc/translate_init.c |    9 ++++++---
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 8ba0d32..3057ac8 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1881,10 +1881,13 @@ enum {
     PPC2_PERM_ISA206   = 0x0000000000000080ULL,
     /* ISA 2.06B divide extended variants                                    */
     PPC2_DIVE_ISA206   = 0x0000000000000100ULL,
+    /* ISA 2.06B larx/stcx. instructions                                     */
+    PPC2_ATOMIC_ISA206 = 0x0000000000000200ULL,
+
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
                         PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
-                        PPC2_DIVE_ISA206)
+                        PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206)
 };
 
 /*****************************************************************************/
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 390024e..5778760 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7237,7 +7237,8 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
-                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206;
+                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
+                        PPC2_ATOMIC_ISA206;
     pcc->msr_mask = 0x800000000284FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7276,7 +7277,8 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
-                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206;
+                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
+                        PPC2_ATOMIC_ISA206;
     pcc->msr_mask = 0x800000000204FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7315,7 +7317,8 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                        PPC_SEGMENT_64B | PPC_SLBI |
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX |
-                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206;
+                        PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
+                        PPC2_ATOMIC_ISA206;
     pcc->msr_mask = 0x800000000284FF36ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 08/22] target-ppc: Add ISA2.06 lbarx, lharx Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (6 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 07/22] target-ppc: Add Flag for ISA2.06 Atomic Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 09/22] target-ppc: Add ISA 2.06 stbcx. and sthcx. Instructions Tom Musta
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the byte and halfword variants of the Load and
Reserve instructions.   Since there is much commonality among
all forms of Load and Reserve, a macro is provided and the existing
implementations of lwarx and ldarx are refactoried to use this
macro.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V2: Fixed bug in aligment check for lharx (caught by Richard).

V4: Using newly added PPC2_ATOMIC_ISA206 flag.

 target-ppc/translate.c |   50 +++++++++++++++++++++++------------------------
 1 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index d476d92..a0b5336 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -3181,21 +3181,29 @@ static void gen_isync(DisasContext *ctx)
     gen_stop_exception(ctx);
 }
 
-/* lwarx */
-static void gen_lwarx(DisasContext *ctx)
-{
-    TCGv t0;
-    TCGv gpr = cpu_gpr[rD(ctx->opcode)];
-    gen_set_access_type(ctx, ACCESS_RES);
-    t0 = tcg_temp_local_new();
-    gen_addr_reg_index(ctx, t0);
-    gen_check_align(ctx, t0, 0x03);
-    gen_qemu_ld32u(ctx, gpr, t0);
-    tcg_gen_mov_tl(cpu_reserve, t0);
-    tcg_gen_st_tl(gpr, cpu_env, offsetof(CPUPPCState, reserve_val));
-    tcg_temp_free(t0);
+#define LARX(name, len, loadop)                                      \
+static void gen_##name(DisasContext *ctx)                            \
+{                                                                    \
+    TCGv t0;                                                         \
+    TCGv gpr = cpu_gpr[rD(ctx->opcode)];                             \
+    gen_set_access_type(ctx, ACCESS_RES);                            \
+    t0 = tcg_temp_local_new();                                       \
+    gen_addr_reg_index(ctx, t0);                                     \
+    if ((len) > 1) {                                                 \
+        gen_check_align(ctx, t0, (len)-1);                           \
+    }                                                                \
+    gen_qemu_##loadop(ctx, gpr, t0);                                 \
+    tcg_gen_mov_tl(cpu_reserve, t0);                                 \
+    tcg_gen_st_tl(gpr, cpu_env, offsetof(CPUPPCState, reserve_val)); \
+    tcg_temp_free(t0);                                               \
 }
 
+/* lwarx */
+LARX(lbarx, 1, ld8u);
+LARX(lharx, 2, ld16u);
+LARX(lwarx, 4, ld32u);
+
+
 #if defined(CONFIG_USER_ONLY)
 static void gen_conditional_store (DisasContext *ctx, TCGv EA,
                                    int reg, int size)
@@ -3242,19 +3250,7 @@ static void gen_stwcx_(DisasContext *ctx)
 
 #if defined(TARGET_PPC64)
 /* ldarx */
-static void gen_ldarx(DisasContext *ctx)
-{
-    TCGv t0;
-    TCGv gpr = cpu_gpr[rD(ctx->opcode)];
-    gen_set_access_type(ctx, ACCESS_RES);
-    t0 = tcg_temp_local_new();
-    gen_addr_reg_index(ctx, t0);
-    gen_check_align(ctx, t0, 0x07);
-    gen_qemu_ld64(ctx, gpr, t0);
-    tcg_gen_mov_tl(cpu_reserve, t0);
-    tcg_gen_st_tl(gpr, cpu_env, offsetof(CPUPPCState, reserve_val));
-    tcg_temp_free(t0);
-}
+LARX(ldarx, 8, ld64);
 
 /* stdcx. */
 static void gen_stdcx_(DisasContext *ctx)
@@ -9416,6 +9412,8 @@ GEN_HANDLER(stswi, 0x1F, 0x15, 0x16, 0x00000001, PPC_STRING),
 GEN_HANDLER(stswx, 0x1F, 0x15, 0x14, 0x00000001, PPC_STRING),
 GEN_HANDLER(eieio, 0x1F, 0x16, 0x1A, 0x03FFF801, PPC_MEM_EIEIO),
 GEN_HANDLER(isync, 0x13, 0x16, 0x04, 0x03FFF801, PPC_MEM),
+GEN_HANDLER_E(lbarx, 0x1F, 0x14, 0x01, 0, PPC_NONE, PPC2_ATOMIC_ISA206),
+GEN_HANDLER_E(lharx, 0x1F, 0x14, 0x03, 0, PPC_NONE, PPC2_ATOMIC_ISA206),
 GEN_HANDLER(lwarx, 0x1F, 0x14, 0x00, 0x00000000, PPC_RES),
 GEN_HANDLER2(stwcx_, "stwcx.", 0x1F, 0x16, 0x04, 0x00000000, PPC_RES),
 #if defined(TARGET_PPC64)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 09/22] target-ppc: Add ISA 2.06 stbcx. and sthcx. Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (7 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 08/22] target-ppc: Add ISA2.06 lbarx, lharx Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 10/22] target-ppc: Add Flag for ISA V2.06 Floating Point Conversion Tom Musta
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the byte and halfword variants of the Store Conditional
instructions.   A common macro is introduced and the existing implementations
of stwcx. and stdcx. are refactored to use this macro.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V2: Re-implemented gen_conditional_store() and STCX macro per comments
from Richard.

V4: Using the newly added PPC2_ATOMIC_ISA206 flag. Fixes for tcg-debug
compilation errors.

 target-ppc/translate.c |   91 +++++++++++++++++++++++-------------------------
 1 files changed, 44 insertions(+), 47 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index a0b5336..48ac5ec 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -3205,8 +3205,8 @@ LARX(lwarx, 4, ld32u);
 
 
 #if defined(CONFIG_USER_ONLY)
-static void gen_conditional_store (DisasContext *ctx, TCGv EA,
-                                   int reg, int size)
+static void gen_conditional_store(DisasContext *ctx, TCGv EA,
+                                  int reg, int size)
 {
     TCGv t0 = tcg_temp_new();
     uint32_t save_exception = ctx->exception;
@@ -3220,62 +3220,57 @@ static void gen_conditional_store (DisasContext *ctx, TCGv EA,
     gen_exception(ctx, POWERPC_EXCP_STCX);
     ctx->exception = save_exception;
 }
-#endif
-
-/* stwcx. */
-static void gen_stwcx_(DisasContext *ctx)
-{
-    TCGv t0;
-    gen_set_access_type(ctx, ACCESS_RES);
-    t0 = tcg_temp_local_new();
-    gen_addr_reg_index(ctx, t0);
-    gen_check_align(ctx, t0, 0x03);
-#if defined(CONFIG_USER_ONLY)
-    gen_conditional_store(ctx, t0, rS(ctx->opcode), 4);
 #else
-    {
-        int l1;
+static void gen_conditional_store(DisasContext *ctx, TCGv EA,
+                                  int reg, int size)
+{
+    int l1;
 
-        tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
-        l1 = gen_new_label();
-        tcg_gen_brcond_tl(TCG_COND_NE, t0, cpu_reserve, l1);
-        tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
-        gen_qemu_st32(ctx, cpu_gpr[rS(ctx->opcode)], t0);
-        gen_set_label(l1);
-        tcg_gen_movi_tl(cpu_reserve, -1);
+    tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
+    l1 = gen_new_label();
+    tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1);
+    tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
+#if defined(TARGET_PPC64)
+    if (size == 8) {
+        gen_qemu_st64(ctx, cpu_gpr[reg], EA);
+    } else
+#endif
+    if (size == 4) {
+        gen_qemu_st32(ctx, cpu_gpr[reg], EA);
+    } else if (size == 2) {
+        gen_qemu_st16(ctx, cpu_gpr[reg], EA);
+    } else {
+        gen_qemu_st8(ctx, cpu_gpr[reg], EA);
     }
+    gen_set_label(l1);
+    tcg_gen_movi_tl(cpu_reserve, -1);
+}
 #endif
-    tcg_temp_free(t0);
+
+#define STCX(name, len)                                   \
+static void gen_##name(DisasContext *ctx)                 \
+{                                                         \
+    TCGv t0;                                              \
+    gen_set_access_type(ctx, ACCESS_RES);                 \
+    t0 = tcg_temp_local_new();                            \
+    gen_addr_reg_index(ctx, t0);                          \
+    if (len > 1) {                                        \
+        gen_check_align(ctx, t0, (len)-1);                \
+    }                                                     \
+    gen_conditional_store(ctx, t0, rS(ctx->opcode), len); \
+    tcg_temp_free(t0);                                    \
 }
 
+STCX(stbcx_, 1);
+STCX(sthcx_, 2);
+STCX(stwcx_, 4);
+
 #if defined(TARGET_PPC64)
 /* ldarx */
 LARX(ldarx, 8, ld64);
 
 /* stdcx. */
-static void gen_stdcx_(DisasContext *ctx)
-{
-    TCGv t0;
-    gen_set_access_type(ctx, ACCESS_RES);
-    t0 = tcg_temp_local_new();
-    gen_addr_reg_index(ctx, t0);
-    gen_check_align(ctx, t0, 0x07);
-#if defined(CONFIG_USER_ONLY)
-    gen_conditional_store(ctx, t0, rS(ctx->opcode), 8);
-#else
-    {
-        int l1;
-        tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so);
-        l1 = gen_new_label();
-        tcg_gen_brcond_tl(TCG_COND_NE, t0, cpu_reserve, l1);
-        tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ);
-        gen_qemu_st64(ctx, cpu_gpr[rS(ctx->opcode)], t0);
-        gen_set_label(l1);
-        tcg_gen_movi_tl(cpu_reserve, -1);
-    }
-#endif
-    tcg_temp_free(t0);
-}
+STCX(stdcx_, 8);
 #endif /* defined(TARGET_PPC64) */
 
 /* sync */
@@ -9415,6 +9410,8 @@ GEN_HANDLER(isync, 0x13, 0x16, 0x04, 0x03FFF801, PPC_MEM),
 GEN_HANDLER_E(lbarx, 0x1F, 0x14, 0x01, 0, PPC_NONE, PPC2_ATOMIC_ISA206),
 GEN_HANDLER_E(lharx, 0x1F, 0x14, 0x03, 0, PPC_NONE, PPC2_ATOMIC_ISA206),
 GEN_HANDLER(lwarx, 0x1F, 0x14, 0x00, 0x00000000, PPC_RES),
+GEN_HANDLER_E(stbcx_, 0x1F, 0x16, 0x15, 0, PPC_NONE, PPC2_ATOMIC_ISA206),
+GEN_HANDLER_E(sthcx_, 0x1F, 0x16, 0x16, 0, PPC_NONE, PPC2_ATOMIC_ISA206),
 GEN_HANDLER2(stwcx_, "stwcx.", 0x1F, 0x16, 0x04, 0x00000000, PPC_RES),
 #if defined(TARGET_PPC64)
 GEN_HANDLER(ldarx, 0x1F, 0x14, 0x02, 0x00000000, PPC_64B),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 10/22] target-ppc: Add Flag for ISA V2.06 Floating Point Conversion
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (8 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 09/22] target-ppc: Add ISA 2.06 stbcx. and sthcx. Instructions Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-08 18:29   ` Richard Henderson
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 11/22] target-ppc: Add ISA2.06 Float to Integer Instructions Tom Musta
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds a flag for the floating point conversion instructions
introduced in Power ISA 2.06B.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: Split single flag into multiple flags per discussion with
Alex Graf and Scott Wood.  Added to Power7+ config.

 target-ppc/cpu.h            |    5 ++++-
 target-ppc/translate_init.c |    6 +++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 3057ac8..4a68ce2 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1883,11 +1883,14 @@ enum {
     PPC2_DIVE_ISA206   = 0x0000000000000100ULL,
     /* ISA 2.06B larx/stcx. instructions                                     */
     PPC2_ATOMIC_ISA206 = 0x0000000000000200ULL,
+    /* ISA 2.06B floating point integer conversion                           */
+    PPC2_FP_CVT_ISA206 = 0x0000000000000400ULL,
 
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
                         PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
-                        PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206)
+                        PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206 | \
+                        PPC2_FP_CVT_ISA206)
 };
 
 /*****************************************************************************/
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 5778760..3fb849b 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7238,7 +7238,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
                         PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
-                        PPC2_ATOMIC_ISA206;
+                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206;
     pcc->msr_mask = 0x800000000284FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7278,7 +7278,7 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
                         PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
-                        PPC2_ATOMIC_ISA206;
+                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206;
     pcc->msr_mask = 0x800000000204FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7318,7 +7318,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX |
                         PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
-                        PPC2_ATOMIC_ISA206;
+                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206;
     pcc->msr_mask = 0x800000000284FF36ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 11/22] target-ppc: Add ISA2.06 Float to Integer Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (9 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 10/22] target-ppc: Add Flag for ISA V2.06 Floating Point Conversion Tom Musta
@ 2014-01-07 16:05 ` Tom Musta
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 12/22] target-ppc: Add ISA 2.06 fcfid[u][s] Instructions Tom Musta
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the four floating point to integer conversion instructions
introduced by Power ISA V2.06:

  - Floating Convert to Integer Word Unsigned (fctiwu)
  - Floating Convert to Integer Word Unsigned with Round Toward
    Zero (fctiwuz)
  - Floating Convert to Integer Doubleword Unsigned (fctidu)
  - Floating Convert to Integer Doubleword Unsigned with Round
    Toward Zero (fctiduz)

A common macro is developed to eliminate repetitious code.  Existing instructions
are also refactoried to use this macro (fctiw, fctiwz, fctid, fctidz).

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V4: Using the newly added PPC2_FP_CVT_ISA206 flag.

 target-ppc/fpu_helper.c |  122 +++++++++++++----------------------------------
 target-ppc/helper.h     |    4 ++
 target-ppc/translate.c  |   12 +++++
 3 files changed, 50 insertions(+), 88 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 1dfb3c0..04599f6 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -600,55 +600,41 @@ uint64_t helper_fdiv(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
     return farg1.ll;
 }
 
-/* fctiw - fctiw. */
-uint64_t helper_fctiw(CPUPPCState *env, uint64_t arg)
-{
-    CPU_DoubleU farg;
-
-    farg.ll = arg;
-
-    if (unlikely(float64_is_signaling_nan(farg.d))) {
-        /* sNaN conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI, 1);
-    } else if (unlikely(float64_is_quiet_nan(farg.d) ||
-                        float64_is_infinity(farg.d))) {
-        /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
-    } else {
-        farg.ll = float64_to_int32(farg.d, &env->fp_status);
-        /* XXX: higher bits are not supposed to be significant.
-         *     to make tests easier, return the same as a real PowerPC 750
-         */
-        farg.ll |= 0xFFF80000ULL << 32;
-    }
-    return farg.ll;
-}
-
-/* fctiwz - fctiwz. */
-uint64_t helper_fctiwz(CPUPPCState *env, uint64_t arg)
-{
-    CPU_DoubleU farg;
 
-    farg.ll = arg;
+#define FPU_FCTI(op, cvt, nanval)                                      \
+uint64_t helper_##op(CPUPPCState *env, uint64_t arg)                   \
+{                                                                      \
+    CPU_DoubleU farg;                                                  \
+                                                                       \
+    farg.ll = arg;                                                     \
+    farg.ll = float64_to_##cvt(farg.d, &env->fp_status);               \
+                                                                       \
+    if (unlikely(env->fp_status.float_exception_flags)) {              \
+        if (float64_is_any_nan(arg)) {                                 \
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);      \
+            if (float64_is_signaling_nan(arg)) {                       \
+                fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1); \
+            }                                                          \
+            farg.ll = nanval;                                          \
+        } else if (env->fp_status.float_exception_flags &              \
+                   float_flag_invalid) {                               \
+            fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);      \
+        }                                                              \
+        helper_float_check_status(env);                                \
+    }                                                                  \
+    return farg.ll;                                                    \
+ }
 
-    if (unlikely(float64_is_signaling_nan(farg.d))) {
-        /* sNaN conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI, 1);
-    } else if (unlikely(float64_is_quiet_nan(farg.d) ||
-                        float64_is_infinity(farg.d))) {
-        /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
-    } else {
-        farg.ll = float64_to_int32_round_to_zero(farg.d, &env->fp_status);
-        /* XXX: higher bits are not supposed to be significant.
-         *     to make tests easier, return the same as a real PowerPC 750
-         */
-        farg.ll |= 0xFFF80000ULL << 32;
-    }
-    return farg.ll;
-}
+FPU_FCTI(fctiw, int32, 0x80000000)
+FPU_FCTI(fctiwz, int32_round_to_zero, 0x80000000)
+FPU_FCTI(fctiwu, uint32, 0x00000000)
+FPU_FCTI(fctiwuz, uint32_round_to_zero, 0x00000000)
+#if defined(TARGET_PPC64)
+FPU_FCTI(fctid, int64, 0x8000000000000000)
+FPU_FCTI(fctidz, int64_round_to_zero, 0x8000000000000000)
+FPU_FCTI(fctidu, uint64, 0x0000000000000000)
+FPU_FCTI(fctiduz, uint64_round_to_zero, 0x0000000000000000)
+#endif
 
 #if defined(TARGET_PPC64)
 /* fcfid - fcfid. */
@@ -660,47 +646,7 @@ uint64_t helper_fcfid(CPUPPCState *env, uint64_t arg)
     return farg.ll;
 }
 
-/* fctid - fctid. */
-uint64_t helper_fctid(CPUPPCState *env, uint64_t arg)
-{
-    CPU_DoubleU farg;
 
-    farg.ll = arg;
-
-    if (unlikely(float64_is_signaling_nan(farg.d))) {
-        /* sNaN conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI, 1);
-    } else if (unlikely(float64_is_quiet_nan(farg.d) ||
-                        float64_is_infinity(farg.d))) {
-        /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
-    } else {
-        farg.ll = float64_to_int64(farg.d, &env->fp_status);
-    }
-    return farg.ll;
-}
-
-/* fctidz - fctidz. */
-uint64_t helper_fctidz(CPUPPCState *env, uint64_t arg)
-{
-    CPU_DoubleU farg;
-
-    farg.ll = arg;
-
-    if (unlikely(float64_is_signaling_nan(farg.d))) {
-        /* sNaN conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI, 1);
-    } else if (unlikely(float64_is_quiet_nan(farg.d) ||
-                        float64_is_infinity(farg.d))) {
-        /* qNan / infinity conversion */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
-    } else {
-        farg.ll = float64_to_int64_round_to_zero(farg.d, &env->fp_status);
-    }
-    return farg.ll;
-}
 
 #endif
 
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index ae2e990..7c7dc61 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -66,11 +66,15 @@ DEF_HELPER_4(fcmpo, void, env, i64, i64, i32)
 DEF_HELPER_4(fcmpu, void, env, i64, i64, i32)
 
 DEF_HELPER_2(fctiw, i64, env, i64)
+DEF_HELPER_2(fctiwu, i64, env, i64)
 DEF_HELPER_2(fctiwz, i64, env, i64)
+DEF_HELPER_2(fctiwuz, i64, env, i64)
 #if defined(TARGET_PPC64)
 DEF_HELPER_2(fcfid, i64, env, i64)
 DEF_HELPER_2(fctid, i64, env, i64)
+DEF_HELPER_2(fctidu, i64, env, i64)
 DEF_HELPER_2(fctidz, i64, env, i64)
+DEF_HELPER_2(fctiduz, i64, env, i64)
 #endif
 DEF_HELPER_2(frsp, i64, env, i64)
 DEF_HELPER_2(frin, i64, env, i64)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 48ac5ec..970209b 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2202,8 +2202,12 @@ GEN_FLOAT_ACB(nmsub, 0x1E, 1, PPC_FLOAT);
 /***                     Floating-Point round & convert                    ***/
 /* fctiw */
 GEN_FLOAT_B(ctiw, 0x0E, 0x00, 0, PPC_FLOAT);
+/* fctiwu */
+GEN_FLOAT_B(ctiwu, 0x0E, 0x04, 0, PPC2_FP_CVT_ISA206);
 /* fctiwz */
 GEN_FLOAT_B(ctiwz, 0x0F, 0x00, 0, PPC_FLOAT);
+/* fctiwuz */
+GEN_FLOAT_B(ctiwuz, 0x0F, 0x04, 0, PPC2_FP_CVT_ISA206);
 /* frsp */
 GEN_FLOAT_B(rsp, 0x0C, 0x00, 1, PPC_FLOAT);
 #if defined(TARGET_PPC64)
@@ -2211,8 +2215,12 @@ GEN_FLOAT_B(rsp, 0x0C, 0x00, 1, PPC_FLOAT);
 GEN_FLOAT_B(cfid, 0x0E, 0x1A, 1, PPC_64B);
 /* fctid */
 GEN_FLOAT_B(ctid, 0x0E, 0x19, 0, PPC_64B);
+/* fctidu */
+GEN_FLOAT_B(ctidu, 0x0E, 0x1D, 0, PPC2_FP_CVT_ISA206);
 /* fctidz */
 GEN_FLOAT_B(ctidz, 0x0F, 0x19, 0, PPC_64B);
+/* fctidu */
+GEN_FLOAT_B(ctiduz, 0x0F, 0x1D, 0, PPC2_FP_CVT_ISA206);
 #endif
 
 /* frin */
@@ -9746,12 +9754,16 @@ GEN_FLOAT_ACB(msub, 0x1C, 1, PPC_FLOAT),
 GEN_FLOAT_ACB(nmadd, 0x1F, 1, PPC_FLOAT),
 GEN_FLOAT_ACB(nmsub, 0x1E, 1, PPC_FLOAT),
 GEN_FLOAT_B(ctiw, 0x0E, 0x00, 0, PPC_FLOAT),
+GEN_HANDLER_E(fctiwu, 0x3F, 0x0E, 0x04, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(ctiwz, 0x0F, 0x00, 0, PPC_FLOAT),
+GEN_HANDLER_E(fctiwuz, 0x3F, 0x0F, 0x04, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(rsp, 0x0C, 0x00, 1, PPC_FLOAT),
 #if defined(TARGET_PPC64)
 GEN_FLOAT_B(cfid, 0x0E, 0x1A, 1, PPC_64B),
 GEN_FLOAT_B(ctid, 0x0E, 0x19, 0, PPC_64B),
+GEN_HANDLER_E(fctidu, 0x3F, 0x0E, 0x1D, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(ctidz, 0x0F, 0x19, 0, PPC_64B),
+GEN_HANDLER_E(fctiduz, 0x3F, 0x0F, 0x1D, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 #endif
 GEN_FLOAT_B(rin, 0x08, 0x0C, 1, PPC_FLOAT_EXT),
 GEN_FLOAT_B(riz, 0x08, 0x0D, 1, PPC_FLOAT_EXT),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 12/22] target-ppc: Add ISA 2.06 fcfid[u][s] Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (10 preceding siblings ...)
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 11/22] target-ppc: Add ISA2.06 Float to Integer Instructions Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-08 18:31   ` Richard Henderson
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 13/22] softfloat: Fix exception flag handling for float32_to_float16() Tom Musta
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the fcfids, fcfidu and fcfidus instructions which
were introduced in Power ISA 2.06B.  A common macro is provided to
eliminate repetitious code, and the existing fcfid instruction is
refactored to use this macro.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: Using the newly added PPC2_FP_CVT_ISA206 flag.  Performed
direct conversion to single precision per Richard Henderson's
review.

 target-ppc/fpu_helper.c |   24 +++++++++++++++++-------
 target-ppc/helper.h     |    3 +++
 target-ppc/translate.c  |    9 +++++++++
 3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 04599f6..4985f53 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -637,16 +637,26 @@ FPU_FCTI(fctiduz, uint64_round_to_zero, 0x0000000000000000)
 #endif
 
 #if defined(TARGET_PPC64)
-/* fcfid - fcfid. */
-uint64_t helper_fcfid(CPUPPCState *env, uint64_t arg)
-{
-    CPU_DoubleU farg;
-
-    farg.d = int64_to_float64(arg, &env->fp_status);
-    return farg.ll;
-}
-
 
+#define FPU_FCFI(op, cvtr, is_single)                      \
+uint64_t helper_##op(CPUPPCState *env, uint64_t arg)       \
+{                                                          \
+    CPU_DoubleU farg;                                      \
+                                                           \
+    if (is_single) {                                       \
+        float32 tmp = cvtr(arg, &env->fp_status);          \
+        farg.d = float32_to_float64(tmp, &env->fp_status); \
+    } else {                                               \
+        farg.d = cvtr(arg, &env->fp_status);               \
+    }                                                      \
+    helper_float_check_status(env);                        \
+    return farg.ll;                                        \
+}
+
+FPU_FCFI(fcfid, int64_to_float64, 0)
+FPU_FCFI(fcfids, int64_to_float32, 1)
+FPU_FCFI(fcfidu, uint64_to_float64, 0)
+FPU_FCFI(fcfidus, uint64_to_float32, 1)
 
 #endif
 
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 7c7dc61..037871b 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -71,6 +71,9 @@ DEF_HELPER_2(fctiwz, i64, env, i64)
 DEF_HELPER_2(fctiwuz, i64, env, i64)
 #if defined(TARGET_PPC64)
 DEF_HELPER_2(fcfid, i64, env, i64)
+DEF_HELPER_2(fcfidu, i64, env, i64)
+DEF_HELPER_2(fcfids, i64, env, i64)
+DEF_HELPER_2(fcfidus, i64, env, i64)
 DEF_HELPER_2(fctid, i64, env, i64)
 DEF_HELPER_2(fctidu, i64, env, i64)
 DEF_HELPER_2(fctidz, i64, env, i64)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 970209b..a9b9032 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2213,6 +2213,12 @@ GEN_FLOAT_B(rsp, 0x0C, 0x00, 1, PPC_FLOAT);
 #if defined(TARGET_PPC64)
 /* fcfid */
 GEN_FLOAT_B(cfid, 0x0E, 0x1A, 1, PPC_64B);
+/* fcfids */
+GEN_FLOAT_B(cfids, 0x0E, 0x1A, 0, PPC2_FP_CVT_ISA206);
+/* fcfidu */
+GEN_FLOAT_B(cfidu, 0x0E, 0x1E, 0, PPC2_FP_CVT_ISA206);
+/* fcfidus */
+GEN_FLOAT_B(cfidus, 0x0E, 0x1E, 0, PPC2_FP_CVT_ISA206);
 /* fctid */
 GEN_FLOAT_B(ctid, 0x0E, 0x19, 0, PPC_64B);
 /* fctidu */
@@ -9760,6 +9766,9 @@ GEN_HANDLER_E(fctiwuz, 0x3F, 0x0F, 0x04, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(rsp, 0x0C, 0x00, 1, PPC_FLOAT),
 #if defined(TARGET_PPC64)
 GEN_FLOAT_B(cfid, 0x0E, 0x1A, 1, PPC_64B),
+GEN_HANDLER_E(fcfids, 0x3B, 0x0E, 0x1A, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
+GEN_HANDLER_E(fcfidu, 0x3F, 0x0E, 0x1E, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
+GEN_HANDLER_E(fcfidus, 0x3B, 0x0E, 0x1E, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(ctid, 0x0E, 0x19, 0, PPC_64B),
 GEN_HANDLER_E(fctidu, 0x3F, 0x0E, 0x1D, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(ctidz, 0x0F, 0x19, 0, PPC_64B),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 13/22] softfloat: Fix exception flag handling for float32_to_float16()
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (11 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 12/22] target-ppc: Add ISA 2.06 fcfid[u][s] Instructions Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 14/22] softfloat: Factor out RoundAndPackFloat16 and NormalizeFloat16Subnormal Tom Musta
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc, Peter Maydell

From: Peter Maydell <peter.maydell@linaro.org>

Our float32 to float16 conversion routine was generating the correct
numerical answers, but not always setting the right set of exception
flags. Fix this, mostly by rearranging the code to more closely
resemble RoundAndPackFloat*, and in particular:
 * non-IEEE halfprec always raises Invalid for input NaNs
 * we need to check for the overflow case before underflow
 * we weren't getting the tininess-detected-after-rounding
   case correct (somewhat academic since only ARM uses halfprec
   and it is always tininess-detected-before-rounding)
 * non-IEEE halfprec overflow raises only Invalid, not
   Invalid + Inexact
 * we weren't setting Inexact when we should

Also add some clarifying comments about what the code is doing.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1389013881-15726-2-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Tom Musta <tommusta@gmail.com>
---
 fpu/softfloat.c |  105 ++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 66 insertions(+), 39 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 2c598ca..f95c964 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3141,6 +3141,10 @@ float16 float32_to_float16(float32 a, flag ieee STATUS_PARAM)
     uint32_t mask;
     uint32_t increment;
     int8 roundingMode;
+    int maxexp = ieee ? 15 : 16;
+    bool rounding_bumps_exp;
+    bool is_tiny = false;
+
     a = float32_squash_input_denormal(a STATUS_VAR);
 
     aSig = extractFloat32Frac( a );
@@ -3149,11 +3153,12 @@ float16 float32_to_float16(float32 a, flag ieee STATUS_PARAM)
     if ( aExp == 0xFF ) {
         if (aSig) {
             /* Input is a NaN */
-            float16 r = commonNaNToFloat16( float32ToCommonNaN( a STATUS_VAR ) STATUS_VAR );
             if (!ieee) {
+                float_raise(float_flag_invalid STATUS_VAR);
                 return packFloat16(aSign, 0, 0);
             }
-            return r;
+            return commonNaNToFloat16(
+                float32ToCommonNaN(a STATUS_VAR) STATUS_VAR);
         }
         /* Infinity */
         if (!ieee) {
@@ -3165,58 +3170,80 @@ float16 float32_to_float16(float32 a, flag ieee STATUS_PARAM)
     if (aExp == 0 && aSig == 0) {
         return packFloat16(aSign, 0, 0);
     }
-    /* Decimal point between bits 22 and 23.  */
+    /* Decimal point between bits 22 and 23. Note that we add the 1 bit
+     * even if the input is denormal; however this is harmless because
+     * the largest possible single-precision denormal is still smaller
+     * than the smallest representable half-precision denormal, and so we
+     * will end up ignoring aSig and returning via the "always return zero"
+     * codepath.
+     */
     aSig |= 0x00800000;
     aExp -= 0x7f;
+    /* Calculate the mask of bits of the mantissa which are not
+     * representable in half-precision and will be lost.
+     */
     if (aExp < -14) {
+        /* Will be denormal in halfprec */
         mask = 0x00ffffff;
         if (aExp >= -24) {
             mask >>= 25 + aExp;
         }
     } else {
+        /* Normal number in halfprec */
         mask = 0x00001fff;
     }
-    if (aSig & mask) {
-        float_raise( float_flag_underflow STATUS_VAR );
-        roundingMode = STATUS(float_rounding_mode);
-        switch (roundingMode) {
-        case float_round_nearest_even:
-            increment = (mask + 1) >> 1;
-            if ((aSig & mask) == increment) {
-                increment = aSig & (increment << 1);
-            }
-            break;
-        case float_round_up:
-            increment = aSign ? 0 : mask;
-            break;
-        case float_round_down:
-            increment = aSign ? mask : 0;
-            break;
-        default: /* round_to_zero */
-            increment = 0;
-            break;
-        }
-        aSig += increment;
-        if (aSig >= 0x01000000) {
-            aSig >>= 1;
-            aExp++;
-        }
-    } else if (aExp < -14
-          && STATUS(float_detect_tininess) == float_tininess_before_rounding) {
-        float_raise( float_flag_underflow STATUS_VAR);
-    }
 
-    if (ieee) {
-        if (aExp > 15) {
-            float_raise( float_flag_overflow | float_flag_inexact STATUS_VAR);
+    roundingMode = STATUS(float_rounding_mode);
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        increment = (mask + 1) >> 1;
+        if ((aSig & mask) == increment) {
+            increment = aSig & (increment << 1);
+        }
+        break;
+    case float_round_up:
+        increment = aSign ? 0 : mask;
+        break;
+    case float_round_down:
+        increment = aSign ? mask : 0;
+        break;
+    default: /* round_to_zero */
+        increment = 0;
+        break;
+    }
+
+    rounding_bumps_exp = (aSig + increment >= 0x01000000);
+
+    if (aExp > maxexp || (aExp == maxexp && rounding_bumps_exp)) {
+        if (ieee) {
+            float_raise(float_flag_overflow | float_flag_inexact STATUS_VAR);
             return packFloat16(aSign, 0x1f, 0);
-        }
-    } else {
-        if (aExp > 16) {
-            float_raise(float_flag_invalid | float_flag_inexact STATUS_VAR);
+        } else {
+            float_raise(float_flag_invalid STATUS_VAR);
             return packFloat16(aSign, 0x1f, 0x3ff);
         }
     }
+
+    if (aExp < -14) {
+        /* Note that flush-to-zero does not affect half-precision results */
+        is_tiny =
+            (STATUS(float_detect_tininess) == float_tininess_before_rounding)
+            || (aExp < -15)
+            || (!rounding_bumps_exp);
+    }
+    if (aSig & mask) {
+        float_raise(float_flag_inexact STATUS_VAR);
+        if (is_tiny) {
+            float_raise(float_flag_underflow STATUS_VAR);
+        }
+    }
+
+    aSig += increment;
+    if (rounding_bumps_exp) {
+        aSig >>= 1;
+        aExp++;
+    }
+
     if (aExp < -24) {
         return packFloat16(aSign, 0, 0);
     }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 14/22] softfloat: Factor out RoundAndPackFloat16 and NormalizeFloat16Subnormal
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (12 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 13/22] softfloat: Fix exception flag handling for float32_to_float16() Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 15/22] softfloat: Refactor code handling various rounding modes Tom Musta
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc, Peter Maydell

From: Peter Maydell <peter.maydell@linaro.org>

In preparation for adding conversions between float16 and float64,
factor out code currently done inline in the float16<=>float32
conversion functions into functions RoundAndPackFloat16 and
NormalizeFloat16Subnormal along the lines of the existing versions
for the other float types.

Note that we change the handling of zExp from the inline code
to match the API of the other RoundAndPackFloat functions; however
we leave the positioning of the binary point between bits 22 and 23
rather than shifting it up to the high end of the word.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1389013881-15726-14-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Tom Musta <tommusta@gmail.com
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 fpu/softfloat.c |  209 +++++++++++++++++++++++++++++++++----------------------
 1 files changed, 125 insertions(+), 84 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index f95c964..2cefd81 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3100,6 +3100,127 @@ static float16 packFloat16(flag zSign, int_fast16_t zExp, uint16_t zSig)
         (((uint32_t)zSign) << 15) + (((uint32_t)zExp) << 10) + zSig);
 }
 
+/*----------------------------------------------------------------------------
+| Takes an abstract floating-point value having sign `zSign', exponent `zExp',
+| and significand `zSig', and returns the proper half-precision floating-
+| point value corresponding to the abstract input.  Ordinarily, the abstract
+| value is simply rounded and packed into the half-precision format, with
+| the inexact exception raised if the abstract input cannot be represented
+| exactly.  However, if the abstract value is too large, the overflow and
+| inexact exceptions are raised and an infinity or maximal finite value is
+| returned.  If the abstract value is too small, the input value is rounded to
+| a subnormal number, and the underflow and inexact exceptions are raised if
+| the abstract input cannot be represented exactly as a subnormal half-
+| precision floating-point number.
+| The `ieee' flag indicates whether to use IEEE standard half precision, or
+| ARM-style "alternative representation", which omits the NaN and Inf
+| encodings in order to raise the maximum representable exponent by one.
+|     The input significand `zSig' has its binary point between bits 22
+| and 23, which is 13 bits to the left of the usual location.  This shifted
+| significand must be normalized or smaller.  If `zSig' is not normalized,
+| `zExp' must be 0; in that case, the result returned is a subnormal number,
+| and it must not require rounding.  In the usual case that `zSig' is
+| normalized, `zExp' must be 1 less than the ``true'' floating-point exponent.
+| Note the slightly odd position of the binary point in zSig compared with the
+| other roundAndPackFloat functions. This should probably be fixed if we
+| need to implement more float16 routines than just conversion.
+| The handling of underflow and overflow follows the IEC/IEEE Standard for
+| Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+static float32 roundAndPackFloat16(flag zSign, int_fast16_t zExp,
+                                   uint32_t zSig, flag ieee STATUS_PARAM)
+{
+    int maxexp = ieee ? 29 : 30;
+    uint32_t mask;
+    uint32_t increment;
+    int8 roundingMode;
+    bool rounding_bumps_exp;
+    bool is_tiny = false;
+
+    /* Calculate the mask of bits of the mantissa which are not
+     * representable in half-precision and will be lost.
+     */
+    if (zExp < 1) {
+        /* Will be denormal in halfprec */
+        mask = 0x00ffffff;
+        if (zExp >= -11) {
+            mask >>= 11 + zExp;
+        }
+    } else {
+        /* Normal number in halfprec */
+        mask = 0x00001fff;
+    }
+
+    roundingMode = STATUS(float_rounding_mode);
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        increment = (mask + 1) >> 1;
+        if ((zSig & mask) == increment) {
+            increment = zSig & (increment << 1);
+        }
+        break;
+    case float_round_up:
+        increment = zSign ? 0 : mask;
+        break;
+    case float_round_down:
+        increment = zSign ? mask : 0;
+        break;
+    default: /* round_to_zero */
+        increment = 0;
+        break;
+    }
+
+    rounding_bumps_exp = (zSig + increment >= 0x01000000);
+
+    if (zExp > maxexp || (zExp == maxexp && rounding_bumps_exp)) {
+        if (ieee) {
+            float_raise(float_flag_overflow | float_flag_inexact STATUS_VAR);
+            return packFloat16(zSign, 0x1f, 0);
+        } else {
+            float_raise(float_flag_invalid STATUS_VAR);
+            return packFloat16(zSign, 0x1f, 0x3ff);
+        }
+    }
+
+    if (zExp < 0) {
+        /* Note that flush-to-zero does not affect half-precision results */
+        is_tiny =
+            (STATUS(float_detect_tininess) == float_tininess_before_rounding)
+            || (zExp < -1)
+            || (!rounding_bumps_exp);
+    }
+    if (zSig & mask) {
+        float_raise(float_flag_inexact STATUS_VAR);
+        if (is_tiny) {
+            float_raise(float_flag_underflow STATUS_VAR);
+        }
+    }
+
+    zSig += increment;
+    if (rounding_bumps_exp) {
+        zSig >>= 1;
+        zExp++;
+    }
+
+    if (zExp < -10) {
+        return packFloat16(zSign, 0, 0);
+    }
+    if (zExp < 0) {
+        zSig >>= -zExp;
+        zExp = 0;
+    }
+    return packFloat16(zSign, zExp, zSig >> 13);
+}
+
+static void normalizeFloat16Subnormal(uint32_t aSig, int_fast16_t *zExpPtr,
+                                      uint32_t *zSigPtr)
+{
+    int8_t shiftCount = countLeadingZeros32(aSig) - 21;
+    *zSigPtr = aSig << shiftCount;
+    *zExpPtr = 1 - shiftCount;
+}
+
 /* Half precision floats come in two formats: standard IEEE and "ARM" format.
    The latter gains extra exponent range by omitting the NaN/Inf encodings.  */
 
@@ -3120,15 +3241,12 @@ float32 float16_to_float32(float16 a, flag ieee STATUS_PARAM)
         return packFloat32(aSign, 0xff, 0);
     }
     if (aExp == 0) {
-        int8 shiftCount;
-
         if (aSig == 0) {
             return packFloat32(aSign, 0, 0);
         }
 
-        shiftCount = countLeadingZeros32( aSig ) - 21;
-        aSig = aSig << shiftCount;
-        aExp = -shiftCount;
+        normalizeFloat16Subnormal(aSig, &aExp, &aSig);
+        aExp--;
     }
     return packFloat32( aSign, aExp + 0x70, aSig << 13);
 }
@@ -3138,12 +3256,6 @@ float16 float32_to_float16(float32 a, flag ieee STATUS_PARAM)
     flag aSign;
     int_fast16_t aExp;
     uint32_t aSig;
-    uint32_t mask;
-    uint32_t increment;
-    int8 roundingMode;
-    int maxexp = ieee ? 15 : 16;
-    bool rounding_bumps_exp;
-    bool is_tiny = false;
 
     a = float32_squash_input_denormal(a STATUS_VAR);
 
@@ -3178,80 +3290,9 @@ float16 float32_to_float16(float32 a, flag ieee STATUS_PARAM)
      * codepath.
      */
     aSig |= 0x00800000;
-    aExp -= 0x7f;
-    /* Calculate the mask of bits of the mantissa which are not
-     * representable in half-precision and will be lost.
-     */
-    if (aExp < -14) {
-        /* Will be denormal in halfprec */
-        mask = 0x00ffffff;
-        if (aExp >= -24) {
-            mask >>= 25 + aExp;
-        }
-    } else {
-        /* Normal number in halfprec */
-        mask = 0x00001fff;
-    }
+    aExp -= 0x71;
 
-    roundingMode = STATUS(float_rounding_mode);
-    switch (roundingMode) {
-    case float_round_nearest_even:
-        increment = (mask + 1) >> 1;
-        if ((aSig & mask) == increment) {
-            increment = aSig & (increment << 1);
-        }
-        break;
-    case float_round_up:
-        increment = aSign ? 0 : mask;
-        break;
-    case float_round_down:
-        increment = aSign ? mask : 0;
-        break;
-    default: /* round_to_zero */
-        increment = 0;
-        break;
-    }
-
-    rounding_bumps_exp = (aSig + increment >= 0x01000000);
-
-    if (aExp > maxexp || (aExp == maxexp && rounding_bumps_exp)) {
-        if (ieee) {
-            float_raise(float_flag_overflow | float_flag_inexact STATUS_VAR);
-            return packFloat16(aSign, 0x1f, 0);
-        } else {
-            float_raise(float_flag_invalid STATUS_VAR);
-            return packFloat16(aSign, 0x1f, 0x3ff);
-        }
-    }
-
-    if (aExp < -14) {
-        /* Note that flush-to-zero does not affect half-precision results */
-        is_tiny =
-            (STATUS(float_detect_tininess) == float_tininess_before_rounding)
-            || (aExp < -15)
-            || (!rounding_bumps_exp);
-    }
-    if (aSig & mask) {
-        float_raise(float_flag_inexact STATUS_VAR);
-        if (is_tiny) {
-            float_raise(float_flag_underflow STATUS_VAR);
-        }
-    }
-
-    aSig += increment;
-    if (rounding_bumps_exp) {
-        aSig >>= 1;
-        aExp++;
-    }
-
-    if (aExp < -24) {
-        return packFloat16(aSign, 0, 0);
-    }
-    if (aExp < -14) {
-        aSig >>= -14 - aExp;
-        aExp = -14;
-    }
-    return packFloat16(aSign, aExp + 14, aSig >> 13);
+    return roundAndPackFloat16(aSign, aExp, aSig, ieee STATUS_VAR);
 }
 
 /*----------------------------------------------------------------------------
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 15/22] softfloat: Refactor code handling various rounding modes
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (13 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 14/22] softfloat: Factor out RoundAndPackFloat16 and NormalizeFloat16Subnormal Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 16/22] softfloat: Add support for ties-away rounding Tom Musta
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc, Peter Maydell

From: Peter Maydell <peter.maydell@linaro.org>

Refactor the code in various functions which calculates rounding
increments given the current rounding mode, so that instead of a
set of nested if statements we have a simple switch statement.
This will give us a clean place to add the case for the new
tiesAway rounding mode.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1389013881-15726-16-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 fpu/softfloat.c |  405 +++++++++++++++++++++++++++++++++----------------------
 1 files changed, 241 insertions(+), 164 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 2cefd81..4390e1d 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -42,6 +42,9 @@ these four paragraphs for those parts of this code that are retained.
 
 #include "fpu/softfloat.h"
 
+/* We only need stdlib for abort() */
+#include <stdlib.h>
+
 /*----------------------------------------------------------------------------
 | Primitive arithmetic functions, including multi-word arithmetic, and
 | division and square root approximations.  (Can be specialized to target if
@@ -121,20 +124,21 @@ static int32 roundAndPackInt32( flag zSign, uint64_t absZ STATUS_PARAM)
 
     roundingMode = STATUS(float_rounding_mode);
     roundNearestEven = ( roundingMode == float_round_nearest_even );
-    roundIncrement = 0x40;
-    if ( ! roundNearestEven ) {
-        if ( roundingMode == float_round_to_zero ) {
-            roundIncrement = 0;
-        }
-        else {
-            roundIncrement = 0x7F;
-            if ( zSign ) {
-                if ( roundingMode == float_round_up ) roundIncrement = 0;
-            }
-            else {
-                if ( roundingMode == float_round_down ) roundIncrement = 0;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        roundIncrement = 0x40;
+        break;
+    case float_round_to_zero:
+        roundIncrement = 0;
+        break;
+    case float_round_up:
+        roundIncrement = zSign ? 0 : 0x7f;
+        break;
+    case float_round_down:
+        roundIncrement = zSign ? 0x7f : 0;
+        break;
+    default:
+        abort();
     }
     roundBits = absZ & 0x7F;
     absZ = ( absZ + roundIncrement )>>7;
@@ -169,19 +173,21 @@ static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATU
 
     roundingMode = STATUS(float_rounding_mode);
     roundNearestEven = ( roundingMode == float_round_nearest_even );
-    increment = ( (int64_t) absZ1 < 0 );
-    if ( ! roundNearestEven ) {
-        if ( roundingMode == float_round_to_zero ) {
-            increment = 0;
-        }
-        else {
-            if ( zSign ) {
-                increment = ( roundingMode == float_round_down ) && absZ1;
-            }
-            else {
-                increment = ( roundingMode == float_round_up ) && absZ1;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        increment = ((int64_t) absZ1 < 0);
+        break;
+    case float_round_to_zero:
+        increment = 0;
+        break;
+    case float_round_up:
+        increment = !zSign && absZ1;
+        break;
+    case float_round_down:
+        increment = zSign && absZ1;
+        break;
+    default:
+        abort();
     }
     if ( increment ) {
         ++absZ0;
@@ -220,17 +226,21 @@ static int64 roundAndPackUint64(flag zSign, uint64_t absZ0,
 
     roundingMode = STATUS(float_rounding_mode);
     roundNearestEven = (roundingMode == float_round_nearest_even);
-    increment = ((int64_t)absZ1 < 0);
-    if (!roundNearestEven) {
-        if (roundingMode == float_round_to_zero) {
-            increment = 0;
-        } else if (absZ1) {
-            if (zSign) {
-                increment = (roundingMode == float_round_down) && absZ1;
-            } else {
-                increment = (roundingMode == float_round_up) && absZ1;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        increment = ((int64_t)absZ1 < 0);
+        break;
+    case float_round_to_zero:
+        increment = 0;
+        break;
+    case float_round_up:
+        increment = !zSign && absZ1;
+        break;
+    case float_round_down:
+        increment = zSign && absZ1;
+        break;
+    default:
+        abort();
     }
     if (increment) {
         ++absZ0;
@@ -368,20 +378,22 @@ static float32 roundAndPackFloat32(flag zSign, int_fast16_t zExp, uint32_t zSig
 
     roundingMode = STATUS(float_rounding_mode);
     roundNearestEven = ( roundingMode == float_round_nearest_even );
-    roundIncrement = 0x40;
-    if ( ! roundNearestEven ) {
-        if ( roundingMode == float_round_to_zero ) {
-            roundIncrement = 0;
-        }
-        else {
-            roundIncrement = 0x7F;
-            if ( zSign ) {
-                if ( roundingMode == float_round_up ) roundIncrement = 0;
-            }
-            else {
-                if ( roundingMode == float_round_down ) roundIncrement = 0;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        roundIncrement = 0x40;
+        break;
+    case float_round_to_zero:
+        roundIncrement = 0;
+        break;
+    case float_round_up:
+        roundIncrement = zSign ? 0 : 0x7f;
+        break;
+    case float_round_down:
+        roundIncrement = zSign ? 0x7f : 0;
+        break;
+    default:
+        abort();
+        break;
     }
     roundBits = zSig & 0x7F;
     if ( 0xFD <= (uint16_t) zExp ) {
@@ -550,20 +562,21 @@ static float64 roundAndPackFloat64(flag zSign, int_fast16_t zExp, uint64_t zSig
 
     roundingMode = STATUS(float_rounding_mode);
     roundNearestEven = ( roundingMode == float_round_nearest_even );
-    roundIncrement = 0x200;
-    if ( ! roundNearestEven ) {
-        if ( roundingMode == float_round_to_zero ) {
-            roundIncrement = 0;
-        }
-        else {
-            roundIncrement = 0x3FF;
-            if ( zSign ) {
-                if ( roundingMode == float_round_up ) roundIncrement = 0;
-            }
-            else {
-                if ( roundingMode == float_round_down ) roundIncrement = 0;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        roundIncrement = 0x200;
+        break;
+    case float_round_to_zero:
+        roundIncrement = 0;
+        break;
+    case float_round_up:
+        roundIncrement = zSign ? 0 : 0x3ff;
+        break;
+    case float_round_down:
+        roundIncrement = zSign ? 0x3ff : 0;
+        break;
+    default:
+        abort();
     }
     roundBits = zSig & 0x3FF;
     if ( 0x7FD <= (uint16_t) zExp ) {
@@ -733,19 +746,20 @@ static floatx80
         goto precision80;
     }
     zSig0 |= ( zSig1 != 0 );
-    if ( ! roundNearestEven ) {
-        if ( roundingMode == float_round_to_zero ) {
-            roundIncrement = 0;
-        }
-        else {
-            roundIncrement = roundMask;
-            if ( zSign ) {
-                if ( roundingMode == float_round_up ) roundIncrement = 0;
-            }
-            else {
-                if ( roundingMode == float_round_down ) roundIncrement = 0;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        break;
+    case float_round_to_zero:
+        roundIncrement = 0;
+        break;
+    case float_round_up:
+        roundIncrement = zSign ? 0 : roundMask;
+        break;
+    case float_round_down:
+        roundIncrement = zSign ? roundMask : 0;
+        break;
+    default:
+        abort();
     }
     roundBits = zSig0 & roundMask;
     if ( 0x7FFD <= (uint32_t) ( zExp - 1 ) ) {
@@ -792,19 +806,21 @@ static floatx80
     if ( zSig0 == 0 ) zExp = 0;
     return packFloatx80( zSign, zExp, zSig0 );
  precision80:
-    increment = ( (int64_t) zSig1 < 0 );
-    if ( ! roundNearestEven ) {
-        if ( roundingMode == float_round_to_zero ) {
-            increment = 0;
-        }
-        else {
-            if ( zSign ) {
-                increment = ( roundingMode == float_round_down ) && zSig1;
-            }
-            else {
-                increment = ( roundingMode == float_round_up ) && zSig1;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        increment = ((int64_t)zSig1 < 0);
+        break;
+    case float_round_to_zero:
+        increment = 0;
+        break;
+    case float_round_up:
+        increment = !zSign && zSig1;
+        break;
+    case float_round_down:
+        increment = zSign && zSig1;
+        break;
+    default:
+        abort();
     }
     if ( 0x7FFD <= (uint32_t) ( zExp - 1 ) ) {
         if (    ( 0x7FFE < zExp )
@@ -834,16 +850,21 @@ static floatx80
             zExp = 0;
             if ( isTiny && zSig1 ) float_raise( float_flag_underflow STATUS_VAR);
             if ( zSig1 ) STATUS(float_exception_flags) |= float_flag_inexact;
-            if ( roundNearestEven ) {
-                increment = ( (int64_t) zSig1 < 0 );
-            }
-            else {
-                if ( zSign ) {
-                    increment = ( roundingMode == float_round_down ) && zSig1;
-                }
-                else {
-                    increment = ( roundingMode == float_round_up ) && zSig1;
-                }
+            switch (roundingMode) {
+            case float_round_nearest_even:
+                increment = ((int64_t)zSig1 < 0);
+                break;
+            case float_round_to_zero:
+                increment = 0;
+                break;
+            case float_round_up:
+                increment = !zSign && zSig1;
+                break;
+            case float_round_down:
+                increment = zSign && zSig1;
+                break;
+            default:
+                abort();
             }
             if ( increment ) {
                 ++zSig0;
@@ -1043,19 +1064,21 @@ static float128
 
     roundingMode = STATUS(float_rounding_mode);
     roundNearestEven = ( roundingMode == float_round_nearest_even );
-    increment = ( (int64_t) zSig2 < 0 );
-    if ( ! roundNearestEven ) {
-        if ( roundingMode == float_round_to_zero ) {
-            increment = 0;
-        }
-        else {
-            if ( zSign ) {
-                increment = ( roundingMode == float_round_down ) && zSig2;
-            }
-            else {
-                increment = ( roundingMode == float_round_up ) && zSig2;
-            }
-        }
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        increment = ((int64_t)zSig2 < 0);
+        break;
+    case float_round_to_zero:
+        increment = 0;
+        break;
+    case float_round_up:
+        increment = !zSign && zSig2;
+        break;
+    case float_round_down:
+        increment = zSign && zSig2;
+        break;
+    default:
+        abort();
     }
     if ( 0x7FFD <= (uint32_t) zExp ) {
         if (    ( 0x7FFD < zExp )
@@ -1103,16 +1126,21 @@ static float128
                 zSig0, zSig1, zSig2, - zExp, &zSig0, &zSig1, &zSig2 );
             zExp = 0;
             if ( isTiny && zSig2 ) float_raise( float_flag_underflow STATUS_VAR);
-            if ( roundNearestEven ) {
-                increment = ( (int64_t) zSig2 < 0 );
-            }
-            else {
-                if ( zSign ) {
-                    increment = ( roundingMode == float_round_down ) && zSig2;
-                }
-                else {
-                    increment = ( roundingMode == float_round_up ) && zSig2;
-                }
+            switch (roundingMode) {
+            case float_round_nearest_even:
+                increment = ((int64_t)zSig2 < 0);
+                break;
+            case float_round_to_zero:
+                increment = 0;
+                break;
+            case float_round_up:
+                increment = !zSign && zSig2;
+                break;
+            case float_round_down:
+                increment = zSign && zSig2;
+                break;
+            default:
+                abort();
             }
         }
     }
@@ -1751,7 +1779,6 @@ float32 float32_round_to_int( float32 a STATUS_PARAM)
     flag aSign;
     int_fast16_t aExp;
     uint32_t lastBitMask, roundBitsMask;
-    int8 roundingMode;
     uint32_t z;
     a = float32_squash_input_denormal(a STATUS_VAR);
 
@@ -1783,15 +1810,27 @@ float32 float32_round_to_int( float32 a STATUS_PARAM)
     lastBitMask <<= 0x96 - aExp;
     roundBitsMask = lastBitMask - 1;
     z = float32_val(a);
-    roundingMode = STATUS(float_rounding_mode);
-    if ( roundingMode == float_round_nearest_even ) {
+    switch (STATUS(float_rounding_mode)) {
+    case float_round_nearest_even:
         z += lastBitMask>>1;
-        if ( ( z & roundBitsMask ) == 0 ) z &= ~ lastBitMask;
-    }
-    else if ( roundingMode != float_round_to_zero ) {
-        if ( extractFloat32Sign( make_float32(z) ) ^ ( roundingMode == float_round_up ) ) {
+        if ((z & roundBitsMask) == 0) {
+            z &= ~lastBitMask;
+        }
+        break;
+    case float_round_to_zero:
+        break;
+    case float_round_up:
+        if (!extractFloat32Sign(make_float32(z))) {
+            z += roundBitsMask;
+        }
+        break;
+    case float_round_down:
+        if (extractFloat32Sign(make_float32(z))) {
             z += roundBitsMask;
         }
+        break;
+    default:
+        abort();
     }
     z &= ~ roundBitsMask;
     if ( z != float32_val(a) ) STATUS(float_exception_flags) |= float_flag_inexact;
@@ -3134,7 +3173,6 @@ static float32 roundAndPackFloat16(flag zSign, int_fast16_t zExp,
     int maxexp = ieee ? 29 : 30;
     uint32_t mask;
     uint32_t increment;
-    int8 roundingMode;
     bool rounding_bumps_exp;
     bool is_tiny = false;
 
@@ -3152,8 +3190,7 @@ static float32 roundAndPackFloat16(flag zSign, int_fast16_t zExp,
         mask = 0x00001fff;
     }
 
-    roundingMode = STATUS(float_rounding_mode);
-    switch (roundingMode) {
+    switch (STATUS(float_rounding_mode)) {
     case float_round_nearest_even:
         increment = (mask + 1) >> 1;
         if ((zSig & mask) == increment) {
@@ -3369,7 +3406,6 @@ float64 float64_round_to_int( float64 a STATUS_PARAM )
     flag aSign;
     int_fast16_t aExp;
     uint64_t lastBitMask, roundBitsMask;
-    int8 roundingMode;
     uint64_t z;
     a = float64_squash_input_denormal(a STATUS_VAR);
 
@@ -3402,15 +3438,27 @@ float64 float64_round_to_int( float64 a STATUS_PARAM )
     lastBitMask <<= 0x433 - aExp;
     roundBitsMask = lastBitMask - 1;
     z = float64_val(a);
-    roundingMode = STATUS(float_rounding_mode);
-    if ( roundingMode == float_round_nearest_even ) {
-        z += lastBitMask>>1;
-        if ( ( z & roundBitsMask ) == 0 ) z &= ~ lastBitMask;
-    }
-    else if ( roundingMode != float_round_to_zero ) {
-        if ( extractFloat64Sign( make_float64(z) ) ^ ( roundingMode == float_round_up ) ) {
+    switch (STATUS(float_rounding_mode)) {
+    case float_round_nearest_even:
+        z += lastBitMask >> 1;
+        if ((z & roundBitsMask) == 0) {
+            z &= ~lastBitMask;
+        }
+        break;
+    case float_round_to_zero:
+        break;
+    case float_round_up:
+        if (!extractFloat64Sign(make_float64(z))) {
+            z += roundBitsMask;
+        }
+        break;
+    case float_round_down:
+        if (extractFloat64Sign(make_float64(z))) {
             z += roundBitsMask;
         }
+        break;
+    default:
+        abort();
     }
     z &= ~ roundBitsMask;
     if ( z != float64_val(a) )
@@ -4638,7 +4686,6 @@ floatx80 floatx80_round_to_int( floatx80 a STATUS_PARAM )
     flag aSign;
     int32 aExp;
     uint64_t lastBitMask, roundBitsMask;
-    int8 roundingMode;
     floatx80 z;
 
     aExp = extractFloatx80Exp( a );
@@ -4679,15 +4726,27 @@ floatx80 floatx80_round_to_int( floatx80 a STATUS_PARAM )
     lastBitMask <<= 0x403E - aExp;
     roundBitsMask = lastBitMask - 1;
     z = a;
-    roundingMode = STATUS(float_rounding_mode);
-    if ( roundingMode == float_round_nearest_even ) {
+    switch (STATUS(float_rounding_mode)) {
+    case float_round_nearest_even:
         z.low += lastBitMask>>1;
-        if ( ( z.low & roundBitsMask ) == 0 ) z.low &= ~ lastBitMask;
-    }
-    else if ( roundingMode != float_round_to_zero ) {
-        if ( extractFloatx80Sign( z ) ^ ( roundingMode == float_round_up ) ) {
+        if ((z.low & roundBitsMask) == 0) {
+            z.low &= ~lastBitMask;
+        }
+        break;
+    case float_round_to_zero:
+        break;
+    case float_round_up:
+        if (!extractFloatx80Sign(z)) {
+            z.low += roundBitsMask;
+        }
+        break;
+    case float_round_down:
+        if (extractFloatx80Sign(z)) {
             z.low += roundBitsMask;
         }
+        break;
+    default:
+        abort();
     }
     z.low &= ~ roundBitsMask;
     if ( z.low == 0 ) {
@@ -5713,7 +5772,6 @@ float128 float128_round_to_int( float128 a STATUS_PARAM )
     flag aSign;
     int32 aExp;
     uint64_t lastBitMask, roundBitsMask;
-    int8 roundingMode;
     float128 z;
 
     aExp = extractFloat128Exp( a );
@@ -5730,8 +5788,8 @@ float128 float128_round_to_int( float128 a STATUS_PARAM )
         lastBitMask = ( lastBitMask<<( 0x406E - aExp ) )<<1;
         roundBitsMask = lastBitMask - 1;
         z = a;
-        roundingMode = STATUS(float_rounding_mode);
-        if ( roundingMode == float_round_nearest_even ) {
+        switch (STATUS(float_rounding_mode)) {
+        case float_round_nearest_even:
             if ( lastBitMask ) {
                 add128( z.high, z.low, 0, lastBitMask>>1, &z.high, &z.low );
                 if ( ( z.low & roundBitsMask ) == 0 ) z.low &= ~ lastBitMask;
@@ -5742,12 +5800,21 @@ float128 float128_round_to_int( float128 a STATUS_PARAM )
                     if ( (uint64_t) ( z.low<<1 ) == 0 ) z.high &= ~1;
                 }
             }
-        }
-        else if ( roundingMode != float_round_to_zero ) {
-            if (   extractFloat128Sign( z )
-                 ^ ( roundingMode == float_round_up ) ) {
-                add128( z.high, z.low, 0, roundBitsMask, &z.high, &z.low );
+            break;
+        case float_round_to_zero:
+            break;
+        case float_round_up:
+            if (!extractFloat128Sign(z)) {
+                add128(z.high, z.low, 0, roundBitsMask, &z.high, &z.low);
+            }
+            break;
+        case float_round_down:
+            if (extractFloat128Sign(z)) {
+                add128(z.high, z.low, 0, roundBitsMask, &z.high, &z.low);
             }
+            break;
+        default:
+            abort();
         }
         z.low &= ~ roundBitsMask;
     }
@@ -5781,19 +5848,29 @@ float128 float128_round_to_int( float128 a STATUS_PARAM )
         roundBitsMask = lastBitMask - 1;
         z.low = 0;
         z.high = a.high;
-        roundingMode = STATUS(float_rounding_mode);
-        if ( roundingMode == float_round_nearest_even ) {
+        switch (STATUS(float_rounding_mode)) {
+        case float_round_nearest_even:
             z.high += lastBitMask>>1;
             if ( ( ( z.high & roundBitsMask ) | a.low ) == 0 ) {
                 z.high &= ~ lastBitMask;
             }
-        }
-        else if ( roundingMode != float_round_to_zero ) {
-            if (   extractFloat128Sign( z )
-                 ^ ( roundingMode == float_round_up ) ) {
+            break;
+        case float_round_to_zero:
+            break;
+        case float_round_up:
+            if (!extractFloat128Sign(z)) {
                 z.high |= ( a.low != 0 );
                 z.high += roundBitsMask;
             }
+            break;
+        case float_round_down:
+            if (extractFloat128Sign(z)) {
+                z.high |= (a.low != 0);
+                z.high += roundBitsMask;
+            }
+            break;
+        default:
+            abort();
         }
         z.high &= ~ roundBitsMask;
     }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 16/22] softfloat: Add support for ties-away rounding
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (14 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 15/22] softfloat: Refactor code handling various rounding modes Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz] Tom Musta
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc, Peter Maydell

From: Peter Maydell <peter.maydell@linaro.org>

IEEE754-2008 specifies a new rounding mode:

"roundTiesToAway: the floating-point number nearest to the infinitely
precise result shall be delivered; if the two nearest floating-point
numbers bracketing an unrepresentable infinitely precise result are
equally near, the one with larger magnitude shall be delivered."

Implement this new mode (it is needed for ARM). The general principle
is that the required code is exactly like the ties-to-even code,
except that we do not need to do the "in case of exact tie clear LSB
to round-to-even", because the rounding operation naturally causes
the exact tie to round up in magnitude.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1389013881-15726-17-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 fpu/softfloat.c         |   54 +++++++++++++++++++++++++++++++++++++++++++++++
 include/fpu/softfloat.h |    3 +-
 2 files changed, 56 insertions(+), 1 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 4390e1d..cbad9a7 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -126,6 +126,7 @@ static int32 roundAndPackInt32( flag zSign, uint64_t absZ STATUS_PARAM)
     roundNearestEven = ( roundingMode == float_round_nearest_even );
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         roundIncrement = 0x40;
         break;
     case float_round_to_zero:
@@ -175,6 +176,7 @@ static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATU
     roundNearestEven = ( roundingMode == float_round_nearest_even );
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         increment = ((int64_t) absZ1 < 0);
         break;
     case float_round_to_zero:
@@ -228,6 +230,7 @@ static int64 roundAndPackUint64(flag zSign, uint64_t absZ0,
     roundNearestEven = (roundingMode == float_round_nearest_even);
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         increment = ((int64_t)absZ1 < 0);
         break;
     case float_round_to_zero:
@@ -380,6 +383,7 @@ static float32 roundAndPackFloat32(flag zSign, int_fast16_t zExp, uint32_t zSig
     roundNearestEven = ( roundingMode == float_round_nearest_even );
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         roundIncrement = 0x40;
         break;
     case float_round_to_zero:
@@ -564,6 +568,7 @@ static float64 roundAndPackFloat64(flag zSign, int_fast16_t zExp, uint64_t zSig
     roundNearestEven = ( roundingMode == float_round_nearest_even );
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         roundIncrement = 0x200;
         break;
     case float_round_to_zero:
@@ -748,6 +753,7 @@ static floatx80
     zSig0 |= ( zSig1 != 0 );
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         break;
     case float_round_to_zero:
         roundIncrement = 0;
@@ -808,6 +814,7 @@ static floatx80
  precision80:
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         increment = ((int64_t)zSig1 < 0);
         break;
     case float_round_to_zero:
@@ -852,6 +859,7 @@ static floatx80
             if ( zSig1 ) STATUS(float_exception_flags) |= float_flag_inexact;
             switch (roundingMode) {
             case float_round_nearest_even:
+            case float_round_ties_away:
                 increment = ((int64_t)zSig1 < 0);
                 break;
             case float_round_to_zero:
@@ -1066,6 +1074,7 @@ static float128
     roundNearestEven = ( roundingMode == float_round_nearest_even );
     switch (roundingMode) {
     case float_round_nearest_even:
+    case float_round_ties_away:
         increment = ((int64_t)zSig2 < 0);
         break;
     case float_round_to_zero:
@@ -1128,6 +1137,7 @@ static float128
             if ( isTiny && zSig2 ) float_raise( float_flag_underflow STATUS_VAR);
             switch (roundingMode) {
             case float_round_nearest_even:
+            case float_round_ties_away:
                 increment = ((int64_t)zSig2 < 0);
                 break;
             case float_round_to_zero:
@@ -1799,6 +1809,11 @@ float32 float32_round_to_int( float32 a STATUS_PARAM)
                 return packFloat32( aSign, 0x7F, 0 );
             }
             break;
+        case float_round_ties_away:
+            if (aExp == 0x7E) {
+                return packFloat32(aSign, 0x7F, 0);
+            }
+            break;
          case float_round_down:
             return make_float32(aSign ? 0xBF800000 : 0);
          case float_round_up:
@@ -1817,6 +1832,9 @@ float32 float32_round_to_int( float32 a STATUS_PARAM)
             z &= ~lastBitMask;
         }
         break;
+    case float_round_ties_away:
+        z += lastBitMask >> 1;
+        break;
     case float_round_to_zero:
         break;
     case float_round_up:
@@ -3197,6 +3215,9 @@ static float32 roundAndPackFloat16(flag zSign, int_fast16_t zExp,
             increment = zSig & (increment << 1);
         }
         break;
+    case float_round_ties_away:
+        increment = (mask + 1) >> 1;
+        break;
     case float_round_up:
         increment = zSign ? 0 : mask;
         break;
@@ -3426,6 +3447,11 @@ float64 float64_round_to_int( float64 a STATUS_PARAM )
                 return packFloat64( aSign, 0x3FF, 0 );
             }
             break;
+        case float_round_ties_away:
+            if (aExp == 0x3FE) {
+                return packFloat64(aSign, 0x3ff, 0);
+            }
+            break;
          case float_round_down:
             return make_float64(aSign ? LIT64( 0xBFF0000000000000 ) : 0);
          case float_round_up:
@@ -3445,6 +3471,9 @@ float64 float64_round_to_int( float64 a STATUS_PARAM )
             z &= ~lastBitMask;
         }
         break;
+    case float_round_ties_away:
+        z += lastBitMask >> 1;
+        break;
     case float_round_to_zero:
         break;
     case float_round_up:
@@ -4710,6 +4739,11 @@ floatx80 floatx80_round_to_int( floatx80 a STATUS_PARAM )
                     packFloatx80( aSign, 0x3FFF, LIT64( 0x8000000000000000 ) );
             }
             break;
+        case float_round_ties_away:
+            if (aExp == 0x3FFE) {
+                return packFloatx80(aSign, 0x3FFF, LIT64(0x8000000000000000));
+            }
+            break;
          case float_round_down:
             return
                   aSign ?
@@ -4733,6 +4767,9 @@ floatx80 floatx80_round_to_int( floatx80 a STATUS_PARAM )
             z.low &= ~lastBitMask;
         }
         break;
+    case float_round_ties_away:
+        z.low += lastBitMask >> 1;
+        break;
     case float_round_to_zero:
         break;
     case float_round_up:
@@ -5801,6 +5838,15 @@ float128 float128_round_to_int( float128 a STATUS_PARAM )
                 }
             }
             break;
+        case float_round_ties_away:
+            if (lastBitMask) {
+                add128(z.high, z.low, 0, lastBitMask >> 1, &z.high, &z.low);
+            } else {
+                if ((int64_t) z.low < 0) {
+                    ++z.high;
+                }
+            }
+            break;
         case float_round_to_zero:
             break;
         case float_round_up:
@@ -5832,6 +5878,11 @@ float128 float128_round_to_int( float128 a STATUS_PARAM )
                     return packFloat128( aSign, 0x3FFF, 0, 0 );
                 }
                 break;
+            case float_round_ties_away:
+                if (aExp == 0x3FFE) {
+                    return packFloat128(aSign, 0x3FFF, 0, 0);
+                }
+                break;
              case float_round_down:
                 return
                       aSign ? packFloat128( 1, 0x3FFF, 0, 0 )
@@ -5855,6 +5906,9 @@ float128 float128_round_to_int( float128 a STATUS_PARAM )
                 z.high &= ~ lastBitMask;
             }
             break;
+        case float_round_ties_away:
+            z.high += lastBitMask>>1;
+            break;
         case float_round_to_zero:
             break;
         case float_round_up:
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 080b36d..18ef187 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -152,7 +152,8 @@ enum {
     float_round_nearest_even = 0,
     float_round_down         = 1,
     float_round_up           = 2,
-    float_round_to_zero      = 3
+    float_round_to_zero      = 3,
+    float_round_ties_away    = 4,
 };
 
 /*----------------------------------------------------------------------------
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz]
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (15 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 16/22] softfloat: Add support for ties-away rounding Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-08 18:32   ` Richard Henderson
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 18/22] target-ppc: Add Flag for Power ISA V2.06 Floating Point Test Instructions Tom Musta
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

The fri* series of instructions was introduced prior to ISA 2.06 and
is supported on Power7 and Power8 hardware.  However, the instruction
is still considered illegal in the P7 and P8 QEMU emulation models.
This patch enables these instructions for the P7 and P8 machines.

Also, the existing helper is modified to correctly handle some of
the boundary cases (NaNs and the inexact flag).

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: frin changed to use "ties away" rounding mode per Richard Henderson's
review.  Modified NaN handling.  Proper handling of stickiness of
the inexact flag.  Added to P7+ model.

 target-ppc/fpu_helper.c     |   18 +++++++++++-------
 target-ppc/translate_init.c |    3 +++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 4985f53..bb84852 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -669,24 +669,28 @@ static inline uint64_t do_fri(CPUPPCState *env, uint64_t arg,
 
     if (unlikely(float64_is_signaling_nan(farg.d))) {
         /* sNaN round */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN |
-                                        POWERPC_EXCP_FP_VXCVI, 1);
-    } else if (unlikely(float64_is_quiet_nan(farg.d) ||
-                        float64_is_infinity(farg.d))) {
-        /* qNan / infinity round */
-        farg.ll = fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXCVI, 1);
+        fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
+        farg.ll = arg | 0x0008000000000000ul;
     } else {
+        int inexact = get_float_exception_flags(&env->fp_status) &
+                      float_flag_inexact;
         set_float_rounding_mode(rounding_mode, &env->fp_status);
         farg.ll = float64_round_to_int(farg.d, &env->fp_status);
         /* Restore rounding mode from FPSCR */
         fpscr_set_rounding_mode(env);
+
+        /* fri* does not set FPSCR[XX] */
+        if (!inexact) {
+            env->fp_status.float_exception_flags &= ~float_flag_inexact;
+        }
     }
+    helper_float_check_status(env);
     return farg.ll;
 }
 
 uint64_t helper_frin(CPUPPCState *env, uint64_t arg)
 {
-    return do_fri(env, arg, float_round_nearest_even);
+    return do_fri(env, arg, float_round_ties_away);
 }
 
 uint64_t helper_friz(CPUPPCState *env, uint64_t arg)
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 3fb849b..ad53fa7 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7230,6 +7230,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
                        PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
                        PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
                        PPC_FLOAT_STFIWX |
+                       PPC_FLOAT_EXT |
                        PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
                        PPC_MEM_SYNC | PPC_MEM_EIEIO |
                        PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
@@ -7270,6 +7271,7 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
                        PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
                        PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
                        PPC_FLOAT_STFIWX |
+                       PPC_FLOAT_EXT |
                        PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
                        PPC_MEM_SYNC | PPC_MEM_EIEIO |
                        PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
@@ -7310,6 +7312,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                        PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
                        PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
                        PPC_FLOAT_STFIWX |
+                       PPC_FLOAT_EXT |
                        PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
                        PPC_MEM_SYNC | PPC_MEM_EIEIO |
                        PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 18/22] target-ppc: Add Flag for Power ISA V2.06 Floating Point Test Instructions
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (16 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz] Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-08 18:33   ` Richard Henderson
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 19/22] target-ppc: Add ISA 2.06 ftdiv Instruction Tom Musta
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds a flag for Floating Point Test instructions that were
introduced in Power ISA V2.06B.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: Split single flag into multiple flags per discussion with
Alex Graf and Scott Wood.  Added flag to Power7+ model.

 target-ppc/cpu.h            |    4 +++-
 target-ppc/translate_init.c |    9 ++++++---
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 4a68ce2..2b8c205 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1885,12 +1885,14 @@ enum {
     PPC2_ATOMIC_ISA206 = 0x0000000000000200ULL,
     /* ISA 2.06B floating point integer conversion                           */
     PPC2_FP_CVT_ISA206 = 0x0000000000000400ULL,
+    /* ISA 2.06B floating point test instructions                            */
+    PPC2_FP_TST_ISA206 = 0x0000000000000800ULL,
 
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
                         PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
                         PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206 | \
-                        PPC2_FP_CVT_ISA206)
+                        PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206)
 };
 
 /*****************************************************************************/
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index ad53fa7..bcaee6c 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7239,7 +7239,8 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
                         PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
-                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206;
+                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206 |
+                        PPC2_FP_TST_ISA206;
     pcc->msr_mask = 0x800000000284FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7280,7 +7281,8 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
                         PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
-                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206;
+                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206 |
+                        PPC2_FP_TST_ISA206;
     pcc->msr_mask = 0x800000000204FF37ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
@@ -7321,7 +7323,8 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                        PPC_POPCNTB | PPC_POPCNTWD;
     pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX |
                         PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
-                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206;
+                        PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206 |
+                        PPC2_FP_TST_ISA206;
     pcc->msr_mask = 0x800000000284FF36ULL;
     pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 19/22] target-ppc: Add ISA 2.06 ftdiv Instruction
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (17 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 18/22] target-ppc: Add Flag for Power ISA V2.06 Floating Point Test Instructions Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-08 18:34   ` Richard Henderson
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 20/22] target-ppc: Add ISA 2.06 ftsqrt Tom Musta
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Floating Point Test for Divide instruction which
was introduced in Power ISA 2.06B.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: Using the newly added PPC2_FP_TST_ISA206 flag.  Modified helper
signature per Richard Henderson's review.

 target-ppc/fpu_helper.c |   56 ++++++++++++++++++++++++++++++++++++++--------
 target-ppc/helper.h     |    2 +
 target-ppc/translate.c  |   13 +++++++++++
 3 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index bb84852..514a5c9 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -50,6 +50,16 @@ static inline int isden(float64 d)
     return ((u.ll >> 52) & 0x7FF) == 0;
 }
 
+static inline int ppc_float32_get_unbiased_exp(float32 f)
+{
+    return ((f >> 23) & 0xFF) - 127;
+}
+
+static inline int ppc_float64_get_unbiased_exp(float64 f)
+{
+    return ((f >> 52) & 0x7FF) - 1023;
+}
+
 uint32_t helper_compute_fprf(CPUPPCState *env, uint64_t arg, uint32_t set_fprf)
 {
     CPU_DoubleU farg;
@@ -993,6 +1003,42 @@ uint64_t helper_fsel(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
     }
 }
 
+uint32_t helper_ftdiv(uint64_t fra, uint64_t frb)
+{
+    int fe_flag = 0;
+    int fg_flag = 0;
+
+    if (unlikely(float64_is_infinity(fra) ||
+                 float64_is_infinity(frb) ||
+                 float64_is_zero(frb))) {
+        fe_flag = 1;
+        fg_flag = 1;
+    } else {
+        int e_a = ppc_float64_get_unbiased_exp(fra);
+        int e_b = ppc_float64_get_unbiased_exp(frb);
+
+        if (unlikely(float64_is_any_nan(fra) ||
+                     float64_is_any_nan(frb))) {
+            fe_flag = 1;
+        } else if ((e_b <= -1022) || (e_b >= 1021)) {
+            fe_flag = 1;
+        } else if (!float64_is_zero(fra) &&
+                   (((e_a - e_b) >= 1023) ||
+                    ((e_a - e_b) <= -1021) ||
+                    (e_a <= -970))) {
+            fe_flag = 1;
+        }
+
+        if (unlikely(float64_is_zero_or_denormal(frb))) {
+            /* XB is not zero because of the above check and */
+            /* so must be denormalized.                      */
+            fg_flag = 1;
+        }
+    }
+
+    return 0x8 | (fg_flag ? 4 : 0) | (fe_flag ? 2 : 0);
+}
+
 void helper_fcmpu(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                   uint32_t crfD)
 {
@@ -2021,16 +2067,6 @@ VSX_RSQRTE(xsrsqrtesp, 1, float64, f64, 1, 1)
 VSX_RSQRTE(xvrsqrtedp, 2, float64, f64, 0, 0)
 VSX_RSQRTE(xvrsqrtesp, 4, float32, f32, 0, 0)
 
-static inline int ppc_float32_get_unbiased_exp(float32 f)
-{
-    return ((f >> 23) & 0xFF) - 127;
-}
-
-static inline int ppc_float64_get_unbiased_exp(float64 f)
-{
-    return ((f >> 52) & 0x7FF) - 1023;
-}
-
 /* VSX_TDIV - VSX floating point test for divide
  *   op    - instruction mnemonic
  *   nels  - number of elements (1, 2 or 4)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 037871b..80ad57f 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -99,6 +99,8 @@ DEF_HELPER_2(fres, i64, env, i64)
 DEF_HELPER_2(frsqrte, i64, env, i64)
 DEF_HELPER_4(fsel, i64, env, i64, i64, i64)
 
+DEF_HELPER_FLAGS_2(ftdiv, TCG_CALL_NO_RWG_SE, i32, i64, i64)
+
 #define dh_alias_avr ptr
 #define dh_ctype_avr ppc_avr_t *
 #define dh_is_signed_avr dh_is_signed_ptr
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index a9b9032..e6cae60 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2238,6 +2238,18 @@ GEN_FLOAT_B(rip, 0x08, 0x0E, 1, PPC_FLOAT_EXT);
 /* frim */
 GEN_FLOAT_B(rim, 0x08, 0x0F, 1, PPC_FLOAT_EXT);
 
+static void gen_ftdiv(DisasContext *ctx)
+{
+    if (unlikely(!ctx->fpu_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_FPU);
+        return;
+    }
+    gen_helper_ftdiv(cpu_crf[crfD(ctx->opcode)], cpu_fpr[rA(ctx->opcode)],
+                     cpu_fpr[rB(ctx->opcode)]);
+}
+
+
+
 /***                         Floating-Point compare                        ***/
 
 /* fcmpo */
@@ -9759,6 +9771,7 @@ GEN_FLOAT_ACB(madd, 0x1D, 1, PPC_FLOAT),
 GEN_FLOAT_ACB(msub, 0x1C, 1, PPC_FLOAT),
 GEN_FLOAT_ACB(nmadd, 0x1F, 1, PPC_FLOAT),
 GEN_FLOAT_ACB(nmsub, 0x1E, 1, PPC_FLOAT),
+GEN_HANDLER_E(ftdiv, 0x3F, 0x00, 0x04, 1, PPC_NONE, PPC2_FP_TST_ISA206),
 GEN_FLOAT_B(ctiw, 0x0E, 0x00, 0, PPC_FLOAT),
 GEN_HANDLER_E(fctiwu, 0x3F, 0x0E, 0x04, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(ctiwz, 0x0F, 0x00, 0, PPC_FLOAT),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 20/22] target-ppc: Add ISA 2.06 ftsqrt
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (18 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 19/22] target-ppc: Add ISA 2.06 ftdiv Instruction Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-08 18:35   ` Richard Henderson
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 21/22] target-ppc: Enable frsqrtes on Power7 and Power8 Tom Musta
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Floating Point Test for Square Root instruction
which was introduced in Power ISA 2.06.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
V4: Using the newly added PPC2_FP_TST_ISA206 flag.  Modified helper
signature per Richard Henderson's review.

 target-ppc/fpu_helper.c |   31 +++++++++++++++++++++++++++++++
 target-ppc/helper.h     |    1 +
 target-ppc/translate.c  |   10 ++++++++++
 3 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 514a5c9..5a330a2 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1039,6 +1039,37 @@ uint32_t helper_ftdiv(uint64_t fra, uint64_t frb)
     return 0x8 | (fg_flag ? 4 : 0) | (fe_flag ? 2 : 0);
 }
 
+uint32_t helper_ftsqrt(uint64_t frb)
+{
+    int fe_flag = 0;
+    int fg_flag = 0;
+
+    if (unlikely(float64_is_infinity(frb) || float64_is_zero(frb))) {
+        fe_flag = 1;
+        fg_flag = 1;
+    } else {
+        int e_b = ppc_float64_get_unbiased_exp(frb);
+
+        if (unlikely(float64_is_any_nan(frb))) {
+            fe_flag = 1;
+        } else if (unlikely(float64_is_zero(frb))) {
+            fe_flag = 1;
+        } else if (unlikely(float64_is_neg(frb))) {
+            fe_flag = 1;
+        } else if (!float64_is_zero(frb) && (e_b <= (-1022+52))) {
+            fe_flag = 1;
+        }
+
+        if (unlikely(float64_is_zero_or_denormal(frb))) {
+            /* XB is not zero because of the above check and */
+            /* therefore must be denormalized.               */
+            fg_flag = 1;
+        }
+    }
+
+    return 0x8 | (fg_flag ? 4 : 0) | (fe_flag ? 2 : 0);
+}
+
 void helper_fcmpu(CPUPPCState *env, uint64_t arg1, uint64_t arg2,
                   uint32_t crfD)
 {
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 80ad57f..8d5a2ce 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -100,6 +100,7 @@ DEF_HELPER_2(frsqrte, i64, env, i64)
 DEF_HELPER_4(fsel, i64, env, i64, i64, i64)
 
 DEF_HELPER_FLAGS_2(ftdiv, TCG_CALL_NO_RWG_SE, i32, i64, i64)
+DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64)
 
 #define dh_alias_avr ptr
 #define dh_ctype_avr ppc_avr_t *
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index e6cae60..5b2143a 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2248,6 +2248,15 @@ static void gen_ftdiv(DisasContext *ctx)
                      cpu_fpr[rB(ctx->opcode)]);
 }
 
+static void gen_ftsqrt(DisasContext *ctx)
+{
+    if (unlikely(!ctx->fpu_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_FPU);
+        return;
+    }
+    gen_helper_ftsqrt(cpu_crf[crfD(ctx->opcode)], cpu_fpr[rB(ctx->opcode)]);
+}
+
 
 
 /***                         Floating-Point compare                        ***/
@@ -9772,6 +9781,7 @@ GEN_FLOAT_ACB(msub, 0x1C, 1, PPC_FLOAT),
 GEN_FLOAT_ACB(nmadd, 0x1F, 1, PPC_FLOAT),
 GEN_FLOAT_ACB(nmsub, 0x1E, 1, PPC_FLOAT),
 GEN_HANDLER_E(ftdiv, 0x3F, 0x00, 0x04, 1, PPC_NONE, PPC2_FP_TST_ISA206),
+GEN_HANDLER_E(ftsqrt, 0x3F, 0x00, 0x05, 1, PPC_NONE, PPC2_FP_TST_ISA206),
 GEN_FLOAT_B(ctiw, 0x0E, 0x00, 0, PPC_FLOAT),
 GEN_HANDLER_E(fctiwu, 0x3F, 0x0E, 0x04, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(ctiwz, 0x0F, 0x00, 0, PPC_FLOAT),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 21/22] target-ppc: Enable frsqrtes on Power7 and Power8
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (19 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 20/22] target-ppc: Add ISA 2.06 ftsqrt Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 22/22] target-ppc: Add ISA2.06 lfiwzx Instruction Tom Musta
  2014-01-27 16:01 ` [Qemu-devel] [Qemu-ppc] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Alexander Graf
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

The frsqrtes instruction was introduced prior to ISA 2.06 and is
support on both the Power7 and Power8 processors.  However, this
instruction is handled as illegal in the current QEMU emulation
machines.  This patch enables the existing implemention of frsqrtes
in the P7 and P8 machines.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
 target-ppc/translate_init.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index bcaee6c..a83c964 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7229,6 +7229,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
     pcc->insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
                        PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
                        PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
+                       PPC_FLOAT_FRSQRTES |
                        PPC_FLOAT_STFIWX |
                        PPC_FLOAT_EXT |
                        PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
@@ -7271,6 +7272,7 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
     pcc->insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
                        PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
                        PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
+                       PPC_FLOAT_FRSQRTES |
                        PPC_FLOAT_STFIWX |
                        PPC_FLOAT_EXT |
                        PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
@@ -7313,6 +7315,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
     pcc->insns_flags = PPC_INSNS_BASE | PPC_STRING | PPC_MFTB |
                        PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
                        PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
+                       PPC_FLOAT_FRSQRTES |
                        PPC_FLOAT_STFIWX |
                        PPC_FLOAT_EXT |
                        PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [V4 PATCH 22/22] target-ppc: Add ISA2.06 lfiwzx Instruction
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (20 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 21/22] target-ppc: Enable frsqrtes on Power7 and Power8 Tom Musta
@ 2014-01-07 16:06 ` Tom Musta
  2014-01-27 16:01 ` [Qemu-devel] [Qemu-ppc] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Alexander Graf
  22 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-01-07 16:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: Tom Musta, qemu-ppc

This patch adds the Load Floating Point as Integer Word and
Zero Indexed (lfiwzx) instruction which was introduced in
Power ISA 2.06.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Richard Henderson <address@hidden>
---
V4: Using the PPC2_FP_CVT_ISA206 flag.

 target-ppc/translate.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 5b2143a..796a103 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -3482,6 +3482,20 @@ static void gen_lfiwax(DisasContext *ctx)
     tcg_temp_free(t0);
 }
 
+/* lfiwzx */
+static void gen_lfiwzx(DisasContext *ctx)
+{
+    TCGv EA;
+    if (unlikely(!ctx->fpu_enabled)) {
+        gen_exception(ctx, POWERPC_EXCP_FPU);
+        return;
+    }
+    gen_set_access_type(ctx, ACCESS_FLOAT);
+    EA = tcg_temp_new();
+    gen_addr_reg_index(ctx, EA);
+    gen_qemu_ld32u_i64(ctx, cpu_fpr[rD(ctx->opcode)], EA);
+    tcg_temp_free(EA);
+}
 /***                         Floating-point store                          ***/
 #define GEN_STF(name, stop, opc, type)                                        \
 static void glue(gen_, name)(DisasContext *ctx)                                       \
@@ -9887,6 +9901,7 @@ GEN_LDXF(name, ldop, 0x17, op | 0x00, type)
 GEN_LDFS(lfd, ld64, 0x12, PPC_FLOAT)
 GEN_LDFS(lfs, ld32fs, 0x10, PPC_FLOAT)
 GEN_HANDLER_E(lfiwax, 0x1f, 0x17, 0x1a, 0x00000001, PPC_NONE, PPC2_ISA205),
+GEN_HANDLER_E(lfiwzx, 0x1f, 0x17, 0x1b, 0x1, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_HANDLER_E(lfdp, 0x39, 0xFF, 0xFF, 0x00200003, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(lfdpx, 0x1F, 0x17, 0x18, 0x00200001, PPC_NONE, PPC2_ISA205),
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 05/22] target-ppc: Add ISA 2.06 divweu[o] Instructions
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 05/22] target-ppc: Add ISA 2.06 divweu[o] Instructions Tom Musta
@ 2014-01-08 18:26   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:26 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:05 AM, Tom Musta wrote:
> This patch addes the Unsigned Divide Word Extended instructions
> which were introduced in Power ISA 2.06B.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V2: Eliminating extraneous code in the overflow case per comments
> from Richard Henderson.  Fixed corner case bug in divweu (check
> for (RA) >= (RB)).
> 
> V4: Using newly added PPC2_DIVE_ISA206 flag.  Converted to helper
> per Richard Henderson's review.
> 
>  target-ppc/helper.h     |    1 +
>  target-ppc/int_helper.c |   31 +++++++++++++++++++++++++++++++
>  target-ppc/translate.c  |    5 +++++
>  3 files changed, 37 insertions(+), 0 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 02/22] target-ppc: Add Flag for ISA2.06 Divide Extended Instructions
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 02/22] target-ppc: Add Flag for ISA2.06 Divide Extended Instructions Tom Musta
@ 2014-01-08 18:27   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:27 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:05 AM, Tom Musta wrote:
> This patch adds a flag for the Divide Extended instructions that
> were introduced in Power ISA V2.06B.  The flag is added to the
> Power7 and Power8 models.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: Split into new and separate patch.  Added flag to Power7+
> model.
> 
>  target-ppc/cpu.h            |    5 ++++-
>  target-ppc/translate_init.c |    6 +++---
>  2 files changed, 7 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 06/22] target-ppc: Add ISA 2.06 divwe[o] Instructions
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 06/22] target-ppc: Add ISA 2.06 divwe[o] Instructions Tom Musta
@ 2014-01-08 18:28   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:28 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:05 AM, Tom Musta wrote:
> This patch addes the signed Divide Word Extended instructions
> which were introduced in Power ISA 2.06B.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V2: Eliminating extraneous code in the overflow case per comments
> from Richard Henderson.  Fixed corner case bug in divweu (check
> for (RA) >= (RB)).
> 
> V4: Using newly added PPC2_DIVE_ISA206 flag.  Converted to helper
> per Richard Henderson's review.
> 
>  target-ppc/helper.h     |    1 +
>  target-ppc/int_helper.c |   32 ++++++++++++++++++++++++++++++++
>  target-ppc/translate.c  |    4 ++++
>  3 files changed, 37 insertions(+), 0 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 07/22] target-ppc: Add Flag for ISA2.06 Atomic Instructions
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 07/22] target-ppc: Add Flag for ISA2.06 Atomic Instructions Tom Musta
@ 2014-01-08 18:28   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:28 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:05 AM, Tom Musta wrote:
> This patch adds a flag for the atomic instructions introduced
> in Power ISA V2.06B.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: Split into new and separate patch.  Added to Power7+ model.
> 
>  target-ppc/cpu.h            |    5 ++++-
>  target-ppc/translate_init.c |    9 ++++++---
>  2 files changed, 10 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 10/22] target-ppc: Add Flag for ISA V2.06 Floating Point Conversion
  2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 10/22] target-ppc: Add Flag for ISA V2.06 Floating Point Conversion Tom Musta
@ 2014-01-08 18:29   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:29 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:05 AM, Tom Musta wrote:
> This patch adds a flag for the floating point conversion instructions
> introduced in Power ISA 2.06B.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: Split single flag into multiple flags per discussion with
> Alex Graf and Scott Wood.  Added to Power7+ config.
> 
>  target-ppc/cpu.h            |    5 ++++-
>  target-ppc/translate_init.c |    6 +++---
>  2 files changed, 7 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 12/22] target-ppc: Add ISA 2.06 fcfid[u][s] Instructions
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 12/22] target-ppc: Add ISA 2.06 fcfid[u][s] Instructions Tom Musta
@ 2014-01-08 18:31   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:31 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:06 AM, Tom Musta wrote:
> This patch adds the fcfids, fcfidu and fcfidus instructions which
> were introduced in Power ISA 2.06B.  A common macro is provided to
> eliminate repetitious code, and the existing fcfid instruction is
> refactored to use this macro.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: Using the newly added PPC2_FP_CVT_ISA206 flag.  Performed
> direct conversion to single precision per Richard Henderson's
> review.
> 
>  target-ppc/fpu_helper.c |   24 +++++++++++++++++-------
>  target-ppc/helper.h     |    3 +++
>  target-ppc/translate.c  |    9 +++++++++
>  3 files changed, 29 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz]
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz] Tom Musta
@ 2014-01-08 18:32   ` Richard Henderson
  2014-02-21 11:58     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:32 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:06 AM, Tom Musta wrote:
> The fri* series of instructions was introduced prior to ISA 2.06 and
> is supported on Power7 and Power8 hardware.  However, the instruction
> is still considered illegal in the P7 and P8 QEMU emulation models.
> This patch enables these instructions for the P7 and P8 machines.
> 
> Also, the existing helper is modified to correctly handle some of
> the boundary cases (NaNs and the inexact flag).
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: frin changed to use "ties away" rounding mode per Richard Henderson's
> review.  Modified NaN handling.  Proper handling of stickiness of
> the inexact flag.  Added to P7+ model.
> 
>  target-ppc/fpu_helper.c     |   18 +++++++++++-------
>  target-ppc/translate_init.c |    3 +++
>  2 files changed, 14 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 18/22] target-ppc: Add Flag for Power ISA V2.06 Floating Point Test Instructions
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 18/22] target-ppc: Add Flag for Power ISA V2.06 Floating Point Test Instructions Tom Musta
@ 2014-01-08 18:33   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:33 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:06 AM, Tom Musta wrote:
> This patch adds a flag for Floating Point Test instructions that were
> introduced in Power ISA V2.06B.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: Split single flag into multiple flags per discussion with
> Alex Graf and Scott Wood.  Added flag to Power7+ model.
> 
>  target-ppc/cpu.h            |    4 +++-
>  target-ppc/translate_init.c |    9 ++++++---
>  2 files changed, 9 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 19/22] target-ppc: Add ISA 2.06 ftdiv Instruction
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 19/22] target-ppc: Add ISA 2.06 ftdiv Instruction Tom Musta
@ 2014-01-08 18:34   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:34 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:06 AM, Tom Musta wrote:
> This patch adds the Floating Point Test for Divide instruction which
> was introduced in Power ISA 2.06B.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: Using the newly added PPC2_FP_TST_ISA206 flag.  Modified helper
> signature per Richard Henderson's review.

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [V4 PATCH 20/22] target-ppc: Add ISA 2.06 ftsqrt
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 20/22] target-ppc: Add ISA 2.06 ftsqrt Tom Musta
@ 2014-01-08 18:35   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2014-01-08 18:35 UTC (permalink / raw)
  To: Tom Musta, qemu-devel; +Cc: qemu-ppc

On 01/07/2014 08:06 AM, Tom Musta wrote:
> This patch adds the Floating Point Test for Square Root instruction
> which was introduced in Power ISA 2.06.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
> V4: Using the newly added PPC2_FP_TST_ISA206 flag.  Modified helper
> signature per Richard Henderson's review.
> 
>  target-ppc/fpu_helper.c |   31 +++++++++++++++++++++++++++++++
>  target-ppc/helper.h     |    1 +
>  target-ppc/translate.c  |   10 ++++++++++
>  3 files changed, 42 insertions(+), 0 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8
  2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
                   ` (21 preceding siblings ...)
  2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 22/22] target-ppc: Add ISA2.06 lfiwzx Instruction Tom Musta
@ 2014-01-27 16:01 ` Alexander Graf
  22 siblings, 0 replies; 36+ messages in thread
From: Alexander Graf @ 2014-01-27 16:01 UTC (permalink / raw)
  To: Tom Musta; +Cc: open list:PowerPC, QEMU Developers


On 07.01.2014, at 17:05, Tom Musta <tommusta@gmail.com> wrote:

> The QEMU emulation models for Power7 and Power8 are still missing some
> of the base instructions that were introduced in Power ISA 2.06 and
> even a few that were introduced prior to that.
> 
> This patch series gets these models caught up with respect to the
> base 2.06 ISA.  That is, the Book I and Book II instructions for 
> the branch, fixed point and floating point units will be completed
> with this series.  Decimal floating point is not addressed in this
> series, nor are the base ISA 2.07 changes for the Power8 model.
> 
> In some cases, existing instructions are re-implemented using common
> macros, thus eliminating some redundant code.  The patch series
> eliminates almost half as much code as it adds.
> 
> Additionally, some bugs in the common floating point library (softfloat)
> are fixed.

Thanks, applied all to ppc-next.


Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz]
  2014-01-08 18:32   ` Richard Henderson
@ 2014-02-21 11:58     ` Alexander Graf
  2014-02-21 12:48       ` Tom Musta
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Graf @ 2014-02-21 11:58 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Tom Musta, list@suse.de:PowerPC, QEMU Developers


On 08.01.2014, at 19:32, Richard Henderson <rth@twiddle.net> wrote:

> On 01/07/2014 08:06 AM, Tom Musta wrote:
>> The fri* series of instructions was introduced prior to ISA 2.06 and
>> is supported on Power7 and Power8 hardware.  However, the instruction
>> is still considered illegal in the P7 and P8 QEMU emulation models.
>> This patch enables these instructions for the P7 and P8 machines.
>> 
>> Also, the existing helper is modified to correctly handle some of
>> the boundary cases (NaNs and the inexact flag).
>> 
>> Signed-off-by: Tom Musta <tommusta@gmail.com>
>> ---
>> V4: frin changed to use "ties away" rounding mode per Richard Henderson's
>> review.  Modified NaN handling.  Proper handling of stickiness of
>> the inexact flag.  Added to P7+ model.
>> 
>> target-ppc/fpu_helper.c     |   18 +++++++++++-------
>> target-ppc/translate_init.c |    3 +++
>> 2 files changed, 14 insertions(+), 7 deletions(-)
> 
> Reviewed-by: Richard Henderson <rth@twiddle.net>

This patch (among others) spawn warnings about constants that are defined "ul" on 32bit hosts:

  http://award.ath.cx/results/89-alex/ibook/virt.qemu.build/debug/virt.qemu.build.DEBUG

Tom, please follow up with a patch fixing all of these in target-ppc.


Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz]
  2014-02-21 11:58     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
@ 2014-02-21 12:48       ` Tom Musta
  0 siblings, 0 replies; 36+ messages in thread
From: Tom Musta @ 2014-02-21 12:48 UTC (permalink / raw)
  To: Alexander Graf, Richard Henderson; +Cc: list@suse.de:PowerPC, QEMU Developers

On 2/21/2014 5:58 AM, Alexander Graf wrote:
> This patch (among others) spawn warnings about constants that are defined "ul" on 32bit hosts:
> 
>   http://award.ath.cx/results/89-alex/ibook/virt.qemu.build/debug/virt.qemu.build.DEBUG
> 
> Tom, please follow up with a patch fixing all of these in target-ppc.
> 

Will do.  Thanks, Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-02-21 12:48 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 01/22] target-ppc: Add ISA2.06 bpermd Instruction Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 02/22] target-ppc: Add Flag for ISA2.06 Divide Extended Instructions Tom Musta
2014-01-08 18:27   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 03/22] target-ppc: Add ISA2.06 divdeu[o] Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 04/22] target-ppc: Add ISA2.06 divde[o] Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 05/22] target-ppc: Add ISA 2.06 divweu[o] Instructions Tom Musta
2014-01-08 18:26   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 06/22] target-ppc: Add ISA 2.06 divwe[o] Instructions Tom Musta
2014-01-08 18:28   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 07/22] target-ppc: Add Flag for ISA2.06 Atomic Instructions Tom Musta
2014-01-08 18:28   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 08/22] target-ppc: Add ISA2.06 lbarx, lharx Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 09/22] target-ppc: Add ISA 2.06 stbcx. and sthcx. Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 10/22] target-ppc: Add Flag for ISA V2.06 Floating Point Conversion Tom Musta
2014-01-08 18:29   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 11/22] target-ppc: Add ISA2.06 Float to Integer Instructions Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 12/22] target-ppc: Add ISA 2.06 fcfid[u][s] Instructions Tom Musta
2014-01-08 18:31   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 13/22] softfloat: Fix exception flag handling for float32_to_float16() Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 14/22] softfloat: Factor out RoundAndPackFloat16 and NormalizeFloat16Subnormal Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 15/22] softfloat: Refactor code handling various rounding modes Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 16/22] softfloat: Add support for ties-away rounding Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz] Tom Musta
2014-01-08 18:32   ` Richard Henderson
2014-02-21 11:58     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-02-21 12:48       ` Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 18/22] target-ppc: Add Flag for Power ISA V2.06 Floating Point Test Instructions Tom Musta
2014-01-08 18:33   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 19/22] target-ppc: Add ISA 2.06 ftdiv Instruction Tom Musta
2014-01-08 18:34   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 20/22] target-ppc: Add ISA 2.06 ftsqrt Tom Musta
2014-01-08 18:35   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 21/22] target-ppc: Enable frsqrtes on Power7 and Power8 Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 22/22] target-ppc: Add ISA2.06 lfiwzx Instruction Tom Musta
2014-01-27 16:01 ` [Qemu-devel] [Qemu-ppc] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Alexander Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.