All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16
@ 2018-05-02 22:15 Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16 Richard Henderson
                   ` (15 more replies)
  0 siblings, 16 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell

When running the gcc testsuite with current aarch64-linux-user,
the testsuite detects the presence of the fp16 extension and
enables lots of extra tests for builtins.

Quite a few of these new tests fail because we missed implementing
some instructions.  We really should go back and verify that nothing
else is missing from this (rather large) extension.

In addition, it tests some edge conditions on data that show flaws
in the way we were performing integer<->fp conversion; particularly
with respect to scaled conversion.

Changes since v1:
  * Rebased vs master instead of tgt-arm-sve-9.
  * Alex did some additional digging through the ARM xhtml
    and came up with some additional missing instructions.
  * Everything cc'd to qemu-stable.


r~


Alex Bennée (4):
  target/arm: Implement FCMP for fp16
  target/arm: Implement FCSEL for fp16
  target/arm: Implement FMOV (immediate) for fp16
  target/arm: Fix sqrt_f16 exception raising

Richard Henderson (10):
  target/arm: Implement vector shifted SCVF/UCVF for fp16
  target/arm: Implement vector shifted FCVT for fp16
  target/arm: Fix float16 to/from int16
  target/arm: Clear SVE high bits for FMOV
  target/arm: Implement FMOV (general) for fp16
  target/arm: Implement FCVT (scalar,integer) for fp16
  target/arm: Implement FCVT (scalar,fixed-point) for fp16
  target/arm: Introduce and use read_fp_hreg
  target/arm: Implement FP data-processing (2 source) for fp16
  target/arm: Implement FP data-processing (3 source) for fp16

 target/arm/helper-a64.h    |   2 +
 target/arm/helper.h        |   6 +
 target/arm/helper-a64.c    |  10 +
 target/arm/helper.c        |  87 +++++++-
 target/arm/translate-a64.c | 535 ++++++++++++++++++++++++++++++++++++---------
 5 files changed, 532 insertions(+), 108 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:03   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT " Richard Henderson
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

While we have some of the scalar paths for *CVF for fp16,
we failed to decode the fp16 version of these instructions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

---
v2: Use parens with (x << y) >> z.
---
 target/arm/translate-a64.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index bff4e13bf6..68ca445691 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -7165,13 +7165,26 @@ static void handle_simd_shift_intfp_conv(DisasContext *s, bool is_scalar,
                                          int immh, int immb, int opcode,
                                          int rn, int rd)
 {
-    bool is_double = extract32(immh, 3, 1);
-    int size = is_double ? MO_64 : MO_32;
-    int elements;
+    int size, elements, fracbits;
     int immhb = immh << 3 | immb;
-    int fracbits = (is_double ? 128 : 64) - immhb;
 
-    if (!extract32(immh, 2, 2)) {
+    if (immh & 8) {
+        size = MO_64;
+        if (!is_scalar && !is_q) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else if (immh & 4) {
+        size = MO_32;
+    } else if (immh & 2) {
+        size = MO_16;
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else {
+        /* immh == 0 would be a failure of the decode logic */
+        g_assert(immh == 1);
         unallocated_encoding(s);
         return;
     }
@@ -7179,20 +7192,14 @@ static void handle_simd_shift_intfp_conv(DisasContext *s, bool is_scalar,
     if (is_scalar) {
         elements = 1;
     } else {
-        elements = is_double ? 2 : is_q ? 4 : 2;
-        if (is_double && !is_q) {
-            unallocated_encoding(s);
-            return;
-        }
+        elements = (8 << is_q) >> size;
     }
+    fracbits = (16 << size) - immhb;
 
     if (!fp_access_check(s)) {
         return;
     }
 
-    /* immh == 0 would be a failure of the decode logic */
-    g_assert(immh);
-
     handle_simd_intfp_conv(s, rd, rn, elements, !is_u, fracbits, size);
 }
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:04   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 03/14] target/arm: Fix float16 to/from int16 Richard Henderson
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

While we have some of the scalar paths for FCVT for fp16,
we failed to decode the fp16 version of these instructions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

---
v2: Use parens with (x << y) >> z.
---
 target/arm/translate-a64.c | 65 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 46 insertions(+), 19 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 68ca445691..a64673575a 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -7208,19 +7208,28 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
                                          bool is_q, bool is_u,
                                          int immh, int immb, int rn, int rd)
 {
-    bool is_double = extract32(immh, 3, 1);
     int immhb = immh << 3 | immb;
-    int fracbits = (is_double ? 128 : 64) - immhb;
-    int pass;
+    int pass, size, fracbits;
     TCGv_ptr tcg_fpstatus;
     TCGv_i32 tcg_rmode, tcg_shift;
 
-    if (!extract32(immh, 2, 2)) {
-        unallocated_encoding(s);
-        return;
-    }
-
-    if (!is_scalar && !is_q && is_double) {
+    if (immh & 0x8) {
+        size = MO_64;
+        if (!is_scalar && !is_q) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else if (immh & 0x4) {
+        size = MO_32;
+    } else if (immh & 0x2) {
+        size = MO_16;
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else {
+        /* Should have split out AdvSIMD modified immediate earlier.  */
+        assert(immh == 1);
         unallocated_encoding(s);
         return;
     }
@@ -7232,11 +7241,12 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
     assert(!(is_scalar && is_q));
 
     tcg_rmode = tcg_const_i32(arm_rmode_to_sf(FPROUNDING_ZERO));
-    tcg_fpstatus = get_fpstatus_ptr(false);
+    tcg_fpstatus = get_fpstatus_ptr(size == MO_16);
     gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
+    fracbits = (16 << size) - immhb;
     tcg_shift = tcg_const_i32(fracbits);
 
-    if (is_double) {
+    if (size == MO_64) {
         int maxpass = is_scalar ? 1 : 2;
 
         for (pass = 0; pass < maxpass; pass++) {
@@ -7253,20 +7263,37 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
         }
         clear_vec_high(s, is_q, rd);
     } else {
-        int maxpass = is_scalar ? 1 : is_q ? 4 : 2;
+        void (*fn)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+        int maxpass = is_scalar ? 1 : ((8 << is_q) >> size);
+
+        switch (size) {
+        case MO_16:
+            if (is_u) {
+                fn = gen_helper_vfp_toulh;
+            } else {
+                fn = gen_helper_vfp_toslh;
+            }
+            break;
+        case MO_32:
+            if (is_u) {
+                fn = gen_helper_vfp_touls;
+            } else {
+                fn = gen_helper_vfp_tosls;
+            }
+            break;
+        default:
+            g_assert_not_reached();
+        }
+
         for (pass = 0; pass < maxpass; pass++) {
             TCGv_i32 tcg_op = tcg_temp_new_i32();
 
-            read_vec_element_i32(s, tcg_op, rn, pass, MO_32);
-            if (is_u) {
-                gen_helper_vfp_touls(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
-            } else {
-                gen_helper_vfp_tosls(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
-            }
+            read_vec_element_i32(s, tcg_op, rn, pass, size);
+            fn(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
             if (is_scalar) {
                 write_fp_sreg(s, rd, tcg_op);
             } else {
-                write_vec_element_i32(s, tcg_op, rd, pass, MO_32);
+                write_vec_element_i32(s, tcg_op, rd, pass, size);
             }
             tcg_temp_free_i32(tcg_op);
         }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 03/14] target/arm: Fix float16 to/from int16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16 Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 04/14] target/arm: Clear SVE high bits for FMOV Richard Henderson
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

The instruction "ucvtf v0.4h, v04h, #2", with input 0x8000u,
overflows the intermediate float16 to infinity before we have a
chance to scale the output.  Use float64 as the intermediate type
so that no input argument (uint32_t in this case) can overflow
or round before scaling.  Given the declared argument, the signed
int32_t function has the same problem.

When converting from float16 to integer, using u/int32_t instead
of u/int16_t means that the bounding is incorrect.

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.h        |  4 ++--
 target/arm/helper.c        | 53 ++++++++++++++++++++++++++++++++++++++++++++--
 target/arm/translate-a64.c |  4 ++--
 3 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 34e8cc8904..1969b37f2d 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -149,8 +149,8 @@ DEF_HELPER_3(vfp_toshd_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_tosld_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, ptr)
-DEF_HELPER_3(vfp_toulh, i32, f16, i32, ptr)
-DEF_HELPER_3(vfp_toslh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_touhh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toshh, i32, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshs, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosls, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosqs, i64, f32, i32, ptr)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 52a88e0297..ba04cddda0 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11420,11 +11420,60 @@ VFP_CONV_FIX_A64(sq, s, 32, 64, int64)
 VFP_CONV_FIX(uh, s, 32, 32, uint16)
 VFP_CONV_FIX(ul, s, 32, 32, uint32)
 VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
-VFP_CONV_FIX_A64(sl, h, 16, 32, int32)
-VFP_CONV_FIX_A64(ul, h, 16, 32, uint32)
+
 #undef VFP_CONV_FIX
 #undef VFP_CONV_FIX_FLOAT
 #undef VFP_CONV_FLOAT_FIX_ROUND
+#undef VFP_CONV_FIX_A64
+
+/* Conversion to/from f16 can overflow to infinity before/after scaling.
+ * Therefore we convert to f64 (which does not round), scale,
+ * and then convert f64 to f16 (which may round).
+ */
+
+static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
+{
+    return float64_to_float16(float64_scalbn(f, -shift, fpst), true, fpst);
+}
+
+float16 HELPER(vfp_sltoh)(uint32_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(int32_to_float64(x, fpst), shift, fpst);
+}
+
+float16 HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(uint32_to_float64(x, fpst), shift, fpst);
+}
+
+static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
+{
+    if (unlikely(float16_is_any_nan(f))) {
+        float_raise(float_flag_invalid, fpst);
+        return 0;
+    } else {
+        int old_exc_flags = get_float_exception_flags(fpst);
+        float64 ret;
+
+        ret = float16_to_float64(f, true, fpst);
+        ret = float64_scalbn(ret, shift, fpst);
+        old_exc_flags |= get_float_exception_flags(fpst)
+            & float_flag_input_denormal;
+        set_float_exception_flags(old_exc_flags, fpst);
+
+        return ret;
+    }
+}
+
+uint32_t HELPER(vfp_toshh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int16(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint32_t HELPER(vfp_touhh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint16(do_prescale_fp16(x, shift, fpst), fpst);
+}
 
 /* Set the current fp rounding mode and return the old one.
  * The argument is a softfloat float_round_ value.
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a64673575a..7021a31b89 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -7269,9 +7269,9 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
         switch (size) {
         case MO_16:
             if (is_u) {
-                fn = gen_helper_vfp_toulh;
+                fn = gen_helper_vfp_touhh;
             } else {
-                fn = gen_helper_vfp_toslh;
+                fn = gen_helper_vfp_toshh;
             }
             break;
         case MO_32:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 04/14] target/arm: Clear SVE high bits for FMOV
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (2 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 03/14] target/arm: Fix float16 to/from int16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16 Richard Henderson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

Use write_fp_dreg and clear_vec_high to zero the bits
that need zeroing for these cases.

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 7021a31b89..c64c3ed99d 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5444,31 +5444,24 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
 
     if (itof) {
         TCGv_i64 tcg_rn = cpu_reg(s, rn);
+        TCGv_i64 tmp;
 
         switch (type) {
         case 0:
-        {
             /* 32 bit */
-            TCGv_i64 tmp = tcg_temp_new_i64();
+            tmp = tcg_temp_new_i64();
             tcg_gen_ext32u_i64(tmp, tcg_rn);
-            tcg_gen_st_i64(tmp, cpu_env, fp_reg_offset(s, rd, MO_64));
-            tcg_gen_movi_i64(tmp, 0);
-            tcg_gen_st_i64(tmp, cpu_env, fp_reg_hi_offset(s, rd));
+            write_fp_dreg(s, rd, tmp);
             tcg_temp_free_i64(tmp);
             break;
-        }
         case 1:
-        {
             /* 64 bit */
-            TCGv_i64 tmp = tcg_const_i64(0);
-            tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_offset(s, rd, MO_64));
-            tcg_gen_st_i64(tmp, cpu_env, fp_reg_hi_offset(s, rd));
-            tcg_temp_free_i64(tmp);
+            write_fp_dreg(s, rd, tcg_rn);
             break;
-        }
         case 2:
             /* 64 bit to top half. */
             tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
+            clear_vec_high(s, true, rd);
             break;
         }
     } else {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (3 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 04/14] target/arm: Clear SVE high bits for FMOV Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:21   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 06/14] target/arm: Implement FCVT (scalar, integer) " Richard Henderson
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

Adding the fp16 moves to/from general registers.

Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index c64c3ed99d..247a4f0cce 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5463,6 +5463,15 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
             tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
             clear_vec_high(s, true, rd);
             break;
+        case 3:
+            /* 16 bit */
+            tmp = tcg_temp_new_i64();
+            tcg_gen_ext16u_i64(tmp, tcg_rn);
+            write_fp_dreg(s, rd, tmp);
+            tcg_temp_free_i64(tmp);
+            break;
+        default:
+            g_assert_not_reached();
         }
     } else {
         TCGv_i64 tcg_rd = cpu_reg(s, rd);
@@ -5480,6 +5489,12 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
             /* 64 bits from top half */
             tcg_gen_ld_i64(tcg_rd, cpu_env, fp_reg_hi_offset(s, rn));
             break;
+        case 3:
+            /* 16 bit */
+            tcg_gen_ld16u_i64(tcg_rd, cpu_env, fp_reg_offset(s, rn, MO_16));
+            break;
+        default:
+            g_assert_not_reached();
         }
     }
 }
@@ -5519,10 +5534,15 @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
         case 0xa: /* 64 bit */
         case 0xd: /* 64 bit to top half of quad */
             break;
+        case 0x6: /* 16-bit */
+            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+                break;
+            }
+            /* fallthru */
         default:
             /* all other sf/type/rmode combinations are invalid */
             unallocated_encoding(s);
-            break;
+            return;
         }
 
         if (!fp_access_check(s)) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 06/14] target/arm: Implement FCVT (scalar, integer) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (4 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 07/14] target/arm: Implement FCVT (scalar, fixed-point) " Richard Henderson
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.h        |  6 +++
 target/arm/helper.c        | 38 +++++++++++++++++-
 target/arm/translate-a64.c | 96 ++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 122 insertions(+), 18 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 1969b37f2d..ce89968b2d 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -151,6 +151,10 @@ DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_touhh, i32, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toulh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toslh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_touqh, i64, f16, i32, ptr)
+DEF_HELPER_3(vfp_tosqh, i64, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshs, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosls, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosqs, i64, f32, i32, ptr)
@@ -177,6 +181,8 @@ DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr)
 DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr)
 DEF_HELPER_3(vfp_sltoh, f16, i32, i32, ptr)
 DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
+DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
+DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, ptr)
 
 DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
 DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index ba04cddda0..1fbf5fdec9 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11427,8 +11427,12 @@ VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
 #undef VFP_CONV_FIX_A64
 
 /* Conversion to/from f16 can overflow to infinity before/after scaling.
- * Therefore we convert to f64 (which does not round), scale,
- * and then convert f64 to f16 (which may round).
+ * Therefore we convert to f64, scale, and then convert f64 to f16; or
+ * vice versa for conversion to integer.
+ *
+ * For 16- and 32-bit integers, the conversion to f64 never rounds.
+ * For 64-bit integers, any integer that would cause rounding will also
+ * overflow to f16 infinity, so there is no double rounding problem.
  */
 
 static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
@@ -11446,6 +11450,16 @@ float16 HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
     return do_postscale_fp16(uint32_to_float64(x, fpst), shift, fpst);
 }
 
+float16 HELPER(vfp_sqtoh)(uint64_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(int64_to_float64(x, fpst), shift, fpst);
+}
+
+float16 HELPER(vfp_uqtoh)(uint64_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(uint64_to_float64(x, fpst), shift, fpst);
+}
+
 static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
 {
     if (unlikely(float16_is_any_nan(f))) {
@@ -11475,6 +11489,26 @@ uint32_t HELPER(vfp_touhh)(float16 x, uint32_t shift, void *fpst)
     return float64_to_uint16(do_prescale_fp16(x, shift, fpst), fpst);
 }
 
+uint32_t HELPER(vfp_toslh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int32(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint32_t HELPER(vfp_toulh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint32(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint64_t HELPER(vfp_tosqh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int64(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint64_t HELPER(vfp_touqh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint64(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
 /* Set the current fp rounding mode and return the old one.
  * The argument is a softfloat float_round_ value.
  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 247a4f0cce..d794744aec 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5274,11 +5274,11 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                            bool itof, int rmode, int scale, int sf, int type)
 {
     bool is_signed = !(opcode & 1);
-    bool is_double = type;
     TCGv_ptr tcg_fpstatus;
-    TCGv_i32 tcg_shift;
+    TCGv_i32 tcg_shift, tcg_single;
+    TCGv_i64 tcg_double;
 
-    tcg_fpstatus = get_fpstatus_ptr(false);
+    tcg_fpstatus = get_fpstatus_ptr(type == 3);
 
     tcg_shift = tcg_const_i32(64 - scale);
 
@@ -5296,8 +5296,9 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             tcg_int = tcg_extend;
         }
 
-        if (is_double) {
-            TCGv_i64 tcg_double = tcg_temp_new_i64();
+        switch (type) {
+        case 1: /* float64 */
+            tcg_double = tcg_temp_new_i64();
             if (is_signed) {
                 gen_helper_vfp_sqtod(tcg_double, tcg_int,
                                      tcg_shift, tcg_fpstatus);
@@ -5307,8 +5308,10 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             }
             write_fp_dreg(s, rd, tcg_double);
             tcg_temp_free_i64(tcg_double);
-        } else {
-            TCGv_i32 tcg_single = tcg_temp_new_i32();
+            break;
+
+        case 0: /* float32 */
+            tcg_single = tcg_temp_new_i32();
             if (is_signed) {
                 gen_helper_vfp_sqtos(tcg_single, tcg_int,
                                      tcg_shift, tcg_fpstatus);
@@ -5318,6 +5321,23 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             }
             write_fp_sreg(s, rd, tcg_single);
             tcg_temp_free_i32(tcg_single);
+            break;
+
+        case 3: /* float16 */
+            tcg_single = tcg_temp_new_i32();
+            if (is_signed) {
+                gen_helper_vfp_sqtoh(tcg_single, tcg_int,
+                                     tcg_shift, tcg_fpstatus);
+            } else {
+                gen_helper_vfp_uqtoh(tcg_single, tcg_int,
+                                     tcg_shift, tcg_fpstatus);
+            }
+            write_fp_sreg(s, rd, tcg_single);
+            tcg_temp_free_i32(tcg_single);
+            break;
+
+        default:
+            g_assert_not_reached();
         }
     } else {
         TCGv_i64 tcg_int = cpu_reg(s, rd);
@@ -5334,8 +5354,9 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
 
         gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
 
-        if (is_double) {
-            TCGv_i64 tcg_double = read_fp_dreg(s, rn);
+        switch (type) {
+        case 1: /* float64 */
+            tcg_double = read_fp_dreg(s, rn);
             if (is_signed) {
                 if (!sf) {
                     gen_helper_vfp_tosld(tcg_int, tcg_double,
@@ -5353,9 +5374,14 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                                          tcg_shift, tcg_fpstatus);
                 }
             }
+            if (!sf) {
+                tcg_gen_ext32u_i64(tcg_int, tcg_int);
+            }
             tcg_temp_free_i64(tcg_double);
-        } else {
-            TCGv_i32 tcg_single = read_fp_sreg(s, rn);
+            break;
+
+        case 0: /* float32 */
+            tcg_single = read_fp_sreg(s, rn);
             if (sf) {
                 if (is_signed) {
                     gen_helper_vfp_tosqs(tcg_int, tcg_single,
@@ -5377,14 +5403,39 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                 tcg_temp_free_i32(tcg_dest);
             }
             tcg_temp_free_i32(tcg_single);
+            break;
+
+        case 3: /* float16 */
+            tcg_single = read_fp_sreg(s, rn);
+            if (sf) {
+                if (is_signed) {
+                    gen_helper_vfp_tosqh(tcg_int, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                } else {
+                    gen_helper_vfp_touqh(tcg_int, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                }
+            } else {
+                TCGv_i32 tcg_dest = tcg_temp_new_i32();
+                if (is_signed) {
+                    gen_helper_vfp_toslh(tcg_dest, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                } else {
+                    gen_helper_vfp_toulh(tcg_dest, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                }
+                tcg_gen_extu_i32_i64(tcg_int, tcg_dest);
+                tcg_temp_free_i32(tcg_dest);
+            }
+            tcg_temp_free_i32(tcg_single);
+            break;
+
+        default:
+            g_assert_not_reached();
         }
 
         gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
         tcg_temp_free_i32(tcg_rmode);
-
-        if (!sf) {
-            tcg_gen_ext32u_i64(tcg_int, tcg_int);
-        }
     }
 
     tcg_temp_free_ptr(tcg_fpstatus);
@@ -5553,7 +5604,20 @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
         /* actual FP conversions */
         bool itof = extract32(opcode, 1, 1);
 
-        if (type > 1 || (rmode != 0 && opcode > 1)) {
+        if (rmode != 0 && opcode > 1) {
+            unallocated_encoding(s);
+            return;
+        }
+        switch (type) {
+        case 0: /* float32 */
+        case 1: /* float64 */
+            break;
+        case 3: /* float16 */
+            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+                break;
+            }
+            /* fallthru */
+        default:
             unallocated_encoding(s);
             return;
         }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 07/14] target/arm: Implement FCVT (scalar, fixed-point) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (5 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 06/14] target/arm: Implement FCVT (scalar, integer) " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg Richard Henderson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index d794744aec..e19d97e8f1 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5460,8 +5460,7 @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
     bool sf = extract32(insn, 31, 1);
     bool itof;
 
-    if (sbit || (type > 1)
-        || (!sf && scale < 32)) {
+    if (sbit || (!sf && scale < 32)) {
         unallocated_encoding(s);
         return;
     }
@@ -5480,6 +5479,20 @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
         return;
     }
 
+    switch (type) {
+    case 0: /* float32 */
+    case 1: /* float64 */
+        break;
+    case 3: /* float16 */
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
     if (!fp_access_check(s)) {
         return;
     }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (6 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 07/14] target/arm: Implement FCVT (scalar, fixed-point) " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:30   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 09/14] target/arm: Implement FP data-processing (2 source) for fp16 Richard Henderson
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 30 ++++++++++++++----------------
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index e19d97e8f1..8c63d5e743 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -614,6 +614,14 @@ static TCGv_i32 read_fp_sreg(DisasContext *s, int reg)
     return v;
 }
 
+static TCGv_i32 read_fp_hreg(DisasContext *s, int reg)
+{
+    TCGv_i32 v = tcg_temp_new_i32();
+
+    tcg_gen_ld16u_i32(v, cpu_env, fp_reg_offset(s, reg, MO_16));
+    return v;
+}
+
 /* Clear the bits above an N-bit vector, for N = (is_q ? 128 : 64).
  * If SVE is not enabled, then there are only 128 bits in the vector.
  */
@@ -4644,11 +4652,9 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
 static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
 {
     TCGv_ptr fpst = NULL;
-    TCGv_i32 tcg_op = tcg_temp_new_i32();
+    TCGv_i32 tcg_op = read_fp_hreg(s, rn);
     TCGv_i32 tcg_res = tcg_temp_new_i32();
 
-    read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
-
     switch (opcode) {
     case 0x0: /* FMOV */
         tcg_gen_mov_i32(tcg_res, tcg_op);
@@ -7543,13 +7549,10 @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
         tcg_temp_free_i64(tcg_op2);
         tcg_temp_free_i64(tcg_res);
     } else {
-        TCGv_i32 tcg_op1 = tcg_temp_new_i32();
-        TCGv_i32 tcg_op2 = tcg_temp_new_i32();
+        TCGv_i32 tcg_op1 = read_fp_hreg(s, rn);
+        TCGv_i32 tcg_op2 = read_fp_hreg(s, rm);
         TCGv_i64 tcg_res = tcg_temp_new_i64();
 
-        read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
-        read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
-
         gen_helper_neon_mull_s16(tcg_res, tcg_op1, tcg_op2);
         gen_helper_neon_addl_saturate_s32(tcg_res, cpu_env, tcg_res, tcg_res);
 
@@ -8090,13 +8093,10 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
 
     fpst = get_fpstatus_ptr(true);
 
-    tcg_op1 = tcg_temp_new_i32();
-    tcg_op2 = tcg_temp_new_i32();
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
     tcg_res = tcg_temp_new_i32();
 
-    read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
-    read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
-
     switch (fpopcode) {
     case 0x03: /* FMULX */
         gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
@@ -12015,11 +12015,9 @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
     }
 
     if (is_scalar) {
-        TCGv_i32 tcg_op = tcg_temp_new_i32();
+        TCGv_i32 tcg_op = read_fp_hreg(s, rn);
         TCGv_i32 tcg_res = tcg_temp_new_i32();
 
-        read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
-
         switch (fpop) {
         case 0x1a: /* FCVTNS */
         case 0x1b: /* FCVTMS */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 09/14] target/arm: Implement FP data-processing (2 source) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (7 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 10/14] target/arm: Implement FP data-processing (3 " Richard Henderson
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

We missed all of the scalar fp16 binary operations.

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 8c63d5e743..d7a49ef2e4 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5062,6 +5062,61 @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
     tcg_temp_free_i64(tcg_res);
 }
 
+/* Floating-point data-processing (2 source) - half precision */
+static void handle_fp_2src_half(DisasContext *s, int opcode,
+                                int rd, int rn, int rm)
+{
+    TCGv_i32 tcg_op1;
+    TCGv_i32 tcg_op2;
+    TCGv_i32 tcg_res;
+    TCGv_ptr fpst;
+
+    tcg_res = tcg_temp_new_i32();
+    fpst = get_fpstatus_ptr(true);
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
+
+    switch (opcode) {
+    case 0x0: /* FMUL */
+        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x1: /* FDIV */
+        gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x2: /* FADD */
+        gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x3: /* FSUB */
+        gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x4: /* FMAX */
+        gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x5: /* FMIN */
+        gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x6: /* FMAXNM */
+        gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x7: /* FMINNM */
+        gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x8: /* FNMUL */
+        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
+        tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    write_fp_sreg(s, rd, tcg_res);
+
+    tcg_temp_free_ptr(fpst);
+    tcg_temp_free_i32(tcg_op1);
+    tcg_temp_free_i32(tcg_op2);
+    tcg_temp_free_i32(tcg_res);
+}
+
 /* Floating point data-processing (2 source)
  *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
  * +---+---+---+-----------+------+---+------+--------+-----+------+------+
@@ -5094,6 +5149,16 @@ static void disas_fp_2src(DisasContext *s, uint32_t insn)
         }
         handle_fp_2src_double(s, opcode, rd, rn, rm);
         break;
+    case 3:
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!fp_access_check(s)) {
+            return;
+        }
+        handle_fp_2src_half(s, opcode, rd, rn, rm);
+        break;
     default:
         unallocated_encoding(s);
     }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 10/14] target/arm: Implement FP data-processing (3 source) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (8 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 09/14] target/arm: Implement FP data-processing (2 source) for fp16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP " Richard Henderson
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable

We missed all of the scalar fp16 fma operations.

Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index d7a49ef2e4..9322b0d896 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5240,6 +5240,44 @@ static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1,
     tcg_temp_free_i64(tcg_res);
 }
 
+/* Floating-point data-processing (3 source) - half precision */
+static void handle_fp_3src_half(DisasContext *s, bool o0, bool o1,
+                                int rd, int rn, int rm, int ra)
+{
+    TCGv_i32 tcg_op1, tcg_op2, tcg_op3;
+    TCGv_i32 tcg_res = tcg_temp_new_i32();
+    TCGv_ptr fpst = get_fpstatus_ptr(true);
+
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
+    tcg_op3 = read_fp_hreg(s, ra);
+
+    /* These are fused multiply-add, and must be done as one
+     * floating point operation with no rounding between the
+     * multiplication and addition steps.
+     * NB that doing the negations here as separate steps is
+     * correct : an input NaN should come out with its sign bit
+     * flipped if it is a negated-input.
+     */
+    if (o1 == true) {
+        tcg_gen_xori_i32(tcg_op3, tcg_op3, 0x8000);
+    }
+
+    if (o0 != o1) {
+        tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
+    }
+
+    gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
+
+    write_fp_sreg(s, rd, tcg_res);
+
+    tcg_temp_free_ptr(fpst);
+    tcg_temp_free_i32(tcg_op1);
+    tcg_temp_free_i32(tcg_op2);
+    tcg_temp_free_i32(tcg_op3);
+    tcg_temp_free_i32(tcg_res);
+}
+
 /* Floating point data-processing (3 source)
  *   31  30  29 28       24 23  22  21  20  16  15  14  10 9    5 4    0
  * +---+---+---+-----------+------+----+------+----+------+------+------+
@@ -5269,6 +5307,16 @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
         }
         handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra);
         break;
+    case 3:
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!fp_access_check(s)) {
+            return;
+        }
+        handle_fp_3src_half(s, o0, o1, rd, rn, rm, ra);
+        break;
     default:
         unallocated_encoding(s);
     }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (9 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 10/14] target/arm: Implement FP data-processing (3 " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:59   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL " Richard Henderson
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable

From: Alex Bennée <alex.bennee@linaro.org>

These where missed out from the rest of the half-precision work.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[rth: Diagnose lack of FP16 before fp_access_check]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper-a64.h    |  2 ++
 target/arm/helper-a64.c    | 10 ++++++
 target/arm/translate-a64.c | 88 +++++++++++++++++++++++++++++++++++++---------
 3 files changed, 83 insertions(+), 17 deletions(-)

diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index ef4ddfe9d8..5c0b9bd799 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -19,6 +19,8 @@
 DEF_HELPER_FLAGS_2(udiv64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(sdiv64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
 DEF_HELPER_FLAGS_1(rbit64, TCG_CALL_NO_RWG_SE, i64, i64)
+DEF_HELPER_3(vfp_cmph_a64, i64, f16, f16, ptr)
+DEF_HELPER_3(vfp_cmpeh_a64, i64, f16, f16, ptr)
 DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index afb25ad20c..35df07adb9 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -85,6 +85,16 @@ static inline uint32_t float_rel_to_flags(int res)
     return flags;
 }
 
+uint64_t HELPER(vfp_cmph_a64)(float16 x, float16 y, void *fp_status)
+{
+    return float_rel_to_flags(float16_compare_quiet(x, y, fp_status));
+}
+
+uint64_t HELPER(vfp_cmpeh_a64)(float16 x, float16 y, void *fp_status)
+{
+    return float_rel_to_flags(float16_compare(x, y, fp_status));
+}
+
 uint64_t HELPER(vfp_cmps_a64)(float32 x, float32 y, void *fp_status)
 {
     return float_rel_to_flags(float32_compare_quiet(x, y, fp_status));
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 9322b0d896..f0aca20771 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4475,14 +4475,14 @@ static void disas_data_proc_reg(DisasContext *s, uint32_t insn)
     }
 }
 
-static void handle_fp_compare(DisasContext *s, bool is_double,
+static void handle_fp_compare(DisasContext *s, int size,
                               unsigned int rn, unsigned int rm,
                               bool cmp_with_zero, bool signal_all_nans)
 {
     TCGv_i64 tcg_flags = tcg_temp_new_i64();
-    TCGv_ptr fpst = get_fpstatus_ptr(false);
+    TCGv_ptr fpst = get_fpstatus_ptr(size == MO_16);
 
-    if (is_double) {
+    if (size == MO_64) {
         TCGv_i64 tcg_vn, tcg_vm;
 
         tcg_vn = read_fp_dreg(s, rn);
@@ -4499,19 +4499,35 @@ static void handle_fp_compare(DisasContext *s, bool is_double,
         tcg_temp_free_i64(tcg_vn);
         tcg_temp_free_i64(tcg_vm);
     } else {
-        TCGv_i32 tcg_vn, tcg_vm;
+        TCGv_i32 tcg_vn = tcg_temp_new_i32();
+        TCGv_i32 tcg_vm = tcg_temp_new_i32();
 
-        tcg_vn = read_fp_sreg(s, rn);
+        read_vec_element_i32(s, tcg_vn, rn, 0, size);
         if (cmp_with_zero) {
-            tcg_vm = tcg_const_i32(0);
+            tcg_gen_movi_i32(tcg_vm, 0);
         } else {
-            tcg_vm = read_fp_sreg(s, rm);
+            read_vec_element_i32(s, tcg_vm, rm, 0, size);
         }
-        if (signal_all_nans) {
-            gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
-        } else {
-            gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+
+        switch (size) {
+        case MO_32:
+            if (signal_all_nans) {
+                gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            } else {
+                gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            }
+            break;
+        case MO_16:
+            if (signal_all_nans) {
+                gen_helper_vfp_cmpeh_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            } else {
+                gen_helper_vfp_cmph_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            }
+            break;
+        default:
+            g_assert_not_reached();
         }
+
         tcg_temp_free_i32(tcg_vn);
         tcg_temp_free_i32(tcg_vm);
     }
@@ -4532,16 +4548,35 @@ static void handle_fp_compare(DisasContext *s, bool is_double,
 static void disas_fp_compare(DisasContext *s, uint32_t insn)
 {
     unsigned int mos, type, rm, op, rn, opc, op2r;
+    int size;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     op = extract32(insn, 14, 2);
     rn = extract32(insn, 5, 5);
     opc = extract32(insn, 3, 2);
     op2r = extract32(insn, 0, 3);
 
-    if (mos || op || op2r || type > 1) {
+    if (mos || op || op2r) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0:
+        size = MO_32;
+        break;
+    case 1:
+        size = MO_64;
+        break;
+    case 3:
+        size = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -4550,7 +4585,7 @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
         return;
     }
 
-    handle_fp_compare(s, type, rn, rm, opc & 1, opc & 2);
+    handle_fp_compare(s, size, rn, rm, opc & 1, opc & 2);
 }
 
 /* Floating point conditional compare
@@ -4564,16 +4599,35 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
     unsigned int mos, type, rm, cond, rn, op, nzcv;
     TCGv_i64 tcg_flags;
     TCGLabel *label_continue = NULL;
+    int size;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     cond = extract32(insn, 12, 4);
     rn = extract32(insn, 5, 5);
     op = extract32(insn, 4, 1);
     nzcv = extract32(insn, 0, 4);
 
-    if (mos || type > 1) {
+    if (mos) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0:
+        size = MO_32;
+        break;
+    case 1:
+        size = MO_64;
+        break;
+    case 3:
+        size = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -4594,7 +4648,7 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
         gen_set_label(label_match);
     }
 
-    handle_fp_compare(s, type, rn, rm, false, op);
+    handle_fp_compare(s, size, rn, rm, false, op);
 
     if (cond < 0x0e) {
         gen_set_label(label_continue);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (10 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 17:01   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) " Richard Henderson
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable

From: Alex Bennée <alex.bennee@linaro.org>

These were missed out from the rest of the half-precision work.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[rth: Fix erroneous check vs type]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index f0aca20771..1ea5185f14 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4666,15 +4666,34 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
     unsigned int mos, type, rm, cond, rn, rd;
     TCGv_i64 t_true, t_false, t_zero;
     DisasCompare64 c;
+    TCGMemOp sz;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     cond = extract32(insn, 12, 4);
     rn = extract32(insn, 5, 5);
     rd = extract32(insn, 0, 5);
 
-    if (mos || type > 1) {
+    if (mos) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch(type) {
+    case 0:
+        sz = MO_32;
+        break;
+    case 1:
+        sz = MO_64;
+        break;
+    case 3:
+        sz = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -4683,11 +4702,11 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
         return;
     }
 
-    /* Zero extend sreg inputs to 64 bits now.  */
+    /* Zero extend sreg & hreg inputs to 64 bits now.  */
     t_true = tcg_temp_new_i64();
     t_false = tcg_temp_new_i64();
-    read_vec_element(s, t_true, rn, 0, type ? MO_64 : MO_32);
-    read_vec_element(s, t_false, rm, 0, type ? MO_64 : MO_32);
+    read_vec_element(s, t_true, rn, 0, sz);
+    read_vec_element(s, t_false, rm, 0, sz);
 
     a64_test_cc(&c, cond);
     t_zero = tcg_const_i64(0);
@@ -4696,7 +4715,7 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(t_false);
     a64_free_cc(&c);
 
-    /* Note that sregs write back zeros to the high bits,
+    /* Note that sregs & hregs write back zeros to the high bits,
        and we've already done the zero-extension.  */
     write_fp_dreg(s, rd, t_true);
     tcg_temp_free_i64(t_true);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (11 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 17:02   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 14/14] target/arm: Fix sqrt_f16 exception raising Richard Henderson
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable

From: Alex Bennée <alex.bennee@linaro.org>

All the hard work is already done by vfp_expand_imm, we just need to
make sure we pick up the correct size.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[rth: Merge unallocated_encoding check with TCGMemOp conversion.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 1ea5185f14..b73d6d96cd 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5437,11 +5437,25 @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
 {
     int rd = extract32(insn, 0, 5);
     int imm8 = extract32(insn, 13, 8);
-    int is_double = extract32(insn, 22, 2);
+    int type = extract32(insn, 22, 2);
     uint64_t imm;
     TCGv_i64 tcg_res;
+    TCGMemOp sz;
 
-    if (is_double > 1) {
+    switch (type) {
+    case 0:
+        sz = MO_32;
+        break;
+    case 1:
+        sz = MO_64;
+        break;
+    case 3:
+        sz = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -5450,7 +5464,7 @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
         return;
     }
 
-    imm = vfp_expand_imm(MO_32 + is_double, imm8);
+    imm = vfp_expand_imm(sz, imm8);
 
     tcg_res = tcg_const_i64(imm);
     write_fp_dreg(s, rd, tcg_res);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v2 14/14] target/arm: Fix sqrt_f16 exception raising
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (12 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:33 ` [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 no-reply
  2018-05-10 17:09 ` Peter Maydell
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable

From: Alex Bennée <alex.bennee@linaro.org>

We are meant to explicitly pass fpst, not cpu_env.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index b73d6d96cd..07e196c19c 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4739,7 +4739,8 @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
         tcg_gen_xori_i32(tcg_res, tcg_op, 0x8000);
         break;
     case 0x3: /* FSQRT */
-        gen_helper_sqrt_f16(tcg_res, tcg_op, cpu_env);
+        fpst = get_fpstatus_ptr(true);
+        gen_helper_sqrt_f16(tcg_res, tcg_op, fpst);
         break;
     case 0x8: /* FRINTN */
     case 0x9: /* FRINTP */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (13 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 14/14] target/arm: Fix sqrt_f16 exception raising Richard Henderson
@ 2018-05-02 22:33 ` no-reply
  2018-05-10 17:09 ` Peter Maydell
  15 siblings, 0 replies; 24+ messages in thread
From: no-reply @ 2018-05-02 22:33 UTC (permalink / raw)
  To: richard.henderson; +Cc: famz, qemu-devel, peter.maydell, qemu-arm

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180502221552.3873-1-richard.henderson@linaro.org
Subject: [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20180502221552.3873-1-richard.henderson@linaro.org -> patchew/20180502221552.3873-1-richard.henderson@linaro.org
Switched to a new branch 'test'
0af6837b48 target/arm: Fix sqrt_f16 exception raising
c740b594ef target/arm: Implement FMOV (immediate) for fp16
9869941c58 target/arm: Implement FCSEL for fp16
2f9af143e5 target/arm: Implement FCMP for fp16
d7fd874757 target/arm: Implement FP data-processing (3 source) for fp16
bd693b19f4 target/arm: Implement FP data-processing (2 source) for fp16
a7c1ff7f31 target/arm: Introduce and use read_fp_hreg
5c2dfce3b3 target/arm: Implement FCVT (scalar, fixed-point) for fp16
ece533faad target/arm: Implement FCVT (scalar, integer) for fp16
1d81531581 target/arm: Implement FMOV (general) for fp16
d3e47fd86b target/arm: Clear SVE high bits for FMOV
b8530d66b3 target/arm: Fix float16 to/from int16
96451f8fef target/arm: Implement vector shifted FCVT for fp16
f3d1b8db5d target/arm: Implement vector shifted SCVF/UCVF for fp16

=== OUTPUT BEGIN ===
Checking PATCH 1/14: target/arm: Implement vector shifted SCVF/UCVF for fp16...
Checking PATCH 2/14: target/arm: Implement vector shifted FCVT for fp16...
Checking PATCH 3/14: target/arm: Fix float16 to/from int16...
ERROR: spaces required around that '*' (ctx:WxV)
#45: FILE: target/arm/helper.c:11434:
+static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
                                                                     ^

total: 1 errors, 0 warnings, 83 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 4/14: target/arm: Clear SVE high bits for FMOV...
Checking PATCH 5/14: target/arm: Implement FMOV (general) for fp16...
Checking PATCH 6/14: target/arm: Implement FCVT (scalar, integer) for fp16...
Checking PATCH 7/14: target/arm: Implement FCVT (scalar, fixed-point) for fp16...
Checking PATCH 8/14: target/arm: Introduce and use read_fp_hreg...
Checking PATCH 9/14: target/arm: Implement FP data-processing (2 source) for fp16...
Checking PATCH 10/14: target/arm: Implement FP data-processing (3 source) for fp16...
Checking PATCH 11/14: target/arm: Implement FCMP for fp16...
Checking PATCH 12/14: target/arm: Implement FCSEL for fp16...
ERROR: space required before the open parenthesis '('
#41: FILE: target/arm/translate-a64.c:4683:
+    switch(type) {

total: 1 errors, 0 warnings, 58 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 13/14: target/arm: Implement FMOV (immediate) for fp16...
Checking PATCH 14/14: target/arm: Fix sqrt_f16 exception raising...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16 Richard Henderson
@ 2018-05-10 16:03   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 16:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, qemu-arm, qemu-stable

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> While we have some of the scalar paths for *CVF for fp16,
> we failed to decode the fp16 version of these instructions.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT " Richard Henderson
@ 2018-05-10 16:04   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 16:04 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, qemu-arm, qemu-stable

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> While we have some of the scalar paths for FCVT for fp16,
> we failed to decode the fp16 version of these instructions.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>
> ---
> v2: Use parens with (x << y) >> z.
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16 Richard Henderson
@ 2018-05-10 16:21   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 16:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, qemu-arm, qemu-stable

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> Adding the fp16 moves to/from general registers.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-a64.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index c64c3ed99d..247a4f0cce 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -5463,6 +5463,15 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
>              tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
>              clear_vec_high(s, true, rd);
>              break;
> +        case 3:
> +            /* 16 bit */
> +            tmp = tcg_temp_new_i64();
> +            tcg_gen_ext16u_i64(tmp, tcg_rn);
> +            write_fp_dreg(s, rd, tmp);
> +            tcg_temp_free_i64(tmp);
> +            break;
> +        default:
> +            g_assert_not_reached();
>          }
>      } else {
>          TCGv_i64 tcg_rd = cpu_reg(s, rd);
> @@ -5480,6 +5489,12 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
>              /* 64 bits from top half */
>              tcg_gen_ld_i64(tcg_rd, cpu_env, fp_reg_hi_offset(s, rn));
>              break;
> +        case 3:
> +            /* 16 bit */
> +            tcg_gen_ld16u_i64(tcg_rd, cpu_env, fp_reg_offset(s, rn, MO_16));
> +            break;
> +        default:
> +            g_assert_not_reached();
>          }
>      }
>  }
> @@ -5519,10 +5534,15 @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
>          case 0xa: /* 64 bit */
>          case 0xd: /* 64 bit to top half of quad */
>              break;
> +        case 0x6: /* 16-bit */
> +            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
> +                break;
> +            }
> +            /* fallthru */

This is a switch on (sf << 3 | type << 1 | rmode), so this catches
the halfprec <-> 32 bit cases, but it misses the halfprec <-> 64 bit
ops, which have sf == 1. So I think you need a 'case 0xe' as well.

>          default:
>              /* all other sf/type/rmode combinations are invalid */
>              unallocated_encoding(s);
> -            break;
> +            return;

This is a distinct bugfix, I think (though the only effect would
be generation of unnecessary dead code after the undef).

>          }
>
>          if (!fp_access_check(s)) {
> --
> 2.14.3

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg Richard Henderson
@ 2018-05-10 16:30   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 16:30 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, qemu-arm, qemu-stable

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-a64.c | 30 ++++++++++++++----------------
>  1 file changed, 14 insertions(+), 16 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP " Richard Henderson
@ 2018-05-10 16:59   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 16:59 UTC (permalink / raw)
  To: Richard Henderson
  Cc: QEMU Developers, qemu-arm, Alex Bennée, qemu-stable

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> From: Alex Bennée <alex.bennee@linaro.org>
>
> These where missed out from the rest of the half-precision work.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> [rth: Diagnose lack of FP16 before fp_access_check]
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper-a64.h    |  2 ++
>  target/arm/helper-a64.c    | 10 ++++++
>  target/arm/translate-a64.c | 88 +++++++++++++++++++++++++++++++++++++---------
>  3 files changed, 83 insertions(+), 17 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL " Richard Henderson
@ 2018-05-10 17:01   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 17:01 UTC (permalink / raw)
  To: Richard Henderson
  Cc: QEMU Developers, qemu-arm, Alex Bennée, qemu-stable

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> From: Alex Bennée <alex.bennee@linaro.org>
>
> These were missed out from the rest of the half-precision work.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> [rth: Fix erroneous check vs type]
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) " Richard Henderson
@ 2018-05-10 17:02   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 17:02 UTC (permalink / raw)
  To: Richard Henderson
  Cc: QEMU Developers, qemu-arm, Alex Bennée, qemu-stable

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> From: Alex Bennée <alex.bennee@linaro.org>
>
> All the hard work is already done by vfp_expand_imm, we just need to
> make sure we pick up the correct size.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> [rth: Merge unallocated_encoding check with TCGMemOp conversion.]
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (14 preceding siblings ...)
  2018-05-02 22:33 ` [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 no-reply
@ 2018-05-10 17:09 ` Peter Maydell
  15 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 17:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, qemu-arm

On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> When running the gcc testsuite with current aarch64-linux-user,
> the testsuite detects the presence of the fp16 extension and
> enables lots of extra tests for builtins.
>
> Quite a few of these new tests fail because we missed implementing
> some instructions.  We really should go back and verify that nothing
> else is missing from this (rather large) extension.
>
> In addition, it tests some edge conditions on data that show flaws
> in the way we were performing integer<->fp conversion; particularly
> with respect to scaled conversion.
>
> Changes since v1:
>   * Rebased vs master instead of tgt-arm-sve-9.
>   * Alex did some additional digging through the ARM xhtml
>     and came up with some additional missing instructions.
>   * Everything cc'd to qemu-stable.

I'm going to take patches 1..4 into target-arm.next, just to
reduce the size of this a bit. Patch 5 is the FMOV one which
is missing some decode. Otherwise I've reviewed all the ones
which still needed review. Don't forget to fix the patch 12
patchew warning:
ERROR: space required before the open parenthesis '('
#41: FILE: target/arm/translate-a64.c:4683:
+    switch(type) {

thanks
-- PMM

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-05-10 17:09 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16 Richard Henderson
2018-05-10 16:03   ` Peter Maydell
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT " Richard Henderson
2018-05-10 16:04   ` Peter Maydell
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 03/14] target/arm: Fix float16 to/from int16 Richard Henderson
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 04/14] target/arm: Clear SVE high bits for FMOV Richard Henderson
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16 Richard Henderson
2018-05-10 16:21   ` Peter Maydell
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 06/14] target/arm: Implement FCVT (scalar, integer) " Richard Henderson
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 07/14] target/arm: Implement FCVT (scalar, fixed-point) " Richard Henderson
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg Richard Henderson
2018-05-10 16:30   ` Peter Maydell
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 09/14] target/arm: Implement FP data-processing (2 source) for fp16 Richard Henderson
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 10/14] target/arm: Implement FP data-processing (3 " Richard Henderson
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP " Richard Henderson
2018-05-10 16:59   ` Peter Maydell
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL " Richard Henderson
2018-05-10 17:01   ` Peter Maydell
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) " Richard Henderson
2018-05-10 17:02   ` Peter Maydell
2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 14/14] target/arm: Fix sqrt_f16 exception raising Richard Henderson
2018-05-02 22:33 ` [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 no-reply
2018-05-10 17:09 ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.