All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Subject: [PULL 47/57] target/arm: Implement MVE VQDMLADH and VQRDMLADH
Date: Mon, 21 Jun 2021 17:28:23 +0100	[thread overview]
Message-ID: <20210621162833.32535-48-peter.maydell@linaro.org> (raw)
In-Reply-To: <20210621162833.32535-1-peter.maydell@linaro.org>

Implement the MVE VQDMLADH and VQRDMLADH insns.  These multiply
elements, and then add pairs of products, double, possibly round,
saturate and return the high half of the result.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210617121628.20116-37-peter.maydell@linaro.org
---
 target/arm/helper-mve.h    | 16 +++++++
 target/arm/mve.decode      |  5 +++
 target/arm/mve_helper.c    | 89 ++++++++++++++++++++++++++++++++++++++
 target/arm/translate-mve.c |  4 ++
 4 files changed, 114 insertions(+)

diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
index b7e2243a19a..c3cc6a08476 100644
--- a/target/arm/helper-mve.h
+++ b/target/arm/helper-mve.h
@@ -201,6 +201,22 @@ DEF_HELPER_FLAGS_4(mve_vqrshlub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
 DEF_HELPER_FLAGS_4(mve_vqrshluh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
 DEF_HELPER_FLAGS_4(mve_vqrshluw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
 
+DEF_HELPER_FLAGS_4(mve_vqdmladhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqdmladhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqdmladhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+
+DEF_HELPER_FLAGS_4(mve_vqdmladhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqdmladhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqdmladhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+
+DEF_HELPER_FLAGS_4(mve_vqrdmladhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqrdmladhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqrdmladhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+
+DEF_HELPER_FLAGS_4(mve_vqrdmladhxb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqrdmladhxh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vqrdmladhxw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+
 DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
index c30fb2c1536..d267c8838eb 100644
--- a/target/arm/mve.decode
+++ b/target/arm/mve.decode
@@ -142,6 +142,11 @@ VQSHL_U          111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev
 VQRSHL_S         111 0 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev
 VQRSHL_U         111 1 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev
 
+VQDMLADH         1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 0 @2op
+VQDMLADHX        1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op
+VQRDMLADH        1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op
+VQRDMLADHX       1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op
+
 # Vector miscellaneous
 
 VCLS             1111 1111 1 . 11 .. 00 ... 0 0100 01 . 0 ... 0 @1op
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
index 07da231fe64..76796b0f920 100644
--- a/target/arm/mve_helper.c
+++ b/target/arm/mve_helper.c
@@ -627,6 +627,95 @@ DO_2OP_SAT_U(vqshlu, DO_UQSHL_OP)
 DO_2OP_SAT_S(vqrshls, DO_SQRSHL_OP)
 DO_2OP_SAT_U(vqrshlu, DO_UQRSHL_OP)
 
+/*
+ * Multiply add dual returning high half
+ * The 'FN' here takes four inputs A, B, C, D, a 0/1 indicator of
+ * whether to add the rounding constant, and the pointer to the
+ * saturation flag, and should do "(A * B + C * D) * 2 + rounding constant",
+ * saturate to twice the input size and return the high half; or
+ * (A * B - C * D) etc for VQDMLSDH.
+ */
+#define DO_VQDMLADH_OP(OP, ESIZE, TYPE, XCHG, ROUND, FN)                \
+    void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn,   \
+                                void *vm)                               \
+    {                                                                   \
+        TYPE *d = vd, *n = vn, *m = vm;                                 \
+        uint16_t mask = mve_element_mask(env);                          \
+        unsigned e;                                                     \
+        bool qc = false;                                                \
+        for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) {              \
+            bool sat = false;                                           \
+            if ((e & 1) == XCHG) {                                      \
+                TYPE r = FN(n[H##ESIZE(e)],                             \
+                            m[H##ESIZE(e - XCHG)],                      \
+                            n[H##ESIZE(e + (1 - 2 * XCHG))],            \
+                            m[H##ESIZE(e + (1 - XCHG))],                \
+                            ROUND, &sat);                               \
+                mergemask(&d[H##ESIZE(e)], r, mask);                    \
+                qc |= sat & mask & 1;                                   \
+            }                                                           \
+        }                                                               \
+        if (qc) {                                                       \
+            env->vfp.qc[0] = qc;                                        \
+        }                                                               \
+        mve_advance_vpt(env);                                           \
+    }
+
+static int8_t do_vqdmladh_b(int8_t a, int8_t b, int8_t c, int8_t d,
+                            int round, bool *sat)
+{
+    int64_t r = ((int64_t)a * b + (int64_t)c * d) * 2 + (round << 7);
+    return do_sat_bhw(r, INT16_MIN, INT16_MAX, sat) >> 8;
+}
+
+static int16_t do_vqdmladh_h(int16_t a, int16_t b, int16_t c, int16_t d,
+                             int round, bool *sat)
+{
+    int64_t r = ((int64_t)a * b + (int64_t)c * d) * 2 + (round << 15);
+    return do_sat_bhw(r, INT32_MIN, INT32_MAX, sat) >> 16;
+}
+
+static int32_t do_vqdmladh_w(int32_t a, int32_t b, int32_t c, int32_t d,
+                             int round, bool *sat)
+{
+    int64_t m1 = (int64_t)a * b;
+    int64_t m2 = (int64_t)c * d;
+    int64_t r;
+    /*
+     * Architecturally we should do the entire add, double, round
+     * and then check for saturation. We do three saturating adds,
+     * but we need to be careful about the order. If the first
+     * m1 + m2 saturates then it's impossible for the *2+rc to
+     * bring it back into the non-saturated range. However, if
+     * m1 + m2 is negative then it's possible that doing the doubling
+     * would take the intermediate result below INT64_MAX and the
+     * addition of the rounding constant then brings it back in range.
+     * So we add half the rounding constant before doubling rather
+     * than adding the rounding constant after the doubling.
+     */
+    if (sadd64_overflow(m1, m2, &r) ||
+        sadd64_overflow(r, (round << 30), &r) ||
+        sadd64_overflow(r, r, &r)) {
+        *sat = true;
+        return r < 0 ? INT32_MAX : INT32_MIN;
+    }
+    return r >> 32;
+}
+
+DO_VQDMLADH_OP(vqdmladhb, 1, int8_t, 0, 0, do_vqdmladh_b)
+DO_VQDMLADH_OP(vqdmladhh, 2, int16_t, 0, 0, do_vqdmladh_h)
+DO_VQDMLADH_OP(vqdmladhw, 4, int32_t, 0, 0, do_vqdmladh_w)
+DO_VQDMLADH_OP(vqdmladhxb, 1, int8_t, 1, 0, do_vqdmladh_b)
+DO_VQDMLADH_OP(vqdmladhxh, 2, int16_t, 1, 0, do_vqdmladh_h)
+DO_VQDMLADH_OP(vqdmladhxw, 4, int32_t, 1, 0, do_vqdmladh_w)
+
+DO_VQDMLADH_OP(vqrdmladhb, 1, int8_t, 0, 1, do_vqdmladh_b)
+DO_VQDMLADH_OP(vqrdmladhh, 2, int16_t, 0, 1, do_vqdmladh_h)
+DO_VQDMLADH_OP(vqrdmladhw, 4, int32_t, 0, 1, do_vqdmladh_w)
+DO_VQDMLADH_OP(vqrdmladhxb, 1, int8_t, 1, 1, do_vqdmladh_b)
+DO_VQDMLADH_OP(vqrdmladhxh, 2, int16_t, 1, 1, do_vqdmladh_h)
+DO_VQDMLADH_OP(vqrdmladhxw, 4, int32_t, 1, 1, do_vqdmladh_w)
+
 #define DO_2OP_SCALAR(OP, ESIZE, TYPE, FN)                              \
     void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn,   \
                                 uint32_t rm)                            \
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
index d75cc377fee..d830b42d5ce 100644
--- a/target/arm/translate-mve.c
+++ b/target/arm/translate-mve.c
@@ -410,6 +410,10 @@ DO_2OP(VQSHL_S, vqshls)
 DO_2OP(VQSHL_U, vqshlu)
 DO_2OP(VQRSHL_S, vqrshls)
 DO_2OP(VQRSHL_U, vqrshlu)
+DO_2OP(VQDMLADH, vqdmladh)
+DO_2OP(VQDMLADHX, vqdmladhx)
+DO_2OP(VQRDMLADH, vqrdmladh)
+DO_2OP(VQRDMLADHX, vqrdmladhx)
 
 static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
                           MVEGenTwoOpScalarFn fn)
-- 
2.20.1



  parent reply	other threads:[~2021-06-21 16:59 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-21 16:27 [PULL 00/57] target-arm queue Peter Maydell
2021-06-21 16:27 ` [PULL 01/57] hw/acpi: Provide stub version of acpi_ghes_record_errors() Peter Maydell
2021-06-21 16:27 ` [PULL 02/57] hw/acpi: Provide function acpi_ghes_present() Peter Maydell
2021-06-21 16:27 ` [PULL 03/57] target/arm: Use acpi_ghes_present() to see if we report ACPI memory errors Peter Maydell
2021-06-21 16:27 ` [PULL 04/57] docs/system/arm: Document which architecture extensions we emulate Peter Maydell
2021-06-21 16:27 ` [PULL 05/57] target/arm/translate-vfp.c: Whitespace fixes Peter Maydell
2021-06-21 16:27 ` [PULL 06/57] target/arm: Handle FPU being disabled in FPCXT_NS accesses Peter Maydell
2021-06-21 16:27 ` [PULL 07/57] target/arm: Don't NOCP fault for " Peter Maydell
2021-06-21 16:27 ` [PULL 08/57] target/arm: Handle writeback in VLDR/VSTR sysreg with no memory access Peter Maydell
2021-06-21 16:27 ` [PULL 09/57] target/arm: Factor FP context update code out into helper function Peter Maydell
2021-06-21 16:27 ` [PULL 10/57] target/arm: Split vfp_access_check() into A and M versions Peter Maydell
2021-06-21 16:27 ` [PULL 11/57] target/arm: Handle FPU check for FPCXT_NS insns via vfp_access_check_m() Peter Maydell
2021-06-21 16:27 ` [PULL 12/57] target/arm: Implement MVE VLDR/VSTR (non-widening forms) Peter Maydell
2021-06-21 16:27 ` [PULL 13/57] target/arm: Implement widening/narrowing MVE VLDR/VSTR insns Peter Maydell
2021-06-21 16:27 ` [PULL 14/57] target/arm: Implement MVE VCLZ Peter Maydell
2021-06-21 16:27 ` [PULL 15/57] target/arm: Implement MVE VCLS Peter Maydell
2021-06-21 16:27 ` [PULL 16/57] target/arm: Implement MVE VREV16, VREV32, VREV64 Peter Maydell
2021-06-21 16:27 ` [PULL 17/57] target/arm: Implement MVE VMVN (register) Peter Maydell
2021-06-21 16:27 ` [PULL 18/57] target/arm: Implement MVE VABS Peter Maydell
2021-06-21 16:27 ` [PULL 19/57] target/arm: Implement MVE VNEG Peter Maydell
2021-06-21 16:27 ` [PULL 20/57] tcg: Make gen_dup_i32/i64() public as tcg_gen_dup_i32/i64 Peter Maydell
2021-06-21 16:27 ` [PULL 21/57] target/arm: Implement MVE VDUP Peter Maydell
2021-06-21 16:27 ` [PULL 22/57] target/arm: Implement MVE VAND, VBIC, VORR, VORN, VEOR Peter Maydell
2021-06-21 16:27 ` [PULL 23/57] target/arm: Implement MVE VADD, VSUB, VMUL Peter Maydell
2021-06-21 16:28 ` [PULL 24/57] target/arm: Implement MVE VMULH Peter Maydell
2021-06-21 16:28 ` [PULL 25/57] target/arm: Implement MVE VRMULH Peter Maydell
2021-06-21 16:28 ` [PULL 26/57] target/arm: Implement MVE VMAX, VMIN Peter Maydell
2021-06-21 16:28 ` [PULL 27/57] target/arm: Implement MVE VABD Peter Maydell
2021-06-21 16:28 ` [PULL 28/57] target/arm: Implement MVE VHADD, VHSUB Peter Maydell
2021-06-21 16:28 ` [PULL 29/57] target/arm: Implement MVE VMULL Peter Maydell
2021-06-21 16:28 ` [PULL 30/57] target/arm: Implement MVE VMLALDAV Peter Maydell
2021-06-21 16:28 ` [PULL 31/57] target/arm: Implement MVE VMLSLDAV Peter Maydell
2021-06-21 16:28 ` [PULL 32/57] target/arm: Implement MVE VRMLALDAVH, VRMLSLDAVH Peter Maydell
2021-06-21 16:28 ` [PULL 33/57] target/arm: Implement MVE VADD (scalar) Peter Maydell
2021-06-21 16:28 ` [PULL 34/57] target/arm: Implement MVE VSUB, VMUL (scalar) Peter Maydell
2021-06-21 16:28 ` [PULL 35/57] target/arm: Implement MVE VHADD, VHSUB (scalar) Peter Maydell
2021-06-21 16:28 ` [PULL 36/57] target/arm: Implement MVE VBRSR Peter Maydell
2021-06-21 16:28 ` [PULL 37/57] target/arm: Implement MVE VPST Peter Maydell
2021-06-21 16:28 ` [PULL 38/57] target/arm: Implement MVE VQADD and VQSUB Peter Maydell
2021-06-21 16:28 ` [PULL 39/57] target/arm: Implement MVE VQDMULH and VQRDMULH (scalar) Peter Maydell
2021-06-21 16:28 ` [PULL 40/57] target/arm: Implement MVE VQDMULL scalar Peter Maydell
2021-06-21 16:28 ` [PULL 41/57] target/arm: Implement MVE VQDMULH, VQRDMULH (vector) Peter Maydell
2021-06-21 16:28 ` [PULL 42/57] target/arm: Implement MVE VQADD, VQSUB (vector) Peter Maydell
2021-06-21 16:28 ` [PULL 43/57] target/arm: Implement MVE VQSHL (vector) Peter Maydell
2021-06-21 16:28 ` [PULL 44/57] target/arm: Implement MVE VQRSHL Peter Maydell
2021-06-21 16:28 ` [PULL 45/57] target/arm: Implement MVE VSHL insn Peter Maydell
2021-06-21 16:28 ` [PULL 46/57] target/arm: Implement MVE VRSHL Peter Maydell
2021-06-21 16:28 ` Peter Maydell [this message]
2021-06-21 16:28 ` [PULL 48/57] target/arm: Implement MVE VQDMLSDH and VQRDMLSDH Peter Maydell
2021-06-21 16:28 ` [PULL 49/57] target/arm: Implement MVE VQDMULL (vector) Peter Maydell
2021-06-21 16:28 ` [PULL 50/57] target/arm: Implement MVE VRHADD Peter Maydell
2021-06-21 16:28 ` [PULL 51/57] target/arm: Implement MVE VADC, VSBC Peter Maydell
2021-06-21 16:28 ` [PULL 52/57] target/arm: Implement MVE VCADD Peter Maydell
2021-06-21 16:28 ` [PULL 53/57] target/arm: Implement MVE VHCADD Peter Maydell
2021-06-21 16:28 ` [PULL 54/57] target/arm: Implement MVE VADDV Peter Maydell
2021-06-21 16:28 ` [PULL 55/57] target/arm: Make VMOV scalar <-> gpreg beatwise for MVE Peter Maydell
2021-06-21 16:28 ` [PULL 56/57] target/arm: Implement MTE3 Peter Maydell
2021-06-21 16:28 ` [PULL 57/57] docs/system: arm: Add nRF boards description Peter Maydell
2021-06-21 17:25 ` [PULL 00/57] target-arm queue no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210621162833.32535-48-peter.maydell@linaro.org \
    --to=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.