All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-arm@nongnu.org, qemu-devel@nongnu.org
Cc: "Richard Henderson" <richard.henderson@linaro.org>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>
Subject: [PATCH v2 10/12] target/arm: Optimize MVE VSHLL and VMOVL
Date: Mon, 13 Sep 2021 10:54:38 +0100	[thread overview]
Message-ID: <20210913095440.13462-11-peter.maydell@linaro.org> (raw)
In-Reply-To: <20210913095440.13462-1-peter.maydell@linaro.org>

Optimize the MVE VSHLL insns by using TCG vector ops when possible.
This includes the VMOVL insn, which we handle in mve.decode as "VSHLL
with zero shift count".

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
The cases here that I've implemented with ANDI then shift
could also be implemented as shift-then-shift. Is one better
than another?
---
 target/arm/translate-mve.c | 67 +++++++++++++++++++++++++++++++++-----
 1 file changed, 59 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
index 00fa4379a74..5d66f70657e 100644
--- a/target/arm/translate-mve.c
+++ b/target/arm/translate-mve.c
@@ -1735,16 +1735,67 @@ DO_2SHIFT_SCALAR(VQSHL_U_scalar, vqshli_u)
 DO_2SHIFT_SCALAR(VQRSHL_S_scalar, vqrshli_s)
 DO_2SHIFT_SCALAR(VQRSHL_U_scalar, vqrshli_u)
 
-#define DO_VSHLL(INSN, FN)                                      \
-    static bool trans_##INSN(DisasContext *s, arg_2shift *a)    \
-    {                                                           \
-        static MVEGenTwoOpShiftFn * const fns[] = {             \
-            gen_helper_mve_##FN##b,                             \
-            gen_helper_mve_##FN##h,                             \
-        };                                                      \
-        return do_2shift(s, a, fns[a->size], false);            \
+#define DO_VSHLL(INSN, FN)                                              \
+    static bool trans_##INSN(DisasContext *s, arg_2shift *a)            \
+    {                                                                   \
+        static MVEGenTwoOpShiftFn * const fns[] = {                     \
+            gen_helper_mve_##FN##b,                                     \
+            gen_helper_mve_##FN##h,                                     \
+        };                                                              \
+        return do_2shift_vec(s, a, fns[a->size], false, do_gvec_##FN);  \
     }
 
+/*
+ * For the VSHLL vector helpers, the vece is the size of the input
+ * (ie MO_8 or MO_16); the helpers want to work in the output size.
+ * The shift count can be 0..<input size>, inclusive. (0 is VMOVL.)
+ */
+static void do_gvec_vshllbs(unsigned vece, uint32_t dofs, uint32_t aofs,
+                            int64_t shift, uint32_t oprsz, uint32_t maxsz)
+{
+    unsigned ovece = vece + 1;
+    unsigned ibits = vece == MO_8 ? 8 : 16;
+    tcg_gen_gvec_shli(ovece, dofs, aofs, ibits, oprsz, maxsz);
+    tcg_gen_gvec_sari(ovece, dofs, dofs, ibits - shift, oprsz, maxsz);
+}
+
+static void do_gvec_vshllbu(unsigned vece, uint32_t dofs, uint32_t aofs,
+                            int64_t shift, uint32_t oprsz, uint32_t maxsz)
+{
+    unsigned ovece = vece + 1;
+    tcg_gen_gvec_andi(ovece, dofs, aofs,
+                      ovece == MO_16 ? 0xff : 0xffff, oprsz, maxsz);
+    tcg_gen_gvec_shli(ovece, dofs, dofs, shift, oprsz, maxsz);
+}
+
+static void do_gvec_vshllts(unsigned vece, uint32_t dofs, uint32_t aofs,
+                            int64_t shift, uint32_t oprsz, uint32_t maxsz)
+{
+    unsigned ovece = vece + 1;
+    unsigned ibits = vece == MO_8 ? 8 : 16;
+    if (shift == 0) {
+        tcg_gen_gvec_sari(ovece, dofs, aofs, ibits, oprsz, maxsz);
+    } else {
+        tcg_gen_gvec_andi(ovece, dofs, aofs,
+                          ovece == MO_16 ? 0xff00 : 0xffff0000, oprsz, maxsz);
+        tcg_gen_gvec_sari(ovece, dofs, dofs, ibits - shift, oprsz, maxsz);
+    }
+}
+
+static void do_gvec_vshlltu(unsigned vece, uint32_t dofs, uint32_t aofs,
+                            int64_t shift, uint32_t oprsz, uint32_t maxsz)
+{
+    unsigned ovece = vece + 1;
+    unsigned ibits = vece == MO_8 ? 8 : 16;
+    if (shift == 0) {
+        tcg_gen_gvec_shri(ovece, dofs, aofs, ibits, oprsz, maxsz);
+    } else {
+        tcg_gen_gvec_andi(ovece, dofs, aofs,
+                          ovece == MO_16 ? 0xff00 : 0xffff0000, oprsz, maxsz);
+        tcg_gen_gvec_shri(ovece, dofs, dofs, ibits - shift, oprsz, maxsz);
+    }
+}
+
 DO_VSHLL(VSHLL_BS, vshllbs)
 DO_VSHLL(VSHLL_BU, vshllbu)
 DO_VSHLL(VSHLL_TS, vshllts)
-- 
2.20.1



  parent reply	other threads:[~2021-09-13 10:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-13  9:54 [PATCH v2 00/12] target/arm: Use TCG vector ops for MVE Peter Maydell
2021-09-13  9:54 ` [PATCH v2 01/12] target/arm: Avoid goto_tb if we're trying to exit to the main loop Peter Maydell
2021-09-13 13:36   ` Richard Henderson
2021-09-13  9:54 ` [PATCH v2 02/12] target/arm: Enforce that FPDSCR.LTPSIZE is 4 on inbound migration Peter Maydell
2021-09-13 13:39   ` Richard Henderson
2021-09-13  9:54 ` [PATCH v2 03/12] target/arm: Add TB flag for "MVE insns not predicated" Peter Maydell
2021-09-13 13:44   ` Richard Henderson
2021-09-13  9:54 ` [PATCH v2 04/12] target/arm: Optimize MVE logic ops Peter Maydell
2021-09-13  9:54 ` [PATCH v2 05/12] target/arm: Optimize MVE arithmetic ops Peter Maydell
2021-09-13  9:54 ` [PATCH v2 06/12] target/arm: Optimize MVE VNEG, VABS Peter Maydell
2021-09-13  9:54 ` [PATCH v2 07/12] target/arm: Optimize MVE VDUP Peter Maydell
2021-09-13 13:46   ` Richard Henderson
2021-09-13  9:54 ` [PATCH v2 08/12] target/arm: Optimize MVE VMVN Peter Maydell
2021-09-13 13:47   ` Richard Henderson
2021-09-13  9:54 ` [PATCH v2 09/12] target/arm: Optimize MVE VSHL, VSHR immediate forms Peter Maydell
2021-09-13 13:56   ` Richard Henderson
2021-09-13 14:21     ` Peter Maydell
2021-09-13 15:53       ` Richard Henderson
2021-09-16 10:01         ` Peter Maydell
2021-09-16 14:39           ` Richard Henderson
2021-09-13  9:54 ` Peter Maydell [this message]
2021-09-13 14:04   ` [PATCH v2 10/12] target/arm: Optimize MVE VSHLL and VMOVL Richard Henderson
2021-09-13 14:22     ` Peter Maydell
2021-09-13 15:56       ` Richard Henderson
2021-09-13  9:54 ` [PATCH v2 11/12] target/arm: Optimize MVE VSLI and VSRI Peter Maydell
2021-09-13 14:04   ` Richard Henderson
2021-09-13  9:54 ` [PATCH v2 12/12] target/arm: Optimize MVE 1op-immediate insns Peter Maydell
2021-09-13 14:09   ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210913095440.13462-11-peter.maydell@linaro.org \
    --to=peter.maydell@linaro.org \
    --cc=f4bug@amsat.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.