All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-arm@nongnu.org, qemu-devel@nongnu.org
Subject: [PATCH 19/20] target/arm: Convert load/store single structure to decodetree
Date: Fri,  2 Jun 2023 16:52:22 +0100	[thread overview]
Message-ID: <20230602155223.2040685-20-peter.maydell@linaro.org> (raw)
In-Reply-To: <20230602155223.2040685-1-peter.maydell@linaro.org>

Convert the ASIMD load/store single structure insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
I note that compared to the old decoder this is rather harder
to compare against the pseudocode; the old hand-decode can
follow the pseudocode quite closely with its switch on 'scale',
whereas here quite a lot of magic is happening in the calculation
of 'index'.
---
 target/arm/tcg/a64.decode      |  37 ++++++
 target/arm/tcg/translate-a64.c | 228 ++++++++++++++++-----------------
 2 files changed, 148 insertions(+), 117 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 12d331b4c2a..48461a0540e 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -494,3 +494,40 @@ LD_mult         0 . 001100 . 1 0 ..... 0110 .. ..... ..... @ldst_mult rpt=3 sele
 LD_mult         0 . 001100 . 1 0 ..... 0111 .. ..... ..... @ldst_mult rpt=1 selem=1
 LD_mult         0 . 001100 . 1 0 ..... 1000 .. ..... ..... @ldst_mult rpt=1 selem=2
 LD_mult         0 . 001100 . 1 0 ..... 1010 .. ..... ..... @ldst_mult rpt=2 selem=1
+
+# Load/store single structure
+
+%ldst_single_selem 13:1 21:1 !function=plus_1
+# The index is made up from bits Q, S and the size; we may then need to scale
+# it down by the size.
+%ldst_single_index q:1 s:1 sz:2
+%ldst_single_index_scaled q:1 s:1 sz:2 scale:3 !function=uimm_scaled_down
+%ldst_single_repl_scale 10:2
+
+# We don't care about S in the trans functions (the decode folds it into
+# the calculation of index), but we have to list it here so that we can
+# handle the S-must-be-0 pattern lines. Similarly we don't care about sz
+# once it has been used to calculate index.
+&ldst_single    rm rn rt sz p q s selem index scale repl
+
+@ldst_single        . q:1 ...... p:1 . . rm:5 ... . .. rn:5 rt:5 \
+                    &ldst_single index=%ldst_single_index_scaled \
+                    selem=%ldst_single_selem repl=0
+@ldst_single_repl   . q:1 ...... p:1 . . rm:5 ... . sz:2 rn:5 rt:5 \
+                    &ldst_single index=%ldst_single_index \
+                    scale=%ldst_single_repl_scale selem=%ldst_single_selem repl=1
+
+
+ST_single       0 . 001101 . 0 . ..... 00 . s:1 sz:2 ..... ..... @ldst_single scale=0
+ST_single       0 . 001101 . 0 . ..... 01 . s:1 00 ..... ..... @ldst_single scale=1 sz=0
+ST_single       0 . 001101 . 0 . ..... 01 . s:1 10 ..... ..... @ldst_single scale=1 sz=2
+ST_single       0 . 001101 . 0 . ..... 10 . s:1 00 ..... ..... @ldst_single scale=2 sz=0
+ST_single       0 . 001101 . 0 . ..... 10 . 0 01 ..... ..... @ldst_single scale=3 sz=1 s=0
+
+LD_single       0 . 001101 . 1 . ..... 00 . s:1 sz:2 ..... ..... @ldst_single scale=0
+LD_single       0 . 001101 . 1 . ..... 01 . s:1 00 ..... ..... @ldst_single scale=1 sz=0
+LD_single       0 . 001101 . 1 . ..... 01 . s:1 10 ..... ..... @ldst_single scale=1 sz=2
+LD_single       0 . 001101 . 1 . ..... 10 . s:1 00 ..... ..... @ldst_single scale=2 sz=0
+LD_single       0 . 001101 . 1 . ..... 10 . 0 01 ..... ..... @ldst_single scale=3 sz=1 s=0
+# Replicating load case
+LD_single_repl  0 . 001101 . 1 . ..... 11 . 0 .. ..... ..... @ldst_single_repl s=0
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index c3b22a74dd5..128c2b8b4b5 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -72,6 +72,17 @@ static int uimm_scaled(DisasContext *s, int x)
     return imm << scale;
 }
 
+/*
+ * For ASIMD load/store single structure: immediate is in bits [31:3],
+ * and should be scaled down by the scale in bits [2:0].
+ */
+static int uimm_scaled_down(DisasContext *s, int x)
+{
+    unsigned imm = x >> 3;
+    unsigned scale = extract32(x, 0, 3);
+    return imm >> scale;
+}
+
 /*
  * Include the generated decoders.
  */
@@ -3405,140 +3416,126 @@ static bool trans_ST_mult(DisasContext *s, arg_ldst_mult *a)
     return true;
 }
 
-/* AdvSIMD load/store single structure
- *
- *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- * | 0 | Q | 0 0 1 1 0 1 0 | L R | 0 0 0 0 0 | opc | S | size |  Rn  |  Rt  |
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- *
- * AdvSIMD load/store single structure (post-indexed)
- *
- *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- * | 0 | Q | 0 0 1 1 0 1 1 | L R |     Rm    | opc | S | size |  Rn  |  Rt  |
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- *
- * Rt: first (or only) SIMD&FP register to be transferred
- * Rn: base address or SP
- * Rm (post-index only): post-index register (when !31) or size dependent #imm
- * index = encoded in Q:S:size dependent on size
- *
- * lane_size = encoded in R, opc
- * transfer width = encoded in opc, S, size
- */
-static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
+static bool trans_ST_single(DisasContext *s, arg_ldst_single *a)
 {
-    int rt = extract32(insn, 0, 5);
-    int rn = extract32(insn, 5, 5);
-    int rm = extract32(insn, 16, 5);
-    int size = extract32(insn, 10, 2);
-    int S = extract32(insn, 12, 1);
-    int opc = extract32(insn, 13, 3);
-    int R = extract32(insn, 21, 1);
-    int is_load = extract32(insn, 22, 1);
-    int is_postidx = extract32(insn, 23, 1);
-    int is_q = extract32(insn, 30, 1);
-
-    int scale = extract32(opc, 1, 2);
-    int selem = (extract32(opc, 0, 1) << 1 | R) + 1;
-    bool replicate = false;
-    int index = is_q << 3 | S << 2 | size;
-    int xs, total;
+    int xs, total, rt;
     TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
     MemOp mop;
 
-    if (extract32(insn, 31, 1)) {
-        unallocated_encoding(s);
-        return;
+    if (!a->p && a->rm != 0) {
+        return false;
     }
-    if (!is_postidx && rm != 0) {
-        unallocated_encoding(s);
-        return;
-    }
-
-    switch (scale) {
-    case 3:
-        if (!is_load || S) {
-            unallocated_encoding(s);
-            return;
-        }
-        scale = size;
-        replicate = true;
-        break;
-    case 0:
-        break;
-    case 1:
-        if (extract32(size, 0, 1)) {
-            unallocated_encoding(s);
-            return;
-        }
-        index >>= 1;
-        break;
-    case 2:
-        if (extract32(size, 1, 1)) {
-            unallocated_encoding(s);
-            return;
-        }
-        if (!extract32(size, 0, 1)) {
-            index >>= 2;
-        } else {
-            if (S) {
-                unallocated_encoding(s);
-                return;
-            }
-            index >>= 3;
-            scale = 3;
-        }
-        break;
-    default:
-        g_assert_not_reached();
-    }
-
     if (!fp_access_check(s)) {
-        return;
+        return true;
     }
 
-    if (rn == 31) {
+    if (a->rn == 31) {
         gen_check_sp_alignment(s);
     }
 
-    total = selem << scale;
-    tcg_rn = cpu_reg_sp(s, rn);
+    total = a->selem << a->scale;
+    tcg_rn = cpu_reg_sp(s, a->rn);
 
-    clean_addr = gen_mte_checkN(s, tcg_rn, !is_load, is_postidx || rn != 31,
-                                total);
-    mop = finalize_memop(s, scale);
+    clean_addr = gen_mte_checkN(s, tcg_rn, true, a->p || a->rn != 31, total);
+    mop = finalize_memop(s, a->scale);
 
-    tcg_ebytes = tcg_constant_i64(1 << scale);
-    for (xs = 0; xs < selem; xs++) {
-        if (replicate) {
-            /* Load and replicate to all elements */
-            TCGv_i64 tcg_tmp = tcg_temp_new_i64();
-
-            tcg_gen_qemu_ld_i64(tcg_tmp, clean_addr, get_mem_index(s), mop);
-            tcg_gen_gvec_dup_i64(scale, vec_full_reg_offset(s, rt),
-                                 (is_q + 1) * 8, vec_full_reg_size(s),
-                                 tcg_tmp);
-        } else {
-            /* Load/store one element per register */
-            if (is_load) {
-                do_vec_ld(s, rt, index, clean_addr, mop);
-            } else {
-                do_vec_st(s, rt, index, clean_addr, mop);
-            }
-        }
+    tcg_ebytes = tcg_constant_i64(1 << a->scale);
+    for (xs = 0, rt = a->rt; xs < a->selem; xs++, rt = (rt + 1) % 32) {
+        do_vec_st(s, rt, a->index, clean_addr, mop);
         tcg_gen_add_i64(clean_addr, clean_addr, tcg_ebytes);
-        rt = (rt + 1) % 32;
     }
 
-    if (is_postidx) {
-        if (rm == 31) {
+    if (a->p) {
+        if (a->rm == 31) {
             tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
         } else {
-            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, a->rm));
         }
     }
+    return true;
+}
+
+static bool trans_LD_single(DisasContext *s, arg_ldst_single *a)
+{
+    int xs, total, rt;
+    TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
+    MemOp mop;
+
+    if (!a->p && a->rm != 0) {
+        return false;
+    }
+    if (!fp_access_check(s)) {
+        return true;
+    }
+
+    if (a->rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    total = a->selem << a->scale;
+    tcg_rn = cpu_reg_sp(s, a->rn);
+
+    clean_addr = gen_mte_checkN(s, tcg_rn, false, a->p || a->rn != 31, total);
+    mop = finalize_memop(s, a->scale);
+
+    tcg_ebytes = tcg_constant_i64(1 << a->scale);
+    for (xs = 0, rt = a->rt; xs < a->selem; xs++, rt = (rt + 1) % 32) {
+        do_vec_ld(s, rt, a->index, clean_addr, mop);
+        tcg_gen_add_i64(clean_addr, clean_addr, tcg_ebytes);
+    }
+
+    if (a->p) {
+        if (a->rm == 31) {
+            tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
+        } else {
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, a->rm));
+        }
+    }
+    return true;
+}
+
+static bool trans_LD_single_repl(DisasContext *s, arg_ldst_single *a)
+{
+    int xs, total, rt;
+    TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
+    MemOp mop;
+
+    if (!a->p && a->rm != 0) {
+        return false;
+    }
+    if (!fp_access_check(s)) {
+        return true;
+    }
+
+    if (a->rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    total = a->selem << a->scale;
+    tcg_rn = cpu_reg_sp(s, a->rn);
+
+    clean_addr = gen_mte_checkN(s, tcg_rn, false, a->p || a->rn != 31, total);
+    mop = finalize_memop(s, a->scale);
+
+    tcg_ebytes = tcg_constant_i64(1 << a->scale);
+    for (xs = 0, rt = a->rt; xs < a->selem; xs++, rt = (rt + 1) % 32) {
+        /* Load and replicate to all elements */
+        TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+        tcg_gen_qemu_ld_i64(tcg_tmp, clean_addr, get_mem_index(s), mop);
+        tcg_gen_gvec_dup_i64(a->scale, vec_full_reg_offset(s, rt),
+                             (a->q + 1) * 8, vec_full_reg_size(s), tcg_tmp);
+        tcg_gen_add_i64(clean_addr, clean_addr, tcg_ebytes);
+    }
+
+    if (a->p) {
+        if (a->rm == 31) {
+            tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
+        } else {
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, a->rm));
+        }
+    }
+    return true;
 }
 
 /*
@@ -3747,9 +3744,6 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
 static void disas_ldst(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 24, 6)) {
-    case 0x0d: /* AdvSIMD load/store single structure */
-        disas_ldst_single_struct(s, insn);
-        break;
     case 0x19:
         if (extract32(insn, 21, 1) != 0) {
             disas_ldst_tag(s, insn);
-- 
2.34.1



  parent reply	other threads:[~2023-06-02 15:54 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 15:52 [PATCH 00/20] target/arm: Convert exception, system, loads and stores to decodetree Peter Maydell
2023-06-02 15:52 ` [PATCH 01/20] target/arm: Fix return value from LDSMIN/LDSMAX 8/16 bit atomics Peter Maydell
2023-06-03  5:35   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 02/20] target/arm: Convert hint instruction space to decodetree Peter Maydell
2023-06-03  5:42   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 03/20] target/arm: Convert barrier insns " Peter Maydell
2023-06-03  5:48   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 04/20] target/arm: Convert CFINV, XAFLAG and AXFLAG " Peter Maydell
2023-06-03  5:55   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 05/20] target/arm: Convert MSR (immediate) " Peter Maydell
2023-06-03  6:01   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 06/20] target/arm: Convert MSR (reg), MRS, SYS, SYSL " Peter Maydell
2023-06-03  6:05   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 07/20] target/arm: Convert exception generation instructions " Peter Maydell
2023-06-03  6:09   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 08/20] target/arm: Convert load/store exclusive and ordered " Peter Maydell
2023-06-03 22:32   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 09/20] target/arm: Convert LDXP, STXP, CASP, CAS " Peter Maydell
2023-06-03 22:44   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 10/20] target/arm: Convert load reg (literal) group " Peter Maydell
2023-06-03 22:49   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 11/20] target/arm: Convert load/store-pair " Peter Maydell
2023-06-03 23:05   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 12/20] target/arm: Convert ld/st reg+imm9 insns " Peter Maydell
2023-06-03 23:14   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 13/20] target/arm: Convert LDR/STR with 12-bit immediate " Peter Maydell
2023-06-02 20:51   ` Philippe Mathieu-Daudé
2023-06-03 16:18     ` Peter Maydell
2023-06-03 23:19   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 14/20] target/arm: Convert LDR/STR reg+reg " Peter Maydell
2023-06-03 23:27   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 15/20] target/arm: Convert atomic memory ops " Peter Maydell
2023-06-03 23:35   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 16/20] target/arm: Convert load (pointer auth) insns " Peter Maydell
2023-06-02 20:56   ` Philippe Mathieu-Daudé
2023-06-03 23:41   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 17/20] target/arm: Convert LDAPR/STLR (imm) " Peter Maydell
2023-06-03 23:55   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 18/20] target/arm: Convert load/store (multiple structures) " Peter Maydell
2023-06-04  0:00   ` Richard Henderson
2023-06-02 15:52 ` Peter Maydell [this message]
2023-06-04  1:27   ` [PATCH 19/20] target/arm: Convert load/store single structure " Richard Henderson
2023-06-02 15:52 ` [PATCH 20/20] target/arm: Convert load/store tags insns " Peter Maydell
2023-06-04  1:36   ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230602155223.2040685-20-peter.maydell@linaro.org \
    --to=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.