All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-arm <qemu-arm@nongnu.org>, QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [PATCH v6 70/82] target/arm: Implement SVE2 LD1RO
Date: Thu, 13 May 2021 17:41:47 +0100	[thread overview]
Message-ID: <CAFEAcA8sf3M_2pVZPn2AwLO0vdc8PoOwWTtxeYYHTxpUphhkuA@mail.gmail.com> (raw)
In-Reply-To: <20210430202610.1136687-71-richard.henderson@linaro.org>

On Fri, 30 Apr 2021 at 22:31, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/sve.decode      |  4 ++
>  target/arm/translate-sve.c | 97 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 101 insertions(+)
>
> diff --git a/target/arm/sve.decode b/target/arm/sve.decode
> index 17adb393ff..df870ce23b 100644
> --- a/target/arm/sve.decode
> +++ b/target/arm/sve.decode
> @@ -1077,11 +1077,15 @@ LD_zpri         1010010 .. nreg:2 0.... 111 ... ..... .....     @rpri_load_msz
>  # SVE load and broadcast quadword (scalar plus scalar)
>  LD1RQ_zprr      1010010 .. 00 ..... 000 ... ..... ..... \
>                  @rprr_load_msz nreg=0
> +LD1RO_zprr      1010010 .. 01 ..... 000 ... ..... ..... \
> +                @rprr_load_msz nreg=0
>
>  # SVE load and broadcast quadword (scalar plus immediate)
>  # LD1RQB, LD1RQH, LD1RQS, LD1RQD
>  LD1RQ_zpri      1010010 .. 00 0.... 001 ... ..... ..... \
>                  @rpri_load_msz nreg=0
> +LD1RO_zpri      1010010 .. 01 0.... 001 ... ..... ..... \
> +                @rpri_load_msz nreg=0
>
>  # SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets)
>  PRF             1000010 00 -1 ----- 0-- --- ----- 0 ----
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index ca393164bc..8a4eb8542f 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -5586,6 +5586,103 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a)
>      return true;
>  }
>
> +static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
> +{
> +    unsigned vsz = vec_full_reg_size(s);
> +    unsigned vsz_r32;
> +    TCGv_ptr t_pg;
> +    TCGv_i32 t_desc;
> +    int desc, poff, doff;
> +
> +    if (vsz < 32) {
> +        /*
> +         * Note that this UNDEFINED check comes after CheckSVEEnabled()
> +         * in the ARM pseudocode, which is the sve_access_check() done
> +         * in our caller.  We should not now return false from the caller.
> +         */
> +        unallocated_encoding(s);
> +        return;
> +    }
> +
> +    /* Load the first octaword using the normal predicated load helpers.  */
> +
> +    poff = pred_full_reg_offset(s, pg);
> +    if (vsz > 32) {
> +        /*
> +         * Zero-extend the first 32 bits of the predicate into a temporary.
> +         * This avoids triggering an assert making sure we don't have bits
> +         * set within a predicate beyond VQ, but we have lowered VQ to 2
> +         * for this load operation.
> +         */
> +        TCGv_i64 tmp = tcg_temp_new_i64();
> +#ifdef HOST_WORDS_BIGENDIAN
> +        poff += 4;
> +#endif
> +        tcg_gen_ld32u_i64(tmp, cpu_env, poff);
> +
> +        poff = offsetof(CPUARMState, vfp.preg_tmp);
> +        tcg_gen_st_i64(tmp, cpu_env, poff);
> +        tcg_temp_free_i64(tmp);
> +    }
> +
> +    t_pg = tcg_temp_new_ptr();
> +    tcg_gen_addi_ptr(t_pg, cpu_env, poff);
> +
> +    desc = simd_desc(32, 32, zt);
> +    t_desc = tcg_const_i32(desc);

Why put these two lines down here? In do_ldrq() they are higher up...
Unless there's a reason for the two functions to be different we
should keep them the same, I think.

> +
> +    gen_helper_gvec_mem *fn
> +        = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
> +    fn(cpu_env, t_pg, addr, t_desc);
> +
> +    tcg_temp_free_ptr(t_pg);
> +    tcg_temp_free_i32(t_desc);
> +
> +    /*
> +     * Replicate that first octaword.
> +     * The replication happens in units of 32; if the full vector size
> +     * is not a multiple of 32, the final bits are zeroed.
> +     */
> +    doff = vec_full_reg_offset(s, zt);

Similarly in do_ldrq() this variable is named "dofs".

> +    vsz_r32 = QEMU_ALIGN_DOWN(vsz, 32);
> +    if (vsz >= 64) {
> +        tcg_gen_gvec_dup_mem(5, doff + 32, doff, vsz_r32 - 32, vsz - 32);
> +    } else if (vsz > vsz_r32) {
> +        /* Nop move, with side effect of clearing the tail. */
> +        tcg_gen_gvec_mov(MO_64, doff, doff, vsz_r32, vsz);
> +    }
> +}
> +

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


  reply	other threads:[~2021-05-13 17:05 UTC|newest]

Thread overview: 184+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-30 20:24 [PATCH v6 00/82] target/arm: Implement SVE2 Richard Henderson
2021-04-30 20:24 ` [PATCH v6 01/82] target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2 Richard Henderson
2021-05-11  7:55   ` Peter Maydell
2021-05-11 17:20     ` Richard Henderson
2021-04-30 20:24 ` [PATCH v6 02/82] target/arm: Implement SVE2 Integer Multiply - Unpredicated Richard Henderson
2021-05-11  8:00   ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 03/82] target/arm: Implement SVE2 integer pairwise add and accumulate long Richard Henderson
2021-05-11  8:02   ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 04/82] target/arm: Implement SVE2 integer unary operations (predicated) Richard Henderson
2021-05-11  8:10   ` Peter Maydell
2021-05-11 17:22     ` Richard Henderson
2021-04-30 20:24 ` [PATCH v6 05/82] target/arm: Split out saturating/rounding shifts from neon Richard Henderson
2021-05-11  8:36   ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 06/82] target/arm: Implement SVE2 saturating/rounding bitwise shift left (predicated) Richard Henderson
2021-05-11  8:43   ` Peter Maydell
2021-05-11 15:40     ` Richard Henderson
2021-05-11 15:56       ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 07/82] target/arm: Implement SVE2 integer halving add/subtract (predicated) Richard Henderson
2021-05-11  8:45   ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 08/82] target/arm: Implement SVE2 integer pairwise arithmetic Richard Henderson
2021-05-11  8:58   ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 09/82] target/arm: Implement SVE2 saturating add/subtract (predicated) Richard Henderson
2021-05-11  9:07   ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 10/82] target/arm: Implement SVE2 integer add/subtract long Richard Henderson
2021-05-11  9:11   ` Peter Maydell
2021-04-30 20:24 ` [PATCH v6 11/82] target/arm: Implement SVE2 integer add/subtract interleaved long Richard Henderson
2021-05-11  9:12   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 12/82] target/arm: Implement SVE2 integer add/subtract wide Richard Henderson
2021-05-11  9:14   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 13/82] target/arm: Implement SVE2 integer multiply long Richard Henderson
2021-05-11 12:21   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 14/82] target/arm: Implement PMULLB and PMULLT Richard Henderson
2021-05-11 12:29   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 15/82] target/arm: Implement SVE2 bitwise shift left long Richard Henderson
2021-05-11 12:40   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 16/82] target/arm: Implement SVE2 bitwise exclusive-or interleaved Richard Henderson
2021-05-11 12:43   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 17/82] target/arm: Implement SVE2 bitwise permute Richard Henderson
2021-05-11 12:58   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 18/82] target/arm: Implement SVE2 complex integer add Richard Henderson
2021-05-11 13:02   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 19/82] target/arm: Implement SVE2 integer absolute difference and accumulate long Richard Henderson
2021-05-11 15:27   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 20/82] target/arm: Implement SVE2 integer add/subtract long with carry Richard Henderson
2021-05-11 15:48   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 21/82] target/arm: Implement SVE2 bitwise shift right and accumulate Richard Henderson
2021-05-11 15:57   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 22/82] target/arm: Implement SVE2 bitwise shift and insert Richard Henderson
2021-05-11 15:58   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 23/82] target/arm: Implement SVE2 integer absolute difference and accumulate Richard Henderson
2021-05-11 15:59   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 24/82] target/arm: Implement SVE2 saturating extract narrow Richard Henderson
2021-05-11 16:08   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 25/82] target/arm: Implement SVE2 floating-point pairwise Richard Henderson
2021-05-11 16:09   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 26/82] target/arm: Implement SVE2 SHRN, RSHRN Richard Henderson
2021-05-12  8:52   ` Peter Maydell
2021-05-12 16:07     ` Richard Henderson
2021-04-30 20:25 ` [PATCH v6 27/82] target/arm: Implement SVE2 SQSHRUN, SQRSHRUN Richard Henderson
2021-05-12  8:54   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 28/82] target/arm: Implement SVE2 UQSHRN, UQRSHRN Richard Henderson
2021-05-12  8:56   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 29/82] target/arm: Implement SVE2 SQSHRN, SQRSHRN Richard Henderson
2021-05-12  8:59   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 30/82] target/arm: Implement SVE2 WHILEGT, WHILEGE, WHILEHI, WHILEHS Richard Henderson
2021-05-12  9:07   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 31/82] target/arm: Implement SVE2 WHILERW, WHILEWR Richard Henderson
2021-05-12 14:06   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 32/82] target/arm: Implement SVE2 bitwise ternary operations Richard Henderson
2021-05-12 14:12   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 33/82] target/arm: Implement SVE2 MATCH, NMATCH Richard Henderson
2021-04-30 20:25 ` [PATCH v6 34/82] target/arm: Implement SVE2 saturating multiply-add long Richard Henderson
2021-05-12 14:21   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 35/82] target/arm: Implement SVE2 saturating multiply-add high Richard Henderson
2021-05-12 15:12   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 36/82] target/arm: Implement SVE2 integer multiply-add long Richard Henderson
2021-05-12 15:13   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 37/82] target/arm: Implement SVE2 complex integer multiply-add Richard Henderson
2021-05-12 15:20   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 38/82] target/arm: Implement SVE2 ADDHNB, ADDHNT Richard Henderson
2021-05-12 15:23   ` Peter Maydell
2021-05-12 16:17     ` Richard Henderson
2021-04-30 20:25 ` [PATCH v6 39/82] target/arm: Implement SVE2 RADDHNB, RADDHNT Richard Henderson
2021-05-12 15:24   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 40/82] target/arm: Implement SVE2 SUBHNB, SUBHNT Richard Henderson
2021-05-12 15:24   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 41/82] target/arm: Implement SVE2 RSUBHNB, RSUBHNT Richard Henderson
2021-05-12 15:25   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 42/82] target/arm: Implement SVE2 HISTCNT, HISTSEG Richard Henderson
2021-05-13 10:22   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 43/82] target/arm: Implement SVE2 XAR Richard Henderson
2021-05-13 10:27   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 44/82] target/arm: Implement SVE2 scatter store insns Richard Henderson
2021-05-13 10:31   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 45/82] target/arm: Implement SVE2 gather load insns Richard Henderson
2021-05-13 10:33   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 46/82] target/arm: Implement SVE2 FMMLA Richard Henderson
2021-05-13 10:38   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 47/82] target/arm: Implement SVE2 SPLICE, EXT Richard Henderson
2021-05-13 10:41   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 48/82] target/arm: Pass separate addend to {U, S}DOT helpers Richard Henderson
2021-05-13 10:47   ` Peter Maydell
2021-05-14 16:33     ` Richard Henderson
2021-05-14 16:35       ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 49/82] target/arm: Pass separate addend to FCMLA helpers Richard Henderson
2021-04-30 20:25 ` [PATCH v6 50/82] target/arm: Split out formats for 2 vectors + 1 index Richard Henderson
2021-05-13 10:49   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 51/82] target/arm: Split out formats for 3 " Richard Henderson
2021-05-13 10:53   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 52/82] target/arm: Implement SVE2 integer multiply (indexed) Richard Henderson
2021-05-13 12:31   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 53/82] target/arm: Implement SVE2 integer multiply-add (indexed) Richard Henderson
2021-05-13 12:33   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 54/82] target/arm: Implement SVE2 saturating multiply-add high (indexed) Richard Henderson
2021-05-13 12:35   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 55/82] target/arm: Implement SVE2 saturating multiply-add (indexed) Richard Henderson
2021-05-13 12:42   ` Peter Maydell
2021-05-14 18:17     ` Richard Henderson
2021-04-30 20:25 ` [PATCH v6 56/82] target/arm: Implement SVE2 saturating multiply (indexed) Richard Henderson
2021-05-13 12:45   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 57/82] target/arm: Implement SVE2 signed saturating doubling multiply high Richard Henderson
2021-05-13 12:48   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 58/82] target/arm: Implement SVE2 saturating multiply high (indexed) Richard Henderson
2021-05-13 12:51   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 59/82] target/arm: Implement SVE mixed sign dot product (indexed) Richard Henderson
2021-05-13 12:57   ` Peter Maydell
2021-05-14 18:47     ` Richard Henderson
2021-04-30 20:25 ` [PATCH v6 60/82] target/arm: Implement SVE mixed sign dot product Richard Henderson
2021-05-13 13:01   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 61/82] target/arm: Implement SVE2 crypto unary operations Richard Henderson
2021-05-13 13:02   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 62/82] target/arm: Implement SVE2 crypto destructive binary operations Richard Henderson
2021-05-13 13:04   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 63/82] target/arm: Implement SVE2 crypto constructive " Richard Henderson
2021-05-13 13:52   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 64/82] target/arm: Implement SVE2 TBL, TBX Richard Henderson
2021-05-13 13:59   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 65/82] target/arm: Implement SVE2 FCVTNT Richard Henderson
2021-05-13 14:01   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 66/82] target/arm: Implement SVE2 FCVTLT Richard Henderson
2021-05-13 14:03   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 67/82] target/arm: Implement SVE2 FCVTXNT, FCVTX Richard Henderson
2021-05-13 14:06   ` Peter Maydell
2021-04-30 20:25 ` [PATCH v6 68/82] target/arm: Implement SVE2 FLOGB Richard Henderson
2021-05-13 14:18   ` Peter Maydell
2021-05-15 16:14     ` Richard Henderson
2021-04-30 20:25 ` [PATCH v6 69/82] target/arm: Share table of sve load functions Richard Henderson
2021-05-13 14:25   ` Peter Maydell
2021-05-15 16:25     ` Richard Henderson
2021-04-30 20:25 ` [PATCH v6 70/82] target/arm: Implement SVE2 LD1RO Richard Henderson
2021-05-13 16:41   ` Peter Maydell [this message]
2021-04-30 20:25 ` [PATCH v6 71/82] target/arm: Implement 128-bit ZIP, UZP, TRN Richard Henderson
2021-05-13 16:48   ` Peter Maydell
2021-04-30 20:26 ` [PATCH v6 72/82] target/arm: Implement SVE2 bitwise shift immediate Richard Henderson
2021-05-13 16:57   ` Peter Maydell
2021-05-15 16:53     ` Richard Henderson
2021-04-30 20:26 ` [PATCH v6 73/82] target/arm: Implement SVE2 fp multiply-add long Richard Henderson
2021-05-13 17:04   ` Peter Maydell
2021-05-15 17:09     ` Richard Henderson
2021-04-30 20:26 ` [PATCH v6 74/82] target/arm: Implement aarch64 SUDOT, USDOT Richard Henderson
2021-05-13 17:09   ` Peter Maydell
2021-04-30 20:26 ` [PATCH v6 75/82] target/arm: Split out do_neon_ddda_fpst Richard Henderson
2021-05-13 17:13   ` Peter Maydell
2021-04-30 20:26 ` [PATCH v6 76/82] target/arm: Remove unused fpst from VDOT_scalar Richard Henderson
2021-05-13 17:18   ` Peter Maydell
2021-04-30 20:26 ` [PATCH v6 77/82] target/arm: Fix decode for VDOT (indexed) Richard Henderson
2021-05-13 19:25   ` Peter Maydell
2021-05-15 17:13     ` Richard Henderson
2021-05-16 16:09       ` Peter Maydell
2021-05-17 15:48         ` Richard Henderson
2021-05-15 17:20     ` Richard Henderson
2021-04-30 20:26 ` [PATCH v6 78/82] target/arm: Split decode of VSDOT and VUDOT Richard Henderson
2021-05-13 19:27   ` Peter Maydell
2021-04-30 20:26 ` [PATCH v6 79/82] target/arm: Implement aarch32 VSUDOT, VUSDOT Richard Henderson
2021-05-13 19:32   ` Peter Maydell
2021-04-30 20:26 ` [PATCH v6 80/82] target/arm: Implement integer matrix multiply accumulate Richard Henderson
2021-05-13 19:49   ` Peter Maydell
2021-05-14 16:58     ` Richard Henderson
2021-04-30 20:26 ` [PATCH v6 81/82] linux-user/aarch64: Enable hwcap bits for sve2 and related extensions Richard Henderson
2021-05-13 19:33   ` Peter Maydell
2021-04-30 20:26 ` [PATCH v6 82/82] target/arm: Enable SVE2 " Richard Henderson
2021-05-13 19:35   ` Peter Maydell
2021-05-14 17:21     ` Richard Henderson
2021-05-13 19:49 ` [PATCH v6 00/82] target/arm: Implement SVE2 Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFEAcA8sf3M_2pVZPn2AwLO0vdc8PoOwWTtxeYYHTxpUphhkuA@mail.gmail.com \
    --to=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.