From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BD428C433F5 for ; Wed, 27 Apr 2022 15:21:47 +0000 (UTC) Received: from localhost ([::1]:60310 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njjUA-00031c-Pd for qemu-devel@archiver.kernel.org; Wed, 27 Apr 2022 11:21:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38040) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njjGA-0001JJ-4Y for qemu-devel@nongnu.org; Wed, 27 Apr 2022 11:07:18 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:43894) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1njjG7-0005Vb-CP for qemu-devel@nongnu.org; Wed, 27 Apr 2022 11:07:17 -0400 Received: by mail-pg1-x52d.google.com with SMTP id g9so1663264pgc.10 for ; Wed, 27 Apr 2022 08:07:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; h=from:mime-version:subject:date:references:to:in-reply-to:message-id; bh=0mLMpV1szJ86FWHWOv8X9LyOB5naGKsusSiUWVvFvQk=; b=WuqN7SDD0GcmYGY97SGxQBKbnluFEaURE6oaB9gTINc1ZTc67T+y71PulH8hbEE2Qu ylHS+e8VxxGs9y5EgHKtsVkhS+AaL0Ic5PTRI2Chm7g/CL47lWaSR1zbbvIxZZJtmIIS dA3Xb2wsfY5cehrEt0yy9eAkWVQh6QqB8AqdXCS/yfXn/+p6jwCuJa89aJUaqWWUtgbN gx9sQWSrd5a4JlyYO0CzB8MPyYSjmmle7kc56oxjMvhhsvxubE4DpHpELppmxrlRjKw+ b7QNyT3XoqMkx/LXmFnnv0MavX/DKeiiE81bCpjjAW7JzVcFUsrhao1Qk22mH02mUVs+ YGwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:mime-version:subject:date:references:to :in-reply-to:message-id; bh=0mLMpV1szJ86FWHWOv8X9LyOB5naGKsusSiUWVvFvQk=; b=MGQh/8ctFPAhHF+4/xMsN2jUB/lUZnmmC1dORnTSYV6QJ0KjfZ3kSVx83bDZ2WLtRl gEr5lkGpv0Unw1Bv4JpolzU3heIUSzS5H7kD/haCKiJp7M40C2qYSFlRRria/1y0N4wQ YmCFpxvIgTeWyYx57CqquMXSbersQA1dKLG/97Iav+7KEgIPVXv9RKk4l0gNMHvxs1wP oOFl/akcMlFjvE06zghxZ1M5ZGVpW6K+BqXGmOa8U2WdDKei0D4AJgw3DKVq4bITXhOI 31dVD2mZYL8YxU77KHUxBejd6Lmi/VQ4k5U3SCxzieH1AbY+2BCZ3LiVgF1t3J5P5qmN YaBw== X-Gm-Message-State: AOAM531mxT4a4QXFwaIx/J15jcYH5AzYqB6bSFZxQcrlpLuO4oV3MKwt DBay8+g8tJwhCXkVGJZLn2lCG09dziPPI0exQKA= X-Google-Smtp-Source: ABdhPJyv5/aYj+pOCHq7m3fuKAiOrS3HrjbDXLTQMwdhA2E+R13PBiMoEocmpMI4GB0roGmMSqXfvA== X-Received: by 2002:a63:87c2:0:b0:3ab:2425:a21e with SMTP id i185-20020a6387c2000000b003ab2425a21emr15383101pge.53.1651072033696; Wed, 27 Apr 2022 08:07:13 -0700 (PDT) Received: from smtpclient.apple (2001-b400-e235-7644-f067-af7f-d60d-f201.emome-ip6.hinet.net. [2001:b400:e235:7644:f067:af7f:d60d:f201]) by smtp.gmail.com with ESMTPSA id b11-20020a621b0b000000b00505c6892effsm18945915pfb.26.2022.04.27.08.07.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 Apr 2022 08:07:12 -0700 (PDT) From: eop Chen Content-Type: multipart/alternative; boundary="Apple-Mail=_ED4F7D08-7B72-481B-91EA-82188FE2F246" Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.60.0.1.1\)) Subject: Re: [PATCH qemu v9 05/14] target/riscv: rvv: Add tail agnostic for vector load / store instructions Date: Wed, 27 Apr 2022 23:07:09 +0800 References: <165105385811.8013.9841879319865783070-5@git.sr.ht> <7b28461b-641e-210f-e156-75e02064a61b@iscas.ac.cn> To: Weiwei Li , qemu-devel@nongnu.org, qemu-riscv@nongnu.org In-Reply-To: <7b28461b-641e-210f-e156-75e02064a61b@iscas.ac.cn> Message-Id: <6004571B-E27F-4CA7-B5BC-3AAA6271F6D8@sifive.com> X-Mailer: Apple Mail (2.3693.60.0.1.1) Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=eop.chen@sifive.com; helo=mail-pg1-x52d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --Apple-Mail=_ED4F7D08-7B72-481B-91EA-82188FE2F246 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > Weiwei Li =E6=96=BC 2022=E5=B9=B44=E6=9C=8827=E6=97= =A5 =E4=B8=8B=E5=8D=887:55 =E5=AF=AB=E9=81=93=EF=BC=9A >=20 >=20 > =E5=9C=A8 2022/3/7 =E4=B8=8B=E5=8D=883:10, ~eopxd =E5=86=99=E9=81=93: >> From: eopXD >>=20 >> Destination register of unit-stride mask load and store instructions = are >> always written with a tail-agnostic policy. >>=20 >> Signed-off-by: eop Chen >> Reviewed-by: Frank Chang >> --- >> target/riscv/insn_trans/trans_rvv.c.inc | 11 ++++++++++ >> target/riscv/vector_helper.c | 28 = +++++++++++++++++++++++++ >> 2 files changed, 39 insertions(+) >>=20 >> diff --git a/target/riscv/insn_trans/trans_rvv.c.inc = b/target/riscv/insn_trans/trans_rvv.c.inc >> index cc80bf00ff..99691f1b9f 100644 >> --- a/target/riscv/insn_trans/trans_rvv.c.inc >> +++ b/target/riscv/insn_trans/trans_rvv.c.inc >> @@ -711,6 +711,7 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm = *a, uint8_t eew) >> data =3D FIELD_DP32(data, VDATA, VM, a->vm); >> data =3D FIELD_DP32(data, VDATA, LMUL, emul); >> data =3D FIELD_DP32(data, VDATA, NF, a->nf); >> + data =3D FIELD_DP32(data, VDATA, VTA, s->vta); >> return ldst_us_trans(a->rd, a->rs1, data, fn, s, false); >> } >> @@ -748,6 +749,7 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm = *a, uint8_t eew) >> data =3D FIELD_DP32(data, VDATA, VM, a->vm); >> data =3D FIELD_DP32(data, VDATA, LMUL, emul); >> data =3D FIELD_DP32(data, VDATA, NF, a->nf); >> + data =3D FIELD_DP32(data, VDATA, VTA, s->vta); >> return ldst_us_trans(a->rd, a->rs1, data, fn, s, true); >> } >> @@ -774,6 +776,8 @@ static bool ld_us_mask_op(DisasContext *s, = arg_vlm_v *a, uint8_t eew) >> /* EMUL =3D 1, NFIELDS =3D 1 */ >> data =3D FIELD_DP32(data, VDATA, LMUL, 0); >> data =3D FIELD_DP32(data, VDATA, NF, 1); >> + /* Mask destination register are always tail-agnostic */ >> + data =3D FIELD_DP32(data, VDATA, VTA, s->cfg_vta_all_1s); >> return ldst_us_trans(a->rd, a->rs1, data, fn, s, false); >> } >> @@ -791,6 +795,8 @@ static bool st_us_mask_op(DisasContext *s, = arg_vsm_v *a, uint8_t eew) >> /* EMUL =3D 1, NFIELDS =3D 1 */ >> data =3D FIELD_DP32(data, VDATA, LMUL, 0); >> data =3D FIELD_DP32(data, VDATA, NF, 1); >> + /* Mask destination register are always tail-agnostic */ >> + data =3D FIELD_DP32(data, VDATA, VTA, s->cfg_vta_all_1s); >> return ldst_us_trans(a->rd, a->rs1, data, fn, s, true); >> } >> @@ -862,6 +868,7 @@ static bool ld_stride_op(DisasContext *s, = arg_rnfvm *a, uint8_t eew) >> data =3D FIELD_DP32(data, VDATA, VM, a->vm); >> data =3D FIELD_DP32(data, VDATA, LMUL, emul); >> data =3D FIELD_DP32(data, VDATA, NF, a->nf); >> + data =3D FIELD_DP32(data, VDATA, VTA, s->vta); >> return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s, = false); >> } >> @@ -891,6 +898,7 @@ static bool st_stride_op(DisasContext *s, = arg_rnfvm *a, uint8_t eew) >> data =3D FIELD_DP32(data, VDATA, VM, a->vm); >> data =3D FIELD_DP32(data, VDATA, LMUL, emul); >> data =3D FIELD_DP32(data, VDATA, NF, a->nf); >> + data =3D FIELD_DP32(data, VDATA, VTA, s->vta); >> fn =3D fns[eew]; >> if (fn =3D=3D NULL) { >> return false; >> @@ -991,6 +999,7 @@ static bool ld_index_op(DisasContext *s, = arg_rnfvm *a, uint8_t eew) >> data =3D FIELD_DP32(data, VDATA, VM, a->vm); >> data =3D FIELD_DP32(data, VDATA, LMUL, emul); >> data =3D FIELD_DP32(data, VDATA, NF, a->nf); >> + data =3D FIELD_DP32(data, VDATA, VTA, s->vta); >> return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s, = false); >> } >> @@ -1043,6 +1052,7 @@ static bool st_index_op(DisasContext *s, = arg_rnfvm *a, uint8_t eew) >> data =3D FIELD_DP32(data, VDATA, VM, a->vm); >> data =3D FIELD_DP32(data, VDATA, LMUL, emul); >> data =3D FIELD_DP32(data, VDATA, NF, a->nf); >> + data =3D FIELD_DP32(data, VDATA, VTA, s->vta); >> return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s, = true); >> } >> @@ -1108,6 +1118,7 @@ static bool ldff_op(DisasContext *s, = arg_r2nfvm *a, uint8_t eew) >> data =3D FIELD_DP32(data, VDATA, VM, a->vm); >> data =3D FIELD_DP32(data, VDATA, LMUL, emul); >> data =3D FIELD_DP32(data, VDATA, NF, a->nf); >> + data =3D FIELD_DP32(data, VDATA, VTA, s->vta); >> return ldff_trans(a->rd, a->rs1, data, fn, s); >> } >> diff --git a/target/riscv/vector_helper.c = b/target/riscv/vector_helper.c >> index 396e252179..1541d97b08 100644 >> --- a/target/riscv/vector_helper.c >> +++ b/target/riscv/vector_helper.c >> @@ -270,6 +270,8 @@ vext_ldst_stride(void *vd, void *v0, target_ulong = base, >> uint32_t i, k; >> uint32_t nf =3D vext_nf(desc); >> uint32_t max_elems =3D vext_max_elems(desc, log2_esz); >> + uint32_t esz =3D 1 << log2_esz; >> + uint32_t vta =3D vext_vta(desc); >> for (i =3D env->vstart; i < env->vl; i++, env->vstart++) { >> if (!vm && !vext_elem_mask(v0, i)) { >> @@ -284,6 +286,11 @@ vext_ldst_stride(void *vd, void *v0, = target_ulong base, >> } >> } >> env->vstart =3D 0; >> + /* set tail elements to 1s */ >> + for (k =3D 0; k < nf; ++k) { >> + vext_set_elems_1s(vd, vta, env->vl * esz + k * max_elems, >> + max_elems * esz + k * max_elems); >> + } >> } >=20 > It seems incorrect here. I think it should be k * max_elems * esz. = The same to following similar case. >=20 > Otherwise, this patchset looks good to me. >=20 > Reviewed-by: Weiwei Li >=20 > Regards, > Weiwei Li >=20 I have just sent a new version to correct this. Thank you for your patient review. Regards, eop Chen --Apple-Mail=_ED4F7D08-7B72-481B-91EA-82188FE2F246 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

Weiwei Li <liweiwei@iscas.ac.cn> =E6=96=BC 2022=E5=B9=B44=E6=9C=882= 7=E6=97=A5 =E4=B8=8B=E5=8D=887:55 =E5=AF=AB=E9=81=93=EF=BC=9A


=E5=9C=A8 2022/3/7 =E4=B8=8B=E5=8D=883:10, ~eopxd =E5=86=99=E9=81= =93:
From: eopXD = <eop.chen@sifive.com>

Destination register of unit-stride mask load and store = instructions are
always written with a tail-agnostic = policy.

Signed-off-by: eop Chen <eop.chen@sifive.com>
Reviewed-by: Frank = Chang <frank.chang@sifive.com>
---
=  target/riscv/insn_trans/trans_rvv.c.inc | 11 ++++++++++
 target/riscv/vector_helper.c =            | 28 = +++++++++++++++++++++++++
 2 files changed, 39 = insertions(+)

diff --git = a/target/riscv/insn_trans/trans_rvv.c.inc = b/target/riscv/insn_trans/trans_rvv.c.inc
index = cc80bf00ff..99691f1b9f 100644
--- = a/target/riscv/insn_trans/trans_rvv.c.inc
+++ = b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -711,6 +711,7 = @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t eew)
     data =3D FIELD_DP32(data, = VDATA, VM, a->vm);
     data = =3D FIELD_DP32(data, VDATA, LMUL, emul);
=      data =3D FIELD_DP32(data, VDATA, NF, = a->nf);
+    data =3D FIELD_DP32(data, = VDATA, VTA, s->vta);
=      return ldst_us_trans(a->rd, a->rs1, = data, fn, s, false);
 }
 @@ = -748,6 +749,7 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm *a, = uint8_t eew)
     data =3D = FIELD_DP32(data, VDATA, VM, a->vm);
=      data =3D FIELD_DP32(data, VDATA, LMUL, = emul);
     data =3D = FIELD_DP32(data, VDATA, NF, a->nf);
+ =    data =3D FIELD_DP32(data, VDATA, VTA, s->vta);
     return ldst_us_trans(a->rd, = a->rs1, data, fn, s, true);
 }
=  @@ -774,6 +776,8 @@ static bool ld_us_mask_op(DisasContext *s, = arg_vlm_v *a, uint8_t eew)
=      /* EMUL =3D 1, NFIELDS =3D 1 */
     data =3D FIELD_DP32(data, = VDATA, LMUL, 0);
     data =3D = FIELD_DP32(data, VDATA, NF, 1);
+    /* = Mask destination register are always tail-agnostic */
+ =    data =3D FIELD_DP32(data, VDATA, VTA, = s->cfg_vta_all_1s);
=      return ldst_us_trans(a->rd, a->rs1, = data, fn, s, false);
 }
 @@ = -791,6 +795,8 @@ static bool st_us_mask_op(DisasContext *s, arg_vsm_v = *a, uint8_t eew)
     /* EMUL =3D = 1, NFIELDS =3D 1 */
     data =3D = FIELD_DP32(data, VDATA, LMUL, 0);
=      data =3D FIELD_DP32(data, VDATA, NF, = 1);
+    /* Mask destination register are = always tail-agnostic */
+    data =3D = FIELD_DP32(data, VDATA, VTA, s->cfg_vta_all_1s);
=      return ldst_us_trans(a->rd, a->rs1, = data, fn, s, true);
 }
 @@ = -862,6 +868,7 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, = uint8_t eew)
     data =3D = FIELD_DP32(data, VDATA, VM, a->vm);
=      data =3D FIELD_DP32(data, VDATA, LMUL, = emul);
     data =3D = FIELD_DP32(data, VDATA, NF, a->nf);
+ =    data =3D FIELD_DP32(data, VDATA, VTA, s->vta);
     return = ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s, = false);
 }
 @@ -891,6 +898,7 @@ = static bool st_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t eew)
     data =3D FIELD_DP32(data, = VDATA, VM, a->vm);
     data = =3D FIELD_DP32(data, VDATA, LMUL, emul);
=      data =3D FIELD_DP32(data, VDATA, NF, = a->nf);
+    data =3D FIELD_DP32(data, = VDATA, VTA, s->vta);
     fn = =3D fns[eew];
     if (fn =3D=3D = NULL) {
=          return false;
@@ -991,6 +999,7 @@ static bool ld_index_op(DisasContext *s, = arg_rnfvm *a, uint8_t eew)
=      data =3D FIELD_DP32(data, VDATA, VM, = a->vm);
     data =3D = FIELD_DP32(data, VDATA, LMUL, emul);
=      data =3D FIELD_DP32(data, VDATA, NF, = a->nf);
+    data =3D FIELD_DP32(data, = VDATA, VTA, s->vta);
=      return ldst_index_trans(a->rd, = a->rs1, a->rs2, data, fn, s, false);
 }
 @@ -1043,6 +1052,7 @@ static bool = st_index_op(DisasContext *s, arg_rnfvm *a, uint8_t eew)
=      data =3D FIELD_DP32(data, VDATA, VM, = a->vm);
     data =3D = FIELD_DP32(data, VDATA, LMUL, emul);
=      data =3D FIELD_DP32(data, VDATA, NF, = a->nf);
+    data =3D FIELD_DP32(data, = VDATA, VTA, s->vta);
=      return ldst_index_trans(a->rd, = a->rs1, a->rs2, data, fn, s, true);
 }
 @@ -1108,6 +1118,7 @@ static bool ldff_op(DisasContext = *s, arg_r2nfvm *a, uint8_t eew)
=      data =3D FIELD_DP32(data, VDATA, VM, = a->vm);
     data =3D = FIELD_DP32(data, VDATA, LMUL, emul);
=      data =3D FIELD_DP32(data, VDATA, NF, = a->nf);
+    data =3D FIELD_DP32(data, = VDATA, VTA, s->vta);
=      return ldff_trans(a->rd, a->rs1, = data, fn, s);
 }
 diff --git = a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 396e252179..1541d97b08 100644
--- = a/target/riscv/vector_helper.c
+++ = b/target/riscv/vector_helper.c
@@ -270,6 +270,8 @@ = vext_ldst_stride(void *vd, void *v0, target_ulong base,
=      uint32_t i, k;
=      uint32_t nf =3D vext_nf(desc);
     uint32_t max_elems =3D = vext_max_elems(desc, log2_esz);
+ =    uint32_t esz =3D 1 << log2_esz;
+ =    uint32_t vta =3D vext_vta(desc);
=        for (i =3D env->vstart; i = < env->vl; i++, env->vstart++) {
=          if (!vm && = !vext_elem_mask(v0, i)) {
@@ -284,6 +286,11 @@ = vext_ldst_stride(void *vd, void *v0, target_ulong base,
=          }
=      }
=      env->vstart =3D 0;
+ =    /* set tail elements to 1s */
+ =    for (k =3D 0; k < nf; ++k) {
+ =        vext_set_elems_1s(vd, vta, = env->vl * esz + k * max_elems,
+ =             &n= bsp;           &nbs= p;max_elems * esz + k * max_elems);
+ =    }
 }

It seems incorrect here. I think it should be  k * = max_elems * esz. The same to following similar case.

Otherwise, this patchset looks good to me.

Reviewed-by: Weiwei Li<liweiwei@iscas.ac.cn>

Regards,
Weiwei Li


I have just sent a new version to correct this.
Thank you for your patient review.

Regards,

eop Chen

= --Apple-Mail=_ED4F7D08-7B72-481B-91EA-82188FE2F246--