From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F713C433EF for ; Tue, 9 Nov 2021 09:07:19 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D775560724 for ; Tue, 9 Nov 2021 09:07:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D775560724 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:48360 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkN65-0003Fe-JM for qemu-devel@archiver.kernel.org; Tue, 09 Nov 2021 04:07:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36258) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkN50-0002OK-5h; Tue, 09 Nov 2021 04:06:10 -0500 Received: from out28-74.mail.aliyun.com ([115.124.28.74]:52869) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkN4v-0002Lk-4q; Tue, 09 Nov 2021 04:06:09 -0500 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07608247|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_enroll_verification|0.0119141-5.85618e-05-0.988027; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047206; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.LpzfBz0_1636448757; Received: from 10.0.2.15(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.LpzfBz0_1636448757) by smtp.aliyun-inc.com(10.147.41.178); Tue, 09 Nov 2021 17:05:58 +0800 Subject: Re: [PATCH 09/13] target/riscv: Adjust vector address with ol From: LIU Zhiwei To: Richard Henderson , qemu-devel@nongnu.org, qemu-riscv@nongnu.org, Alexey Baturo References: <20211101100143.44356-1-zhiwei_liu@c-sky.com> <20211101100143.44356-10-zhiwei_liu@c-sky.com> <851481b9-e973-b3e1-1722-73db47edb772@linaro.org> <52357320-6098-c3da-b9de-89b131b85ffb@linaro.org> <1bd3a3ff-8ca8-ed78-1bc7-c4668965f448@linaro.org> <08078788-dd46-7e7a-14eb-7eb95c2c9c89@c-sky.com> Message-ID: <47b0a429-52ac-4d5e-ce0b-cea67b44c550@c-sky.com> Date: Tue, 9 Nov 2021 17:05:57 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <08078788-dd46-7e7a-14eb-7eb95c2c9c89@c-sky.com> Content-Type: multipart/alternative; boundary="------------E63463B41C14ADCCF83A9203" Content-Language: en-US Received-SPF: none client-ip=115.124.28.74; envelope-from=zhiwei_liu@c-sky.com; helo=out28-74.mail.aliyun.com X-Spam_score_int: -52 X-Spam_score: -5.3 X-Spam_bar: ----- X-Spam_report: (-5.3 / 5.0 requ) BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-3.364, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Alistair.Francis@wdc.com, bin.meng@windriver.com, palmer@dabbelt.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is a multi-part message in MIME format. --------------E63463B41C14ADCCF83A9203 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 2021/11/9 下午4:39, LIU Zhiwei wrote: > > > On 2021/11/9 下午4:18, Richard Henderson wrote: >> On 11/9/21 9:04 AM, LIU Zhiwei wrote: >>> On 2021/11/9 下午2:37, Richard Henderson wrote: >>> >>>> On 11/8/21 10:28 AM, LIU Zhiwei wrote: >>>>> On 2021/11/1 下午7:35, Richard Henderson wrote: >>>>> >>>>>> On 11/1/21 6:01 AM, LIU Zhiwei wrote: >>>>>>> Signed-off-by: LIU Zhiwei >>>>>>> --- >>>>>>>   target/riscv/insn_trans/trans_rvv.c.inc |  8 ++++ >>>>>>>   target/riscv/internals.h                |  1 + >>>>>>>   target/riscv/vector_helper.c            | 54 >>>>>>> +++++++++++++++++-------- >>>>>>>   3 files changed, 46 insertions(+), 17 deletions(-) >>>>>>> >>>>>>> diff --git a/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> b/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> index ed042f7bb9..5cd9b802df 100644 >>>>>>> --- a/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> +++ b/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> @@ -233,6 +233,7 @@ static bool ld_us_op(DisasContext *s, >>>>>>> arg_r2nfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_us_trans(a->rd, a->rs1, data, fn, s); >>>>>>>   } >>>>>>>   @@ -286,6 +287,7 @@ static bool st_us_op(DisasContext *s, >>>>>>> arg_r2nfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_us_trans(a->rd, a->rs1, data, fn, s); >>>>>>>   } >>>>>>>   @@ -365,6 +367,7 @@ static bool ld_stride_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   @@ -404,6 +407,7 @@ static bool st_stride_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       fn =  fns[seq][s->sew]; >>>>>>>       if (fn == NULL) { >>>>>>>           return false; >>>>>>> @@ -490,6 +494,7 @@ static bool ld_index_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   @@ -542,6 +547,7 @@ static bool st_index_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   @@ -617,6 +623,7 @@ static bool ldff_op(DisasContext *s, >>>>>>> arg_r2nfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldff_trans(a->rd, a->rs1, data, fn, s); >>>>>>>   } >>>>>>>   @@ -724,6 +731,7 @@ static bool amo_op(DisasContext *s, >>>>>>> arg_rwdvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, WD, a->wd); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return amo_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   /* >>>>>>> diff --git a/target/riscv/internals.h b/target/riscv/internals.h >>>>>>> index b15ad394bb..f74b8291e4 100644 >>>>>>> --- a/target/riscv/internals.h >>>>>>> +++ b/target/riscv/internals.h >>>>>>> @@ -27,6 +27,7 @@ FIELD(VDATA, VM, 8, 1) >>>>>>>   FIELD(VDATA, LMUL, 9, 2) >>>>>>>   FIELD(VDATA, NF, 11, 4) >>>>>>>   FIELD(VDATA, WD, 11, 1) >>>>>>> +FIELD(VDATA, OL, 15, 2) >>>>>>>     /* float point classify helpers */ >>>>>>>   target_ulong fclass_h(uint64_t frs1); >>>>>>> diff --git a/target/riscv/vector_helper.c >>>>>>> b/target/riscv/vector_helper.c >>>>>>> index 535420ee66..451688c328 100644 >>>>>>> --- a/target/riscv/vector_helper.c >>>>>>> +++ b/target/riscv/vector_helper.c >>>>>>> @@ -112,6 +112,11 @@ static uint32_t vext_wd(uint32_t desc) >>>>>>>       return (simd_data(desc) >> 11) & 0x1; >>>>>>>   } >>>>>>>   +static inline uint32_t vext_ol(uint32_t desc) >>>>>>> +{ >>>>>>> +    return FIELD_EX32(simd_data(desc), VDATA, OL); >>>>>>> +} >>>>>> >>>>>> XLEN not OLEN. >>>>> OK. >>>>>> >>>>>>> @@ -123,6 +128,14 @@ static inline uint32_t vext_maxsz(uint32_t >>>>>>> desc) >>>>>>>       return simd_maxsz(desc) << vext_lmul(desc); >>>>>>>   } >>>>>>>   +static inline target_ulong adjust_addr(target_ulong addr, >>>>>>> uint32_t olen) >>>>>>> +{ >>>>>>> +    if (olen < TARGET_LONG_BITS) { >>>>>>> +        addr &= UINT32_MAX; >>>>>>> +    } >>>>>>> +    return addr; >>>>>>> +} >>>>>> >>>>>> Here's where I'm unsure.  This looks a lot like the changes that >>>>>> are required to support pointer-masking in vectors, which Alexey >>>>>> said he was going to look at. >>>>>> >>>>>> (1) Do we need to pass anything in VEXT at all? >>>>>>     We do have CPURISCVState, so we could just use cpu_get_ml, >>>>> Yes, we should use cpu_get_xl. >>>>>> which we would also need for env->mmte etc for pointer masking. >>>>> >>>>> Do you mean env->mpmmask and env->mpmbase? I think yes, we should >>>>> also adjust these register behaviors with xlen. >>>> >>>> I mean the set of [msu]pmmask and [msu]pmbase, selected as >>>> appropriate for the current execution mode. >>>> >>>>>> (3) Do we try to streamline the computation by passing down >>>>>> composite >>>>>>     mask and base parameters.  This way we don't need to do complex >>>>>>     examination of ENV to determine execution mode, and instead >>>>>> always >>>>>>     compute >>>>>> >>>>>>        addr = (addr & mask) | base; >>>>>> >>>>>>     where mask = -1, base = 0 for "normal" addressing, and when >>>>>>     UXLEN == 32, mask <= UINT32_MAX. >>>>> >>>>> Do you mean add env->pmmask and env->pmbase? >>>>> >>>>> I can initialize them in riscv_tr_init_disas_context, such as by >>>>> env->xpmmask & UINT32_MAX . >>>>> >>>>>> >>>>>> (4) Do we in fact want to pre-compute these into known slots on ENV, >>>>>>     so that we don't have to pass these around as separate >>>>>> parameters? >>>>>>     We would adjust these values during PM CSR changes and when >>>>>>     changing privilege levels. >>>> For option (3), I was suggesting a mask + base pair passed down >>>> from TCG-generated code. >>>> >>>> For option (4), I was suggesting embedding a mask + base pair in >>>> env, which would be re-computed at every privilege level change, >>>> plus reset and vmload. >>>> >>>> In both cases, the mask would be a combination of [msu]pmmask & >>>> (RV32 ? UINT32_MAX : UINT64_MAX), as you say. >>> >>> We will calculate [msu]pmmask by  csrrw , and we have ignored high >>> bits there. >>> >>> Can we just use the [msu]pmmmask? >> >> We could.  However: >> >> In order to select [msu]pmmask, we have to look up the current cpu >> state.  In order to mask the high bits, we have to look up the >> current xl, which requires that we look up the current cpu state then >> extract the xl from misa  and mstatus. >> >> All of which means that we're doing repeated lookups for every memory >> access.  I am suggesting that we either (3) compile those lookups >> into the generated code or (4) cache those lookups when state changes >> (csr writes and priv changes). > > > Do you mean we should add this code to riscv_tr_init_disas_context > >     if (ctx->pm_enabled) { >          switch (priv) { >          case PRV_M: >              env->mask = env->mpmmask; >              env->base = env->mpmbase; >              break; >          case PRV_S: >              env->mask = env->spmmask; >              env->base = env->spmbase; >              break; >          case PRV_U: >              env->mask = env->upmmask; >              env->base = env->upmbase; >              break; >          default: >              g_assert_not_reached(); >          } >          ctx->pm_mask = pm_mask[priv]; >          ctx->pm_base = pm_base[priv]; > ctx->need_mask = true; /* new flag for mask */ >      } else if (get_xlen(ctx)  < TARGET_LONG_BITS) { >          env->mask = UINT32_MAX; >          env->base = 0; >          ctx->pm_mask = tcg_constant_tl(UINT32_MAX); >          ctx->pm_base = tcg_constant_tl(0); >        ctx->need_mask = true; >      } else { > env->mask = UINT64_MAX; >          env->base = 0; > } I think the code is wrong, perhaps we should modify the write_mpmmask env->mask = env->mpmmask = value; Zhiwei > Thanks, > Zhiwei >> >> >> r~ --------------E63463B41C14ADCCF83A9203 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit


On 2021/11/9 下午4:39, LIU Zhiwei wrote:


On 2021/11/9 下午4:18, Richard Henderson wrote:
On 11/9/21 9:04 AM, LIU Zhiwei wrote:
On 2021/11/9 下午2:37, Richard Henderson wrote:

On 11/8/21 10:28 AM, LIU Zhiwei wrote:
On 2021/11/1 下午7:35, Richard Henderson wrote:

On 11/1/21 6:01 AM, LIU Zhiwei wrote:
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
  target/riscv/insn_trans/trans_rvv.c.inc |  8 ++++
  target/riscv/internals.h                |  1 +
  target/riscv/vector_helper.c            | 54 +++++++++++++++++--------
  3 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index ed042f7bb9..5cd9b802df 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -233,6 +233,7 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_us_trans(a->rd, a->rs1, data, fn, s);
  }
  @@ -286,6 +287,7 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_us_trans(a->rd, a->rs1, data, fn, s);
  }
  @@ -365,6 +367,7 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  @@ -404,6 +407,7 @@ static bool st_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      fn =  fns[seq][s->sew];
      if (fn == NULL) {
          return false;
@@ -490,6 +494,7 @@ static bool ld_index_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  @@ -542,6 +547,7 @@ static bool st_index_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  @@ -617,6 +623,7 @@ static bool ldff_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldff_trans(a->rd, a->rs1, data, fn, s);
  }
  @@ -724,6 +731,7 @@ static bool amo_op(DisasContext *s, arg_rwdvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, WD, a->wd);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return amo_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  /*
diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index b15ad394bb..f74b8291e4 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -27,6 +27,7 @@ FIELD(VDATA, VM, 8, 1)
  FIELD(VDATA, LMUL, 9, 2)
  FIELD(VDATA, NF, 11, 4)
  FIELD(VDATA, WD, 11, 1)
+FIELD(VDATA, OL, 15, 2)
    /* float point classify helpers */
  target_ulong fclass_h(uint64_t frs1);
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 535420ee66..451688c328 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -112,6 +112,11 @@ static uint32_t vext_wd(uint32_t desc)
      return (simd_data(desc) >> 11) & 0x1;
  }
  +static inline uint32_t vext_ol(uint32_t desc)
+{
+    return FIELD_EX32(simd_data(desc), VDATA, OL);
+}

XLEN not OLEN.
OK.

@@ -123,6 +128,14 @@ static inline uint32_t vext_maxsz(uint32_t desc)
      return simd_maxsz(desc) << vext_lmul(desc);
  }
  +static inline target_ulong adjust_addr(target_ulong addr, uint32_t olen)
+{
+    if (olen < TARGET_LONG_BITS) {
+        addr &= UINT32_MAX;
+    }
+    return addr;
+}

Here's where I'm unsure.  This looks a lot like the changes that are required to support pointer-masking in vectors, which Alexey said he was going to look at.

(1) Do we need to pass anything in VEXT at all?
    We do have CPURISCVState, so we could just use cpu_get_ml,
Yes, we should use cpu_get_xl.
which we would also need for env->mmte etc for pointer masking.

Do you mean env->mpmmask and env->mpmbase? I think yes, we should also adjust these register behaviors with xlen.

I mean the set of [msu]pmmask and [msu]pmbase, selected as appropriate for the current execution mode.

(3) Do we try to streamline the computation by passing down composite
    mask and base parameters.  This way we don't need to do complex
    examination of ENV to determine execution mode, and instead always
    compute

       addr = (addr & mask) | base;

    where mask = -1, base = 0 for "normal" addressing, and when
    UXLEN == 32, mask <= UINT32_MAX.

Do you mean add env->pmmask and env->pmbase?

I can initialize them in riscv_tr_init_disas_context, such as by env->xpmmask & UINT32_MAX .


(4) Do we in fact want to pre-compute these into known slots on ENV,
    so that we don't have to pass these around as separate parameters?
    We would adjust these values during PM CSR changes and when
    changing privilege levels.
For option (3), I was suggesting a mask + base pair passed down from TCG-generated code.

For option (4), I was suggesting embedding a mask + base pair in env, which would be re-computed at every privilege level change, plus reset and vmload.

In both cases, the mask would be a combination of [msu]pmmask & (RV32 ? UINT32_MAX : UINT64_MAX), as you say.

We will calculate [msu]pmmask by  csrrw , and we have ignored high bits there.

Can we just use the [msu]pmmmask?

We could.  However:

In order to select [msu]pmmask, we have to look up the current cpu state.  In order to mask the high bits, we have to look up the current xl, which requires that we look up the current cpu state then extract the xl from misa  and mstatus.

All of which means that we're doing repeated lookups for every memory access.  I am suggesting that we either (3) compile those lookups into the generated code or (4) cache those lookups when state changes (csr writes and priv changes).


Do you mean we should add this code to riscv_tr_init_disas_context

    if (ctx->pm_enabled) {
         switch (priv) {
         case PRV_M:
             env->mask = env->mpmmask;
             env->base = env->mpmbase;
             break;
         case PRV_S:
             env->mask = env->spmmask;
             env->base = env->spmbase;
             break;
         case PRV_U:
             env->mask = env->upmmask;
             env->base = env->upmbase;
             break;
         default:
             g_assert_not_reached();
         }
         ctx->pm_mask = pm_mask[priv];
         ctx->pm_base = pm_base[priv];
         ctx->need_mask = true; /* new flag for mask */
     } else if (get_xlen(ctx)  < TARGET_LONG_BITS) {
         env->mask = UINT32_MAX;
         env->base = 0;
         ctx->pm_mask = tcg_constant_tl(UINT32_MAX);
         ctx->pm_base = tcg_constant_tl(0);
        ctx->need_mask = true;
     } else {
	 env->mask = UINT64_MAX;
         env->base = 0;
     }

I think the code is wrong, perhaps we should modify the write_mpmmask
env->mask = env->mpmmask = value;

Zhiwei

Thanks,
Zhiwei


r~
--------------E63463B41C14ADCCF83A9203-- From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1mkN51-0002Q7-Kq for mharc-qemu-riscv@gnu.org; Tue, 09 Nov 2021 04:06:11 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36258) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkN50-0002OK-5h; Tue, 09 Nov 2021 04:06:10 -0500 Received: from out28-74.mail.aliyun.com ([115.124.28.74]:52869) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkN4v-0002Lk-4q; Tue, 09 Nov 2021 04:06:09 -0500 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07608247|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_enroll_verification|0.0119141-5.85618e-05-0.988027; FP=0|0|0|0|0|-1|-1|-1; HT=ay29a033018047206; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=7; RT=7; SR=0; TI=SMTPD_---.LpzfBz0_1636448757; Received: from 10.0.2.15(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.LpzfBz0_1636448757) by smtp.aliyun-inc.com(10.147.41.178); Tue, 09 Nov 2021 17:05:58 +0800 Subject: Re: [PATCH 09/13] target/riscv: Adjust vector address with ol From: LIU Zhiwei To: Richard Henderson , qemu-devel@nongnu.org, qemu-riscv@nongnu.org, Alexey Baturo Cc: palmer@dabbelt.com, bin.meng@windriver.com, Alistair.Francis@wdc.com References: <20211101100143.44356-1-zhiwei_liu@c-sky.com> <20211101100143.44356-10-zhiwei_liu@c-sky.com> <851481b9-e973-b3e1-1722-73db47edb772@linaro.org> <52357320-6098-c3da-b9de-89b131b85ffb@linaro.org> <1bd3a3ff-8ca8-ed78-1bc7-c4668965f448@linaro.org> <08078788-dd46-7e7a-14eb-7eb95c2c9c89@c-sky.com> Message-ID: <47b0a429-52ac-4d5e-ce0b-cea67b44c550@c-sky.com> Date: Tue, 9 Nov 2021 17:05:57 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <08078788-dd46-7e7a-14eb-7eb95c2c9c89@c-sky.com> Content-Type: multipart/alternative; boundary="------------E63463B41C14ADCCF83A9203" Content-Language: en-US Received-SPF: none client-ip=115.124.28.74; envelope-from=zhiwei_liu@c-sky.com; helo=out28-74.mail.aliyun.com X-Spam_score_int: -52 X-Spam_score: -5.3 X-Spam_bar: ----- X-Spam_report: (-5.3 / 5.0 requ) BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-3.364, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-riscv@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Nov 2021 09:06:10 -0000 This is a multi-part message in MIME format. --------------E63463B41C14ADCCF83A9203 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 2021/11/9 下午4:39, LIU Zhiwei wrote: > > > On 2021/11/9 下午4:18, Richard Henderson wrote: >> On 11/9/21 9:04 AM, LIU Zhiwei wrote: >>> On 2021/11/9 下午2:37, Richard Henderson wrote: >>> >>>> On 11/8/21 10:28 AM, LIU Zhiwei wrote: >>>>> On 2021/11/1 下午7:35, Richard Henderson wrote: >>>>> >>>>>> On 11/1/21 6:01 AM, LIU Zhiwei wrote: >>>>>>> Signed-off-by: LIU Zhiwei >>>>>>> --- >>>>>>>   target/riscv/insn_trans/trans_rvv.c.inc |  8 ++++ >>>>>>>   target/riscv/internals.h                |  1 + >>>>>>>   target/riscv/vector_helper.c            | 54 >>>>>>> +++++++++++++++++-------- >>>>>>>   3 files changed, 46 insertions(+), 17 deletions(-) >>>>>>> >>>>>>> diff --git a/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> b/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> index ed042f7bb9..5cd9b802df 100644 >>>>>>> --- a/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> +++ b/target/riscv/insn_trans/trans_rvv.c.inc >>>>>>> @@ -233,6 +233,7 @@ static bool ld_us_op(DisasContext *s, >>>>>>> arg_r2nfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_us_trans(a->rd, a->rs1, data, fn, s); >>>>>>>   } >>>>>>>   @@ -286,6 +287,7 @@ static bool st_us_op(DisasContext *s, >>>>>>> arg_r2nfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_us_trans(a->rd, a->rs1, data, fn, s); >>>>>>>   } >>>>>>>   @@ -365,6 +367,7 @@ static bool ld_stride_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   @@ -404,6 +407,7 @@ static bool st_stride_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       fn =  fns[seq][s->sew]; >>>>>>>       if (fn == NULL) { >>>>>>>           return false; >>>>>>> @@ -490,6 +494,7 @@ static bool ld_index_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   @@ -542,6 +547,7 @@ static bool st_index_op(DisasContext *s, >>>>>>> arg_rnfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   @@ -617,6 +623,7 @@ static bool ldff_op(DisasContext *s, >>>>>>> arg_r2nfvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, NF, a->nf); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return ldff_trans(a->rd, a->rs1, data, fn, s); >>>>>>>   } >>>>>>>   @@ -724,6 +731,7 @@ static bool amo_op(DisasContext *s, >>>>>>> arg_rwdvm *a, uint8_t seq) >>>>>>>       data = FIELD_DP32(data, VDATA, VM, a->vm); >>>>>>>       data = FIELD_DP32(data, VDATA, LMUL, s->lmul); >>>>>>>       data = FIELD_DP32(data, VDATA, WD, a->wd); >>>>>>> +    data = FIELD_DP32(data, VDATA, OL, s->ol); >>>>>>>       return amo_trans(a->rd, a->rs1, a->rs2, data, fn, s); >>>>>>>   } >>>>>>>   /* >>>>>>> diff --git a/target/riscv/internals.h b/target/riscv/internals.h >>>>>>> index b15ad394bb..f74b8291e4 100644 >>>>>>> --- a/target/riscv/internals.h >>>>>>> +++ b/target/riscv/internals.h >>>>>>> @@ -27,6 +27,7 @@ FIELD(VDATA, VM, 8, 1) >>>>>>>   FIELD(VDATA, LMUL, 9, 2) >>>>>>>   FIELD(VDATA, NF, 11, 4) >>>>>>>   FIELD(VDATA, WD, 11, 1) >>>>>>> +FIELD(VDATA, OL, 15, 2) >>>>>>>     /* float point classify helpers */ >>>>>>>   target_ulong fclass_h(uint64_t frs1); >>>>>>> diff --git a/target/riscv/vector_helper.c >>>>>>> b/target/riscv/vector_helper.c >>>>>>> index 535420ee66..451688c328 100644 >>>>>>> --- a/target/riscv/vector_helper.c >>>>>>> +++ b/target/riscv/vector_helper.c >>>>>>> @@ -112,6 +112,11 @@ static uint32_t vext_wd(uint32_t desc) >>>>>>>       return (simd_data(desc) >> 11) & 0x1; >>>>>>>   } >>>>>>>   +static inline uint32_t vext_ol(uint32_t desc) >>>>>>> +{ >>>>>>> +    return FIELD_EX32(simd_data(desc), VDATA, OL); >>>>>>> +} >>>>>> >>>>>> XLEN not OLEN. >>>>> OK. >>>>>> >>>>>>> @@ -123,6 +128,14 @@ static inline uint32_t vext_maxsz(uint32_t >>>>>>> desc) >>>>>>>       return simd_maxsz(desc) << vext_lmul(desc); >>>>>>>   } >>>>>>>   +static inline target_ulong adjust_addr(target_ulong addr, >>>>>>> uint32_t olen) >>>>>>> +{ >>>>>>> +    if (olen < TARGET_LONG_BITS) { >>>>>>> +        addr &= UINT32_MAX; >>>>>>> +    } >>>>>>> +    return addr; >>>>>>> +} >>>>>> >>>>>> Here's where I'm unsure.  This looks a lot like the changes that >>>>>> are required to support pointer-masking in vectors, which Alexey >>>>>> said he was going to look at. >>>>>> >>>>>> (1) Do we need to pass anything in VEXT at all? >>>>>>     We do have CPURISCVState, so we could just use cpu_get_ml, >>>>> Yes, we should use cpu_get_xl. >>>>>> which we would also need for env->mmte etc for pointer masking. >>>>> >>>>> Do you mean env->mpmmask and env->mpmbase? I think yes, we should >>>>> also adjust these register behaviors with xlen. >>>> >>>> I mean the set of [msu]pmmask and [msu]pmbase, selected as >>>> appropriate for the current execution mode. >>>> >>>>>> (3) Do we try to streamline the computation by passing down >>>>>> composite >>>>>>     mask and base parameters.  This way we don't need to do complex >>>>>>     examination of ENV to determine execution mode, and instead >>>>>> always >>>>>>     compute >>>>>> >>>>>>        addr = (addr & mask) | base; >>>>>> >>>>>>     where mask = -1, base = 0 for "normal" addressing, and when >>>>>>     UXLEN == 32, mask <= UINT32_MAX. >>>>> >>>>> Do you mean add env->pmmask and env->pmbase? >>>>> >>>>> I can initialize them in riscv_tr_init_disas_context, such as by >>>>> env->xpmmask & UINT32_MAX . >>>>> >>>>>> >>>>>> (4) Do we in fact want to pre-compute these into known slots on ENV, >>>>>>     so that we don't have to pass these around as separate >>>>>> parameters? >>>>>>     We would adjust these values during PM CSR changes and when >>>>>>     changing privilege levels. >>>> For option (3), I was suggesting a mask + base pair passed down >>>> from TCG-generated code. >>>> >>>> For option (4), I was suggesting embedding a mask + base pair in >>>> env, which would be re-computed at every privilege level change, >>>> plus reset and vmload. >>>> >>>> In both cases, the mask would be a combination of [msu]pmmask & >>>> (RV32 ? UINT32_MAX : UINT64_MAX), as you say. >>> >>> We will calculate [msu]pmmask by  csrrw , and we have ignored high >>> bits there. >>> >>> Can we just use the [msu]pmmmask? >> >> We could.  However: >> >> In order to select [msu]pmmask, we have to look up the current cpu >> state.  In order to mask the high bits, we have to look up the >> current xl, which requires that we look up the current cpu state then >> extract the xl from misa  and mstatus. >> >> All of which means that we're doing repeated lookups for every memory >> access.  I am suggesting that we either (3) compile those lookups >> into the generated code or (4) cache those lookups when state changes >> (csr writes and priv changes). > > > Do you mean we should add this code to riscv_tr_init_disas_context > >     if (ctx->pm_enabled) { >          switch (priv) { >          case PRV_M: >              env->mask = env->mpmmask; >              env->base = env->mpmbase; >              break; >          case PRV_S: >              env->mask = env->spmmask; >              env->base = env->spmbase; >              break; >          case PRV_U: >              env->mask = env->upmmask; >              env->base = env->upmbase; >              break; >          default: >              g_assert_not_reached(); >          } >          ctx->pm_mask = pm_mask[priv]; >          ctx->pm_base = pm_base[priv]; > ctx->need_mask = true; /* new flag for mask */ >      } else if (get_xlen(ctx)  < TARGET_LONG_BITS) { >          env->mask = UINT32_MAX; >          env->base = 0; >          ctx->pm_mask = tcg_constant_tl(UINT32_MAX); >          ctx->pm_base = tcg_constant_tl(0); >        ctx->need_mask = true; >      } else { > env->mask = UINT64_MAX; >          env->base = 0; > } I think the code is wrong, perhaps we should modify the write_mpmmask env->mask = env->mpmmask = value; Zhiwei > Thanks, > Zhiwei >> >> >> r~ --------------E63463B41C14ADCCF83A9203 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit


On 2021/11/9 下午4:39, LIU Zhiwei wrote:


On 2021/11/9 下午4:18, Richard Henderson wrote:
On 11/9/21 9:04 AM, LIU Zhiwei wrote:
On 2021/11/9 下午2:37, Richard Henderson wrote:

On 11/8/21 10:28 AM, LIU Zhiwei wrote:
On 2021/11/1 下午7:35, Richard Henderson wrote:

On 11/1/21 6:01 AM, LIU Zhiwei wrote:
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
  target/riscv/insn_trans/trans_rvv.c.inc |  8 ++++
  target/riscv/internals.h                |  1 +
  target/riscv/vector_helper.c            | 54 +++++++++++++++++--------
  3 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index ed042f7bb9..5cd9b802df 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -233,6 +233,7 @@ static bool ld_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_us_trans(a->rd, a->rs1, data, fn, s);
  }
  @@ -286,6 +287,7 @@ static bool st_us_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_us_trans(a->rd, a->rs1, data, fn, s);
  }
  @@ -365,6 +367,7 @@ static bool ld_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  @@ -404,6 +407,7 @@ static bool st_stride_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      fn =  fns[seq][s->sew];
      if (fn == NULL) {
          return false;
@@ -490,6 +494,7 @@ static bool ld_index_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  @@ -542,6 +547,7 @@ static bool st_index_op(DisasContext *s, arg_rnfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  @@ -617,6 +623,7 @@ static bool ldff_op(DisasContext *s, arg_r2nfvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, NF, a->nf);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return ldff_trans(a->rd, a->rs1, data, fn, s);
  }
  @@ -724,6 +731,7 @@ static bool amo_op(DisasContext *s, arg_rwdvm *a, uint8_t seq)
      data = FIELD_DP32(data, VDATA, VM, a->vm);
      data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
      data = FIELD_DP32(data, VDATA, WD, a->wd);
+    data = FIELD_DP32(data, VDATA, OL, s->ol);
      return amo_trans(a->rd, a->rs1, a->rs2, data, fn, s);
  }
  /*
diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index b15ad394bb..f74b8291e4 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -27,6 +27,7 @@ FIELD(VDATA, VM, 8, 1)
  FIELD(VDATA, LMUL, 9, 2)
  FIELD(VDATA, NF, 11, 4)
  FIELD(VDATA, WD, 11, 1)
+FIELD(VDATA, OL, 15, 2)
    /* float point classify helpers */
  target_ulong fclass_h(uint64_t frs1);
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 535420ee66..451688c328 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -112,6 +112,11 @@ static uint32_t vext_wd(uint32_t desc)
      return (simd_data(desc) >> 11) & 0x1;
  }
  +static inline uint32_t vext_ol(uint32_t desc)
+{
+    return FIELD_EX32(simd_data(desc), VDATA, OL);
+}

XLEN not OLEN.
OK.

@@ -123,6 +128,14 @@ static inline uint32_t vext_maxsz(uint32_t desc)
      return simd_maxsz(desc) << vext_lmul(desc);
  }
  +static inline target_ulong adjust_addr(target_ulong addr, uint32_t olen)
+{
+    if (olen < TARGET_LONG_BITS) {
+        addr &= UINT32_MAX;
+    }
+    return addr;
+}

Here's where I'm unsure.  This looks a lot like the changes that are required to support pointer-masking in vectors, which Alexey said he was going to look at.

(1) Do we need to pass anything in VEXT at all?
    We do have CPURISCVState, so we could just use cpu_get_ml,
Yes, we should use cpu_get_xl.
which we would also need for env->mmte etc for pointer masking.

Do you mean env->mpmmask and env->mpmbase? I think yes, we should also adjust these register behaviors with xlen.

I mean the set of [msu]pmmask and [msu]pmbase, selected as appropriate for the current execution mode.

(3) Do we try to streamline the computation by passing down composite
    mask and base parameters.  This way we don't need to do complex
    examination of ENV to determine execution mode, and instead always
    compute

       addr = (addr & mask) | base;

    where mask = -1, base = 0 for "normal" addressing, and when
    UXLEN == 32, mask <= UINT32_MAX.

Do you mean add env->pmmask and env->pmbase?

I can initialize them in riscv_tr_init_disas_context, such as by env->xpmmask & UINT32_MAX .


(4) Do we in fact want to pre-compute these into known slots on ENV,
    so that we don't have to pass these around as separate parameters?
    We would adjust these values during PM CSR changes and when
    changing privilege levels.
For option (3), I was suggesting a mask + base pair passed down from TCG-generated code.

For option (4), I was suggesting embedding a mask + base pair in env, which would be re-computed at every privilege level change, plus reset and vmload.

In both cases, the mask would be a combination of [msu]pmmask & (RV32 ? UINT32_MAX : UINT64_MAX), as you say.

We will calculate [msu]pmmask by  csrrw , and we have ignored high bits there.

Can we just use the [msu]pmmmask?

We could.  However:

In order to select [msu]pmmask, we have to look up the current cpu state.  In order to mask the high bits, we have to look up the current xl, which requires that we look up the current cpu state then extract the xl from misa  and mstatus.

All of which means that we're doing repeated lookups for every memory access.  I am suggesting that we either (3) compile those lookups into the generated code or (4) cache those lookups when state changes (csr writes and priv changes).


Do you mean we should add this code to riscv_tr_init_disas_context

    if (ctx->pm_enabled) {
         switch (priv) {
         case PRV_M:
             env->mask = env->mpmmask;
             env->base = env->mpmbase;
             break;
         case PRV_S:
             env->mask = env->spmmask;
             env->base = env->spmbase;
             break;
         case PRV_U:
             env->mask = env->upmmask;
             env->base = env->upmbase;
             break;
         default:
             g_assert_not_reached();
         }
         ctx->pm_mask = pm_mask[priv];
         ctx->pm_base = pm_base[priv];
         ctx->need_mask = true; /* new flag for mask */
     } else if (get_xlen(ctx)  < TARGET_LONG_BITS) {
         env->mask = UINT32_MAX;
         env->base = 0;
         ctx->pm_mask = tcg_constant_tl(UINT32_MAX);
         ctx->pm_base = tcg_constant_tl(0);
        ctx->need_mask = true;
     } else {
	 env->mask = UINT64_MAX;
         env->base = 0;
     }

I think the code is wrong, perhaps we should modify the write_mpmmask
env->mask = env->mpmmask = value;

Zhiwei

Thanks,
Zhiwei


r~
--------------E63463B41C14ADCCF83A9203--