Re: [PATCH] target/riscv/pmp: fix non-translated page size address checks w/ MPU

From: Alistair Francis <alistair23@gmail.com>
To: Leon Schuermann <leon@is.currently.online>
Cc: Palmer Dabbelt <palmer@dabbelt.com>,
	Alistair Francis <alistair.francis@wdc.com>,
	 Bin Meng <bin.meng@windriver.com>,
	"open list:RISC-V" <qemu-riscv@nongnu.org>,
	 "qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>
Subject: Re: [PATCH] target/riscv/pmp: fix non-translated page size address checks w/ MPU
Date: Tue, 25 Oct 2022 12:19:35 +1000	[thread overview]
Message-ID: <CAKmqyKO7=LS9JbMyCKg3C5U3-vNJiQTEDR59atwMGoCSpF3C2g@mail.gmail.com> (raw)
In-Reply-To: <8735bjqz8f.fsf@silicon.host.schuermann.io>

On Thu, Oct 20, 2022 at 7:29 AM Leon Schuermann
<leon@is.currently.online> wrote:
>
> Alistair Francis <alistair23@gmail.com> writes:
> >> @@ -310,10 +311,17 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
> >>      }
> >>
> >>      if (size == 0) {
> >> -        if (riscv_feature(env, RISCV_FEATURE_MMU)) {
> >> +        if (riscv_cpu_mxl(env) == MXL_RV32) {
> >> +            satp_mode = SATP32_MODE;
> >> +        } else {
> >> +            satp_mode = SATP64_MODE;
> >> +        }
> >> +
> >> +        if (riscv_feature(env, RISCV_FEATURE_MMU)
> >> +            && get_field(env->satp, satp_mode)) {
> >>              /*
> >> -             * If size is unknown (0), assume that all bytes
> >> -             * from addr to the end of the page will be accessed.
> >> +             * If size is unknown (0) and virtual memory is enabled, assume that
> >> +             * all bytes from addr to the end of the page will be accessed.
> >>               */
> >>              pmp_size = -(addr | TARGET_PAGE_MASK);
> >
> > I'm not sure if we need this at all.
> >
> > This function is only called from get_physical_address_pmp() which
> > then calculates the maximum size using pmp_is_range_in_tlb().
>
> I'm by no means an expert on QEMU and the TCG, so I've spun up GDB to
> trace down why exactly this function is called with a `size = 0`
> argument. It seems that there are, generally, two code paths to this
> function for instruction fetching:
>
> 1. From `get_page_addr_code`: this will invoke `tlb_fill` with
>    `size = 0` to check whether an entire page is accessible and can be
>    translated given the current MMU / PMP configuration. In my
>    particular example, it may rightfully fail then. `get_page_addr_code`
>    can handle this and will subsequently cause an MMU protection check
>    to be run for each instruction translated.
>
> 2. From `riscv_tr_translate_insn` through `cpu_lduw_code`, which will
>    execute `tlb_fill` with `size = 2` (to try and decode a compressed
>    instruction), assuming that the above check failed.
>
> So far, so good. In this context, it actually makes sense for
> `pmp_hart_has_privs` to interpret `size = 0` to mean whether the entire
> page is allowed to be accessed.
>
> > I suspect that we could just use sizeof(target_ulong) as the fallback
> > for every time size == 0. Then pmp_is_range_in_tlb() will set the
> > tlb_size to the maximum possible size of the PMP region.
>
> Given the above, I don't think that this is correct either. The PMP
> check would pass even for non-page sized regions, but the entire page
> would be accessible through the TCG's TLB, as a consequence of
> `get_page_addr_code`.

If we pass a size smaller than the page, it won't be cached in the
TLB, so that should be ok.

A few PMP improvements have gone into
https://github.com/alistair23/qemu/tree/riscv-to-apply.next. It might
be worth checking to see if that fixes the issue you are seeing.
Otherwise I think defaulting to the xlen should be ok.

Alistair

>
>
> In the current implementation, `get_page_addr_code_hostp` calls
> `tlb_fill`, which in turn invokes the RISC-V TCG op `tlb_fill` with the
> parameter `probe = false`. This in turn raises a PMP exception in the
> CPU, whereas `get_page_addr_code` would seem to expect this a failing
> `tlb_fill` to be side-effect free, such that the MMU protection checks
> can be re-run per instruction in the TCG code generation phase.
>
> I think that this is sufficient evidence to conclude that my initial
> patch is actually incorrect, however I am unsure as to how this issue
> can be solved proper. One approach which seems to work is to change
> `get_page_addr_code_hostp` to use a non-faulting page-table read
> instead:
>
> @@ -1510,11 +1510,15 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
>      uintptr_t mmu_idx = cpu_mmu_index(env, true);
>      uintptr_t index = tlb_index(env, mmu_idx, addr);
>      CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
> +    CPUState *cs = env_cpu(env);
> +    CPUClass *cc = CPU_GET_CLASS(cs);
>      void *p;
>
>      if (unlikely(!tlb_hit(entry->addr_code, addr))) {
>          if (!VICTIM_TLB_HIT(addr_code, addr)) {
> -            tlb_fill(env_cpu(env), addr, 0, MMU_INST_FETCH, mmu_idx, 0);
> +            // Nonfaulting page-table read:
> +            cc->tcg_ops->tlb_fill(cs, addr, 0, MMU_INST_FETCH, mmu_idx, true,
> +                                  0);
>              index = tlb_index(env, mmu_idx, addr);
>              entry = tlb_entry(env, mmu_idx, addr);
>
>
> However, given this touches the generic TCG implementation, I cannot
> judge whether this is correct or has any unintended side effects for
> other targets. If this is correct, I'd be happy to send a proper patch.
>
> -Leon