From: Anup Patel <anup@brainfault.org>
To: Jiuyang Liu <liu@jiuyang.me>
Cc: Alexandre Ghiti <alex@ghiti.fr>,
Andrew Waterman <waterman@eecs.berkeley.edu>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Atish Patra <atish.patra@wdc.com>,
Anup Patel <anup.patel@wdc.com>,
Andrew Morton <akpm@linux-foundation.org>,
Mike Rapoport <rppt@kernel.org>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
Zong Li <zong.li@sifive.com>,
Greentime Hu <greentime.hu@sifive.com>,
linux-riscv <linux-riscv@lists.infradead.org>,
"linux-kernel@vger.kernel.org List"
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] Insert SFENCE.VMA in function set_pte_at for RISCV
Date: Tue, 16 Mar 2021 13:02:24 +0530 [thread overview]
Message-ID: <CAAhSdy1HYJJDig3Mg1eWaO=zok9G6+hQM1LLbDKMzH-=Fi2dKw@mail.gmail.com> (raw)
In-Reply-To: <CAPM7DZc+Ysd=VQdzc4_4Np8VAMESBrzD3mhk0ueh92x11bFFNg@mail.gmail.com>
On Tue, Mar 16, 2021 at 12:27 PM Jiuyang Liu <liu@jiuyang.me> wrote:
>
> > As per my understanding, we don't need to explicitly invalidate local TLB
> > in set_pte() or set_pet_at() because generic Linux page table management
> > (<linux>/mm/*) will call the appropriate flush_tlb_xyz() function after page
> > table updates.
>
> I witnessed this bug in our micro-architecture: set_pte instruction is
> still in the store buffer, no functions are inserting SFENCE.VMA in
> the stack below, so TLB cannot witness this modification.
> Here is my call stack:
> set_pte
> set_pte_at
> map_vm_area
> __vmalloc_area_node
> __vmalloc_node_range
> __vmalloc_node
> __vmalloc_node_flags
> vzalloc
> n_tty_open
>
> I think this is an architecture specific code, so <linux>/mm/* should
> not be modified.
> And spec requires SFENCE.VMA to be inserted on each modification to
> TLB. So I added code here.
The generic linux/mm/* already calls the appropriate tlb_flush_xyz()
function defined in arch/riscv/include/asm/tlbflush.h
Better to have a write-barrier in set_pte().
>
> > Also, just local TLB flush is generally not sufficient because
> > a lot of page tables will be used across on multiple HARTs.
>
> Yes, this is the biggest issue, in RISC-V Volume 2, Privileged Spec v.
> 20190608 page 67 gave a solution:
This is not an issue with RISC-V privilege spec rather it is more about
placing RISC-V fences at right locations.
> Consequently, other harts must be notified separately when the
> memory-management data structures have been modified. One approach is
> to use
> 1) a local data fence to ensure local writes are visible globally,
> then 2) an interprocessor interrupt to the other thread,
> then 3) a local SFENCE.VMA in the interrupt handler of the remote thread,
> and finally 4) signal back to originating thread that operation is
> complete. This is, of course, the RISC-V analog to a TLB shootdown.
I would suggest trying approach#1.
You can include "asm/barrier.h" here and use wmb() or __smp_wmb()
in-place of local TLB flush.
>
> In general, this patch didn't handle the G bit in PTE, kernel trap it
> to sbi_remote_sfence_vma. do you think I should use flush_tlb_all?
>
> Jiuyang
>
>
>
>
> arch/arm/mm/mmu.c
> void set_pte_at(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, pte_t pteval)
> {
> unsigned long ext = 0;
>
> if (addr < TASK_SIZE && pte_valid_user(pteval)) {
> if (!pte_special(pteval))
> __sync_icache_dcache(pteval);
> ext |= PTE_EXT_NG;
> }
>
> set_pte_ext(ptep, pteval, ext);
> }
>
> arch/mips/include/asm/pgtable.h
> static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, pte_t pteval)
> {
>
> if (!pte_present(pteval))
> goto cache_sync_done;
>
> if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval)))
> goto cache_sync_done;
>
> __update_cache(addr, pteval);
> cache_sync_done:
> set_pte(ptep, pteval);
> }
>
>
> Also, just local TLB flush is generally not sufficient because
> > a lot of page tables will be used accross on multiple HARTs.
>
>
> On Tue, Mar 16, 2021 at 5:05 AM Anup Patel <anup@brainfault.org> wrote:
> >
> > +Alex
> >
> > On Tue, Mar 16, 2021 at 9:20 AM Jiuyang Liu <liu@jiuyang.me> wrote:
> > >
> > > This patch inserts SFENCE.VMA after modifying PTE based on RISC-V
> > > specification.
> > >
> > > arch/riscv/include/asm/pgtable.h:
> > > 1. implement pte_user, pte_global and pte_leaf to check correspond
> > > attribute of a pte_t.
> >
> > Adding pte_user(), pte_global(), and pte_leaf() is fine.
> >
> > >
> > > 2. insert SFENCE.VMA in set_pte_at based on RISC-V Volume 2, Privileged
> > > Spec v. 20190608 page 66 and 67:
> > > If software modifies a non-leaf PTE, it should execute SFENCE.VMA with
> > > rs1=x0. If any PTE along the traversal path had its G bit set, rs2 must
> > > be x0; otherwise, rs2 should be set to the ASID for which the
> > > translation is being modified.
> > > If software modifies a leaf PTE, it should execute SFENCE.VMA with rs1
> > > set to a virtual address within the page. If any PTE along the traversal
> > > path had its G bit set, rs2 must be x0; otherwise, rs2 should be set to
> > > the ASID for which the translation is being modified.
> > >
> > > arch/riscv/include/asm/tlbflush.h:
> > > 1. implement get_current_asid to get current program asid.
> > > 2. implement local_flush_tlb_asid to flush tlb with asid.
> >
> > As per my understanding, we don't need to explicitly invalidate local TLB
> > in set_pte() or set_pet_at() because generic Linux page table management
> > (<linux>/mm/*) will call the appropriate flush_tlb_xyz() function after page
> > table updates. Also, just local TLB flush is generally not sufficient because
> > a lot of page tables will be used accross on multiple HARTs.
> >
> > >
> > > Signed-off-by: Jiuyang Liu <liu@jiuyang.me>
> > > ---
> > > arch/riscv/include/asm/pgtable.h | 27 +++++++++++++++++++++++++++
> > > arch/riscv/include/asm/tlbflush.h | 12 ++++++++++++
> > > 2 files changed, 39 insertions(+)
> > >
> > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> > > index ebf817c1bdf4..5a47c60372c1 100644
> > > --- a/arch/riscv/include/asm/pgtable.h
> > > +++ b/arch/riscv/include/asm/pgtable.h
> > > @@ -222,6 +222,16 @@ static inline int pte_write(pte_t pte)
> > > return pte_val(pte) & _PAGE_WRITE;
> > > }
> > >
> > > +static inline int pte_user(pte_t pte)
> > > +{
> > > + return pte_val(pte) & _PAGE_USER;
> > > +}
> > > +
> > > +static inline int pte_global(pte_t pte)
> > > +{
> > > + return pte_val(pte) & _PAGE_GLOBAL;
> > > +}
> > > +
> > > static inline int pte_exec(pte_t pte)
> > > {
> > > return pte_val(pte) & _PAGE_EXEC;
> > > @@ -248,6 +258,11 @@ static inline int pte_special(pte_t pte)
> > > return pte_val(pte) & _PAGE_SPECIAL;
> > > }
> > >
> > > +static inline int pte_leaf(pte_t pte)
> > > +{
> > > + return pte_val(pte) & (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC);
> > > +}
> > > +
> > > /* static inline pte_t pte_rdprotect(pte_t pte) */
> > >
> > > static inline pte_t pte_wrprotect(pte_t pte)
> > > @@ -358,6 +373,18 @@ static inline void set_pte_at(struct mm_struct *mm,
> > > flush_icache_pte(pteval);
> > >
> > > set_pte(ptep, pteval);
> > > +
> > > + if (pte_present(pteval)) {
> > > + if (pte_leaf(pteval)) {
> > > + local_flush_tlb_page(addr);
> > > + } else {
> > > + if (pte_global(pteval))
> > > + local_flush_tlb_all();
> > > + else
> > > + local_flush_tlb_asid();
> > > +
> > > + }
> > > + }
> > > }
> > >
> > > static inline void pte_clear(struct mm_struct *mm,
> > > diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
> > > index 394cfbccdcd9..1f9b62b3670b 100644
> > > --- a/arch/riscv/include/asm/tlbflush.h
> > > +++ b/arch/riscv/include/asm/tlbflush.h
> > > @@ -21,6 +21,18 @@ static inline void local_flush_tlb_page(unsigned long addr)
> > > {
> > > __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
> > > }
> > > +
> > > +static inline unsigned long get_current_asid(void)
> > > +{
> > > + return (csr_read(CSR_SATP) >> SATP_ASID_SHIFT) & SATP_ASID_MASK;
> > > +}
> > > +
> > > +static inline void local_flush_tlb_asid(void)
> > > +{
> > > + unsigned long asid = get_current_asid();
> > > + __asm__ __volatile__ ("sfence.vma x0, %0" : : "r" (asid) : "memory");
> > > +}
> > > +
> > > #else /* CONFIG_MMU */
> > > #define local_flush_tlb_all() do { } while (0)
> > > #define local_flush_tlb_page(addr) do { } while (0)
> > > --
> > > 2.30.2
> > >
> > >
> > > _______________________________________________
> > > linux-riscv mailing list
> > > linux-riscv@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
> > Regards,
> > Anup
Regards,
Anup
next prev parent reply other threads:[~2021-03-16 7:33 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-10 6:22 [PATCH] Insert SFENCE.VMA in function set_pte_at for RISCV Jiuyang
2021-03-16 1:53 ` [PATCH 2/2] Bug Fix for last patch Jiuyang Liu
2021-03-16 3:15 ` Yixun Lan
2021-03-16 3:40 ` Andrew Morton
2021-03-16 3:46 ` [PATCH] Insert SFENCE.VMA in function set_pte_at for RISCV Jiuyang Liu
2021-03-16 5:05 ` Anup Patel
2021-03-16 6:56 ` Jiuyang Liu
2021-03-16 7:32 ` Anup Patel [this message]
2021-03-16 8:29 ` Andrew Waterman
2021-03-16 8:40 ` Anup Patel
2021-03-16 12:05 ` Alex Ghiti
2021-03-16 22:03 ` Andrew Waterman
2021-03-18 2:10 ` Jiuyang Liu
2021-03-19 7:14 ` Alex Ghiti
2021-03-30 23:35 ` Palmer Dabbelt
2021-03-17 4:17 ` Palmer Dabbelt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAAhSdy1HYJJDig3Mg1eWaO=zok9G6+hQM1LLbDKMzH-=Fi2dKw@mail.gmail.com' \
--to=anup@brainfault.org \
--cc=akpm@linux-foundation.org \
--cc=alex@ghiti.fr \
--cc=anup.patel@wdc.com \
--cc=aou@eecs.berkeley.edu \
--cc=atish.patra@wdc.com \
--cc=greentime.hu@sifive.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=liu@jiuyang.me \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=rppt@kernel.org \
--cc=wangkefeng.wang@huawei.com \
--cc=waterman@eecs.berkeley.edu \
--cc=zong.li@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).