linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] riscv: mm: synchronize MMU after pte change
@ 2019-06-17  4:26 ShihPo Hung
  2019-06-17 10:32 ` Paul Walmsley
  0 siblings, 1 reply; 4+ messages in thread
From: ShihPo Hung @ 2019-06-17  4:26 UTC (permalink / raw)
  To: linux-riscv; +Cc: Albert Ou, Palmer Dabbelt, stable, Paul Walmsley

Because RISC-V compliant implementations can cache invalid entries
in TLB, an SFENCE.VMA is necessary after changes to the page table.
This patch adds an SFENCE.vma for the vmalloc_fault path.

Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-riscv@lists.infradead.org
Cc: stable@vger.kernel.org
---
 arch/riscv/mm/fault.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index 88401d5..a1c7b39 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -29,6 +29,7 @@

 #include <asm/pgalloc.h>
 #include <asm/ptrace.h>
+#include <asm/tlbflush.h>

 /*
  * This routine handles page faults.  It determines the address and the
@@ -281,6 +282,16 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
        pte_k = pte_offset_kernel(pmd_k, addr);
        if (!pte_present(*pte_k))
            goto no_context;
+
+       /*
+        * The kernel assumes that TLBs don't cache invalid entries, but
+        * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a
+        * cache flush; it is necessary even after writing invalid entries.
+        * Relying on flush_tlb_fix_spurious_fault would suffice, but
+        * the extra traps reduce performance.  So, eagerly SFENCE.VMA.
+        */
+       local_flush_tlb_page(addr);
+
        return;
    }
 }
--
2.7.4

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] riscv: mm: synchronize MMU after pte change
  2019-06-17  4:26 [PATCH v2] riscv: mm: synchronize MMU after pte change ShihPo Hung
@ 2019-06-17 10:32 ` Paul Walmsley
  2019-06-19  1:51   ` Gary Guo
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Walmsley @ 2019-06-17 10:32 UTC (permalink / raw)
  To: ShihPo Hung; +Cc: linux-riscv, Palmer Dabbelt, Albert Ou, stable

Hi ShihPo,

On Mon, 17 Jun 2019, ShihPo Hung wrote:

> Because RISC-V compliant implementations can cache invalid entries
> in TLB, an SFENCE.VMA is necessary after changes to the page table.
> This patch adds an SFENCE.vma for the vmalloc_fault path.
> 
> Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
> Cc: Palmer Dabbelt <palmer@sifive.com>
> Cc: Albert Ou <aou@eecs.berkeley.edu>
> Cc: Paul Walmsley <paul.walmsley@sifive.com>
> Cc: linux-riscv@lists.infradead.org
> Cc: stable@vger.kernel.org

Looks like something in your patch flow converted tabs to whitespace, so 
the patch doesn't apply.  Then, with the tabs fixed, the comment text 
exceeds 80 columns.

I've fixed those issues by hand for this patch (revised patch below, as 
queued for v5.2-rc), but could you please fix these issues for future 
patches?  Running checkpatch.pl --strict should help identify these issues 
before posting.


thanks,

- Paul


From: ShihPo Hung <shihpo.hung@sifive.com>
Date: Mon, 17 Jun 2019 12:26:17 +0800
Subject: [PATCH] [PATCH v2] riscv: mm: synchronize MMU after pte change

Because RISC-V compliant implementations can cache invalid entries
in TLB, an SFENCE.VMA is necessary after changes to the page table.
This patch adds an SFENCE.vma for the vmalloc_fault path.

Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
[paul.walmsley@sifive.com: reversed tab->whitespace conversion,
 wrapped comment lines]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-riscv@lists.infradead.org
Cc: stable@vger.kernel.org
---
 arch/riscv/mm/fault.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index cec8be9e2d6a..5b72e60c5a6b 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -29,6 +29,7 @@
 
 #include <asm/pgalloc.h>
 #include <asm/ptrace.h>
+#include <asm/tlbflush.h>
 
 /*
  * This routine handles page faults.  It determines the address and the
@@ -278,6 +279,18 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 		pte_k = pte_offset_kernel(pmd_k, addr);
 		if (!pte_present(*pte_k))
 			goto no_context;
+
+		/*
+		 * The kernel assumes that TLBs don't cache invalid
+		 * entries, but in RISC-V, SFENCE.VMA specifies an
+		 * ordering constraint, not a cache flush; it is
+		 * necessary even after writing invalid entries.
+		 * Relying on flush_tlb_fix_spurious_fault would
+		 * suffice, but the extra traps reduce
+		 * performance. So, eagerly SFENCE.VMA.
+		 */
+		local_flush_tlb_page(addr);
+
 		return;
 	}
 }
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH v2] riscv: mm: synchronize MMU after pte change
  2019-06-17 10:32 ` Paul Walmsley
@ 2019-06-19  1:51   ` Gary Guo
  2019-06-19  7:44     ` Alan Kao
  0 siblings, 1 reply; 4+ messages in thread
From: Gary Guo @ 2019-06-19  1:51 UTC (permalink / raw)
  To: Paul Walmsley, ShihPo Hung; +Cc: linux-riscv, Palmer Dabbelt, stable, Albert Ou

I don't think it is sensible that any hardware would actually cache invalid entries.
From my research https://arxiv.org/pdf/1905.06825 (will appear in CARRV 2019), existing Linux code already emits too many SFENCE.VMAs (94% of all flushes) that are completely unnecessary for a hardware that does not cache invalid entries. Adding a couple of more would definitely not good, considering that TLB flush could be expensive for some microarchitectures.

I think the spec should be fixed, recommending against invalid entries to be cached, and then we can do things similar to other architectures, i.e. use flush_tlb_fix_spurious_fault instead.

> -----Original Message-----
> From: linux-riscv <linux-riscv-bounces@lists.infradead.org> On Behalf Of Paul
> Walmsley
> Sent: Monday, June 17, 2019 11:33
> To: ShihPo Hung <shihpo.hung@sifive.com>
> Cc: linux-riscv@lists.infradead.org; Palmer Dabbelt <palmer@sifive.com>; Albert
> Ou <aou@eecs.berkeley.edu>; stable@vger.kernel.org
> Subject: Re: [PATCH v2] riscv: mm: synchronize MMU after pte change
> 
> Hi ShihPo,
> 
> On Mon, 17 Jun 2019, ShihPo Hung wrote:
> 
> > Because RISC-V compliant implementations can cache invalid entries
> > in TLB, an SFENCE.VMA is necessary after changes to the page table.
> > This patch adds an SFENCE.vma for the vmalloc_fault path.
> >
> > Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
> > Cc: Palmer Dabbelt <palmer@sifive.com>
> > Cc: Albert Ou <aou@eecs.berkeley.edu>
> > Cc: Paul Walmsley <paul.walmsley@sifive.com>
> > Cc: linux-riscv@lists.infradead.org
> > Cc: stable@vger.kernel.org
> 
> Looks like something in your patch flow converted tabs to whitespace, so
> the patch doesn't apply.  Then, with the tabs fixed, the comment text
> exceeds 80 columns.
> 
> I've fixed those issues by hand for this patch (revised patch below, as
> queued for v5.2-rc), but could you please fix these issues for future
> patches?  Running checkpatch.pl --strict should help identify these issues
> before posting.
> 
> 
> thanks,
> 
> - Paul
> 
> 
> From: ShihPo Hung <shihpo.hung@sifive.com>
> Date: Mon, 17 Jun 2019 12:26:17 +0800
> Subject: [PATCH] [PATCH v2] riscv: mm: synchronize MMU after pte change
> 
> Because RISC-V compliant implementations can cache invalid entries
> in TLB, an SFENCE.VMA is necessary after changes to the page table.
> This patch adds an SFENCE.vma for the vmalloc_fault path.
> 
> Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
> [paul.walmsley@sifive.com: reversed tab->whitespace conversion,
>  wrapped comment lines]
> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
> Cc: Palmer Dabbelt <palmer@sifive.com>
> Cc: Albert Ou <aou@eecs.berkeley.edu>
> Cc: Paul Walmsley <paul.walmsley@sifive.com>
> Cc: linux-riscv@lists.infradead.org
> Cc: stable@vger.kernel.org
> ---
>  arch/riscv/mm/fault.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index cec8be9e2d6a..5b72e60c5a6b 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -29,6 +29,7 @@
> 
>  #include <asm/pgalloc.h>
>  #include <asm/ptrace.h>
> +#include <asm/tlbflush.h>
> 
>  /*
>   * This routine handles page faults.  It determines the address and the
> @@ -278,6 +279,18 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
>  		pte_k = pte_offset_kernel(pmd_k, addr);
>  		if (!pte_present(*pte_k))
>  			goto no_context;
> +
> +		/*
> +		 * The kernel assumes that TLBs don't cache invalid
> +		 * entries, but in RISC-V, SFENCE.VMA specifies an
> +		 * ordering constraint, not a cache flush; it is
> +		 * necessary even after writing invalid entries.
> +		 * Relying on flush_tlb_fix_spurious_fault would
> +		 * suffice, but the extra traps reduce
> +		 * performance. So, eagerly SFENCE.VMA.
> +		 */
> +		local_flush_tlb_page(addr);
> +
>  		return;
>  	}
>  }
> --
> 2.20.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] riscv: mm: synchronize MMU after pte change
  2019-06-19  1:51   ` Gary Guo
@ 2019-06-19  7:44     ` Alan Kao
  0 siblings, 0 replies; 4+ messages in thread
From: Alan Kao @ 2019-06-19  7:44 UTC (permalink / raw)
  To: Gary Guo
  Cc: Albert Ou, Palmer Dabbelt, stable, Paul Walmsley, linux-riscv,
	ShihPo Hung

Hi all,

On Wed, Jun 19, 2019 at 01:51:43AM +0000, Gary Guo wrote:
> I don't think it is sensible that any hardware would actually cache invalid entries.

I need some clarification here.  What does "invalid entries" mean here?
The inconsistency between TLB and page table?

> From my research https://arxiv.org/pdf/1905.06825 (will appear in CARRV 2019), existing Linux code already emits too many SFENCE.VMAs (94% of all flushes) that are completely unnecessary for a hardware that does not cache invalid entries. Adding a couple of more would definitely not good, considering that TLB flush could be expensive for some microarchitectures.
> 
> I think the spec should be fixed, recommending against invalid entries to be cached, and then we can do things similar to other architectures, i.e. use flush_tlb_fix_spurious_fault instead.
> 
> > -----Original Message-----
> > From: linux-riscv <linux-riscv-bounces@lists.infradead.org> On Behalf Of Paul
> > Walmsley
> > Sent: Monday, June 17, 2019 11:33
> > To: ShihPo Hung <shihpo.hung@sifive.com>
> > Cc: linux-riscv@lists.infradead.org; Palmer Dabbelt <palmer@sifive.com>; Albert
> > Ou <aou@eecs.berkeley.edu>; stable@vger.kernel.org
> > Subject: Re: [PATCH v2] riscv: mm: synchronize MMU after pte change
> > 
> > Hi ShihPo,
> > 
> > On Mon, 17 Jun 2019, ShihPo Hung wrote:
> > 
> > > Because RISC-V compliant implementations can cache invalid entries
> > > in TLB, an SFENCE.VMA is necessary after changes to the page table.
> > > This patch adds an SFENCE.vma for the vmalloc_fault path.
> > >
> > > Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
> > > Cc: Palmer Dabbelt <palmer@sifive.com>
> > > Cc: Albert Ou <aou@eecs.berkeley.edu>
> > > Cc: Paul Walmsley <paul.walmsley@sifive.com>
> > > Cc: linux-riscv@lists.infradead.org
> > > Cc: stable@vger.kernel.org
> > 
> > Looks like something in your patch flow converted tabs to whitespace, so
> > the patch doesn't apply.  Then, with the tabs fixed, the comment text
> > exceeds 80 columns.
> > 
> > I've fixed those issues by hand for this patch (revised patch below, as
> > queued for v5.2-rc), but could you please fix these issues for future
> > patches?  Running checkpatch.pl --strict should help identify these issues
> > before posting.
> > 
> > 
> > thanks,
> > 
> > - Paul
> > 
> > 
> > From: ShihPo Hung <shihpo.hung@sifive.com>
> > Date: Mon, 17 Jun 2019 12:26:17 +0800
> > Subject: [PATCH] [PATCH v2] riscv: mm: synchronize MMU after pte change
> > 
> > Because RISC-V compliant implementations can cache invalid entries
> > in TLB, an SFENCE.VMA is necessary after changes to the page table.
> > This patch adds an SFENCE.vma for the vmalloc_fault path.
> > 
> > Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
> > [paul.walmsley@sifive.com: reversed tab->whitespace conversion,
> >  wrapped comment lines]
> > Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
> > Cc: Palmer Dabbelt <palmer@sifive.com>
> > Cc: Albert Ou <aou@eecs.berkeley.edu>
> > Cc: Paul Walmsley <paul.walmsley@sifive.com>
> > Cc: linux-riscv@lists.infradead.org
> > Cc: stable@vger.kernel.org

I guess this patch is inspired by a discuss from sw-dev forum:
https://groups.google.com/a/groups.riscv.org/forum/#!msg/sw-dev/-M-eRDmGuEc/m1__-sfqAAAJ

Some processors in our AndeStar V5 family do suffer the issue this
patch addressed.  Along this vmalloc_fault path in do_page_fault,
pud and pmd (level-3/2 PTE) are set in dcache, but without SFENCE.VMA
the handler returns.  Then, the page table worker does not snoop
dcache, so the PTE update has no effect and it loops in faults.

Just as what spec mentions,
"The SFENCE.VMA is used to flush any local hardware caches related to
address translation." 

> > ---
> >  arch/riscv/mm/fault.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> > index cec8be9e2d6a..5b72e60c5a6b 100644
> > --- a/arch/riscv/mm/fault.c
> > +++ b/arch/riscv/mm/fault.c
> > @@ -29,6 +29,7 @@
> > 
> >  #include <asm/pgalloc.h>
> >  #include <asm/ptrace.h>
> > +#include <asm/tlbflush.h>
> > 
> >  /*
> >   * This routine handles page faults.  It determines the address and the
> > @@ -278,6 +279,18 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
> >  		pte_k = pte_offset_kernel(pmd_k, addr);
> >  		if (!pte_present(*pte_k))
> >  			goto no_context;
> > +
> > +		/*
> > +		 * The kernel assumes that TLBs don't cache invalid
> > +		 * entries, but in RISC-V, SFENCE.VMA specifies an
> > +		 * ordering constraint, not a cache flush; it is
> > +		 * necessary even after writing invalid entries.
> > +		 * Relying on flush_tlb_fix_spurious_fault would
> > +		 * suffice, but the extra traps reduce
> > +		 * performance. So, eagerly SFENCE.VMA.
> > +		 */
> > +		local_flush_tlb_page(addr);
> > +
> >  		return;
> >  	}
> >  }
> > --
> > 2.20.1
> > 
> > 
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-19  7:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-17  4:26 [PATCH v2] riscv: mm: synchronize MMU after pte change ShihPo Hung
2019-06-17 10:32 ` Paul Walmsley
2019-06-19  1:51   ` Gary Guo
2019-06-19  7:44     ` Alan Kao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).