All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
@ 2015-12-10 17:39 ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-12-10 17:39 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: mark.rutland, Will Deacon, stable

In paging_init, we allocate the zero page, memset it to zero and then
point TTBR0 to it in order to avoid speculative fetches through the
identity mapping.

In order to guarantee that the freshly zeroed page is indeed visible to
the page table walker, we need to execute a dsb instruction prior to
writing the TTBR.

Cc: <stable@vger.kernel.org> # v3.14+, for older kernels need to drop the 'ishst'
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c04def90f3e4..c5bd5bca8e3d 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -464,6 +464,9 @@ void __init paging_init(void)
 
 	empty_zero_page = virt_to_page(zero_page);
 
+	/* Ensure the zero page is visible to the page table walker */
+	dsb(ishst);
+
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
@ 2015-12-10 17:39 ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-12-10 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

In paging_init, we allocate the zero page, memset it to zero and then
point TTBR0 to it in order to avoid speculative fetches through the
identity mapping.

In order to guarantee that the freshly zeroed page is indeed visible to
the page table walker, we need to execute a dsb instruction prior to
writing the TTBR.

Cc: <stable@vger.kernel.org> # v3.14+, for older kernels need to drop the 'ishst'
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c04def90f3e4..c5bd5bca8e3d 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -464,6 +464,9 @@ void __init paging_init(void)
 
 	empty_zero_page = virt_to_page(zero_page);
 
+	/* Ensure the zero page is visible to the page table walker */
+	dsb(ishst);
+
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
  2015-12-10 17:39 ` Will Deacon
@ 2015-12-10 18:14   ` Mark Rutland
  -1 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2015-12-10 18:14 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, stable

Hi Will,

On Thu, Dec 10, 2015 at 05:39:59PM +0000, Will Deacon wrote:
> In paging_init, we allocate the zero page, memset it to zero and then
> point TTBR0 to it in order to avoid speculative fetches through the
> identity mapping.
> 
> In order to guarantee that the freshly zeroed page is indeed visible to
> the page table walker, we need to execute a dsb instruction prior to
> writing the TTBR.
> 
> Cc: <stable@vger.kernel.org> # v3.14+, for older kernels need to drop the 'ishst'
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index c04def90f3e4..c5bd5bca8e3d 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -464,6 +464,9 @@ void __init paging_init(void)
>  
>  	empty_zero_page = virt_to_page(zero_page);
>  
> +	/* Ensure the zero page is visible to the page table walker */
> +	dsb(ishst);

I think this should live in early_alloc (likewise in late_alloc).

In the other cases we call early_alloc or late_allot we assume the
zeroing is visible to the page table walker.

For example in in alloc_init_pte we do:
	
	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
		if (pmd_sect(*pmd))
			split_pmd(pmd, pte);
		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
		flush_tlb_all();
	}

There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
entry, so the walker might start walking the newly-allocated pte table
before the zeroing is visible.

Either we need a barrier after every alloc, or we fold the barrier into
the two allocation functions.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
@ 2015-12-10 18:14   ` Mark Rutland
  0 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2015-12-10 18:14 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Thu, Dec 10, 2015 at 05:39:59PM +0000, Will Deacon wrote:
> In paging_init, we allocate the zero page, memset it to zero and then
> point TTBR0 to it in order to avoid speculative fetches through the
> identity mapping.
> 
> In order to guarantee that the freshly zeroed page is indeed visible to
> the page table walker, we need to execute a dsb instruction prior to
> writing the TTBR.
> 
> Cc: <stable@vger.kernel.org> # v3.14+, for older kernels need to drop the 'ishst'
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index c04def90f3e4..c5bd5bca8e3d 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -464,6 +464,9 @@ void __init paging_init(void)
>  
>  	empty_zero_page = virt_to_page(zero_page);
>  
> +	/* Ensure the zero page is visible to the page table walker */
> +	dsb(ishst);

I think this should live in early_alloc (likewise in late_alloc).

In the other cases we call early_alloc or late_allot we assume the
zeroing is visible to the page table walker.

For example in in alloc_init_pte we do:
	
	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
		if (pmd_sect(*pmd))
			split_pmd(pmd, pte);
		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
		flush_tlb_all();
	}

There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
entry, so the walker might start walking the newly-allocated pte table
before the zeroing is visible.

Either we need a barrier after every alloc, or we fold the barrier into
the two allocation functions.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
  2015-12-10 18:14   ` Mark Rutland
@ 2015-12-11 17:58     ` Will Deacon
  -1 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-12-11 17:58 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, stable

On Thu, Dec 10, 2015 at 06:14:12PM +0000, Mark Rutland wrote:
> Hi Will,

Hi Mark,

> On Thu, Dec 10, 2015 at 05:39:59PM +0000, Will Deacon wrote:
> > In paging_init, we allocate the zero page, memset it to zero and then
> > point TTBR0 to it in order to avoid speculative fetches through the
> > identity mapping.
> > 
> > In order to guarantee that the freshly zeroed page is indeed visible to
> > the page table walker, we need to execute a dsb instruction prior to
> > writing the TTBR.
> > 
> > Cc: <stable@vger.kernel.org> # v3.14+, for older kernels need to drop the 'ishst'
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
> >  arch/arm64/mm/mmu.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index c04def90f3e4..c5bd5bca8e3d 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -464,6 +464,9 @@ void __init paging_init(void)
> >  
> >  	empty_zero_page = virt_to_page(zero_page);
> >  
> > +	/* Ensure the zero page is visible to the page table walker */
> > +	dsb(ishst);
> 
> I think this should live in early_alloc (likewise in late_alloc).
> 
> In the other cases we call early_alloc or late_allot we assume the
> zeroing is visible to the page table walker.
> 
> For example in in alloc_init_pte we do:
> 	
> 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> 		if (pmd_sect(*pmd))
> 			split_pmd(pmd, pte);
> 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> 		flush_tlb_all();
> 	}
> 
> There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> entry, so the walker might start walking the newly-allocated pte table
> before the zeroing is visible.

Urgh. The reason this is a problem is because we're modifying the page
tables live (which I know that you're fixing) without using
break-before-make. Consequently, the usual ordering guarantees that we
get from the tlb flush after installing the invalid entry do not apply
and we end up with the issue you point out.

> Either we need a barrier after every alloc, or we fold the barrier into
> the two allocation functions.

Could you roll this into your patch that drops the size parameter from
the alloc functions please? Then we can name them {early,late}_alloc_pgtable
and have them do the dsb in there. Maybe we can drop it again when we're
doing proper break-before-make.

Cheers,

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
@ 2015-12-11 17:58     ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-12-11 17:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 10, 2015 at 06:14:12PM +0000, Mark Rutland wrote:
> Hi Will,

Hi Mark,

> On Thu, Dec 10, 2015 at 05:39:59PM +0000, Will Deacon wrote:
> > In paging_init, we allocate the zero page, memset it to zero and then
> > point TTBR0 to it in order to avoid speculative fetches through the
> > identity mapping.
> > 
> > In order to guarantee that the freshly zeroed page is indeed visible to
> > the page table walker, we need to execute a dsb instruction prior to
> > writing the TTBR.
> > 
> > Cc: <stable@vger.kernel.org> # v3.14+, for older kernels need to drop the 'ishst'
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
> >  arch/arm64/mm/mmu.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index c04def90f3e4..c5bd5bca8e3d 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -464,6 +464,9 @@ void __init paging_init(void)
> >  
> >  	empty_zero_page = virt_to_page(zero_page);
> >  
> > +	/* Ensure the zero page is visible to the page table walker */
> > +	dsb(ishst);
> 
> I think this should live in early_alloc (likewise in late_alloc).
> 
> In the other cases we call early_alloc or late_allot we assume the
> zeroing is visible to the page table walker.
> 
> For example in in alloc_init_pte we do:
> 	
> 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> 		if (pmd_sect(*pmd))
> 			split_pmd(pmd, pte);
> 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> 		flush_tlb_all();
> 	}
> 
> There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> entry, so the walker might start walking the newly-allocated pte table
> before the zeroing is visible.

Urgh. The reason this is a problem is because we're modifying the page
tables live (which I know that you're fixing) without using
break-before-make. Consequently, the usual ordering guarantees that we
get from the tlb flush after installing the invalid entry do not apply
and we end up with the issue you point out.

> Either we need a barrier after every alloc, or we fold the barrier into
> the two allocation functions.

Could you roll this into your patch that drops the size parameter from
the alloc functions please? Then we can name them {early,late}_alloc_pgtable
and have them do the dsb in there. Maybe we can drop it again when we're
doing proper break-before-make.

Cheers,

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
  2015-12-11 17:58     ` Will Deacon
@ 2015-12-11 18:19       ` Mark Rutland
  -1 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2015-12-11 18:19 UTC (permalink / raw)
  To: Will Deacon, catalin.marinas; +Cc: linux-arm-kernel, stable

> > > +	/* Ensure the zero page is visible to the page table walker */
> > > +	dsb(ishst);
> > 
> > I think this should live in early_alloc (likewise in late_alloc).
> > 
> > In the other cases we call early_alloc or late_allot we assume the
> > zeroing is visible to the page table walker.
> > 
> > For example in in alloc_init_pte we do:
> > 	
> > 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> > 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> > 		if (pmd_sect(*pmd))
> > 			split_pmd(pmd, pte);
> > 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> > 		flush_tlb_all();
> > 	}
> > 
> > There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> > entry, so the walker might start walking the newly-allocated pte table
> > before the zeroing is visible.
> 
> Urgh. The reason this is a problem is because we're modifying the page
> tables live (which I know that you're fixing) without using
> break-before-make. Consequently, the usual ordering guarantees that we
> get from the tlb flush after installing the invalid entry do not apply
> and we end up with the issue you point out.

My feeling was that in these paths we usually assume all prior page
table updates have been made visible to the page table walkers. Given
that, having the allocator guarantee the zeroing was already visible
felt like the natural thing to do.

That said, having looked at mm/memory.c, we seem to follow the exact
same pattern when plumbing tables together dynamically, with only an
smp_wmb() between the zeroed allocation and plumbing a table entry in.

e.g. in __pte_alloc we have the pattern:

	pgtable_t new = pte_alloc_one(mm, address);
	smp_wmb();
	if (pmd_none(*pmd))
		pmd_populate(mm, pmd, new);

> > Either we need a barrier after every alloc, or we fold the barrier into
> > the two allocation functions.
> 
> Could you roll this into your patch that drops the size parameter from
> the alloc functions please? Then we can name them {early,late}_alloc_pgtable
> and have them do the dsb in there. Maybe we can drop it again when we're
> doing proper break-before-make.

Sure, will do.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
@ 2015-12-11 18:19       ` Mark Rutland
  0 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2015-12-11 18:19 UTC (permalink / raw)
  To: linux-arm-kernel

> > > +	/* Ensure the zero page is visible to the page table walker */
> > > +	dsb(ishst);
> > 
> > I think this should live in early_alloc (likewise in late_alloc).
> > 
> > In the other cases we call early_alloc or late_allot we assume the
> > zeroing is visible to the page table walker.
> > 
> > For example in in alloc_init_pte we do:
> > 	
> > 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> > 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> > 		if (pmd_sect(*pmd))
> > 			split_pmd(pmd, pte);
> > 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> > 		flush_tlb_all();
> > 	}
> > 
> > There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> > entry, so the walker might start walking the newly-allocated pte table
> > before the zeroing is visible.
> 
> Urgh. The reason this is a problem is because we're modifying the page
> tables live (which I know that you're fixing) without using
> break-before-make. Consequently, the usual ordering guarantees that we
> get from the tlb flush after installing the invalid entry do not apply
> and we end up with the issue you point out.

My feeling was that in these paths we usually assume all prior page
table updates have been made visible to the page table walkers. Given
that, having the allocator guarantee the zeroing was already visible
felt like the natural thing to do.

That said, having looked at mm/memory.c, we seem to follow the exact
same pattern when plumbing tables together dynamically, with only an
smp_wmb() between the zeroed allocation and plumbing a table entry in.

e.g. in __pte_alloc we have the pattern:

	pgtable_t new = pte_alloc_one(mm, address);
	smp_wmb();
	if (pmd_none(*pmd))
		pmd_populate(mm, pmd, new);

> > Either we need a barrier after every alloc, or we fold the barrier into
> > the two allocation functions.
> 
> Could you roll this into your patch that drops the size parameter from
> the alloc functions please? Then we can name them {early,late}_alloc_pgtable
> and have them do the dsb in there. Maybe we can drop it again when we're
> doing proper break-before-make.

Sure, will do.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
  2015-12-11 18:19       ` Mark Rutland
@ 2015-12-11 19:10         ` Will Deacon
  -1 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-12-11 19:10 UTC (permalink / raw)
  To: Mark Rutland; +Cc: catalin.marinas, linux-arm-kernel, stable

On Fri, Dec 11, 2015 at 06:19:52PM +0000, Mark Rutland wrote:
> > > > +	/* Ensure the zero page is visible to the page table walker */
> > > > +	dsb(ishst);
> > > 
> > > I think this should live in early_alloc (likewise in late_alloc).
> > > 
> > > In the other cases we call early_alloc or late_allot we assume the
> > > zeroing is visible to the page table walker.
> > > 
> > > For example in in alloc_init_pte we do:
> > > 	
> > > 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> > > 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> > > 		if (pmd_sect(*pmd))
> > > 			split_pmd(pmd, pte);
> > > 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> > > 		flush_tlb_all();
> > > 	}
> > > 
> > > There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> > > entry, so the walker might start walking the newly-allocated pte table
> > > before the zeroing is visible.
> > 
> > Urgh. The reason this is a problem is because we're modifying the page
> > tables live (which I know that you're fixing) without using
> > break-before-make. Consequently, the usual ordering guarantees that we
> > get from the tlb flush after installing the invalid entry do not apply
> > and we end up with the issue you point out.
> 
> My feeling was that in these paths we usually assume all prior page
> table updates have been made visible to the page table walkers. Given
> that, having the allocator guarantee the zeroing was already visible
> felt like the natural thing to do.
> 
> That said, having looked at mm/memory.c, we seem to follow the exact
> same pattern when plumbing tables together dynamically, with only an
> smp_wmb() between the zeroed allocation and plumbing a table entry in.
> 
> e.g. in __pte_alloc we have the pattern:
> 
> 	pgtable_t new = pte_alloc_one(mm, address);
> 	smp_wmb();
> 	if (pmd_none(*pmd))
> 		pmd_populate(mm, pmd, new);

I suspect this is potentially broken if somebody builds a CPU with a
"cool feature" in the page table walker that allows it to walk out of
order without respecting address dependencies.

The easiest fix is adding dsb(ishst) to the page table alloc functions.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
@ 2015-12-11 19:10         ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-12-11 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 11, 2015 at 06:19:52PM +0000, Mark Rutland wrote:
> > > > +	/* Ensure the zero page is visible to the page table walker */
> > > > +	dsb(ishst);
> > > 
> > > I think this should live in early_alloc (likewise in late_alloc).
> > > 
> > > In the other cases we call early_alloc or late_allot we assume the
> > > zeroing is visible to the page table walker.
> > > 
> > > For example in in alloc_init_pte we do:
> > > 	
> > > 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> > > 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> > > 		if (pmd_sect(*pmd))
> > > 			split_pmd(pmd, pte);
> > > 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> > > 		flush_tlb_all();
> > > 	}
> > > 
> > > There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> > > entry, so the walker might start walking the newly-allocated pte table
> > > before the zeroing is visible.
> > 
> > Urgh. The reason this is a problem is because we're modifying the page
> > tables live (which I know that you're fixing) without using
> > break-before-make. Consequently, the usual ordering guarantees that we
> > get from the tlb flush after installing the invalid entry do not apply
> > and we end up with the issue you point out.
> 
> My feeling was that in these paths we usually assume all prior page
> table updates have been made visible to the page table walkers. Given
> that, having the allocator guarantee the zeroing was already visible
> felt like the natural thing to do.
> 
> That said, having looked at mm/memory.c, we seem to follow the exact
> same pattern when plumbing tables together dynamically, with only an
> smp_wmb() between the zeroed allocation and plumbing a table entry in.
> 
> e.g. in __pte_alloc we have the pattern:
> 
> 	pgtable_t new = pte_alloc_one(mm, address);
> 	smp_wmb();
> 	if (pmd_none(*pmd))
> 		pmd_populate(mm, pmd, new);

I suspect this is potentially broken if somebody builds a CPU with a
"cool feature" in the page table walker that allows it to walk out of
order without respecting address dependencies.

The easiest fix is adding dsb(ishst) to the page table alloc functions.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
  2015-12-11 19:10         ` Will Deacon
@ 2015-12-11 19:16           ` Mark Rutland
  -1 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2015-12-11 19:16 UTC (permalink / raw)
  To: Will Deacon; +Cc: catalin.marinas, linux-arm-kernel, stable

On Fri, Dec 11, 2015 at 07:10:31PM +0000, Will Deacon wrote:
> On Fri, Dec 11, 2015 at 06:19:52PM +0000, Mark Rutland wrote:
> > > > > +	/* Ensure the zero page is visible to the page table walker */
> > > > > +	dsb(ishst);
> > > > 
> > > > I think this should live in early_alloc (likewise in late_alloc).
> > > > 
> > > > In the other cases we call early_alloc or late_allot we assume the
> > > > zeroing is visible to the page table walker.
> > > > 
> > > > For example in in alloc_init_pte we do:
> > > > 	
> > > > 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> > > > 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> > > > 		if (pmd_sect(*pmd))
> > > > 			split_pmd(pmd, pte);
> > > > 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> > > > 		flush_tlb_all();
> > > > 	}
> > > > 
> > > > There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> > > > entry, so the walker might start walking the newly-allocated pte table
> > > > before the zeroing is visible.
> > > 
> > > Urgh. The reason this is a problem is because we're modifying the page
> > > tables live (which I know that you're fixing) without using
> > > break-before-make. Consequently, the usual ordering guarantees that we
> > > get from the tlb flush after installing the invalid entry do not apply
> > > and we end up with the issue you point out.
> > 
> > My feeling was that in these paths we usually assume all prior page
> > table updates have been made visible to the page table walkers. Given
> > that, having the allocator guarantee the zeroing was already visible
> > felt like the natural thing to do.
> > 
> > That said, having looked at mm/memory.c, we seem to follow the exact
> > same pattern when plumbing tables together dynamically, with only an
> > smp_wmb() between the zeroed allocation and plumbing a table entry in.
> > 
> > e.g. in __pte_alloc we have the pattern:
> > 
> > 	pgtable_t new = pte_alloc_one(mm, address);
> > 	smp_wmb();
> > 	if (pmd_none(*pmd))
> > 		pmd_populate(mm, pmd, new);
> 
> I suspect this is potentially broken if somebody builds a CPU with a
> "cool feature" in the page table walker that allows it to walk out of
> order without respecting address dependencies.
> 
> The easiest fix is adding dsb(ishst) to the page table alloc functions.

Sounds good to me.

Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker
@ 2015-12-11 19:16           ` Mark Rutland
  0 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2015-12-11 19:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 11, 2015 at 07:10:31PM +0000, Will Deacon wrote:
> On Fri, Dec 11, 2015 at 06:19:52PM +0000, Mark Rutland wrote:
> > > > > +	/* Ensure the zero page is visible to the page table walker */
> > > > > +	dsb(ishst);
> > > > 
> > > > I think this should live in early_alloc (likewise in late_alloc).
> > > > 
> > > > In the other cases we call early_alloc or late_allot we assume the
> > > > zeroing is visible to the page table walker.
> > > > 
> > > > For example in in alloc_init_pte we do:
> > > > 	
> > > > 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> > > > 		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
> > > > 		if (pmd_sect(*pmd))
> > > > 			split_pmd(pmd, pte);
> > > > 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> > > > 		flush_tlb_all();
> > > > 	}
> > > > 
> > > > There's a dsb in __pmd_populate, but it's _after_ the write to the pmd
> > > > entry, so the walker might start walking the newly-allocated pte table
> > > > before the zeroing is visible.
> > > 
> > > Urgh. The reason this is a problem is because we're modifying the page
> > > tables live (which I know that you're fixing) without using
> > > break-before-make. Consequently, the usual ordering guarantees that we
> > > get from the tlb flush after installing the invalid entry do not apply
> > > and we end up with the issue you point out.
> > 
> > My feeling was that in these paths we usually assume all prior page
> > table updates have been made visible to the page table walkers. Given
> > that, having the allocator guarantee the zeroing was already visible
> > felt like the natural thing to do.
> > 
> > That said, having looked at mm/memory.c, we seem to follow the exact
> > same pattern when plumbing tables together dynamically, with only an
> > smp_wmb() between the zeroed allocation and plumbing a table entry in.
> > 
> > e.g. in __pte_alloc we have the pattern:
> > 
> > 	pgtable_t new = pte_alloc_one(mm, address);
> > 	smp_wmb();
> > 	if (pmd_none(*pmd))
> > 		pmd_populate(mm, pmd, new);
> 
> I suspect this is potentially broken if somebody builds a CPU with a
> "cool feature" in the page table walker that allows it to walk out of
> order without respecting address dependencies.
> 
> The easiest fix is adding dsb(ishst) to the page table alloc functions.

Sounds good to me.

Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] arm64: mm: specialise pagetable allocators
  2015-12-11 19:10         ` Will Deacon
  (?)
  (?)
@ 2015-12-14 11:40         ` Mark Rutland
  2015-12-14 11:40           ` [PATCH 2/2] arm64: mm: ensure visbility of page table zeroing Mark Rutland
  -1 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2015-12-14 11:40 UTC (permalink / raw)
  To: linux-arm-kernel

We pass a size parameter to early_alloc and late_alloc, but these are
only ever used to allocate single pages. In late_alloc we always
allocate a single page.

Both allocators provide us with zeroed pages (such that all entries are
invalid), but we have no barriers between allocating a page and adding
that page to existing (live) tables. A concurrent page table walk may
see stale data, leading to a number of issues.

This patch specialises the two allocators for page tables. The size
parameter is removed and the necessary dsb(ishst) is folded into each.
To make it clear that the functions are intended for use for page table
allocation, they are renamed to {early_,late_}pgtable_alloc, with the
related function pointed renamed to pgtable_alloc.

As the dsb(ishst) is now in the allocator, the existing barrier for the
zero page is redundant and thus is removed. The previously missing
include of barrier.h is added.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 52 +++++++++++++++++++++++++++-------------------------
 1 file changed, 27 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c5bd5bc..3ed128c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -30,6 +30,7 @@
 #include <linux/slab.h>
 #include <linux/stop_machine.h>
 
+#include <asm/barrier.h>
 #include <asm/cputype.h>
 #include <asm/fixmap.h>
 #include <asm/kernel-pgtable.h>
@@ -62,15 +63,18 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 }
 EXPORT_SYMBOL(phys_mem_access_prot);
 
-static void __init *early_alloc(unsigned long sz)
+static void __init *early_pgtable_alloc(void)
 {
 	phys_addr_t phys;
 	void *ptr;
 
-	phys = memblock_alloc(sz, sz);
+	phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
 	BUG_ON(!phys);
 	ptr = __va(phys);
-	memset(ptr, 0, sz);
+	memset(ptr, 0, PAGE_SIZE);
+
+	/* Ensure the zeroed page is visible to the page table walker */
+	dsb(ishst);
 	return ptr;
 }
 
@@ -95,12 +99,12 @@ static void split_pmd(pmd_t *pmd, pte_t *pte)
 static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
 				  unsigned long end, unsigned long pfn,
 				  pgprot_t prot,
-				  void *(*alloc)(unsigned long size))
+				  void *(*pgtable_alloc)(void))
 {
 	pte_t *pte;
 
 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
-		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
+		pte = pgtable_alloc();
 		if (pmd_sect(*pmd))
 			split_pmd(pmd, pte);
 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
@@ -130,7 +134,7 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd)
 static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				  unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
-				  void *(*alloc)(unsigned long size))
+				  void *(*pgtable_alloc)(void))
 {
 	pmd_t *pmd;
 	unsigned long next;
@@ -139,7 +143,7 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 	 * Check for initial section mappings in the pgd/pud and remove them.
 	 */
 	if (pud_none(*pud) || pud_sect(*pud)) {
-		pmd = alloc(PTRS_PER_PMD * sizeof(pmd_t));
+		pmd = pgtable_alloc();
 		if (pud_sect(*pud)) {
 			/*
 			 * need to have the 1G of mappings continue to be
@@ -174,7 +178,7 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 			}
 		} else {
 			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
-				       prot, alloc);
+				       prot, pgtable_alloc);
 		}
 		phys += next - addr;
 	} while (pmd++, addr = next, addr != end);
@@ -195,13 +199,13 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
 static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				  unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
-				  void *(*alloc)(unsigned long size))
+				  void *(*pgtable_alloc)(void))
 {
 	pud_t *pud;
 	unsigned long next;
 
 	if (pgd_none(*pgd)) {
-		pud = alloc(PTRS_PER_PUD * sizeof(pud_t));
+		pud = pgtable_alloc();
 		pgd_populate(mm, pgd, pud);
 	}
 	BUG_ON(pgd_bad(*pgd));
@@ -234,7 +238,8 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				}
 			}
 		} else {
-			alloc_init_pmd(mm, pud, addr, next, phys, prot, alloc);
+			alloc_init_pmd(mm, pud, addr, next, phys, prot,
+				       pgtable_alloc);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -247,7 +252,7 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 				    phys_addr_t phys, unsigned long virt,
 				    phys_addr_t size, pgprot_t prot,
-				    void *(*alloc)(unsigned long size))
+				    void *(*pgtable_alloc)(void))
 {
 	unsigned long addr, length, end, next;
 
@@ -265,18 +270,18 @@ static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(mm, pgd, addr, next, phys, prot, alloc);
+		alloc_init_pud(mm, pgd, addr, next, phys, prot, pgtable_alloc);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
 
-static void *late_alloc(unsigned long size)
+static void *late_pgtable_alloc(void)
 {
-	void *ptr;
-
-	BUG_ON(size > PAGE_SIZE);
-	ptr = (void *)__get_free_page(PGALLOC_GFP);
+	void *ptr = (void *)__get_free_page(PGALLOC_GFP);
 	BUG_ON(!ptr);
+
+	/* Ensure the zeroed page is visible to the page table walker */
+	dsb(ishst);
 	return ptr;
 }
 
@@ -289,7 +294,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 		return;
 	}
 	__create_mapping(&init_mm, pgd_offset_k(virt), phys, virt,
-			 size, prot, early_alloc);
+			 size, prot, early_pgtable_alloc);
 }
 
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
@@ -297,7 +302,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       pgprot_t prot)
 {
 	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot,
-				late_alloc);
+				late_pgtable_alloc);
 }
 
 static void create_mapping_late(phys_addr_t phys, unsigned long virt,
@@ -310,7 +315,7 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 	}
 
 	return __create_mapping(&init_mm, pgd_offset_k(virt),
-				phys, virt, size, prot, late_alloc);
+				phys, virt, size, prot, late_pgtable_alloc);
 }
 
 #ifdef CONFIG_DEBUG_RODATA
@@ -458,15 +463,12 @@ void __init paging_init(void)
 	fixup_executable();
 
 	/* allocate the zero page. */
-	zero_page = early_alloc(PAGE_SIZE);
+	zero_page = early_pgtable_alloc();
 
 	bootmem_init();
 
 	empty_zero_page = virt_to_page(zero_page);
 
-	/* Ensure the zero page is visible to the page table walker */
-	dsb(ishst);
-
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] arm64: mm: ensure visbility of page table zeroing
  2015-12-14 11:40         ` [PATCH 1/2] arm64: mm: specialise pagetable allocators Mark Rutland
@ 2015-12-14 11:40           ` Mark Rutland
  0 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2015-12-14 11:40 UTC (permalink / raw)
  To: linux-arm-kernel

We allocate pages with PGALLOC_GFP, which implictly ensures that the
newly allocated tables are zeroed. However, we plumb the newly allocated
tables into (potentially live) tables without an intervening barrier,
and thus a concurrent page table walk may see stale data rather than the
zeroes written by the allocator.

Insert a dsb(ishst) in our run time page table allocators to ensure that
such zeroing is visible.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgalloc.h | 14 +++++++++++---
 arch/arm64/mm/pgd.c              | 10 ++++++++--
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index c150539..b91f6dd 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -21,6 +21,7 @@
 
 #include <asm/pgtable-hwdef.h>
 #include <asm/processor.h>
+#include <asm/barrier.h>
 #include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 
@@ -33,7 +34,9 @@
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-	return (pmd_t *)__get_free_page(PGALLOC_GFP);
+	pmd_t *pmd = (pmd_t *)__get_free_page(PGALLOC_GFP);
+	dsb(ishst);
+	return pmd;
 }
 
 static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
@@ -53,7 +56,9 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-	return (pud_t *)__get_free_page(PGALLOC_GFP);
+	pud_t *pud = (pud_t *)__get_free_page(PGALLOC_GFP);
+	dsb(ishst);
+	return pud;
 }
 
 static inline void pud_free(struct mm_struct *mm, pud_t *pud)
@@ -75,7 +80,9 @@ extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
 static inline pte_t *
 pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr)
 {
-	return (pte_t *)__get_free_page(PGALLOC_GFP);
+	pte_t *pte = (pte_t *)__get_free_page(PGALLOC_GFP);
+	dsb(ishst);
+	return pte;
 }
 
 static inline pgtable_t
@@ -90,6 +97,7 @@ pte_alloc_one(struct mm_struct *mm, unsigned long addr)
 		__free_page(pte);
 		return NULL;
 	}
+	dsb(ishst);
 	return pte;
 }
 
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index cb3ba1b..5420cf3 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -22,6 +22,7 @@
 #include <linux/highmem.h>
 #include <linux/slab.h>
 
+#include <asm/barrier.h>
 #include <asm/pgalloc.h>
 #include <asm/page.h>
 #include <asm/tlbflush.h>
@@ -32,10 +33,15 @@ static struct kmem_cache *pgd_cache;
 
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
+	pgd_t *pgd;
+
 	if (PGD_SIZE == PAGE_SIZE)
-		return (pgd_t *)__get_free_page(PGALLOC_GFP);
+		pgd = (pgd_t *)__get_free_page(PGALLOC_GFP);
 	else
-		return kmem_cache_alloc(pgd_cache, PGALLOC_GFP);
+		pgd = kmem_cache_alloc(pgd_cache, PGALLOC_GFP);
+
+	dsb(ishst);
+	return pgd;
 }
 
 void pgd_free(struct mm_struct *mm, pgd_t *pgd)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-12-14 11:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-10 17:39 [PATCH] arm64: mm: ensure that the zero page is visible to the page table walker Will Deacon
2015-12-10 17:39 ` Will Deacon
2015-12-10 18:14 ` Mark Rutland
2015-12-10 18:14   ` Mark Rutland
2015-12-11 17:58   ` Will Deacon
2015-12-11 17:58     ` Will Deacon
2015-12-11 18:19     ` Mark Rutland
2015-12-11 18:19       ` Mark Rutland
2015-12-11 19:10       ` Will Deacon
2015-12-11 19:10         ` Will Deacon
2015-12-11 19:16         ` Mark Rutland
2015-12-11 19:16           ` Mark Rutland
2015-12-14 11:40         ` [PATCH 1/2] arm64: mm: specialise pagetable allocators Mark Rutland
2015-12-14 11:40           ` [PATCH 2/2] arm64: mm: ensure visbility of page table zeroing Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.