linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mm/64: Do not dereference non-present PGD entries
@ 2020-08-07  8:40 Joerg Roedel
  2020-08-07  9:52 ` Jason A. Donenfeld
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Joerg Roedel @ 2020-08-07  8:40 UTC (permalink / raw)
  To: x86
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, hpa, Mike Rapoport,
	Joerg Roedel, Linus Torvalds, Jason, Andrew Morton, linux-kernel,
	joro

From: Joerg Roedel <jroedel@suse.de>

The code for preallocate_vmalloc_pages() was written under the
assumption that the p4d_offset() and pud_offset() functions will perform
present checks before dereferencing the parent entries.

This assumption is wrong an leads to a bug in the code which causes the
physical address found in the PGD be used as a page-table page, even if
the PGD is not present.

So the code flow currently is:

	pgd = pgd_offset_k(addr);
	p4d = p4d_offset(pgd, addr);
	if (p4d_none(*p4d))
		p4d = p4d_alloc(&init_mm, pgd, addr);

This lacks a check for pgd_none() at least, the correct flow would be:

	pgd = pgd_offset_k(addr);
	if (pgd_none(*pgd))
		p4d = p4d_alloc(&init_mm, pgd, addr);
	else
		p4d = p4d_offset(pgd, addr);

But this is the same flow that the p4d_alloc() and the pud_alloc()
functions use internally, so there is no need to duplicate them.

Remove the p?d_none() checks from the function and just call into
p4d_alloc() and pud_alloc() to correctly pre-allocate the PGD entries.

Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Fixes: 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/mm/init_64.c | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3f4e29a78f2b..449e071240e1 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1253,28 +1253,23 @@ static void __init preallocate_vmalloc_pages(void)
 		p4d_t *p4d;
 		pud_t *pud;
 
-		p4d = p4d_offset(pgd, addr);
-		if (p4d_none(*p4d)) {
-			/* Can only happen with 5-level paging */
-			p4d = p4d_alloc(&init_mm, pgd, addr);
-			if (!p4d) {
-				lvl = "p4d";
-				goto failed;
-			}
-		}
+		lvl = "p4d";
+		p4d = p4d_alloc(&init_mm, pgd, addr);
+		if (!p4d)
+			goto failed;
 
+		/*
+		 * With 5-level paging the P4D level is not folded. So the PGDs
+		 * are now populated and there is no need to walk down to the
+		 * PUD level.
+		 */
 		if (pgtable_l5_enabled())
 			continue;
 
-		pud = pud_offset(p4d, addr);
-		if (pud_none(*pud)) {
-			/* Ends up here only with 4-level paging */
-			pud = pud_alloc(&init_mm, p4d, addr);
-			if (!pud) {
-				lvl = "pud";
-				goto failed;
-			}
-		}
+		lvl = "pud";
+		pud = pud_alloc(&init_mm, p4d, addr);
+		if (!pud)
+			goto failed;
 	}
 
 	return;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
  2020-08-07  8:40 [PATCH] x86/mm/64: Do not dereference non-present PGD entries Joerg Roedel
@ 2020-08-07  9:52 ` Jason A. Donenfeld
  2020-08-07 10:07 ` Mike Rapoport
  2020-08-10 14:27 ` Dave Hansen
  2 siblings, 0 replies; 6+ messages in thread
From: Jason A. Donenfeld @ 2020-08-07  9:52 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: X86 ML, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	H . Peter Anvin, Mike Rapoport, Joerg Roedel, Linus Torvalds,
	Andrew Morton, LKML

On Fri, Aug 7, 2020 at 10:40 AM Joerg Roedel <joro@8bytes.org> wrote:
>
> From: Joerg Roedel <jroedel@suse.de>
>
> The code for preallocate_vmalloc_pages() was written under the
> assumption that the p4d_offset() and pud_offset() functions will perform
> present checks before dereferencing the parent entries.
>
> This assumption is wrong an leads to a bug in the code which causes the
> physical address found in the PGD be used as a page-table page, even if
> the PGD is not present.
>
> So the code flow currently is:
>
>         pgd = pgd_offset_k(addr);
>         p4d = p4d_offset(pgd, addr);
>         if (p4d_none(*p4d))
>                 p4d = p4d_alloc(&init_mm, pgd, addr);
>
> This lacks a check for pgd_none() at least, the correct flow would be:
>
>         pgd = pgd_offset_k(addr);
>         if (pgd_none(*pgd))
>                 p4d = p4d_alloc(&init_mm, pgd, addr);
>         else
>                 p4d = p4d_offset(pgd, addr);
>
> But this is the same flow that the p4d_alloc() and the pud_alloc()
> functions use internally, so there is no need to duplicate them.
>
> Remove the p?d_none() checks from the function and just call into
> p4d_alloc() and pud_alloc() to correctly pre-allocate the PGD entries.
>
> Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
> Fixes: 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area")
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  arch/x86/mm/init_64.c | 31 +++++++++++++------------------
>  1 file changed, 13 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 3f4e29a78f2b..449e071240e1 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1253,28 +1253,23 @@ static void __init preallocate_vmalloc_pages(void)
>                 p4d_t *p4d;
>                 pud_t *pud;
>
> -               p4d = p4d_offset(pgd, addr);
> -               if (p4d_none(*p4d)) {
> -                       /* Can only happen with 5-level paging */
> -                       p4d = p4d_alloc(&init_mm, pgd, addr);
> -                       if (!p4d) {
> -                               lvl = "p4d";
> -                               goto failed;
> -                       }
> -               }
> +               lvl = "p4d";
> +               p4d = p4d_alloc(&init_mm, pgd, addr);
> +               if (!p4d)
> +                       goto failed;
>
> +               /*
> +                * With 5-level paging the P4D level is not folded. So the PGDs
> +                * are now populated and there is no need to walk down to the
> +                * PUD level.
> +                */
>                 if (pgtable_l5_enabled())
>                         continue;
>
> -               pud = pud_offset(p4d, addr);
> -               if (pud_none(*pud)) {
> -                       /* Ends up here only with 4-level paging */
> -                       pud = pud_alloc(&init_mm, p4d, addr);
> -                       if (!pud) {
> -                               lvl = "pud";
> -                               goto failed;
> -                       }
> -               }
> +               lvl = "pud";
> +               pud = pud_alloc(&init_mm, p4d, addr);
> +               if (!pud)
> +                       goto failed;
>         }
>
>         return;
> --
> 2.26.2


This appears to fix the issue, so:

Tested-by: Jason A. Donenfeld <Jason@zx2c4.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
  2020-08-07  8:40 [PATCH] x86/mm/64: Do not dereference non-present PGD entries Joerg Roedel
  2020-08-07  9:52 ` Jason A. Donenfeld
@ 2020-08-07 10:07 ` Mike Rapoport
  2020-08-10 14:27 ` Dave Hansen
  2 siblings, 0 replies; 6+ messages in thread
From: Mike Rapoport @ 2020-08-07 10:07 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: x86, Dave Hansen, Andy Lutomirski, Peter Zijlstra, hpa,
	Joerg Roedel, Linus Torvalds, Jason, Andrew Morton, linux-kernel

On Fri, Aug 07, 2020 at 10:40:13AM +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> The code for preallocate_vmalloc_pages() was written under the
> assumption that the p4d_offset() and pud_offset() functions will perform
> present checks before dereferencing the parent entries.
> 
> This assumption is wrong an leads to a bug in the code which causes the
> physical address found in the PGD be used as a page-table page, even if
> the PGD is not present.
> 
> So the code flow currently is:
> 
> 	pgd = pgd_offset_k(addr);
> 	p4d = p4d_offset(pgd, addr);
> 	if (p4d_none(*p4d))
> 		p4d = p4d_alloc(&init_mm, pgd, addr);
> 
> This lacks a check for pgd_none() at least, the correct flow would be:
> 
> 	pgd = pgd_offset_k(addr);
> 	if (pgd_none(*pgd))
> 		p4d = p4d_alloc(&init_mm, pgd, addr);
> 	else
> 		p4d = p4d_offset(pgd, addr);
> 
> But this is the same flow that the p4d_alloc() and the pud_alloc()
> functions use internally, so there is no need to duplicate them.
> 
> Remove the p?d_none() checks from the function and just call into
> p4d_alloc() and pud_alloc() to correctly pre-allocate the PGD entries.
> 
> Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
> Fixes: 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area")
> Signed-off-by: Joerg Roedel <jroedel@suse.de>

LGTM,

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  arch/x86/mm/init_64.c | 31 +++++++++++++------------------
>  1 file changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 3f4e29a78f2b..449e071240e1 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1253,28 +1253,23 @@ static void __init preallocate_vmalloc_pages(void)
>  		p4d_t *p4d;
>  		pud_t *pud;
>  
> -		p4d = p4d_offset(pgd, addr);
> -		if (p4d_none(*p4d)) {
> -			/* Can only happen with 5-level paging */
> -			p4d = p4d_alloc(&init_mm, pgd, addr);
> -			if (!p4d) {
> -				lvl = "p4d";
> -				goto failed;
> -			}
> -		}
> +		lvl = "p4d";
> +		p4d = p4d_alloc(&init_mm, pgd, addr);
> +		if (!p4d)
> +			goto failed;
>  
> +		/*
> +		 * With 5-level paging the P4D level is not folded. So the PGDs
> +		 * are now populated and there is no need to walk down to the
> +		 * PUD level.
> +		 */
>  		if (pgtable_l5_enabled())
>  			continue;
>  
> -		pud = pud_offset(p4d, addr);
> -		if (pud_none(*pud)) {
> -			/* Ends up here only with 4-level paging */
> -			pud = pud_alloc(&init_mm, p4d, addr);
> -			if (!pud) {
> -				lvl = "pud";
> -				goto failed;
> -			}
> -		}
> +		lvl = "pud";
> +		pud = pud_alloc(&init_mm, p4d, addr);
> +		if (!pud)
> +			goto failed;
>  	}
>  
>  	return;
> -- 
> 2.26.2
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
  2020-08-07  8:40 [PATCH] x86/mm/64: Do not dereference non-present PGD entries Joerg Roedel
  2020-08-07  9:52 ` Jason A. Donenfeld
  2020-08-07 10:07 ` Mike Rapoport
@ 2020-08-10 14:27 ` Dave Hansen
  2020-08-10 15:53   ` Mike Rapoport
  2 siblings, 1 reply; 6+ messages in thread
From: Dave Hansen @ 2020-08-10 14:27 UTC (permalink / raw)
  To: Joerg Roedel, x86
  Cc: Andy Lutomirski, Peter Zijlstra, hpa, Mike Rapoport,
	Joerg Roedel, Linus Torvalds, Jason, Andrew Morton, linux-kernel,
	Shutemov, Kirill

... adding Kirill

On 8/7/20 1:40 AM, Joerg Roedel wrote:
> +		lvl = "p4d";
> +		p4d = p4d_alloc(&init_mm, pgd, addr);
> +		if (!p4d)
> +			goto failed;
>  
> +		/*
> +		 * With 5-level paging the P4D level is not folded. So the PGDs
> +		 * are now populated and there is no need to walk down to the
> +		 * PUD level.
> +		 */
>  		if (pgtable_l5_enabled())
>  			continue;

It's early and I'm a coffee or two short of awake, but I had to stare at
the comment for a but to make sense of it.

It feels wrong, I think, because the 5-level code usually ends up doing
*more* allocations and in this case, it is _appearing_ to do fewer.
Would something like this make sense?

		/*
		 * The goal here is to allocate all possibly required
		 * hardware page tables pointed to by the top hardware
		 * level.
		 *
		 * On 4-level systems, the p4d layer is folded away and
		 * the above code does no preallocation.  Below, go down
		 * to the pud _software_ level to ensure the second
		 * hardware level is allocated.
		 */


> -		pud = pud_offset(p4d, addr);
> -		if (pud_none(*pud)) {
> -			/* Ends up here only with 4-level paging */
> -			pud = pud_alloc(&init_mm, p4d, addr);
> -			if (!pud) {
> -				lvl = "pud";
> -				goto failed;
> -			}
> -		}
> +		lvl = "pud";
> +		pud = pud_alloc(&init_mm, p4d, addr);
> +		if (!pud)
> +			goto failed;
>  	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
  2020-08-10 14:27 ` Dave Hansen
@ 2020-08-10 15:53   ` Mike Rapoport
  2020-08-13 19:21     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Rapoport @ 2020-08-10 15:53 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Joerg Roedel, x86, Andy Lutomirski, Peter Zijlstra, hpa,
	Joerg Roedel, Linus Torvalds, Jason, Andrew Morton, linux-kernel,
	Shutemov, Kirill

On Mon, Aug 10, 2020 at 07:27:33AM -0700, Dave Hansen wrote:
> ... adding Kirill
> 
> On 8/7/20 1:40 AM, Joerg Roedel wrote:
> > +		lvl = "p4d";
> > +		p4d = p4d_alloc(&init_mm, pgd, addr);
> > +		if (!p4d)
> > +			goto failed;
> >  
> > +		/*
> > +		 * With 5-level paging the P4D level is not folded. So the PGDs
> > +		 * are now populated and there is no need to walk down to the
> > +		 * PUD level.
> > +		 */
> >  		if (pgtable_l5_enabled())
> >  			continue;
> 
> It's early and I'm a coffee or two short of awake, but I had to stare at
> the comment for a but to make sense of it.
> 
> It feels wrong, I think, because the 5-level code usually ends up doing
> *more* allocations and in this case, it is _appearing_ to do fewer.
> Would something like this make sense?

Unless I miss something, with 5 levels vmalloc mappings are shared at
p4d level, so allocating a p4d page would be enough. With 4 levels,
p4d_alloc() is a nop and pud is the first actually populated level below
pgd.

> 		/*
> 		 * The goal here is to allocate all possibly required
> 		 * hardware page tables pointed to by the top hardware
> 		 * level.
> 		 *
> 		 * On 4-level systems, the p4d layer is folded away and
> 		 * the above code does no preallocation.  Below, go down
> 		 * to the pud _software_ level to ensure the second
> 		 * hardware level is allocated.
> 		 */
> 
> 
> > -		pud = pud_offset(p4d, addr);
> > -		if (pud_none(*pud)) {
> > -			/* Ends up here only with 4-level paging */
> > -			pud = pud_alloc(&init_mm, p4d, addr);
> > -			if (!pud) {
> > -				lvl = "pud";
> > -				goto failed;
> > -			}
> > -		}
> > +		lvl = "pud";
> > +		pud = pud_alloc(&init_mm, p4d, addr);
> > +		if (!pud)
> > +			goto failed;
> >  	}

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/64: Do not dereference non-present PGD entries
  2020-08-10 15:53   ` Mike Rapoport
@ 2020-08-13 19:21     ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2020-08-13 19:21 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Dave Hansen, Joerg Roedel, x86, Andy Lutomirski, Peter Zijlstra,
	hpa, Joerg Roedel, Linus Torvalds, Jason, Andrew Morton,
	linux-kernel, Shutemov, Kirill


* Mike Rapoport <rppt@linux.ibm.com> wrote:

> On Mon, Aug 10, 2020 at 07:27:33AM -0700, Dave Hansen wrote:
> > ... adding Kirill
> > 
> > On 8/7/20 1:40 AM, Joerg Roedel wrote:
> > > +		lvl = "p4d";
> > > +		p4d = p4d_alloc(&init_mm, pgd, addr);
> > > +		if (!p4d)
> > > +			goto failed;
> > >  
> > > +		/*
> > > +		 * With 5-level paging the P4D level is not folded. So the PGDs
> > > +		 * are now populated and there is no need to walk down to the
> > > +		 * PUD level.
> > > +		 */
> > >  		if (pgtable_l5_enabled())
> > >  			continue;
> > 
> > It's early and I'm a coffee or two short of awake, but I had to stare at
> > the comment for a but to make sense of it.
> > 
> > It feels wrong, I think, because the 5-level code usually ends up doing
> > *more* allocations and in this case, it is _appearing_ to do fewer.
> > Would something like this make sense?
> 
> Unless I miss something, with 5 levels vmalloc mappings are shared at
> p4d level, so allocating a p4d page would be enough. With 4 levels,
> p4d_alloc() is a nop and pud is the first actually populated level below
> pgd.
> 
> > 		/*
> > 		 * The goal here is to allocate all possibly required
> > 		 * hardware page tables pointed to by the top hardware
> > 		 * level.
> > 		 *
> > 		 * On 4-level systems, the p4d layer is folded away and
> > 		 * the above code does no preallocation.  Below, go down
> > 		 * to the pud _software_ level to ensure the second
> > 		 * hardware level is allocated.
> > 		 */

Would be nice to integrate all these explanations into the comment itself?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-13 19:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07  8:40 [PATCH] x86/mm/64: Do not dereference non-present PGD entries Joerg Roedel
2020-08-07  9:52 ` Jason A. Donenfeld
2020-08-07 10:07 ` Mike Rapoport
2020-08-10 14:27 ` Dave Hansen
2020-08-10 15:53   ` Mike Rapoport
2020-08-13 19:21     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).