linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page
@ 2019-04-08 19:11 Singh, Brijesh
  2019-04-09  8:40 ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Singh, Brijesh @ 2019-04-08 19:11 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Singh, Brijesh, Peter Zijlstra, Dave Hansen, Dan Williams,
	Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov,
	H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas

The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init()
PTE population") triggers the below warning in the SEV guest.

WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386
Call Trace:
 kernel_physical_mapping_init+0xce/0x259
 early_set_memory_enc_dec+0x10f/0x160
 kvm_smp_prepare_boot_cpu+0x71/0x9d
 start_kernel+0x1c9/0x50b
 secondary_startup_64+0xa4/0xb0

The SEV guest calls kernel_physical_mapping_init() to clear the encryption
mask from an existing mapping. While clearing the encryption mask
kernel_physical_mapping_init() splits the large pages into the smaller.
To split the page, the kernel_physical_mapping_init() allocates a new page
and updates the existing entry. The set_{pud,pmd}_safe triggers warning
when updating the entry with page in the present state. We should use the
set_{pud,pmd} when updating an existing entry with the new entry.

Updating an entry will also requires a TLB flush. Currently the caller
(early_set_memory_enc_dec()) is taking care of issuing the TLB flushes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Fixes: 0a9fe8ca844d (x86/mm: Validate kernel_physical_mapping_init() ...)
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
---
 arch/x86/mm/init_64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index bccff68e3267..0a26b64a99b9 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -536,7 +536,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 		paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot);
 
 		spin_lock(&init_mm.page_table_lock);
-		pmd_populate_kernel_safe(&init_mm, pmd, pte);
+		pmd_populate_kernel(&init_mm, pmd, pte);
 		spin_unlock(&init_mm.page_table_lock);
 	}
 	update_page_count(PG_LEVEL_2M, pages);
@@ -623,7 +623,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 					   page_size_mask, prot);
 
 		spin_lock(&init_mm.page_table_lock);
-		pud_populate_safe(&init_mm, pud, pmd);
+		pud_populate(&init_mm, pud, pmd);
 		spin_unlock(&init_mm.page_table_lock);
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page
  2019-04-08 19:11 [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page Singh, Brijesh
@ 2019-04-09  8:40 ` Peter Zijlstra
  2019-04-09  9:39   ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2019-04-09  8:40 UTC (permalink / raw)
  To: Singh, Brijesh
  Cc: linux-kernel, x86, Dave Hansen, Dan Williams,
	Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov,
	H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas

On Mon, Apr 08, 2019 at 07:11:21PM +0000, Singh, Brijesh wrote:
> The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init()
> PTE population") triggers the below warning in the SEV guest.
> 
> WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386
> Call Trace:
>  kernel_physical_mapping_init+0xce/0x259
>  early_set_memory_enc_dec+0x10f/0x160
>  kvm_smp_prepare_boot_cpu+0x71/0x9d
>  start_kernel+0x1c9/0x50b
>  secondary_startup_64+0xa4/0xb0
> 
> The SEV guest calls kernel_physical_mapping_init() to clear the encryption
> mask from an existing mapping. While clearing the encryption mask
> kernel_physical_mapping_init() splits the large pages into the smaller.
> To split the page, the kernel_physical_mapping_init() allocates a new page
> and updates the existing entry. The set_{pud,pmd}_safe triggers warning
> when updating the entry with page in the present state. We should use the
> set_{pud,pmd} when updating an existing entry with the new entry.
> 
> Updating an entry will also requires a TLB flush. Currently the caller
> (early_set_memory_enc_dec()) is taking care of issuing the TLB flushes.

I'm not entirely sure I like this, this means all users of
kernel_physical_mapping_init() now need to be aware and careful.

That said; the alternative is adding an argument to the function and
propagating it through the callchain and dynamically switching between
_safe and not. Which doesn't sound ideal either.

Anybody else got clever ideas?

> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Fixes: 0a9fe8ca844d (x86/mm: Validate kernel_physical_mapping_init() ...)
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
> ---
>  arch/x86/mm/init_64.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index bccff68e3267..0a26b64a99b9 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -536,7 +536,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
>  		paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot);
>  
>  		spin_lock(&init_mm.page_table_lock);
> -		pmd_populate_kernel_safe(&init_mm, pmd, pte);
> +		pmd_populate_kernel(&init_mm, pmd, pte);
>  		spin_unlock(&init_mm.page_table_lock);
>  	}
>  	update_page_count(PG_LEVEL_2M, pages);
> @@ -623,7 +623,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
>  					   page_size_mask, prot);
>  
>  		spin_lock(&init_mm.page_table_lock);
> -		pud_populate_safe(&init_mm, pud, pmd);
> +		pud_populate(&init_mm, pud, pmd);
>  		spin_unlock(&init_mm.page_table_lock);
>  	}
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page
  2019-04-09  8:40 ` Peter Zijlstra
@ 2019-04-09  9:39   ` Peter Zijlstra
  2019-04-09 10:20     ` Borislav Petkov
  2019-04-09 19:09     ` Singh, Brijesh
  0 siblings, 2 replies; 5+ messages in thread
From: Peter Zijlstra @ 2019-04-09  9:39 UTC (permalink / raw)
  To: Singh, Brijesh
  Cc: linux-kernel, x86, Dave Hansen, Dan Williams,
	Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov,
	H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas

On Tue, Apr 09, 2019 at 10:40:31AM +0200, Peter Zijlstra wrote:
> On Mon, Apr 08, 2019 at 07:11:21PM +0000, Singh, Brijesh wrote:
> > The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init()
> > PTE population") triggers the below warning in the SEV guest.
> > 
> > WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386
> > Call Trace:
> >  kernel_physical_mapping_init+0xce/0x259
> >  early_set_memory_enc_dec+0x10f/0x160
> >  kvm_smp_prepare_boot_cpu+0x71/0x9d
> >  start_kernel+0x1c9/0x50b
> >  secondary_startup_64+0xa4/0xb0
> > 
> > The SEV guest calls kernel_physical_mapping_init() to clear the encryption
> > mask from an existing mapping. While clearing the encryption mask
> > kernel_physical_mapping_init() splits the large pages into the smaller.
> > To split the page, the kernel_physical_mapping_init() allocates a new page
> > and updates the existing entry. The set_{pud,pmd}_safe triggers warning
> > when updating the entry with page in the present state. We should use the
> > set_{pud,pmd} when updating an existing entry with the new entry.
> > 
> > Updating an entry will also requires a TLB flush. Currently the caller
> > (early_set_memory_enc_dec()) is taking care of issuing the TLB flushes.
> 
> I'm not entirely sure I like this, this means all users of
> kernel_physical_mapping_init() now need to be aware and careful.
> 
> That said; the alternative is adding an argument to the function and
> propagating it through the callchain and dynamically switching between
> _safe and not. Which doesn't sound ideal either.
> 
> Anybody else got clever ideas?

The more I think about it, I think that is in fact the right thing to
do.

Rename kernel_physical_mapping_init() to __& and add that flag, thread
it down to all the set_{pud,pmd.pte}() thingies. Then add:

unsigned long kernel_physical_mapping_init(unsigned long paddr_start, unsigned
		long paddr_end, unsigned long page_size_mask)
{
	return __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, true);
}

unsigned long kernel_physical_mapping_change(unsigned long paddr_start, unsigned
		long paddr_end, unsigned long page_size_mask)
{
	unsigned long last;

	last = __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, false);

	__flush_tlb_all();

	return last;
}

Or something along those lines.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page
  2019-04-09  9:39   ` Peter Zijlstra
@ 2019-04-09 10:20     ` Borislav Petkov
  2019-04-09 19:09     ` Singh, Brijesh
  1 sibling, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2019-04-09 10:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Singh, Brijesh, linux-kernel, x86, Dave Hansen, Dan Williams,
	Kirill A . Shutemov, Andy Lutomirski, H . Peter Anvin,
	Thomas Gleixner, Lendacky, Thomas

On Tue, Apr 09, 2019 at 11:39:35AM +0200, Peter Zijlstra wrote:
> unsigned long kernel_physical_mapping_change(unsigned long paddr_start, unsigned
> 		long paddr_end, unsigned long page_size_mask)

... and add a comment above it what the "_change" thing is supposed to
mean...

> 	unsigned long last;
> 
> 	last = __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, false);
> 
> 	__flush_tlb_all();

... and maybe not do the flushing here because if you look at
early_set_memory_enc_dec(), it iterates over a bunch of addresses and
when it is done, does the TLB flush once.

If you did it here, then you'd flush after each change. Which is costly.

So maybe the comment above should also say that _change callers are
responsible to flush the TLB after they're done.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page
  2019-04-09  9:39   ` Peter Zijlstra
  2019-04-09 10:20     ` Borislav Petkov
@ 2019-04-09 19:09     ` Singh, Brijesh
  1 sibling, 0 replies; 5+ messages in thread
From: Singh, Brijesh @ 2019-04-09 19:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Singh, Brijesh, linux-kernel, x86, Dave Hansen, Dan Williams,
	Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov,
	H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas



On 4/9/19 4:39 AM, Peter Zijlstra wrote:
> On Tue, Apr 09, 2019 at 10:40:31AM +0200, Peter Zijlstra wrote:
>> On Mon, Apr 08, 2019 at 07:11:21PM +0000, Singh, Brijesh wrote:
>>> The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init()
>>> PTE population") triggers the below warning in the SEV guest.
>>>
>>> WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386
>>> Call Trace:
>>>   kernel_physical_mapping_init+0xce/0x259
>>>   early_set_memory_enc_dec+0x10f/0x160
>>>   kvm_smp_prepare_boot_cpu+0x71/0x9d
>>>   start_kernel+0x1c9/0x50b
>>>   secondary_startup_64+0xa4/0xb0
>>>
>>> The SEV guest calls kernel_physical_mapping_init() to clear the encryption
>>> mask from an existing mapping. While clearing the encryption mask
>>> kernel_physical_mapping_init() splits the large pages into the smaller.
>>> To split the page, the kernel_physical_mapping_init() allocates a new page
>>> and updates the existing entry. The set_{pud,pmd}_safe triggers warning
>>> when updating the entry with page in the present state. We should use the
>>> set_{pud,pmd} when updating an existing entry with the new entry.
>>>
>>> Updating an entry will also requires a TLB flush. Currently the caller
>>> (early_set_memory_enc_dec()) is taking care of issuing the TLB flushes.
>>
>> I'm not entirely sure I like this, this means all users of
>> kernel_physical_mapping_init() now need to be aware and careful.
>>
>> That said; the alternative is adding an argument to the function and
>> propagating it through the callchain and dynamically switching between
>> _safe and not. Which doesn't sound ideal either.
>>
>> Anybody else got clever ideas?
> 
> The more I think about it, I think that is in fact the right thing to
> do.
> 
> Rename kernel_physical_mapping_init() to __& and add that flag, thread
> it down to all the set_{pud,pmd.pte}() thingies. Then add:
> 
> unsigned long kernel_physical_mapping_init(unsigned long paddr_start, unsigned
> 		long paddr_end, unsigned long page_size_mask)
> {
> 	return __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, true);
> }
> 
> unsigned long kernel_physical_mapping_change(unsigned long paddr_start, unsigned
> 		long paddr_end, unsigned long page_size_mask)
> {
> 	unsigned long last;
> 
> 	last = __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, false);
> 
> 	__flush_tlb_all();
> 
> 	return last;
> }
> 
> Or something along those lines.


thanks for the comment. So regarding changing all the set_{pud,pmd,pte}
thingies, I can go in one of the two options below, please let me know
which is preferred.

1) Update all the functions to check for the flag and use the safe vs
non-safe variants accordingly. Something similar to this:

    if (use_safe)
       set_pmd_safe(...)
    else
       set_pmd(..)

    ....
    if (use_safe)
       pmd_populate_kernel_safe(...)
    else
       pmd_populate(...)

There are multiple places where we need to add the above check.

2) Define new __& variants of set_{pud,pmd,pte} and populate_{pud,..}
and update the callers to use the __&. Something like this:

#define DEFINE_POPULATE(fname, type1, type2, safe)              \
static void inline __##fname(struct mm_struct *mm,              \
                 type1##_t *arg1, type2##_t *arg2, bool safe)    \
{								\
	if (safe)						\
		fname##_safe(mm, arg1, arg2);			\
	else
		fname(mm, arg1, arg2);				\
}

DEFINE_POPULATE(p4d_populate, p4d, pud, safe)
DEFINE_POPULATE(pgd_populate, pgd, p4d, safe)
DEFINE_POPULATE(pud_populate, pud, pmd, safe)
DEFINE_POPULATE(pmd_populate_kernel, pmd, pte, safe)

I prefer #2 because it still keeps the code readable and minimize the
changes in the patch file.

thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-04-09 19:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-08 19:11 [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page Singh, Brijesh
2019-04-09  8:40 ` Peter Zijlstra
2019-04-09  9:39   ` Peter Zijlstra
2019-04-09 10:20     ` Borislav Petkov
2019-04-09 19:09     ` Singh, Brijesh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).