linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/2] CPA improvements
@ 2022-06-14  6:39 Hyeonggon Yoo
  2022-06-14  6:39 ` [RFC 1/2] x86/mm/cpa: always fail when user address is passed Hyeonggon Yoo
  2022-06-14  6:39 ` [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits() Hyeonggon Yoo
  0 siblings, 2 replies; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-14  6:39 UTC (permalink / raw)
  To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm
  Cc: Borislav Petkov, x86, Hyeonggon Yoo

This is some improvements of CPA.

Main purpose of this series make pages not to lose _PAGE_GLOBAL after
_set_pages_np() and set_pages_p() are called.

Hyeonggon Yoo (2):
  x86/mm/cpa: always fail when user address is passed
  x86/mm/cpa: drop pgprot_clear_protnone_bits()

 arch/x86/mm/pat/set_memory.c | 30 ++++++------------------------
 1 file changed, 6 insertions(+), 24 deletions(-)

-- 
2.32.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-14  6:39 [RFC 0/2] CPA improvements Hyeonggon Yoo
@ 2022-06-14  6:39 ` Hyeonggon Yoo
  2022-06-14 17:52   ` Edgecombe, Rick P
                     ` (2 more replies)
  2022-06-14  6:39 ` [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits() Hyeonggon Yoo
  1 sibling, 3 replies; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-14  6:39 UTC (permalink / raw)
  To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm
  Cc: Borislav Petkov, x86, Hyeonggon Yoo

Currently CPA is not used for user mappings (only pgd of init_mm
or and efi_mm is used). For simplicity, always fail when user address
is passed.

Note that efi_mm uses 1:1 mapping so its address should not be
considered as user address.

Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 arch/x86/mm/pat/set_memory.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 1abd5438f126..67cf969fed0d 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -20,6 +20,7 @@
 #include <linux/kernel.h>
 #include <linux/cc_platform.h>
 #include <linux/set_memory.h>
+#include <linux/efi.h>
 
 #include <asm/e820/api.h>
 #include <asm/processor.h>
@@ -1514,6 +1515,11 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
 	pte_t *kpte, old_pte;
 
 	address = __cpa_addr(cpa, cpa->curpage);
+
+	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd : true)
+		  && address <= TASK_SIZE_MAX,
+		 KERN_WARNING "CPA: Got a user address"))
+		return -EINVAL;
 repeat:
 	kpte = _lookup_address_cpa(cpa, address, &level);
 	if (!kpte)
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits()
  2022-06-14  6:39 [RFC 0/2] CPA improvements Hyeonggon Yoo
  2022-06-14  6:39 ` [RFC 1/2] x86/mm/cpa: always fail when user address is passed Hyeonggon Yoo
@ 2022-06-14  6:39 ` Hyeonggon Yoo
  2022-06-14  6:53   ` Hyeonggon Yoo
  1 sibling, 1 reply; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-14  6:39 UTC (permalink / raw)
  To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm
  Cc: Borislav Petkov, x86, Hyeonggon Yoo

commit a8aed3e0752b4 ("x86/mm/pageattr: Prevent PSE and GLOABL leftovers
to confuse pmd/pte_present and pmd_huge") made CPA clear _PAGE_GLOBAL when
_PAGE_PRESENT is not set. This prevents kernel crashing when kernel reads
a page with !_PAGE_PRESENT and _PAGE_PROTNONE (_PAGE_GLOBAL). And then it
set _PAGE_GLOBAL back when setting _PAGE_PRESENT again.

After commit d1440b23c922d ("x86/mm: Factor out pageattr _PAGE_GLOBAL
setting") made kernel not set unconditionally _PAGE_GLOBAL, pages lose
global flag after _set_pages_np() and _set_pages_p() are called.

But after commit 3166851142411 ("x86: skip check for spurious faults for
non-present faults"), spurious_kernel_fault() does not confuse
pte/pmd entries with _PAGE_PROTNONE as present anymore. So simply
drop pgprot_clear_protnone_bits().

Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 arch/x86/mm/pat/set_memory.c | 24 ------------------------
 1 file changed, 24 deletions(-)

diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 67cf969fed0d..8a8ce8d78694 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -746,23 +746,6 @@ static void __set_pmd_pte(pte_t *kpte, unsigned long address, pte_t pte)
 #endif
 }
 
-static pgprot_t pgprot_clear_protnone_bits(pgprot_t prot)
-{
-	/*
-	 * _PAGE_GLOBAL means "global page" for present PTEs.
-	 * But, it is also used to indicate _PAGE_PROTNONE
-	 * for non-present PTEs.
-	 *
-	 * This ensures that a _PAGE_GLOBAL PTE going from
-	 * present to non-present is not confused as
-	 * _PAGE_PROTNONE.
-	 */
-	if (!(pgprot_val(prot) & _PAGE_PRESENT))
-		pgprot_val(prot) &= ~_PAGE_GLOBAL;
-
-	return prot;
-}
-
 static int __should_split_large_page(pte_t *kpte, unsigned long address,
 				     struct cpa_data *cpa)
 {
@@ -824,7 +807,6 @@ static int __should_split_large_page(pte_t *kpte, unsigned long address,
 	 * different bit positions in the two formats.
 	 */
 	req_prot = pgprot_4k_2_large(req_prot);
-	req_prot = pgprot_clear_protnone_bits(req_prot);
 	if (pgprot_val(req_prot) & _PAGE_PRESENT)
 		pgprot_val(req_prot) |= _PAGE_PSE;
 
@@ -1013,8 +995,6 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		return 1;
 	}
 
-	ref_prot = pgprot_clear_protnone_bits(ref_prot);
-
 	/*
 	 * Get the target pfn from the original entry:
 	 */
@@ -1246,8 +1226,6 @@ static void populate_pte(struct cpa_data *cpa,
 
 	pte = pte_offset_kernel(pmd, start);
 
-	pgprot = pgprot_clear_protnone_bits(pgprot);
-
 	while (num_pages-- && start < end) {
 		set_pte(pte, pfn_pte(cpa->pfn, pgprot));
 
@@ -1542,8 +1520,6 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
 		new_prot = static_protections(new_prot, address, pfn, 1, 0,
 					      CPA_PROTECT);
 
-		new_prot = pgprot_clear_protnone_bits(new_prot);
-
 		/*
 		 * We need to keep the pfn from the existing PTE,
 		 * after all we're only going to change it's attributes
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits()
  2022-06-14  6:39 ` [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits() Hyeonggon Yoo
@ 2022-06-14  6:53   ` Hyeonggon Yoo
  2022-06-14 18:23     ` Edgecombe, Rick P
  0 siblings, 1 reply; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-14  6:53 UTC (permalink / raw)
  To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm
  Cc: Borislav Petkov, x86, Andrea Arcangeli

On Tue, Jun 14, 2022 at 03:39:33PM +0900, Hyeonggon Yoo wrote:
> commit a8aed3e0752b4 ("x86/mm/pageattr: Prevent PSE and GLOABL leftovers
> to confuse pmd/pte_present and pmd_huge") made CPA clear _PAGE_GLOBAL when
> _PAGE_PRESENT is not set. This prevents kernel crashing when kernel reads
> a page with !_PAGE_PRESENT and _PAGE_PROTNONE (_PAGE_GLOBAL). And then it
> set _PAGE_GLOBAL back when setting _PAGE_PRESENT again.
> 
> After commit d1440b23c922d ("x86/mm: Factor out pageattr _PAGE_GLOBAL
> setting") made kernel not set unconditionally _PAGE_GLOBAL, pages lose
> global flag after _set_pages_np() and _set_pages_p() are called.
> 
> But after commit 3166851142411 ("x86: skip check for spurious faults for
> non-present faults"), spurious_kernel_fault() does not confuse
> pte/pmd entries with _PAGE_PROTNONE as present anymore. So simply
> drop pgprot_clear_protnone_bits().
 
Looks like I forgot to Cc: Andrea Arcangeli <aarcange@redhat.com>

Plus I did check that kernel does not crash when reading from/writing to
non-present pages with this patch applied.

> Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>  arch/x86/mm/pat/set_memory.c | 24 ------------------------
>  1 file changed, 24 deletions(-)
> 
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 67cf969fed0d..8a8ce8d78694 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -746,23 +746,6 @@ static void __set_pmd_pte(pte_t *kpte, unsigned long address, pte_t pte)
>  #endif
>  }
>  
> -static pgprot_t pgprot_clear_protnone_bits(pgprot_t prot)
> -{
> -	/*
> -	 * _PAGE_GLOBAL means "global page" for present PTEs.
> -	 * But, it is also used to indicate _PAGE_PROTNONE
> -	 * for non-present PTEs.
> -	 *
> -	 * This ensures that a _PAGE_GLOBAL PTE going from
> -	 * present to non-present is not confused as
> -	 * _PAGE_PROTNONE.
> -	 */
> -	if (!(pgprot_val(prot) & _PAGE_PRESENT))
> -		pgprot_val(prot) &= ~_PAGE_GLOBAL;
> -
> -	return prot;
> -}
> -
>  static int __should_split_large_page(pte_t *kpte, unsigned long address,
>  				     struct cpa_data *cpa)
>  {
> @@ -824,7 +807,6 @@ static int __should_split_large_page(pte_t *kpte, unsigned long address,
>  	 * different bit positions in the two formats.
>  	 */
>  	req_prot = pgprot_4k_2_large(req_prot);
> -	req_prot = pgprot_clear_protnone_bits(req_prot);
>  	if (pgprot_val(req_prot) & _PAGE_PRESENT)
>  		pgprot_val(req_prot) |= _PAGE_PSE;
>  
> @@ -1013,8 +995,6 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  		return 1;
>  	}
>  
> -	ref_prot = pgprot_clear_protnone_bits(ref_prot);
> -
>  	/*
>  	 * Get the target pfn from the original entry:
>  	 */
> @@ -1246,8 +1226,6 @@ static void populate_pte(struct cpa_data *cpa,
>  
>  	pte = pte_offset_kernel(pmd, start);
>  
> -	pgprot = pgprot_clear_protnone_bits(pgprot);
> -
>  	while (num_pages-- && start < end) {
>  		set_pte(pte, pfn_pte(cpa->pfn, pgprot));
>  
> @@ -1542,8 +1520,6 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
>  		new_prot = static_protections(new_prot, address, pfn, 1, 0,
>  					      CPA_PROTECT);
>  
> -		new_prot = pgprot_clear_protnone_bits(new_prot);
> -
>  		/*
>  		 * We need to keep the pfn from the existing PTE,
>  		 * after all we're only going to change it's attributes
> -- 
> 2.32.0
> 

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-14  6:39 ` [RFC 1/2] x86/mm/cpa: always fail when user address is passed Hyeonggon Yoo
@ 2022-06-14 17:52   ` Edgecombe, Rick P
  2022-06-15  3:26     ` Hyeonggon Yoo
  2022-06-14 18:31   ` Dave Hansen
  2022-06-15 13:11   ` Christoph Hellwig
  2 siblings, 1 reply; 18+ messages in thread
From: Edgecombe, Rick P @ 2022-06-14 17:52 UTC (permalink / raw)
  To: peterz, rppt, tglx, linux-mm, dave.hansen, 42.hyeyoo, Williams,
	Dan J, hpa, mingo, Christopherson,,
	Sean, Lutomirski, Andy, pbonzini, Tianyu.Lan, aneesh.kumar, chu,
	jane
  Cc: bp, x86

On Tue, 2022-06-14 at 15:39 +0900, Hyeonggon Yoo wrote:
> Currently CPA is not used for user mappings (only pgd of init_mm
> or and efi_mm is used). For simplicity, always fail when user address
> is passed.
> 
> Note that efi_mm uses 1:1 mapping so its address should not be
> considered as user address.
> 
> Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>  arch/x86/mm/pat/set_memory.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/mm/pat/set_memory.c
> b/arch/x86/mm/pat/set_memory.c
> index 1abd5438f126..67cf969fed0d 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -20,6 +20,7 @@
>  #include <linux/kernel.h>
>  #include <linux/cc_platform.h>
>  #include <linux/set_memory.h>
> +#include <linux/efi.h>
>  
>  #include <asm/e820/api.h>
>  #include <asm/processor.h>
> @@ -1514,6 +1515,11 @@ static int __change_page_attr(struct cpa_data
> *cpa, int primary)

I guess having it here instead of __change_page_attr_set_clr() will
result in the direct map alias addresses getting checked as well. Since
these are determined inside of CPA, I'm not sure if it's needed as
much.

>  	pte_t *kpte, old_pte;
>  
>  	address = __cpa_addr(cpa, cpa->curpage);
> +
> +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd :
> true)

Could it be?

(!IS_ENABLED(CONFIG_EFI) || cpa->pgd != efi_mm.pgd)


> +		  && address <= TASK_SIZE_MAX,
> +		 KERN_WARNING "CPA: Got a user address"))
> +		return -EINVAL;
>  repeat:
>  	kpte = _lookup_address_cpa(cpa, address, &level);
>  	if (!kpte)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits()
  2022-06-14  6:53   ` Hyeonggon Yoo
@ 2022-06-14 18:23     ` Edgecombe, Rick P
  2022-06-15  3:47       ` Hyeonggon Yoo
  0 siblings, 1 reply; 18+ messages in thread
From: Edgecombe, Rick P @ 2022-06-14 18:23 UTC (permalink / raw)
  To: peterz, rppt, tglx, linux-mm, dave.hansen, 42.hyeyoo, Williams,
	Dan J, hpa, mingo, Christopherson,,
	Sean, Lutomirski, Andy, pbonzini, Tianyu.Lan, aneesh.kumar, chu,
	jane
  Cc: aarcange, bp, x86

On Tue, 2022-06-14 at 15:53 +0900, Hyeonggon Yoo wrote:
> On Tue, Jun 14, 2022 at 03:39:33PM +0900, Hyeonggon Yoo wrote:
> > commit a8aed3e0752b4 ("x86/mm/pageattr: Prevent PSE and GLOABL
> > leftovers
> > to confuse pmd/pte_present and pmd_huge") made CPA clear
> > _PAGE_GLOBAL when
> > _PAGE_PRESENT is not set. This prevents kernel crashing when kernel
> > reads
> > a page with !_PAGE_PRESENT and _PAGE_PROTNONE (_PAGE_GLOBAL). And
> > then it
> > set _PAGE_GLOBAL back when setting _PAGE_PRESENT again.
> > 
> > After commit d1440b23c922d ("x86/mm: Factor out pageattr
> > _PAGE_GLOBAL
> > setting") made kernel not set unconditionally _PAGE_GLOBAL, pages
> > lose
> > global flag after _set_pages_np() and _set_pages_p() are called.
> > 
> > But after commit 3166851142411 ("x86: skip check for spurious
> > faults for
> > non-present faults"), spurious_kernel_fault() does not confuse
> > pte/pmd entries with _PAGE_PROTNONE as present anymore. So simply
> > drop pgprot_clear_protnone_bits().
> 
>  
> Looks like I forgot to Cc: Andrea Arcangeli <aarcange@redhat.com>
> 
> Plus I did check that kernel does not crash when reading from/writing
> to
> non-present pages with this patch applied.

Thanks for the history.

I think we should still fix pte_present() to not check prot_none if the
user bit is clear. The spurious fault handler infinite loop may no
longer be a problem, but pte_present() still would return true for
kernel NP pages, so be fragile. Today I see at least the oops message
and memory hotunplug (see remove_pagetable()) that would get confused.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-14  6:39 ` [RFC 1/2] x86/mm/cpa: always fail when user address is passed Hyeonggon Yoo
  2022-06-14 17:52   ` Edgecombe, Rick P
@ 2022-06-14 18:31   ` Dave Hansen
  2022-06-16  8:49     ` Hyeonggon Yoo
  2022-06-15 13:11   ` Christoph Hellwig
  2 siblings, 1 reply; 18+ messages in thread
From: Dave Hansen @ 2022-06-14 18:31 UTC (permalink / raw)
  To: Hyeonggon Yoo, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Dan Williams,
	Paolo Bonzini, Jane Chu, Aneesh Kumar K . V, Sean Christopherson,
	Tianyu Lan, Mike Rapoport, Rick Edgecombe, linux-mm
  Cc: Borislav Petkov, x86

On 6/13/22 23:39, Hyeonggon Yoo wrote:
> @@ -1514,6 +1515,11 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
>  	pte_t *kpte, old_pte;
>  
>  	address = __cpa_addr(cpa, cpa->curpage);
> +
> +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd : true)
> +		  && address <= TASK_SIZE_MAX,
> +		 KERN_WARNING "CPA: Got a user address"))
> +		return -EINVAL;

I was expecting this to actually go after _PAGE_USER, not necessarily
userspace addresses themselves.

What does and should happen with the VDSO, for instance?  It's a
_PAGE_USER mapping, but it's >TASK_SIZE.  Should set_page_attr() work on it?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-14 17:52   ` Edgecombe, Rick P
@ 2022-06-15  3:26     ` Hyeonggon Yoo
  2022-06-15 18:17       ` Edgecombe, Rick P
  0 siblings, 1 reply; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-15  3:26 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: peterz, rppt, tglx, linux-mm, dave.hansen, Williams, Dan J, hpa,
	mingo, Christopherson,,
	Sean, Lutomirski, Andy, pbonzini, Tianyu.Lan, aneesh.kumar, chu,
	jane, bp, x86

On Tue, Jun 14, 2022 at 05:52:31PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2022-06-14 at 15:39 +0900, Hyeonggon Yoo wrote:
> > Currently CPA is not used for user mappings (only pgd of init_mm
> > or and efi_mm is used). For simplicity, always fail when user address
> > is passed.
> > 
> > Note that efi_mm uses 1:1 mapping so its address should not be
> > considered as user address.
> > 
> > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> > ---
> >  arch/x86/mm/pat/set_memory.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/arch/x86/mm/pat/set_memory.c
> > b/arch/x86/mm/pat/set_memory.c
> > index 1abd5438f126..67cf969fed0d 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/kernel.h>
> >  #include <linux/cc_platform.h>
> >  #include <linux/set_memory.h>
> > +#include <linux/efi.h>
> >  
> >  #include <asm/e820/api.h>
> >  #include <asm/processor.h>
> > @@ -1514,6 +1515,11 @@ static int __change_page_attr(struct cpa_data
> > *cpa, int primary)
> 
> I guess having it here instead of __change_page_attr_set_clr() will
> result in the direct map alias addresses getting checked as well. Since
> these are determined inside of CPA, I'm not sure if it's needed as
> much.

It does not check alias address when it failed.
I put it in __change_page_attr() with CPA_ARRAY in mind.
Because it may not be a single continuous area.

> 
> >  	pte_t *kpte, old_pte;
> >  
> >  	address = __cpa_addr(cpa, cpa->curpage);
> > +
> > +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd :
> > true)
> 
> Could it be?
> 
> (!IS_ENABLED(CONFIG_EFI) || cpa->pgd != efi_mm.pgd)

Looks better, will update in v2.

Thanks!
Hyeonggon

> 
> > +		  && address <= TASK_SIZE_MAX,
> > +		 KERN_WARNING "CPA: Got a user address"))
> > +		return -EINVAL;
> >  repeat:
> >  	kpte = _lookup_address_cpa(cpa, address, &level);
> >  	if (!kpte)

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits()
  2022-06-14 18:23     ` Edgecombe, Rick P
@ 2022-06-15  3:47       ` Hyeonggon Yoo
  2022-06-15 18:18         ` Edgecombe, Rick P
  0 siblings, 1 reply; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-15  3:47 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: peterz, rppt, tglx, linux-mm, dave.hansen, Williams, Dan J, hpa,
	mingo, Christopherson,,
	Sean, Lutomirski, Andy, pbonzini, Tianyu.Lan, aneesh.kumar, chu,
	jane, aarcange, bp, x86

On Tue, Jun 14, 2022 at 06:23:43PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2022-06-14 at 15:53 +0900, Hyeonggon Yoo wrote:
> > On Tue, Jun 14, 2022 at 03:39:33PM +0900, Hyeonggon Yoo wrote:
> > > commit a8aed3e0752b4 ("x86/mm/pageattr: Prevent PSE and GLOABL
> > > leftovers
> > > to confuse pmd/pte_present and pmd_huge") made CPA clear
> > > _PAGE_GLOBAL when
> > > _PAGE_PRESENT is not set. This prevents kernel crashing when kernel
> > > reads
> > > a page with !_PAGE_PRESENT and _PAGE_PROTNONE (_PAGE_GLOBAL). And
> > > then it
> > > set _PAGE_GLOBAL back when setting _PAGE_PRESENT again.
> > > 
> > > After commit d1440b23c922d ("x86/mm: Factor out pageattr
> > > _PAGE_GLOBAL
> > > setting") made kernel not set unconditionally _PAGE_GLOBAL, pages
> > > lose
> > > global flag after _set_pages_np() and _set_pages_p() are called.
> > > 
> > > But after commit 3166851142411 ("x86: skip check for spurious
> > > faults for
> > > non-present faults"), spurious_kernel_fault() does not confuse
> > > pte/pmd entries with _PAGE_PROTNONE as present anymore. So simply
> > > drop pgprot_clear_protnone_bits().
> > 
> >  
> > Looks like I forgot to Cc: Andrea Arcangeli <aarcange@redhat.com>
> > 
> > Plus I did check that kernel does not crash when reading from/writing
> > to
> > non-present pages with this patch applied.
> 
> Thanks for the history.
> 
> I think we should still fix pte_present() to not check prot_none if the
> user bit is clear.

I tried, but realized it wouldn't work :(

For example, when a pte entry is used as swap entry, _PAGE_PRESENT is
cleared and _PAGE_PROTNONE is set.

And other bits are used as type and offset of swap entry.
In that case, _PAGE_BIT_USER bit does not represent _PAGE_USER.
It is just one of bits that represents type of swap entry.

So checking if _PAGE_PROTNONE set only when _PAGE_USER is set
will confuse some swap entries as non-present.

> The spurious fault handler infinite loop may no
> longer be a problem, but pte_present() still would return true for
> kernel NP pages, so be fragile. Today I see at least the oops message
> and memory hotunplug (see remove_pagetable()) that would get confused.

As explained above, I don't think it's possible to make pte_present() 
accurate for both kernel and user ptes.

Maybe we can implement pte_present_kernel()/pte_present_user()
for when kernel knows it is user or kernel pte.

or pte_present_with_address(pte, address) if we don't
know it is user pte or kernel pte.

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-14  6:39 ` [RFC 1/2] x86/mm/cpa: always fail when user address is passed Hyeonggon Yoo
  2022-06-14 17:52   ` Edgecombe, Rick P
  2022-06-14 18:31   ` Dave Hansen
@ 2022-06-15 13:11   ` Christoph Hellwig
  2022-06-16  8:51     ` Hyeonggon Yoo
  2 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2022-06-15 13:11 UTC (permalink / raw)
  To: Hyeonggon Yoo
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm, Borislav Petkov, x86

On Tue, Jun 14, 2022 at 03:39:32PM +0900, Hyeonggon Yoo wrote:
> +
> +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd : true)
> +		  && address <= TASK_SIZE_MAX,
> +		 KERN_WARNING "CPA: Got a user address"))
> +		return -EINVAL;

This looks rather unreadable.

Why not something like:

static bool cpa_addr_valid(struct cpa_data *cpa, unsigned long address)
{
	if (address > TASK_SIZE_MAX)
		return true;
	if (IS_ENABLED(CONFIG_EFI) && cpa->pgd == efi_mm.pgd)
		return true;
	return false;
}

....

	if (WARN_ON_ONCE(!cpa_addr_valid(cpa, address)))
		return -EINVAL;



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-15  3:26     ` Hyeonggon Yoo
@ 2022-06-15 18:17       ` Edgecombe, Rick P
  0 siblings, 0 replies; 18+ messages in thread
From: Edgecombe, Rick P @ 2022-06-15 18:17 UTC (permalink / raw)
  To: 42.hyeyoo
  Cc: peterz, rppt, tglx, linux-mm, dave.hansen, Williams, Dan J, x86,
	hpa, mingo, Christopherson,,
	Sean, Lutomirski, Andy, pbonzini, bp, Tianyu.Lan, aneesh.kumar,
	chu, jane

On Wed, 2022-06-15 at 12:26 +0900, Hyeonggon Yoo wrote:
> On Tue, Jun 14, 2022 at 05:52:31PM +0000, Edgecombe, Rick P wrote:
> > On Tue, 2022-06-14 at 15:39 +0900, Hyeonggon Yoo wrote:
> > > Currently CPA is not used for user mappings (only pgd of init_mm
> > > or and efi_mm is used). For simplicity, always fail when user
> > > address
> > > is passed.
> > > 
> > > Note that efi_mm uses 1:1 mapping so its address should not be
> > > considered as user address.
> > > 
> > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> > > ---
> > >   arch/x86/mm/pat/set_memory.c | 6 ++++++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/arch/x86/mm/pat/set_memory.c
> > > b/arch/x86/mm/pat/set_memory.c
> > > index 1abd5438f126..67cf969fed0d 100644
> > > --- a/arch/x86/mm/pat/set_memory.c
> > > +++ b/arch/x86/mm/pat/set_memory.c
> > > @@ -20,6 +20,7 @@
> > >   #include <linux/kernel.h>
> > >   #include <linux/cc_platform.h>
> > >   #include <linux/set_memory.h>
> > > +#include <linux/efi.h>
> > >   
> > >   #include <asm/e820/api.h>
> > >   #include <asm/processor.h>
> > > @@ -1514,6 +1515,11 @@ static int __change_page_attr(struct
> > > cpa_data
> > > *cpa, int primary)
> > 
> > I guess having it here instead of __change_page_attr_set_clr() will
> > result in the direct map alias addresses getting checked as well.
> > Since
> > these are determined inside of CPA, I'm not sure if it's needed as
> > much.
> 
> It does not check alias address when it failed.
> I put it in __change_page_attr() with CPA_ARRAY in mind.
> Because it may not be a single continuous area.

Makes sense.

> 
> > 
> > >      pte_t *kpte, old_pte;
> > >   
> > >      address = __cpa_addr(cpa, cpa->curpage);
> > > +
> > > +   if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd :
> > > true)
> > 
> > Could it be?
> > 
> > (!IS_ENABLED(CONFIG_EFI) || cpa->pgd != efi_mm.pgd)
> 
> Looks better, will update in v2.

Christoph's seems better to me.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits()
  2022-06-15  3:47       ` Hyeonggon Yoo
@ 2022-06-15 18:18         ` Edgecombe, Rick P
  2022-06-19 12:20           ` Hyeonggon Yoo
  0 siblings, 1 reply; 18+ messages in thread
From: Edgecombe, Rick P @ 2022-06-15 18:18 UTC (permalink / raw)
  To: 42.hyeyoo
  Cc: peterz, rppt, tglx, linux-mm, dave.hansen, Williams, Dan J, x86,
	hpa, aarcange, mingo, Christopherson,,
	Sean, Lutomirski, Andy, pbonzini, bp, Tianyu.Lan, aneesh.kumar,
	chu, jane

On Wed, 2022-06-15 at 12:47 +0900, Hyeonggon Yoo wrote:
> On Tue, Jun 14, 2022 at 06:23:43PM +0000, Edgecombe, Rick P wrote:
> > On Tue, 2022-06-14 at 15:53 +0900, Hyeonggon Yoo wrote:
> > > On Tue, Jun 14, 2022 at 03:39:33PM +0900, Hyeonggon Yoo wrote:
> > > > commit a8aed3e0752b4 ("x86/mm/pageattr: Prevent PSE and GLOABL
> > > > leftovers
> > > > to confuse pmd/pte_present and pmd_huge") made CPA clear
> > > > _PAGE_GLOBAL when
> > > > _PAGE_PRESENT is not set. This prevents kernel crashing when
> > > > kernel
> > > > reads
> > > > a page with !_PAGE_PRESENT and _PAGE_PROTNONE (_PAGE_GLOBAL).
> > > > And
> > > > then it
> > > > set _PAGE_GLOBAL back when setting _PAGE_PRESENT again.
> > > > 
> > > > After commit d1440b23c922d ("x86/mm: Factor out pageattr
> > > > _PAGE_GLOBAL
> > > > setting") made kernel not set unconditionally _PAGE_GLOBAL,
> > > > pages
> > > > lose
> > > > global flag after _set_pages_np() and _set_pages_p() are
> > > > called.
> > > > 
> > > > But after commit 3166851142411 ("x86: skip check for spurious
> > > > faults for
> > > > non-present faults"), spurious_kernel_fault() does not confuse
> > > > pte/pmd entries with _PAGE_PROTNONE as present anymore. So
> > > > simply
> > > > drop pgprot_clear_protnone_bits().
> > > 
> > >  
> > > Looks like I forgot to Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > 
> > > Plus I did check that kernel does not crash when reading
> > > from/writing
> > > to
> > > non-present pages with this patch applied.
> > 
> > Thanks for the history.
> > 
> > I think we should still fix pte_present() to not check prot_none if
> > the
> > user bit is clear.
> 
> I tried, but realized it wouldn't work :(
> 
> For example, when a pte entry is used as swap entry, _PAGE_PRESENT is
> cleared and _PAGE_PROTNONE is set.
> 
> And other bits are used as type and offset of swap entry.
> In that case, _PAGE_BIT_USER bit does not represent _PAGE_USER.
> It is just one of bits that represents type of swap entry.
> 
> So checking if _PAGE_PROTNONE set only when _PAGE_USER is set
> will confuse some swap entries as non-present.

Oooh, right. So the user bit records "when a pagetable is
writeprotected by userfaultfd WP support". I'm not sure if maybe PCD is
available to move that to and leave the user bit in place, but it
sounds like an errata sensitive area to be tweaking.

> 
> > The spurious fault handler infinite loop may no
> > longer be a problem, but pte_present() still would return true for
> > kernel NP pages, so be fragile. Today I see at least the oops
> > message
> > and memory hotunplug (see remove_pagetable()) that would get
> > confused.
> 
> As explained above, I don't think it's possible to make
> pte_present() 
> accurate for both kernel and user ptes.
> 
> Maybe we can implement pte_present_kernel()/pte_present_user()
> for when kernel knows it is user or kernel pte.

This seems like a decent option to me. It seems there are only a few
places that are isolated to arch/x86.

> 
> or pte_present_with_address(pte, address) if we don't
> know it is user pte or kernel pte.
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-14 18:31   ` Dave Hansen
@ 2022-06-16  8:49     ` Hyeonggon Yoo
  2022-06-16 14:20       ` Dave Hansen
  0 siblings, 1 reply; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-16  8:49 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm, Borislav Petkov, x86

On Tue, Jun 14, 2022 at 11:31:48AM -0700, Dave Hansen wrote:
> On 6/13/22 23:39, Hyeonggon Yoo wrote:
> > @@ -1514,6 +1515,11 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
> >  	pte_t *kpte, old_pte;
> >  
> >  	address = __cpa_addr(cpa, cpa->curpage);
> > +
> > +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd : true)
> > +		  && address <= TASK_SIZE_MAX,
> > +		 KERN_WARNING "CPA: Got a user address"))
> > +		return -EINVAL;
> 
> I was expecting this to actually go after _PAGE_USER, not necessarily
> userspace addresses themselves.

userspace ptes may not have _PAGE_USER set. (e.g. swap entry)
I think it's more accurate to go after user addresses.

> What does and should happen with the VDSO, for instance?  It's a
> _PAGE_USER mapping, but it's >TASK_SIZE.

you mean vsyscall? AFAIK address of mapped vDSO image is < TASK_SIZE.
(or please tell me I'm wrong)

> Should set_page_attr() work on it?

vsyscall does not need CPA functionalities.
So I don't think it (__change_page_attr()) should work on vsyscall.

I think renaming fault_in_kernel_space() and using it would be more robust
than simply using the expression (address <= TASK_SIZE_MAX).

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-15 13:11   ` Christoph Hellwig
@ 2022-06-16  8:51     ` Hyeonggon Yoo
  0 siblings, 0 replies; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-16  8:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm, Borislav Petkov, x86

On Wed, Jun 15, 2022 at 06:11:03AM -0700, Christoph Hellwig wrote:
> On Tue, Jun 14, 2022 at 03:39:32PM +0900, Hyeonggon Yoo wrote:
> > +
> > +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd : true)
> > +		  && address <= TASK_SIZE_MAX,
> > +		 KERN_WARNING "CPA: Got a user address"))
> > +		return -EINVAL;
> 
> This looks rather unreadable.
> 
> Why not something like:
> 
> static bool cpa_addr_valid(struct cpa_data *cpa, unsigned long address)
> {
> 	if (address > TASK_SIZE_MAX)
> 		return true;
> 	if (IS_ENABLED(CONFIG_EFI) && cpa->pgd == efi_mm.pgd)
> 		return true;
> 	return false;
> }
> 
> ....
> 
> 	if (WARN_ON_ONCE(!cpa_addr_valid(cpa, address)))
> 		return -EINVAL;
>

Thank your for feedback. I'll care more about readability in v2.

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-16  8:49     ` Hyeonggon Yoo
@ 2022-06-16 14:20       ` Dave Hansen
  2022-06-20  8:08         ` Hyeonggon Yoo
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Hansen @ 2022-06-16 14:20 UTC (permalink / raw)
  To: Hyeonggon Yoo
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm, Borislav Petkov, x86

On 6/16/22 01:49, Hyeonggon Yoo wrote:
> On Tue, Jun 14, 2022 at 11:31:48AM -0700, Dave Hansen wrote:
>> On 6/13/22 23:39, Hyeonggon Yoo wrote:
>>> @@ -1514,6 +1515,11 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
>>>  	pte_t *kpte, old_pte;
>>>  
>>>  	address = __cpa_addr(cpa, cpa->curpage);
>>> +
>>> +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd : true)
>>> +		  && address <= TASK_SIZE_MAX,
>>> +		 KERN_WARNING "CPA: Got a user address"))
>>> +		return -EINVAL;
>>
>> I was expecting this to actually go after _PAGE_USER, not necessarily
>> userspace addresses themselves.
> 
> userspace ptes may not have _PAGE_USER set. (e.g. swap entry)
> I think it's more accurate to go after user addresses.

It would, of course, have to be paired with _PAGE_PRESENT checks.  This
works both on the way in and out of the set_memory code.  It shouldn't
clear other bits a PTE with _PAGE_PRESENT|_PAGE_USER and also shouldn't
*result* in _PAGE_USER|_PAGE_PRESENT PTEs, even if those PTEs are in the
kernel address space.

Filtering on the addresses also makes sense.

>> What does and should happen with the VDSO, for instance?  It's a
>> _PAGE_USER mapping, but it's >TASK_SIZE.
> 
> you mean vsyscall? AFAIK address of mapped vDSO image is < TASK_SIZE.
> (or please tell me I'm wrong)

You're right.  That was a silly thinko.

>> Should set_page_attr() work on it?
> 
> vsyscall does not need CPA functionalities.
> So I don't think it (__change_page_attr()) should work on vsyscall.

Agreed.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits()
  2022-06-15 18:18         ` Edgecombe, Rick P
@ 2022-06-19 12:20           ` Hyeonggon Yoo
  0 siblings, 0 replies; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-19 12:20 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: peterz, rppt, tglx, linux-mm, dave.hansen, Williams, Dan J, x86,
	hpa, aarcange, mingo, Christopherson,,
	Sean, Lutomirski, Andy, pbonzini, bp, Tianyu.Lan, aneesh.kumar,
	chu, jane

On Wed, Jun 15, 2022 at 06:18:15PM +0000, Edgecombe, Rick P wrote:
> On Wed, 2022-06-15 at 12:47 +0900, Hyeonggon Yoo wrote:
> > On Tue, Jun 14, 2022 at 06:23:43PM +0000, Edgecombe, Rick P wrote:
> > > On Tue, 2022-06-14 at 15:53 +0900, Hyeonggon Yoo wrote:
> > > > On Tue, Jun 14, 2022 at 03:39:33PM +0900, Hyeonggon Yoo wrote:
> > > > > commit a8aed3e0752b4 ("x86/mm/pageattr: Prevent PSE and GLOABL
> > > > > leftovers
> > > > > to confuse pmd/pte_present and pmd_huge") made CPA clear
> > > > > _PAGE_GLOBAL when
> > > > > _PAGE_PRESENT is not set. This prevents kernel crashing when
> > > > > kernel
> > > > > reads
> > > > > a page with !_PAGE_PRESENT and _PAGE_PROTNONE (_PAGE_GLOBAL).
> > > > > And
> > > > > then it
> > > > > set _PAGE_GLOBAL back when setting _PAGE_PRESENT again.
> > > > > 
> > > > > After commit d1440b23c922d ("x86/mm: Factor out pageattr
> > > > > _PAGE_GLOBAL
> > > > > setting") made kernel not set unconditionally _PAGE_GLOBAL,
> > > > > pages
> > > > > lose
> > > > > global flag after _set_pages_np() and _set_pages_p() are
> > > > > called.
> > > > > 
> > > > > But after commit 3166851142411 ("x86: skip check for spurious
> > > > > faults for
> > > > > non-present faults"), spurious_kernel_fault() does not confuse
> > > > > pte/pmd entries with _PAGE_PROTNONE as present anymore. So
> > > > > simply
> > > > > drop pgprot_clear_protnone_bits().
> > > > 
> > > >  
> > > > Looks like I forgot to Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > > 
> > > > Plus I did check that kernel does not crash when reading
> > > > from/writing
> > > > to
> > > > non-present pages with this patch applied.
> > > 
> > > Thanks for the history.
> > > 
> > > I think we should still fix pte_present() to not check prot_none if
> > > the
> > > user bit is clear.
> > 
> > I tried, but realized it wouldn't work :(
> > 
> > For example, when a pte entry is used as swap entry, _PAGE_PRESENT is
> > cleared and _PAGE_PROTNONE is set.
> > 
> > And other bits are used as type and offset of swap entry.
> > In that case, _PAGE_BIT_USER bit does not represent _PAGE_USER.
> > It is just one of bits that represents type of swap entry.
> > 
> > So checking if _PAGE_PROTNONE set only when _PAGE_USER is set
> > will confuse some swap entries as non-present.
> 
> Oooh, right. So the user bit records "when a pagetable is
> writeprotected by userfaultfd WP support". I'm not sure if maybe PCD is
> available to move that to and leave the user bit in place, but it
> sounds like an errata sensitive area to be tweaking.
> 
> > 
> > > The spurious fault handler infinite loop may no
> > > longer be a problem, but pte_present() still would return true for
> > > kernel NP pages, so be fragile. Today I see at least the oops
> > > message
> > > and memory hotunplug (see remove_pagetable()) that would get
> > > confused.
> > 
> > As explained above, I don't think it's possible to make
> > pte_present() 
> > accurate for both kernel and user ptes.
> > 
> > Maybe we can implement pte_present_kernel()/pte_present_user()
> > for when kernel knows it is user or kernel pte.
> 
> This seems like a decent option to me. It seems there are only a few
> places that are isolated to arch/x86.

But there are some places where kernel does not know if it's
kernel pte or user pte.

For example show_fault_oops() can be called for both kernel and user
address. Is something like this acceptable?

static inline bool pte_present_address(pte_t pte, address)
{
	if (kernel address)
		return pte_present_kernel(pte);
	return pte_present_user(pte);
}

> > 
> > or pte_present_with_address(pte, address) if we don't
> > know it is user pte or kernel pte.
> > 

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-16 14:20       ` Dave Hansen
@ 2022-06-20  8:08         ` Hyeonggon Yoo
  2022-07-07 20:24           ` Dave Hansen
  0 siblings, 1 reply; 18+ messages in thread
From: Hyeonggon Yoo @ 2022-06-20  8:08 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm, Borislav Petkov, x86

On Thu, Jun 16, 2022 at 07:20:09AM -0700, Dave Hansen wrote:
> On 6/16/22 01:49, Hyeonggon Yoo wrote:
> > On Tue, Jun 14, 2022 at 11:31:48AM -0700, Dave Hansen wrote:
> >> On 6/13/22 23:39, Hyeonggon Yoo wrote:
> >>> @@ -1514,6 +1515,11 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
> >>>  	pte_t *kpte, old_pte;
> >>>  
> >>>  	address = __cpa_addr(cpa, cpa->curpage);
> >>> +
> >>> +	if (WARN((IS_ENABLED(CONFIG_EFI) ? cpa->pgd != efi_mm.pgd : true)
> >>> +		  && address <= TASK_SIZE_MAX,
> >>> +		 KERN_WARNING "CPA: Got a user address"))
> >>> +		return -EINVAL;
> >>
> >> I was expecting this to actually go after _PAGE_USER, not necessarily
> >> userspace addresses themselves.
> > 
> > userspace ptes may not have _PAGE_USER set. (e.g. swap entry)
> > I think it's more accurate to go after user addresses.
> 
> It would, of course, have to be paired with _PAGE_PRESENT checks.  This
> works both on the way in and out of the set_memory code.  It shouldn't
> clear other bits a PTE with _PAGE_PRESENT|_PAGE_USER and

You mean nothing should not use set_memory code for PTEs with
_PAGE_USER|_PAGE_PRESENT but set_memory can still be used to clear
_PAGE_USER|_PAGE_PRESENT?

Can't we just simply deny any PTE/PMDs with _PAGE_PRESENT|_PAGE_USER?

> also shouldn't
> *result* in _PAGE_USER|_PAGE_PRESENT PTEs, even if those PTEs are in the
> kernel address space.

Makes sense.

> Filtering on the addresses also makes sense.
> 
> >> What does and should happen with the VDSO, for instance?  It's a
> >> _PAGE_USER mapping, but it's >TASK_SIZE.
> > 
> > you mean vsyscall? AFAIK address of mapped vDSO image is < TASK_SIZE.
> > (or please tell me I'm wrong)
> 
> You're right.  That was a silly thinko.
> 
> >> Should set_page_attr() work on it?
> > 
> > vsyscall does not need CPA functionalities.
> > So I don't think it (__change_page_attr()) should work on vsyscall.
> 
> Agreed.

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/2] x86/mm/cpa: always fail when user address is passed
  2022-06-20  8:08         ` Hyeonggon Yoo
@ 2022-07-07 20:24           ` Dave Hansen
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Hansen @ 2022-07-07 20:24 UTC (permalink / raw)
  To: Hyeonggon Yoo
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Dan Williams, Paolo Bonzini,
	Jane Chu, Aneesh Kumar K . V, Sean Christopherson, Tianyu Lan,
	Mike Rapoport, Rick Edgecombe, linux-mm, Borislav Petkov, x86

On 6/20/22 01:08, Hyeonggon Yoo wrote:
>> It would, of course, have to be paired with _PAGE_PRESENT checks.  This
>> works both on the way in and out of the set_memory code.  It shouldn't
>> clear other bits a PTE with _PAGE_PRESENT|_PAGE_USER and
> You mean nothing should not use set_memory code for PTEs with
> _PAGE_USER|_PAGE_PRESENT but set_memory can still be used to clear
> _PAGE_USER|_PAGE_PRESENT?

I don't see a reason why it should be able to clear
_PAGE_USER|_PAGE_PRESENT entries either.




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-07-07 20:24 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-14  6:39 [RFC 0/2] CPA improvements Hyeonggon Yoo
2022-06-14  6:39 ` [RFC 1/2] x86/mm/cpa: always fail when user address is passed Hyeonggon Yoo
2022-06-14 17:52   ` Edgecombe, Rick P
2022-06-15  3:26     ` Hyeonggon Yoo
2022-06-15 18:17       ` Edgecombe, Rick P
2022-06-14 18:31   ` Dave Hansen
2022-06-16  8:49     ` Hyeonggon Yoo
2022-06-16 14:20       ` Dave Hansen
2022-06-20  8:08         ` Hyeonggon Yoo
2022-07-07 20:24           ` Dave Hansen
2022-06-15 13:11   ` Christoph Hellwig
2022-06-16  8:51     ` Hyeonggon Yoo
2022-06-14  6:39 ` [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits() Hyeonggon Yoo
2022-06-14  6:53   ` Hyeonggon Yoo
2022-06-14 18:23     ` Edgecombe, Rick P
2022-06-15  3:47       ` Hyeonggon Yoo
2022-06-15 18:18         ` Edgecombe, Rick P
2022-06-19 12:20           ` Hyeonggon Yoo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).