linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations
@ 2021-11-11 11:02 Mike Rapoport
  2021-11-11 11:02 ` [PATCH 1/4] x86/mm: make init_trampoline_kaslr() __init Mike Rapoport
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Mike Rapoport @ 2021-11-11 11:02 UTC (permalink / raw)
  To: x86
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Mike Rapoport, Mike Rapoport, Peter Zijlstra,
	Rick Edgecombe, Thomas Gleixner, linux-kernel

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

The direct map pages on x86 that are allocated using alloc_low_pages() and
spp_getpage() functions. When these functions take 'after_bootmem' branch,
the memory is allocated from buddy with GFP_ATOMIC and I could not find any
reason for this.

Since most of the kernel page tables are anyway allocated really early,
only GART IOMMU initialization and memory hotplug would actually use
get_free_pages() to allocate direct map entries and neither of them happen
in an atomic context, so it would be fine to use GFP_KERNEL. This will give
the page allocator more flexibility when memory hotplug creates the direct
map for hot-added memory and won't use precious atomic memory resources.

The first three patches are trivial cleanups I've encountered while
analysing call paths of alloc_low_pages() and spp_getpage() and the fourth
patch actually replaces GFP_ATOMIC with GFP_KERNEL in those functions.

Mike Rapoport (4):
  x86/mm: make init_trampoline_kaslr() __init
  x86/mm: make kernel_physical_mapping_change() __init
  x86/mm: init_64: make set_pte_vaddr_p4d static
  x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations

 arch/x86/include/asm/pgtable_64.h | 1 -
 arch/x86/mm/init.c                | 2 +-
 arch/x86/mm/init_64.c             | 7 ++++---
 arch/x86/mm/kaslr.c               | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)


base-commit: d2f38a3c6507b2520101f9a3807ed98f1bdc545a
-- 
2.28.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] x86/mm: make init_trampoline_kaslr() __init
  2021-11-11 11:02 [PATCH 0/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
@ 2021-11-11 11:02 ` Mike Rapoport
  2021-11-11 20:46   ` Edgecombe, Rick P
  2021-11-11 11:02 ` [PATCH 2/4] x86/mm: make kernel_physical_mapping_change() __init Mike Rapoport
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Mike Rapoport @ 2021-11-11 11:02 UTC (permalink / raw)
  To: x86
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Mike Rapoport, Mike Rapoport, Peter Zijlstra,
	Rick Edgecombe, Thomas Gleixner, linux-kernel

From: Mike Rapoport <rppt@linux.ibm.com>

init_trampoline_kaslr() is __meminit although it is only called by __init
function init_trampoline().

Make init_trampoline_kaslr() __init.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/x86/mm/kaslr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 557f0fe25dff..d40bb4feeda3 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -138,7 +138,7 @@ void __init kernel_randomize_memory(void)
 	}
 }
 
-void __meminit init_trampoline_kaslr(void)
+void __init init_trampoline_kaslr(void)
 {
 	pud_t *pud_page_tramp, *pud, *pud_tramp;
 	p4d_t *p4d_page_tramp, *p4d, *p4d_tramp;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] x86/mm: make kernel_physical_mapping_change() __init
  2021-11-11 11:02 [PATCH 0/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
  2021-11-11 11:02 ` [PATCH 1/4] x86/mm: make init_trampoline_kaslr() __init Mike Rapoport
@ 2021-11-11 11:02 ` Mike Rapoport
  2021-11-11 20:47   ` Edgecombe, Rick P
  2021-11-11 11:02 ` [PATCH 3/4] x86/mm: init_64: make set_pte_vaddr_p4d static Mike Rapoport
  2021-11-11 11:02 ` [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
  3 siblings, 1 reply; 13+ messages in thread
From: Mike Rapoport @ 2021-11-11 11:02 UTC (permalink / raw)
  To: x86
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Mike Rapoport, Mike Rapoport, Peter Zijlstra,
	Rick Edgecombe, Thomas Gleixner, linux-kernel

From: Mike Rapoport <rppt@linux.ibm.com>

kernel_physical_mapping_change() is __meminit although it is only called by
__init function early_set_memory_enc_dec().

Make kernel_physical_mapping_change() __init.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/x86/mm/init_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 36098226a957..9a24532d2b85 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -794,7 +794,7 @@ kernel_physical_mapping_init(unsigned long paddr_start,
  * when updating the mapping. The caller is responsible to flush the TLBs after
  * the function returns.
  */
-unsigned long __meminit
+unsigned long __init
 kernel_physical_mapping_change(unsigned long paddr_start,
 			       unsigned long paddr_end,
 			       unsigned long page_size_mask)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] x86/mm: init_64: make set_pte_vaddr_p4d static
  2021-11-11 11:02 [PATCH 0/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
  2021-11-11 11:02 ` [PATCH 1/4] x86/mm: make init_trampoline_kaslr() __init Mike Rapoport
  2021-11-11 11:02 ` [PATCH 2/4] x86/mm: make kernel_physical_mapping_change() __init Mike Rapoport
@ 2021-11-11 11:02 ` Mike Rapoport
  2021-11-11 20:48   ` Edgecombe, Rick P
  2021-11-11 11:02 ` [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
  3 siblings, 1 reply; 13+ messages in thread
From: Mike Rapoport @ 2021-11-11 11:02 UTC (permalink / raw)
  To: x86
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Mike Rapoport, Mike Rapoport, Peter Zijlstra,
	Rick Edgecombe, Thomas Gleixner, linux-kernel

From: Mike Rapoport <rppt@linux.ibm.com>

The function set_pte_vaddr_p4d() is never called outside the file
arch/x86/mm/init_64.c.

Make it static.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/x86/include/asm/pgtable_64.h | 1 -
 arch/x86/mm/init_64.c             | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 56d0399a0cd1..9f4b6d48ce2b 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -59,7 +59,6 @@ static inline bool mm_p4d_folded(struct mm_struct *mm)
 	return !pgtable_l5_enabled();
 }
 
-void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte);
 void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte);
 
 static inline void native_set_pte(pte_t *ptep, pte_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 9a24532d2b85..e46d2f18d895 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -302,7 +302,8 @@ static void __set_pte_vaddr(pud_t *pud, unsigned long vaddr, pte_t new_pte)
 	flush_tlb_one_kernel(vaddr);
 }
 
-void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte)
+static void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr,
+			      pte_t new_pte)
 {
 	p4d_t *p4d = p4d_page + p4d_index(vaddr);
 	pud_t *pud = fill_pud(p4d, vaddr);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations
  2021-11-11 11:02 [PATCH 0/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
                   ` (2 preceding siblings ...)
  2021-11-11 11:02 ` [PATCH 3/4] x86/mm: init_64: make set_pte_vaddr_p4d static Mike Rapoport
@ 2021-11-11 11:02 ` Mike Rapoport
  2021-11-11 15:19   ` Dave Hansen
  2021-11-11 21:35   ` Edgecombe, Rick P
  3 siblings, 2 replies; 13+ messages in thread
From: Mike Rapoport @ 2021-11-11 11:02 UTC (permalink / raw)
  To: x86
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Mike Rapoport, Mike Rapoport, Peter Zijlstra,
	Rick Edgecombe, Thomas Gleixner, linux-kernel

From: Mike Rapoport <rppt@linux.ibm.com>

The allocations of the direct map pages are mostly happen very early during
the system boot and they use either the page table cache in brk area of bss
or memblock.

The few callers that effectively use page allocator for the direct map
updates are gart_iommu_init() and memory hotplug. Neither of them happen in
an atomic context so there is no reason to use GFP_ATOMIC for these
allocations.

Replace GFP_ATOMIC with GFP_KERNEL to avoid using atomic reserves for
allocations that do not require that.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/x86/mm/init.c    | 2 +-
 arch/x86/mm/init_64.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 1895986842b9..c01f144e0015 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -120,7 +120,7 @@ __ref void *alloc_low_pages(unsigned int num)
 		unsigned int order;
 
 		order = get_order((unsigned long)num << PAGE_SHIFT);
-		return (void *)__get_free_pages(GFP_ATOMIC | __GFP_ZERO, order);
+		return (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
 	}
 
 	if ((pgt_buf_end + num) > pgt_buf_top || !can_use_brk_pgt) {
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index e46d2f18d895..f3924f1a953d 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -227,7 +227,7 @@ static __ref void *spp_getpage(void)
 	void *ptr;
 
 	if (after_bootmem)
-		ptr = (void *) get_zeroed_page(GFP_ATOMIC);
+		ptr = (void *) get_zeroed_page(GFP_KERNEL);
 	else
 		ptr = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations
  2021-11-11 11:02 ` [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
@ 2021-11-11 15:19   ` Dave Hansen
  2021-11-11 15:57     ` Mike Rapoport
  2021-11-11 21:35   ` Edgecombe, Rick P
  1 sibling, 1 reply; 13+ messages in thread
From: Dave Hansen @ 2021-11-11 15:19 UTC (permalink / raw)
  To: Mike Rapoport, x86
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Mike Rapoport, Peter Zijlstra, Rick Edgecombe,
	Thomas Gleixner, linux-kernel

On 11/11/21 3:02 AM, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The allocations of the direct map pages are mostly happen very early during
> the system boot and they use either the page table cache in brk area of bss
> or memblock.
> 
> The few callers that effectively use page allocator for the direct map
> updates are gart_iommu_init() and memory hotplug. Neither of them happen in
> an atomic context so there is no reason to use GFP_ATOMIC for these
> allocations.
> 
> Replace GFP_ATOMIC with GFP_KERNEL to avoid using atomic reserves for
> allocations that do not require that.

I usually think of the biggest downside of GFP_ATOMIC as being that it
fails more often.  But, since we tend not to be low on memory in early
boot, GFP_ATOMIC and GFP_KERNEL end up being pretty close in actual
behavior.

These allocations also get exposed via init_extra_mapping_*().  But,
those are used via early_initcall()s where GFP_KERNEL is fine too.
Those are a bit worrying because they're in somewhat nice code, like the
Numascale APIC code.  I'm not sure how much use it sees these days.

I guess if this goes wrong somehow, we'll get some nice splats to tell
us what happened.

Was this motivated by anything in particular?  Or is it a pure cleanup?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations
  2021-11-11 15:19   ` Dave Hansen
@ 2021-11-11 15:57     ` Mike Rapoport
  0 siblings, 0 replies; 13+ messages in thread
From: Mike Rapoport @ 2021-11-11 15:57 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Mike Rapoport, Peter Zijlstra,
	Rick Edgecombe, Thomas Gleixner, linux-kernel

On Thu, Nov 11, 2021 at 07:19:40AM -0800, Dave Hansen wrote:
> On 11/11/21 3:02 AM, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The allocations of the direct map pages are mostly happen very early during
> > the system boot and they use either the page table cache in brk area of bss
> > or memblock.
> > 
> > The few callers that effectively use page allocator for the direct map
> > updates are gart_iommu_init() and memory hotplug. Neither of them happen in
> > an atomic context so there is no reason to use GFP_ATOMIC for these
> > allocations.
> > 
> > Replace GFP_ATOMIC with GFP_KERNEL to avoid using atomic reserves for
> > allocations that do not require that.
> 
> I usually think of the biggest downside of GFP_ATOMIC as being that it
> fails more often.  But, since we tend not to be low on memory in early
> boot, GFP_ATOMIC and GFP_KERNEL end up being pretty close in actual
> behavior.
> 
> These allocations also get exposed via init_extra_mapping_*().  But,
> those are used via early_initcall()s where GFP_KERNEL is fine too.

Right, I forgot to mention them in the changelog...

> Those are a bit worrying because they're in somewhat nice code, like the
> Numascale APIC code.  I'm not sure how much use it sees these days.
>
> I guess if this goes wrong somehow, we'll get some nice splats to tell
> us what happened.
> 
> Was this motivated by anything in particular?  Or is it a pure cleanup?

The trigger was the discussion about PKS protection for the kernel page
tables, but for now I'd say it's pure cleanup.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] x86/mm: make init_trampoline_kaslr() __init
  2021-11-11 11:02 ` [PATCH 1/4] x86/mm: make init_trampoline_kaslr() __init Mike Rapoport
@ 2021-11-11 20:46   ` Edgecombe, Rick P
  0 siblings, 0 replies; 13+ messages in thread
From: Edgecombe, Rick P @ 2021-11-11 20:46 UTC (permalink / raw)
  To: rppt, x86
  Cc: linux-kernel, peterz, tglx, dave.hansen, hpa, mingo, rppt,
	Lutomirski, Andy, bp

On Thu, 2021-11-11 at 13:02 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> init_trampoline_kaslr() is __meminit although it is only called by
> __init
> function init_trampoline().
> 
> Make init_trampoline_kaslr() __init.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

> ---
>  arch/x86/mm/kaslr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index 557f0fe25dff..d40bb4feeda3 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -138,7 +138,7 @@ void __init kernel_randomize_memory(void)
>  	}
>  }
>  
> -void __meminit init_trampoline_kaslr(void)
> +void __init init_trampoline_kaslr(void)
>  {
>  	pud_t *pud_page_tramp, *pud, *pud_tramp;
>  	p4d_t *p4d_page_tramp, *p4d, *p4d_tramp;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] x86/mm: make kernel_physical_mapping_change() __init
  2021-11-11 11:02 ` [PATCH 2/4] x86/mm: make kernel_physical_mapping_change() __init Mike Rapoport
@ 2021-11-11 20:47   ` Edgecombe, Rick P
  0 siblings, 0 replies; 13+ messages in thread
From: Edgecombe, Rick P @ 2021-11-11 20:47 UTC (permalink / raw)
  To: rppt, x86
  Cc: linux-kernel, peterz, tglx, dave.hansen, hpa, mingo, rppt,
	Lutomirski, Andy, bp

On Thu, 2021-11-11 at 13:02 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> kernel_physical_mapping_change() is __meminit although it is only
> called by
> __init function early_set_memory_enc_dec().
> 
> Make kernel_physical_mapping_change() __init.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

> ---
>  arch/x86/mm/init_64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 36098226a957..9a24532d2b85 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -794,7 +794,7 @@ kernel_physical_mapping_init(unsigned long
> paddr_start,
>   * when updating the mapping. The caller is responsible to flush the
> TLBs after
>   * the function returns.
>   */
> -unsigned long __meminit
> +unsigned long __init
>  kernel_physical_mapping_change(unsigned long paddr_start,
>  			       unsigned long paddr_end,
>  			       unsigned long page_size_mask)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] x86/mm: init_64: make set_pte_vaddr_p4d static
  2021-11-11 11:02 ` [PATCH 3/4] x86/mm: init_64: make set_pte_vaddr_p4d static Mike Rapoport
@ 2021-11-11 20:48   ` Edgecombe, Rick P
  0 siblings, 0 replies; 13+ messages in thread
From: Edgecombe, Rick P @ 2021-11-11 20:48 UTC (permalink / raw)
  To: rppt, x86
  Cc: linux-kernel, peterz, tglx, dave.hansen, hpa, mingo, rppt,
	Lutomirski, Andy, bp

On Thu, 2021-11-11 at 13:02 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The function set_pte_vaddr_p4d() is never called outside the file
> arch/x86/mm/init_64.c.
> 
> Make it static.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

> ---
>  arch/x86/include/asm/pgtable_64.h | 1 -
>  arch/x86/mm/init_64.c             | 3 ++-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pgtable_64.h
> b/arch/x86/include/asm/pgtable_64.h
> index 56d0399a0cd1..9f4b6d48ce2b 100644
> --- a/arch/x86/include/asm/pgtable_64.h
> +++ b/arch/x86/include/asm/pgtable_64.h
> @@ -59,7 +59,6 @@ static inline bool mm_p4d_folded(struct mm_struct
> *mm)
>  	return !pgtable_l5_enabled();
>  }
>  
> -void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t
> new_pte);
>  void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t
> new_pte);
>  
>  static inline void native_set_pte(pte_t *ptep, pte_t pte)
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 9a24532d2b85..e46d2f18d895 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -302,7 +302,8 @@ static void __set_pte_vaddr(pud_t *pud, unsigned
> long vaddr, pte_t new_pte)
>  	flush_tlb_one_kernel(vaddr);
>  }
>  
> -void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t
> new_pte)
> +static void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr,
> +			      pte_t new_pte)
>  {
>  	p4d_t *p4d = p4d_page + p4d_index(vaddr);
>  	pud_t *pud = fill_pud(p4d, vaddr);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations
  2021-11-11 11:02 ` [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
  2021-11-11 15:19   ` Dave Hansen
@ 2021-11-11 21:35   ` Edgecombe, Rick P
  2021-11-12 12:30     ` Mike Rapoport
  1 sibling, 1 reply; 13+ messages in thread
From: Edgecombe, Rick P @ 2021-11-11 21:35 UTC (permalink / raw)
  To: rppt, x86
  Cc: linux-kernel, peterz, tglx, dave.hansen, hpa, mingo, rppt,
	Lutomirski, Andy, bp

On Thu, 2021-11-11 at 13:02 +0200, Mike Rapoport wrote:
> The allocations of the direct map pages are mostly happen very early
> during
> the system boot and they use either the page table cache in brk area
> of bss
> or memblock.
> 
> The few callers that effectively use page allocator for the direct
> map
> updates are gart_iommu_init() and memory hotplug. Neither of them
> happen in
> an atomic context so there is no reason to use GFP_ATOMIC for these
> allocations.

There are some other places where these paths could get triggered.
alloc_low_pages() gets called by a bunch of memremap_pages() callers.
spp_getpage() gets called from the set_fixmap() family of functions. I
guess you are saying those should not end up triggering an allocation
post-after_bootmem?

I went ahead and did a search, and found this getting called in a timer
delay:
ghes_poll_func()
  spin_lock_irqsave()
  ghes_proc()
    ghes_read_estatus()
      __ghes_read_estatus()
        ghes_copy_tofrom_phys()
          ghes_map()
            __set_fixmap()
              ...spp_getpage()?

I’m not sure if it’s possible to hit, but potentially it could splat
about not being able to sleep? It would depend on something else not
already mapping the needed fixmap pte, which maybe would never happen.
It seems a little rickety though.

For alloc_low_pages(), I noticed the callers don’t check for allocation
failure. I'm a little surprised that there haven't been reports of the
allocation failing, because these operations could result in a lot more
pages getting allocated way past boot, and failure causes a NULL
pointer dereference.

I checked over the alloc_low_pages() callers and I didn’t see any
problems removing GFP_ATOMIC, but I wonder if it should try harder to
allocate. Or properly check for allocation failure in the callers, to
prevent pre-existing risk of crash. GFP_KERNEL doesn’t look to make it
any worse though, and I guess probably slightly less likely to crash.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations
  2021-11-11 21:35   ` Edgecombe, Rick P
@ 2021-11-12 12:30     ` Mike Rapoport
  2021-11-12 22:39       ` Edgecombe, Rick P
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Rapoport @ 2021-11-12 12:30 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: x86, linux-kernel, peterz, tglx, dave.hansen, hpa, mingo, rppt,
	Lutomirski, Andy, bp

On Thu, Nov 11, 2021 at 09:35:56PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2021-11-11 at 13:02 +0200, Mike Rapoport wrote:
> > The allocations of the direct map pages are mostly happen very early
> > during
> > the system boot and they use either the page table cache in brk area
> > of bss
> > or memblock.
> > 
> > The few callers that effectively use page allocator for the direct
> > map
> > updates are gart_iommu_init() and memory hotplug. Neither of them
> > happen in
> > an atomic context so there is no reason to use GFP_ATOMIC for these
> > allocations.
> 
> There are some other places where these paths could get triggered.
> alloc_low_pages() gets called by a bunch of memremap_pages() callers.
> spp_getpage() gets called from the set_fixmap() family of functions. I
> guess you are saying those should not end up triggering an allocation
> post-after_bootmem?
> 
> I went ahead and did a search, and found this getting called in a timer
> delay:
> ghes_poll_func()
>   spin_lock_irqsave()
>   ghes_proc()
>     ghes_read_estatus()
>       __ghes_read_estatus()
>         ghes_copy_tofrom_phys()
>           ghes_map()
>             __set_fixmap()
>               ...spp_getpage()?
> 
> I’m not sure if it’s possible to hit, but potentially it could splat
> about not being able to sleep? It would depend on something else not
> already mapping the needed fixmap pte, which maybe would never happen.
> It seems a little rickety though.

The fixmap is less 2M so all the page tables will be allocated from gpt
cache/memblock so __set_fixmap() will be essentially a call to set_pte().

I'll see how to ensure that page tables for ghex fixmaps are explicitly
preallocated at init time.
 
> For alloc_low_pages(), I noticed the callers don’t check for allocation
> failure. I'm a little surprised that there haven't been reports of the
> allocation failing, because these operations could result in a lot more
> pages getting allocated way past boot, and failure causes a NULL
> pointer dereference.

The allocations at init time are really unlikely to succeed.
As for the memory hotplug, the hotplug will likely fail if there is no
memory, but the failure may be attributed to an error elsewhere.

I'm all for adding checks for allocation errors, but I don't think this is
strictly related to this patch.
 
> I checked over the alloc_low_pages() callers and I didn’t see any
> problems removing GFP_ATOMIC, but I wonder if it should try harder to
> allocate. Or properly check for allocation failure in the callers, to
> prevent pre-existing risk of crash. GFP_KERNEL doesn’t look to make it
> any worse though, and I guess probably slightly less likely to crash.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations
  2021-11-12 12:30     ` Mike Rapoport
@ 2021-11-12 22:39       ` Edgecombe, Rick P
  0 siblings, 0 replies; 13+ messages in thread
From: Edgecombe, Rick P @ 2021-11-12 22:39 UTC (permalink / raw)
  To: rppt
  Cc: linux-kernel, peterz, tglx, dave.hansen, x86, hpa, mingo, rppt,
	Lutomirski, Andy, bp

On Fri, 2021-11-12 at 14:30 +0200, Mike Rapoport wrote:
> The fixmap is less 2M so all the page tables will be allocated from
> gpt
> cache/memblock so __set_fixmap() will be essentially a call to
> set_pte().

If you configure CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP it can be larger.

> 
> I'll see how to ensure that page tables for ghex fixmaps are
> explicitly
> preallocated at init time.
>  

Investigating a bit further, fixmap pte's are allocated statically with
level1_fixmap_pgt, so set_fixmap() should never call spp_getpage() pre
or post after_bootmem. So I guess GFP_KERNEL shouldn't come into play
for all the fixmap callers.

Thanks!

Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-11-12 22:39 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-11 11:02 [PATCH 0/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
2021-11-11 11:02 ` [PATCH 1/4] x86/mm: make init_trampoline_kaslr() __init Mike Rapoport
2021-11-11 20:46   ` Edgecombe, Rick P
2021-11-11 11:02 ` [PATCH 2/4] x86/mm: make kernel_physical_mapping_change() __init Mike Rapoport
2021-11-11 20:47   ` Edgecombe, Rick P
2021-11-11 11:02 ` [PATCH 3/4] x86/mm: init_64: make set_pte_vaddr_p4d static Mike Rapoport
2021-11-11 20:48   ` Edgecombe, Rick P
2021-11-11 11:02 ` [PATCH 4/4] x86/mm: replace GFP_ATOMIC with GFP_KERNEL for direct map allocations Mike Rapoport
2021-11-11 15:19   ` Dave Hansen
2021-11-11 15:57     ` Mike Rapoport
2021-11-11 21:35   ` Edgecombe, Rick P
2021-11-12 12:30     ` Mike Rapoport
2021-11-12 22:39       ` Edgecombe, Rick P

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).