linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
@ 2018-11-19 23:19 Dan Williams
  2018-11-19 23:43 ` Dave Hansen
  2018-11-20  8:52 ` [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() Peter Zijlstra
  0 siblings, 2 replies; 8+ messages in thread
From: Dan Williams @ 2018-11-19 23:19 UTC (permalink / raw)
  To: tglx
  Cc: Kirill A. Shutemov, Sebastian Andrzej Siewior, Peter Zijlstra,
	Borislav Petkov, stable, Andy Lutomirski, Dave Hansen, x86,
	mingo, linux-kernel

Commit f77084d96355 "x86/mm/pat: Disable preemption around
__flush_tlb_all()" addressed a case where __flush_tlb_all() is called
without preemption being disabled. It also left a warning to catch other
cases where preemption is not disabled. That warning triggers for the
memory hotplug path which is also used for persistent memory enabling:

 WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460
 RIP: 0010:__flush_tlb_all+0x1b/0x3a
 [..]
 Call Trace:
  phys_pud_init+0x29c/0x2bb
  kernel_physical_mapping_init+0xfc/0x219
  init_memory_mapping+0x1a5/0x3b0
  arch_add_memory+0x2c/0x50
  devm_memremap_pages+0x3aa/0x610
  pmem_attach_disk+0x585/0x700 [nd_pmem]

Andy wondered why a path that can sleep was using __flush_tlb_all() [1]
and Dave confirmed the expectation for TLB flush is for modifying /
invalidating existing pte entries, but not initial population [2]. Drop
the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the
expectation that this path is only ever populating empty entries for the
linear map. Note, at linear map teardown time there is a call to the
all-cpu flush_tlb_all() to invalidate the removed mappings.

[1]: https://lore.kernel.org/patchwork/patch/1009434/#1193941
[2]: https://lore.kernel.org/patchwork/patch/1009434/#1194540

Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()")
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: <stable@vger.kernel.org>
Reported-by: Andy Lutomirski <luto@kernel.org>
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/mm/init_64.c |    6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5fab264948c2..de95db8ac52f 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -584,7 +584,6 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 							   paddr_end,
 							   page_size_mask,
 							   prot);
-				__flush_tlb_all();
 				continue;
 			}
 			/*
@@ -627,7 +626,6 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 		pud_populate(&init_mm, pud, pmd);
 		spin_unlock(&init_mm.page_table_lock);
 	}
-	__flush_tlb_all();
 
 	update_page_count(PG_LEVEL_1G, pages);
 
@@ -668,7 +666,6 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
 			paddr_last = phys_pud_init(pud, paddr,
 					paddr_end,
 					page_size_mask);
-			__flush_tlb_all();
 			continue;
 		}
 
@@ -680,7 +677,6 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
 		p4d_populate(&init_mm, p4d, pud);
 		spin_unlock(&init_mm.page_table_lock);
 	}
-	__flush_tlb_all();
 
 	return paddr_last;
 }
@@ -733,8 +729,6 @@ kernel_physical_mapping_init(unsigned long paddr_start,
 	if (pgd_changed)
 		sync_global_pgds(vaddr_start, vaddr_end - 1);
 
-	__flush_tlb_all();
-
 	return paddr_last;
 }
 


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
  2018-11-19 23:19 [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() Dan Williams
@ 2018-11-19 23:43 ` Dave Hansen
  2018-11-19 23:48   ` Dan Williams
                     ` (2 more replies)
  2018-11-20  8:52 ` [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() Peter Zijlstra
  1 sibling, 3 replies; 8+ messages in thread
From: Dave Hansen @ 2018-11-19 23:43 UTC (permalink / raw)
  To: Dan Williams, tglx
  Cc: Kirill A. Shutemov, Sebastian Andrzej Siewior, Peter Zijlstra,
	Borislav Petkov, stable, Andy Lutomirski, Dave Hansen, x86,
	mingo, linux-kernel

On 11/19/18 3:19 PM, Dan Williams wrote:
> Andy wondered why a path that can sleep was using __flush_tlb_all() [1]
> and Dave confirmed the expectation for TLB flush is for modifying /
> invalidating existing pte entries, but not initial population [2].

I _think_ this is OK.

But, could we sprinkle a few WARN_ON_ONCE(p*_present()) calls in there
to help us sleep at night?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
  2018-11-19 23:43 ` Dave Hansen
@ 2018-11-19 23:48   ` Dan Williams
  2018-11-20  2:59   ` Williams, Dan J
  2018-12-05 18:06   ` [tip:x86/mm] generic/pgtable: Introduce set_pte_safe() tip-bot for Dan Williams
  2 siblings, 0 replies; 8+ messages in thread
From: Dan Williams @ 2018-11-19 23:48 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, Kirill A. Shutemov, Sebastian Andrzej Siewior,
	Peter Zijlstra, Borislav Petkov, stable, Andy Lutomirski,
	Dave Hansen, X86 ML, Ingo Molnar, Linux Kernel Mailing List

On Mon, Nov 19, 2018 at 3:43 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 11/19/18 3:19 PM, Dan Williams wrote:
> > Andy wondered why a path that can sleep was using __flush_tlb_all() [1]
> > and Dave confirmed the expectation for TLB flush is for modifying /
> > invalidating existing pte entries, but not initial population [2].
>
> I _think_ this is OK.
>
> But, could we sprinkle a few WARN_ON_ONCE(p*_present()) calls in there
> to help us sleep at night?

Makes sense, I'll add those for v2.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
  2018-11-19 23:43 ` Dave Hansen
  2018-11-19 23:48   ` Dan Williams
@ 2018-11-20  2:59   ` Williams, Dan J
  2018-11-20  9:03     ` Peter Zijlstra
  2018-12-05 18:06   ` [tip:x86/mm] generic/pgtable: Introduce set_pte_safe() tip-bot for Dan Williams
  2 siblings, 1 reply; 8+ messages in thread
From: Williams, Dan J @ 2018-11-20  2:59 UTC (permalink / raw)
  To: tglx, Hansen, Dave
  Cc: bigeasy, kirill.shutemov, peterz, linux-kernel, dave.hansen,
	stable, x86, mingo, luto, bp

On Mon, 2018-11-19 at 15:43 -0800, Dave Hansen wrote:
> On 11/19/18 3:19 PM, Dan Williams wrote:
> > Andy wondered why a path that can sleep was using __flush_tlb_all()
> > [1]
> > and Dave confirmed the expectation for TLB flush is for modifying /
> > invalidating existing pte entries, but not initial population [2].
> 
> I _think_ this is OK.
> 
> But, could we sprinkle a few WARN_ON_ONCE(p*_present()) calls in
> there
> to help us sleep at night?

Well, I'm having nightmares now because my naive patch to sprinkle some
WARN_ON_ONCE() calls is leading to my VM live locking at boot... no
backtrace. If I revert the patch below and just go with the
__flush_tlb_all() removal it seems fine.

I'm going to set this aside for a bit, but if anyone has any thoughts
in the meantime I'd appreciate it.

---

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index de95db8ac52f..ecdf917def4c 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -432,6 +432,7 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
 					     E820_TYPE_RAM) &&
 			    !e820__mapped_any(paddr & PAGE_MASK, paddr_next,
 					     E820_TYPE_RESERVED_KERN))
+				WARN_ON_ONCE(pte_present(*pte));
 				set_pte(pte, __pte(0));
 			continue;
 		}
@@ -452,6 +453,7 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
 			pr_info("   pte=%p addr=%lx pte=%016lx\n", pte, paddr,
 				pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL).pte);
 		pages++;
+		WARN_ON_ONCE(pte_present(*pte));
 		set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
 		paddr_last = (paddr & PAGE_MASK) + PAGE_SIZE;
 	}
@@ -487,6 +489,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 					     E820_TYPE_RAM) &&
 			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
 					     E820_TYPE_RESERVED_KERN))
+				WARN_ON_ONCE(pmd_present(*pmd));
 				set_pmd(pmd, __pmd(0));
 			continue;
 		}
@@ -524,6 +527,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 		if (page_size_mask & (1<<PG_LEVEL_2M)) {
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
+			WARN_ON_ONCE(pmd_present(*pmd));
 			set_pte((pte_t *)pmd,
 				pfn_pte((paddr & PMD_MASK) >> PAGE_SHIFT,
 					__pgprot(pgprot_val(prot) | _PAGE_PSE)));
@@ -536,6 +540,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 		paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot);
 
 		spin_lock(&init_mm.page_table_lock);
+		WARN_ON_ONCE(pmd_present(*pmd));
 		pmd_populate_kernel(&init_mm, pmd, pte);
 		spin_unlock(&init_mm.page_table_lock);
 	}
@@ -573,6 +578,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 					     E820_TYPE_RAM) &&
 			    !e820__mapped_any(paddr & PUD_MASK, paddr_next,
 					     E820_TYPE_RESERVED_KERN))
+				WARN_ON_ONCE(pud_present(*pud));
 				set_pud(pud, __pud(0));
 			continue;
 		}
@@ -610,6 +616,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 		if (page_size_mask & (1<<PG_LEVEL_1G)) {
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
+			WARN_ON_ONCE(pud_present(*pud));
 			set_pte((pte_t *)pud,
 				pfn_pte((paddr & PUD_MASK) >> PAGE_SHIFT,
 					PAGE_KERNEL_LARGE));
@@ -623,6 +630,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 					   page_size_mask, prot);
 
 		spin_lock(&init_mm.page_table_lock);
+		WARN_ON_ONCE(pud_present(*pud));
 		pud_populate(&init_mm, pud, pmd);
 		spin_unlock(&init_mm.page_table_lock);
 	}
@@ -657,6 +665,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
 					     E820_TYPE_RAM) &&
 			    !e820__mapped_any(paddr & P4D_MASK, paddr_next,
 					     E820_TYPE_RESERVED_KERN))
+				WARN_ON_ONCE(p4d_present(*p4d));
 				set_p4d(p4d, __p4d(0));
 			continue;
 		}
@@ -674,6 +683,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
 					   page_size_mask);
 
 		spin_lock(&init_mm.page_table_lock);
+		WARN_ON_ONCE(p4d_present(*p4d));
 		p4d_populate(&init_mm, p4d, pud);
 		spin_unlock(&init_mm.page_table_lock);
 	}

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
  2018-11-19 23:19 [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() Dan Williams
  2018-11-19 23:43 ` Dave Hansen
@ 2018-11-20  8:52 ` Peter Zijlstra
  1 sibling, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2018-11-20  8:52 UTC (permalink / raw)
  To: Dan Williams
  Cc: tglx, Kirill A. Shutemov, Sebastian Andrzej Siewior,
	Borislav Petkov, stable, Andy Lutomirski, Dave Hansen, x86,
	mingo, linux-kernel

On Mon, Nov 19, 2018 at 03:19:04PM -0800, Dan Williams wrote:

> [1]: https://lore.kernel.org/patchwork/patch/1009434/#1193941
> [2]: https://lore.kernel.org/patchwork/patch/1009434/#1194540

FWIW, that is not the canonical form to refer to emails. Please use:

  https://lkml.kernel.org/r/$msgid

(also, patchwork is even worse crap than lore is for reading emails :/)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
  2018-11-20  2:59   ` Williams, Dan J
@ 2018-11-20  9:03     ` Peter Zijlstra
  2018-11-21 22:36       ` Dan Williams
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2018-11-20  9:03 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: tglx, Hansen, Dave, bigeasy, kirill.shutemov, linux-kernel,
	dave.hansen, stable, x86, mingo, luto, bp

On Tue, Nov 20, 2018 at 02:59:32AM +0000, Williams, Dan J wrote:
> On Mon, 2018-11-19 at 15:43 -0800, Dave Hansen wrote:
> > On 11/19/18 3:19 PM, Dan Williams wrote:
> > > Andy wondered why a path that can sleep was using __flush_tlb_all()
> > > [1]
> > > and Dave confirmed the expectation for TLB flush is for modifying /
> > > invalidating existing pte entries, but not initial population [2].
> > 
> > I _think_ this is OK.
> > 
> > But, could we sprinkle a few WARN_ON_ONCE(p*_present()) calls in
> > there
> > to help us sleep at night?
> 
> Well, I'm having nightmares now because my naive patch to sprinkle some
> WARN_ON_ONCE() calls is leading to my VM live locking at boot... no
> backtrace. If I revert the patch below and just go with the
> __flush_tlb_all() removal it seems fine.
> 
> I'm going to set this aside for a bit, but if anyone has any thoughts
> in the meantime I'd appreciate it.

Have you tried using early_printk ?

So kernel_physical_mapping_init() has a comment that states the virtual
and physical addresses we create mappings for should be PMD aligned,
which implies pud/p4d could have overlap between the mappings.

But in that case, I would expect the new and old values to match.

So maybe you should be checking something like:

	WARN_ON_ONCE(pud_present(*pud) && !pud_same(pud, new));

> @@ -573,6 +578,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
>  					     E820_TYPE_RAM) &&
>  			    !e820__mapped_any(paddr & PUD_MASK, paddr_next,
>  					     E820_TYPE_RESERVED_KERN))
> +				WARN_ON_ONCE(pud_present(*pud));
>  				set_pud(pud, __pud(0));
>  			continue;
>  		}
> @@ -610,6 +616,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
>  		if (page_size_mask & (1<<PG_LEVEL_1G)) {
>  			pages++;
>  			spin_lock(&init_mm.page_table_lock);
> +			WARN_ON_ONCE(pud_present(*pud));
>  			set_pte((pte_t *)pud,
>  				pfn_pte((paddr & PUD_MASK) >> PAGE_SHIFT,
>  					PAGE_KERNEL_LARGE));
> @@ -623,6 +630,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
>  					   page_size_mask, prot);
>  
>  		spin_lock(&init_mm.page_table_lock);
> +		WARN_ON_ONCE(pud_present(*pud));
>  		pud_populate(&init_mm, pud, pmd);
>  		spin_unlock(&init_mm.page_table_lock);
>  	}
> @@ -657,6 +665,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
>  					     E820_TYPE_RAM) &&
>  			    !e820__mapped_any(paddr & P4D_MASK, paddr_next,
>  					     E820_TYPE_RESERVED_KERN))
> +				WARN_ON_ONCE(p4d_present(*p4d));
>  				set_p4d(p4d, __p4d(0));
>  			continue;
>  		}
> @@ -674,6 +683,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
>  					   page_size_mask);
>  
>  		spin_lock(&init_mm.page_table_lock);
> +		WARN_ON_ONCE(p4d_present(*p4d));
>  		p4d_populate(&init_mm, p4d, pud);
>  		spin_unlock(&init_mm.page_table_lock);
>  	}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
  2018-11-20  9:03     ` Peter Zijlstra
@ 2018-11-21 22:36       ` Dan Williams
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Williams @ 2018-11-21 22:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Dave Hansen, Sebastian Andrzej Siewior,
	Kirill A. Shutemov, Linux Kernel Mailing List, Dave Hansen,
	stable, X86 ML, Ingo Molnar, Andy Lutomirski, Borislav Petkov

On Tue, Nov 20, 2018 at 1:03 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Nov 20, 2018 at 02:59:32AM +0000, Williams, Dan J wrote:
> > On Mon, 2018-11-19 at 15:43 -0800, Dave Hansen wrote:
> > > On 11/19/18 3:19 PM, Dan Williams wrote:
> > > > Andy wondered why a path that can sleep was using __flush_tlb_all()
> > > > [1]
> > > > and Dave confirmed the expectation for TLB flush is for modifying /
> > > > invalidating existing pte entries, but not initial population [2].
> > >
> > > I _think_ this is OK.
> > >
> > > But, could we sprinkle a few WARN_ON_ONCE(p*_present()) calls in
> > > there
> > > to help us sleep at night?
> >
> > Well, I'm having nightmares now because my naive patch to sprinkle some
> > WARN_ON_ONCE() calls is leading to my VM live locking at boot... no
> > backtrace. If I revert the patch below and just go with the
> > __flush_tlb_all() removal it seems fine.
> >
> > I'm going to set this aside for a bit, but if anyone has any thoughts
> > in the meantime I'd appreciate it.
>
> Have you tried using early_printk ?

No, it boots well past printk, and even gets past pivot root.
Eventually live locks with all cores spinning. It appears to be
correlated with the arrival of pmem, and independent of the tlb
flushes... I'll dig deeper.

> So kernel_physical_mapping_init() has a comment that states the virtual
> and physical addresses we create mappings for should be PMD aligned,
> which implies pud/p4d could have overlap between the mappings.
>
> But in that case, I would expect the new and old values to match.
>
> So maybe you should be checking something like:
>
>         WARN_ON_ONCE(pud_present(*pud) && !pud_same(pud, new));

Yes, that looks better.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:x86/mm] generic/pgtable: Introduce set_pte_safe()
  2018-11-19 23:43 ` Dave Hansen
  2018-11-19 23:48   ` Dan Williams
  2018-11-20  2:59   ` Williams, Dan J
@ 2018-12-05 18:06   ` tip-bot for Dan Williams
  2 siblings, 0 replies; 8+ messages in thread
From: tip-bot for Dan Williams @ 2018-12-05 18:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: riel, dave.hansen, linux-kernel, hpa, kirill.shutemov, luto,
	dave.hansen, bigeasy, torvalds, tglx, dan.j.williams, bp, mingo,
	peterz

Commit-ID:  4369deaa2f022ef92da45a0e7eec8a4a52e8e8a4
Gitweb:     https://git.kernel.org/tip/4369deaa2f022ef92da45a0e7eec8a4a52e8e8a4
Author:     Dan Williams <dan.j.williams@intel.com>
AuthorDate: Tue, 4 Dec 2018 13:37:16 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 5 Dec 2018 09:03:06 +0100

generic/pgtable: Introduce set_pte_safe()

Commit:

  f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()"

introduced a warning to capture cases __flush_tlb_all() is called without
pre-emption disabled. It triggers a false positive warning in the memory
hotplug path.

On investigation it was found that the __flush_tlb_all() calls are not
necessary. However, they are only "not necessary" in practice provided
the ptes are being initially populated from the !present state.

Introduce set_pte_safe() as a sanity check that the pte is being updated
in a way that does not require a TLB flush.

Forgive the macro, the availability of the various of set_pte() levels
is hit and miss across architectures.

[ mingo: Minor readability edits. ]

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/279dadae-9148-465c-7ec6-3f37e026c6c9@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/asm-generic/pgtable.h | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index dae7f98babed..a9cac82e9a7a 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -400,6 +400,44 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
 }
 #endif
 
+/*
+ * Use set_p*_safe(), and elide TLB flushing, when confident that *no*
+ * TLB flush will be required as a result of the "set". For example, use
+ * in scenarios where it is known ahead of time that the routine is
+ * setting non-present entries, or re-setting an existing entry to the
+ * same value. Otherwise, use the typical "set" helpers and flush the
+ * TLB.
+ */
+#define set_pte_safe(ptep, pte) \
+({ \
+	WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \
+	set_pte(ptep, pte); \
+})
+
+#define set_pmd_safe(pmdp, pmd) \
+({ \
+	WARN_ON_ONCE(pmd_present(*pmdp) && !pmd_same(*pmdp, pmd)); \
+	set_pmd(pmdp, pmd); \
+})
+
+#define set_pud_safe(pudp, pud) \
+({ \
+	WARN_ON_ONCE(pud_present(*pudp) && !pud_same(*pudp, pud)); \
+	set_pud(pudp, pud); \
+})
+
+#define set_p4d_safe(p4dp, p4d) \
+({ \
+	WARN_ON_ONCE(p4d_present(*p4dp) && !p4d_same(*p4dp, p4d)); \
+	set_p4d(p4dp, p4d); \
+})
+
+#define set_pgd_safe(pgdp, pgd) \
+({ \
+	WARN_ON_ONCE(pgd_present(*pgdp) && !pgd_same(*pgdp, pgd)); \
+	set_pgd(pgdp, pgd); \
+})
+
 #ifndef __HAVE_ARCH_DO_SWAP_PAGE
 /*
  * Some architectures support metadata associated with a page. When a

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-12-05 18:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-19 23:19 [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() Dan Williams
2018-11-19 23:43 ` Dave Hansen
2018-11-19 23:48   ` Dan Williams
2018-11-20  2:59   ` Williams, Dan J
2018-11-20  9:03     ` Peter Zijlstra
2018-11-21 22:36       ` Dan Williams
2018-12-05 18:06   ` [tip:x86/mm] generic/pgtable: Introduce set_pte_safe() tip-bot for Dan Williams
2018-11-20  8:52 ` [PATCH] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).