All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@alien8.de>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 36/43] x86/mm: Allow flushing for future ASID switches
Date: Fri, 24 Nov 2017 10:14:41 +0100	[thread overview]
Message-ID: <20171124091448.7649-37-mingo@kernel.org> (raw)
In-Reply-To: <20171124091448.7649-1-mingo@kernel.org>

From: Dave Hansen <dave.hansen@linux.intel.com>

If changing the page tables in such a way that an invalidation of
all contexts (aka. PCIDs / ASIDs) is required, they can be
actively invalidated by:

 1. INVPCID for each PCID (works for single pages too).
 2. Load CR3 with each PCID without the NOFLUSH bit set
 3. Load CR3 with the NOFLUSH bit set for each and do
    INVLPG for each address.

But, none of these are really feasible since there are ~6 ASIDs (12 with
KAISER) at the time that invalidation is required.  Instead of
actively invalidating them, invalidate the *current* context and
also mark the cpu_tlbstate _quickly_ to indicate future invalidation
to be required.

At the next context-switch, look for this indicator
('all_other_ctxs_invalid' being set) invalidate all of the
cpu_tlbstate.ctxs[] entries.

This ensures that any future context switches will do a full flush
of the TLB, picking up the previous changes.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Daniel Gruss <daniel.gruss@iaik.tugraz.at>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at>
Cc: Moritz Lipp <moritz.lipp@iaik.tugraz.at>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Fellner <richard.fellner@student.tugraz.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20171123003507.E8C327F5@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/tlbflush.h | 47 ++++++++++++++++++++++++++++++++---------
 arch/x86/mm/tlb.c               | 35 ++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 24b27eb5904c..bb5ba71038ee 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -184,6 +184,17 @@ struct tlb_state {
 	 */
 	bool is_lazy;
 
+	/*
+	 * If set we changed the page tables in such a way that we
+	 * needed an invalidation of all contexts (aka. PCIDs / ASIDs).
+	 * This tells us to go invalidate all the non-loaded ctxs[]
+	 * on the next context switch.
+	 *
+	 * The current ctx was kept up-to-date as it ran and does not
+	 * need to be invalidated.
+	 */
+	bool all_other_ctxs_invalid;
+
 	/*
 	 * Access to this CR4 shadow and to H/W CR4 is protected by
 	 * disabling interrupts when modifying either one.
@@ -261,6 +272,19 @@ static inline unsigned long cr4_read_shadow(void)
 	return this_cpu_read(cpu_tlbstate.cr4);
 }
 
+static inline void tlb_flush_shared_nonglobals(void)
+{
+	/*
+	 * With global pages, all of the shared kenel page tables
+	 * are set as _PAGE_GLOBAL.  We have no shared nonglobals
+	 * and nothing to do here.
+	 */
+	if (IS_ENABLED(CONFIG_X86_GLOBAL_PAGES))
+		return;
+
+	this_cpu_write(cpu_tlbstate.all_other_ctxs_invalid, true);
+}
+
 /*
  * Save some of cr4 feature set we're using (e.g.  Pentium 4MB
  * enable and PPro Global page enable), so that any CPU's that boot
@@ -290,6 +314,10 @@ static inline void __native_flush_tlb(void)
 	preempt_disable();
 	native_write_cr3(__native_read_cr3());
 	preempt_enable();
+	/*
+	 * Does not need tlb_flush_shared_nonglobals() since the CR3 write
+	 * without PCIDs flushes all non-globals.
+	 */
 }
 
 static inline void __native_flush_tlb_global_irq_disabled(void)
@@ -335,24 +363,23 @@ static inline void __native_flush_tlb_single(unsigned long addr)
 
 static inline void __flush_tlb_all(void)
 {
-	if (boot_cpu_has(X86_FEATURE_PGE))
+	if (boot_cpu_has(X86_FEATURE_PGE)) {
 		__flush_tlb_global();
-	else
+	} else {
 		__flush_tlb();
-
-	/*
-	 * Note: if we somehow had PCID but not PGE, then this wouldn't work --
-	 * we'd end up flushing kernel translations for the current ASID but
-	 * we might fail to flush kernel translations for other cached ASIDs.
-	 *
-	 * To avoid this issue, we force PCID off if PGE is off.
-	 */
+		tlb_flush_shared_nonglobals();
+	}
 }
 
 static inline void __flush_tlb_one(unsigned long addr)
 {
 	count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE);
 	__flush_tlb_single(addr);
+	/*
+	 * Invalidate other address spaces inaccessible to single-page
+	 * invalidation:
+	 */
+	tlb_flush_shared_nonglobals();
 }
 
 #define TLB_FLUSH_ALL	-1UL
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index e629dbda01a0..81941f1690fa 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -28,6 +28,38 @@
  *	Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
  */
 
+/*
+ * We get here when we do something requiring a TLB invalidation
+ * but could not go invalidate all of the contexts.  We do the
+ * necessary invalidation by clearing out the 'ctx_id' which
+ * forces a TLB flush when the context is loaded.
+ */
+void clear_non_loaded_ctxs(void)
+{
+	u16 asid;
+
+	/*
+	 * This is only expected to be set if we have disabled
+	 * kernel _PAGE_GLOBAL pages.
+	 */
+	if (IS_ENABLED(CONFIG_X86_GLOBAL_PAGES)) {
+		WARN_ON_ONCE(1);
+		return;
+	}
+
+	for (asid = 0; asid < TLB_NR_DYN_ASIDS; asid++) {
+		/* Do not need to flush the current asid */
+		if (asid == this_cpu_read(cpu_tlbstate.loaded_mm_asid))
+			continue;
+		/*
+		 * Make sure the next time we go to switch to
+		 * this asid, we do a flush:
+		 */
+		this_cpu_write(cpu_tlbstate.ctxs[asid].ctx_id, 0);
+	}
+	this_cpu_write(cpu_tlbstate.all_other_ctxs_invalid, false);
+}
+
 atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1);
 
 
@@ -42,6 +74,9 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen,
 		return;
 	}
 
+	if (this_cpu_read(cpu_tlbstate.all_other_ctxs_invalid))
+		clear_non_loaded_ctxs();
+
 	for (asid = 0; asid < TLB_NR_DYN_ASIDS; asid++) {
 		if (this_cpu_read(cpu_tlbstate.ctxs[asid].ctx_id) !=
 		    next->context.ctx_id)
-- 
2.14.1

  parent reply	other threads:[~2017-11-24  9:17 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-24  9:14 [PATCH 00/43] x86 entry-stack and Kaiser series, 2017/11/24 version Ingo Molnar
2017-11-24  9:14 ` [PATCH 01/43] x86/decoder: Add new TEST instruction pattern Ingo Molnar
2017-11-24 10:38   ` Borislav Petkov
2017-12-02  7:39   ` Robert Elliott (Persistent Memory)
2017-11-24  9:14 ` [PATCH 02/43] x86/entry/64: Allocate and enable the SYSENTER stack Ingo Molnar
2017-11-24  9:14 ` [PATCH 03/43] x86/dumpstack: Add get_stack_info() support for " Ingo Molnar
2017-11-24  9:14 ` [PATCH 04/43] x86/gdt: Put per-cpu GDT remaps in ascending order Ingo Molnar
2017-11-24  9:14 ` [PATCH 05/43] x86/fixmap: Generalize the GDT fixmap mechanism Ingo Molnar
2017-11-24 11:00   ` Borislav Petkov
2017-11-24  9:14 ` [PATCH 06/43] x86/kasan/64: Teach KASAN about the cpu_entry_area Ingo Molnar
2017-11-24  9:14 ` [PATCH 07/43] x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss Ingo Molnar
2017-11-24  9:14 ` [PATCH 08/43] x86/dumpstack: Handle stack overflow on all stacks Ingo Molnar
2017-11-24  9:14 ` [PATCH 09/43] x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct Ingo Molnar
2017-11-24 11:44   ` Borislav Petkov
2017-11-24  9:14 ` [PATCH 10/43] x86/entry: Remap the TSS into the cpu entry area Ingo Molnar
2017-11-24  9:14 ` [PATCH 11/43] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 Ingo Molnar
2017-11-24 14:19   ` Borislav Petkov
2017-11-24  9:14 ` [PATCH 12/43] x86/espfix/64: Stop assuming that pt_regs is on the entry stack Ingo Molnar
2017-11-24  9:14 ` [PATCH 13/43] x86/entry/64: Use a percpu trampoline stack for IDT entries Ingo Molnar
2017-11-24 11:27   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 14/43] x86/entry/64: Return to userspace from the trampoline stack Ingo Molnar
2017-11-24 13:46   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 15/43] x86/entry/64: Create a percpu SYSCALL entry trampoline Ingo Molnar
2017-11-24 13:52   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 16/43] x86/irq: Remove an old outdated comment about context tracking races Ingo Molnar
2017-11-24 13:53   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 17/43] x86/irq/64: Print the offending IP in the stack overflow warning Ingo Molnar
2017-11-24 14:22   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 18/43] x86/entry/64: Move the IST stacks into cpu_entry_area Ingo Molnar
2017-11-24 14:23   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 19/43] x86/entry/64: Remove the SYSENTER stack canary Ingo Molnar
2017-11-24 14:23   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 20/43] x86/entry: Clean up SYSENTER_stack code Ingo Molnar
2017-11-24 14:24   ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 21/43] x86/mm/kaiser: Disable global pages by default with KAISER Ingo Molnar
2017-11-24  9:14 ` [PATCH 22/43] x86/mm/kaiser: Prepare assembly for entry/exit CR3 switching Ingo Molnar
2017-11-24 12:05   ` Peter Zijlstra
2017-11-24 12:17     ` Ingo Molnar
2017-11-24 12:45       ` Peter Zijlstra
2017-11-24 13:04         ` Thomas Gleixner
2017-11-24  9:14 ` [PATCH 23/43] x86/mm/kaiser: Introduce user-mapped per-cpu areas Ingo Molnar
2017-11-24  9:14 ` [PATCH 24/43] x86/mm/kaiser: Mark per-cpu data structures required for entry/exit Ingo Molnar
2017-11-24  9:14 ` [PATCH 25/43] x86/mm/kaiser: Unmap kernel from userspace page tables (core patch) Ingo Molnar
2017-11-24 12:13   ` Peter Zijlstra
2017-11-24 13:46     ` Ingo Molnar
2017-11-24 12:16   ` Peter Zijlstra
2017-11-24 16:33     ` Dave Hansen
2017-11-26 15:13       ` Ingo Molnar
2017-11-24 13:30   ` Peter Zijlstra
2017-11-26 15:15     ` Ingo Molnar
2017-11-27  8:59       ` [PATCH] x86/mm/kaiser: Use the other page_table_lock pattern Peter Zijlstra
2017-11-27  8:59       ` [PATCH] mm: Unify page_table_lock allocation pattern Peter Zijlstra
2017-11-24  9:14 ` [PATCH 26/43] x86/mm/kaiser: Allow NX poison to be set in p4d/pgd Ingo Molnar
2017-11-24  9:14 ` [PATCH 27/43] x86/mm/kaiser: Make sure static PGDs are 8k in size Ingo Molnar
2017-11-24  9:14 ` [PATCH 28/43] x86/mm/kaiser: Map CPU entry area Ingo Molnar
2017-11-24 13:43   ` Peter Zijlstra
2017-11-24  9:14 ` [PATCH 29/43] x86/mm/kaiser: Map dynamically-allocated LDTs Ingo Molnar
2017-11-24  9:14 ` [PATCH 30/43] x86/mm/kaiser: Map espfix structures Ingo Molnar
2017-11-24 13:47   ` Peter Zijlstra
2017-11-24 16:17     ` Andy Lutomirski
2017-11-27  9:14       ` Peter Zijlstra
2017-11-27 15:35         ` Peter Zijlstra
2017-11-24  9:14 ` [PATCH 31/43] x86/mm/kaiser: Map entry stack variable Ingo Molnar
2017-11-24  9:14 ` [PATCH 32/43] x86/mm/kaiser: Map virtually-addressed performance monitoring buffers Ingo Molnar
2017-11-24  9:14 ` [PATCH 33/43] x86/mm: Move CR3 construction functions Ingo Molnar
2017-11-24  9:14 ` [PATCH 34/43] x86/mm: Remove hard-coded ASID limit checks Ingo Molnar
2017-11-24  9:14 ` [PATCH 35/43] x86/mm: Put mmu-to-h/w ASID translation in one place Ingo Molnar
2017-11-24  9:14 ` Ingo Molnar [this message]
2017-11-24  9:14 ` [PATCH 37/43] x86/mm/kaiser: Use PCID feature to make user and kernel switches faster Ingo Molnar
2017-11-24  9:14 ` [PATCH 38/43] x86/mm/kaiser: Disable native VSYSCALL Ingo Molnar
2017-11-24  9:14 ` [PATCH 39/43] x86/mm/kaiser: Add debugfs file to turn KAISER on/off at runtime Ingo Molnar
2017-11-24  9:14 ` [PATCH 40/43] x86/mm/kaiser: Add a function to check for KAISER being enabled Ingo Molnar
2017-11-24  9:14 ` [PATCH 41/43] x86/mm/kaiser: Un-poison PGDs at runtime Ingo Molnar
2017-11-24  9:14 ` [PATCH 42/43] x86/mm/kaiser: Allow KAISER to be enabled/disabled " Ingo Molnar
2017-11-24  9:14 ` [PATCH 43/43] x86/mm/kaiser: Add Kconfig Ingo Molnar
2017-11-24 13:55 ` [PATCH 00/43] x86 entry-stack and Kaiser series, 2017/11/24 version Ingo Molnar
2017-11-24 15:23 ` Thomas Gleixner
2017-11-24 17:19   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171124091448.7649-37-mingo@kernel.org \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.