linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: X86 ML <x86@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>
Subject: [PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume
Date: Wed,  6 Sep 2017 19:54:53 -0700	[thread overview]
Message-ID: <f13c8e8c58ba3b535f1e4cb9e62b50ab37dd69bb.1504752689.git.luto@kernel.org> (raw)
In-Reply-To: <cover.1504752689.git.luto@kernel.org>
In-Reply-To: <cover.1504752689.git.luto@kernel.org>

When Linux brings a CPU down and back up, it switches to init_mm and then
loads swapper_pg_dir into CR3.  With PCID enabled, this has the side effect
of masking off the ASID bits in CR3.

This can result in some confusion in the TLB handling code.  If we
bring a CPU down and back up with any ASID other than 0, we end up
with the wrong ASID active on the CPU after resume.  This could
cause our internal state to become corrupt, although major
corruption is unlikely because init_mm doesn't have any user pages.
More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion
in the next context switch.  The result of *that* is a failure to
resume from suspend with probability 1 - 1/6^(cpus-1).

Fix it by reinitializing cpu_tlbstate on resume and CPU bringup.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Jiri Kosina <jikos@kernel.org>
Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/tlbflush.h |  2 ++
 arch/x86/kernel/cpu/common.c    |  2 ++
 arch/x86/mm/tlb.c               | 44 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/power/cpu.c            |  1 +
 4 files changed, 49 insertions(+)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index d23e61dc0640..4893abf7f74f 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -198,6 +198,8 @@ static inline void cr4_set_bits_and_update_boot(unsigned long mask)
 	cr4_set_bits(mask);
 }
 
+extern void initialize_tlbstate_and_flush(void);
+
 static inline void __native_flush_tlb(void)
 {
 	/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index efba8e3da3e2..40cb4d0a5982 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1583,6 +1583,7 @@ void cpu_init(void)
 	mmgrab(&init_mm);
 	me->active_mm = &init_mm;
 	BUG_ON(me->mm);
+	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, me);
 
 	load_sp0(t, &current->thread);
@@ -1637,6 +1638,7 @@ void cpu_init(void)
 	mmgrab(&init_mm);
 	curr->active_mm = &init_mm;
 	BUG_ON(curr->mm);
+	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, curr);
 
 	load_sp0(t, thread);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index ce104b962a17..dbbcfd59726a 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -214,6 +214,50 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 }
 
 /*
+ * Call this when reinitializing a CPU.  It fixes the following potential
+ * problems:
+ *
+ * - The ASID changed from what cpu_tlbstate thinks it is (most likely
+ *   because the CPU was taken down and came back up with CR3's PCID
+ *   bits clear.  CPU hotplug can do this.
+ *
+ * - The TLB contains junk in slots corresponding to inactive ASIDs.
+ *
+ * - The CPU went so far out to lunch that it may have missed a TLB
+ *   flush.
+ */
+void initialize_tlbstate_and_flush(void)
+{
+	int i;
+	struct mm_struct *mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+	u64 tlb_gen = atomic64_read(&init_mm.context.tlb_gen);
+	unsigned long cr3 = __read_cr3();
+
+	/* Assert that CR3 already references the right mm. */
+	WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd));
+
+	/*
+	 * Assert that CR4.PCIDE is set if needed.  (CR4.PCIDE initialization
+	 * doesn't work like other CR4 bits because it can only be set from
+	 * long mode.)
+	 */
+	WARN_ON(boot_cpu_has(X86_CR4_PCIDE) &&
+		!(cr4_read_shadow() & X86_CR4_PCIDE));
+
+	/* Force ASID 0 and force a TLB flush. */
+	write_cr3(cr3 & ~CR3_PCID_MASK);
+
+	/* Reinitialize tlbstate. */
+	this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
+	this_cpu_write(cpu_tlbstate.next_asid, 1);
+	this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
+	this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen);
+
+	for (i = 1; i < TLB_NR_DYN_ASIDS; i++)
+		this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0);
+}
+
+/*
  * flush_tlb_func_common()'s memory ordering requirement is that any
  * TLB fills that happen after we flush the TLB are ordered after we
  * read active_mm's tlb_gen.  We don't need any explicit barriers
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 78459a6d455a..4d68d59f457d 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -181,6 +181,7 @@ static void fix_processor_context(void)
 #endif
 	load_TR_desc();				/* This does ltr */
 	load_mm_ldt(current->active_mm);	/* This does lldt */
+	initialize_tlbstate_and_flush();
 
 	fpu__resume_cpu();
 
-- 
2.13.5

  reply	other threads:[~2017-09-07  2:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-07  2:54 [PATCH 0/2] Fix resume failure due to PCID Andy Lutomirski
2017-09-07  2:54 ` Andy Lutomirski [this message]
2017-09-07  7:01   ` [PATCH] mm/debug: Change BUG_ON() crashes to survivable WARN_ON() warnings Ingo Molnar
2017-09-07 20:50     ` Linus Torvalds
2017-09-07  7:31   ` [PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume Jiri Kosina
2017-09-07  7:48     ` Ingo Molnar
2017-09-07 19:55       ` Jiri Kosina
2017-09-08  1:23         ` Andy Lutomirski
2017-09-07  9:54   ` Borislav Petkov
2017-09-07  9:59     ` Ingo Molnar
2017-09-07 10:10       ` Borislav Petkov
2017-09-07  2:54 ` [PATCH 2/2] x86/mm: Document how CR4.PCIDE restore works Andy Lutomirski
2017-09-07  3:25 ` [PATCH 0/2] Fix resume failure due to PCID Linus Torvalds
2017-09-07  4:15   ` Andy Lutomirski
2017-09-15  6:59   ` x60: warnings on boot and resume, arch/x86/mm/tlb.c:257 initialize_ ... was " Pavel Machek
2017-09-15  8:39     ` Ingo Molnar
2017-09-15  9:16       ` Pavel Machek
2017-09-15  9:35         ` Ingo Molnar
2017-09-15 10:22       ` [4.14-rc0 regression] " Pavel Machek
2017-09-15 18:47         ` Linus Torvalds
2017-09-15 19:29           ` Andy Lutomirski
2017-09-15 21:06             ` Andy Lutomirski
2017-09-07  8:59 ` Borislav Petkov
2017-09-15 11:01 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f13c8e8c58ba3b535f1e4cb9e62b50ab37dd69bb.1504752689.git.luto@kernel.org \
    --to=luto@kernel.org \
    --cc=bpetkov@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).