All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
@ 2020-08-24 22:12 Tony Luck
  2020-08-26 16:45 ` [tip: ras/core] " tip-bot2 for Tony Luck
  0 siblings, 1 reply; 2+ messages in thread
From: Tony Luck @ 2020-08-24 22:12 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Tony Luck, Gabriele Paoloni, linux-kernel, x86

A long time ago, Linux cleared IA32_MCG_STATUS at the very end of machine
check processing.

Then we added some fancy recovery and IST manipulation in:

commit d4812e169de4 ("x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks")

and clearing IA32_MCG_STATUS was pulled earlier in the function.

Next change moved the actual recovery out of do_machine_check() and just
used task_work_add() to schedule it later (before returning to the user):

commit 5567d11c21a1 ("x86/mce: Send #MC singal from task work")

Most recently the fancy IST footwork was removed as no longer needed:

commit b052df3da821 ("x86/entry: Get rid of ist_begin/end_non_atomic()")

At this point there is no reason remaining to clear IA32_MCG_STATUS early.
It can move back to the very end of the function.

Also moved sync_core(). The comments for this function say that it should
only be called when instructions have been changed/re-mapped. Recovery for
an instruction fetch may change the physical address. But that doesn't happen
until the scheduled work runs (which could be on another CPU).

Reported-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/mce/core.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f43a78bde670..0ba24dfffdb2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1190,6 +1190,7 @@ static void kill_me_maybe(struct callback_head *cb)
 
 	if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags)) {
 		set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
+		sync_core();
 		return;
 	}
 
@@ -1330,12 +1331,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
 	if (worst > 0)
 		irq_work_queue(&mce_irq_work);
 
-	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
-
-	sync_core();
-
 	if (worst != MCE_AR_SEVERITY && !kill_it)
-		return;
+		goto out;
 
 	/* Fault was in user mode and we need to take some action */
 	if ((m.cs & 3) == 3) {
@@ -1364,6 +1361,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
 				mce_panic("Failed kernel mode recovery", &m, msg);
 		}
 	}
+out:
+	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
 }
 EXPORT_SYMBOL_GPL(do_machine_check);
 
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [tip: ras/core] x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
  2020-08-24 22:12 [PATCH] x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check() Tony Luck
@ 2020-08-26 16:45 ` tip-bot2 for Tony Luck
  0 siblings, 0 replies; 2+ messages in thread
From: tip-bot2 for Tony Luck @ 2020-08-26 16:45 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Gabriele Paoloni, Tony Luck, Borislav Petkov, x86, LKML

The following commit has been merged into the ras/core branch of tip:

Commit-ID:     1e36d9c6886849c6f3d3c836370563e6bc1a6ddd
Gitweb:        https://git.kernel.org/tip/1e36d9c6886849c6f3d3c836370563e6bc1a6ddd
Author:        Tony Luck <tony.luck@intel.com>
AuthorDate:    Mon, 24 Aug 2020 15:12:37 -07:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 26 Aug 2020 18:40:18 +02:00

x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()

A long time ago, Linux cleared IA32_MCG_STATUS at the very end of machine
check processing.

Then, some fancy recovery and IST manipulation was added in:

  d4812e169de4 ("x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks")

and clearing IA32_MCG_STATUS was pulled earlier in the function.

Next change moved the actual recovery out of do_machine_check() and
just used task_work_add() to schedule it later (before returning to the
user):

  5567d11c21a1 ("x86/mce: Send #MC singal from task work")

Most recently the fancy IST footwork was removed as no longer needed:

  b052df3da821 ("x86/entry: Get rid of ist_begin/end_non_atomic()")

At this point there is no reason remaining to clear IA32_MCG_STATUS early.
It can move back to the very end of the function.

Also move sync_core(). The comments for this function say that it should
only be called when instructions have been changed/re-mapped. Recovery
for an instruction fetch may change the physical address. But that
doesn't happen until the scheduled work runs (which could be on another
CPU).

 [ bp: Massage commit message. ]

Reported-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200824221237.5397-1-tony.luck@intel.com
---
 arch/x86/kernel/cpu/mce/core.c |  9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f43a78b..0ba24df 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1190,6 +1190,7 @@ static void kill_me_maybe(struct callback_head *cb)
 
 	if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags)) {
 		set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
+		sync_core();
 		return;
 	}
 
@@ -1330,12 +1331,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
 	if (worst > 0)
 		irq_work_queue(&mce_irq_work);
 
-	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
-
-	sync_core();
-
 	if (worst != MCE_AR_SEVERITY && !kill_it)
-		return;
+		goto out;
 
 	/* Fault was in user mode and we need to take some action */
 	if ((m.cs & 3) == 3) {
@@ -1364,6 +1361,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
 				mce_panic("Failed kernel mode recovery", &m, msg);
 		}
 	}
+out:
+	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
 }
 EXPORT_SYMBOL_GPL(do_machine_check);
 

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-08-26 16:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-24 22:12 [PATCH] x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check() Tony Luck
2020-08-26 16:45 ` [tip: ras/core] " tip-bot2 for Tony Luck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.