All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v0 00/12] x86/mce: Correct the noinstr annotation
@ 2021-11-04 14:40 Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
                   ` (12 more replies)
  0 siblings, 13 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Hi,

here's a first preliminary (it is based on some random 5.16-rc0 commit
and is tested only in qemu) of the series which correct all the noinstr
annotation of the #MC handler.

Since it calls a bunch of external facilities, the strategy is to mark
mce-specific functions called by the #MC handler as noinstr and when
they "call out" so to speak, to do a begin/end sandwich around that
call.

Please have a look and let me know if it looks ok-ish.

Further testing will happen after -rc1 releases, thus the "v0" version
here.

Thx.

Borislav Petkov (12):
  x86/mce: Do not use memset to clear the banks bitmaps
  x86/mce: Remove function-local cpus variables
  x86/mce: Use mce_rdmsrl() in severity checking code
  x86/mce: Remove noinstr annotation from mce_setup()
  x86/mce: Allow instrumentation during task work queueing
  x86/mce: Prevent severity computation from being instrumented
  x86/mce: Mark mce_panic() noinstr
  x86/mce: Mark mce_end() noinstr
  x86/mce: Mark mce_read_aux() noinstr
  x86/mce: Move the tainting outside of the noinstr region
  x86/mce: Mark mce_timed_out() noinstr
  x86/mce: Mark mce_start() noinstr

 arch/x86/kernel/cpu/mce/core.c     | 108 +++++++++++++++++++----------
 arch/x86/kernel/cpu/mce/internal.h |   2 +
 arch/x86/kernel/cpu/mce/severity.c |  37 ++++++----
 3 files changed, 98 insertions(+), 49 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v0 01/12] x86/mce: Do not use memset to clear the banks bitmaps
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 02/12] x86/mce: Remove function-local cpus variables Borislav Petkov
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

The bitmap is a single unsigned long so no need for the function call.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 6ed365337a3b..52d277c955f9 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1340,8 +1340,8 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
 noinstr void do_machine_check(struct pt_regs *regs)
 {
 	int worst = 0, order, no_way_out, kill_current_task, lmce;
-	DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
-	DECLARE_BITMAP(toclear, MAX_NR_BANKS);
+	DECLARE_BITMAP(valid_banks, MAX_NR_BANKS) = { 0 };
+	DECLARE_BITMAP(toclear, MAX_NR_BANKS) = { 0 };
 	struct mca_config *cfg = &mca_cfg;
 	struct mce m, *final;
 	char *msg = NULL;
@@ -1385,7 +1385,6 @@ noinstr void do_machine_check(struct pt_regs *regs)
 	final = this_cpu_ptr(&mces_seen);
 	*final = m;
 
-	memset(valid_banks, 0, sizeof(valid_banks));
 	no_way_out = mce_no_way_out(&m, &msg, valid_banks, regs);
 
 	barrier();
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 02/12] x86/mce: Remove function-local cpus variables
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 03/12] x86/mce: Use mce_rdmsrl() in severity checking code Borislav Petkov
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Use num_online_cpus() directly.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 52d277c955f9..0dcd7ac7444c 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -989,7 +989,6 @@ static atomic_t global_nwo;
 static int mce_start(int *no_way_out)
 {
 	int order;
-	int cpus = num_online_cpus();
 	u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
 
 	if (!timeout)
@@ -1006,7 +1005,7 @@ static int mce_start(int *no_way_out)
 	/*
 	 * Wait for everyone.
 	 */
-	while (atomic_read(&mce_callin) != cpus) {
+	while (atomic_read(&mce_callin) != num_online_cpus()) {
 		if (mce_timed_out(&timeout,
 				  "Timeout: Not all CPUs entered broadcast exception handler")) {
 			atomic_set(&global_nwo, 0);
@@ -1070,14 +1069,11 @@ static int mce_end(int order)
 	atomic_inc(&mce_executing);
 
 	if (order == 1) {
-		/* CHECKME: Can this race with a parallel hotplug? */
-		int cpus = num_online_cpus();
-
 		/*
 		 * Monarch: Wait for everyone to go through their scanning
 		 * loops.
 		 */
-		while (atomic_read(&mce_executing) <= cpus) {
+		while (atomic_read(&mce_executing) <= num_online_cpus()) {
 			if (mce_timed_out(&timeout,
 					  "Timeout: Monarch CPU unable to finish machine check processing"))
 				goto reset;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 03/12] x86/mce: Use mce_rdmsrl() in severity checking code
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 02/12] x86/mce: Remove function-local cpus variables Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

MCA has its own special MSR accessors. Use them.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c     | 2 +-
 arch/x86/kernel/cpu/mce/internal.h | 2 ++
 arch/x86/kernel/cpu/mce/severity.c | 8 +++-----
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 0dcd7ac7444c..c660008ccd4a 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -365,7 +365,7 @@ void ex_handler_msr_mce(struct pt_regs *regs, bool wrmsr)
 }
 
 /* MSR access wrappers used for error injection */
-static noinstr u64 mce_rdmsrl(u32 msr)
+noinstr u64 mce_rdmsrl(u32 msr)
 {
 	DECLARE_ARGS(val, low, high);
 
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index acd61c41846c..52c633950b38 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -207,4 +207,6 @@ static inline void pentium_machine_check(struct pt_regs *regs) {}
 static inline void winchip_machine_check(struct pt_regs *regs) {}
 #endif
 
+noinstr u64 mce_rdmsrl(u32 msr);
+
 #endif /* __X86_MCE_INTERNAL_H__ */
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index bb019a594a2c..00e97ebbd94a 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -288,8 +288,7 @@ static int error_context(struct mce *m, struct pt_regs *regs)
 
 static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
 {
-	u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
-	u32 low, high;
+	u64 mcx_cfg;
 
 	/*
 	 * We need to look at the following bits:
@@ -300,11 +299,10 @@ static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
 	if (!mce_flags.succor)
 		return MCE_PANIC_SEVERITY;
 
-	if (rdmsr_safe(addr, &low, &high))
-		return MCE_PANIC_SEVERITY;
+	mcx_cfg = mce_rdmsrl(MSR_AMD64_SMCA_MCx_CONFIG(m->bank));
 
 	/* TCC (Task context corrupt). If set and if IN_KERNEL, panic. */
-	if ((low & MCI_CONFIG_MCAX) &&
+	if ((mcx_cfg & MCI_CONFIG_MCAX) &&
 	    (m->status & MCI_STATUS_TCC) &&
 	    (err_ctx == IN_KERNEL))
 		return MCE_PANIC_SEVERITY;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 04/12] x86/mce: Remove noinstr annotation from mce_setup()
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (2 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 03/12] x86/mce: Use mce_rdmsrl() in severity checking code Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Instead, sandwitch around the call which is done in noinstr context and
mark the caller - mce_gather_info() - as noinstr.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index c660008ccd4a..c7181e8e2845 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -128,7 +128,7 @@ static struct irq_work mce_irq_work;
 BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
 
 /* Do initial initialization of a struct mce */
-noinstr void mce_setup(struct mce *m)
+void mce_setup(struct mce *m)
 {
 	memset(m, 0, sizeof(struct mce));
 	m->cpu = m->extcpu = smp_processor_id();
@@ -433,9 +433,11 @@ static noinstr void mce_wrmsrl(u32 msr, u64 v)
  * check into our "mce" struct so that we can use it later to assess
  * the severity of the problem as we read per-bank specific details.
  */
-static inline void mce_gather_info(struct mce *m, struct pt_regs *regs)
+static noinstr void mce_gather_info(struct mce *m, struct pt_regs *regs)
 {
+	instrumentation_begin();
 	mce_setup(m);
+	instrumentation_end();
 
 	m->mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
 	if (regs) {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 05/12] x86/mce: Allow instrumentation during task work queueing
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (3 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-09 22:05   ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 06/12] x86/mce: Prevent severity computation from being instrumented Borislav Petkov
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0xdb1: call to queue_task_work() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index c7181e8e2845..65a8be420aff 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1451,6 +1451,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
 	if (worst != MCE_AR_SEVERITY && !kill_current_task)
 		goto out;
 
+	instrumentation_begin();
+
 	/* Fault was in user mode and we need to take some action */
 	if ((m.cs & 3) == 3) {
 		/* If this triggers there is no way to recover. Die hard. */
@@ -1479,6 +1481,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
 		if (m.kflags & MCE_IN_KERNEL_COPYIN)
 			queue_task_work(&m, msg, kill_me_never);
 	}
+
+	instrumentation_end();
+
 out:
 	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 06/12] x86/mce: Prevent severity computation from being instrumented
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (4 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 07/12] x86/mce: Mark mce_panic() noinstr Borislav Petkov
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Mark all the MCE severity computation logic noinstr and allow
instrumentation when it "calls out".

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0xc5d: call to mce_severity() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/severity.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index 00e97ebbd94a..03543765e9bf 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -263,24 +263,35 @@ static bool is_copy_from_user(struct pt_regs *regs)
  * distinguish an exception taken in user from from one
  * taken in the kernel.
  */
-static int error_context(struct mce *m, struct pt_regs *regs)
+static noinstr int error_context(struct mce *m, struct pt_regs *regs)
 {
+	int fixup_type;
+	bool copy_user;
+
 	if ((m->cs & 3) == 3)
 		return IN_USER;
+
 	if (!mc_recoverable(m->mcgstatus))
 		return IN_KERNEL;
 
-	switch (ex_get_fixup_type(m->ip)) {
+	instrumentation_begin();
+	fixup_type = ex_get_fixup_type(m->ip);
+	copy_user  = is_copy_from_user(regs);
+	instrumentation_end();
+
+	switch (fixup_type) {
 	case EX_TYPE_UACCESS:
 	case EX_TYPE_COPY:
-		if (!regs || !is_copy_from_user(regs))
+		if (!regs || !copy_user)
 			return IN_KERNEL;
 		m->kflags |= MCE_IN_KERNEL_COPYIN;
 		fallthrough;
+
 	case EX_TYPE_FAULT_MCE_SAFE:
 	case EX_TYPE_DEFAULT_MCE_SAFE:
 		m->kflags |= MCE_IN_KERNEL_RECOV;
 		return IN_KERNEL_RECOV;
+
 	default:
 		return IN_KERNEL;
 	}
@@ -315,8 +326,8 @@ static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
  * See AMD Error Scope Hierarchy table in a newer BKDG. For example
  * 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features"
  */
-static int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
-			    char **msg, bool is_excp)
+static noinstr int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
+				    char **msg, bool is_excp)
 {
 	enum context ctx = error_context(m, regs);
 
@@ -368,8 +379,8 @@ static int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
 	return MCE_KEEP_SEVERITY;
 }
 
-static int mce_severity_intel(struct mce *m, struct pt_regs *regs,
-			      int tolerant, char **msg, bool is_excp)
+static noinstr int mce_severity_intel(struct mce *m, struct pt_regs *regs,
+				      int tolerant, char **msg, bool is_excp)
 {
 	enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP);
 	enum context ctx = error_context(m, regs);
@@ -405,8 +416,8 @@ static int mce_severity_intel(struct mce *m, struct pt_regs *regs,
 	}
 }
 
-int mce_severity(struct mce *m, struct pt_regs *regs, int tolerant, char **msg,
-		 bool is_excp)
+int noinstr mce_severity(struct mce *m, struct pt_regs *regs, int tolerant, char **msg,
+			 bool is_excp)
 {
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
 	    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 07/12] x86/mce: Mark mce_panic() noinstr
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (5 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 06/12] x86/mce: Prevent severity computation from being instrumented Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 08/12] x86/mce: Mark mce_end() noinstr Borislav Petkov
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

And allow instrumentation inside it because it does calls to other
facilities which will not be tagged noinstr.

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0xc73: call to mce_panic() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 65a8be420aff..4783b009f5aa 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -267,11 +267,13 @@ static void wait_for_panic(void)
 	panic("Panicing machine check CPU died");
 }
 
-static void mce_panic(const char *msg, struct mce *final, char *exp)
+static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
 {
-	int apei_err = 0;
 	struct llist_node *pending;
 	struct mce_evt_llist *l;
+	int apei_err = 0;
+
+	instrumentation_begin();
 
 	if (!fake_panic) {
 		/*
@@ -286,7 +288,7 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
 	} else {
 		/* Don't log too much for fake panic */
 		if (atomic_inc_return(&mce_fake_panicked) > 1)
-			return;
+			goto out;
 	}
 	pending = mce_gen_pool_prepare_records();
 	/* First print corrected ones that are still unlogged */
@@ -324,6 +326,9 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
 		panic(msg);
 	} else
 		pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
+
+out:
+	instrumentation_end();
 }
 
 /* Support code for software error injection */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 08/12] x86/mce: Mark mce_end() noinstr
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (6 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 07/12] x86/mce: Mark mce_panic() noinstr Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 09/12] x86/mce: Mark mce_read_aux() noinstr Borislav Petkov
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

It is called by the #MC handler which is noinstr.

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0xbd6: call to memset() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4783b009f5aa..8b0dd4fdab75 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1060,10 +1060,12 @@ static int mce_start(int *no_way_out)
  * Synchronize between CPUs after main scanning loop.
  * This invokes the bulk of the Monarch processing.
  */
-static int mce_end(int order)
+static noinstr int mce_end(int order)
 {
-	int ret = -1;
 	u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
+	int ret = -1;
+
+	instrumentation_begin();
 
 	if (!timeout)
 		goto reset;
@@ -1104,7 +1106,8 @@ static int mce_end(int order)
 		/*
 		 * Don't reset anything. That's done by the Monarch.
 		 */
-		return 0;
+		ret = 0;
+		goto out;
 	}
 
 	/*
@@ -1120,6 +1123,10 @@ static int mce_end(int order)
 	 * Let others run again.
 	 */
 	atomic_set(&mce_executing, 0);
+
+out:
+	instrumentation_end();
+
 	return ret;
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 09/12] x86/mce: Mark mce_read_aux() noinstr
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (7 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 08/12] x86/mce: Mark mce_end() noinstr Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 10/12] x86/mce: Move the tainting outside of the noinstr region Borislav Petkov
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0x681: call to mce_read_aux() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 8b0dd4fdab75..e31df79f87c7 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -643,7 +643,7 @@ static struct notifier_block mce_default_nb = {
 /*
  * Read ADDR and MISC registers.
  */
-static void mce_read_aux(struct mce *m, int i)
+static noinstr void mce_read_aux(struct mce *m, int i)
 {
 	if (m->status & MCI_STATUS_MISCV)
 		m->misc = mce_rdmsrl(mca_msr_reg(i, MCA_MISC));
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 10/12] x86/mce: Move the tainting outside of the noinstr region
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (8 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 09/12] x86/mce: Mark mce_read_aux() noinstr Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 11/12] x86/mce: Mark mce_timed_out() noinstr Borislav Petkov
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

add_taint() is yet another external facility which the #MC handler
calls. Move that tainting call into the instrumentation-allowed part of
the handler.

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0x617: call to add_taint() leaves .noinstr.text section

While at it, drop noinstr tracking around mce_log().

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0x690: call to mce_log() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index e31df79f87c7..961f80a4bd6f 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1175,13 +1175,14 @@ static noinstr bool mce_check_crashing_cpu(void)
 	return false;
 }
 
-static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final,
-			    unsigned long *toclear, unsigned long *valid_banks,
-			    int no_way_out, int *worst)
+static __always_inline int
+__mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final,
+		unsigned long *toclear, unsigned long *valid_banks, int no_way_out,
+		int *worst)
 {
 	struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
 	struct mca_config *cfg = &mca_cfg;
-	int severity, i;
+	int severity, i, taint = 0;
 
 	for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
 		__clear_bit(i, toclear);
@@ -1208,7 +1209,7 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
 			continue;
 
 		/* Set taint even when machine check was not enabled. */
-		add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+		taint++;
 
 		severity = mce_severity(m, regs, cfg->tolerant, NULL, true);
 
@@ -1231,7 +1232,9 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
 		/* assuming valid severity level != 0 */
 		m->severity = severity;
 
+		instrumentation_begin();
 		mce_log(m);
+		instrumentation_end();
 
 		if (severity > *worst) {
 			*final = *m;
@@ -1241,6 +1244,8 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
 
 	/* mce_clear_state will clear *final, save locally for use later */
 	*m = *final;
+
+	return taint;
 }
 
 static void kill_me_now(struct callback_head *ch)
@@ -1349,7 +1354,7 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
  */
 noinstr void do_machine_check(struct pt_regs *regs)
 {
-	int worst = 0, order, no_way_out, kill_current_task, lmce;
+	int worst = 0, order, no_way_out, kill_current_task, lmce, taint = 0;
 	DECLARE_BITMAP(valid_banks, MAX_NR_BANKS) = { 0 };
 	DECLARE_BITMAP(toclear, MAX_NR_BANKS) = { 0 };
 	struct mca_config *cfg = &mca_cfg;
@@ -1428,7 +1433,7 @@ noinstr void do_machine_check(struct pt_regs *regs)
 		order = mce_start(&no_way_out);
 	}
 
-	__mc_scan_banks(&m, regs, final, toclear, valid_banks, no_way_out, &worst);
+	taint = __mc_scan_banks(&m, regs, final, toclear, valid_banks, no_way_out, &worst);
 
 	if (!no_way_out)
 		mce_clear_state(toclear);
@@ -1460,11 +1465,14 @@ noinstr void do_machine_check(struct pt_regs *regs)
 		}
 	}
 
+	instrumentation_begin();
+
+	if (taint)
+		add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+
 	if (worst != MCE_AR_SEVERITY && !kill_current_task)
 		goto out;
 
-	instrumentation_begin();
-
 	/* Fault was in user mode and we need to take some action */
 	if ((m.cs & 3) == 3) {
 		/* If this triggers there is no way to recover. Die hard. */
@@ -1494,9 +1502,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
 			queue_task_work(&m, msg, kill_me_never);
 	}
 
+out:
 	instrumentation_end();
 
-out:
 	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
 }
 EXPORT_SYMBOL_GPL(do_machine_check);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 11/12] x86/mce: Mark mce_timed_out() noinstr
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (9 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 10/12] x86/mce: Move the tainting outside of the noinstr region Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-04 14:40 ` [PATCH v0 12/12] x86/mce: Mark mce_start() noinstr Borislav Petkov
  2021-11-09 22:08 ` [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Peter Zijlstra
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0x482: call to mce_timed_out() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 961f80a4bd6f..bc97db90ac0a 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -878,8 +878,12 @@ static cpumask_t mce_missing_cpus = CPU_MASK_ALL;
 /*
  * Check if a timeout waiting for other CPUs happened.
  */
-static int mce_timed_out(u64 *t, const char *msg)
+static noinstr int mce_timed_out(u64 *t, const char *msg)
 {
+	int ret = 0;
+
+	instrumentation_begin();
+
 	/*
 	 * The others already did panic for some reason.
 	 * Bail out like in a timeout.
@@ -899,12 +903,17 @@ static int mce_timed_out(u64 *t, const char *msg)
 			mce_panic(msg, NULL, NULL);
 		}
 		cpu_missing = 1;
-		return 1;
+		ret = 1;
+		goto out;
 	}
 	*t -= SPINUNIT;
+
 out:
 	touch_nmi_watchdog();
-	return 0;
+
+	instrumentation_end();
+
+	return ret;
 }
 
 /*
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v0 12/12] x86/mce: Mark mce_start() noinstr
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (10 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 11/12] x86/mce: Mark mce_timed_out() noinstr Borislav Petkov
@ 2021-11-04 14:40 ` Borislav Petkov
  2021-11-09 22:08 ` [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Peter Zijlstra
  12 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-04 14:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0x4ae: call to __const_udelay() leaves .noinstr.text section

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/core.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index bc97db90ac0a..4bec85a07259 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1002,13 +1002,13 @@ static atomic_t global_nwo;
  * in the entry order.
  * TBD double check parallel CPU hotunplug
  */
-static int mce_start(int *no_way_out)
+static noinstr int mce_start(int *no_way_out)
 {
-	int order;
 	u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
+	int order, ret = -1;
 
 	if (!timeout)
-		return -1;
+		return ret;
 
 	atomic_add(*no_way_out, &global_nwo);
 	/*
@@ -1018,6 +1018,8 @@ static int mce_start(int *no_way_out)
 	order = atomic_inc_return(&mce_callin);
 	cpumask_clear_cpu(smp_processor_id(), &mce_missing_cpus);
 
+	instrumentation_begin();
+
 	/*
 	 * Wait for everyone.
 	 */
@@ -1025,7 +1027,7 @@ static int mce_start(int *no_way_out)
 		if (mce_timed_out(&timeout,
 				  "Timeout: Not all CPUs entered broadcast exception handler")) {
 			atomic_set(&global_nwo, 0);
-			return -1;
+			goto out;
 		}
 		ndelay(SPINUNIT);
 	}
@@ -1051,7 +1053,7 @@ static int mce_start(int *no_way_out)
 			if (mce_timed_out(&timeout,
 					  "Timeout: Subject CPUs unable to finish machine check processing")) {
 				atomic_set(&global_nwo, 0);
-				return -1;
+				goto out;
 			}
 			ndelay(SPINUNIT);
 		}
@@ -1062,7 +1064,12 @@ static int mce_start(int *no_way_out)
 	 */
 	*no_way_out = atomic_read(&global_nwo);
 
-	return order;
+	ret = order;
+
+out:
+	instrumentation_end();
+
+	return ret;
 }
 
 /*
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v0 05/12] x86/mce: Allow instrumentation during task work queueing
  2021-11-04 14:40 ` [PATCH v0 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
@ 2021-11-09 22:05   ` Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2021-11-09 22:05 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, LKML

On Thu, Nov 04, 2021 at 03:40:28PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> Fixes
> 
>   vmlinux.o: warning: objtool: do_machine_check()+0xdb1: call to queue_task_work() leaves .noinstr.text section
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  arch/x86/kernel/cpu/mce/core.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index c7181e8e2845..65a8be420aff 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1451,6 +1451,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
>  	if (worst != MCE_AR_SEVERITY && !kill_current_task)
>  		goto out;
>  
> +	instrumentation_begin();

<peterz> so it might be nice to have a comment with each sandwich to
say why sh*t won't go sideways when instrumentation happens in the #MC
at that point
<peterz> like this shuts things up, but really we don't want
instrumentation here either, except blah

note to self: will need to go through the whole set.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v0 00/12] x86/mce: Correct the noinstr annotation
  2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
                   ` (11 preceding siblings ...)
  2021-11-04 14:40 ` [PATCH v0 12/12] x86/mce: Mark mce_start() noinstr Borislav Petkov
@ 2021-11-09 22:08 ` Peter Zijlstra
  12 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2021-11-09 22:08 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Tony Luck, X86 ML, LKML

On Thu, Nov 04, 2021 at 03:40:23PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> Hi,
> 
> here's a first preliminary (it is based on some random 5.16-rc0 commit
> and is tested only in qemu) of the series which correct all the noinstr
> annotation of the #MC handler.
> 
> Since it calls a bunch of external facilities, the strategy is to mark
> mce-specific functions called by the #MC handler as noinstr and when
> they "call out" so to speak, to do a begin/end sandwich around that
> call.

Some things are obviously fine, like annotating away mce_panic(), I mean
we're going to panic, so who cares if instrumentation is going to make
it explode earlier.

But other things are non-obvious to me; in principle I'm thinking much
of #MC really doesn't want instrumentation because things are fragile
and the more 'crap' runs the more chance we'll trigger a second fail and
blow up the system, right?

Now, MCE code hasn't been 'architected' much, and as such seems to call
out to lots of code, so perhaps put in an explicit comment with
instrumentation_begin()'s you *know* are wrong, but are the best we can
do to shut things up -- for now.

That way we don't forget things are broken and need more work :-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-11-09 22:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-04 14:40 [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 02/12] x86/mce: Remove function-local cpus variables Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 03/12] x86/mce: Use mce_rdmsrl() in severity checking code Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
2021-11-09 22:05   ` Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 06/12] x86/mce: Prevent severity computation from being instrumented Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 07/12] x86/mce: Mark mce_panic() noinstr Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 08/12] x86/mce: Mark mce_end() noinstr Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 09/12] x86/mce: Mark mce_read_aux() noinstr Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 10/12] x86/mce: Move the tainting outside of the noinstr region Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 11/12] x86/mce: Mark mce_timed_out() noinstr Borislav Petkov
2021-11-04 14:40 ` [PATCH v0 12/12] x86/mce: Mark mce_start() noinstr Borislav Petkov
2021-11-09 22:08 ` [PATCH v0 00/12] x86/mce: Correct the noinstr annotation Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.