* [PATCH v1 00/12] x86/mce: Correct the noinstr annotation
@ 2021-12-08 11:13 Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
` (12 more replies)
0 siblings, 13 replies; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Hi folks,
here's v1 with the instrumentation "sandwiching" documented, as
requested by peterz.
Tony, I'd appreciate making sure nothing in them breaks your injection
workflows before I queue them. And it shouldn't but... :-)
Thx.
Changelog:
==========
here's a first preliminary (it is based on some random 5.16-rc0 commit
and is tested only in qemu) of the series which correct all the noinstr
annotation of the #MC handler.
Since it calls a bunch of external facilities, the strategy is to mark
mce-specific functions called by the #MC handler as noinstr and when
they "call out" so to speak, to do a begin/end sandwich around that
call.
Please have a look and let me know if it looks ok-ish.
Further testing will happen after -rc1 releases, thus the "v0" version
here.
Thx.
Borislav Petkov (12):
x86/mce: Do not use memset to clear the banks bitmaps
x86/mce: Remove function-local cpus variables
x86/mce: Use mce_rdmsrl() in severity checking code
x86/mce: Remove noinstr annotation from mce_setup()
x86/mce: Allow instrumentation during task work queueing
x86/mce: Prevent severity computation from being instrumented
x86/mce: Mark mce_panic() noinstr
x86/mce: Mark mce_end() noinstr
x86/mce: Mark mce_read_aux() noinstr
x86/mce: Move the tainting outside of the noinstr region
x86/mce: Mark mce_timed_out() noinstr
x86/mce: Mark mce_start() noinstr
arch/x86/kernel/cpu/mce/core.c | 128 +++++++++++++++++++++--------
arch/x86/kernel/cpu/mce/internal.h | 2 +
arch/x86/kernel/cpu/mce/severity.c | 38 +++++----
3 files changed, 119 insertions(+), 49 deletions(-)
--
2.29.2
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v1 01/12] x86/mce: Do not use memset to clear the banks bitmaps
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 02/12] x86/mce: Remove function-local cpus variables Borislav Petkov
` (11 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
The bitmap is a single unsigned long so no need for the function call.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 30de00fe0d7a..7c264ee18c95 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1336,8 +1336,8 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
noinstr void do_machine_check(struct pt_regs *regs)
{
int worst = 0, order, no_way_out, kill_current_task, lmce;
- DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
- DECLARE_BITMAP(toclear, MAX_NR_BANKS);
+ DECLARE_BITMAP(valid_banks, MAX_NR_BANKS) = { 0 };
+ DECLARE_BITMAP(toclear, MAX_NR_BANKS) = { 0 };
struct mca_config *cfg = &mca_cfg;
struct mce m, *final;
char *msg = NULL;
@@ -1381,7 +1381,6 @@ noinstr void do_machine_check(struct pt_regs *regs)
final = this_cpu_ptr(&mces_seen);
*final = m;
- memset(valid_banks, 0, sizeof(valid_banks));
no_way_out = mce_no_way_out(&m, &msg, valid_banks, regs);
barrier();
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 02/12] x86/mce: Remove function-local cpus variables
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 03/12] x86/mce: Use mce_rdmsrl() in severity checking code Borislav Petkov
` (10 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Use num_online_cpus() directly.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7c264ee18c95..f15efa242f44 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -985,7 +985,6 @@ static atomic_t global_nwo;
static int mce_start(int *no_way_out)
{
int order;
- int cpus = num_online_cpus();
u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
if (!timeout)
@@ -1002,7 +1001,7 @@ static int mce_start(int *no_way_out)
/*
* Wait for everyone.
*/
- while (atomic_read(&mce_callin) != cpus) {
+ while (atomic_read(&mce_callin) != num_online_cpus()) {
if (mce_timed_out(&timeout,
"Timeout: Not all CPUs entered broadcast exception handler")) {
atomic_set(&global_nwo, 0);
@@ -1066,14 +1065,11 @@ static int mce_end(int order)
atomic_inc(&mce_executing);
if (order == 1) {
- /* CHECKME: Can this race with a parallel hotplug? */
- int cpus = num_online_cpus();
-
/*
* Monarch: Wait for everyone to go through their scanning
* loops.
*/
- while (atomic_read(&mce_executing) <= cpus) {
+ while (atomic_read(&mce_executing) <= num_online_cpus()) {
if (mce_timed_out(&timeout,
"Timeout: Monarch CPU unable to finish machine check processing"))
goto reset;
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 03/12] x86/mce: Use mce_rdmsrl() in severity checking code
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 02/12] x86/mce: Remove function-local cpus variables Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
` (9 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
MCA has its own special MSR accessors. Use them.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 2 +-
arch/x86/kernel/cpu/mce/internal.h | 2 ++
arch/x86/kernel/cpu/mce/severity.c | 8 +++-----
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f15efa242f44..87a277fc80b4 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -362,7 +362,7 @@ void ex_handler_msr_mce(struct pt_regs *regs, bool wrmsr)
}
/* MSR access wrappers used for error injection */
-static noinstr u64 mce_rdmsrl(u32 msr)
+noinstr u64 mce_rdmsrl(u32 msr)
{
DECLARE_ARGS(val, low, high);
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index acd61c41846c..52c633950b38 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -207,4 +207,6 @@ static inline void pentium_machine_check(struct pt_regs *regs) {}
static inline void winchip_machine_check(struct pt_regs *regs) {}
#endif
+noinstr u64 mce_rdmsrl(u32 msr);
+
#endif /* __X86_MCE_INTERNAL_H__ */
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index bb019a594a2c..00e97ebbd94a 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -288,8 +288,7 @@ static int error_context(struct mce *m, struct pt_regs *regs)
static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
{
- u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
- u32 low, high;
+ u64 mcx_cfg;
/*
* We need to look at the following bits:
@@ -300,11 +299,10 @@ static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
if (!mce_flags.succor)
return MCE_PANIC_SEVERITY;
- if (rdmsr_safe(addr, &low, &high))
- return MCE_PANIC_SEVERITY;
+ mcx_cfg = mce_rdmsrl(MSR_AMD64_SMCA_MCx_CONFIG(m->bank));
/* TCC (Task context corrupt). If set and if IN_KERNEL, panic. */
- if ((low & MCI_CONFIG_MCAX) &&
+ if ((mcx_cfg & MCI_CONFIG_MCAX) &&
(m->status & MCI_STATUS_TCC) &&
(err_ctx == IN_KERNEL))
return MCE_PANIC_SEVERITY;
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup()
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (2 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 03/12] x86/mce: Use mce_rdmsrl() in severity checking code Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-09 12:16 ` Peter Zijlstra
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
` (8 subsequent siblings)
12 siblings, 2 replies; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Instead, sandwitch around the call which is done in noinstr context and
mark the caller - mce_gather_info() - as noinstr.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 87a277fc80b4..f61f14faa532 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -127,7 +127,7 @@ static struct irq_work mce_irq_work;
BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
/* Do initial initialization of a struct mce */
-noinstr void mce_setup(struct mce *m)
+void mce_setup(struct mce *m)
{
memset(m, 0, sizeof(struct mce));
m->cpu = m->extcpu = smp_processor_id();
@@ -430,9 +430,15 @@ static noinstr void mce_wrmsrl(u32 msr, u64 v)
* check into our "mce" struct so that we can use it later to assess
* the severity of the problem as we read per-bank specific details.
*/
-static inline void mce_gather_info(struct mce *m, struct pt_regs *regs)
+static noinstr void mce_gather_info(struct mce *m, struct pt_regs *regs)
{
+ /*
+ * Enable instrumentation around mce_setup() which calls external
+ * facilities.
+ */
+ instrumentation_begin();
mce_setup(m);
+ instrumentation_end();
m->mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
if (regs) {
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 05/12] x86/mce: Allow instrumentation during task work queueing
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (3 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-09 12:17 ` Peter Zijlstra
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 06/12] x86/mce: Prevent severity computation from being instrumented Borislav Petkov
` (7 subsequent siblings)
12 siblings, 2 replies; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xdb1: call to queue_task_work() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f61f14faa532..d7fa444d6282 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1451,6 +1451,14 @@ noinstr void do_machine_check(struct pt_regs *regs)
if (worst != MCE_AR_SEVERITY && !kill_current_task)
goto out;
+ /*
+ * Enable instrumentation around the external facilities like
+ * task_work_add() (via queue_task_work()), fixup_exception() etc.
+ * For now, that is. Fixing this properly would need a lot more involved
+ * reorganization.
+ */
+ instrumentation_begin();
+
/* Fault was in user mode and we need to take some action */
if ((m.cs & 3) == 3) {
/* If this triggers there is no way to recover. Die hard. */
@@ -1479,6 +1487,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
if (m.kflags & MCE_IN_KERNEL_COPYIN)
queue_task_work(&m, msg, kill_me_never);
}
+
+ instrumentation_end();
+
out:
mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
}
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 06/12] x86/mce: Prevent severity computation from being instrumented
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (4 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 07/12] x86/mce: Mark mce_panic() noinstr Borislav Petkov
` (6 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Mark all the MCE severity computation logic noinstr and allow
instrumentation when it "calls out".
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xc5d: call to mce_severity() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/severity.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index 00e97ebbd94a..a32646769705 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -263,24 +263,36 @@ static bool is_copy_from_user(struct pt_regs *regs)
* distinguish an exception taken in user from from one
* taken in the kernel.
*/
-static int error_context(struct mce *m, struct pt_regs *regs)
+static noinstr int error_context(struct mce *m, struct pt_regs *regs)
{
+ int fixup_type;
+ bool copy_user;
+
if ((m->cs & 3) == 3)
return IN_USER;
+
if (!mc_recoverable(m->mcgstatus))
return IN_KERNEL;
- switch (ex_get_fixup_type(m->ip)) {
+ /* Allow instrumentation around external facilities usage. */
+ instrumentation_begin();
+ fixup_type = ex_get_fixup_type(m->ip);
+ copy_user = is_copy_from_user(regs);
+ instrumentation_end();
+
+ switch (fixup_type) {
case EX_TYPE_UACCESS:
case EX_TYPE_COPY:
- if (!regs || !is_copy_from_user(regs))
+ if (!regs || !copy_user)
return IN_KERNEL;
m->kflags |= MCE_IN_KERNEL_COPYIN;
fallthrough;
+
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
m->kflags |= MCE_IN_KERNEL_RECOV;
return IN_KERNEL_RECOV;
+
default:
return IN_KERNEL;
}
@@ -315,8 +327,8 @@ static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
* See AMD Error Scope Hierarchy table in a newer BKDG. For example
* 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features"
*/
-static int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
- char **msg, bool is_excp)
+static noinstr int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
+ char **msg, bool is_excp)
{
enum context ctx = error_context(m, regs);
@@ -368,8 +380,8 @@ static int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
return MCE_KEEP_SEVERITY;
}
-static int mce_severity_intel(struct mce *m, struct pt_regs *regs,
- int tolerant, char **msg, bool is_excp)
+static noinstr int mce_severity_intel(struct mce *m, struct pt_regs *regs,
+ int tolerant, char **msg, bool is_excp)
{
enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP);
enum context ctx = error_context(m, regs);
@@ -405,8 +417,8 @@ static int mce_severity_intel(struct mce *m, struct pt_regs *regs,
}
}
-int mce_severity(struct mce *m, struct pt_regs *regs, int tolerant, char **msg,
- bool is_excp)
+int noinstr mce_severity(struct mce *m, struct pt_regs *regs, int tolerant, char **msg,
+ bool is_excp)
{
if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 07/12] x86/mce: Mark mce_panic() noinstr
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (5 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 06/12] x86/mce: Prevent severity computation from being instrumented Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 08/12] x86/mce: Mark mce_end() noinstr Borislav Petkov
` (5 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
And allow instrumentation inside it because it does calls to other
facilities which will not be tagged noinstr.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xc73: call to mce_panic() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d7fa444d6282..d81266885de3 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -266,11 +266,17 @@ static void wait_for_panic(void)
panic("Panicing machine check CPU died");
}
-static void mce_panic(const char *msg, struct mce *final, char *exp)
+static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
{
- int apei_err = 0;
struct llist_node *pending;
struct mce_evt_llist *l;
+ int apei_err = 0;
+
+ /*
+ * Allow instrumentation around external facilities usage. Not that it
+ * matters a whole lot since the machine is going to panic anyway.
+ */
+ instrumentation_begin();
if (!fake_panic) {
/*
@@ -285,7 +291,7 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
} else {
/* Don't log too much for fake panic */
if (atomic_inc_return(&mce_fake_panicked) > 1)
- return;
+ goto out;
}
pending = mce_gen_pool_prepare_records();
/* First print corrected ones that are still unlogged */
@@ -321,6 +327,9 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
panic(msg);
} else
pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
+
+out:
+ instrumentation_end();
}
/* Support code for software error injection */
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 08/12] x86/mce: Mark mce_end() noinstr
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (6 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 07/12] x86/mce: Mark mce_panic() noinstr Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 09/12] x86/mce: Mark mce_read_aux() noinstr Borislav Petkov
` (4 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
It is called by the #MC handler which is noinstr.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xbd6: call to memset() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d81266885de3..e86334925797 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1064,10 +1064,13 @@ static int mce_start(int *no_way_out)
* Synchronize between CPUs after main scanning loop.
* This invokes the bulk of the Monarch processing.
*/
-static int mce_end(int order)
+static noinstr int mce_end(int order)
{
- int ret = -1;
u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
+ int ret = -1;
+
+ /* Allow instrumentation around external facilities. */
+ instrumentation_begin();
if (!timeout)
goto reset;
@@ -1108,7 +1111,8 @@ static int mce_end(int order)
/*
* Don't reset anything. That's done by the Monarch.
*/
- return 0;
+ ret = 0;
+ goto out;
}
/*
@@ -1124,6 +1128,10 @@ static int mce_end(int order)
* Let others run again.
*/
atomic_set(&mce_executing, 0);
+
+out:
+ instrumentation_end();
+
return ret;
}
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 09/12] x86/mce: Mark mce_read_aux() noinstr
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (7 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 08/12] x86/mce: Mark mce_end() noinstr Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 10/12] x86/mce: Move the tainting outside of the noinstr region Borislav Petkov
` (3 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x681: call to mce_read_aux() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index e86334925797..eafa7f350fa2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -648,7 +648,7 @@ static struct notifier_block mce_default_nb = {
/*
* Read ADDR and MISC registers.
*/
-static void mce_read_aux(struct mce *m, int i)
+static noinstr void mce_read_aux(struct mce *m, int i)
{
if (m->status & MCI_STATUS_MISCV)
m->misc = mce_rdmsrl(mca_msr_reg(i, MCA_MISC));
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 10/12] x86/mce: Move the tainting outside of the noinstr region
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (8 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 09/12] x86/mce: Mark mce_read_aux() noinstr Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 11/12] x86/mce: Mark mce_timed_out() noinstr Borislav Petkov
` (2 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
add_taint() is yet another external facility which the #MC handler
calls. Move that tainting call into the instrumentation-allowed part of
the handler.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x617: call to add_taint() leaves .noinstr.text section
While at it, allow instrumentation around the mce_log() call.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x690: call to mce_log() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 41 +++++++++++++++++++++-------------
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index eafa7f350fa2..782724b1204f 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1180,13 +1180,14 @@ static noinstr bool mce_check_crashing_cpu(void)
return false;
}
-static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final,
- unsigned long *toclear, unsigned long *valid_banks,
- int no_way_out, int *worst)
+static __always_inline int
+__mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final,
+ unsigned long *toclear, unsigned long *valid_banks, int no_way_out,
+ int *worst)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
- int severity, i;
+ int severity, i, taint = 0;
for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
__clear_bit(i, toclear);
@@ -1213,7 +1214,7 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
continue;
/* Set taint even when machine check was not enabled. */
- add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+ taint++;
severity = mce_severity(m, regs, cfg->tolerant, NULL, true);
@@ -1236,7 +1237,13 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
/* assuming valid severity level != 0 */
m->severity = severity;
+ /*
+ * Enable instrumentation around the mce_log() call which is
+ * done in #MC context, where instrumentation is disabled.
+ */
+ instrumentation_begin();
mce_log(m);
+ instrumentation_end();
if (severity > *worst) {
*final = *m;
@@ -1246,6 +1253,8 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
/* mce_clear_state will clear *final, save locally for use later */
*m = *final;
+
+ return taint;
}
static void kill_me_now(struct callback_head *ch)
@@ -1354,7 +1363,7 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
*/
noinstr void do_machine_check(struct pt_regs *regs)
{
- int worst = 0, order, no_way_out, kill_current_task, lmce;
+ int worst = 0, order, no_way_out, kill_current_task, lmce, taint = 0;
DECLARE_BITMAP(valid_banks, MAX_NR_BANKS) = { 0 };
DECLARE_BITMAP(toclear, MAX_NR_BANKS) = { 0 };
struct mca_config *cfg = &mca_cfg;
@@ -1433,7 +1442,7 @@ noinstr void do_machine_check(struct pt_regs *regs)
order = mce_start(&no_way_out);
}
- __mc_scan_banks(&m, regs, final, toclear, valid_banks, no_way_out, &worst);
+ taint = __mc_scan_banks(&m, regs, final, toclear, valid_banks, no_way_out, &worst);
if (!no_way_out)
mce_clear_state(toclear);
@@ -1465,17 +1474,19 @@ noinstr void do_machine_check(struct pt_regs *regs)
}
}
- if (worst != MCE_AR_SEVERITY && !kill_current_task)
- goto out;
-
/*
- * Enable instrumentation around the external facilities like
- * task_work_add() (via queue_task_work()), fixup_exception() etc.
- * For now, that is. Fixing this properly would need a lot more involved
- * reorganization.
+ * Enable instrumentation around the external facilities like task_work_add()
+ * (via queue_task_work()), fixup_exception() etc. For now, that is. Fixing this
+ * properly would need a lot more involved reorganization.
*/
instrumentation_begin();
+ if (taint)
+ add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+
+ if (worst != MCE_AR_SEVERITY && !kill_current_task)
+ goto out;
+
/* Fault was in user mode and we need to take some action */
if ((m.cs & 3) == 3) {
/* If this triggers there is no way to recover. Die hard. */
@@ -1505,9 +1516,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
queue_task_work(&m, msg, kill_me_never);
}
+out:
instrumentation_end();
-out:
mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
}
EXPORT_SYMBOL_GPL(do_machine_check);
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 11/12] x86/mce: Mark mce_timed_out() noinstr
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (9 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 10/12] x86/mce: Move the tainting outside of the noinstr region Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 12/12] x86/mce: Mark mce_start() noinstr Borislav Petkov
2021-12-09 12:19 ` [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Peter Zijlstra
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x482: call to mce_timed_out() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 782724b1204f..3fe1d6614c22 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -883,8 +883,13 @@ static cpumask_t mce_missing_cpus = CPU_MASK_ALL;
/*
* Check if a timeout waiting for other CPUs happened.
*/
-static int mce_timed_out(u64 *t, const char *msg)
+static noinstr int mce_timed_out(u64 *t, const char *msg)
{
+ int ret = 0;
+
+ /* Enable instrumentation around calls to external facilities */
+ instrumentation_begin();
+
/*
* The others already did panic for some reason.
* Bail out like in a timeout.
@@ -903,12 +908,17 @@ static int mce_timed_out(u64 *t, const char *msg)
cpumask_pr_args(&mce_missing_cpus));
mce_panic(msg, NULL, NULL);
}
- return 1;
+ ret = 1;
+ goto out;
}
*t -= SPINUNIT;
+
out:
touch_nmi_watchdog();
- return 0;
+
+ instrumentation_end();
+
+ return ret;
}
/*
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v1 12/12] x86/mce: Mark mce_start() noinstr
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (10 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 11/12] x86/mce: Mark mce_timed_out() noinstr Borislav Petkov
@ 2021-12-08 11:13 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-09 12:19 ` [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Peter Zijlstra
12 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-12-08 11:13 UTC (permalink / raw)
To: Tony Luck; +Cc: X86 ML, LKML
From: Borislav Petkov <bp@suse.de>
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x4ae: call to __const_udelay() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mce/core.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 3fe1d6614c22..aa47ff12100f 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1007,13 +1007,13 @@ static atomic_t global_nwo;
* in the entry order.
* TBD double check parallel CPU hotunplug
*/
-static int mce_start(int *no_way_out)
+static noinstr int mce_start(int *no_way_out)
{
- int order;
u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
+ int order, ret = -1;
if (!timeout)
- return -1;
+ return ret;
atomic_add(*no_way_out, &global_nwo);
/*
@@ -1023,6 +1023,9 @@ static int mce_start(int *no_way_out)
order = atomic_inc_return(&mce_callin);
cpumask_clear_cpu(smp_processor_id(), &mce_missing_cpus);
+ /* Enable instrumentation around calls to external facilities */
+ instrumentation_begin();
+
/*
* Wait for everyone.
*/
@@ -1030,7 +1033,7 @@ static int mce_start(int *no_way_out)
if (mce_timed_out(&timeout,
"Timeout: Not all CPUs entered broadcast exception handler")) {
atomic_set(&global_nwo, 0);
- return -1;
+ goto out;
}
ndelay(SPINUNIT);
}
@@ -1056,7 +1059,7 @@ static int mce_start(int *no_way_out)
if (mce_timed_out(&timeout,
"Timeout: Subject CPUs unable to finish machine check processing")) {
atomic_set(&global_nwo, 0);
- return -1;
+ goto out;
}
ndelay(SPINUNIT);
}
@@ -1067,7 +1070,12 @@ static int mce_start(int *no_way_out)
*/
*no_way_out = atomic_read(&global_nwo);
- return order;
+ ret = order;
+
+out:
+ instrumentation_end();
+
+ return ret;
}
/*
--
2.29.2
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup()
2021-12-08 11:13 ` [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
@ 2021-12-09 12:16 ` Peter Zijlstra
2021-12-10 14:18 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
1 sibling, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2021-12-09 12:16 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Tony Luck, X86 ML, LKML
On Wed, Dec 08, 2021 at 12:13:35PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
>
> Instead, sandwitch around the call which is done in noinstr context and
> mark the caller - mce_gather_info() - as noinstr.
>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
> arch/x86/kernel/cpu/mce/core.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 87a277fc80b4..f61f14faa532 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -127,7 +127,7 @@ static struct irq_work mce_irq_work;
> BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
>
> /* Do initial initialization of a struct mce */
> -noinstr void mce_setup(struct mce *m)
> +void mce_setup(struct mce *m)
> {
> memset(m, 0, sizeof(struct mce));
> m->cpu = m->extcpu = smp_processor_id();
> @@ -430,9 +430,15 @@ static noinstr void mce_wrmsrl(u32 msr, u64 v)
> * check into our "mce" struct so that we can use it later to assess
> * the severity of the problem as we read per-bank specific details.
> */
> -static inline void mce_gather_info(struct mce *m, struct pt_regs *regs)
> +static noinstr void mce_gather_info(struct mce *m, struct pt_regs *regs)
> {
> + /*
> + * Enable instrumentation around mce_setup() which calls external
> + * facilities.
Yeah, that's what it does; but *why* is that correct? I'm thinking we're
well past the exception entry code and are only using noinstr as a means
to limit the amount of code in the MCE handler?
> + */
> + instrumentation_begin();
> mce_setup(m);
> + instrumentation_end();
>
> m->mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
> if (regs) {
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v1 05/12] x86/mce: Allow instrumentation during task work queueing
2021-12-08 11:13 ` [PATCH v1 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
@ 2021-12-09 12:17 ` Peter Zijlstra
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
1 sibling, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2021-12-09 12:17 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Tony Luck, X86 ML, LKML
On Wed, Dec 08, 2021 at 12:13:36PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
>
> Fixes
>
> vmlinux.o: warning: objtool: do_machine_check()+0xdb1: call to queue_task_work() leaves .noinstr.text section
>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
> arch/x86/kernel/cpu/mce/core.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index f61f14faa532..d7fa444d6282 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1451,6 +1451,14 @@ noinstr void do_machine_check(struct pt_regs *regs)
> if (worst != MCE_AR_SEVERITY && !kill_current_task)
> goto out;
>
> + /*
> + * Enable instrumentation around the external facilities like
> + * task_work_add() (via queue_task_work()), fixup_exception() etc.
> + * For now, that is. Fixing this properly would need a lot more involved
> + * reorganization.
The only reason we want to fix that is to limit the amount of code MCE
runs, or is there additional concerns, like perhaps getting #DB traps
from !noinstr code?
> + */
> + instrumentation_begin();
> +
> /* Fault was in user mode and we need to take some action */
> if ((m.cs & 3) == 3) {
> /* If this triggers there is no way to recover. Die hard. */
> @@ -1479,6 +1487,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
> if (m.kflags & MCE_IN_KERNEL_COPYIN)
> queue_task_work(&m, msg, kill_me_never);
> }
> +
> + instrumentation_end();
> +
> out:
> mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
> }
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v1 00/12] x86/mce: Correct the noinstr annotation
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
` (11 preceding siblings ...)
2021-12-08 11:13 ` [PATCH v1 12/12] x86/mce: Mark mce_start() noinstr Borislav Petkov
@ 2021-12-09 12:19 ` Peter Zijlstra
12 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2021-12-09 12:19 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Tony Luck, X86 ML, LKML
On Wed, Dec 08, 2021 at 12:13:31PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
>
> Hi folks,
>
> here's v1 with the instrumentation "sandwiching" documented, as
> requested by peterz.
>
> Tony, I'd appreciate making sure nothing in them breaks your injection
> workflows before I queue them. And it shouldn't but... :-)
>
> Thx.
>
> Changelog:
> ==========
>
> here's a first preliminary (it is based on some random 5.16-rc0 commit
> and is tested only in qemu) of the series which correct all the noinstr
> annotation of the #MC handler.
>
> Since it calls a bunch of external facilities, the strategy is to mark
> mce-specific functions called by the #MC handler as noinstr and when
> they "call out" so to speak, to do a begin/end sandwich around that
> call.
>
Looked ok. I think some of the comments can be improved to clarify the
noinstr usage in MCE in general, but that's about it.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup()
2021-12-09 12:16 ` Peter Zijlstra
@ 2021-12-10 14:18 ` Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: Borislav Petkov @ 2021-12-10 14:18 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Tony Luck, X86 ML, LKML
On Thu, Dec 09, 2021 at 01:16:56PM +0100, Peter Zijlstra wrote:
> Yeah, that's what it does; but *why* is that correct?
Correct, shmorect - it is the #MC handler nasty.
> I'm thinking we're well past the exception entry code and are only
> using noinstr as a means to limit the amount of code in the MCE
> handler?
Well, one of the calls to mce_gather_info() happen in #MC context. That
one calls mce_setup() and that thing calls out to
# ./arch/x86/include/asm/paravirt.h:116: PVOP_VCALL4(cpu.cpuid, eax, ebx, ecx, edx);
cmpq $0, pv_ops+176(%rip) #, pv_ops.cpu.cpuid
and you get this:
vmlinux.o: warning: objtool: pv_ops[22]: xen_cpuid
vmlinux.o: warning: objtool: pv_ops[22]: native_cpuid
vmlinux.o: warning: objtool: mce_setup()+0xa0: call to pv_ops[22]() leaves .noinstr.text section
I think this is too much and too specific text to stick in the code as a
comment.
I can stick it in the commit message if you prefer that but frankly,
seeing those instrumentation_begin/_end() sandwiches are already hints
enough in my head to read "TODO" there...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Mark mce_start() noinstr
2021-12-08 11:13 ` [PATCH v1 12/12] x86/mce: Mark mce_start() noinstr Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: e3d72e8eee53c55835ab4664920d83400ff5d6c2
Gitweb: https://git.kernel.org/tip/e3d72e8eee53c55835ab4664920d83400ff5d6c2
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Wed, 03 Nov 2021 10:30:29 +01:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:14:05 +01:00
x86/mce: Mark mce_start() noinstr
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x4ae: call to __const_udelay() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-13-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7023d65..5818b83 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1007,13 +1007,13 @@ static atomic_t global_nwo;
* in the entry order.
* TBD double check parallel CPU hotunplug
*/
-static int mce_start(int *no_way_out)
+static noinstr int mce_start(int *no_way_out)
{
- int order;
u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
+ int order, ret = -1;
if (!timeout)
- return -1;
+ return ret;
atomic_add(*no_way_out, &global_nwo);
/*
@@ -1023,6 +1023,9 @@ static int mce_start(int *no_way_out)
order = atomic_inc_return(&mce_callin);
cpumask_clear_cpu(smp_processor_id(), &mce_missing_cpus);
+ /* Enable instrumentation around calls to external facilities */
+ instrumentation_begin();
+
/*
* Wait for everyone.
*/
@@ -1030,7 +1033,7 @@ static int mce_start(int *no_way_out)
if (mce_timed_out(&timeout,
"Timeout: Not all CPUs entered broadcast exception handler")) {
atomic_set(&global_nwo, 0);
- return -1;
+ goto out;
}
ndelay(SPINUNIT);
}
@@ -1056,7 +1059,7 @@ static int mce_start(int *no_way_out)
if (mce_timed_out(&timeout,
"Timeout: Subject CPUs unable to finish machine check processing")) {
atomic_set(&global_nwo, 0);
- return -1;
+ goto out;
}
ndelay(SPINUNIT);
}
@@ -1067,7 +1070,12 @@ static int mce_start(int *no_way_out)
*/
*no_way_out = atomic_read(&global_nwo);
- return order;
+ ret = order;
+
+out:
+ instrumentation_end();
+
+ return ret;
}
/*
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Mark mce_timed_out() noinstr
2021-12-08 11:13 ` [PATCH v1 11/12] x86/mce: Mark mce_timed_out() noinstr Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: edb3d07e2403abd13fc664e8b6f23ea7efb52747
Gitweb: https://git.kernel.org/tip/edb3d07e2403abd13fc664e8b6f23ea7efb52747
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 02 Nov 2021 22:25:12 +01:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:13:54 +01:00
x86/mce: Mark mce_timed_out() noinstr
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x482: call to mce_timed_out() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-12-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 044c94b..7023d65 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -883,8 +883,13 @@ static cpumask_t mce_missing_cpus = CPU_MASK_ALL;
/*
* Check if a timeout waiting for other CPUs happened.
*/
-static int mce_timed_out(u64 *t, const char *msg)
+static noinstr int mce_timed_out(u64 *t, const char *msg)
{
+ int ret = 0;
+
+ /* Enable instrumentation around calls to external facilities */
+ instrumentation_begin();
+
/*
* The others already did panic for some reason.
* Bail out like in a timeout.
@@ -903,12 +908,17 @@ static int mce_timed_out(u64 *t, const char *msg)
cpumask_pr_args(&mce_missing_cpus));
mce_panic(msg, NULL, NULL);
}
- return 1;
+ ret = 1;
+ goto out;
}
*t -= SPINUNIT;
+
out:
touch_nmi_watchdog();
- return 0;
+
+ instrumentation_end();
+
+ return ret;
}
/*
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Move the tainting outside of the noinstr region
2021-12-08 11:13 ` [PATCH v1 10/12] x86/mce: Move the tainting outside of the noinstr region Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 75581a203e63210aabb1336c8c9cb65a7858b596
Gitweb: https://git.kernel.org/tip/75581a203e63210aabb1336c8c9cb65a7858b596
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 02 Nov 2021 08:14:44 +01:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:13:35 +01:00
x86/mce: Move the tainting outside of the noinstr region
add_taint() is yet another external facility which the #MC handler
calls. Move that tainting call into the instrumentation-allowed part of
the handler.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x617: call to add_taint() leaves .noinstr.text section
While at it, allow instrumentation around the mce_log() call.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x690: call to mce_log() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-11-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 41 ++++++++++++++++++++-------------
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 53b4cfc..044c94b 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1180,13 +1180,14 @@ static noinstr bool mce_check_crashing_cpu(void)
return false;
}
-static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final,
- unsigned long *toclear, unsigned long *valid_banks,
- int no_way_out, int *worst)
+static __always_inline int
+__mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final,
+ unsigned long *toclear, unsigned long *valid_banks, int no_way_out,
+ int *worst)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
- int severity, i;
+ int severity, i, taint = 0;
for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
__clear_bit(i, toclear);
@@ -1213,7 +1214,7 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
continue;
/* Set taint even when machine check was not enabled. */
- add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+ taint++;
severity = mce_severity(m, regs, cfg->tolerant, NULL, true);
@@ -1236,7 +1237,13 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
/* assuming valid severity level != 0 */
m->severity = severity;
+ /*
+ * Enable instrumentation around the mce_log() call which is
+ * done in #MC context, where instrumentation is disabled.
+ */
+ instrumentation_begin();
mce_log(m);
+ instrumentation_end();
if (severity > *worst) {
*final = *m;
@@ -1246,6 +1253,8 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin
/* mce_clear_state will clear *final, save locally for use later */
*m = *final;
+
+ return taint;
}
static void kill_me_now(struct callback_head *ch)
@@ -1362,7 +1371,7 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
*/
noinstr void do_machine_check(struct pt_regs *regs)
{
- int worst = 0, order, no_way_out, kill_current_task, lmce;
+ int worst = 0, order, no_way_out, kill_current_task, lmce, taint = 0;
DECLARE_BITMAP(valid_banks, MAX_NR_BANKS) = { 0 };
DECLARE_BITMAP(toclear, MAX_NR_BANKS) = { 0 };
struct mca_config *cfg = &mca_cfg;
@@ -1441,7 +1450,7 @@ noinstr void do_machine_check(struct pt_regs *regs)
order = mce_start(&no_way_out);
}
- __mc_scan_banks(&m, regs, final, toclear, valid_banks, no_way_out, &worst);
+ taint = __mc_scan_banks(&m, regs, final, toclear, valid_banks, no_way_out, &worst);
if (!no_way_out)
mce_clear_state(toclear);
@@ -1473,17 +1482,19 @@ noinstr void do_machine_check(struct pt_regs *regs)
}
}
- if (worst != MCE_AR_SEVERITY && !kill_current_task)
- goto out;
-
/*
- * Enable instrumentation around the external facilities like
- * task_work_add() (via queue_task_work()), fixup_exception() etc.
- * For now, that is. Fixing this properly would need a lot more involved
- * reorganization.
+ * Enable instrumentation around the external facilities like task_work_add()
+ * (via queue_task_work()), fixup_exception() etc. For now, that is. Fixing this
+ * properly would need a lot more involved reorganization.
*/
instrumentation_begin();
+ if (taint)
+ add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+
+ if (worst != MCE_AR_SEVERITY && !kill_current_task)
+ goto out;
+
/* Fault was in user mode and we need to take some action */
if ((m.cs & 3) == 3) {
/* If this triggers there is no way to recover. Die hard. */
@@ -1513,9 +1524,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
queue_task_work(&m, msg, kill_me_never);
}
+out:
instrumentation_end();
-out:
mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
}
EXPORT_SYMBOL_GPL(do_machine_check);
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Mark mce_read_aux() noinstr
2021-12-08 11:13 ` [PATCH v1 09/12] x86/mce: Mark mce_read_aux() noinstr Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: db6c996d6ce45dfb44891f0824a65ecec216f47a
Gitweb: https://git.kernel.org/tip/db6c996d6ce45dfb44891f0824a65ecec216f47a
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 02 Nov 2021 11:14:48 +01:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:13:23 +01:00
x86/mce: Mark mce_read_aux() noinstr
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0x681: call to mce_read_aux() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-10-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4bdcca8..53b4cfc 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -648,7 +648,7 @@ static struct notifier_block mce_default_nb = {
/*
* Read ADDR and MISC registers.
*/
-static void mce_read_aux(struct mce *m, int i)
+static noinstr void mce_read_aux(struct mce *m, int i)
{
if (m->status & MCI_STATUS_MISCV)
m->misc = mce_rdmsrl(mca_msr_reg(i, MCA_MISC));
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Mark mce_end() noinstr
2021-12-08 11:13 ` [PATCH v1 08/12] x86/mce: Mark mce_end() noinstr Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: b4813539d37fa31fed62cdfab7bd2dd8929c5b2e
Gitweb: https://git.kernel.org/tip/b4813539d37fa31fed62cdfab7bd2dd8929c5b2e
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 01 Nov 2021 16:43:33 +01:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:13:12 +01:00
x86/mce: Mark mce_end() noinstr
It is called by the #MC handler which is noinstr.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xbd6: call to memset() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-9-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index ec0f7bb..4bdcca8 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1064,10 +1064,13 @@ static int mce_start(int *no_way_out)
* Synchronize between CPUs after main scanning loop.
* This invokes the bulk of the Monarch processing.
*/
-static int mce_end(int order)
+static noinstr int mce_end(int order)
{
- int ret = -1;
u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
+ int ret = -1;
+
+ /* Allow instrumentation around external facilities. */
+ instrumentation_begin();
if (!timeout)
goto reset;
@@ -1108,7 +1111,8 @@ static int mce_end(int order)
/*
* Don't reset anything. That's done by the Monarch.
*/
- return 0;
+ ret = 0;
+ goto out;
}
/*
@@ -1124,6 +1128,10 @@ reset:
* Let others run again.
*/
atomic_set(&mce_executing, 0);
+
+out:
+ instrumentation_end();
+
return ret;
}
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Mark mce_panic() noinstr
2021-12-08 11:13 ` [PATCH v1 07/12] x86/mce: Mark mce_panic() noinstr Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 3c7ce80a818fa7950be123cac80cd078e5ac1013
Gitweb: https://git.kernel.org/tip/3c7ce80a818fa7950be123cac80cd078e5ac1013
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 01 Nov 2021 13:39:35 +01:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:13:01 +01:00
x86/mce: Mark mce_panic() noinstr
And allow instrumentation inside it because it does calls to other
facilities which will not be tagged noinstr.
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xc73: call to mce_panic() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-8-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d788ccc..ec0f7bb 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -266,11 +266,17 @@ static void wait_for_panic(void)
panic("Panicing machine check CPU died");
}
-static void mce_panic(const char *msg, struct mce *final, char *exp)
+static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
{
- int apei_err = 0;
struct llist_node *pending;
struct mce_evt_llist *l;
+ int apei_err = 0;
+
+ /*
+ * Allow instrumentation around external facilities usage. Not that it
+ * matters a whole lot since the machine is going to panic anyway.
+ */
+ instrumentation_begin();
if (!fake_panic) {
/*
@@ -285,7 +291,7 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
} else {
/* Don't log too much for fake panic */
if (atomic_inc_return(&mce_fake_panicked) > 1)
- return;
+ goto out;
}
pending = mce_gen_pool_prepare_records();
/* First print corrected ones that are still unlogged */
@@ -321,6 +327,9 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
panic(msg);
} else
pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
+
+out:
+ instrumentation_end();
}
/* Support code for software error injection */
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Prevent severity computation from being instrumented
2021-12-08 11:13 ` [PATCH v1 06/12] x86/mce: Prevent severity computation from being instrumented Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 0a5b288e85bbef5227bb6397e31fcf1d7ba9142a
Gitweb: https://git.kernel.org/tip/0a5b288e85bbef5227bb6397e31fcf1d7ba9142a
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Wed, 13 Oct 2021 09:47:50 +02:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:12:48 +01:00
x86/mce: Prevent severity computation from being instrumented
Mark all the MCE severity computation logic noinstr and allow
instrumentation when it "calls out".
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xc5d: call to mce_severity() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-7-bp@alien8.de
---
arch/x86/kernel/cpu/mce/severity.c | 30 ++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index 00e97eb..a326467 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -263,24 +263,36 @@ static bool is_copy_from_user(struct pt_regs *regs)
* distinguish an exception taken in user from from one
* taken in the kernel.
*/
-static int error_context(struct mce *m, struct pt_regs *regs)
+static noinstr int error_context(struct mce *m, struct pt_regs *regs)
{
+ int fixup_type;
+ bool copy_user;
+
if ((m->cs & 3) == 3)
return IN_USER;
+
if (!mc_recoverable(m->mcgstatus))
return IN_KERNEL;
- switch (ex_get_fixup_type(m->ip)) {
+ /* Allow instrumentation around external facilities usage. */
+ instrumentation_begin();
+ fixup_type = ex_get_fixup_type(m->ip);
+ copy_user = is_copy_from_user(regs);
+ instrumentation_end();
+
+ switch (fixup_type) {
case EX_TYPE_UACCESS:
case EX_TYPE_COPY:
- if (!regs || !is_copy_from_user(regs))
+ if (!regs || !copy_user)
return IN_KERNEL;
m->kflags |= MCE_IN_KERNEL_COPYIN;
fallthrough;
+
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
m->kflags |= MCE_IN_KERNEL_RECOV;
return IN_KERNEL_RECOV;
+
default:
return IN_KERNEL;
}
@@ -315,8 +327,8 @@ static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
* See AMD Error Scope Hierarchy table in a newer BKDG. For example
* 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features"
*/
-static int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
- char **msg, bool is_excp)
+static noinstr int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
+ char **msg, bool is_excp)
{
enum context ctx = error_context(m, regs);
@@ -368,8 +380,8 @@ static int mce_severity_amd(struct mce *m, struct pt_regs *regs, int tolerant,
return MCE_KEEP_SEVERITY;
}
-static int mce_severity_intel(struct mce *m, struct pt_regs *regs,
- int tolerant, char **msg, bool is_excp)
+static noinstr int mce_severity_intel(struct mce *m, struct pt_regs *regs,
+ int tolerant, char **msg, bool is_excp)
{
enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP);
enum context ctx = error_context(m, regs);
@@ -405,8 +417,8 @@ static int mce_severity_intel(struct mce *m, struct pt_regs *regs,
}
}
-int mce_severity(struct mce *m, struct pt_regs *regs, int tolerant, char **msg,
- bool is_excp)
+int noinstr mce_severity(struct mce *m, struct pt_regs *regs, int tolerant, char **msg,
+ bool is_excp)
{
if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Allow instrumentation during task work queueing
2021-12-08 11:13 ` [PATCH v1 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
2021-12-09 12:17 ` Peter Zijlstra
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
1 sibling, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 4fbce464db81a42f9a57ee242d6150ec7f996415
Gitweb: https://git.kernel.org/tip/4fbce464db81a42f9a57ee242d6150ec7f996415
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Wed, 13 Oct 2021 09:07:19 +02:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:12:35 +01:00
x86/mce: Allow instrumentation during task work queueing
Fixes
vmlinux.o: warning: objtool: do_machine_check()+0xdb1: call to queue_task_work() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-6-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index c11fc8d..d788ccc 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1459,6 +1459,14 @@ noinstr void do_machine_check(struct pt_regs *regs)
if (worst != MCE_AR_SEVERITY && !kill_current_task)
goto out;
+ /*
+ * Enable instrumentation around the external facilities like
+ * task_work_add() (via queue_task_work()), fixup_exception() etc.
+ * For now, that is. Fixing this properly would need a lot more involved
+ * reorganization.
+ */
+ instrumentation_begin();
+
/* Fault was in user mode and we need to take some action */
if ((m.cs & 3) == 3) {
/* If this triggers there is no way to recover. Die hard. */
@@ -1487,6 +1495,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
if (m.kflags & MCE_IN_KERNEL_COPYIN)
queue_task_work(&m, msg, kill_me_never);
}
+
+ instrumentation_end();
+
out:
mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
}
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Remove noinstr annotation from mce_setup()
2021-12-08 11:13 ` [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
2021-12-09 12:16 ` Peter Zijlstra
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
1 sibling, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 487d654db3edacc31dee86b10258cc740640fad8
Gitweb: https://git.kernel.org/tip/487d654db3edacc31dee86b10258cc740640fad8
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 05 Oct 2021 19:54:47 +02:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:12:21 +01:00
x86/mce: Remove noinstr annotation from mce_setup()
Instead, sandwitch around the call which is done in noinstr context and
mark the caller - mce_gather_info() - as noinstr.
Also, document what the whole instrumentation strategy with #MC is going
to be in the future and where it all is supposed to be going to.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-5-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 87a277f..c11fc8d 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -127,7 +127,7 @@ static struct irq_work mce_irq_work;
BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
/* Do initial initialization of a struct mce */
-noinstr void mce_setup(struct mce *m)
+void mce_setup(struct mce *m)
{
memset(m, 0, sizeof(struct mce));
m->cpu = m->extcpu = smp_processor_id();
@@ -430,9 +430,15 @@ static noinstr void mce_wrmsrl(u32 msr, u64 v)
* check into our "mce" struct so that we can use it later to assess
* the severity of the problem as we read per-bank specific details.
*/
-static inline void mce_gather_info(struct mce *m, struct pt_regs *regs)
+static noinstr void mce_gather_info(struct mce *m, struct pt_regs *regs)
{
+ /*
+ * Enable instrumentation around mce_setup() which calls external
+ * facilities.
+ */
+ instrumentation_begin();
mce_setup(m);
+ instrumentation_end();
m->mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
if (regs) {
@@ -1312,11 +1318,11 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
}
/*
- * The actual machine check handler. This only handles real
- * exceptions when something got corrupted coming in through int 18.
+ * The actual machine check handler. This only handles real exceptions when
+ * something got corrupted coming in through int 18.
*
- * This is executed in NMI context not subject to normal locking rules. This
- * implies that most kernel services cannot be safely used. Don't even
+ * This is executed in #MC context not subject to normal locking rules.
+ * This implies that most kernel services cannot be safely used. Don't even
* think about putting a printk in there!
*
* On Intel systems this is entered on all CPUs in parallel through
@@ -1328,6 +1334,14 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
* issues: if the machine check was due to a failure of the memory
* backing the user stack, tracing that reads the user stack will cause
* potentially infinite recursion.
+ *
+ * Currently, the #MC handler calls out to a number of external facilities
+ * and, therefore, allows instrumentation around them. The optimal thing to
+ * have would be to do the absolutely minimal work required in #MC context
+ * and have instrumentation disabled only around that. Further processing can
+ * then happen in process context where instrumentation is allowed. Achieving
+ * that requires careful auditing and modifications. Until then, the code
+ * allows instrumentation temporarily, where required. *
*/
noinstr void do_machine_check(struct pt_regs *regs)
{
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Remove function-local cpus variables
2021-12-08 11:13 ` [PATCH v1 02/12] x86/mce: Remove function-local cpus variables Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: ad669ec16afeb0edb7d5bc4f92fdd059565281d2
Gitweb: https://git.kernel.org/tip/ad669ec16afeb0edb7d5bc4f92fdd059565281d2
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 05 Oct 2021 20:06:57 +02:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:11:53 +01:00
x86/mce: Remove function-local cpus variables
Use num_online_cpus() directly.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-3-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7c264ee..f15efa2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -985,7 +985,6 @@ static atomic_t global_nwo;
static int mce_start(int *no_way_out)
{
int order;
- int cpus = num_online_cpus();
u64 timeout = (u64)mca_cfg.monarch_timeout * NSEC_PER_USEC;
if (!timeout)
@@ -1002,7 +1001,7 @@ static int mce_start(int *no_way_out)
/*
* Wait for everyone.
*/
- while (atomic_read(&mce_callin) != cpus) {
+ while (atomic_read(&mce_callin) != num_online_cpus()) {
if (mce_timed_out(&timeout,
"Timeout: Not all CPUs entered broadcast exception handler")) {
atomic_set(&global_nwo, 0);
@@ -1066,14 +1065,11 @@ static int mce_end(int order)
atomic_inc(&mce_executing);
if (order == 1) {
- /* CHECKME: Can this race with a parallel hotplug? */
- int cpus = num_online_cpus();
-
/*
* Monarch: Wait for everyone to go through their scanning
* loops.
*/
- while (atomic_read(&mce_executing) <= cpus) {
+ while (atomic_read(&mce_executing) <= num_online_cpus()) {
if (mce_timed_out(&timeout,
"Timeout: Monarch CPU unable to finish machine check processing"))
goto reset;
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Use mce_rdmsrl() in severity checking code
2021-12-08 11:13 ` [PATCH v1 03/12] x86/mce: Use mce_rdmsrl() in severity checking code Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 88f66a42353717e7fac31caf04f0acd2d7fbbf54
Gitweb: https://git.kernel.org/tip/88f66a42353717e7fac31caf04f0acd2d7fbbf54
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Wed, 06 Oct 2021 00:55:38 +02:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:12:08 +01:00
x86/mce: Use mce_rdmsrl() in severity checking code
MCA has its own special MSR accessors. Use them.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-4-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 2 +-
arch/x86/kernel/cpu/mce/internal.h | 2 ++
arch/x86/kernel/cpu/mce/severity.c | 8 +++-----
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f15efa2..87a277f 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -362,7 +362,7 @@ void ex_handler_msr_mce(struct pt_regs *regs, bool wrmsr)
}
/* MSR access wrappers used for error injection */
-static noinstr u64 mce_rdmsrl(u32 msr)
+noinstr u64 mce_rdmsrl(u32 msr)
{
DECLARE_ARGS(val, low, high);
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index acd61c4..52c6339 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -207,4 +207,6 @@ static inline void pentium_machine_check(struct pt_regs *regs) {}
static inline void winchip_machine_check(struct pt_regs *regs) {}
#endif
+noinstr u64 mce_rdmsrl(u32 msr);
+
#endif /* __X86_MCE_INTERNAL_H__ */
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index bb019a5..00e97eb 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -288,8 +288,7 @@ static int error_context(struct mce *m, struct pt_regs *regs)
static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
{
- u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
- u32 low, high;
+ u64 mcx_cfg;
/*
* We need to look at the following bits:
@@ -300,11 +299,10 @@ static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
if (!mce_flags.succor)
return MCE_PANIC_SEVERITY;
- if (rdmsr_safe(addr, &low, &high))
- return MCE_PANIC_SEVERITY;
+ mcx_cfg = mce_rdmsrl(MSR_AMD64_SMCA_MCx_CONFIG(m->bank));
/* TCC (Task context corrupt). If set and if IN_KERNEL, panic. */
- if ((low & MCI_CONFIG_MCAX) &&
+ if ((mcx_cfg & MCI_CONFIG_MCAX) &&
(m->status & MCI_STATUS_TCC) &&
(err_ctx == IN_KERNEL))
return MCE_PANIC_SEVERITY;
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [tip: ras/core] x86/mce: Do not use memset to clear the banks bitmaps
2021-12-08 11:13 ` [PATCH v1 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
@ 2021-12-13 17:35 ` tip-bot2 for Borislav Petkov
0 siblings, 0 replies; 29+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2021-12-13 17:35 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov, x86, linux-kernel
The following commit has been merged into the ras/core branch of tip:
Commit-ID: cd5e0d1fc93a2218fbfb2b614f06e04e218ca948
Gitweb: https://git.kernel.org/tip/cd5e0d1fc93a2218fbfb2b614f06e04e218ca948
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 01 Nov 2021 15:34:48 +01:00
Committer: Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 13 Dec 2021 14:11:22 +01:00
x86/mce: Do not use memset to clear the banks bitmaps
The bitmap is a single unsigned long so no need for the function call.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-2-bp@alien8.de
---
arch/x86/kernel/cpu/mce/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 30de00f..7c264ee 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1336,8 +1336,8 @@ static noinstr void unexpected_machine_check(struct pt_regs *regs)
noinstr void do_machine_check(struct pt_regs *regs)
{
int worst = 0, order, no_way_out, kill_current_task, lmce;
- DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
- DECLARE_BITMAP(toclear, MAX_NR_BANKS);
+ DECLARE_BITMAP(valid_banks, MAX_NR_BANKS) = { 0 };
+ DECLARE_BITMAP(toclear, MAX_NR_BANKS) = { 0 };
struct mca_config *cfg = &mca_cfg;
struct mce m, *final;
char *msg = NULL;
@@ -1381,7 +1381,6 @@ noinstr void do_machine_check(struct pt_regs *regs)
final = this_cpu_ptr(&mces_seen);
*final = m;
- memset(valid_banks, 0, sizeof(valid_banks));
no_way_out = mce_no_way_out(&m, &msg, valid_banks, regs);
barrier();
^ permalink raw reply related [flat|nested] 29+ messages in thread
end of thread, other threads:[~2021-12-13 17:35 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-08 11:13 [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 01/12] x86/mce: Do not use memset to clear the banks bitmaps Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 02/12] x86/mce: Remove function-local cpus variables Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 03/12] x86/mce: Use mce_rdmsrl() in severity checking code Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 04/12] x86/mce: Remove noinstr annotation from mce_setup() Borislav Petkov
2021-12-09 12:16 ` Peter Zijlstra
2021-12-10 14:18 ` Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 05/12] x86/mce: Allow instrumentation during task work queueing Borislav Petkov
2021-12-09 12:17 ` Peter Zijlstra
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 06/12] x86/mce: Prevent severity computation from being instrumented Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 07/12] x86/mce: Mark mce_panic() noinstr Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 08/12] x86/mce: Mark mce_end() noinstr Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 09/12] x86/mce: Mark mce_read_aux() noinstr Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 10/12] x86/mce: Move the tainting outside of the noinstr region Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 11/12] x86/mce: Mark mce_timed_out() noinstr Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-08 11:13 ` [PATCH v1 12/12] x86/mce: Mark mce_start() noinstr Borislav Petkov
2021-12-13 17:35 ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2021-12-09 12:19 ` [PATCH v1 00/12] x86/mce: Correct the noinstr annotation Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).