* [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only @ 2020-09-06 21:21 Borislav Petkov 2020-09-07 20:06 ` Luck, Tony ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Borislav Petkov @ 2020-09-06 21:21 UTC (permalink / raw) To: x86-ml, Tony Luck; +Cc: lkml Hi, Ingo and I talked about this thing this morning and tglx has had it on his to-fix list too so here's a first attempt at it. Below is just a brain dump of what we talked about so let's start with it and see where it would take us. Thx. --- From: Borislav Petkov <bp@suse.de> ... without any exception handling and tracing. If an exception needs to be handled while reading an MSR - which is in most of the cases caused by a #GP on a non-existent MSR - then this is most likely the incarnation of a BIOS or a hardware bug. Such bug violates the architectural guarantee that MSR banks are present with all MSRs belonging to them. The proper fix belongs in the hardware/firmware - not in the kernel. Handling exceptions while in #MC and while an NMI is being handled would cause the nasty NMI nesting issue because of the shortcoming of IRET of reenabling NMIs when executed. And the machine is in an #MC context already so <Deity> be at its side. Tracing MSR accesses while in #MC is another no-no due to tracing being inherently a bad idea in atomic context: vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() leaves .noinstr.text section so remove all that "additional" functionality from mce_rdmsrl() and concentrate on solely reading the MSRs. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/cpu/mce/core.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 0ba24dfffdb2..14ebdf3e22f3 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -376,7 +376,7 @@ static int msr_to_offset(u32 msr) /* MSR access wrappers used for error injection */ static u64 mce_rdmsrl(u32 msr) { - u64 v; + DECLARE_ARGS(val, low, high); if (__this_cpu_read(injectm.finished)) { int offset = msr_to_offset(msr); @@ -386,17 +386,13 @@ static u64 mce_rdmsrl(u32 msr) return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset); } - if (rdmsrl_safe(msr, &v)) { - WARN_ONCE(1, "mce: Unable to read MSR 0x%x!\n", msr); - /* - * Return zero in case the access faulted. This should - * not happen normally but can happen if the CPU does - * something weird, or if the code is buggy. - */ - v = 0; - } + /* + * RDMSR on MCA MSRs should not fault. If they do, this is very much an + * architectural violation and needs to be reported to hw vendor. + */ + asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr)); - return v; + return EAX_EDX_VAL(val, low, high); } static void mce_wrmsrl(u32 msr, u64 v) -- 2.21.0 -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-06 21:21 [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Borislav Petkov @ 2020-09-07 20:06 ` Luck, Tony 2020-09-08 9:46 ` Borislav Petkov 2020-09-07 20:16 ` Andy Lutomirski 2020-09-11 9:47 ` [tip: ras/core] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR tip-bot2 for Borislav Petkov 2 siblings, 1 reply; 17+ messages in thread From: Luck, Tony @ 2020-09-07 20:06 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, lkml On Sun, Sep 06, 2020 at 11:21:30PM +0200, Borislav Petkov wrote: > Hi, > > Ingo and I talked about this thing this morning and tglx has had it on > his to-fix list too so here's a first attempt at it. > > Below is just a brain dump of what we talked about so let's start with > it and see where it would take us. This very much looks to be the right thing to do. Returning "0" from a failed RDMSR in MCE handling might be a much worse thing to do than crashing. It papers over a serious error and the system might keep running using corrupted data. Digging into the history it seems that this rdmsrl_safe() was added for a possible bug on a pentiumIII back in 2009 that was eventually closed as "unreproducible". Do we have three distinct classes of calls to RDMSR now? 1) This case. Architecture checks have been made. This call can't fail. We are calling from a tricky state (on IST) so no tracing allowed. 2) Normal case ... architecture checks have been done so shouldn't fail. Safe state for tracing etc. Use rdmsrl(). 3) Probe case. Architecture didn't provide definitive enumeration, so we might take a #GP. Use the rdmsrl_safe(). > + /* > + * RDMSR on MCA MSRs should not fault. If they do, this is very much an > + * architectural violation and needs to be reported to hw vendor. > + */ > + asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr)); If mce subsystem is the only instance of case "1", then this inline is OK. If there are others then rather than inlining the asm here, we should have some new rdmsrl_notrace() inline function declared for all to use. Needs a: Fixes: 11868a2dc4f5 ("x86: mce: Use safer ways to access MCE registers") since this is undoing an earlier change. -Tony ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-07 20:06 ` Luck, Tony @ 2020-09-08 9:46 ` Borislav Petkov 2020-09-08 10:08 ` Borislav Petkov 0 siblings, 1 reply; 17+ messages in thread From: Borislav Petkov @ 2020-09-08 9:46 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml On Mon, Sep 07, 2020 at 01:06:22PM -0700, Luck, Tony wrote: > Digging into the history it seems that this rdmsrl_safe() was added for > a possible bug on a pentiumIII back in 2009 that was eventually closed > as "unreproducible". So this is the $ 10^6 question so far: if I can assume that those two boxes we found in bug reports which triggered this warn because they failed the MSR read even though those MSRs are MCA architectural, if I can assume that those are a one-off thing and that we won't support such and we'd let machines like that crash and burn in testing, hopefully before they get released to the public, then, we can do the plain MSR read there. No, tracing, no exception handling. HOWEVER! *If* that non-existent MSR read happens after all, then the machine would #GP. Our #GP handler, however, doesn't panic unless panic_on_oops, as Andy pointed out on IRC yesterday and this makes a nasty situation even worse. Now, even if it did panic, reporting this would not be very user-friendly because people would have to look at the Code: output of the splat to see which insn rIP pointed to and then figure out what happened. Now, we can look at the code bytes around rIP ourselves, in the #GP handler, and decide to panic if 0F 32, i.e., RDMSR opcode. But that is ewww ugly to say the least. So, Andy suggested we do a simple .fixup so that when the RDMSR fails, in the fixup we panic directly. The problem there is: NMI -> #MC -> #GP ... IRET Another #MC won't happen after we IRET because MSR_IA32_MCG_STATUS is not cleared yet but the state is mightily confused and you don't want to run code on that box anymore but panic and not move. > Do we have three distinct classes of calls to RDMSR now? > > > 1) This case. Architecture checks have been made. This call can't > fail. We are calling from a tricky state (on IST) so no tracing > allowed. *If* we can hold hw people to the architectural guarantee, then this would be the cleanest fix. > 2) Normal case ... architecture checks have been done so shouldn't > fail. Safe state for tracing etc. Use rdmsrl(). Yeah, but see above. > 3) Probe case. Architecture didn't provide definitive enumeration, > so we might take a #GP. Use the rdmsrl_safe(). ... and that is bad because above *and* tracing can't handle atomic context properly. > If mce subsystem is the only instance of case "1", then this > inline is OK. If there are others then rather than inlining > the asm here, we should have some new rdmsrl_notrace() inline > function declared for all to use. Yeah, Ingo suggested the same thing but let's solve the bigger issue first. > Needs a: > > Fixes: 11868a2dc4f5 ("x86: mce: Use safer ways to access MCE registers") > > since this is undoing an earlier change. Ok. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-08 9:46 ` Borislav Petkov @ 2020-09-08 10:08 ` Borislav Petkov 2020-09-08 15:07 ` Luck, Tony 0 siblings, 1 reply; 17+ messages in thread From: Borislav Petkov @ 2020-09-08 10:08 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml On Tue, Sep 08, 2020 at 11:46:50AM +0200, Borislav Petkov wrote: > So, Andy suggested we do a simple .fixup so that when the RDMSR fails, > in the fixup we panic directly. Ok, so I think this is what Andy meant last night and PeterZ just suggested it too: We do a: _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_panic) which panics straight in the #GP handler and avoids the IRET. I mean, we want to panic there because architectural violation and we won't have to deal with the IRET... Hohumm, sounds like the right thing to do in the face of the brokenness of this magnificent architecture.</maxed out sarcasm>. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-08 10:08 ` Borislav Petkov @ 2020-09-08 15:07 ` Luck, Tony 2020-09-08 15:25 ` Borislav Petkov 0 siblings, 1 reply; 17+ messages in thread From: Luck, Tony @ 2020-09-08 15:07 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, lkml > Ok, so I think this is what Andy meant last night and PeterZ just > suggested it too: > > We do a: > > _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_panic) > > which panics straight in the #GP handler and avoids the IRET. We can even get a nice diagnostic message since the handler has access to "regs". It can print which MSR (regs->cx) and where it happened (regs->ip). Which sounds like you might want a specific ex_handler_rdmsr function rather than a generic ex_handler_panic. Maybe same deal for wrmsr() too? That would also print edx:eax so you could see what was being written. -Tony ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-08 15:07 ` Luck, Tony @ 2020-09-08 15:25 ` Borislav Petkov 2020-09-09 11:30 ` Borislav Petkov 0 siblings, 1 reply; 17+ messages in thread From: Borislav Petkov @ 2020-09-08 15:25 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml On Tue, Sep 08, 2020 at 03:07:05PM +0000, Luck, Tony wrote: > We can even get a nice diagnostic message since the handler > has access to "regs". It can print which MSR (regs->cx) and > where it happened (regs->ip). > > Which sounds like you might want a specific ex_handler_rdmsr > function rather than a generic ex_handler_panic. > > Maybe same deal for wrmsr() too? That would also print edx:eax > so you could see what was being written. Sounds good. Lemme try to come up with something these days. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-08 15:25 ` Borislav Petkov @ 2020-09-09 11:30 ` Borislav Petkov 2020-09-09 18:20 ` Luck, Tony 0 siblings, 1 reply; 17+ messages in thread From: Borislav Petkov @ 2020-09-09 11:30 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml I guess something as straightforward as this: --- diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 0ba24dfffdb2..9893caaf2696 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -373,10 +373,27 @@ static int msr_to_offset(u32 msr) return -1; } +__visible bool ex_handler_rdmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) +{ + if (pr_warn_once("MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n", + (unsigned int)regs->cx, regs->ip, (void *)regs->ip)) + show_stack_regs(regs); + + panic("MCA Architectural violation!\n"); + + while (true) + cpu_relax(); + + return true; +} + /* MSR access wrappers used for error injection */ static u64 mce_rdmsrl(u32 msr) { - u64 v; + DECLARE_ARGS(val, low, high); if (__this_cpu_read(injectm.finished)) { int offset = msr_to_offset(msr); @@ -386,21 +403,42 @@ static u64 mce_rdmsrl(u32 msr) return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset); } - if (rdmsrl_safe(msr, &v)) { - WARN_ONCE(1, "mce: Unable to read MSR 0x%x!\n", msr); - /* - * Return zero in case the access faulted. This should - * not happen normally but can happen if the CPU does - * something weird, or if the code is buggy. - */ - v = 0; - } + /* + * RDMSR on MCA MSRs should not fault. If they do, this is very much an + * architectural violation and needs to be reported to hw vendor. Panic + * the box to not allow any further progress. + */ + asm volatile("1: rdmsr\n" + "2:\n" + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_rdmsr_fault) + : EAX_EDX_RET(val, low, high) : "c" (msr)); - return v; + + return EAX_EDX_VAL(val, low, high); +} + +__visible bool ex_handler_wrmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) +{ + if (pr_warn_once("MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n", + (unsigned int)regs->cx, (unsigned int)regs->dx, (unsigned int)regs->ax, + regs->ip, (void *)regs->ip)) + show_stack_regs(regs); + + panic("MCA Architectural violation!\n"); + + while (true) + cpu_relax(); + + return true; } static void mce_wrmsrl(u32 msr, u64 v) { + u32 low, high; + if (__this_cpu_read(injectm.finished)) { int offset = msr_to_offset(msr); @@ -408,7 +446,15 @@ static void mce_wrmsrl(u32 msr, u64 v) *(u64 *)((char *)this_cpu_ptr(&injectm) + offset) = v; return; } - wrmsrl(msr, v); + + low = (u32)v; + high = (u32)(v >> 32); + + /* See comment in mce_rdmsrl() */ + asm volatile("1: wrmsr\n" + "2:\n" + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_fault) + : : "c" (msr), "a"(low), "d" (high) : "memory"); } /* -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-09 11:30 ` Borislav Petkov @ 2020-09-09 18:20 ` Luck, Tony 2020-09-09 20:03 ` Borislav Petkov 0 siblings, 1 reply; 17+ messages in thread From: Luck, Tony @ 2020-09-09 18:20 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, lkml On Wed, Sep 09, 2020 at 01:30:22PM +0200, Borislav Petkov wrote: > I guess something as straightforward as this: Do we think there will be other places where we want this MSR-or-die behaviour? If there are, then most of this belongs elsewhere from arch/x86/kernel/cpu/mce/core.c > --- > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 0ba24dfffdb2..9893caaf2696 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -373,10 +373,27 @@ static int msr_to_offset(u32 msr) > return -1; > } > > +__visible bool ex_handler_rdmsr_fault(const struct exception_table_entry *fixup, > + struct pt_regs *regs, int trapnr, > + unsigned long error_code, > + unsigned long fault_addr) > +{ > + if (pr_warn_once("MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n", The "_once" version seems a little pointless when the next statement in the function is "panic()". "warn" seems understated for an error that is going to crash the system. Just go for "pr_emerg()". There seems no consistency on using "rIP" or "RIP" ... but I think "RIP" is slightly ahead. > + (unsigned int)regs->cx, regs->ip, (void *)regs->ip)) > + show_stack_regs(regs); > + > + panic("MCA Architectural violation!\n"); nitpick: I don't thing Architectural needs to be capitalized. > + > + while (true) > + cpu_relax(); Ugh. Is this why you have warn_once() ... because panic might return? Above comments also apply to the wrmsr path. -Tony ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-09 18:20 ` Luck, Tony @ 2020-09-09 20:03 ` Borislav Petkov 2020-09-10 18:29 ` Borislav Petkov 0 siblings, 1 reply; 17+ messages in thread From: Borislav Petkov @ 2020-09-09 20:03 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml On Wed, Sep 09, 2020 at 11:20:51AM -0700, Luck, Tony wrote: > Do we think there will be other places where we want this > MSR-or-die behaviour? MSR-or-die - I like that. That belongs on a T-shirt. :-) > If there are, then most of this belongs elsewhere from > arch/x86/kernel/cpu/mce/core.c Yeah, I can't think of any other users needing this ATM. But if they do, lifting those and moving them somewhere else is trivial. > The "_once" version seems a little pointless when the next statement > in the function is "panic()". True, removed. > "warn" seems understated for an error that is going to crash the system. > Just go for "pr_emerg()". Done. > There seems no consistency on using "rIP" or "RIP" ... but I think "RIP" > is slightly ahead. The "r" in rIP means that it can be RIP or EIP and that code builds on 32-bit too. > nitpick: I don't thing Architectural needs to be capitalized. Done. > Ugh. Is this why you have warn_once() ... because panic might return? People tend to put all kinds of things in panic() and I want to make sure that even if it returns for whatever reason, at some point in the future, we don't make any further progress here. Think of it as a paranoid precation of sorts... Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-09 20:03 ` Borislav Petkov @ 2020-09-10 18:29 ` Borislav Petkov 2020-09-10 18:38 ` [PATCH -v2] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR Borislav Petkov 2020-09-10 18:42 ` [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Luck, Tony 0 siblings, 2 replies; 17+ messages in thread From: Borislav Petkov @ 2020-09-10 18:29 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml Ok, with all those changes, I don't think the following nice and juicy message can be overlooked: [ 32.267830] mce: MSR access error: RDMSR from 0x1234 at rIP: 0xffffffff8102ed62 (mce_rdmsrl+0x12/0x50) [ 32.267838] Call Trace: [ 32.267838] <#MC> [ 32.267838] do_machine_check+0xbd/0x9f0 [ 32.267839] ? trigger_mce+0x7/0x10 [ 32.267839] exc_machine_check+0x77/0xd0 [ 32.267840] asm_exc_machine_check+0x19/0x30 [ 32.267840] RIP: 0010:trigger_mce+0x7/0x10 [ 32.267841] Code: c0 c3 90 0f 1f 44 00 00 48 8b 47 58 48 89 06 31 c0 c3 90 0f 1f 44 00 00 8b 47 44 48 89 06 31 c0 c3 66 90 0f 1f 44 00 00 cd 12 <c3> 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 cd f4 c3 0f 1f 84 00 00 [ 32.267841] RSP: 0018:ffffc90000003fc0 EFLAGS: 00000002 [ 32.267842] RAX: ffffffff81032940 RBX: 0000000000000000 RCX: ffff88807dce9e80 [ 32.267843] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 32.267843] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 32.267844] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 32.267844] R13: ffff88807dce9e80 R14: 0000000000000000 R15: 0000000000000000 [ 32.267845] ? inj_extcpu_get+0x10/0x10 [ 32.267845] </#MC> [ 32.267845] <IRQ> [ 32.267846] flush_smp_call_function_queue+0xce/0x1b0 [ 32.267846] __sysvec_call_function_single+0x2c/0xd0 [ 32.267846] asm_call_on_stack+0x12/0x20 [ 32.267847] </IRQ> [ 32.267847] sysvec_call_function_single+0x6e/0x80 [ 32.267847] asm_sysvec_call_function_single+0x12/0x20 [ 32.267848] RIP: 0010:default_idle+0x13/0x20 [ 32.267849] Code: 0f ae 38 0f ae f0 eb b9 fb eb e3 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 e9 07 00 00 00 0f 00 2d c1 9c 5e 00 fb f4 <c3> cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 65 8b 15 04 [ 32.267849] RSP: 0018:ffffffff82203ec0 EFLAGS: 00000212 [ 32.267850] RAX: ffffffff81819730 RBX: 0000000000000000 RCX: 0000000000029d00 [ 32.267851] RDX: 000000000007de7e RSI: 7ffffff880b4b40b RDI: ffffffff8220f880 [ 32.267851] RBP: ffffffff8220f880 R08: 0000000473333333 R09: 00000049ae089af4 [ 32.267852] R10: 000000000006fecd R11: 0000000000006f96 R12: ffffffff8220f880 [ 32.267852] R13: ffffffff8220f880 R14: 0000000000000000 R15: 0000000000000000 [ 32.267852] ? mwait_idle+0x80/0x80 [ 32.267853] default_idle_call+0x34/0xf0 [ 32.267853] do_idle+0x1ba/0x210 [ 32.267853] cpu_startup_entry+0x19/0x20 [ 32.267854] start_kernel+0x4f5/0x502 [ 32.267854] secondary_startup_64+0xa4/0xb0 [ 32.267855] Kernel panic - not syncing: MCA architectural violation! [ 32.267855] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc4+ #4 [ 32.267856] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 32.267856] Call Trace: [ 32.267856] <#MC> [ 32.267856] dump_stack+0x57/0x6a [ 32.267857] panic+0xf2/0x2c2 [ 32.267857] ? secondary_startup_64+0xa3/0xb0 [ 32.267857] ex_handler_rdmsr_fault+0x36/0x36 [ 32.267858] fixup_exception+0x56/0x80 [ 32.267858] exc_general_protection+0xb6/0x2d0 [ 32.267858] asm_exc_general_protection+0x1e/0x30 [ 32.267859] RIP: 0010:mce_rdmsrl+0x12/0x50 [ 32.267860] Code: 54 b9 15 82 48 c7 c1 20 e4 02 81 48 c7 c2 00 e4 02 81 e9 01 17 20 00 90 0f 1f 44 00 00 65 8a 05 07 81 fe 7e 84 c0 75 10 89 f9 <0f> 32 48 89 d7 48 c1 e7 20 48 09 f8 f3 c3 e8 7b f5 ff ff 48 63 d0 [ 32.267860] RSP: 0018:fffffe000000fe30 EFLAGS: 00010046 [ 32.267861] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000001234 [ 32.267861] RDX: 0000001d39aa4ebc RSI: fffffe000000ff08 RDI: 0000000000001234 [ 32.267862] RBP: fffffe000000fe88 R08: 0000000000000000 R09: 0000000000000000 [ 32.267862] R10: 0000000000000000 R11: 0000000000000000 R12: fffffe000000ff58 [ 32.267863] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 32.267863] do_machine_check+0xbd/0x9f0 [ 32.267864] ? trigger_mce+0x7/0x10 [ 32.267864] exc_machine_check+0x77/0xd0 [ 32.267864] asm_exc_machine_check+0x19/0x30 [ 32.267865] RIP: 0010:trigger_mce+0x7/0x10 [ 32.267866] Code: c0 c3 90 0f 1f 44 00 00 48 8b 47 58 48 89 06 31 c0 c3 90 0f 1f 44 00 00 8b 47 44 48 89 06 31 c0 c3 66 90 0f 1f 44 00 00 cd 12 <c3> 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 cd f4 c3 0f 1f 84 00 00 [ 32.267866] RSP: 0018:ffffc90000003fc0 EFLAGS: 00000002 [ 32.267867] RAX: ffffffff81032940 RBX: 0000000000000000 RCX: ffff88807dce9e80 [ 32.267867] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 32.267868] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 32.267868] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 32.267869] R13: ffff88807dce9e80 R14: 0000000000000000 R15: 0000000000000000 [ 32.267869] ? inj_extcpu_get+0x10/0x10 [ 32.267869] </#MC> [ 32.267870] <IRQ> [ 32.267870] flush_smp_call_function_queue+0xce/0x1b0 [ 32.267870] __sysvec_call_function_single+0x2c/0xd0 [ 32.267871] asm_call_on_stack+0x12/0x20 [ 32.267871] </IRQ> [ 32.267871] sysvec_call_function_single+0x6e/0x80 [ 32.267872] asm_sysvec_call_function_single+0x12/0x20 [ 32.267872] RIP: 0010:default_idle+0x13/0x20 [ 32.267873] Code: 0f ae 38 0f ae f0 eb b9 fb eb e3 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 e9 07 00 00 00 0f 00 2d c1 9c 5e 00 fb f4 <c3> cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 65 8b 15 04 [ 32.267874] RSP: 0018:ffffffff82203ec0 EFLAGS: 00000212 [ 32.267874] RAX: ffffffff81819730 RBX: 0000000000000000 RCX: 0000000000029d00 [ 32.267875] RDX: 000000000007de7e RSI: 7ffffff880b4b40b RDI: ffffffff8220f880 [ 32.267875] RBP: ffffffff8220f880 R08: 0000000473333333 R09: 00000049ae089af4 [ 32.267876] R10: 000000000006fecd R11: 0000000000006f96 R12: ffffffff8220f880 [ 32.267876] R13: ffffffff8220f880 R14: 0000000000000000 R15: 0000000000000000 [ 32.267877] ? mwait_idle+0x80/0x80 [ 32.267877] default_idle_call+0x34/0xf0 [ 32.267877] do_idle+0x1ba/0x210 [ 32.267878] cpu_startup_entry+0x19/0x20 [ 32.267878] start_kernel+0x4f5/0x502 [ 32.267878] secondary_startup_64+0xa4/0xb0 [ 32.268204] Kernel Offset: disabled -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH -v2] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR 2020-09-10 18:29 ` Borislav Petkov @ 2020-09-10 18:38 ` Borislav Petkov 2020-09-10 18:42 ` [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Luck, Tony 1 sibling, 0 replies; 17+ messages in thread From: Borislav Petkov @ 2020-09-10 18:38 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml From: Borislav Petkov <bp@suse.de> If an exception needs to be handled while reading an MSR - which is in most of the cases caused by a #GP on a non-existent MSR - then this is most likely the incarnation of a BIOS or a hardware bug. Such bug violates the architectural guarantee that MSR banks are present with all MSRs belonging to them. The proper fix belongs in the hardware/firmware - not in the kernel. Handling an #MC exception which is raised while an NMI is being handled would cause the nasty NMI nesting issue because of the shortcoming of IRET of reenabling NMIs when executed. And the machine is in an #MC context already so <Deity> be at its side. Tracing MSR accesses while in #MC is another no-no due to tracing being inherently a bad idea in atomic context: vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() leaves .noinstr.text section so remove all that "additional" functionality from mce_rdmsrl() and provide it with a special exception handler which panics the machine when that MSR is not accessible. The exception handler prints a human-readable message explaining what the panic reason is but, what is more, it panics while in the #GP handler and latter won't have executed an IRET, thus opening the NMI nesting issue in the case when the #MC has happened while handling an NMI. (#MC itself won't be reenabled until MCG_STATUS hasn't been cleared). Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/kernel/cpu/mce/core.c | 72 ++++++++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 0ba24dfffdb2..a697baee9aa0 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -373,10 +373,28 @@ static int msr_to_offset(u32 msr) return -1; } +__visible bool ex_handler_rdmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) +{ + pr_emerg("MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n", + (unsigned int)regs->cx, regs->ip, (void *)regs->ip); + + show_stack_regs(regs); + + panic("MCA architectural violation!\n"); + + while (true) + cpu_relax(); + + return true; +} + /* MSR access wrappers used for error injection */ static u64 mce_rdmsrl(u32 msr) { - u64 v; + DECLARE_ARGS(val, low, high); if (__this_cpu_read(injectm.finished)) { int offset = msr_to_offset(msr); @@ -386,21 +404,43 @@ static u64 mce_rdmsrl(u32 msr) return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset); } - if (rdmsrl_safe(msr, &v)) { - WARN_ONCE(1, "mce: Unable to read MSR 0x%x!\n", msr); - /* - * Return zero in case the access faulted. This should - * not happen normally but can happen if the CPU does - * something weird, or if the code is buggy. - */ - v = 0; - } + /* + * RDMSR on MCA MSRs should not fault. If they do, this is very much an + * architectural violation and needs to be reported to hw vendor. Panic + * the box to not allow any further progress. + */ + asm volatile("1: rdmsr\n" + "2:\n" + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_rdmsr_fault) + : EAX_EDX_RET(val, low, high) : "c" (msr)); - return v; + + return EAX_EDX_VAL(val, low, high); +} + +__visible bool ex_handler_wrmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) +{ + pr_emerg("MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n", + (unsigned int)regs->cx, (unsigned int)regs->dx, (unsigned int)regs->ax, + regs->ip, (void *)regs->ip); + + show_stack_regs(regs); + + panic("MCA architectural violation!\n"); + + while (true) + cpu_relax(); + + return true; } static void mce_wrmsrl(u32 msr, u64 v) { + u32 low, high; + if (__this_cpu_read(injectm.finished)) { int offset = msr_to_offset(msr); @@ -408,7 +448,15 @@ static void mce_wrmsrl(u32 msr, u64 v) *(u64 *)((char *)this_cpu_ptr(&injectm) + offset) = v; return; } - wrmsrl(msr, v); + + low = (u32)v; + high = (u32)(v >> 32); + + /* See comment in mce_rdmsrl() */ + asm volatile("1: wrmsr\n" + "2:\n" + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_fault) + : : "c" (msr), "a"(low), "d" (high) : "memory"); } /* -- 2.21.0 -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-10 18:29 ` Borislav Petkov 2020-09-10 18:38 ` [PATCH -v2] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR Borislav Petkov @ 2020-09-10 18:42 ` Luck, Tony 2020-09-10 18:54 ` Borislav Petkov 1 sibling, 1 reply; 17+ messages in thread From: Luck, Tony @ 2020-09-10 18:42 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, lkml On Thu, Sep 10, 2020 at 08:29:19PM +0200, Borislav Petkov wrote: > Ok, with all those changes, I don't think the following nice and juicy > message can be overlooked: > > [ 32.267830] mce: MSR access error: RDMSR from 0x1234 at rIP: 0xffffffff8102ed62 (mce_rdmsrl+0x12/0x50) With only one call site the rIP isn't super helpful at the moment. But once you start selling those "MSR or die" T-shirts everyone will want to use this :-) > [ 32.267838] Call Trace: > [ 32.267838] <#MC> > [ 32.267838] do_machine_check+0xbd/0x9f0 > [ 32.267839] ? trigger_mce+0x7/0x10 > [ 32.267839] exc_machine_check+0x77/0xd0 > [ 32.267840] asm_exc_machine_check+0x19/0x30 Do we need the stack trace twice? Once from your fixup function, second from panic()? -Tony ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-10 18:42 ` [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Luck, Tony @ 2020-09-10 18:54 ` Borislav Petkov 2020-09-10 19:43 ` Luck, Tony 0 siblings, 1 reply; 17+ messages in thread From: Borislav Petkov @ 2020-09-10 18:54 UTC (permalink / raw) To: Luck, Tony; +Cc: x86-ml, lkml On Thu, Sep 10, 2020 at 11:42:06AM -0700, Luck, Tony wrote: > With only one call site the rIP isn't super helpful at the moment. But > once you start selling those "MSR or die" T-shirts everyone will want > to use this :-) :-))) > Do we need the stack trace twice? Once from your fixup > function, second from panic()? The second panic comes from: if (lmce) { if (no_way_out) mce_panic("Fatal local machine check", &m, msg); in do_machine_check() and we panic earlier in mce_no_way_out() where I stuck mce_rdmsrl(0x1234); because that one does reads MCi_STATUS. mce_gather_info() could cause it too because it gathers all the other MSRs. Now, if we want to avoid the second panic(), we'd have to tell that site above that we have panicked already but that would require some flag variable or even more uglification. Considering how this situation is supposed to almost never happen and how we're actually interested in the first line of the whole splat I pasted, how much output comes after it, doesn't really matter. All it matters is that the machine stops any further progress (as much as we can do that with a panic, ofc). Or do you see a more important issue with a second panic to warrant such not-really-needed change? Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-10 18:54 ` Borislav Petkov @ 2020-09-10 19:43 ` Luck, Tony 0 siblings, 0 replies; 17+ messages in thread From: Luck, Tony @ 2020-09-10 19:43 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, lkml > Considering how this situation is supposed to almost never happen and > how we're actually interested in the first line of the whole splat I > pasted, how much output comes after it, doesn't really matter. All it > matters is that the machine stops any further progress (as much as we > can do that with a panic, ofc). It would be a "nice to have" ... because all that extra noise pushed the useful line off the top of the screen. For anyone with a serial console it’s a non-issue. But since this is a "should never happen" case it isn't worth creating a complex bunch of infrastructure. You can slap a: Reviewed-by: Tony Luck <tony.luck@intel.com> on it when you push into GIT. -Tony ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-06 21:21 [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Borislav Petkov 2020-09-07 20:06 ` Luck, Tony @ 2020-09-07 20:16 ` Andy Lutomirski 2020-09-07 20:27 ` Borislav Petkov 2020-09-11 9:47 ` [tip: ras/core] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR tip-bot2 for Borislav Petkov 2 siblings, 1 reply; 17+ messages in thread From: Andy Lutomirski @ 2020-09-07 20:16 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, Tony Luck, lkml On Sun, Sep 6, 2020 at 2:21 PM Borislav Petkov <bp@alien8.de> wrote: > > Hi, > > Ingo and I talked about this thing this morning and tglx has had it on > his to-fix list too so here's a first attempt at it. > > Below is just a brain dump of what we talked about so let's start with > it and see where it would take us. > > Thx. > > --- > > From: Borislav Petkov <bp@suse.de> > > ... without any exception handling and tracing. > > If an exception needs to be handled while reading an MSR - which is in > most of the cases caused by a #GP on a non-existent MSR - then this > is most likely the incarnation of a BIOS or a hardware bug. Such bug > violates the architectural guarantee that MSR banks are present with all > MSRs belonging to them. > > The proper fix belongs in the hardware/firmware - not in the kernel. > > Handling exceptions while in #MC and while an NMI is being handled would > cause the nasty NMI nesting issue because of the shortcoming of IRET > of reenabling NMIs when executed. And the machine is in an #MC context > already so <Deity> be at its side. > > Tracing MSR accesses while in #MC is another no-no due to tracing being > inherently a bad idea in atomic context: > > vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() leaves .noinstr.text section > > so remove all that "additional" functionality from mce_rdmsrl() and > concentrate on solely reading the MSRs. > > Signed-off-by: Borislav Petkov <bp@suse.de> > Cc: Ingo Molnar <mingo@kernel.org> > --- > arch/x86/kernel/cpu/mce/core.c | 18 +++++++----------- > 1 file changed, 7 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 0ba24dfffdb2..14ebdf3e22f3 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -376,7 +376,7 @@ static int msr_to_offset(u32 msr) > /* MSR access wrappers used for error injection */ > static u64 mce_rdmsrl(u32 msr) > { > - u64 v; > + DECLARE_ARGS(val, low, high); > > if (__this_cpu_read(injectm.finished)) { > int offset = msr_to_offset(msr); > @@ -386,17 +386,13 @@ static u64 mce_rdmsrl(u32 msr) > return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset); > } > > - if (rdmsrl_safe(msr, &v)) { > - WARN_ONCE(1, "mce: Unable to read MSR 0x%x!\n", msr); > - /* > - * Return zero in case the access faulted. This should > - * not happen normally but can happen if the CPU does > - * something weird, or if the code is buggy. > - */ > - v = 0; > - } > + /* > + * RDMSR on MCA MSRs should not fault. If they do, this is very much an > + * architectural violation and needs to be reported to hw vendor. > + */ > + asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr)); I don't like this. Plain rdmsrl() will at least print a nice error if it fails. Perhaps we should add a read_msr_panic() variant that panics on failure? Or, if there is just this one case, then we can use rdmsrl_safe() and print a nice error and panic on failure. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only 2020-09-07 20:16 ` Andy Lutomirski @ 2020-09-07 20:27 ` Borislav Petkov 0 siblings, 0 replies; 17+ messages in thread From: Borislav Petkov @ 2020-09-07 20:27 UTC (permalink / raw) To: Andy Lutomirski; +Cc: x86-ml, Tony Luck, lkml On Mon, Sep 07, 2020 at 01:16:43PM -0700, Andy Lutomirski wrote: > > + asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr)); > > I don't like this. Plain rdmsrl() will at least print a nice error if it fails. I think you read my commit message too quickly :) The point is to not print anything and not have *any* exception handling or whatever - just plain RDMSR. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 17+ messages in thread
* [tip: ras/core] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR 2020-09-06 21:21 [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Borislav Petkov 2020-09-07 20:06 ` Luck, Tony 2020-09-07 20:16 ` Andy Lutomirski @ 2020-09-11 9:47 ` tip-bot2 for Borislav Petkov 2 siblings, 0 replies; 17+ messages in thread From: tip-bot2 for Borislav Petkov @ 2020-09-11 9:47 UTC (permalink / raw) To: linux-tip-commits Cc: Andy Lutomirski, Peter Zijlstra, kernel test robot, Borislav Petkov, Tony Luck, x86, LKML The following commit has been merged into the ras/core branch of tip: Commit-ID: e2def7d49d0812ea40a224161b2001b2e815dce2 Gitweb: https://git.kernel.org/tip/e2def7d49d0812ea40a224161b2001b2e815dce2 Author: Borislav Petkov <bp@suse.de> AuthorDate: Sun, 06 Sep 2020 23:02:52 +02:00 Committer: Borislav Petkov <bp@suse.de> CommitterDate: Fri, 11 Sep 2020 08:25:43 +02:00 x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR If an exception needs to be handled while reading an MSR - which is in most of the cases caused by a #GP on a non-existent MSR - then this is most likely the incarnation of a BIOS or a hardware bug. Such bug violates the architectural guarantee that MCA banks are present with all MSRs belonging to them. The proper fix belongs in the hardware/firmware - not in the kernel. Handling an #MC exception which is raised while an NMI is being handled would cause the nasty NMI nesting issue because of the shortcoming of IRET of reenabling NMIs when executed. And the machine is in an #MC context already so <Deity> be at its side. Tracing MSR accesses while in #MC is another no-no due to tracing being inherently a bad idea in atomic context: vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() leaves .noinstr.text section so remove all that "additional" functionality from mce_rdmsrl() and provide it with a special exception handler which panics the machine when that MSR is not accessible. The exception handler prints a human-readable message explaining what the panic reason is but, what is more, it panics while in the #GP handler and latter won't have executed an IRET, thus opening the NMI nesting issue in the case when the #MC has happened while handling an NMI. (#MC itself won't be reenabled until MCG_STATUS hasn't been cleared). Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> [ Add missing prototypes for ex_handler_* ] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200906212130.GA28456@zn.tnic --- arch/x86/kernel/cpu/mce/core.c | 72 ++++++++++++++++++++++++----- arch/x86/kernel/cpu/mce/internal.h | 10 ++++- 2 files changed, 70 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 0ba24df..a697bae 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -373,10 +373,28 @@ static int msr_to_offset(u32 msr) return -1; } +__visible bool ex_handler_rdmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) +{ + pr_emerg("MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n", + (unsigned int)regs->cx, regs->ip, (void *)regs->ip); + + show_stack_regs(regs); + + panic("MCA architectural violation!\n"); + + while (true) + cpu_relax(); + + return true; +} + /* MSR access wrappers used for error injection */ static u64 mce_rdmsrl(u32 msr) { - u64 v; + DECLARE_ARGS(val, low, high); if (__this_cpu_read(injectm.finished)) { int offset = msr_to_offset(msr); @@ -386,21 +404,43 @@ static u64 mce_rdmsrl(u32 msr) return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset); } - if (rdmsrl_safe(msr, &v)) { - WARN_ONCE(1, "mce: Unable to read MSR 0x%x!\n", msr); - /* - * Return zero in case the access faulted. This should - * not happen normally but can happen if the CPU does - * something weird, or if the code is buggy. - */ - v = 0; - } + /* + * RDMSR on MCA MSRs should not fault. If they do, this is very much an + * architectural violation and needs to be reported to hw vendor. Panic + * the box to not allow any further progress. + */ + asm volatile("1: rdmsr\n" + "2:\n" + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_rdmsr_fault) + : EAX_EDX_RET(val, low, high) : "c" (msr)); - return v; + + return EAX_EDX_VAL(val, low, high); +} + +__visible bool ex_handler_wrmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) +{ + pr_emerg("MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n", + (unsigned int)regs->cx, (unsigned int)regs->dx, (unsigned int)regs->ax, + regs->ip, (void *)regs->ip); + + show_stack_regs(regs); + + panic("MCA architectural violation!\n"); + + while (true) + cpu_relax(); + + return true; } static void mce_wrmsrl(u32 msr, u64 v) { + u32 low, high; + if (__this_cpu_read(injectm.finished)) { int offset = msr_to_offset(msr); @@ -408,7 +448,15 @@ static void mce_wrmsrl(u32 msr, u64 v) *(u64 *)((char *)this_cpu_ptr(&injectm) + offset) = v; return; } - wrmsrl(msr, v); + + low = (u32)v; + high = (u32)(v >> 32); + + /* See comment in mce_rdmsrl() */ + asm volatile("1: wrmsr\n" + "2:\n" + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_fault) + : : "c" (msr), "a"(low), "d" (high) : "memory"); } /* diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index 6473070..b122610 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -185,4 +185,14 @@ extern bool amd_filter_mce(struct mce *m); static inline bool amd_filter_mce(struct mce *m) { return false; }; #endif +__visible bool ex_handler_rdmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr); + +__visible bool ex_handler_wrmsr_fault(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr); + #endif /* __X86_MCE_INTERNAL_H__ */ ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2020-09-11 9:47 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-06 21:21 [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Borislav Petkov 2020-09-07 20:06 ` Luck, Tony 2020-09-08 9:46 ` Borislav Petkov 2020-09-08 10:08 ` Borislav Petkov 2020-09-08 15:07 ` Luck, Tony 2020-09-08 15:25 ` Borislav Petkov 2020-09-09 11:30 ` Borislav Petkov 2020-09-09 18:20 ` Luck, Tony 2020-09-09 20:03 ` Borislav Petkov 2020-09-10 18:29 ` Borislav Petkov 2020-09-10 18:38 ` [PATCH -v2] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR Borislav Petkov 2020-09-10 18:42 ` [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only Luck, Tony 2020-09-10 18:54 ` Borislav Petkov 2020-09-10 19:43 ` Luck, Tony 2020-09-07 20:16 ` Andy Lutomirski 2020-09-07 20:27 ` Borislav Petkov 2020-09-11 9:47 ` [tip: ras/core] x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR tip-bot2 for Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).