From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753527AbcCNRFu (ORCPT <rfc822;w@1wt.eu>);
	Mon, 14 Mar 2016 13:05:50 -0400
Received: from mail-oi0-f45.google.com ([209.85.218.45]:33746 "EHLO
	mail-oi0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751941AbcCNRFq (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 14 Mar 2016 13:05:46 -0400
MIME-Version: 1.0
In-Reply-To: <20160314120202.GD15800@pd.tnic>
References: <cover.1457805972.git.luto@kernel.org> <a3b871a4eb533340d04255409dfecc94f88c647d.1457805972.git.luto@kernel.org>
 <20160314120202.GD15800@pd.tnic>
From: Andy Lutomirski <luto@amacapital.net>
Date: Mon, 14 Mar 2016 10:05:25 -0700
Message-ID: <CALCETrW6E0Nz6gSmRKTvHbQDhnHVpuhzmgZB1nZ3m-DL-Bt=tQ@mail.gmail.com>
Subject: Re: [PATCH v4 2/5] x86/msr: Carry on after a non-"safe" MSR access
 fails without !panic_on_oops
To: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>, X86 ML <x86@kernel.org>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Peter Zijlstra <peterz@infradead.org>, KVM list <kvm@vger.kernel.org>,
        Arjan van de Ven <arjan@linux.intel.com>,
        xen-devel <Xen-devel@lists.xen.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Andrew Morton <akpm@linux-foundation.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Mar 14, 2016 at 5:02 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Sat, Mar 12, 2016 at 10:08:49AM -0800, Andy Lutomirski wrote:
>> This demotes an OOPS and likely panic due to a failed non-"safe" MSR
>> access to a WARN_ONCE and, for RDMSR, a return value of zero.  If
>> panic_on_oops is set, then failed unsafe MSR accesses will still
>> oops and panic.
>>
>> To be clear, this type of failure should *not* happen.  This patch
>> exists to minimize the chance of nasty undebuggable failures due on
>> systems that used to work due to a now-fixed CONFIG_PARAVIRT=y bug.
>>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>  arch/x86/include/asm/msr.h | 10 ++++++++--
>>  arch/x86/mm/extable.c      | 33 +++++++++++++++++++++++++++++++++
>>  2 files changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
>> index 93fb7c1cffda..1487054a1a70 100644
>> --- a/arch/x86/include/asm/msr.h
>> +++ b/arch/x86/include/asm/msr.h
>> @@ -92,7 +92,10 @@ static inline unsigned long long native_read_msr(unsigned int msr)
>>  {
>>       DECLARE_ARGS(val, low, high);
>>
>> -     asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr));
>> +     asm volatile("1: rdmsr\n"
>> +                  "2:\n"
>> +                  _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_rdmsr_unsafe)
>> +                  : EAX_EDX_RET(val, low, high) : "c" (msr));
>>       if (msr_tracepoint_active(__tracepoint_read_msr))
>>               do_trace_read_msr(msr, EAX_EDX_VAL(val, low, high), 0);
>>       return EAX_EDX_VAL(val, low, high);
>> @@ -119,7 +122,10 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr,
>>  static inline void native_write_msr(unsigned int msr,
>>                                   unsigned low, unsigned high)
>>  {
>> -     asm volatile("wrmsr" : : "c" (msr), "a"(low), "d" (high) : "memory");
>> +     asm volatile("1: wrmsr\n"
>> +                  "2:\n"
>> +                  _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_unsafe)
>
> This might be a good idea:
>
> [    0.220066] cpuidle: using governor menu
> [    0.224000] ------------[ cut here ]------------
> [    0.224000] WARNING: CPU: 0 PID: 1 at arch/x86/mm/extable.c:74 ex_handler_wrmsr_unsafe+0x73/0x80()
> [    0.224000] unchecked MSR access error: WRMSR to 0xdeadbeef (tried to write 0x000000000000caca)
> [    0.224000] Modules linked in:
> [    0.224000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc7+ #7
> [    0.224000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [    0.224000]  0000000000000000 ffff88007c0d7c08 ffffffff812f13a3 ffff88007c0d7c50
> [    0.224000]  ffffffff81a40ffe ffff88007c0d7c40 ffffffff8105c3b1 ffffffff81717710
> [    0.224000]  ffff88007c0d7d18 0000000000000000 ffffffff816207d0 0000000000000000
> [    0.224000] Call Trace:
> [    0.224000]  [<ffffffff812f13a3>] dump_stack+0x67/0x94
> [    0.224000]  [<ffffffff8105c3b1>] warn_slowpath_common+0x91/0xd0
> [    0.224000]  [<ffffffff816207d0>] ? amd_cpu_notify+0x40/0x40
> [    0.224000]  [<ffffffff8105c43c>] warn_slowpath_fmt+0x4c/0x50
> [    0.224000]  [<ffffffff816207d0>] ? amd_cpu_notify+0x40/0x40
> [    0.224000]  [<ffffffff8131de53>] ? __this_cpu_preempt_check+0x13/0x20
> [    0.224000]  [<ffffffff8104efe3>] ex_handler_wrmsr_unsafe+0x73/0x80
>
> and it looks helpful and all but when you do it pretty early - for
> example I added a
>
>          wrmsrl(0xdeadbeef, 0xcafe);
>
> at the end of pat_bsp_init() and the machine explodes with an early
> panic. I'm wondering what is better - early panic or an early #GP from a
> missing MSR.

You're hitting:

    /* special handling not supported during early boot */
    if (handler != ex_handler_default)
        return 0;


which means that the behavior with and without my series applied is
identical, for better or for worse.

>
> And more specifically, can we do better to handle the early case
> gracefully too?

We could probably remove that check and let custom fixups run early.
I don't see any compelling reason to keep them disabled.  That should
probably be a separate change, though.

--Andy