LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "David P. Reed" <dpreed@deepplum.com>
To: "Sean Christopherson" <sean.j.christopherson@intel.com>
Cc: "Andy Lutomirski" <luto@amacapital.net>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Allison Randal" <allison@lohutok.net>,
	"Enrico Weigelt" <info@metux.net>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Kate Stewart" <kstewart@linuxfoundation.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Martin Molnar" <martin.molnar.programming@gmail.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Alexandre Chartre" <alexandre.chartre@oracle.com>,
	"Jann Horn" <jannh@google.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic
Date: Tue, 7 Jul 2020 15:09:38 -0400 (EDT)
Message-ID: <1594148978.965916054@apps.rackspace.com> (raw)
In-Reply-To: <20200707050932.GF5208@linux.intel.com>



On Tuesday, July 7, 2020 1:09am, "Sean Christopherson" <sean.j.christopherson@intel.com> said:

> On Sat, Jul 04, 2020 at 04:38:08PM -0400, David P. Reed wrote:
>> Fix: Mask undefined operation fault during emergency VMXOFF that must be
>> attempted to force cpu exit from VMX root operation.
>> Explanation: When a cpu may be in VMX root operation (only possible when
>> CR4.VMXE is set), crash or panic reboot tries to exit VMX root operation
>> using VMXOFF. This is necessary, because any INIT will be masked while cpu
>> is in VMX root operation, but that state cannot be reliably
>> discerned by the state of the cpu.
>> VMXOFF faults if the cpu is not actually in VMX root operation, signalling
>> undefined operation.
>> Discovered while debugging an out-of-tree x-visor with a race. Can happen
>> due to certain kinds of bugs in KVM.
>>
>> Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
>> Reported-by: David P. Reed <dpreed@deepplum.com>
>> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> Suggested-by: Andy Lutomirski <luto@kernel.org>
>> Signed-off-by: David P. Reed <dpreed@deepplum.com>
>> ---
>>  arch/x86/include/asm/virtext.h | 20 ++++++++++++++------
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
>> index 0ede8d04535a..0e0900eacb9c 100644
>> --- a/arch/x86/include/asm/virtext.h
>> +++ b/arch/x86/include/asm/virtext.h
>> @@ -30,11 +30,11 @@ static inline int cpu_has_vmx(void)
>>  }
>>
>>
>> -/* Disable VMX on the current CPU
>> +/* Exit VMX root mode and isable VMX on the current CPU.
>>   *
>>   * vmxoff causes a undefined-opcode exception if vmxon was not run
>> - * on the CPU previously. Only call this function if you know VMX
>> - * is enabled.
>> + * on the CPU previously. Only call this function if you know cpu
>> + * is in VMX root mode.
>>   */
>>  static inline void cpu_vmxoff(void)
>>  {
>> @@ -47,14 +47,22 @@ static inline int cpu_vmx_enabled(void)
>>  	return __read_cr4() & X86_CR4_VMXE;
>>  }
>>
>> -/* Disable VMX if it is enabled on the current CPU
>> +/* Safely exit VMX root mode and disable VMX if VMX enabled
>> + * on the current CPU. Handle undefined-opcode fault
>> + * that can occur if cpu is not in VMX root mode, due
>> + * to a race.
>>   *
>>   * You shouldn't call this if cpu_has_vmx() returns 0.
>>   */
>>  static inline void __cpu_emergency_vmxoff(void)
>>  {
>> -	if (cpu_vmx_enabled())
>> -		cpu_vmxoff();
>> +	if (!cpu_vmx_enabled())
>> +		return;
>> +	asm volatile ("1:vmxoff\n\t"
>> +		      "2:\n\t"
>> +		      _ASM_EXTABLE(1b, 2b)
>> +		      ::: "cc", "memory");
>> +	cr4_clear_bits(X86_CR4_VMXE);
> 
> Open coding vmxoff doesn't make sense, and IMO is flat out wrong as it fixes
> flows that use __cpu_emergency_vmxoff() but leaves the same bug hanging
> around in emergency_vmx_disable_all() until the next patch.
> 
> The reason I say it doesn't make sense is that there is no sane scenario
> where the generic vmxoff helper should _not_ eat the fault.  All other VMXOFF
> faults are mode related, i.e. any fault is guaranteed to be due to the
> !post-VMXON check unless we're magically in RM, VM86, compat mode, or at
> CPL>0.  Given that the whole point of this series is that it's impossible to
> determine whether or not the CPU if post-VMXON if CR4.VMXE=1 without taking a
> fault of some form, there's simply no way that anything except the hypervisor
> (in normal operation) can know the state of VMX.  And given that the only
> in-tree hypervisor (KVM) has its own version of vmxoff, that means there is
> no scenario in which cpu_vmxoff() can safely be used.  Case in point, after
> the next patch there are no users of cpu_vmxoff().
> 
> TL;DR: Just do fixup on cpu_vmxoff().

Personally, I don't care either way, since it fixes the bug either way (and it's inlined, so either way no additional code is generated. I was just being conservative since the cpu_vmxoff() is exported throughout the kernel source, so it might be expected to stay the same (when not in an "emergency"). I'll wait a day or two for any objections to just doing the fix in cpu_vmxoff() by other commenters. WIth no objection, I'll just do it that way.

Sean, are you the one who would get this particular fix pushed into Linus's tree, by the way? The "maintainership" is not clear to me. If you are, happy to take direction from you as the primary input.

> 
>>  }
>>
>>  /* Disable VMX if it is supported and enabled on the current CPU
>> --
>> 2.26.2
>>
> 



  reply index

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 14:45 [PATCH v2] Fix undefined operation VMXOFF during reboot and crash David P. Reed
2020-06-25 14:59 ` David P. Reed
2020-06-29 20:54   ` David P. Reed
2020-06-29 21:22     ` Andy Lutomirski
2020-06-29 21:49       ` Sean Christopherson
2020-06-29 22:46         ` David P. Reed
2020-07-04 20:38         ` [PATCH v3 0/3] " David P. Reed
2020-07-04 20:38           ` [PATCH v3 1/3] Correct asm VMXOFF side effects David P. Reed
2020-07-05  5:46             ` Randy Dunlap
2020-07-04 20:38           ` [PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic David P. Reed
2020-07-05 18:22             ` Andy Lutomirski
2020-07-05 19:52               ` David P. Reed
2020-07-05 20:55                 ` Andy Lutomirski
2020-07-05 22:07                   ` David P. Reed
2020-07-07  5:09             ` Sean Christopherson
2020-07-07 19:09               ` David P. Reed [this message]
2020-07-07 19:24                 ` Sean Christopherson
2020-07-07 19:52                   ` David P. Reed
2020-07-04 20:38           ` [PATCH v3 3/3] Force all cpus to exit VMX root operation on crash/panic reliably David P. Reed
2020-07-05 18:26             ` Andy Lutomirski
2020-07-05 20:00               ` David P. Reed
2020-07-05 20:53                 ` Andy Lutomirski
2020-07-07  5:29                   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1594148978.965916054@apps.rackspace.com \
    --to=dpreed@deepplum.com \
    --cc=alexandre.chartre@oracle.com \
    --cc=allison@lohutok.net \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=info@metux.net \
    --cc=jannh@google.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=martin.molnar.programming@gmail.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git