kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
	kvm list <kvm@vger.kernel.org>, stable <stable@vger.kernel.org>
Subject: Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS
Date: Thu, 9 Apr 2020 08:03:09 -0700	[thread overview]
Message-ID: <4EB5D96F-F322-45BB-9169-6BF932D413D4@amacapital.net> (raw)
In-Reply-To: <c09dd91f-c280-85a6-c2a2-d44a0d378bbc@redhat.com>



> On Apr 9, 2020, at 7:32 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 09/04/20 16:13, Andrew Cooper wrote:
>>> On 09/04/2020 13:47, Paolo Bonzini wrote:
>>> On 09/04/20 06:50, Andy Lutomirski wrote:
>>>> The small
>>>> (or maybe small) one is that any fancy protocol where the guest
>>>> returns from an exception by doing, logically:
>>>> 
>>>> Hey I'm done;  /* MOV somewhere, hypercall, MOV to CR4, whatever */
>>>> IRET;
>>>> 
>>>> is fundamentally racy.  After we say we're done and before IRET, we
>>>> can be recursively reentered.  Hi, NMI!
>>> That's possible in theory.  In practice there would be only two levels
>>> of nesting, one for the original page being loaded and one for the tail
>>> of the #VE handler.  The nested #VE would see IF=0, resolve the EPT
>>> violation synchronously and both handlers would finish.  For the tail
>>> page to be swapped out again, leading to more nesting, the host's LRU
>>> must be seriously messed up.
>>> 
>>> With IST it would be much messier, and I haven't quite understood why
>>> you believe the #VE handler should have an IST.
>> 
>> Any interrupt/exception which can possibly occur between a SYSCALL and
>> re-establishing a kernel stack (several instructions), must be IST to
>> avoid taking said exception on a user stack and being a trivial
>> privilege escalation.
> 
> Doh, of course.  I always confuse SYSCALL and SYSENTER.
> 
>> Therefore, it doesn't really matter if KVM's paravirt use of #VE does
>> respect the interrupt flag.  It is not sensible to build a paravirt
>> interface using #VE who's safety depends on never turning on
>> hardware-induced #VE's.
> 
> No, I think we wouldn't use a paravirt #VE at this point, we would use
> the real thing if available.
> 
> It would still be possible to switch from the IST to the main kernel
> stack before writing 0 to the reentrancy word.
> 
> 

Almost but not quite. We do this for NMI-from-usermode, and it’s ugly. But we can’t do this for NMI-from-kernel or #VE-from-kernel because there might not be a kernel stack.  Trying to hack around this won’t be pretty.

Frankly, I think that we shouldn’t even try to report memory failure to the guest if it happens with interrupts off. Just kill the guest cleanly and keep it simple. Or inject an intentionally unrecoverable IST exception.

  reply	other threads:[~2020-04-09 15:03 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-07  2:26 [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS Andy Lutomirski
2020-03-07 15:03 ` Andy Lutomirski
2020-03-07 15:47   ` Thomas Gleixner
2020-03-07 15:59     ` Andy Lutomirski
2020-03-07 19:01       ` Thomas Gleixner
2020-03-07 19:34         ` Andy Lutomirski
2020-03-08  7:23         ` Thomas Gleixner
2020-03-09  6:57           ` Thomas Gleixner
2020-03-09  8:40             ` Paolo Bonzini
2020-03-09  9:09               ` Thomas Gleixner
2020-03-09 18:14                 ` Andy Lutomirski
2020-03-09 19:05                   ` Thomas Gleixner
2020-03-09 20:22                     ` Peter Zijlstra
2020-04-06 19:09                       ` Vivek Goyal
2020-04-06 20:25                         ` Peter Zijlstra
2020-04-06 20:32                           ` Andy Lutomirski
2020-04-06 20:42                             ` Andy Lutomirski
2020-04-07 17:21                               ` Vivek Goyal
2020-04-07 17:38                                 ` Andy Lutomirski
2020-04-07 20:20                                   ` Thomas Gleixner
2020-04-07 21:41                                     ` Andy Lutomirski
2020-04-07 22:07                                       ` Paolo Bonzini
2020-04-07 22:29                                         ` Andy Lutomirski
2020-04-08  0:30                                           ` Paolo Bonzini
2020-05-21 15:55                                         ` Vivek Goyal
2020-04-07 22:48                                       ` Thomas Gleixner
2020-04-08  4:48                                         ` Andy Lutomirski
2020-04-08  9:32                                           ` Borislav Petkov
2020-04-08 10:12                                           ` Thomas Gleixner
2020-04-08 18:23                                           ` Vivek Goyal
2020-04-07 22:49                                       ` Vivek Goyal
2020-04-08 10:01                                         ` Borislav Petkov
2020-04-07 22:04                                     ` Paolo Bonzini
2020-04-07 23:21                                       ` Thomas Gleixner
2020-04-08  8:23                                         ` Paolo Bonzini
2020-04-08 13:01                                           ` Thomas Gleixner
2020-04-08 15:38                                             ` Peter Zijlstra
2020-04-08 16:41                                               ` Thomas Gleixner
2020-04-09  9:03                                             ` Paolo Bonzini
2020-04-08 15:34                                           ` Sean Christopherson
2020-04-08 16:50                                             ` Paolo Bonzini
2020-04-08 18:01                                               ` Thomas Gleixner
2020-04-08 20:34                                                 ` Vivek Goyal
2020-04-08 23:06                                                   ` Thomas Gleixner
2020-04-08 23:14                                                     ` Thomas Gleixner
2020-04-09  4:50                                                 ` Andy Lutomirski
2020-04-09  9:43                                                   ` Paolo Bonzini
2020-04-09 11:36                                                   ` Andrew Cooper
2020-04-09 12:47                                                   ` Paolo Bonzini
2020-04-09 14:13                                                     ` Andrew Cooper
2020-04-09 14:32                                                       ` Paolo Bonzini
2020-04-09 15:03                                                         ` Andy Lutomirski [this message]
2020-04-09 15:17                                                           ` Paolo Bonzini
2020-04-09 17:32                                                             ` Andy Lutomirski
2020-04-06 21:32                         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EB5D96F-F322-45BB-9169-6BF932D413D4@amacapital.net \
    --to=luto@amacapital.net \
    --cc=andrew.cooper3@citrix.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).