All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Gabriel L. Somlo" <gsomlo@gmail.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, afaerber@suse.de,
	agraf@suse.de
Subject: Re: [PATCH v2] kvm: x86: emulate monitor and mwait instructions as nop
Date: Mon, 2 Jun 2014 23:24:52 +0300	[thread overview]
Message-ID: <20140602202452.GA5701@redhat.com> (raw)
In-Reply-To: <20140602192530.GC1653@ERROL.INI.CMU.EDU>

On Mon, Jun 02, 2014 at 03:25:30PM -0400, Gabriel L. Somlo wrote:
> On Wed, May 07, 2014 at 04:52:13PM -0400, Gabriel L. Somlo wrote:
> > Treat monitor and mwait instructions as nop, which is architecturally
> > correct (but inefficient) behavior. We do this to prevent misbehaving
> > guests (e.g. OS X <= 10.7) from crashing after they fail to check for
> > monitor/mwait availability via cpuid.
> > 
> > Since mwait-based idle loops relying on these nop-emulated instructions
> > would keep the host CPU pegged at 100%, do NOT advertise their presence
> > via cpuid, to prevent compliant guests from using them inadvertently.
> > 
> > Signed-off-by: Gabriel L. Somlo <somlo@cmu.edu>
> > ---
> > 
> > New in v2: remove invalid_op handler functions which were only used to
> >            handle exits caused by monitor and mwait
> > 
> > On Wed, May 07, 2014 at 08:31:27PM +0200, Alexander Graf wrote:
> > > On 05/07/2014 08:15 PM, Michael S. Tsirkin wrote:
> > > >If we really want to be paranoid and worry about guests
> > > >that use this strange way to trigger invalid opcode,
> > > >we can make it possible for userspace to enable/disable
> > > >this hack, and teach qemu to set it.
> > > >
> > > >That would make it even safer than it was.
> > > >
> > > >Not sure it's worth it, just a thought.
> > > 
> > > Since we don't trap on non-exposed other instructions (new SSE and
> > > whatdoiknow) I don't think it's really bad to just expose
> > > MONITOR/MWAIT as nops.
> 
> Would it make sense to make this a module parameter,
> (e.g., "int emulate_mwait") ?
> 
> Default would be 0 (no emulation). 1 would mean "emulate as nop", and
> if anyone ever figures out how to do proper page-locking based
> emulation we could use 2 to enable that, etc. ?

If it's a module parameter, default should be one that's
safe for all guests, that would be 1 in the list above.

> Not sure we'd want qemu to enable/disable it automatically, though...
> 
> What do you all think ?
> 
> Thanks,
> --Gabriel
> 
> > 
> > So AFAICT, linux prefers to use mwait for idling if cpuid tells it that
> > it's available. If we keep telling everyone that we do NOT have monitor
> > and mwait available, compliant guests will never end up using them, and
> > this hack would remain completely invisible to them, which is good
> > (better to use hlt-based idle loops when you're a vm guest, that would
> > actually allow the host to relax while you're halted :)
> > 
> > So the only time anyone would be able to tell we have this hack would be
> > when they're about to receive an invalid opcode for using monitor/mwait
> > in violation of what CPUID (would have) told them. That's what happens
> > to OS X prior to 10.8, which is when I'm hypothesizing the Apple devs
> > begain to seriously think about their OS running as a vm guest (on fusion
> > and parallels)...
> > 
> > Instead of killing the misbehaving guest with an invalid opcode, we'd
> > allow them to peg the host CPU with their monitor == mwait == nop idle
> > loop instead, which, at least on OS X, should be tolerable long enough
> > to run 'rm -rf System/Library/Extensions/AppleIntelCPUPowerManagement.kext'
> > and reboot the guest, after which things would settle down by reverting
> > the guest to a hlt-based idle loop.
> > 
> > The only reason I can think of to add functionality for enabling/disabling
> > this hack would be to protect against a malicious guest which would use
> > mwait *on purpose* to peg the host CPU. But a malicious guest could just
> > run "for(;;);" in ring 0 and accomplish the same goal, so we wouldn't
> > really gain anything in exchange for the added complexity...
> > 
> > Thanks,
> >   Gabriel
> > 
> >  arch/x86/kvm/cpuid.c |  2 ++
> >  arch/x86/kvm/svm.c   | 28 ++++++++++++++++++++--------
> >  arch/x86/kvm/vmx.c   | 20 ++++++++++++++++----
> >  3 files changed, 38 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index f47a104..d094fc6 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -283,6 +283,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >  		0 /* Reserved */ | f_lm | F(3DNOWEXT) | F(3DNOW);
> >  	/* cpuid 1.ecx */
> >  	const u32 kvm_supported_word4_x86_features =
> > +		/* NOTE: MONITOR (and MWAIT) are emulated as NOP,
> > +		 * but *not* advertised to guests via CPUID ! */
> >  		F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ |
> >  		0 /* DS-CPL, VMX, SMX, EST */ |
> >  		0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 7f4f9c2..0b7d58d 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -2770,12 +2770,6 @@ static int xsetbv_interception(struct vcpu_svm *svm)
> >  	return 1;
> >  }
> >  
> > -static int invalid_op_interception(struct vcpu_svm *svm)
> > -{
> > -	kvm_queue_exception(&svm->vcpu, UD_VECTOR);
> > -	return 1;
> > -}
> > -
> >  static int task_switch_interception(struct vcpu_svm *svm)
> >  {
> >  	u16 tss_selector;
> > @@ -3287,6 +3281,24 @@ static int pause_interception(struct vcpu_svm *svm)
> >  	return 1;
> >  }
> >  
> > +static int nop_interception(struct vcpu_svm *svm)
> > +{
> > +	skip_emulated_instruction(&(svm->vcpu));
> > +	return 1;
> > +}
> > +
> > +static int monitor_interception(struct vcpu_svm *svm)
> > +{
> > +	printk_once(KERN_WARNING "kvm: MONITOR instruction emulated as NOP!\n");
> > +	return nop_interception(svm);
> > +}
> > +
> > +static int mwait_interception(struct vcpu_svm *svm)
> > +{
> > +	printk_once(KERN_WARNING "kvm: MWAIT instruction emulated as NOP!\n");
> > +	return nop_interception(svm);
> > +}
> > +
> >  static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
> >  	[SVM_EXIT_READ_CR0]			= cr_interception,
> >  	[SVM_EXIT_READ_CR3]			= cr_interception,
> > @@ -3344,8 +3356,8 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
> >  	[SVM_EXIT_CLGI]				= clgi_interception,
> >  	[SVM_EXIT_SKINIT]			= skinit_interception,
> >  	[SVM_EXIT_WBINVD]                       = emulate_on_interception,
> > -	[SVM_EXIT_MONITOR]			= invalid_op_interception,
> > -	[SVM_EXIT_MWAIT]			= invalid_op_interception,
> > +	[SVM_EXIT_MONITOR]			= monitor_interception,
> > +	[SVM_EXIT_MWAIT]			= mwait_interception,
> >  	[SVM_EXIT_XSETBV]			= xsetbv_interception,
> >  	[SVM_EXIT_NPF]				= pf_interception,
> >  };
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 33e8c02..3ccbcb1 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -5669,12 +5669,24 @@ static int handle_pause(struct kvm_vcpu *vcpu)
> >  	return 1;
> >  }
> >  
> > -static int handle_invalid_op(struct kvm_vcpu *vcpu)
> > +static int handle_nop(struct kvm_vcpu *vcpu)
> >  {
> > -	kvm_queue_exception(vcpu, UD_VECTOR);
> > +	skip_emulated_instruction(vcpu);
> >  	return 1;
> >  }
> >  
> > +static int handle_mwait(struct kvm_vcpu *vcpu)
> > +{
> > +	printk_once(KERN_WARNING "kvm: MWAIT instruction emulated as NOP!\n");
> > +	return handle_nop(vcpu);
> > +}
> > +
> > +static int handle_monitor(struct kvm_vcpu *vcpu)
> > +{
> > +	printk_once(KERN_WARNING "kvm: MONITOR instruction emulated as NOP!\n");
> > +	return handle_nop(vcpu);
> > +}
> > +
> >  /*
> >   * To run an L2 guest, we need a vmcs02 based on the L1-specified vmcs12.
> >   * We could reuse a single VMCS for all the L2 guests, but we also want the
> > @@ -6571,8 +6583,8 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
> >  	[EXIT_REASON_EPT_VIOLATION]	      = handle_ept_violation,
> >  	[EXIT_REASON_EPT_MISCONFIG]           = handle_ept_misconfig,
> >  	[EXIT_REASON_PAUSE_INSTRUCTION]       = handle_pause,
> > -	[EXIT_REASON_MWAIT_INSTRUCTION]	      = handle_invalid_op,
> > -	[EXIT_REASON_MONITOR_INSTRUCTION]     = handle_invalid_op,
> > +	[EXIT_REASON_MWAIT_INSTRUCTION]	      = handle_mwait,
> > +	[EXIT_REASON_MONITOR_INSTRUCTION]     = handle_monitor,
> >  	[EXIT_REASON_INVEPT]                  = handle_invept,
> >  };
> >  
> > -- 
> > 1.9.0
> > 

  parent reply	other threads:[~2014-06-02 20:24 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-07 20:52 [PATCH v2] kvm: x86: emulate monitor and mwait instructions as nop Gabriel L. Somlo
2014-06-02 19:25 ` Gabriel L. Somlo
2014-06-02 19:48   ` Alexander Graf
2014-06-02 20:20     ` Michael S. Tsirkin
2014-06-02 20:35       ` Alexander Graf
2014-06-02 20:41         ` Michael S. Tsirkin
2014-06-02 21:01           ` Alexander Graf
2014-06-03  1:55             ` Gabriel L. Somlo
2014-06-02 20:24   ` Michael S. Tsirkin [this message]
2014-06-03  9:17   ` Paolo Bonzini
2014-06-03 14:21     ` Gabriel L. Somlo
2014-06-03 15:37       ` Alexander Graf
2014-06-03 19:07         ` Gabriel L. Somlo
2014-06-10 10:16       ` Michael S. Tsirkin
2014-06-04 14:39     ` Gabriel L. Somlo
2014-06-04 14:44       ` Alexander Graf
2014-06-04 15:05         ` Gabriel L. Somlo
2014-06-04 15:09           ` Alexander Graf
2014-06-04 17:07             ` Gabriel L. Somlo
2014-06-04 19:06               ` Michael S. Tsirkin
2014-06-04 19:24                 ` Gabriel L. Somlo
2014-06-04 19:37                   ` Michael S. Tsirkin
2014-06-04 16:34         ` Paolo Bonzini
2014-06-04 19:08           ` Michael S. Tsirkin
2014-06-04 19:33             ` Gabriel L. Somlo
2014-06-04 19:40               ` Michael S. Tsirkin
2014-06-04 19:12           ` Nadav Amit
2014-06-04 19:43             ` Gabriel L. Somlo
2014-06-04 20:44           ` Borislav Petkov
2014-06-05 14:40             ` Eduardo Habkost
2014-06-05 20:59 ` Eric Northup
2014-06-05 21:19   ` Gabriel L. Somlo
     [not found] <46EF8587-E226-44C5-930A-49E4F7FBBC82@gmail.com>
2014-06-04 20:01 ` Nadav Amit
2014-06-04 20:11   ` Gabriel L. Somlo
2014-06-04 20:55     ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140602202452.GA5701@redhat.com \
    --to=mst@redhat.com \
    --cc=afaerber@suse.de \
    --cc=agraf@suse.de \
    --cc=gsomlo@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.