All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] SVM NMI fixes
@ 2011-02-03 15:02 Avi Kivity
  2011-02-03 15:02 ` [PATCH 1/2] KVM: Fix race between nmi injection and enabling nmi window Avi Kivity
  2011-02-03 15:02 ` [PATCH 2/2] KVM: SVM: check for progress after IRET interception Avi Kivity
  0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 15:02 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm; +Cc: Jan Kiszka, Joerg Roedel

There are a couple of fairly severe problems with NMI on AMD, both triggered
with nmi_watchdog=1 in the guest and kvm ftrace in the host.  One of the bug
leads to guest userspace crashes via spurious setting of EFLAGS.TF, while the
other leads to guest kernel hangs looping on the NMI handler's IRET
instruction.  I believe ftrace only affects timing here, and is not a real
requirement to reproduce the bug.

See https://bugzilla.redhat.com/show_bug.cgi?id=612436 for the original report.

I will try to write unit tests for both issues.

Avi Kivity (2):
  KVM: Fix race between nmi injection and enabling nmi window
  KVM: SVM: check for progress after IRET interception

 arch/x86/kvm/svm.c       |   10 +++++++++-
 arch/x86/kvm/x86.c       |    4 +++-
 include/linux/kvm_host.h |    1 +
 3 files changed, 13 insertions(+), 2 deletions(-)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] KVM: Fix race between nmi injection and enabling nmi window
  2011-02-03 15:02 [PATCH 0/2] SVM NMI fixes Avi Kivity
@ 2011-02-03 15:02 ` Avi Kivity
  2011-02-03 15:11   ` Jan Kiszka
  2011-02-03 15:02 ` [PATCH 2/2] KVM: SVM: check for progress after IRET interception Avi Kivity
  1 sibling, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 15:02 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm; +Cc: Jan Kiszka, Joerg Roedel

The interrupt injection logic looks something like

  if an nmi is pending, and nmi injection allowed
    inject nmi
  if an nmi is pending
    request exit on nmi window

the problem is that "nmi is pending" can be set asynchronously by
the PIT; if it happens to fire between the two if statements, we
will request an nmi window even though nmi injection is allowed.  On
SVM, this has disasterous results, since it causes eflags.TF to be
set in random guest code.

The fix is simple; make nmi_pending asynchronous using the standard
vcpu->requests mechanism; this ensures the code above is completely
synchronous wrt nmi_pending.

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 arch/x86/kvm/x86.c       |    4 +++-
 include/linux/kvm_host.h |    1 +
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a7f65aa..abe76c0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -360,8 +360,8 @@ void kvm_propagate_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault)
 
 void kvm_inject_nmi(struct kvm_vcpu *vcpu)
 {
+	kvm_make_request(KVM_REQ_NMI, vcpu);
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
-	vcpu->arch.nmi_pending = 1;
 }
 EXPORT_SYMBOL_GPL(kvm_inject_nmi);
 
@@ -5182,6 +5182,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			r = 1;
 			goto out;
 		}
+		if (kvm_check_request(KVM_REQ_NMI, vcpu))
+			vcpu->arch.nmi_pending = true;
 	}
 
 	r = kvm_mmu_reload(vcpu);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c8dee22..7581090 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -43,6 +43,7 @@
 #define KVM_REQ_DEACTIVATE_FPU    10
 #define KVM_REQ_EVENT             11
 #define KVM_REQ_APF_HALT          12
+#define KVM_REQ_NMI               13
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID	0
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:02 [PATCH 0/2] SVM NMI fixes Avi Kivity
  2011-02-03 15:02 ` [PATCH 1/2] KVM: Fix race between nmi injection and enabling nmi window Avi Kivity
@ 2011-02-03 15:02 ` Avi Kivity
  2011-02-03 15:07   ` Avi Kivity
  1 sibling, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 15:02 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm; +Cc: Jan Kiszka, Joerg Roedel

When we enable an NMI window, we ask for an IRET intercept, since
the IRET re-enables NMIs.  However, the IRET intercept happens before
the instruction executes, while the NMI window architecturally opens
afterwards.

To compensate for this mismatch, we only open the NMI window in the
following exit, assuming that the IRET has by then executed; however,
this assumption is not always correct; we may exit due to a host interrupt
or page fault, without having executed the instruction.

Fix by checking for forward progress by recording and comparing the IRET's
rip.  This is somewhat of a hack, since an unchaging rip does not mean that
no forward progress has been made, but is the simplest fix for now.

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 arch/x86/kvm/svm.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 73a8f1d..53c5d8a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -135,6 +135,8 @@ struct vcpu_svm {
 
 	u32 *msrpm;
 
+	ulong nmi_iret_rip;
+
 	struct nested_state nested;
 
 	bool nmi_singlestep;
@@ -2653,6 +2655,7 @@ static int iret_interception(struct vcpu_svm *svm)
 	++svm->vcpu.stat.nmi_window_exits;
 	clr_intercept(svm, INTERCEPT_IRET);
 	svm->vcpu.arch.hflags |= HF_IRET_MASK;
+	svm->nmi_iret_rip = kvm_rip_read(&svm->vcpu);
 	return 1;
 }
 
@@ -3472,7 +3475,12 @@ static void svm_complete_interrupts(struct vcpu_svm *svm)
 
 	svm->int3_injected = 0;
 
-	if (svm->vcpu.arch.hflags & HF_IRET_MASK) {
+	/*
+	 * If we've made progress since setting HF_IRET_MASK, we've
+	 * executed an IRET and can allow NMI injection.
+	 */
+	if ((svm->vcpu.arch.hflags & HF_IRET_MASK)
+	    && kvm_rip_read(&svm->vcpu) != svm->nmi_iret_rip) {
 		svm->vcpu.arch.hflags &= ~(HF_NMI_MASK | HF_IRET_MASK);
 		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 	}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:02 ` [PATCH 2/2] KVM: SVM: check for progress after IRET interception Avi Kivity
@ 2011-02-03 15:07   ` Avi Kivity
  2011-02-03 15:21     ` Jan Kiszka
  2011-02-08 13:49     ` Marcelo Tosatti
  0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 15:07 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm; +Cc: Jan Kiszka, Joerg Roedel

On 02/03/2011 05:02 PM, Avi Kivity wrote:
> When we enable an NMI window, we ask for an IRET intercept, since
> the IRET re-enables NMIs.  However, the IRET intercept happens before
> the instruction executes, while the NMI window architecturally opens
> afterwards.
>
> To compensate for this mismatch, we only open the NMI window in the
> following exit, assuming that the IRET has by then executed; however,
> this assumption is not always correct; we may exit due to a host interrupt
> or page fault, without having executed the instruction.
>
> Fix by checking for forward progress by recording and comparing the IRET's
> rip.  This is somewhat of a hack, since an unchaging rip does not mean that
> no forward progress has been made, but is the simplest fix for now.
>

So what would be a better fix?  We could unconditionally single step on 
iret_interception() which would fix the problem at the cost of making 
NMIs less efficient (three exits instead of two).  We could emulate the 
IRET (doubling kvm's code and likely slower, and certainly buggier, than 
the first option).  Alternatively, can anyone think of a reliable way to 
make sure forward progress has been made?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] KVM: Fix race between nmi injection and enabling nmi window
  2011-02-03 15:02 ` [PATCH 1/2] KVM: Fix race between nmi injection and enabling nmi window Avi Kivity
@ 2011-02-03 15:11   ` Jan Kiszka
  2011-02-03 15:15     ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2011-02-03 15:11 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 2011-02-03 16:02, Avi Kivity wrote:
> The interrupt injection logic looks something like
> 
>   if an nmi is pending, and nmi injection allowed
>     inject nmi
>   if an nmi is pending
>     request exit on nmi window
> 
> the problem is that "nmi is pending" can be set asynchronously by
> the PIT; if it happens to fire between the two if statements, we
> will request an nmi window even though nmi injection is allowed.  On
> SVM, this has disasterous results, since it causes eflags.TF to be
> set in random guest code.

Good point. Fortunately never seen on production machines so far here
(we have very moderate NMI rates).

> 
> The fix is simple; make nmi_pending asynchronous using the standard

You mean synchronous, no?

> vcpu->requests mechanism; this ensures the code above is completely
> synchronous wrt nmi_pending.
> 
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
>  arch/x86/kvm/x86.c       |    4 +++-
>  include/linux/kvm_host.h |    1 +
>  2 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a7f65aa..abe76c0 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -360,8 +360,8 @@ void kvm_propagate_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault)
>  
>  void kvm_inject_nmi(struct kvm_vcpu *vcpu)
>  {
> +	kvm_make_request(KVM_REQ_NMI, vcpu);
>  	kvm_make_request(KVM_REQ_EVENT, vcpu);
> -	vcpu->arch.nmi_pending = 1;
>  }
>  EXPORT_SYMBOL_GPL(kvm_inject_nmi);
>  
> @@ -5182,6 +5182,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  			r = 1;
>  			goto out;
>  		}
> +		if (kvm_check_request(KVM_REQ_NMI, vcpu))
> +			vcpu->arch.nmi_pending = true;
>  	}
>  
>  	r = kvm_mmu_reload(vcpu);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index c8dee22..7581090 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -43,6 +43,7 @@
>  #define KVM_REQ_DEACTIVATE_FPU    10
>  #define KVM_REQ_EVENT             11
>  #define KVM_REQ_APF_HALT          12
> +#define KVM_REQ_NMI               13
>  
>  #define KVM_USERSPACE_IRQ_SOURCE_ID	0
>  

Looks good.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] KVM: Fix race between nmi injection and enabling nmi window
  2011-02-03 15:11   ` Jan Kiszka
@ 2011-02-03 15:15     ` Avi Kivity
  0 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 15:15 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 02/03/2011 05:11 PM, Jan Kiszka wrote:
> On 2011-02-03 16:02, Avi Kivity wrote:
> >  The interrupt injection logic looks something like
> >
> >    if an nmi is pending, and nmi injection allowed
> >      inject nmi
> >    if an nmi is pending
> >      request exit on nmi window
> >
> >  the problem is that "nmi is pending" can be set asynchronously by
> >  the PIT; if it happens to fire between the two if statements, we
> >  will request an nmi window even though nmi injection is allowed.  On
> >  SVM, this has disasterous results, since it causes eflags.TF to be
> >  set in random guest code.
>
> Good point. Fortunately never seen on production machines so far here
> (we have very moderate NMI rates).

I've never seen it either, except with ftrace enabled.  I wonder what 
the connection is.

> >
> >  The fix is simple; make nmi_pending asynchronous using the standard
>
> You mean synchronous, no?
>

Yes.

> >  vcpu->requests mechanism; this ensures the code above is completely
> >  synchronous wrt nmi_pending.
> >

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:07   ` Avi Kivity
@ 2011-02-03 15:21     ` Jan Kiszka
  2011-02-03 15:30       ` Avi Kivity
  2011-02-08 13:49     ` Marcelo Tosatti
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2011-02-03 15:21 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 2011-02-03 16:07, Avi Kivity wrote:
> On 02/03/2011 05:02 PM, Avi Kivity wrote:
>> When we enable an NMI window, we ask for an IRET intercept, since
>> the IRET re-enables NMIs.  However, the IRET intercept happens before
>> the instruction executes, while the NMI window architecturally opens
>> afterwards.
>>
>> To compensate for this mismatch, we only open the NMI window in the
>> following exit, assuming that the IRET has by then executed; however,
>> this assumption is not always correct; we may exit due to a host interrupt
>> or page fault, without having executed the instruction.
>>
>> Fix by checking for forward progress by recording and comparing the IRET's
>> rip.  This is somewhat of a hack, since an unchaging rip does not mean that
>> no forward progress has been made, but is the simplest fix for now.
>>
> 
> So what would be a better fix?  We could unconditionally single step on 
> iret_interception() which would fix the problem at the cost of making 
> NMIs less efficient (three exits instead of two).  We could emulate the 
> IRET (doubling kvm's code and likely slower, and certainly buggier, than 
> the first option).  Alternatively, can anyone think of a reliable way to 
> make sure forward progress has been made?

Joerg and I discussed this a few times, I think last on the KVM forum.
It's really tricky and we found no option without limitations.
Single-stepping, e.g., already pollutes the guest state (if an exception
is taken without prior vmexit).

I don't recall all alternatives, but a vmexit-saving one was (IIRC) to
fall back to an interrupt window without IRET interception, likely
augmented with some break-out timer like we do for oldish, vnmi-lacking
Intels.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:21     ` Jan Kiszka
@ 2011-02-03 15:30       ` Avi Kivity
  2011-02-03 15:55         ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 15:30 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 02/03/2011 05:21 PM, Jan Kiszka wrote:
> >
> >  So what would be a better fix?  We could unconditionally single step on
> >  iret_interception() which would fix the problem at the cost of making
> >  NMIs less efficient (three exits instead of two).  We could emulate the
> >  IRET (doubling kvm's code and likely slower, and certainly buggier, than
> >  the first option).  Alternatively, can anyone think of a reliable way to
> >  make sure forward progress has been made?
>
> Joerg and I discussed this a few times, I think last on the KVM forum.
> It's really tricky and we found no option without limitations.
> Single-stepping, e.g., already pollutes the guest state (if an exception
> is taken without prior vmexit).

We could enable all intercepts when single stepping.  if we get a 
single-step #DB, we clear nmi blocking and are happy.  If we get another 
exception, or a non-single-step #DB, we re-enable the IRET intercept, 
clear single-step, re-inject the exception, and let the guest continue.

> I don't recall all alternatives, but a vmexit-saving one was (IIRC) to
> fall back to an interrupt window without IRET interception, likely
> augmented with some break-out timer like we do for oldish, vnmi-lacking
> Intels.

What's an interrupt window without IRET interception?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:30       ` Avi Kivity
@ 2011-02-03 15:55         ` Jan Kiszka
  2011-02-03 15:58           ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2011-02-03 15:55 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 2011-02-03 16:30, Avi Kivity wrote:
> On 02/03/2011 05:21 PM, Jan Kiszka wrote:
>>>
>>>  So what would be a better fix?  We could unconditionally single step on
>>>  iret_interception() which would fix the problem at the cost of making
>>>  NMIs less efficient (three exits instead of two).  We could emulate the
>>>  IRET (doubling kvm's code and likely slower, and certainly buggier, than
>>>  the first option).  Alternatively, can anyone think of a reliable way to
>>>  make sure forward progress has been made?
>>
>> Joerg and I discussed this a few times, I think last on the KVM forum.
>> It's really tricky and we found no option without limitations.
>> Single-stepping, e.g., already pollutes the guest state (if an exception
>> is taken without prior vmexit).
> 
> We could enable all intercepts when single stepping.  if we get a 
> single-step #DB, we clear nmi blocking and are happy.  If we get another 
> exception, or a non-single-step #DB, we re-enable the IRET intercept, 
> clear single-step, re-inject the exception, and let the guest continue.

Yes, I once explored this direction. It should be feasible (though
complex), need to check again why I suspended the effort.

> 
>> I don't recall all alternatives, but a vmexit-saving one was (IIRC) to
>> fall back to an interrupt window without IRET interception, likely
>> augmented with some break-out timer like we do for oldish, vnmi-lacking
>> Intels.
> 
> What's an interrupt window without IRET interception?

I don't the details, but I thought you could get something like an
interrupt-window-open interception by (fake-)injecting an IRQ and
intercepting on VIRQ acceptance. That will not work if returning to and
staying in irq-disabled guest code, therefore the timeout, but it should
be most efficient (specifically if the guest uses NMIs for things like
perf).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:55         ` Jan Kiszka
@ 2011-02-03 15:58           ` Avi Kivity
  2011-02-03 16:14             ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 15:58 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 02/03/2011 05:55 PM, Jan Kiszka wrote:
> >
> >  What's an interrupt window without IRET interception?
>
> I don't the details, but I thought you could get something like an
> interrupt-window-open interception by (fake-)injecting an IRQ and
> intercepting on VIRQ acceptance. That will not work if returning to and
> staying in irq-disabled guest code, therefore the timeout, but it should
> be most efficient (specifically if the guest uses NMIs for things like
> perf).
>

Since NMIs are used to break out of irq-disabled regions (watchdog, NMI 
IPIs during reboots) I'm wary of such a solution.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:58           ` Avi Kivity
@ 2011-02-03 16:14             ` Jan Kiszka
  2011-02-03 16:20               ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2011-02-03 16:14 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 2011-02-03 16:58, Avi Kivity wrote:
> On 02/03/2011 05:55 PM, Jan Kiszka wrote:
>>>
>>>  What's an interrupt window without IRET interception?
>>
>> I don't the details, but I thought you could get something like an
>> interrupt-window-open interception by (fake-)injecting an IRQ and
>> intercepting on VIRQ acceptance. That will not work if returning to and
>> staying in irq-disabled guest code, therefore the timeout, but it should
>> be most efficient (specifically if the guest uses NMIs for things like
>> perf).
>>
> 
> Since NMIs are used to break out of irq-disabled regions (watchdog, NMI 
> IPIs during reboots) I'm wary of such a solution.

Right, but we already use it for Intel. The timeout ensures that you
can't get stuck forever. I think Xen works this way as well (minus the
timeout - last time I checked).

I hope AMD would finally realize what the left behind and improve it so
that we can declare whatever "nice" solution just a temporary
workaround. Will still take a few years, but we had the same situation
on Intel.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 16:14             ` Jan Kiszka
@ 2011-02-03 16:20               ` Avi Kivity
  2011-02-03 16:30                 ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2011-02-03 16:20 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 02/03/2011 06:14 PM, Jan Kiszka wrote:
> On 2011-02-03 16:58, Avi Kivity wrote:
> >  On 02/03/2011 05:55 PM, Jan Kiszka wrote:
> >>>
> >>>   What's an interrupt window without IRET interception?
> >>
> >>  I don't the details, but I thought you could get something like an
> >>  interrupt-window-open interception by (fake-)injecting an IRQ and
> >>  intercepting on VIRQ acceptance. That will not work if returning to and
> >>  staying in irq-disabled guest code, therefore the timeout, but it should
> >>  be most efficient (specifically if the guest uses NMIs for things like
> >>  perf).
> >>
> >
> >  Since NMIs are used to break out of irq-disabled regions (watchdog, NMI
> >  IPIs during reboots) I'm wary of such a solution.
>
> Right, but we already use it for Intel. The timeout ensures that you
> can't get stuck forever. I think Xen works this way as well (minus the
> timeout - last time I checked).

Only without vnmi support, yes?  In that case, we can't do any better.  
In this case, we can, and we should, even at the expense of performance 
or ridiculous complexity.

> I hope AMD would finally realize what the left behind and improve it so
> that we can declare whatever "nice" solution just a temporary
> workaround. Will still take a few years, but we had the same situation
> on Intel.

Me, too, except that I'd like a correct implementation on the existing 
ISA.  As time goes by, it becomes more and more difficult to declare 
that all previous processors are an unimportant minority.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 16:20               ` Avi Kivity
@ 2011-02-03 16:30                 ` Jan Kiszka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kiszka @ 2011-02-03 16:30 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, Joerg Roedel

On 2011-02-03 17:20, Avi Kivity wrote:
> On 02/03/2011 06:14 PM, Jan Kiszka wrote:
>> On 2011-02-03 16:58, Avi Kivity wrote:
>>>  On 02/03/2011 05:55 PM, Jan Kiszka wrote:
>>>>>
>>>>>   What's an interrupt window without IRET interception?
>>>>
>>>>  I don't the details, but I thought you could get something like an
>>>>  interrupt-window-open interception by (fake-)injecting an IRQ and
>>>>  intercepting on VIRQ acceptance. That will not work if returning to and
>>>>  staying in irq-disabled guest code, therefore the timeout, but it should
>>>>  be most efficient (specifically if the guest uses NMIs for things like
>>>>  perf).
>>>>
>>>
>>>  Since NMIs are used to break out of irq-disabled regions (watchdog, NMI
>>>  IPIs during reboots) I'm wary of such a solution.
>>
>> Right, but we already use it for Intel. The timeout ensures that you
>> can't get stuck forever. I think Xen works this way as well (minus the
>> timeout - last time I checked).
> 
> Only without vnmi support, yes?  In that case, we can't do any better.  
> In this case, we can, and we should, even at the expense of performance 
> or ridiculous complexity.

OK, then I guess we should explore the single-step approach and make it
waterproof. It's likely still much simpler than iret emulation.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-03 15:07   ` Avi Kivity
  2011-02-03 15:21     ` Jan Kiszka
@ 2011-02-08 13:49     ` Marcelo Tosatti
  2011-02-08 14:05       ` Avi Kivity
  1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2011-02-08 13:49 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Jan Kiszka, Joerg Roedel

On Thu, Feb 03, 2011 at 05:07:33PM +0200, Avi Kivity wrote:
> On 02/03/2011 05:02 PM, Avi Kivity wrote:
> >When we enable an NMI window, we ask for an IRET intercept, since
> >the IRET re-enables NMIs.  However, the IRET intercept happens before
> >the instruction executes, while the NMI window architecturally opens
> >afterwards.
> >
> >To compensate for this mismatch, we only open the NMI window in the
> >following exit, assuming that the IRET has by then executed; however,
> >this assumption is not always correct; we may exit due to a host interrupt
> >or page fault, without having executed the instruction.
> >
> >Fix by checking for forward progress by recording and comparing the IRET's
> >rip.  This is somewhat of a hack, since an unchaging rip does not mean that
> >no forward progress has been made, but is the simplest fix for now.
> >

Looks good.

> So what would be a better fix?  We could unconditionally single step
> on iret_interception() which would fix the problem at the cost of
> making NMIs less efficient (three exits instead of two).  We could
> emulate the IRET (doubling kvm's code and likely slower, and
> certainly buggier, than the first option).  Alternatively, can
> anyone think of a reliable way to make sure forward progress has
> been made?

Is there other negative impact of the RIP hack than NMI being delayed?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] KVM: SVM: check for progress after IRET interception
  2011-02-08 13:49     ` Marcelo Tosatti
@ 2011-02-08 14:05       ` Avi Kivity
  0 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2011-02-08 14:05 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, Jan Kiszka, Joerg Roedel

On 02/08/2011 03:49 PM, Marcelo Tosatti wrote:
> >  >Fix by checking for forward progress by recording and comparing the IRET's
> >  >rip.  This is somewhat of a hack, since an unchaging rip does not mean that
> >  >no forward progress has been made, but is the simplest fix for now.
> >  >
>
> Looks good.
>
> >  So what would be a better fix?  We could unconditionally single step
> >  on iret_interception() which would fix the problem at the cost of
> >  making NMIs less efficient (three exits instead of two).  We could
> >  emulate the IRET (doubling kvm's code and likely slower, and
> >  certainly buggier, than the first option).  Alternatively, can
> >  anyone think of a reliable way to make sure forward progress has
> >  been made?
>
> Is there other negative impact of the RIP hack than NMI being delayed?

Worst case scenario is that we always exit before the IRET for some 
reason, so the NMI is delayed forever.  It's incredibly unlikely though, 
I can't think of a sensible way to construct such a case (an insensible 
way: an IRET that returns to itself - on 64-bit, IRET pops RSP, so it's 
easy to arrange such a loop).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-02-08 14:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-03 15:02 [PATCH 0/2] SVM NMI fixes Avi Kivity
2011-02-03 15:02 ` [PATCH 1/2] KVM: Fix race between nmi injection and enabling nmi window Avi Kivity
2011-02-03 15:11   ` Jan Kiszka
2011-02-03 15:15     ` Avi Kivity
2011-02-03 15:02 ` [PATCH 2/2] KVM: SVM: check for progress after IRET interception Avi Kivity
2011-02-03 15:07   ` Avi Kivity
2011-02-03 15:21     ` Jan Kiszka
2011-02-03 15:30       ` Avi Kivity
2011-02-03 15:55         ` Jan Kiszka
2011-02-03 15:58           ` Avi Kivity
2011-02-03 16:14             ` Jan Kiszka
2011-02-03 16:20               ` Avi Kivity
2011-02-03 16:30                 ` Jan Kiszka
2011-02-08 13:49     ` Marcelo Tosatti
2011-02-08 14:05       ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.