All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
@ 2014-11-25 10:08 Andrew Cooper
  2014-11-25 10:42 ` Jan Beulich
  2014-11-25 10:47 ` Tim Deegan
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Cooper @ 2014-11-25 10:08 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich

A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
As a result, injecting a fault and retrying the the vmentry is likely to fail
in the same way.

While crashing a guest because userspace tickled a hypervisor bug to get up
invalid VMCS state is bad (and usually warrants an XSA), it is better than the
infinite loop caused by this change, and the non-ratelimited console output it
would cause.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

---

Konrad: A hypervisor infinite loop is quite bad, so I am requesting a release
ack for this.  An alternative would be to revert 28b4baacd5 wholesale, but
most of it is good.

One other alternative, which I would pursue if we were not already in -rc2
would be to add some extra logic to detect repeated vmentry failure and allow
one attempt to shoot userspace before giving up and crashing the domain.
---
 xen/arch/x86/hvm/vmx/vmx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2907afa..9b98595 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2520,7 +2520,7 @@ static void vmx_failed_vmentry(unsigned int exit_reason,
     vmcs_dump_vcpu(curr);
     printk("**************************************\n");
 
-    vmx_crash_or_fault(curr);
+    domain_crash(curr->domain);
 }
 
 void vmx_enter_realmode(struct cpu_user_regs *regs)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 10:08 [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5 Andrew Cooper
@ 2014-11-25 10:42 ` Jan Beulich
  2014-11-25 10:46   ` Andrew Cooper
  2014-11-25 10:47 ` Tim Deegan
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2014-11-25 10:42 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Xen-devel

>>> On 25.11.14 at 11:08, <andrew.cooper3@citrix.com> wrote:
> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
> As a result, injecting a fault and retrying the the vmentry is likely to 
> fail
> in the same way.

That's not all that unlikely - remember that the change was prompted
by the XSA-110 fix. There CS pieces being in a bad state would get
corrected by the exception injection.

> One other alternative, which I would pursue if we were not already in -rc2
> would be to add some extra logic to detect repeated vmentry failure and 
> allow
> one attempt to shoot userspace before giving up and crashing the domain.

That's not even needed afaict (and if it really is, it can't be all that
difficult/intrusive): Did you observe what you attempt to fix here in
practice, or is this just from theoretical considerations? I ask because
I don't think it can actually happen, as the second time we get here
the guest ought to be in kernel mode (due to the exception injection)
and hence would get crashed anyway.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 10:42 ` Jan Beulich
@ 2014-11-25 10:46   ` Andrew Cooper
  2014-11-25 10:58     ` Andrew Cooper
  2014-11-25 11:03     ` Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Cooper @ 2014-11-25 10:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, Xen-devel

On 25/11/14 10:42, Jan Beulich wrote:
>>>> On 25.11.14 at 11:08, <andrew.cooper3@citrix.com> wrote:
>> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
>> As a result, injecting a fault and retrying the the vmentry is likely to 
>> fail
>> in the same way.
> That's not all that unlikely - remember that the change was prompted
> by the XSA-110 fix. There CS pieces being in a bad state would get
> corrected by the exception injection.
>
>> One other alternative, which I would pursue if we were not already in -rc2
>> would be to add some extra logic to detect repeated vmentry failure and 
>> allow
>> one attempt to shoot userspace before giving up and crashing the domain.
> That's not even needed afaict (and if it really is, it can't be all that
> difficult/intrusive): Did you observe what you attempt to fix here in
> practice, or is this just from theoretical considerations? I ask because
> I don't think it can actually happen, as the second time we get here
> the guest ought to be in kernel mode (due to the exception injection)
> and hence would get crashed anyway.

Only from theoretical considerations.  A bad CS (and possibly SS) would
be fixed by this, but there are many others which wouldn't

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 10:08 [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5 Andrew Cooper
  2014-11-25 10:42 ` Jan Beulich
@ 2014-11-25 10:47 ` Tim Deegan
  1 sibling, 0 replies; 9+ messages in thread
From: Tim Deegan @ 2014-11-25 10:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Jan Beulich, Xen-devel

At 10:08 +0000 on 25 Nov (1416906538), Andrew Cooper wrote:
> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
> As a result, injecting a fault and retrying the the vmentry is likely to fail
> in the same way.

In particular, the guest's privilege level won't change until _after_
the next vm entry has succeeded.

> While crashing a guest because userspace tickled a hypervisor bug to get up
> invalid VMCS state is bad (and usually warrants an XSA), it is better than the
> infinite loop caused by this change, and the non-ratelimited console output it
> would cause.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Tim Deegan <tim@xen.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 10:46   ` Andrew Cooper
@ 2014-11-25 10:58     ` Andrew Cooper
  2014-11-25 11:31       ` Jan Beulich
  2014-11-25 11:03     ` Jan Beulich
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2014-11-25 10:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, Xen-devel

On 25/11/14 10:46, Andrew Cooper wrote:
> On 25/11/14 10:42, Jan Beulich wrote:
>>>>> On 25.11.14 at 11:08, <andrew.cooper3@citrix.com> wrote:
>>> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
>>> As a result, injecting a fault and retrying the the vmentry is likely to 
>>> fail
>>> in the same way.
>> That's not all that unlikely - remember that the change was prompted
>> by the XSA-110 fix. There CS pieces being in a bad state would get
>> corrected by the exception injection.
>>
>>> One other alternative, which I would pursue if we were not already in -rc2
>>> would be to add some extra logic to detect repeated vmentry failure and 
>>> allow
>>> one attempt to shoot userspace before giving up and crashing the domain.
>> That's not even needed afaict (and if it really is, it can't be all that
>> difficult/intrusive): Did you observe what you attempt to fix here in
>> practice, or is this just from theoretical considerations? I ask because
>> I don't think it can actually happen, as the second time we get here
>> the guest ought to be in kernel mode (due to the exception injection)
>> and hence would get crashed anyway.
> Only from theoretical considerations.  A bad CS (and possibly SS) would
> be fixed by this, but there are many others which wouldn't

Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
this without emulating the event injection.  Per the XSA-106 followup,
we only ever emulate enough of event injection to cover the dpl checks
on software events for older generation SVM.  We never actually emulate
the context switch itself.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 10:46   ` Andrew Cooper
  2014-11-25 10:58     ` Andrew Cooper
@ 2014-11-25 11:03     ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2014-11-25 11:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Xen-devel

>>> On 25.11.14 at 11:46, <andrew.cooper3@citrix.com> wrote:
> On 25/11/14 10:42, Jan Beulich wrote:
>>>>> On 25.11.14 at 11:08, <andrew.cooper3@citrix.com> wrote:
>>> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
> state.
>>> As a result, injecting a fault and retrying the the vmentry is likely to 
>>> fail
>>> in the same way.
>> That's not all that unlikely - remember that the change was prompted
>> by the XSA-110 fix. There CS pieces being in a bad state would get
>> corrected by the exception injection.
>>
>>> One other alternative, which I would pursue if we were not already in -rc2
>>> would be to add some extra logic to detect repeated vmentry failure and 
>>> allow
>>> one attempt to shoot userspace before giving up and crashing the domain.
>> That's not even needed afaict (and if it really is, it can't be all that
>> difficult/intrusive): Did you observe what you attempt to fix here in
>> practice, or is this just from theoretical considerations? I ask because
>> I don't think it can actually happen, as the second time we get here
>> the guest ought to be in kernel mode (due to the exception injection)
>> and hence would get crashed anyway.
> 
> Only from theoretical considerations.  A bad CS (and possibly SS) would
> be fixed by this, but there are many others which wouldn't

But that doesn't eliminate the fact that in the second pass we'd find
the guest in kernel mode, and hence crash it. Yet your reply sounds
as if you still think your patch is needed.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 10:58     ` Andrew Cooper
@ 2014-11-25 11:31       ` Jan Beulich
  2014-11-25 12:10         ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2014-11-25 11:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Xen-devel

>>> On 25.11.14 at 11:58, <andrew.cooper3@citrix.com> wrote:
> On 25/11/14 10:46, Andrew Cooper wrote:
>> On 25/11/14 10:42, Jan Beulich wrote:
>>>>>> On 25.11.14 at 11:08, <andrew.cooper3@citrix.com> wrote:
>>>> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
> state.
>>>> As a result, injecting a fault and retrying the the vmentry is likely to 
>>>> fail
>>>> in the same way.
>>> That's not all that unlikely - remember that the change was prompted
>>> by the XSA-110 fix. There CS pieces being in a bad state would get
>>> corrected by the exception injection.
>>>
>>>> One other alternative, which I would pursue if we were not already in -rc2
>>>> would be to add some extra logic to detect repeated vmentry failure and 
>>>> allow
>>>> one attempt to shoot userspace before giving up and crashing the domain.
>>> That's not even needed afaict (and if it really is, it can't be all that
>>> difficult/intrusive): Did you observe what you attempt to fix here in
>>> practice, or is this just from theoretical considerations? I ask because
>>> I don't think it can actually happen, as the second time we get here
>>> the guest ought to be in kernel mode (due to the exception injection)
>>> and hence would get crashed anyway.
>> Only from theoretical considerations.  A bad CS (and possibly SS) would
>> be fixed by this, but there are many others which wouldn't
> 
> Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
> this without emulating the event injection.  Per the XSA-106 followup,
> we only ever emulate enough of event injection to cover the dpl checks
> on software events for older generation SVM.  We never actually emulate
> the context switch itself.

Which suggests that rather than doing the partial revert as you
propose we might better extend the check to become "kernel mode
or event injection pending".

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 11:31       ` Jan Beulich
@ 2014-11-25 12:10         ` Andrew Cooper
  2014-11-25 12:24           ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2014-11-25 12:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, Xen-devel

On 25/11/14 11:31, Jan Beulich wrote:
>>>> On 25.11.14 at 11:58, <andrew.cooper3@citrix.com> wrote:
>> On 25/11/14 10:46, Andrew Cooper wrote:
>>> On 25/11/14 10:42, Jan Beulich wrote:
>>>>>>> On 25.11.14 at 11:08, <andrew.cooper3@citrix.com> wrote:
>>>>> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
>> state.
>>>>> As a result, injecting a fault and retrying the the vmentry is likely to 
>>>>> fail
>>>>> in the same way.
>>>> That's not all that unlikely - remember that the change was prompted
>>>> by the XSA-110 fix. There CS pieces being in a bad state would get
>>>> corrected by the exception injection.
>>>>
>>>>> One other alternative, which I would pursue if we were not already in -rc2
>>>>> would be to add some extra logic to detect repeated vmentry failure and 
>>>>> allow
>>>>> one attempt to shoot userspace before giving up and crashing the domain.
>>>> That's not even needed afaict (and if it really is, it can't be all that
>>>> difficult/intrusive): Did you observe what you attempt to fix here in
>>>> practice, or is this just from theoretical considerations? I ask because
>>>> I don't think it can actually happen, as the second time we get here
>>>> the guest ought to be in kernel mode (due to the exception injection)
>>>> and hence would get crashed anyway.
>>> Only from theoretical considerations.  A bad CS (and possibly SS) would
>>> be fixed by this, but there are many others which wouldn't
>> Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
>> this without emulating the event injection.  Per the XSA-106 followup,
>> we only ever emulate enough of event injection to cover the dpl checks
>> on software events for older generation SVM.  We never actually emulate
>> the context switch itself.
> Which suggests that rather than doing the partial revert as you
> propose we might better extend the check to become "kernel mode
> or event injection pending".

At that point, it is safer just to unconditionally crash on a repeated
vmentry failure, rather than gain a list of conditions which we hope
wont leave us spinning in a loop.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5
  2014-11-25 12:10         ` Andrew Cooper
@ 2014-11-25 12:24           ` Jan Beulich
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2014-11-25 12:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Xen-devel

>>> On 25.11.14 at 13:10, <andrew.cooper3@citrix.com> wrote:
> On 25/11/14 11:31, Jan Beulich wrote:
>>>>> On 25.11.14 at 11:58, <andrew.cooper3@citrix.com> wrote:
>>> On 25/11/14 10:46, Andrew Cooper wrote:
>>>> On 25/11/14 10:42, Jan Beulich wrote:
>>>>>>>> On 25.11.14 at 11:08, <andrew.cooper3@citrix.com> wrote:
>>>>>> A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
>>> state.
>>>>>> As a result, injecting a fault and retrying the the vmentry is likely to 
>>>>>> fail
>>>>>> in the same way.
>>>>> That's not all that unlikely - remember that the change was prompted
>>>>> by the XSA-110 fix. There CS pieces being in a bad state would get
>>>>> corrected by the exception injection.
>>>>>
>>>>>> One other alternative, which I would pursue if we were not already in -rc2
>>>>>> would be to add some extra logic to detect repeated vmentry failure and 
>>>>>> allow
>>>>>> one attempt to shoot userspace before giving up and crashing the domain.
>>>>> That's not even needed afaict (and if it really is, it can't be all that
>>>>> difficult/intrusive): Did you observe what you attempt to fix here in
>>>>> practice, or is this just from theoretical considerations? I ask because
>>>>> I don't think it can actually happen, as the second time we get here
>>>>> the guest ought to be in kernel mode (due to the exception injection)
>>>>> and hence would get crashed anyway.
>>>> Only from theoretical considerations.  A bad CS (and possibly SS) would
>>>> be fixed by this, but there are many others which wouldn't
>>> Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
>>> this without emulating the event injection.  Per the XSA-106 followup,
>>> we only ever emulate enough of event injection to cover the dpl checks
>>> on software events for older generation SVM.  We never actually emulate
>>> the context switch itself.
>> Which suggests that rather than doing the partial revert as you
>> propose we might better extend the check to become "kernel mode
>> or event injection pending".
> 
> At that point, it is safer just to unconditionally crash on a repeated
> vmentry failure, rather than gain a list of conditions which we hope
> wont leave us spinning in a loop.

Crashing on _repeated_ VM entry failure is certainly fine with me.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-11-25 12:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-25 10:08 [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5 Andrew Cooper
2014-11-25 10:42 ` Jan Beulich
2014-11-25 10:46   ` Andrew Cooper
2014-11-25 10:58     ` Andrew Cooper
2014-11-25 11:31       ` Jan Beulich
2014-11-25 12:10         ` Andrew Cooper
2014-11-25 12:24           ` Jan Beulich
2014-11-25 11:03     ` Jan Beulich
2014-11-25 10:47 ` Tim Deegan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.