[PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
@ 2020-10-09 15:09 Andrew Cooper
  2020-10-13 15:58 ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2020-10-09 15:09 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian

At the time of XSA-170, the x86 instruction emulator really was broken, and
would allow arbitrary non-canonical values to be loaded into %rip.  This was
fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
targets".

However, in a demonstration that off-by-one errors really are one of the
hardest programming issues we face, everyone involved with XSA-170, myself
included, mistook the statement in the SDM which says:

  If the processor supports N < 64 linear-address bits, bits 63:N must be identical

to mean "must be canonical".  A real canonical check is bits 63:N-1.

VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
to the boundary condition at 0x0000800000000000.

Now that the emulator has been fixed, revert the XSA-170 change to fix
architectural behaviour at the boundary case.  The XTF test case for XSA-170
exercises this corner case, and still passes.

Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 34 +---------------------------------
 1 file changed, 1 insertion(+), 33 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 86b8916a5d..28d09c1ca0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3650,7 +3650,7 @@ static int vmx_handle_apic_write(void)
 void vmx_vmexit_handler(struct cpu_user_regs *regs)
 {
     unsigned long exit_qualification, exit_reason, idtv_info, intr_info = 0;
-    unsigned int vector = 0, mode;
+    unsigned int vector = 0;
     struct vcpu *v = current;
     struct domain *currd = v->domain;
 
@@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 out:
     if ( nestedhvm_vcpu_in_guestmode(v) )
         nvmx_idtv_handling();
-
-    /*
-     * VM entry will fail (causing the guest to get crashed) if rIP (and
-     * rFLAGS, but we don't have an issue there) doesn't meet certain
-     * criteria. As we must not allow less than fully privileged mode to have
-     * such an effect on the domain, we correct rIP in that case (accepting
-     * this not being architecturally correct behavior, as the injected #GP
-     * fault will then not see the correct [invalid] return address).
-     * And since we know the guest will crash, we crash it right away if it
-     * already is in most privileged mode.
-     */
-    mode = vmx_guest_x86_mode(v);
-    if ( mode == 8 ? !is_canonical_address(regs->rip)
-                   : regs->rip != regs->eip )
-    {
-        gprintk(XENLOG_WARNING, "Bad rIP %lx for mode %u\n", regs->rip, mode);
-
-        if ( vmx_get_cpl() )
-        {
-            __vmread(VM_ENTRY_INTR_INFO, &intr_info);
-            if ( !(intr_info & INTR_INFO_VALID_MASK) )
-                hvm_inject_hw_exception(TRAP_gp_fault, 0);
-            /* Need to fix rIP nevertheless. */
-            if ( mode == 8 )
-                regs->rip = (long)(regs->rip << (64 - VADDR_BITS)) >>
-                            (64 - VADDR_BITS);
-            else
-                regs->rip = regs->eip;
-        }
-        else
-            domain_crash(v->domain);
-    }
 }
 
 static void lbr_tsx_fixup(void)
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-09 15:09 [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest" Andrew Cooper
@ 2020-10-13 15:58 ` Jan Beulich
  2020-10-14 13:57   ` Andrew Cooper
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-10-13 15:58 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian

On 09.10.2020 17:09, Andrew Cooper wrote:
> At the time of XSA-170, the x86 instruction emulator really was broken, and
> would allow arbitrary non-canonical values to be loaded into %rip.  This was
> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
> targets".
> 
> However, in a demonstration that off-by-one errors really are one of the
> hardest programming issues we face, everyone involved with XSA-170, myself
> included, mistook the statement in the SDM which says:
> 
>   If the processor supports N < 64 linear-address bits, bits 63:N must be identical
> 
> to mean "must be canonical".  A real canonical check is bits 63:N-1.
> 
> VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
> to the boundary condition at 0x0000800000000000.
> 
> Now that the emulator has been fixed, revert the XSA-170 change to fix
> architectural behaviour at the boundary case.  The XTF test case for XSA-170
> exercises this corner case, and still passes.
> 
> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

But why revert the change rather than fix ...

> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>  out:
>      if ( nestedhvm_vcpu_in_guestmode(v) )
>          nvmx_idtv_handling();
> -
> -    /*
> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
> -     * criteria. As we must not allow less than fully privileged mode to have
> -     * such an effect on the domain, we correct rIP in that case (accepting
> -     * this not being architecturally correct behavior, as the injected #GP
> -     * fault will then not see the correct [invalid] return address).
> -     * And since we know the guest will crash, we crash it right away if it
> -     * already is in most privileged mode.
> -     */
> -    mode = vmx_guest_x86_mode(v);
> -    if ( mode == 8 ? !is_canonical_address(regs->rip)

... the wrong use of is_canonical_address() here? By reverting
you open up avenues for XSAs in case we get things wrong elsewhere,
including ...

> -                   : regs->rip != regs->eip )

... for 32-bit guests.

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-13 15:58 ` Jan Beulich
@ 2020-10-14 13:57   ` Andrew Cooper
  2020-10-15  8:01     ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2020-10-14 13:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian

On 13/10/2020 16:58, Jan Beulich wrote:
> On 09.10.2020 17:09, Andrew Cooper wrote:
>> At the time of XSA-170, the x86 instruction emulator really was broken, and
>> would allow arbitrary non-canonical values to be loaded into %rip.  This was
>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
>> targets".
>>
>> However, in a demonstration that off-by-one errors really are one of the
>> hardest programming issues we face, everyone involved with XSA-170, myself
>> included, mistook the statement in the SDM which says:
>>
>>   If the processor supports N < 64 linear-address bits, bits 63:N must be identical
>>
>> to mean "must be canonical".  A real canonical check is bits 63:N-1.
>>
>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
>> to the boundary condition at 0x0000800000000000.
>>
>> Now that the emulator has been fixed, revert the XSA-170 change to fix
>> architectural behaviour at the boundary case.  The XTF test case for XSA-170
>> exercises this corner case, and still passes.
>>
>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> But why revert the change rather than fix ...
>
>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>  out:
>>      if ( nestedhvm_vcpu_in_guestmode(v) )
>>          nvmx_idtv_handling();
>> -
>> -    /*
>> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
>> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
>> -     * criteria. As we must not allow less than fully privileged mode to have
>> -     * such an effect on the domain, we correct rIP in that case (accepting
>> -     * this not being architecturally correct behavior, as the injected #GP
>> -     * fault will then not see the correct [invalid] return address).
>> -     * And since we know the guest will crash, we crash it right away if it
>> -     * already is in most privileged mode.
>> -     */
>> -    mode = vmx_guest_x86_mode(v);
>> -    if ( mode == 8 ? !is_canonical_address(regs->rip)
> ... the wrong use of is_canonical_address() here? By reverting
> you open up avenues for XSAs in case we get things wrong elsewhere,
> including ...
>
>> -                   : regs->rip != regs->eip )
> ... for 32-bit guests.

Because the only appropriate alternative would be ASSERT_UNREACHABLE()
and domain crash.

This logic corrupts guest state.

Running with corrupt state is every bit an XSA as hitting a VMEntry
failure if it can be triggered by userspace, but the latter safer and
much more obvious.

It was the appropriate security fix (give or take the functional bug in
it) at the time, given the complexity of retrofitting zero length
instruction fetches to the emulator.

However, it is one of a very long list of guest-state-induced VMEntry
failures, with non-trivial logic which we assert will pass, on a
fastpath, where hardware also performs the same checks and we already
have a runtime safe way of dealing with errors.  (Hence not actually
using ASSERT_UNREACHABLE() here.)

It isn't appropriate for this check to exist on its own (i.e. without
other guest state checks), and it isn't appropriate to live in a
fastpath.  In principle, some logic like this could live on the vmentry
failure path to try a second time, but then you're still creating an XSA
situation which is less obvious, and therefore isn't a clever move IMO.

~Andrew


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-14 13:57   ` Andrew Cooper
@ 2020-10-15  8:01     ` Jan Beulich
  2020-10-16 15:38       ` Andrew Cooper
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-10-15  8:01 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian

On 14.10.2020 15:57, Andrew Cooper wrote:
> On 13/10/2020 16:58, Jan Beulich wrote:
>> On 09.10.2020 17:09, Andrew Cooper wrote:
>>> At the time of XSA-170, the x86 instruction emulator really was broken, and
>>> would allow arbitrary non-canonical values to be loaded into %rip.  This was
>>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
>>> targets".
>>>
>>> However, in a demonstration that off-by-one errors really are one of the
>>> hardest programming issues we face, everyone involved with XSA-170, myself
>>> included, mistook the statement in the SDM which says:
>>>
>>>   If the processor supports N < 64 linear-address bits, bits 63:N must be identical
>>>
>>> to mean "must be canonical".  A real canonical check is bits 63:N-1.
>>>
>>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
>>> to the boundary condition at 0x0000800000000000.
>>>
>>> Now that the emulator has been fixed, revert the XSA-170 change to fix
>>> architectural behaviour at the boundary case.  The XTF test case for XSA-170
>>> exercises this corner case, and still passes.
>>>
>>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> But why revert the change rather than fix ...
>>
>>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>  out:
>>>      if ( nestedhvm_vcpu_in_guestmode(v) )
>>>          nvmx_idtv_handling();
>>> -
>>> -    /*
>>> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
>>> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
>>> -     * criteria. As we must not allow less than fully privileged mode to have
>>> -     * such an effect on the domain, we correct rIP in that case (accepting
>>> -     * this not being architecturally correct behavior, as the injected #GP
>>> -     * fault will then not see the correct [invalid] return address).
>>> -     * And since we know the guest will crash, we crash it right away if it
>>> -     * already is in most privileged mode.
>>> -     */
>>> -    mode = vmx_guest_x86_mode(v);
>>> -    if ( mode == 8 ? !is_canonical_address(regs->rip)
>> ... the wrong use of is_canonical_address() here? By reverting
>> you open up avenues for XSAs in case we get things wrong elsewhere,
>> including ...
>>
>>> -                   : regs->rip != regs->eip )
>> ... for 32-bit guests.
> 
> Because the only appropriate alternative would be ASSERT_UNREACHABLE()
> and domain crash.
> 
> This logic corrupts guest state.
> 
> Running with corrupt state is every bit an XSA as hitting a VMEntry
> failure if it can be triggered by userspace, but the latter safer and
> much more obvious.

I disagree. For CPL > 0 we don't "corrupt" guest state any more
than reporting a #GP fault when one is going to be reported
anyway (as long as the VM entry doesn't fail, and hence the
guest won't get crashed). IOW this raising of #GP actually is a
precautionary measure to _avoid_ XSAs.

Nor do I agree with the "much more obvious" aspect: A VM entry
failure requires quite a bit of analysis to recognize what has
caused it; whether a non-pseudo-canonical RIP is what catches your
eye right away is simply unknown. The gprintk() that you delete,
otoh, says very clearly what we have found to be wrong.

> It was the appropriate security fix (give or take the functional bug in
> it) at the time, given the complexity of retrofitting zero length
> instruction fetches to the emulator.
> 
> However, it is one of a very long list of guest-state-induced VMEntry
> failures, with non-trivial logic which we assert will pass, on a
> fastpath, where hardware also performs the same checks and we already
> have a runtime safe way of dealing with errors.  (Hence not actually
> using ASSERT_UNREACHABLE() here.)

"Runtime safe" as far as Xen is concerned, I take it. This isn't safe
for the guest at all, as vmx_failed_vmentry() results in an
unconditional domain_crash().

I certainly buy the fast path aspect of your comment, and if you were
moving the guest state adjustment into vmx_failed_vmentry(), I'd be
fine with the deletion here.

> It isn't appropriate for this check to exist on its own (i.e. without
> other guest state checks),

Well, if we run into cases where we get things wrong, more checks
and adjustments may want adding. Sadly each one of those has a fair
chance of needing an XSA.

As an aside, nvmx_n2_vmexit_handler()'s handling of
VMX_EXIT_REASONS_FAILED_VMENTRY looks pretty bogus - this is a flag,
not a separate exit reason. I guess I'll make a patch ...

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-15  8:01     ` Jan Beulich
@ 2020-10-16 15:38       ` Andrew Cooper
  2020-10-19  9:09         ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2020-10-16 15:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian

On 15/10/2020 09:01, Jan Beulich wrote:
> On 14.10.2020 15:57, Andrew Cooper wrote:
>> On 13/10/2020 16:58, Jan Beulich wrote:
>>> On 09.10.2020 17:09, Andrew Cooper wrote:
>>>> At the time of XSA-170, the x86 instruction emulator really was broken, and
>>>> would allow arbitrary non-canonical values to be loaded into %rip.  This was
>>>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
>>>> targets".
>>>>
>>>> However, in a demonstration that off-by-one errors really are one of the
>>>> hardest programming issues we face, everyone involved with XSA-170, myself
>>>> included, mistook the statement in the SDM which says:
>>>>
>>>>   If the processor supports N < 64 linear-address bits, bits 63:N must be identical
>>>>
>>>> to mean "must be canonical".  A real canonical check is bits 63:N-1.
>>>>
>>>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
>>>> to the boundary condition at 0x0000800000000000.
>>>>
>>>> Now that the emulator has been fixed, revert the XSA-170 change to fix
>>>> architectural behaviour at the boundary case.  The XTF test case for XSA-170
>>>> exercises this corner case, and still passes.
>>>>
>>>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> But why revert the change rather than fix ...
>>>
>>>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>>  out:
>>>>      if ( nestedhvm_vcpu_in_guestmode(v) )
>>>>          nvmx_idtv_handling();
>>>> -
>>>> -    /*
>>>> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
>>>> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
>>>> -     * criteria. As we must not allow less than fully privileged mode to have
>>>> -     * such an effect on the domain, we correct rIP in that case (accepting
>>>> -     * this not being architecturally correct behavior, as the injected #GP
>>>> -     * fault will then not see the correct [invalid] return address).
>>>> -     * And since we know the guest will crash, we crash it right away if it
>>>> -     * already is in most privileged mode.
>>>> -     */
>>>> -    mode = vmx_guest_x86_mode(v);
>>>> -    if ( mode == 8 ? !is_canonical_address(regs->rip)
>>> ... the wrong use of is_canonical_address() here? By reverting
>>> you open up avenues for XSAs in case we get things wrong elsewhere,
>>> including ...
>>>
>>>> -                   : regs->rip != regs->eip )
>>> ... for 32-bit guests.
>> Because the only appropriate alternative would be ASSERT_UNREACHABLE()
>> and domain crash.
>>
>> This logic corrupts guest state.
>>
>> Running with corrupt state is every bit an XSA as hitting a VMEntry
>> failure if it can be triggered by userspace, but the latter safer and
>> much more obvious.
> I disagree. For CPL > 0 we don't "corrupt" guest state any more
> than reporting a #GP fault when one is going to be reported
> anyway (as long as the VM entry doesn't fail, and hence the
> guest won't get crashed). IOW this raising of #GP actually is a
> precautionary measure to _avoid_ XSAs.

It does not remove any XSAs.  It merely hides them.

There are legal states where RIP is 0x0000800000000000 and #GP is the
wrong thing to do.  Any async VMExit (Processor Trace Prefetch in
particular), or with debug traps pending.

> Nor do I agree with the "much more obvious" aspect:

A domain crash is far more likely to be reported to xen-devel/security
than something which bodges state in an almost-silent way.

> A VM entry
> failure requires quite a bit of analysis to recognize what has
> caused it; whether a non-pseudo-canonical RIP is what catches your
> eye right away is simply unknown. The gprintk() that you delete,
> otoh, says very clearly what we have found to be wrong.

Non-canonical values are easier to spot than most of the other rules, IMO.

>> It was the appropriate security fix (give or take the functional bug in
>> it) at the time, given the complexity of retrofitting zero length
>> instruction fetches to the emulator.
>>
>> However, it is one of a very long list of guest-state-induced VMEntry
>> failures, with non-trivial logic which we assert will pass, on a
>> fastpath, where hardware also performs the same checks and we already
>> have a runtime safe way of dealing with errors.  (Hence not actually
>> using ASSERT_UNREACHABLE() here.)
> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
> for the guest at all, as vmx_failed_vmentry() results in an
> unconditional domain_crash().

Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
an XSA, *irrespective* of whether we crash the domain then and there, or
whether we let it try and limp on with corrupted state.

The different is purely in how obviously the bug manifests.

> I certainly buy the fast path aspect of your comment, and if you were
> moving the guest state adjustment into vmx_failed_vmentry(), I'd be
> fine with the deletion here.
>
>> It isn't appropriate for this check to exist on its own (i.e. without
>> other guest state checks),
> Well, if we run into cases where we get things wrong, more checks
> and adjustments may want adding. Sadly each one of those has a fair
> chance of needing an XSA.

We've had plenty of VMEntry failure XSAs, and we will no doubt have
plenty more in the future.  A non-canonical RIP is not special as far as
these go.


We absolutely should not be doing any fixup in debug builds.  I don't
see any security benefit for doing it in release builds, and an
important downside in terms of getting the bug noticed, and therefore fixed.


> As an aside, nvmx_n2_vmexit_handler()'s handling of
> VMX_EXIT_REASONS_FAILED_VMENTRY looks pretty bogus - this is a flag,
> not a separate exit reason. I guess I'll make a patch ...

This is far from the only problem.  I'm not intending to fix any issues
I find, until I can start getting some proper nested virt functionality
tests in place.

~Andrew


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-16 15:38       ` Andrew Cooper
@ 2020-10-19  9:09         ` Jan Beulich
  2020-10-19 16:12           ` Andrew Cooper
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-10-19  9:09 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian

On 16.10.2020 17:38, Andrew Cooper wrote:
> On 15/10/2020 09:01, Jan Beulich wrote:
>> On 14.10.2020 15:57, Andrew Cooper wrote:
>>> On 13/10/2020 16:58, Jan Beulich wrote:
>>>> On 09.10.2020 17:09, Andrew Cooper wrote:
>>>>> At the time of XSA-170, the x86 instruction emulator really was broken, and
>>>>> would allow arbitrary non-canonical values to be loaded into %rip.  This was
>>>>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
>>>>> targets".
>>>>>
>>>>> However, in a demonstration that off-by-one errors really are one of the
>>>>> hardest programming issues we face, everyone involved with XSA-170, myself
>>>>> included, mistook the statement in the SDM which says:
>>>>>
>>>>>   If the processor supports N < 64 linear-address bits, bits 63:N must be identical
>>>>>
>>>>> to mean "must be canonical".  A real canonical check is bits 63:N-1.
>>>>>
>>>>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
>>>>> to the boundary condition at 0x0000800000000000.
>>>>>
>>>>> Now that the emulator has been fixed, revert the XSA-170 change to fix
>>>>> architectural behaviour at the boundary case.  The XTF test case for XSA-170
>>>>> exercises this corner case, and still passes.
>>>>>
>>>>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> But why revert the change rather than fix ...
>>>>
>>>>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>>>  out:
>>>>>      if ( nestedhvm_vcpu_in_guestmode(v) )
>>>>>          nvmx_idtv_handling();
>>>>> -
>>>>> -    /*
>>>>> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
>>>>> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
>>>>> -     * criteria. As we must not allow less than fully privileged mode to have
>>>>> -     * such an effect on the domain, we correct rIP in that case (accepting
>>>>> -     * this not being architecturally correct behavior, as the injected #GP
>>>>> -     * fault will then not see the correct [invalid] return address).
>>>>> -     * And since we know the guest will crash, we crash it right away if it
>>>>> -     * already is in most privileged mode.
>>>>> -     */
>>>>> -    mode = vmx_guest_x86_mode(v);
>>>>> -    if ( mode == 8 ? !is_canonical_address(regs->rip)
>>>> ... the wrong use of is_canonical_address() here? By reverting
>>>> you open up avenues for XSAs in case we get things wrong elsewhere,
>>>> including ...
>>>>
>>>>> -                   : regs->rip != regs->eip )
>>>> ... for 32-bit guests.
>>> Because the only appropriate alternative would be ASSERT_UNREACHABLE()
>>> and domain crash.
>>>
>>> This logic corrupts guest state.
>>>
>>> Running with corrupt state is every bit an XSA as hitting a VMEntry
>>> failure if it can be triggered by userspace, but the latter safer and
>>> much more obvious.
>> I disagree. For CPL > 0 we don't "corrupt" guest state any more
>> than reporting a #GP fault when one is going to be reported
>> anyway (as long as the VM entry doesn't fail, and hence the
>> guest won't get crashed). IOW this raising of #GP actually is a
>> precautionary measure to _avoid_ XSAs.
> 
> It does not remove any XSAs.  It merely hides them.

How that? If we convert the ability of guest user mode to crash
the guest into deliver of #GP(0), how is there a hidden XSA then?

> There are legal states where RIP is 0x0000800000000000 and #GP is the
> wrong thing to do.  Any async VMExit (Processor Trace Prefetch in
> particular), or with debug traps pending.

You realize we're in agreement about this pseudo-canonical check
needing fixing?

>> Nor do I agree with the "much more obvious" aspect:
> 
> A domain crash is far more likely to be reported to xen-devel/security
> than something which bodges state in an almost-silent way.
> 
>> A VM entry
>> failure requires quite a bit of analysis to recognize what has
>> caused it; whether a non-pseudo-canonical RIP is what catches your
>> eye right away is simply unknown. The gprintk() that you delete,
>> otoh, says very clearly what we have found to be wrong.
> 
> Non-canonical values are easier to spot than most of the other rules, IMO.

Which will get less obvious with 5-level paging capable hardware
in mind.

>>> It was the appropriate security fix (give or take the functional bug in
>>> it) at the time, given the complexity of retrofitting zero length
>>> instruction fetches to the emulator.
>>>
>>> However, it is one of a very long list of guest-state-induced VMEntry
>>> failures, with non-trivial logic which we assert will pass, on a
>>> fastpath, where hardware also performs the same checks and we already
>>> have a runtime safe way of dealing with errors.  (Hence not actually
>>> using ASSERT_UNREACHABLE() here.)
>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
>> for the guest at all, as vmx_failed_vmentry() results in an
>> unconditional domain_crash().
> 
> Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
> an XSA, *irrespective* of whether we crash the domain then and there, or
> whether we let it try and limp on with corrupted state.

Allowing the guest to continue with corrupted state is not a
useful thing to do, I agree. However, what falls under
"corrupted" seems to be different for you and me. I'd not call
delivery of #GP "corruption" in any way. The primary goal ought
to be that we don't corrupt the guest kernel view of the world.
It may then have the opportunity to kill the offending user
mode process.

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-19  9:09         ` Jan Beulich
@ 2020-10-19 16:12           ` Andrew Cooper
  2020-10-20  8:09             ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2020-10-19 16:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian

On 19/10/2020 10:09, Jan Beulich wrote:
> On 16.10.2020 17:38, Andrew Cooper wrote:
>> On 15/10/2020 09:01, Jan Beulich wrote:
>>> On 14.10.2020 15:57, Andrew Cooper wrote:
>>>> On 13/10/2020 16:58, Jan Beulich wrote:
>>>>> On 09.10.2020 17:09, Andrew Cooper wrote:
>>>>>> At the time of XSA-170, the x86 instruction emulator really was broken, and
>>>>>> would allow arbitrary non-canonical values to be loaded into %rip.  This was
>>>>>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
>>>>>> targets".
>>>>>>
>>>>>> However, in a demonstration that off-by-one errors really are one of the
>>>>>> hardest programming issues we face, everyone involved with XSA-170, myself
>>>>>> included, mistook the statement in the SDM which says:
>>>>>>
>>>>>>   If the processor supports N < 64 linear-address bits, bits 63:N must be identical
>>>>>>
>>>>>> to mean "must be canonical".  A real canonical check is bits 63:N-1.
>>>>>>
>>>>>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
>>>>>> to the boundary condition at 0x0000800000000000.
>>>>>>
>>>>>> Now that the emulator has been fixed, revert the XSA-170 change to fix
>>>>>> architectural behaviour at the boundary case.  The XTF test case for XSA-170
>>>>>> exercises this corner case, and still passes.
>>>>>>
>>>>>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> But why revert the change rather than fix ...
>>>>>
>>>>>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>>>>  out:
>>>>>>      if ( nestedhvm_vcpu_in_guestmode(v) )
>>>>>>          nvmx_idtv_handling();
>>>>>> -
>>>>>> -    /*
>>>>>> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
>>>>>> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
>>>>>> -     * criteria. As we must not allow less than fully privileged mode to have
>>>>>> -     * such an effect on the domain, we correct rIP in that case (accepting
>>>>>> -     * this not being architecturally correct behavior, as the injected #GP
>>>>>> -     * fault will then not see the correct [invalid] return address).
>>>>>> -     * And since we know the guest will crash, we crash it right away if it
>>>>>> -     * already is in most privileged mode.
>>>>>> -     */
>>>>>> -    mode = vmx_guest_x86_mode(v);
>>>>>> -    if ( mode == 8 ? !is_canonical_address(regs->rip)
>>>>> ... the wrong use of is_canonical_address() here? By reverting
>>>>> you open up avenues for XSAs in case we get things wrong elsewhere,
>>>>> including ...
>>>>>
>>>>>> -                   : regs->rip != regs->eip )
>>>>> ... for 32-bit guests.
>>>> Because the only appropriate alternative would be ASSERT_UNREACHABLE()
>>>> and domain crash.
>>>>
>>>> This logic corrupts guest state.
>>>>
>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry
>>>> failure if it can be triggered by userspace, but the latter safer and
>>>> much more obvious.
>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more
>>> than reporting a #GP fault when one is going to be reported
>>> anyway (as long as the VM entry doesn't fail, and hence the
>>> guest won't get crashed). IOW this raising of #GP actually is a
>>> precautionary measure to _avoid_ XSAs.
>> It does not remove any XSAs.  It merely hides them.
> How that? If we convert the ability of guest user mode to crash
> the guest into deliver of #GP(0), how is there a hidden XSA then?

Because userspace being able to triggering this fixup is still an XSA.

>> There are legal states where RIP is 0x0000800000000000 and #GP is the
>> wrong thing to do.  Any async VMExit (Processor Trace Prefetch in
>> particular), or with debug traps pending.
> You realize we're in agreement about this pseudo-canonical check
> needing fixing?

Anything other than deleting this clause does not fix the bugs above.

>>>> It was the appropriate security fix (give or take the functional bug in
>>>> it) at the time, given the complexity of retrofitting zero length
>>>> instruction fetches to the emulator.
>>>>
>>>> However, it is one of a very long list of guest-state-induced VMEntry
>>>> failures, with non-trivial logic which we assert will pass, on a
>>>> fastpath, where hardware also performs the same checks and we already
>>>> have a runtime safe way of dealing with errors.  (Hence not actually
>>>> using ASSERT_UNREACHABLE() here.)
>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
>>> for the guest at all, as vmx_failed_vmentry() results in an
>>> unconditional domain_crash().
>> Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
>> an XSA, *irrespective* of whether we crash the domain then and there, or
>> whether we let it try and limp on with corrupted state.
> Allowing the guest to continue with corrupted state is not a
> useful thing to do, I agree. However, what falls under
> "corrupted" seems to be different for you and me. I'd not call
> delivery of #GP "corruption" in any way.

I can only repeat my previous statement:

> There are legal states where RIP is 0x0000800000000000 and #GP is the
> wrong thing to do.

Blindly raising #GP in is not always the right thing to do.

>  The primary goal ought
> to be that we don't corrupt the guest kernel view of the world.
> It may then have the opportunity to kill the offending user
> mode process.

By the time we have hit this case, all bets are off, because Xen *is*
malfunctioning.  We have no idea if kernel context is still intact.  You
don't even know that current user context is the correct offending
context to clobber, and might be creating a user=>user DoS vulnerability.

We definitely have an XSA to find and fix, and we can either make it
very obvious and likely to be reported, or hidden and liable to go
unnoticed for a long period of time.

Every rational argument is on the side of killing the domain in an
obvious way.

~Andrew


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-19 16:12           ` Andrew Cooper
@ 2020-10-20  8:09             ` Jan Beulich
  2020-10-23  6:14               ` Tian, Kevin
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-10-20  8:09 UTC (permalink / raw)
  To: Andrew Cooper, Jun Nakajima, Kevin Tian
  Cc: Xen-devel, Roger Pau Monné, Wei Liu

On 19.10.2020 18:12, Andrew Cooper wrote:
> On 19/10/2020 10:09, Jan Beulich wrote:
>> On 16.10.2020 17:38, Andrew Cooper wrote:
>>> On 15/10/2020 09:01, Jan Beulich wrote:
>>>> On 14.10.2020 15:57, Andrew Cooper wrote:
>>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry
>>>>> failure if it can be triggered by userspace, but the latter safer and
>>>>> much more obvious.
>>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more
>>>> than reporting a #GP fault when one is going to be reported
>>>> anyway (as long as the VM entry doesn't fail, and hence the
>>>> guest won't get crashed). IOW this raising of #GP actually is a
>>>> precautionary measure to _avoid_ XSAs.
>>> It does not remove any XSAs.  It merely hides them.
>> How that? If we convert the ability of guest user mode to crash
>> the guest into deliver of #GP(0), how is there a hidden XSA then?
> 
> Because userspace being able to triggering this fixup is still an XSA.

How do you know without a specific case at hand? It may well be
that all that's impacted is guest user space, in which case I
don't see why there would need to be an XSA. It's still a bug
then, sure.

>>>>> It was the appropriate security fix (give or take the functional bug in
>>>>> it) at the time, given the complexity of retrofitting zero length
>>>>> instruction fetches to the emulator.
>>>>>
>>>>> However, it is one of a very long list of guest-state-induced VMEntry
>>>>> failures, with non-trivial logic which we assert will pass, on a
>>>>> fastpath, where hardware also performs the same checks and we already
>>>>> have a runtime safe way of dealing with errors.  (Hence not actually
>>>>> using ASSERT_UNREACHABLE() here.)
>>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
>>>> for the guest at all, as vmx_failed_vmentry() results in an
>>>> unconditional domain_crash().
>>> Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
>>> an XSA, *irrespective* of whether we crash the domain then and there, or
>>> whether we let it try and limp on with corrupted state.
>> Allowing the guest to continue with corrupted state is not a
>> useful thing to do, I agree. However, what falls under
>> "corrupted" seems to be different for you and me. I'd not call
>> delivery of #GP "corruption" in any way.
> 
> I can only repeat my previous statement:
> 
>> There are legal states where RIP is 0x0000800000000000 and #GP is the
>> wrong thing to do.
> 
> Blindly raising #GP in is not always the right thing to do.

Again - we're in agreement about "blindly". Let's be less blind.

>>  The primary goal ought
>> to be that we don't corrupt the guest kernel view of the world.
>> It may then have the opportunity to kill the offending user
>> mode process.
> 
> By the time we have hit this case, all bets are off, because Xen *is*
> malfunctioning.  We have no idea if kernel context is still intact.  You
> don't even know that current user context is the correct offending
> context to clobber, and might be creating a user=>user DoS vulnerability.
> 
> We definitely have an XSA to find and fix, and we can either make it
> very obvious and likely to be reported, or hidden and liable to go
> unnoticed for a long period of time.

Why would it go unnoticed when we log the incident? I very much
hope people inspect their logs at least every once in a while ...

And as per above - I disagree with your use of "definitely" here.
We have a bug to analyze and fix, yes. Whether it's an XSA-worthy
one isn't known ahead of time, unless we crash the guest in such
a case.

In any event I think it's about time that the VMX maintainers
voice their views here, as they're the ones to approve of
whichever change we end up with. Kevin, Jun?

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2020-10-20  8:09             ` Jan Beulich
@ 2020-10-23  6:14               ` Tian, Kevin
  0 siblings, 0 replies; 16+ messages in thread
From: Tian, Kevin @ 2020-10-23  6:14 UTC (permalink / raw)
  To: Jan Beulich, Cooper, Andrew, Nakajima, Jun
  Cc: Xen-devel, Roger Pau Monné, Wei Liu

> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, October 20, 2020 4:10 PM
> 
> On 19.10.2020 18:12, Andrew Cooper wrote:
> > On 19/10/2020 10:09, Jan Beulich wrote:
> >> On 16.10.2020 17:38, Andrew Cooper wrote:
> >>> On 15/10/2020 09:01, Jan Beulich wrote:
> >>>> On 14.10.2020 15:57, Andrew Cooper wrote:
> >>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry
> >>>>> failure if it can be triggered by userspace, but the latter safer and
> >>>>> much more obvious.
> >>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more
> >>>> than reporting a #GP fault when one is going to be reported
> >>>> anyway (as long as the VM entry doesn't fail, and hence the
> >>>> guest won't get crashed). IOW this raising of #GP actually is a
> >>>> precautionary measure to _avoid_ XSAs.
> >>> It does not remove any XSAs.  It merely hides them.
> >> How that? If we convert the ability of guest user mode to crash
> >> the guest into deliver of #GP(0), how is there a hidden XSA then?
> >
> > Because userspace being able to triggering this fixup is still an XSA.
> 
> How do you know without a specific case at hand? It may well be
> that all that's impacted is guest user space, in which case I
> don't see why there would need to be an XSA. It's still a bug
> then, sure.
> 
> >>>>> It was the appropriate security fix (give or take the functional bug in
> >>>>> it) at the time, given the complexity of retrofitting zero length
> >>>>> instruction fetches to the emulator.
> >>>>>
> >>>>> However, it is one of a very long list of guest-state-induced VMEntry
> >>>>> failures, with non-trivial logic which we assert will pass, on a
> >>>>> fastpath, where hardware also performs the same checks and we
> already
> >>>>> have a runtime safe way of dealing with errors.  (Hence not actually
> >>>>> using ASSERT_UNREACHABLE() here.)
> >>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
> >>>> for the guest at all, as vmx_failed_vmentry() results in an
> >>>> unconditional domain_crash().
> >>> Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
> >>> an XSA, *irrespective* of whether we crash the domain then and there,
> or
> >>> whether we let it try and limp on with corrupted state.
> >> Allowing the guest to continue with corrupted state is not a
> >> useful thing to do, I agree. However, what falls under
> >> "corrupted" seems to be different for you and me. I'd not call
> >> delivery of #GP "corruption" in any way.
> >
> > I can only repeat my previous statement:
> >
> >> There are legal states where RIP is 0x0000800000000000 and #GP is the
> >> wrong thing to do.
> >
> > Blindly raising #GP in is not always the right thing to do.
> 
> Again - we're in agreement about "blindly". Let's be less blind.
> 
> >>  The primary goal ought
> >> to be that we don't corrupt the guest kernel view of the world.
> >> It may then have the opportunity to kill the offending user
> >> mode process.
> >
> > By the time we have hit this case, all bets are off, because Xen *is*
> > malfunctioning.  We have no idea if kernel context is still intact.  You
> > don't even know that current user context is the correct offending
> > context to clobber, and might be creating a user=>user DoS vulnerability.
> >
> > We definitely have an XSA to find and fix, and we can either make it
> > very obvious and likely to be reported, or hidden and liable to go
> > unnoticed for a long period of time.
> 
> Why would it go unnoticed when we log the incident? I very much
> hope people inspect their logs at least every once in a while ...
> 
> And as per above - I disagree with your use of "definitely" here.
> We have a bug to analyze and fix, yes. Whether it's an XSA-worthy
> one isn't known ahead of time, unless we crash the guest in such
> a case.
> 
> In any event I think it's about time that the VMX maintainers
> voice their views here, as they're the ones to approve of
> whichever change we end up with. Kevin, Jun?
> 

Honestly speaking both of your options make some sense. and
I don't think there is a perfect answer here. Personally I'm more
aligned with Jan's point on preventing guest user from crashing
the whole domain. But let's also hear from others' opinions as 
I believe this dilemma might be seen in other places too thus 
may need a general consensus in Xen community.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2023-04-05 21:52 Andrew Cooper
  2023-04-06  7:10 ` Jan Beulich
  2023-08-23 11:15 ` Roger Pau Monné
@ 2023-08-24  4:26 ` Tian, Kevin
  2 siblings, 0 replies; 16+ messages in thread
From: Tian, Kevin @ 2023-08-24  4:26 UTC (permalink / raw)
  To: andrew.cooper3, Xen-devel
  Cc: andrew.cooper3, Beulich, Jan, Pau Monné,
	Roger, Wei Liu, Nakajima, Jun

> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: Thursday, April 6, 2023 5:53 AM
> 
> At the time of XSA-170, the x86 instruction emulator was genuinely broken.
> It
> would load arbitrary values into %rip and putting a check here probably was
> the best stopgap security fix.  It should have been reverted following c/s
> 81d3a0b26c1 "x86emul: limit-check branch targets" which corrected the
> emulator
> behaviour.
> 
> However, everyone involved in XSA-170, myself included, failed to read the
> SDM
> correctly.  On the subject of %rip consistency checks, the SDM stated:
> 
>   If the processor supports N < 64 linear-address bits, bits 63:N must be
>   identical
> 
> A non-canonical %rip (and SSP more recently) is an explicitly legal state in
> x86, and the VMEntry consistency checks are intentionally off-by-one from a
> regular canonical check.
> 
> The consequence of this bug is that Xen will currently take a legal x86 state
> which would successfully VMEnter, and corrupt it into having non-
> architectural
> behaviour.
> 
> Furthermore, in the time this bugfix has been pending in public, I
> successfully persuaded Intel to clarify the SDM, adding the following
> clarification:
> 
>   The guest RIP value is not required to be canonical; the value of bit N-1
>   may differ from that of bit N.
> 
> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2023-08-23 13:31     ` Roger Pau Monné
@ 2023-08-23 14:09       ` Andrew Cooper
  0 siblings, 0 replies; 16+ messages in thread
From: Andrew Cooper @ 2023-08-23 14:09 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Xen-devel, Jan Beulich, Wei Liu, Jun Nakajima, Kevin Tian

On 23/08/2023 2:31 pm, Roger Pau Monné wrote:
> On Wed, Aug 23, 2023 at 12:56:48PM +0100, Andrew Cooper wrote:
>> On 23/08/2023 12:15 pm, Roger Pau Monné wrote:
>>> On Wed, Apr 05, 2023 at 10:52:45PM +0100, Andrew Cooper wrote:
>>>> At the time of XSA-170, the x86 instruction emulator was genuinely broken.  It
>>>> would load arbitrary values into %rip and putting a check here probably was
>>>> the best stopgap security fix.  It should have been reverted following c/s
>>>> 81d3a0b26c1 "x86emul: limit-check branch targets" which corrected the emulator
>>>> behaviour.
>>>>
>>>> However, everyone involved in XSA-170, myself included, failed to read the SDM
>>>> correctly.  On the subject of %rip consistency checks, the SDM stated:
>>>>
>>>>   If the processor supports N < 64 linear-address bits, bits 63:N must be
>>>>   identical
>>>>
>>>> A non-canonical %rip (and SSP more recently) is an explicitly legal state in
>>>> x86, and the VMEntry consistency checks are intentionally off-by-one from a
>>>> regular canonical check.
>>>>
>>>> The consequence of this bug is that Xen will currently take a legal x86 state
>>>> which would successfully VMEnter, and corrupt it into having non-architectural
>>>> behaviour.
>>>>
>>>> Furthermore, in the time this bugfix has been pending in public, I
>>>> successfully persuaded Intel to clarify the SDM, adding the following
>>>> clarification:
>>>>
>>>>   The guest RIP value is not required to be canonical; the value of bit N-1
>>>>   may differ from that of bit N.
>>>>
>>>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>>> I think the fixes tag should likely be "x86emul: limit-check branch
>>> targets", since it's that commit that missed the revert done here?
>> Well, not really.  ffbbfda377 really does have a bug, irrespective of
>> the changes in the emulator.
>>
>> The presence of 81d3a0b26c1 is why this bugfix is a full revert of
>> ffbbfda377, and not just an off-by-1 adjustment.
> Right, but taking this patch without also having 81d3a0b26c1 will lead
> to a vulnerable system, hence why I think the dependency would better
> be on 81d3a0b26c1.
>
> Anyway, I don't think it's worth arguing over, so if you want to leave
> it as-is I won't object.

We don't really have a depends-on tag, or only-safe-with or whatever,
and a change like that is stretching the definition of Fixes IMO.

81d3a0b26c1 is more than 7 years old now, so there's not going to be a
practical problem backporting.  For everything else, I expect people to
read the commit message and apply common sense.

~Andrew


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2023-08-23 11:56   ` Andrew Cooper
@ 2023-08-23 13:31     ` Roger Pau Monné
  2023-08-23 14:09       ` Andrew Cooper
  0 siblings, 1 reply; 16+ messages in thread
From: Roger Pau Monné @ 2023-08-23 13:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Jan Beulich, Wei Liu, Jun Nakajima, Kevin Tian

On Wed, Aug 23, 2023 at 12:56:48PM +0100, Andrew Cooper wrote:
> On 23/08/2023 12:15 pm, Roger Pau Monné wrote:
> > On Wed, Apr 05, 2023 at 10:52:45PM +0100, Andrew Cooper wrote:
> >> At the time of XSA-170, the x86 instruction emulator was genuinely broken.  It
> >> would load arbitrary values into %rip and putting a check here probably was
> >> the best stopgap security fix.  It should have been reverted following c/s
> >> 81d3a0b26c1 "x86emul: limit-check branch targets" which corrected the emulator
> >> behaviour.
> >>
> >> However, everyone involved in XSA-170, myself included, failed to read the SDM
> >> correctly.  On the subject of %rip consistency checks, the SDM stated:
> >>
> >>   If the processor supports N < 64 linear-address bits, bits 63:N must be
> >>   identical
> >>
> >> A non-canonical %rip (and SSP more recently) is an explicitly legal state in
> >> x86, and the VMEntry consistency checks are intentionally off-by-one from a
> >> regular canonical check.
> >>
> >> The consequence of this bug is that Xen will currently take a legal x86 state
> >> which would successfully VMEnter, and corrupt it into having non-architectural
> >> behaviour.
> >>
> >> Furthermore, in the time this bugfix has been pending in public, I
> >> successfully persuaded Intel to clarify the SDM, adding the following
> >> clarification:
> >>
> >>   The guest RIP value is not required to be canonical; the value of bit N-1
> >>   may differ from that of bit N.
> >>
> >> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
> > I think the fixes tag should likely be "x86emul: limit-check branch
> > targets", since it's that commit that missed the revert done here?
> 
> Well, not really.  ffbbfda377 really does have a bug, irrespective of
> the changes in the emulator.
> 
> The presence of 81d3a0b26c1 is why this bugfix is a full revert of
> ffbbfda377, and not just an off-by-1 adjustment.

Right, but taking this patch without also having 81d3a0b26c1 will lead
to a vulnerable system, hence why I think the dependency would better
be on 81d3a0b26c1.

Anyway, I don't think it's worth arguing over, so if you want to leave
it as-is I won't object.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2023-08-23 11:15 ` Roger Pau Monné
@ 2023-08-23 11:56   ` Andrew Cooper
  2023-08-23 13:31     ` Roger Pau Monné
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2023-08-23 11:56 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Xen-devel, Jan Beulich, Wei Liu, Jun Nakajima, Kevin Tian

On 23/08/2023 12:15 pm, Roger Pau Monné wrote:
> On Wed, Apr 05, 2023 at 10:52:45PM +0100, Andrew Cooper wrote:
>> At the time of XSA-170, the x86 instruction emulator was genuinely broken.  It
>> would load arbitrary values into %rip and putting a check here probably was
>> the best stopgap security fix.  It should have been reverted following c/s
>> 81d3a0b26c1 "x86emul: limit-check branch targets" which corrected the emulator
>> behaviour.
>>
>> However, everyone involved in XSA-170, myself included, failed to read the SDM
>> correctly.  On the subject of %rip consistency checks, the SDM stated:
>>
>>   If the processor supports N < 64 linear-address bits, bits 63:N must be
>>   identical
>>
>> A non-canonical %rip (and SSP more recently) is an explicitly legal state in
>> x86, and the VMEntry consistency checks are intentionally off-by-one from a
>> regular canonical check.
>>
>> The consequence of this bug is that Xen will currently take a legal x86 state
>> which would successfully VMEnter, and corrupt it into having non-architectural
>> behaviour.
>>
>> Furthermore, in the time this bugfix has been pending in public, I
>> successfully persuaded Intel to clarify the SDM, adding the following
>> clarification:
>>
>>   The guest RIP value is not required to be canonical; the value of bit N-1
>>   may differ from that of bit N.
>>
>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
> I think the fixes tag should likely be "x86emul: limit-check branch
> targets", since it's that commit that missed the revert done here?

Well, not really.  ffbbfda377 really does have a bug, irrespective of
the changes in the emulator.

The presence of 81d3a0b26c1 is why this bugfix is a full revert of
ffbbfda377, and not just an off-by-1 adjustment.

>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

~Andrew


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2023-04-05 21:52 Andrew Cooper
  2023-04-06  7:10 ` Jan Beulich
@ 2023-08-23 11:15 ` Roger Pau Monné
  2023-08-23 11:56   ` Andrew Cooper
  2023-08-24  4:26 ` Tian, Kevin
  2 siblings, 1 reply; 16+ messages in thread
From: Roger Pau Monné @ 2023-08-23 11:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Jan Beulich, Wei Liu, Jun Nakajima, Kevin Tian

On Wed, Apr 05, 2023 at 10:52:45PM +0100, Andrew Cooper wrote:
> At the time of XSA-170, the x86 instruction emulator was genuinely broken.  It
> would load arbitrary values into %rip and putting a check here probably was
> the best stopgap security fix.  It should have been reverted following c/s
> 81d3a0b26c1 "x86emul: limit-check branch targets" which corrected the emulator
> behaviour.
> 
> However, everyone involved in XSA-170, myself included, failed to read the SDM
> correctly.  On the subject of %rip consistency checks, the SDM stated:
> 
>   If the processor supports N < 64 linear-address bits, bits 63:N must be
>   identical
> 
> A non-canonical %rip (and SSP more recently) is an explicitly legal state in
> x86, and the VMEntry consistency checks are intentionally off-by-one from a
> regular canonical check.
> 
> The consequence of this bug is that Xen will currently take a legal x86 state
> which would successfully VMEnter, and corrupt it into having non-architectural
> behaviour.
> 
> Furthermore, in the time this bugfix has been pending in public, I
> successfully persuaded Intel to clarify the SDM, adding the following
> clarification:
> 
>   The guest RIP value is not required to be canonical; the value of bit N-1
>   may differ from that of bit N.
> 
> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")

I think the fixes tag should likely be "x86emul: limit-check branch
targets", since it's that commit that missed the revert done here?

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
  2023-04-05 21:52 Andrew Cooper
@ 2023-04-06  7:10 ` Jan Beulich
  2023-08-23 11:15 ` Roger Pau Monné
  2023-08-24  4:26 ` Tian, Kevin
  2 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2023-04-06  7:10 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Roger Pau Monné, Wei Liu, Jun Nakajima, Kevin Tian, Xen-devel

On 05.04.2023 23:52, Andrew Cooper wrote:
> At the time of XSA-170, the x86 instruction emulator was genuinely broken.  It
> would load arbitrary values into %rip and putting a check here probably was
> the best stopgap security fix.  It should have been reverted following c/s
> 81d3a0b26c1 "x86emul: limit-check branch targets" which corrected the emulator
> behaviour.
> 
> However, everyone involved in XSA-170, myself included, failed to read the SDM
> correctly.  On the subject of %rip consistency checks, the SDM stated:
> 
>   If the processor supports N < 64 linear-address bits, bits 63:N must be
>   identical
> 
> A non-canonical %rip (and SSP more recently) is an explicitly legal state in
> x86, and the VMEntry consistency checks are intentionally off-by-one from a
> regular canonical check.
> 
> The consequence of this bug is that Xen will currently take a legal x86 state
> which would successfully VMEnter, and corrupt it into having non-architectural
> behaviour.
> 
> Furthermore, in the time this bugfix has been pending in public, I
> successfully persuaded Intel to clarify the SDM, adding the following
> clarification:
> 
>   The guest RIP value is not required to be canonical; the value of bit N-1
>   may differ from that of bit N.
> 
> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

I am kind of okay with such a full revert, but I'd consider it quite helpful
if the description made clear why the alternative of instead doing the spec
mandated check isn't necessary / useful. The emulator having gained respective
checking is only part of the reason for this, aiui. Plus bugs may be
introduced into the emulator again, where the checking here could be a guard
against needing to issue an XSA in such a case.

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
@ 2023-04-05 21:52 Andrew Cooper
  2023-04-06  7:10 ` Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Andrew Cooper @ 2023-04-05 21:52 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian

At the time of XSA-170, the x86 instruction emulator was genuinely broken.  It
would load arbitrary values into %rip and putting a check here probably was
the best stopgap security fix.  It should have been reverted following c/s
81d3a0b26c1 "x86emul: limit-check branch targets" which corrected the emulator
behaviour.

However, everyone involved in XSA-170, myself included, failed to read the SDM
correctly.  On the subject of %rip consistency checks, the SDM stated:

  If the processor supports N < 64 linear-address bits, bits 63:N must be
  identical

A non-canonical %rip (and SSP more recently) is an explicitly legal state in
x86, and the VMEntry consistency checks are intentionally off-by-one from a
regular canonical check.

The consequence of this bug is that Xen will currently take a legal x86 state
which would successfully VMEnter, and corrupt it into having non-architectural
behaviour.

Furthermore, in the time this bugfix has been pending in public, I
successfully persuaded Intel to clarify the SDM, adding the following
clarification:

  The guest RIP value is not required to be canonical; the value of bit N-1
  may differ from that of bit N.

Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>

Rewrite the commit message, but no change to the fix.

Fun fact... LAM almost required this to be reverted in order to support, but
the penultimate draft of the spec backed down and made LAM only apply to data
accesses, not instruction fetches.
---
 xen/arch/x86/hvm/vmx/vmx.c | 34 +---------------------------------
 1 file changed, 1 insertion(+), 33 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index bfc9693f7e43..79ff59b11c6b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -4035,7 +4035,7 @@ static void undo_nmis_unblocked_by_iret(void)
 void vmx_vmexit_handler(struct cpu_user_regs *regs)
 {
     unsigned long exit_qualification, exit_reason, idtv_info, intr_info = 0;
-    unsigned int vector = 0, mode;
+    unsigned int vector = 0;
     struct vcpu *v = current;
     struct domain *currd = v->domain;
 
@@ -4730,38 +4730,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 out:
     if ( nestedhvm_vcpu_in_guestmode(v) )
         nvmx_idtv_handling();
-
-    /*
-     * VM entry will fail (causing the guest to get crashed) if rIP (and
-     * rFLAGS, but we don't have an issue there) doesn't meet certain
-     * criteria. As we must not allow less than fully privileged mode to have
-     * such an effect on the domain, we correct rIP in that case (accepting
-     * this not being architecturally correct behavior, as the injected #GP
-     * fault will then not see the correct [invalid] return address).
-     * And since we know the guest will crash, we crash it right away if it
-     * already is in most privileged mode.
-     */
-    mode = vmx_guest_x86_mode(v);
-    if ( mode == 8 ? !is_canonical_address(regs->rip)
-                   : regs->rip != regs->eip )
-    {
-        gprintk(XENLOG_WARNING, "Bad rIP %lx for mode %u\n", regs->rip, mode);
-
-        if ( vmx_get_cpl() )
-        {
-            __vmread(VM_ENTRY_INTR_INFO, &intr_info);
-            if ( !(intr_info & INTR_INFO_VALID_MASK) )
-                hvm_inject_hw_exception(X86_EXC_GP, 0);
-            /* Need to fix rIP nevertheless. */
-            if ( mode == 8 )
-                regs->rip = (long)(regs->rip << (64 - VADDR_BITS)) >>
-                            (64 - VADDR_BITS);
-            else
-                regs->rip = regs->eip;
-        }
-        else
-            domain_crash(v->domain);
-    }
 }
 
 static void lbr_tsx_fixup(void)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-08-24  4:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 15:09 [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest" Andrew Cooper
2020-10-13 15:58 ` Jan Beulich
2020-10-14 13:57   ` Andrew Cooper
2020-10-15  8:01     ` Jan Beulich
2020-10-16 15:38       ` Andrew Cooper
2020-10-19  9:09         ` Jan Beulich
2020-10-19 16:12           ` Andrew Cooper
2020-10-20  8:09             ` Jan Beulich
2020-10-23  6:14               ` Tian, Kevin
2023-04-05 21:52 Andrew Cooper
2023-04-06  7:10 ` Jan Beulich
2023-08-23 11:15 ` Roger Pau Monné
2023-08-23 11:56   ` Andrew Cooper
2023-08-23 13:31     ` Roger Pau Monné
2023-08-23 14:09       ` Andrew Cooper
2023-08-24  4:26 ` Tian, Kevin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).