xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Xen-devel <xen-devel@lists.xenproject.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>, "Wei Liu" <wl@xen.org>,
	"Jun Nakajima" <jun.nakajima@intel.com>,
	"Kevin Tian" <kevin.tian@intel.com>
Subject: Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
Date: Thu, 15 Oct 2020 10:01:28 +0200	[thread overview]
Message-ID: <1786f728-15c2-3877-c01a-035b11bd8504@suse.com> (raw)
In-Reply-To: <01bb2f27-4e0b-3637-e456-09eb7b9b233e@citrix.com>

On 14.10.2020 15:57, Andrew Cooper wrote:
> On 13/10/2020 16:58, Jan Beulich wrote:
>> On 09.10.2020 17:09, Andrew Cooper wrote:
>>> At the time of XSA-170, the x86 instruction emulator really was broken, and
>>> would allow arbitrary non-canonical values to be loaded into %rip.  This was
>>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
>>> targets".
>>>
>>> However, in a demonstration that off-by-one errors really are one of the
>>> hardest programming issues we face, everyone involved with XSA-170, myself
>>> included, mistook the statement in the SDM which says:
>>>
>>>   If the processor supports N < 64 linear-address bits, bits 63:N must be identical
>>>
>>> to mean "must be canonical".  A real canonical check is bits 63:N-1.
>>>
>>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to cater
>>> to the boundary condition at 0x0000800000000000.
>>>
>>> Now that the emulator has been fixed, revert the XSA-170 change to fix
>>> architectural behaviour at the boundary case.  The XTF test case for XSA-170
>>> exercises this corner case, and still passes.
>>>
>>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> But why revert the change rather than fix ...
>>
>>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>  out:
>>>      if ( nestedhvm_vcpu_in_guestmode(v) )
>>>          nvmx_idtv_handling();
>>> -
>>> -    /*
>>> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
>>> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
>>> -     * criteria. As we must not allow less than fully privileged mode to have
>>> -     * such an effect on the domain, we correct rIP in that case (accepting
>>> -     * this not being architecturally correct behavior, as the injected #GP
>>> -     * fault will then not see the correct [invalid] return address).
>>> -     * And since we know the guest will crash, we crash it right away if it
>>> -     * already is in most privileged mode.
>>> -     */
>>> -    mode = vmx_guest_x86_mode(v);
>>> -    if ( mode == 8 ? !is_canonical_address(regs->rip)
>> ... the wrong use of is_canonical_address() here? By reverting
>> you open up avenues for XSAs in case we get things wrong elsewhere,
>> including ...
>>
>>> -                   : regs->rip != regs->eip )
>> ... for 32-bit guests.
> 
> Because the only appropriate alternative would be ASSERT_UNREACHABLE()
> and domain crash.
> 
> This logic corrupts guest state.
> 
> Running with corrupt state is every bit an XSA as hitting a VMEntry
> failure if it can be triggered by userspace, but the latter safer and
> much more obvious.

I disagree. For CPL > 0 we don't "corrupt" guest state any more
than reporting a #GP fault when one is going to be reported
anyway (as long as the VM entry doesn't fail, and hence the
guest won't get crashed). IOW this raising of #GP actually is a
precautionary measure to _avoid_ XSAs.

Nor do I agree with the "much more obvious" aspect: A VM entry
failure requires quite a bit of analysis to recognize what has
caused it; whether a non-pseudo-canonical RIP is what catches your
eye right away is simply unknown. The gprintk() that you delete,
otoh, says very clearly what we have found to be wrong.

> It was the appropriate security fix (give or take the functional bug in
> it) at the time, given the complexity of retrofitting zero length
> instruction fetches to the emulator.
> 
> However, it is one of a very long list of guest-state-induced VMEntry
> failures, with non-trivial logic which we assert will pass, on a
> fastpath, where hardware also performs the same checks and we already
> have a runtime safe way of dealing with errors.  (Hence not actually
> using ASSERT_UNREACHABLE() here.)

"Runtime safe" as far as Xen is concerned, I take it. This isn't safe
for the guest at all, as vmx_failed_vmentry() results in an
unconditional domain_crash().

I certainly buy the fast path aspect of your comment, and if you were
moving the guest state adjustment into vmx_failed_vmentry(), I'd be
fine with the deletion here.

> It isn't appropriate for this check to exist on its own (i.e. without
> other guest state checks),

Well, if we run into cases where we get things wrong, more checks
and adjustments may want adding. Sadly each one of those has a fair
chance of needing an XSA.

As an aside, nvmx_n2_vmexit_handler()'s handling of
VMX_EXIT_REASONS_FAILED_VMENTRY looks pretty bogus - this is a flag,
not a separate exit reason. I guess I'll make a patch ...

Jan


  reply	other threads:[~2020-10-15  8:01 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-09 15:09 [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest" Andrew Cooper
2020-10-13 15:58 ` Jan Beulich
2020-10-14 13:57   ` Andrew Cooper
2020-10-15  8:01     ` Jan Beulich [this message]
2020-10-16 15:38       ` Andrew Cooper
2020-10-19  9:09         ` Jan Beulich
2020-10-19 16:12           ` Andrew Cooper
2020-10-20  8:09             ` Jan Beulich
2020-10-23  6:14               ` Tian, Kevin
2023-04-05 21:52 Andrew Cooper
2023-04-06  7:10 ` Jan Beulich
2023-08-23 11:15 ` Roger Pau Monné
2023-08-23 11:56   ` Andrew Cooper
2023-08-23 13:31     ` Roger Pau Monné
2023-08-23 14:09       ` Andrew Cooper
2023-08-24  4:26 ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1786f728-15c2-3877-c01a-035b11bd8504@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).