xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Xen-devel <xen-devel@lists.xenproject.org>
Cc: "Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Jan Beulich" <JBeulich@suse.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>, "Wei Liu" <wl@xen.org>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Manuel Bouyer" <bouyer@antioche.eu.org>
Subject: [PATCH 3/3] x86/pv: Inject #UD for missing SYSCALL callbacks
Date: Wed, 23 Sep 2020 11:18:48 +0100	[thread overview]
Message-ID: <20200923101848.29049-4-andrew.cooper3@citrix.com> (raw)
In-Reply-To: <20200923101848.29049-1-andrew.cooper3@citrix.com>

Despite appearing to be a deliberate design choice of early PV64, the
resulting behaviour for unregistered SYSCALL callbacks creates an untenable
testability problem for Xen.  Furthermore, the behaviour is undocumented,
bizarre, and inconsistent with related behaviour in Xen, and very liable
introduce a security vulnerability into a PV guest if the author hasn't
studied Xen's assembly code in detail.

There are two different bugs here.

1) The current logic confuses the registered entrypoints, and may deliver a
   SYSCALL from 32bit userspace to the 64bit entry, when only a 64bit
   entrypoint is registered.

   This has been the case ever since 2007 (c/s cd75d47348b) but up until
   2018 (c/s dba899de14) the wrong selectors would be handed to the guest for
   a 32bit SYSCALL entry, making it appear as if it a 64bit entry all along.

   Xen would malfunction under these circumstances, if it were a PV guest.
   Linux would as well, but PVOps has always registered both entrypoints and
   discarded the Xen-provided selectors.  NetBSD really does malfunction as a
   consequence (benignly now, but a VM DoS before the 2018 Xen selector fix).

2) In the case that neither SYSCALL callbacks are registered, the guest will
   be crashed when userspace executes a SYSCALL instruction, which is a
   userspace => kernel DoS.

   This has been the case ever since the introduction of 64bit PV support, but
   behaves unlike all other SYSCALL/SYSENTER callbacks in Xen, which yield
   #GP/#UD in userspace before the callback is registered, and are therefore
   safe by default.

This change does constitute a change in the PV ABI, for corner cases of a PV
guest kernel registering neither callback, or not registering the 32bit
callback when running on AMD/Hygon hardware.

It brings the behaviour in line with PV32 SYSCALL/SYSENTER, and PV64
SYSENTER (safe by default, until explicitly enabled), as well as native
hardware (always delivered to the single applicable callback).

Most importantly however, and the primary reason for the change, is that it
lets us actually test the PV entrypoints to prove correct behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Andy Lutomirski <luto@kernel.org>
CC: Manuel Bouyer <bouyer@antioche.eu.org>

Manuel: This will result in a corner case change for NetBSD.

At the moment on native, 32bit userspace on 64bit NetBSD will get #UD (Intel,
etc), or an explicit -ENOSYS (AMD, etc) when trying to execute a 32bit SYSCALL
instruction.

After this change, a 64bit PV VM will consistently see #UD (like on Intel, etc
hardware) even when running on AMD/Hygon hardware (as Xsyscall32 isn't
registered with Xen), rather than following Xsyscall into the proper system
call path.
---
 xen/arch/x86/x86_64/entry.S | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 000eb9722b..baab6d8755 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -26,18 +26,30 @@
 /* %rbx: struct vcpu */
 ENTRY(switch_to_kernel)
         leaq  VCPU_trap_bounce(%rbx),%rdx
-        /* TB_eip = (32-bit syscall && syscall32_addr) ?
-         *          syscall32_addr : syscall_addr */
-        xor   %eax,%eax
+
+        /* TB_eip = 32-bit syscall ? syscall32_addr : syscall_addr */
+        mov   VCPU_syscall32_addr(%rbx), %ecx
+        mov   VCPU_syscall_addr(%rbx), %rax
         cmpw  $FLAT_USER_CS32,UREGS_cs(%rsp)
-        cmoveq VCPU_syscall32_addr(%rbx),%rax
-        testq %rax,%rax
-        cmovzq VCPU_syscall_addr(%rbx),%rax
-        movq  %rax,TRAPBOUNCE_eip(%rdx)
+        cmove %rcx, %rax
+
         /* TB_flags = VGCF_syscall_disables_events ? TBF_INTERRUPT : 0 */
         btl   $_VGCF_syscall_disables_events,VCPU_guest_context_flags(%rbx)
         setc  %cl
         leal  (,%rcx,TBF_INTERRUPT),%ecx
+
+        test  %rax, %rax
+UNLIKELY_START(z, syscall_no_callback) /* TB_eip == 0 => #UD */
+        movq  VCPU_trap_ctxt(%rbx), %rdi
+        movl  $X86_EXC_UD, UREGS_entry_vector(%rsp)
+        subl  $2, UREGS_rip(%rsp)
+        movl  X86_EXC_UD * TRAPINFO_sizeof + TRAPINFO_eip(%rdi), %eax
+        testb $4, X86_EXC_UD * TRAPINFO_sizeof + TRAPINFO_flags(%rdi)
+        setnz %cl
+        leal  TBF_EXCEPTION(, %rcx, TBF_INTERRUPT), %ecx
+UNLIKELY_END(syscall_no_callback)
+
+        movq  %rax,TRAPBOUNCE_eip(%rdx)
         movb  %cl,TRAPBOUNCE_flags(%rdx)
         call  create_bounce_frame
         andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
-- 
2.11.0



  parent reply	other threads:[~2020-09-23 10:19 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-23 10:18 [PATCH 0/3] x86/pv: Multiple fixes to SYSCALL/SYSENTER handling (XSA-339 followup) Andrew Cooper
2020-09-23 10:18 ` [PATCH 1/3] x86/pv: Don't deliver #GP for a SYSENTER with NT set Andrew Cooper
2020-09-24 13:55   ` Jan Beulich
2020-09-23 10:18 ` [PATCH 2/3] x86/pv: Don't clobber NT on return-to-guest Andrew Cooper
2020-09-24 13:57   ` Jan Beulich
2020-09-23 10:18 ` Andrew Cooper [this message]
2020-09-24 14:56   ` [PATCH 3/3] x86/pv: Inject #UD for missing SYSCALL callbacks Jan Beulich
2020-09-28 13:05     ` Andrew Cooper
2020-09-28 15:35       ` Jan Beulich
2020-10-09 11:53   ` [PATCH v2] " Andrew Cooper
2020-10-09 12:40     ` Manuel Bouyer
2020-10-09 12:50       ` Andrew Cooper
2020-10-14 14:16     ` Roger Pau Monné
2020-10-14 14:20       ` Manuel Bouyer
2020-10-14 14:26         ` Andrew Cooper
2020-10-14 15:17       ` Andrew Cooper
2020-10-14 16:28     ` Roger Pau Monné
2020-10-14 17:41       ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200923101848.29049-4-andrew.cooper3@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=bouyer@antioche.eu.org \
    --cc=luto@kernel.org \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).