All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues
@ 2019-02-19 22:18 Andrew Cooper
  2019-02-19 22:18 ` [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill() Andrew Cooper
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Andrew Cooper @ 2019-02-19 22:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu,
	Jun Nakajima, Razvan Cojocaru, Andrew Cooper, Jan Beulich,
	Roger Pau Monné

There are no XSAs because altp2m isn't security supported.  However, it would
be very nice to have it in a less broken state for 4.12.

Overall the risk of regression to other parts of Xen is minimal, as most of
these changes are only in altp2m-enabled paths.

Andrew Cooper (4):
  xen/common: Break domain_unmap_resources() out of domain_kill()
  x86/altp2m: Rework #VE enable/disable paths
  x86/vmx: Fix security issue when a guest balloons out the #VE info page
  x86/vmx: Properly flush the TLB when an altp2m is modified

 xen/arch/x86/domain.c          |  7 ++++
 xen/arch/x86/hvm/hvm.c         | 19 ++--------
 xen/arch/x86/hvm/vmx/vmx.c     | 69 ++++++++++++++++++++++++------------
 xen/arch/x86/mm/altp2m.c       | 80 +++++++++++++++++++++++++++++++++++-------
 xen/common/domain.c            | 16 +++++++--
 xen/include/asm-x86/altp2m.h   |  4 ++-
 xen/include/asm-x86/domain.h   |  3 ++
 xen/include/asm-x86/hvm/vcpu.h |  7 +++-
 xen/include/xen/domain.h       |  4 +++
 9 files changed, 153 insertions(+), 56 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill()
  2019-02-19 22:18 [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Andrew Cooper
@ 2019-02-19 22:18 ` Andrew Cooper
  2019-02-19 22:39   ` Razvan Cojocaru
  2019-02-19 22:18 ` [PATCH 2/4] x86/altp2m: Rework #VE enable/disable paths Andrew Cooper
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Andrew Cooper @ 2019-02-19 22:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Razvan Cojocaru,
	Andrew Cooper, Jan Beulich, Roger Pau Monné

A subsequent change is going to need an x86-specific unmapping step, so take
the opportunity to split the current vcpu unmapping out into a dedicated path.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
CC: Tamas K Lengyel <tamas@tklengyel.com>
CC: Juergen Gross <jgross@suse.com>
---
 xen/common/domain.c      | 16 +++++++++++++---
 xen/include/xen/domain.h |  4 ++++
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 32bca8d..e66f7ea 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -700,10 +700,21 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
     return 0;
 }
 
+static void domain_unmap_resources(struct domain *d)
+{
+    struct vcpu *v;
+
+    for_each_vcpu ( d, v )
+    {
+        unmap_vcpu_info(v);
+
+        arch_vcpu_unmap_resources(v);
+    }
+}
+
 int domain_kill(struct domain *d)
 {
     int rc = 0;
-    struct vcpu *v;
 
     if ( d == current->domain )
         return -EINVAL;
@@ -732,13 +743,12 @@ int domain_kill(struct domain *d)
         d->tmem_client = NULL;
         /* fallthrough */
     case DOMDYING_dying:
+        domain_unmap_resources(d);
         rc = domain_relinquish_resources(d);
         if ( rc != 0 )
             break;
         if ( cpupool_move_domain(d, cpupool0) )
             return -ERESTART;
-        for_each_vcpu ( d, v )
-            unmap_vcpu_info(v);
         d->is_dying = DOMDYING_dead;
         /* Mem event cleanup has to go here because the rings 
          * have to be put before we call put_domain. */
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index d1bfc82..f53c3a9 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -90,6 +90,10 @@ void arch_dump_domain_info(struct domain *d);
 
 int arch_vcpu_reset(struct vcpu *);
 
+#ifndef arch_vcpu_unmap_resources
+static inline void arch_vcpu_unmap_resources(struct vcpu *v) {}
+#endif
+
 bool_t domctl_lock_acquire(void);
 void domctl_lock_release(void);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/4] x86/altp2m: Rework #VE enable/disable paths
  2019-02-19 22:18 [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Andrew Cooper
  2019-02-19 22:18 ` [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill() Andrew Cooper
@ 2019-02-19 22:18 ` Andrew Cooper
  2019-02-19 23:10   ` Razvan Cojocaru
  2019-02-19 22:18 ` [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page Andrew Cooper
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Andrew Cooper @ 2019-02-19 22:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Razvan Cojocaru,
	Andrew Cooper, Jan Beulich, Roger Pau Monné

Split altp2m_vcpu_{enable,disable}_ve() out of the
HVMOP_altp2m_vcpu_{enable,disable}_notify marshalling logic.  A future change
is going to need to call altp2m_vcpu_disable_ve() from the domain_kill() path.

While at it, clean up the logic in altp2m_vcpu_{initialise,destroy}().
altp2m_vcpu_reset() has no external callers, so fold it into its two
callsites.  This in turn allows for altp2m_vcpu_destroy() to reuse
altp2m_vcpu_disable_ve() rather than opencoding it.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
CC: Tamas K Lengyel <tamas@tklengyel.com>
CC: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/hvm/hvm.c       | 19 ++-----------------
 xen/arch/x86/mm/altp2m.c     | 39 +++++++++++++++++++++++++++------------
 xen/include/asm-x86/altp2m.h |  4 +++-
 3 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 79c7d81..a80ddcf 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4568,7 +4568,6 @@ static int do_altp2m_op(
     case HVMOP_altp2m_vcpu_enable_notify:
     {
         struct vcpu *v;
-        p2m_type_t p2mt;
 
         if ( a.u.enable_notify.pad ||
              a.u.enable_notify.vcpu_id >= d->max_vcpus )
@@ -4585,16 +4584,7 @@ static int do_altp2m_op(
 
         v = d->vcpu[a.u.enable_notify.vcpu_id];
 
-        if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) ||
-             mfn_eq(get_gfn_query_unlocked(v->domain,
-                    a.u.enable_notify.gfn, &p2mt), INVALID_MFN) )
-        {
-            rc = -EINVAL;
-            break;
-        }
-
-        vcpu_altp2m(v).veinfo_gfn = _gfn(a.u.enable_notify.gfn);
-        altp2m_vcpu_update_vmfunc_ve(v);
+        rc = altp2m_vcpu_enable_ve(v, _gfn(a.u.enable_notify.gfn));
         break;
     }
 
@@ -4616,12 +4606,7 @@ static int do_altp2m_op(
 
         v = d->vcpu[a.u.enable_notify.vcpu_id];
 
-        /* Already disabled, nothing to do. */
-        if ( gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) )
-            break;
-
-        vcpu_altp2m(v).veinfo_gfn = INVALID_GFN;
-        altp2m_vcpu_update_vmfunc_ve(v);
+        altp2m_vcpu_disable_ve(v);
         break;
     }
 
diff --git a/xen/arch/x86/mm/altp2m.c b/xen/arch/x86/mm/altp2m.c
index 930bdc2..8bdefb0 100644
--- a/xen/arch/x86/mm/altp2m.c
+++ b/xen/arch/x86/mm/altp2m.c
@@ -21,22 +21,13 @@
 #include <asm/altp2m.h>
 
 void
-altp2m_vcpu_reset(struct vcpu *v)
-{
-    struct altp2mvcpu *av = &vcpu_altp2m(v);
-
-    av->p2midx = INVALID_ALTP2M;
-    av->veinfo_gfn = INVALID_GFN;
-}
-
-void
 altp2m_vcpu_initialise(struct vcpu *v)
 {
     if ( v != current )
         vcpu_pause(v);
 
-    altp2m_vcpu_reset(v);
     vcpu_altp2m(v).p2midx = 0;
+    vcpu_altp2m(v).veinfo_gfn = INVALID_GFN;
     atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
 
     altp2m_vcpu_update_p2m(v);
@@ -56,15 +47,39 @@ altp2m_vcpu_destroy(struct vcpu *v)
     if ( (p2m = p2m_get_altp2m(v)) )
         atomic_dec(&p2m->active_vcpus);
 
-    altp2m_vcpu_reset(v);
+    altp2m_vcpu_disable_ve(v);
 
+    vcpu_altp2m(v).p2midx = INVALID_ALTP2M;
     altp2m_vcpu_update_p2m(v);
-    altp2m_vcpu_update_vmfunc_ve(v);
 
     if ( v != current )
         vcpu_unpause(v);
 }
 
+int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn)
+{
+    p2m_type_t p2mt;
+
+    if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) ||
+         mfn_eq(get_gfn_query_unlocked(v->domain, gfn_x(gfn), &p2mt),
+                INVALID_MFN) )
+        return -EINVAL;
+
+    vcpu_altp2m(v).veinfo_gfn = gfn;
+    altp2m_vcpu_update_vmfunc_ve(v);
+
+    return 0;
+}
+
+void altp2m_vcpu_disable_ve(struct vcpu *v)
+{
+    if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) )
+    {
+        vcpu_altp2m(v).veinfo_gfn = INVALID_GFN;
+        altp2m_vcpu_update_vmfunc_ve(v);
+    }
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/altp2m.h b/xen/include/asm-x86/altp2m.h
index 3befcf6..8139bf8 100644
--- a/xen/include/asm-x86/altp2m.h
+++ b/xen/include/asm-x86/altp2m.h
@@ -33,7 +33,9 @@ static inline bool altp2m_active(const struct domain *d)
 /* Alternate p2m VCPU */
 void altp2m_vcpu_initialise(struct vcpu *v);
 void altp2m_vcpu_destroy(struct vcpu *v);
-void altp2m_vcpu_reset(struct vcpu *v);
+
+int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn);
+void altp2m_vcpu_disable_ve(struct vcpu *v);
 
 static inline uint16_t altp2m_vcpu_idx(const struct vcpu *v)
 {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-19 22:18 [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Andrew Cooper
  2019-02-19 22:18 ` [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill() Andrew Cooper
  2019-02-19 22:18 ` [PATCH 2/4] x86/altp2m: Rework #VE enable/disable paths Andrew Cooper
@ 2019-02-19 22:18 ` Andrew Cooper
  2019-02-20  9:45   ` Razvan Cojocaru
                     ` (2 more replies)
  2019-02-19 22:18 ` [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified Andrew Cooper
  2019-02-21 13:13 ` [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Juergen Gross
  4 siblings, 3 replies; 23+ messages in thread
From: Andrew Cooper @ 2019-02-19 22:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu,
	Jun Nakajima, Razvan Cojocaru, Andrew Cooper, Jan Beulich,
	Roger Pau Monné

The logic in altp2m_vcpu_{en,dis}able_ve() and vmx_vcpu_update_vmfunc_ve() is
dangerous.  After #VE has been set up, the guest can balloon out and free the
nominated GFN, after which the processor may write to it.  Also, the unlocked
GFN query means the MFN is stale by the time it is used.  Alternatively, a
guest can race two disable calls to cause one VMCS to still reference the
nominated GFN after the tracking information was dropped.

Rework the logic from scratch to make it safe.

Hold an extra page reference on the underlying frame, to account for the
VMCS's reference.  This means that if the GFN gets ballooned out, it isn't
freed back to Xen until #VE is disabled, and the VMCS no longer refers to the
page.

A consequence of this is that arch_vcpu_unmap_resources() now needs to call
altp2m_vcpu_disable_ve() to drop the reference during domain_kill(), to allow
all of the memory to be freed.

For domains using altp2m, we expect a single enable call and no disable for
the remaining lifetime of the domain.  However, to avoid problems with
concurrent calls, use cmpxchg() to locklessly maintain safety.

This doesn't have an XSA because altp2m is not yet a security-supported
feature.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
CC: Tamas K Lengyel <tamas@tklengyel.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/domain.c          |  7 +++++
 xen/arch/x86/hvm/vmx/vmx.c     | 33 ++++++++++++-----------
 xen/arch/x86/mm/altp2m.c       | 59 +++++++++++++++++++++++++++++++++++-------
 xen/include/asm-x86/domain.h   |  3 +++
 xen/include/asm-x86/hvm/vcpu.h |  7 ++++-
 5 files changed, 82 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253..198fa14 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -38,6 +38,7 @@
 #include <xen/livepatch.h>
 #include <public/sysctl.h>
 #include <public/hvm/hvm_vcpu.h>
+#include <asm/altp2m.h>
 #include <asm/regs.h>
 #include <asm/mc146818rtc.h>
 #include <asm/system.h>
@@ -2013,6 +2014,12 @@ static int relinquish_memory(
     return ret;
 }
 
+void arch_vcpu_unmap_resources(struct vcpu *v)
+{
+    if ( altp2m_active(v->domain) )
+        altp2m_vcpu_disable_ve(v);
+}
+
 int domain_relinquish_resources(struct domain *d)
 {
     int ret;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 24def93..395bccd 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2196,14 +2196,11 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
 
         if ( cpu_has_vmx_virt_exceptions )
         {
-            p2m_type_t t;
-            mfn_t mfn;
+            const struct page_info *pg = vcpu_altp2m(v).veinfo_pg;
 
-            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
-
-            if ( !mfn_eq(mfn, INVALID_MFN) )
+            if ( pg )
             {
-                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
+                __vmwrite(VIRT_EXCEPTION_INFO, page_to_maddr(pg));
                 /*
                  * Make sure we have an up-to-date EPTP_INDEX when
                  * setting SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS.
@@ -2237,21 +2234,19 @@ static int vmx_vcpu_emulate_vmfunc(const struct cpu_user_regs *regs)
 
 static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 {
-    bool_t rc = 0, writable;
-    gfn_t gfn = vcpu_altp2m(v).veinfo_gfn;
+    const struct page_info *pg = vcpu_altp2m(v).veinfo_pg;
     ve_info_t *veinfo;
+    bool rc = false;
 
-    if ( gfn_eq(gfn, INVALID_GFN) )
-        return 0;
+    if ( !pg )
+        return rc;
 
-    veinfo = hvm_map_guest_frame_rw(gfn_x(gfn), 0, &writable);
-    if ( !veinfo )
-        return 0;
-    if ( !writable || veinfo->semaphore != 0 )
-        goto out;
+    veinfo = __map_domain_page(pg);
 
-    rc = 1;
+    if ( veinfo->semaphore != 0 )
+        goto out;
 
+    rc = true;
     veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
     veinfo->semaphore = ~0;
     veinfo->eptp_index = vcpu_altp2m(v).p2midx;
@@ -2266,7 +2261,11 @@ static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
                             X86_EVENT_NO_EC);
 
  out:
-    hvm_unmap_guest_frame(veinfo, 0);
+    unmap_domain_page(veinfo);
+
+    if ( rc )
+        paging_mark_dirty(v->domain, page_to_mfn(pg));
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/altp2m.c b/xen/arch/x86/mm/altp2m.c
index 8bdefb0..7412635 100644
--- a/xen/arch/x86/mm/altp2m.c
+++ b/xen/arch/x86/mm/altp2m.c
@@ -27,7 +27,6 @@ altp2m_vcpu_initialise(struct vcpu *v)
         vcpu_pause(v);
 
     vcpu_altp2m(v).p2midx = 0;
-    vcpu_altp2m(v).veinfo_gfn = INVALID_GFN;
     atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
 
     altp2m_vcpu_update_p2m(v);
@@ -58,25 +57,67 @@ altp2m_vcpu_destroy(struct vcpu *v)
 
 int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn)
 {
+    struct domain *d = v->domain;
+    struct altp2mvcpu *a = &vcpu_altp2m(v);
     p2m_type_t p2mt;
+    mfn_t mfn;
+    struct page_info *pg;
+    int rc;
+
+    /* Early exit path if #VE is already configured. */
+    if ( a->veinfo_pg )
+        return -EEXIST;
+
+    mfn = get_gfn(d, gfn_x(gfn), &p2mt);
+
+    /*
+     * Looking for a plain piece of guest writeable RAM.  Take an extra page
+     * reference to reflect our intent to point the VMCS at it.
+     */
+    if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_ram(p2mt) ||
+         p2m_is_readonly(p2mt) || !get_page(pg = mfn_to_page(mfn), d) )
+    {
+        rc = -EINVAL;
+        goto out;
+    }
 
-    if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) ||
-         mfn_eq(get_gfn_query_unlocked(v->domain, gfn_x(gfn), &p2mt),
-                INVALID_MFN) )
-        return -EINVAL;
+    /*
+     * Update veinfo_pg, making sure to be safe with concurrent hypercalls.
+     * The first caller to make veinfo_pg become non-NULL will program its MFN
+     * into the VMCS, so must not be clobbered.  Callers which lose the race
+     * back off with -EEXIST.
+     */
+    if ( cmpxchg(&a->veinfo_pg, NULL, pg) != NULL )
+    {
+        put_page(pg);
+        rc = -EEXIST;
+        goto out;
+    }
 
-    vcpu_altp2m(v).veinfo_gfn = gfn;
+    rc = 0;
     altp2m_vcpu_update_vmfunc_ve(v);
 
-    return 0;
+ out:
+    put_gfn(d, gfn_x(gfn));
+
+    return rc;
 }
 
 void altp2m_vcpu_disable_ve(struct vcpu *v)
 {
-    if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) )
+    struct altp2mvcpu *a = &vcpu_altp2m(v);
+    struct page_info *pg;
+
+    /*
+     * Update veinfo_pg, making sure to be safe with concurrent hypercalls.
+     * The winner of this race is responsible to update the VMCS to no longer
+     * point at the page, then drop the associated ref.
+     */
+    if ( (pg = xchg(&a->veinfo_pg, NULL)) )
     {
-        vcpu_altp2m(v).veinfo_gfn = INVALID_GFN;
         altp2m_vcpu_update_vmfunc_ve(v);
+
+        put_page(pg);
     }
 }
 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 277f99f..5f742d1 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -653,6 +653,9 @@ bool update_secondary_system_time(struct vcpu *,
 void vcpu_show_execution_state(struct vcpu *);
 void vcpu_show_registers(const struct vcpu *);
 
+#define arch_vcpu_unmap_resources arch_vcpu_unmap_resources
+void arch_vcpu_unmap_resources(struct vcpu *v);
+
 /* Clean up CR4 bits that are not under guest control. */
 unsigned long pv_guest_cr4_fixup(const struct vcpu *, unsigned long guest_cr4);
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index c8a40f6..6c84d5a 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -137,8 +137,13 @@ struct nestedvcpu {
 #define vcpu_nestedhvm(v) ((v)->arch.hvm.nvcpu)
 
 struct altp2mvcpu {
+    /*
+     * #VE information page.  This pointer being non-NULL indicates that a
+     * VMCS's VIRT_EXCEPTION_INFO field is pointing to the page, and an extra
+     * page reference is held.
+     */
+    struct page_info *veinfo_pg;
     uint16_t    p2midx;         /* alternate p2m index */
-    gfn_t       veinfo_gfn;     /* #VE information page gfn */
 };
 
 #define vcpu_altp2m(v) ((v)->arch.hvm.avcpu)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified
  2019-02-19 22:18 [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Andrew Cooper
                   ` (2 preceding siblings ...)
  2019-02-19 22:18 ` [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page Andrew Cooper
@ 2019-02-19 22:18 ` Andrew Cooper
  2019-02-19 23:20   ` Razvan Cojocaru
                     ` (2 more replies)
  2019-02-21 13:13 ` [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Juergen Gross
  4 siblings, 3 replies; 23+ messages in thread
From: Andrew Cooper @ 2019-02-19 22:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu,
	Jun Nakajima, Razvan Cojocaru, Andrew Cooper, Jan Beulich,
	Roger Pau Monné

Modificaitons to an altp2m mark the p2m as needing flushing, but this was
never wired up in the return-to-guest path.  As a result, stale TLB entries
can remain after resuming the guest.

In practice, this manifests as a missing EPT_VIOLATION or #VE exception when
the guest subsequently accesses a page which has had its permissions reduced.

vmx_vmenter_helper() now has 11 p2ms to potentially invalidate, but issuing 11
INVEPT instructions isn't clever.  Instead, count how many contexts need
invalidating, and use INVEPT_ALL_CONTEXT if two or more are in need of
flushing.

This doesn't have an XSA because altp2m is not yet a security-supported
feature.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
CC: Tamas K Lengyel <tamas@tklengyel.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 395bccd..290175b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -4273,6 +4273,7 @@ static void lbr_fixup(void)
 bool vmx_vmenter_helper(const struct cpu_user_regs *regs)
 {
     struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
     u32 new_asid, old_asid;
     struct hvm_vcpu_asid *p_asid;
     bool_t need_flush;
@@ -4319,17 +4320,42 @@ bool vmx_vmenter_helper(const struct cpu_user_regs *regs)
 
     if ( paging_mode_hap(curr->domain) )
     {
-        struct ept_data *ept = &p2m_get_hostp2m(curr->domain)->ept;
+        struct ept_data *ept = &p2m_get_hostp2m(currd)->ept;
         unsigned int cpu = smp_processor_id();
+        unsigned int inv = 0; /* None => Single => All */
+        struct ept_data *single = NULL; /* Single eptp, iff inv == 1 */
 
         if ( cpumask_test_cpu(cpu, ept->invalidate) )
         {
             cpumask_clear_cpu(cpu, ept->invalidate);
-            if ( nestedhvm_enabled(curr->domain) )
-                __invept(INVEPT_ALL_CONTEXT, 0);
-            else
-                __invept(INVEPT_SINGLE_CONTEXT, ept->eptp);
+
+            /* Automatically invalidate all contexts if nested. */
+            inv += 1 + nestedhvm_enabled(currd);
+            single = ept;
+        }
+
+        if ( altp2m_active(curr->domain) )
+        {
+            unsigned int i;
+
+            for ( i = 0; i < MAX_ALTP2M; ++i )
+            {
+                if ( currd->arch.altp2m_eptp[i] == mfn_x(INVALID_MFN) )
+                    continue;
+
+                ept = &currd->arch.altp2m_p2m[i]->ept;
+                if ( cpumask_test_cpu(cpu, ept->invalidate) )
+                {
+                    cpumask_clear_cpu(cpu, ept->invalidate);
+                    inv++;
+                    single = ept;
+                }
+            }
         }
+
+        if ( inv )
+            __invept(inv == 1 ? INVEPT_SINGLE_CONTEXT : INVEPT_ALL_CONTEXT,
+                     inv == 1 ? single->eptp          : 0);
     }
 
  out:
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill()
  2019-02-19 22:18 ` [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill() Andrew Cooper
@ 2019-02-19 22:39   ` Razvan Cojocaru
  2019-02-19 22:46     ` Andrew Cooper
  0 siblings, 1 reply; 23+ messages in thread
From: Razvan Cojocaru @ 2019-02-19 22:39 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Jan Beulich,
	Roger Pau Monné

On 2/20/19 12:18 AM, Andrew Cooper wrote:
> A subsequent change is going to need an x86-specific unmapping step, so take
> the opportunity to split the current vcpu unmapping out into a dedicated path.
> 
> No practical change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
> CC: Tamas K Lengyel <tamas@tklengyel.com>
> CC: Juergen Gross <jgross@suse.com>
> ---
>  xen/common/domain.c      | 16 +++++++++++++---
>  xen/include/xen/domain.h |  4 ++++
>  2 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 32bca8d..e66f7ea 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -700,10 +700,21 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
>      return 0;
>  }
>  
> +static void domain_unmap_resources(struct domain *d)
> +{
> +    struct vcpu *v;
> +
> +    for_each_vcpu ( d, v )
> +    {
> +        unmap_vcpu_info(v);
> +
> +        arch_vcpu_unmap_resources(v);
> +    }
> +}
> +
>  int domain_kill(struct domain *d)
>  {
>      int rc = 0;
> -    struct vcpu *v;
>  
>      if ( d == current->domain )
>          return -EINVAL;
> @@ -732,13 +743,12 @@ int domain_kill(struct domain *d)
>          d->tmem_client = NULL;
>          /* fallthrough */
>      case DOMDYING_dying:
> +        domain_unmap_resources(d);
>          rc = domain_relinquish_resources(d);
>          if ( rc != 0 )
>              break;
>          if ( cpupool_move_domain(d, cpupool0) )
>              return -ERESTART;a
> -        for_each_vcpu ( d, v )
> -            unmap_vcpu_info(v);

So before this change there was a leak of some sort here? I see that
before this patch unmap_vcpu_info() was called only if rc == 0, but it
is now called unconditionally _before_ domain_relinquish_resources().

This does appear to change the behaviour of the code in the error case.

If that's intended:
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill()
  2019-02-19 22:39   ` Razvan Cojocaru
@ 2019-02-19 22:46     ` Andrew Cooper
  2019-02-20 14:11       ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Andrew Cooper @ 2019-02-19 22:46 UTC (permalink / raw)
  To: Razvan Cojocaru, Xen-devel
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Jan Beulich,
	Roger Pau Monné

On 19/02/2019 22:39, Razvan Cojocaru wrote:
> On 2/20/19 12:18 AM, Andrew Cooper wrote:
>> A subsequent change is going to need an x86-specific unmapping step, so take
>> the opportunity to split the current vcpu unmapping out into a dedicated path.
>>
>> No practical change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
>> CC: Tamas K Lengyel <tamas@tklengyel.com>
>> CC: Juergen Gross <jgross@suse.com>
>> ---
>>  xen/common/domain.c      | 16 +++++++++++++---
>>  xen/include/xen/domain.h |  4 ++++
>>  2 files changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index 32bca8d..e66f7ea 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -700,10 +700,21 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
>>      return 0;
>>  }
>>  
>> +static void domain_unmap_resources(struct domain *d)
>> +{
>> +    struct vcpu *v;
>> +
>> +    for_each_vcpu ( d, v )
>> +    {
>> +        unmap_vcpu_info(v);
>> +
>> +        arch_vcpu_unmap_resources(v);
>> +    }
>> +}
>> +
>>  int domain_kill(struct domain *d)
>>  {
>>      int rc = 0;
>> -    struct vcpu *v;
>>  
>>      if ( d == current->domain )
>>          return -EINVAL;
>> @@ -732,13 +743,12 @@ int domain_kill(struct domain *d)
>>          d->tmem_client = NULL;
>>          /* fallthrough */
>>      case DOMDYING_dying:
>> +        domain_unmap_resources(d);
>>          rc = domain_relinquish_resources(d);
>>          if ( rc != 0 )
>>              break;
>>          if ( cpupool_move_domain(d, cpupool0) )
>>              return -ERESTART;a
>> -        for_each_vcpu ( d, v )
>> -            unmap_vcpu_info(v);
> So before this change there was a leak of some sort here? I see that
> before this patch unmap_vcpu_info() was called only if rc == 0, but it
> is now called unconditionally _before_ domain_relinquish_resources().
>
> This does appear to change the behaviour of the code in the error case.
>
> If that's intended:
> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

domain_kill() is a very long running hypercall, and takes about 15
minutes of wallclock time for domains with 1T of RAM (before the idle
scrubbing changes went in).

It will frequently break out with -ERESTART for continuation purposes,
but doesn't fail outright.

However, with hindsight it would probably be better just to put the
altp2m case in domain_relinquish_resources() and do away with this patch
entirely.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] x86/altp2m: Rework #VE enable/disable paths
  2019-02-19 22:18 ` [PATCH 2/4] x86/altp2m: Rework #VE enable/disable paths Andrew Cooper
@ 2019-02-19 23:10   ` Razvan Cojocaru
  2019-02-20 14:14     ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Razvan Cojocaru @ 2019-02-19 23:10 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Jan Beulich,
	Roger Pau Monné

On 2/20/19 12:18 AM, Andrew Cooper wrote:
> Split altp2m_vcpu_{enable,disable}_ve() out of the
> HVMOP_altp2m_vcpu_{enable,disable}_notify marshalling logic.  A future change
> is going to need to call altp2m_vcpu_disable_ve() from the domain_kill() path.
> 
> While at it, clean up the logic in altp2m_vcpu_{initialise,destroy}().
> altp2m_vcpu_reset() has no external callers, so fold it into its two
> callsites.  This in turn allows for altp2m_vcpu_destroy() to reuse
> altp2m_vcpu_disable_ve() rather than opencoding it.
> 
> No practical change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified
  2019-02-19 22:18 ` [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified Andrew Cooper
@ 2019-02-19 23:20   ` Razvan Cojocaru
  2019-02-20 14:47   ` Jan Beulich
  2019-02-28  5:50   ` Tian, Kevin
  2 siblings, 0 replies; 23+ messages in thread
From: Razvan Cojocaru @ 2019-02-19 23:20 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu, Jan Beulich,
	Jun Nakajima, Roger Pau Monné

On 2/20/19 12:18 AM, Andrew Cooper wrote:
> Modificaitons to an altp2m mark the p2m as needing flushing, but this was
> never wired up in the return-to-guest path.  As a result, stale TLB entries
> can remain after resuming the guest.
> 
> In practice, this manifests as a missing EPT_VIOLATION or #VE exception when
> the guest subsequently accesses a page which has had its permissions reduced.
> 
> vmx_vmenter_helper() now has 11 p2ms to potentially invalidate, but issuing 11
> INVEPT instructions isn't clever.  Instead, count how many contexts need
> invalidating, and use INVEPT_ALL_CONTEXT if two or more are in need of
> flushing.
> 
> This doesn't have an XSA because altp2m is not yet a security-supported
> feature.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-19 22:18 ` [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page Andrew Cooper
@ 2019-02-20  9:45   ` Razvan Cojocaru
  2019-02-20 14:37   ` Jan Beulich
  2019-02-21 20:18   ` [PATCH v2 " Andrew Cooper
  2 siblings, 0 replies; 23+ messages in thread
From: Razvan Cojocaru @ 2019-02-20  9:45 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu, Jan Beulich,
	Jun Nakajima, Roger Pau Monné

On 2/20/19 12:18 AM, Andrew Cooper wrote:
> The logic in altp2m_vcpu_{en,dis}able_ve() and vmx_vcpu_update_vmfunc_ve() is
> dangerous.  After #VE has been set up, the guest can balloon out and free the
> nominated GFN, after which the processor may write to it.  Also, the unlocked
> GFN query means the MFN is stale by the time it is used.  Alternatively, a
> guest can race two disable calls to cause one VMCS to still reference the
> nominated GFN after the tracking information was dropped.
> 
> Rework the logic from scratch to make it safe.
> 
> Hold an extra page reference on the underlying frame, to account for the
> VMCS's reference.  This means that if the GFN gets ballooned out, it isn't
> freed back to Xen until #VE is disabled, and the VMCS no longer refers to the
> page.
> 
> A consequence of this is that arch_vcpu_unmap_resources() now needs to call
> altp2m_vcpu_disable_ve() to drop the reference during domain_kill(), to allow
> all of the memory to be freed.
> 
> For domains using altp2m, we expect a single enable call and no disable for
> the remaining lifetime of the domain.  However, to avoid problems with
> concurrent calls, use cmpxchg() to locklessly maintain safety.
> 
> This doesn't have an XSA because altp2m is not yet a security-supported
> feature.
> 
> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com>

Tested-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill()
  2019-02-19 22:46     ` Andrew Cooper
@ 2019-02-20 14:11       ` Jan Beulich
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2019-02-20 14:11 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Razvan Cojocaru,
	Xen-devel, Roger Pau Monne

>>> On 19.02.19 at 23:46, <andrew.cooper3@citrix.com> wrote:
> On 19/02/2019 22:39, Razvan Cojocaru wrote:
>> On 2/20/19 12:18 AM, Andrew Cooper wrote:
>>> A subsequent change is going to need an x86-specific unmapping step, so take
>>> the opportunity to split the current vcpu unmapping out into a dedicated 
> path.
>>>
>>> No practical change.
>>>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> CC: Jan Beulich <JBeulich@suse.com>
>>> CC: Wei Liu <wei.liu2@citrix.com>
>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>> CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>> CC: Tamas K Lengyel <tamas@tklengyel.com>
>>> CC: Juergen Gross <jgross@suse.com>
>>> ---
>>>  xen/common/domain.c      | 16 +++++++++++++---
>>>  xen/include/xen/domain.h |  4 ++++
>>>  2 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>>> index 32bca8d..e66f7ea 100644
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -700,10 +700,21 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, 
> struct domain **d)
>>>      return 0;
>>>  }
>>>  
>>> +static void domain_unmap_resources(struct domain *d)
>>> +{
>>> +    struct vcpu *v;
>>> +
>>> +    for_each_vcpu ( d, v )
>>> +    {
>>> +        unmap_vcpu_info(v);
>>> +
>>> +        arch_vcpu_unmap_resources(v);
>>> +    }
>>> +}
>>> +
>>>  int domain_kill(struct domain *d)
>>>  {
>>>      int rc = 0;
>>> -    struct vcpu *v;
>>>  
>>>      if ( d == current->domain )
>>>          return -EINVAL;
>>> @@ -732,13 +743,12 @@ int domain_kill(struct domain *d)
>>>          d->tmem_client = NULL;
>>>          /* fallthrough */
>>>      case DOMDYING_dying:
>>> +        domain_unmap_resources(d);
>>>          rc = domain_relinquish_resources(d);
>>>          if ( rc != 0 )
>>>              break;
>>>          if ( cpupool_move_domain(d, cpupool0) )
>>>              return -ERESTART;a
>>> -        for_each_vcpu ( d, v )
>>> -            unmap_vcpu_info(v);
>> So before this change there was a leak of some sort here? I see that
>> before this patch unmap_vcpu_info() was called only if rc == 0, but it
>> is now called unconditionally _before_ domain_relinquish_resources().
>>
>> This does appear to change the behaviour of the code in the error case.
>>
>> If that's intended:
>> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
> 
> domain_kill() is a very long running hypercall, and takes about 15
> minutes of wallclock time for domains with 1T of RAM (before the idle
> scrubbing changes went in).
> 
> It will frequently break out with -ERESTART for continuation purposes,
> but doesn't fail outright.
> 
> However, with hindsight it would probably be better just to put the
> altp2m case in domain_relinquish_resources() and do away with this patch
> entirely.

While this patch could have my ack, I'd clearly prefer you going the
described alternative route.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] x86/altp2m: Rework #VE enable/disable paths
  2019-02-19 23:10   ` Razvan Cojocaru
@ 2019-02-20 14:14     ` Jan Beulich
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2019-02-20 14:14 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Razvan Cojocaru,
	Xen-devel, Roger Pau Monne

>>> On 20.02.19 at 00:10, <rcojocaru@bitdefender.com> wrote:
> On 2/20/19 12:18 AM, Andrew Cooper wrote:
>> Split altp2m_vcpu_{enable,disable}_ve() out of the
>> HVMOP_altp2m_vcpu_{enable,disable}_notify marshalling logic.  A future 
> change
>> is going to need to call altp2m_vcpu_disable_ve() from the domain_kill() 
> path.
>> 
>> While at it, clean up the logic in altp2m_vcpu_{initialise,destroy}().
>> altp2m_vcpu_reset() has no external callers, so fold it into its two
>> callsites.  This in turn allows for altp2m_vcpu_destroy() to reuse
>> altp2m_vcpu_disable_ve() rather than opencoding it.
>> 
>> No practical change.
>> 
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-19 22:18 ` [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page Andrew Cooper
  2019-02-20  9:45   ` Razvan Cojocaru
@ 2019-02-20 14:37   ` Jan Beulich
  2019-02-21 17:03     ` Andrew Cooper
  2019-02-21 20:18   ` [PATCH v2 " Andrew Cooper
  2 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2019-02-20 14:37 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu,
	Razvan Cojocaru, Xen-devel, Jun Nakajima, Roger Pau Monne

>>> On 19.02.19 at 23:18, <andrew.cooper3@citrix.com> wrote:
> @@ -58,25 +57,67 @@ altp2m_vcpu_destroy(struct vcpu *v)
>  
>  int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn)
>  {
> +    struct domain *d = v->domain;
> +    struct altp2mvcpu *a = &vcpu_altp2m(v);
>      p2m_type_t p2mt;
> +    mfn_t mfn;
> +    struct page_info *pg;
> +    int rc;
> +
> +    /* Early exit path if #VE is already configured. */
> +    if ( a->veinfo_pg )
> +        return -EEXIST;
> +
> +    mfn = get_gfn(d, gfn_x(gfn), &p2mt);
> +
> +    /*
> +     * Looking for a plain piece of guest writeable RAM.  Take an extra page
> +     * reference to reflect our intent to point the VMCS at it.
> +     */
> +    if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_ram(p2mt) ||
> +         p2m_is_readonly(p2mt) || !get_page(pg = mfn_to_page(mfn), d) )

p2m_is_ram() is pretty broad a class, and p2m_is_readonly() removes
only usefully p2m_ram_ro and p2m_ram_shared from the set. In
particular p2m_ioreq_server is thus permitted, as are all p2m_paging_*
which don't produce INVALID_MFN. I don't think that's what you want.
I also don't think you really want to exclude p2m_ram_logdirty - if
anything you might need to convert that type to p2m_ram_rw in case
you find the page in that state.

Can't you simply call check_get_page_from_gfn() here anyway?

Furthermore I'm uncertain if get_page() is quite enough: Don't you
also want a writable type reference? It may not strictly be needed at
this point, but I think we're trying to make the distinction in new code,
like was e.g. done in recent Viridian changes.

> +    {
> +        rc = -EINVAL;
> +        goto out;
> +    }
>  
> -    if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) ||
> -         mfn_eq(get_gfn_query_unlocked(v->domain, gfn_x(gfn), &p2mt),
> -                INVALID_MFN) )
> -        return -EINVAL;
> +    /*
> +     * Update veinfo_pg, making sure to be safe with concurrent hypercalls.
> +     * The first caller to make veinfo_pg become non-NULL will program its MFN
> +     * into the VMCS, so must not be clobbered.  Callers which lose the race
> +     * back off with -EEXIST.
> +     */
> +    if ( cmpxchg(&a->veinfo_pg, NULL, pg) != NULL )
> +    {
> +        put_page(pg);
> +        rc = -EEXIST;
> +        goto out;
> +    }
>  
> -    vcpu_altp2m(v).veinfo_gfn = gfn;
> +    rc = 0;
>      altp2m_vcpu_update_vmfunc_ve(v);
>  
> -    return 0;
> + out:
> +    put_gfn(d, gfn_x(gfn));

Is there any reason why you need to hold the GFN ref until here? It would
seem to me that it can be dropped the moment you've obtained the general
page ref.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified
  2019-02-19 22:18 ` [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified Andrew Cooper
  2019-02-19 23:20   ` Razvan Cojocaru
@ 2019-02-20 14:47   ` Jan Beulich
  2019-02-28  5:50   ` Tian, Kevin
  2 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2019-02-20 14:47 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu,
	Razvan Cojocaru, Xen-devel, Jun Nakajima, Roger Pau Monne

>>> On 19.02.19 at 23:18, <andrew.cooper3@citrix.com> wrote:
> @@ -4319,17 +4320,42 @@ bool vmx_vmenter_helper(const struct cpu_user_regs *regs)
>  
>      if ( paging_mode_hap(curr->domain) )
>      {
> -        struct ept_data *ept = &p2m_get_hostp2m(curr->domain)->ept;
> +        struct ept_data *ept = &p2m_get_hostp2m(currd)->ept;
>          unsigned int cpu = smp_processor_id();
> +        unsigned int inv = 0; /* None => Single => All */
> +        struct ept_data *single = NULL; /* Single eptp, iff inv == 1 */
>  
>          if ( cpumask_test_cpu(cpu, ept->invalidate) )
>          {
>              cpumask_clear_cpu(cpu, ept->invalidate);
> -            if ( nestedhvm_enabled(curr->domain) )
> -                __invept(INVEPT_ALL_CONTEXT, 0);
> -            else
> -                __invept(INVEPT_SINGLE_CONTEXT, ept->eptp);
> +
> +            /* Automatically invalidate all contexts if nested. */
> +            inv += 1 + nestedhvm_enabled(currd);
> +            single = ept;
> +        }
> +
> +        if ( altp2m_active(curr->domain) )

Please use currd here as well.

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues
  2019-02-19 22:18 [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Andrew Cooper
                   ` (3 preceding siblings ...)
  2019-02-19 22:18 ` [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified Andrew Cooper
@ 2019-02-21 13:13 ` Juergen Gross
  4 siblings, 0 replies; 23+ messages in thread
From: Juergen Gross @ 2019-02-21 13:13 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Kevin Tian, Tamas K Lengyel, Wei Liu, Jan Beulich,
	Razvan Cojocaru, Jun Nakajima, Roger Pau Monné

On 19/02/2019 23:18, Andrew Cooper wrote:
> There are no XSAs because altp2m isn't security supported.  However, it would
> be very nice to have it in a less broken state for 4.12.
> 
> Overall the risk of regression to other parts of Xen is minimal, as most of
> these changes are only in altp2m-enabled paths.
> 
> Andrew Cooper (4):
>   xen/common: Break domain_unmap_resources() out of domain_kill()
>   x86/altp2m: Rework #VE enable/disable paths
>   x86/vmx: Fix security issue when a guest balloons out the #VE info page
>   x86/vmx: Properly flush the TLB when an altp2m is modified
> 
>  xen/arch/x86/domain.c          |  7 ++++
>  xen/arch/x86/hvm/hvm.c         | 19 ++--------
>  xen/arch/x86/hvm/vmx/vmx.c     | 69 ++++++++++++++++++++++++------------
>  xen/arch/x86/mm/altp2m.c       | 80 +++++++++++++++++++++++++++++++++++-------
>  xen/common/domain.c            | 16 +++++++--
>  xen/include/asm-x86/altp2m.h   |  4 ++-
>  xen/include/asm-x86/domain.h   |  3 ++
>  xen/include/asm-x86/hvm/vcpu.h |  7 +++-
>  xen/include/xen/domain.h       |  4 +++
>  9 files changed, 153 insertions(+), 56 deletions(-)
> 

For the series:

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-20 14:37   ` Jan Beulich
@ 2019-02-21 17:03     ` Andrew Cooper
  2019-02-22  8:49       ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Andrew Cooper @ 2019-02-21 17:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu,
	Razvan Cojocaru, Xen-devel, Jun Nakajima, Roger Pau Monne

On 20/02/2019 14:37, Jan Beulich wrote:
>>>> On 19.02.19 at 23:18, <andrew.cooper3@citrix.com> wrote:
>> @@ -58,25 +57,67 @@ altp2m_vcpu_destroy(struct vcpu *v)
>>  
>>  int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn)
>>  {
>> +    struct domain *d = v->domain;
>> +    struct altp2mvcpu *a = &vcpu_altp2m(v);
>>      p2m_type_t p2mt;
>> +    mfn_t mfn;
>> +    struct page_info *pg;
>> +    int rc;
>> +
>> +    /* Early exit path if #VE is already configured. */
>> +    if ( a->veinfo_pg )
>> +        return -EEXIST;
>> +
>> +    mfn = get_gfn(d, gfn_x(gfn), &p2mt);
>> +
>> +    /*
>> +     * Looking for a plain piece of guest writeable RAM.  Take an extra page
>> +     * reference to reflect our intent to point the VMCS at it.
>> +     */
>> +    if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_ram(p2mt) ||
>> +         p2m_is_readonly(p2mt) || !get_page(pg = mfn_to_page(mfn), d) )
> p2m_is_ram() is pretty broad a class, and p2m_is_readonly() removes
> only usefully p2m_ram_ro and p2m_ram_shared from the set. In
> particular p2m_ioreq_server is thus permitted, as are all p2m_paging_*
> which don't produce INVALID_MFN. I don't think that's what you want.
> I also don't think you really want to exclude p2m_ram_logdirty - if
> anything you might need to convert that type to p2m_ram_rw in case
> you find the page in that state.

Flipping logdirty back to ram cannot reasonably be a caller requirement,
but you are right that logdirty pages ought to be eligible.  Holding
extra references doesn't interact with logdirty operations.

> Can't you simply call check_get_page_from_gfn() here anyway?
>
> Furthermore I'm uncertain if get_page() is quite enough: Don't you
> also want a writable type reference? It may not strictly be needed at
> this point, but I think we're trying to make the distinction in new code,
> like was e.g. done in recent Viridian changes.

No - I don't want to take a writeable reference.

Not because it is necessarily the wrong thing to do longterm, but
because it is not the current behaviour, and is therefore substantially
more risky in terms of subtle unexpected side effects (especially for
4.12 at this point).

We have years and years of XSAs demonstrating that things tend to go
wrong.  It is irresponsible to start introducing a new reference
counting scheme without doing the work to first justify the safety of
change, and have a reasonable demonstration that the result is safe,
e.g. with an IOMMU in the mix.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-19 22:18 ` [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page Andrew Cooper
  2019-02-20  9:45   ` Razvan Cojocaru
  2019-02-20 14:37   ` Jan Beulich
@ 2019-02-21 20:18   ` Andrew Cooper
  2019-02-21 21:28     ` Razvan Cojocaru
                       ` (2 more replies)
  2 siblings, 3 replies; 23+ messages in thread
From: Andrew Cooper @ 2019-02-21 20:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Tamas K Lengyel, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, Andrew Cooper, Jan Beulich, Roger Pau Monné

The logic in altp2m_vcpu_{en,dis}able_ve() and vmx_vcpu_update_vmfunc_ve() is
dangerous.  After #VE has been set up, the guest can balloon out and free the
nominated GFN, after which the processor may write to it.  Also, the unlocked
GFN query means the MFN is stale by the time it is used.  Alternatively, a
guest can race two disable calls to cause one VMCS to still reference the
nominated GFN after the tracking information was dropped.

Rework the logic from scratch to make it safe.

Hold an extra page reference on the underlying frame, to account for the
VMCS's reference.  This means that if the GFN gets ballooned out, it isn't
freed back to Xen until #VE is disabled, and the VMCS no longer refers to the
page.

A consequence of this is that altp2m_vcpu_disable_ve() needs to be called
during the domain_kill() path, to drop the reference for domains which shut
down with #VE still enabled.

For domains using altp2m, we expect a single enable call and no disable for
the remaining lifetime of the domain.  However, to avoid problems with
concurrent calls, use cmpxchg() to locklessly maintain safety.

This doesn't have an XSA because altp2m is not yet a security-supported
feature.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Razvan Cojocaru <rcojocaru@bitdefender.com>
CC: Tamas K Lengyel <tamas@tklengyel.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>

v2:
 * Reuse domain_relinquish_resources() rather than introducing
   domain_unmap_resources().
 * Use check_get_page_from_gfn() rather than opcoding half of it.
 * Disallow registering the #VE info page over anything which isn't plain RAM
   in the guest.
---
 xen/arch/x86/domain.c          |  7 +++++
 xen/arch/x86/hvm/vmx/vmx.c     | 33 ++++++++++++-----------
 xen/arch/x86/mm/altp2m.c       | 59 ++++++++++++++++++++++++++++++++++++------
 xen/include/asm-x86/hvm/vcpu.h |  7 ++++-
 4 files changed, 80 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7a29435..b5febd6 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -38,6 +38,7 @@
 #include <xen/livepatch.h>
 #include <public/sysctl.h>
 #include <public/hvm/hvm_vcpu.h>
+#include <asm/altp2m.h>
 #include <asm/regs.h>
 #include <asm/mc146818rtc.h>
 #include <asm/system.h>
@@ -2071,6 +2072,12 @@ int domain_relinquish_resources(struct domain *d)
                 return ret;
         }
 
+        if ( altp2m_active(d) )
+        {
+            for_each_vcpu ( d, v )
+                altp2m_vcpu_disable_ve(v);
+        }
+
         if ( is_pv_domain(d) )
         {
             for_each_vcpu ( d, v )
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 24def93..395bccd 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2196,14 +2196,11 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
 
         if ( cpu_has_vmx_virt_exceptions )
         {
-            p2m_type_t t;
-            mfn_t mfn;
+            const struct page_info *pg = vcpu_altp2m(v).veinfo_pg;
 
-            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
-
-            if ( !mfn_eq(mfn, INVALID_MFN) )
+            if ( pg )
             {
-                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
+                __vmwrite(VIRT_EXCEPTION_INFO, page_to_maddr(pg));
                 /*
                  * Make sure we have an up-to-date EPTP_INDEX when
                  * setting SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS.
@@ -2237,21 +2234,19 @@ static int vmx_vcpu_emulate_vmfunc(const struct cpu_user_regs *regs)
 
 static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 {
-    bool_t rc = 0, writable;
-    gfn_t gfn = vcpu_altp2m(v).veinfo_gfn;
+    const struct page_info *pg = vcpu_altp2m(v).veinfo_pg;
     ve_info_t *veinfo;
+    bool rc = false;
 
-    if ( gfn_eq(gfn, INVALID_GFN) )
-        return 0;
+    if ( !pg )
+        return rc;
 
-    veinfo = hvm_map_guest_frame_rw(gfn_x(gfn), 0, &writable);
-    if ( !veinfo )
-        return 0;
-    if ( !writable || veinfo->semaphore != 0 )
-        goto out;
+    veinfo = __map_domain_page(pg);
 
-    rc = 1;
+    if ( veinfo->semaphore != 0 )
+        goto out;
 
+    rc = true;
     veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
     veinfo->semaphore = ~0;
     veinfo->eptp_index = vcpu_altp2m(v).p2midx;
@@ -2266,7 +2261,11 @@ static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
                             X86_EVENT_NO_EC);
 
  out:
-    hvm_unmap_guest_frame(veinfo, 0);
+    unmap_domain_page(veinfo);
+
+    if ( rc )
+        paging_mark_dirty(v->domain, page_to_mfn(pg));
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/altp2m.c b/xen/arch/x86/mm/altp2m.c
index 8bdefb0..50768f2 100644
--- a/xen/arch/x86/mm/altp2m.c
+++ b/xen/arch/x86/mm/altp2m.c
@@ -27,7 +27,6 @@ altp2m_vcpu_initialise(struct vcpu *v)
         vcpu_pause(v);
 
     vcpu_altp2m(v).p2midx = 0;
-    vcpu_altp2m(v).veinfo_gfn = INVALID_GFN;
     atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
 
     altp2m_vcpu_update_p2m(v);
@@ -58,25 +57,69 @@ altp2m_vcpu_destroy(struct vcpu *v)
 
 int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn)
 {
+    struct domain *d = v->domain;
+    struct altp2mvcpu *a = &vcpu_altp2m(v);
     p2m_type_t p2mt;
+    struct page_info *pg;
+    int rc;
+
+    /* Early exit path if #VE is already configured. */
+    if ( a->veinfo_pg )
+        return -EEXIST;
+
+    rc = check_get_page_from_gfn(d, gfn, false, &p2mt, &pg);
+    if ( rc )
+        return rc;
+
+    /*
+     * Looking for a plain piece of guest writeable RAM with isn't a magic
+     * frame such as a grant/ioreq/shared_info/etc mapping.  We (ab)use the
+     * pageable() predicate for this, due to it having the same properties
+     * that we want.
+     */
+    if ( !p2m_is_pageable(p2mt) || is_xen_heap_page(pg) )
+    {
+        rc = -EINVAL;
+        goto err;
+    }
 
-    if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) ||
-         mfn_eq(get_gfn_query_unlocked(v->domain, gfn_x(gfn), &p2mt),
-                INVALID_MFN) )
-        return -EINVAL;
+    /*
+     * Update veinfo_pg, making sure to be safe with concurrent hypercalls.
+     * The first caller to make veinfo_pg become non-NULL will program its MFN
+     * into the VMCS, so must not be clobbered.  Callers which lose the race
+     * back off with -EEXIST.
+     */
+    if ( cmpxchg(&a->veinfo_pg, NULL, pg) != NULL )
+    {
+        rc = -EEXIST;
+        goto err;
+    }
 
-    vcpu_altp2m(v).veinfo_gfn = gfn;
     altp2m_vcpu_update_vmfunc_ve(v);
 
     return 0;
+
+ err:
+    put_page(pg);
+
+    return rc;
 }
 
 void altp2m_vcpu_disable_ve(struct vcpu *v)
 {
-    if ( !gfn_eq(vcpu_altp2m(v).veinfo_gfn, INVALID_GFN) )
+    struct altp2mvcpu *a = &vcpu_altp2m(v);
+    struct page_info *pg;
+
+    /*
+     * Update veinfo_pg, making sure to be safe with concurrent hypercalls.
+     * The winner of this race is responsible to update the VMCS to no longer
+     * point at the page, then drop the associated ref.
+     */
+    if ( (pg = xchg(&a->veinfo_pg, NULL)) )
     {
-        vcpu_altp2m(v).veinfo_gfn = INVALID_GFN;
         altp2m_vcpu_update_vmfunc_ve(v);
+
+        put_page(pg);
     }
 }
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index c8a40f6..6c84d5a 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -137,8 +137,13 @@ struct nestedvcpu {
 #define vcpu_nestedhvm(v) ((v)->arch.hvm.nvcpu)
 
 struct altp2mvcpu {
+    /*
+     * #VE information page.  This pointer being non-NULL indicates that a
+     * VMCS's VIRT_EXCEPTION_INFO field is pointing to the page, and an extra
+     * page reference is held.
+     */
+    struct page_info *veinfo_pg;
     uint16_t    p2midx;         /* alternate p2m index */
-    gfn_t       veinfo_gfn;     /* #VE information page gfn */
 };
 
 #define vcpu_altp2m(v) ((v)->arch.hvm.avcpu)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-21 20:18   ` [PATCH v2 " Andrew Cooper
@ 2019-02-21 21:28     ` Razvan Cojocaru
  2019-02-22 12:24     ` Jan Beulich
  2019-02-28  5:48     ` Tian, Kevin
  2 siblings, 0 replies; 23+ messages in thread
From: Razvan Cojocaru @ 2019-02-21 21:28 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Kevin Tian, Tamas K Lengyel, Wei Liu, Jan Beulich, Jun Nakajima,
	Roger Pau Monné

On 2/21/19 10:18 PM, Andrew Cooper wrote:
> The logic in altp2m_vcpu_{en,dis}able_ve() and vmx_vcpu_update_vmfunc_ve() is
> dangerous.  After #VE has been set up, the guest can balloon out and free the
> nominated GFN, after which the processor may write to it.  Also, the unlocked
> GFN query means the MFN is stale by the time it is used.  Alternatively, a
> guest can race two disable calls to cause one VMCS to still reference the
> nominated GFN after the tracking information was dropped.
> 
> Rework the logic from scratch to make it safe.
> 
> Hold an extra page reference on the underlying frame, to account for the
> VMCS's reference.  This means that if the GFN gets ballooned out, it isn't
> freed back to Xen until #VE is disabled, and the VMCS no longer refers to the
> page.
> 
> A consequence of this is that altp2m_vcpu_disable_ve() needs to be called
> during the domain_kill() path, to drop the reference for domains which shut
> down with #VE still enabled.
> 
> For domains using altp2m, we expect a single enable call and no disable for
> the remaining lifetime of the domain.  However, to avoid problems with
> concurrent calls, use cmpxchg() to locklessly maintain safety.
> 
> This doesn't have an XSA because altp2m is not yet a security-supported
> feature.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Juergen Gross <jgross@suse.com>

Tested-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-21 17:03     ` Andrew Cooper
@ 2019-02-22  8:49       ` Jan Beulich
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2019-02-22  8:49 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Kevin Tian, Tamas K Lengyel, Wei Liu,
	Razvan Cojocaru, Xen-devel, Jun Nakajima, Roger Pau Monne

>>> On 21.02.19 at 18:03, <andrew.cooper3@citrix.com> wrote:
> On 20/02/2019 14:37, Jan Beulich wrote:
>>>>> On 19.02.19 at 23:18, <andrew.cooper3@citrix.com> wrote:
>>> @@ -58,25 +57,67 @@ altp2m_vcpu_destroy(struct vcpu *v)
>>>  
>>>  int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn)
>>>  {
>>> +    struct domain *d = v->domain;
>>> +    struct altp2mvcpu *a = &vcpu_altp2m(v);
>>>      p2m_type_t p2mt;
>>> +    mfn_t mfn;
>>> +    struct page_info *pg;
>>> +    int rc;
>>> +
>>> +    /* Early exit path if #VE is already configured. */
>>> +    if ( a->veinfo_pg )
>>> +        return -EEXIST;
>>> +
>>> +    mfn = get_gfn(d, gfn_x(gfn), &p2mt);
>>> +
>>> +    /*
>>> +     * Looking for a plain piece of guest writeable RAM.  Take an extra 
> page
>>> +     * reference to reflect our intent to point the VMCS at it.
>>> +     */
>>> +    if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_ram(p2mt) ||
>>> +         p2m_is_readonly(p2mt) || !get_page(pg = mfn_to_page(mfn), d) )
>> p2m_is_ram() is pretty broad a class, and p2m_is_readonly() removes
>> only usefully p2m_ram_ro and p2m_ram_shared from the set. In
>> particular p2m_ioreq_server is thus permitted, as are all p2m_paging_*
>> which don't produce INVALID_MFN. I don't think that's what you want.
>> I also don't think you really want to exclude p2m_ram_logdirty - if
>> anything you might need to convert that type to p2m_ram_rw in case
>> you find the page in that state.
> 
> Flipping logdirty back to ram cannot reasonably be a caller requirement,

Sure - hence my "if anything".

> but you are right that logdirty pages ought to be eligible.  Holding
> extra references doesn't interact with logdirty operations.
> 
>> Can't you simply call check_get_page_from_gfn() here anyway?
>>
>> Furthermore I'm uncertain if get_page() is quite enough: Don't you
>> also want a writable type reference? It may not strictly be needed at
>> this point, but I think we're trying to make the distinction in new code,
>> like was e.g. done in recent Viridian changes.
> 
> No - I don't want to take a writeable reference.
> 
> Not because it is necessarily the wrong thing to do longterm, but
> because it is not the current behaviour, and is therefore substantially
> more risky in terms of subtle unexpected side effects (especially for
> 4.12 at this point).
> 
> We have years and years of XSAs demonstrating that things tend to go
> wrong.  It is irresponsible to start introducing a new reference
> counting scheme without doing the work to first justify the safety of
> change, and have a reasonable demonstration that the result is safe,
> e.g. with an IOMMU in the mix.

Well, afaict it has been a mix of both models basically from the
beginning. Whenever new code gets added to properly take
references, you have to decide anyway which of the two
possible variants to follow (as long as there is such an
inconsistency). I don't see it as particularly more or less risk to go
one route vs the other. I do agree though that we should settle
on one approach rather sooner than later, and then make all HVM
code behave that way.

Anyway - as long as we're convinced the writable ref is not strictly
needed, I'm not going to object to just acquiring a general ref
here. But as said, I'm not really certain there is no place where we
actually expect pages written to and owned by HVM guests to have
a writable ref.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-21 20:18   ` [PATCH v2 " Andrew Cooper
  2019-02-21 21:28     ` Razvan Cojocaru
@ 2019-02-22 12:24     ` Jan Beulich
  2019-02-22 14:03       ` Andrew Cooper
  2019-02-28  5:48     ` Tian, Kevin
  2 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2019-02-22 12:24 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Tamas K Lengyel, Wei Liu, Razvan Cojocaru, Xen-devel,
	Jun Nakajima, Roger Pau Monne

>>> On 21.02.19 at 21:18, <andrew.cooper3@citrix.com> wrote:
> The logic in altp2m_vcpu_{en,dis}able_ve() and vmx_vcpu_update_vmfunc_ve() is
> dangerous.  After #VE has been set up, the guest can balloon out and free the
> nominated GFN, after which the processor may write to it.  Also, the unlocked
> GFN query means the MFN is stale by the time it is used.  Alternatively, a
> guest can race two disable calls to cause one VMCS to still reference the
> nominated GFN after the tracking information was dropped.
> 
> Rework the logic from scratch to make it safe.
> 
> Hold an extra page reference on the underlying frame, to account for the
> VMCS's reference.  This means that if the GFN gets ballooned out, it isn't
> freed back to Xen until #VE is disabled, and the VMCS no longer refers to the
> page.
> 
> A consequence of this is that altp2m_vcpu_disable_ve() needs to be called
> during the domain_kill() path, to drop the reference for domains which shut
> down with #VE still enabled.
> 
> For domains using altp2m, we expect a single enable call and no disable for
> the remaining lifetime of the domain.  However, to avoid problems with
> concurrent calls, use cmpxchg() to locklessly maintain safety.
> 
> This doesn't have an XSA because altp2m is not yet a security-supported
> feature.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

I would appreciate though if you could add half a sentence clarifying
that and why you decided against obtaining a writable type-ref.

Also - I take it that altp2m and migration don't work together?
Otherwise wouldn't you need to mark the page dirty in more
cases?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-22 12:24     ` Jan Beulich
@ 2019-02-22 14:03       ` Andrew Cooper
  0 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2019-02-22 14:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Tamas K Lengyel, Wei Liu, Razvan Cojocaru, Xen-devel,
	Jun Nakajima, Roger Pau Monne

On 22/02/2019 12:24, Jan Beulich wrote:
>>>> On 21.02.19 at 21:18, <andrew.cooper3@citrix.com> wrote:
>> The logic in altp2m_vcpu_{en,dis}able_ve() and vmx_vcpu_update_vmfunc_ve() is
>> dangerous.  After #VE has been set up, the guest can balloon out and free the
>> nominated GFN, after which the processor may write to it.  Also, the unlocked
>> GFN query means the MFN is stale by the time it is used.  Alternatively, a
>> guest can race two disable calls to cause one VMCS to still reference the
>> nominated GFN after the tracking information was dropped.
>>
>> Rework the logic from scratch to make it safe.
>>
>> Hold an extra page reference on the underlying frame, to account for the
>> VMCS's reference.  This means that if the GFN gets ballooned out, it isn't
>> freed back to Xen until #VE is disabled, and the VMCS no longer refers to the
>> page.
>>
>> A consequence of this is that altp2m_vcpu_disable_ve() needs to be called
>> during the domain_kill() path, to drop the reference for domains which shut
>> down with #VE still enabled.
>>
>> For domains using altp2m, we expect a single enable call and no disable for
>> the remaining lifetime of the domain.  However, to avoid problems with
>> concurrent calls, use cmpxchg() to locklessly maintain safety.
>>
>> This doesn't have an XSA because altp2m is not yet a security-supported
>> feature.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Release-acked-by: Juergen Gross <jgross@suse.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> I would appreciate though if you could add half a sentence clarifying
> that and why you decided against obtaining a writable type-ref.
>
> Also - I take it that altp2m and migration don't work together?
> Otherwise wouldn't you need to mark the page dirty in more
> cases?

Noone ever considers migration, and it definitely doesn't work as a
self-contained action.  For one, nothing preserves the already
established mem_access / alternatives information.

For general logdirty, you will recall that Razvan spent a long time
fixing that in Xen 4.12.

The #VE-info page itself can in principle be tracked with Page
Modification Logging, but PML isn't available because EPT A/D is
mutually exclusive with introspection in current processors.  Enabling
EPT A/D causes all guest pagetable accesses to be treated as writes even
when they aren't setting guest A/D bits, meaning that if you try and
write protect a pagetable, you livelock taking EPT_VIOLATIONs when
trying to perform a TLB Fill.

However, what happens in practice is that you can't safely migrate VM's
with an attached vmevent ring as part of the p2m (as the guest kernel
can interfere), and you might notice that there is ongoing effort to use
resource mapping instead.  IIRC Introspection detaches in the post-pause
phase of migrate, and (by design) is capable of re-initialising all
protections from first principles at the destination (so individual
introspection agents on each server don't need to communicate state).

So overall, migration does work at least in the altp2m_external case,
but only due to a fair amount of heavy lifting in the toolstack

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page
  2019-02-21 20:18   ` [PATCH v2 " Andrew Cooper
  2019-02-21 21:28     ` Razvan Cojocaru
  2019-02-22 12:24     ` Jan Beulich
@ 2019-02-28  5:48     ` Tian, Kevin
  2 siblings, 0 replies; 23+ messages in thread
From: Tian, Kevin @ 2019-02-28  5:48 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Tamas K Lengyel, Wei Liu, Jan Beulich, Razvan Cojocaru, Nakajima,
	Jun, Roger Pau Monné

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Friday, February 22, 2019 4:19 AM
> 
> The logic in altp2m_vcpu_{en,dis}able_ve() and
> vmx_vcpu_update_vmfunc_ve() is
> dangerous.  After #VE has been set up, the guest can balloon out and free the
> nominated GFN, after which the processor may write to it.  Also, the unlocked
> GFN query means the MFN is stale by the time it is used.  Alternatively, a
> guest can race two disable calls to cause one VMCS to still reference the
> nominated GFN after the tracking information was dropped.
> 
> Rework the logic from scratch to make it safe.
> 
> Hold an extra page reference on the underlying frame, to account for the
> VMCS's reference.  This means that if the GFN gets ballooned out, it isn't
> freed back to Xen until #VE is disabled, and the VMCS no longer refers to the
> page.
> 
> A consequence of this is that altp2m_vcpu_disable_ve() needs to be called
> during the domain_kill() path, to drop the reference for domains which shut
> down with #VE still enabled.
> 
> For domains using altp2m, we expect a single enable call and no disable for
> the remaining lifetime of the domain.  However, to avoid problems with
> concurrent calls, use cmpxchg() to locklessly maintain safety.
> 
> This doesn't have an XSA because altp2m is not yet a security-supported
> feature.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified
  2019-02-19 22:18 ` [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified Andrew Cooper
  2019-02-19 23:20   ` Razvan Cojocaru
  2019-02-20 14:47   ` Jan Beulich
@ 2019-02-28  5:50   ` Tian, Kevin
  2 siblings, 0 replies; 23+ messages in thread
From: Tian, Kevin @ 2019-02-28  5:50 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Juergen Gross, Tamas K Lengyel, Wei Liu, Jan Beulich,
	Razvan Cojocaru, Nakajima, Jun, Roger Pau Monné

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, February 20, 2019 6:19 AM
> 
> Modificaitons to an altp2m mark the p2m as needing flushing, but this was

Modifications

> never wired up in the return-to-guest path.  As a result, stale TLB entries
> can remain after resuming the guest.
> 
> In practice, this manifests as a missing EPT_VIOLATION or #VE exception
> when
> the guest subsequently accesses a page which has had its permissions
> reduced.
> 
> vmx_vmenter_helper() now has 11 p2ms to potentially invalidate, but issuing
> 11
> INVEPT instructions isn't clever.  Instead, count how many contexts need
> invalidating, and use INVEPT_ALL_CONTEXT if two or more are in need of
> flushing.
> 
> This doesn't have an XSA because altp2m is not yet a security-supported
> feature.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-02-28  5:50 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-19 22:18 [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Andrew Cooper
2019-02-19 22:18 ` [PATCH 1/4] xen/common: Break domain_unmap_resources() out of domain_kill() Andrew Cooper
2019-02-19 22:39   ` Razvan Cojocaru
2019-02-19 22:46     ` Andrew Cooper
2019-02-20 14:11       ` Jan Beulich
2019-02-19 22:18 ` [PATCH 2/4] x86/altp2m: Rework #VE enable/disable paths Andrew Cooper
2019-02-19 23:10   ` Razvan Cojocaru
2019-02-20 14:14     ` Jan Beulich
2019-02-19 22:18 ` [PATCH 3/4] x86/vmx: Fix security issue when a guest balloons out the #VE info page Andrew Cooper
2019-02-20  9:45   ` Razvan Cojocaru
2019-02-20 14:37   ` Jan Beulich
2019-02-21 17:03     ` Andrew Cooper
2019-02-22  8:49       ` Jan Beulich
2019-02-21 20:18   ` [PATCH v2 " Andrew Cooper
2019-02-21 21:28     ` Razvan Cojocaru
2019-02-22 12:24     ` Jan Beulich
2019-02-22 14:03       ` Andrew Cooper
2019-02-28  5:48     ` Tian, Kevin
2019-02-19 22:18 ` [PATCH 4/4] x86/vmx: Properly flush the TLB when an altp2m is modified Andrew Cooper
2019-02-19 23:20   ` Razvan Cojocaru
2019-02-20 14:47   ` Jan Beulich
2019-02-28  5:50   ` Tian, Kevin
2019-02-21 13:13 ` [PATCH tentitively for-4.12 0/4] x86/altp2m: Fix multiple security issues Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.