All of lore.kernel.org
 help / color / mirror / Atom feed
* Xen Security Advisory 173 (CVE-2016-3960) - x86 shadow pagetables: address width overflow
@ 2016-04-18 13:31 Xen.org security team
  2016-05-13 10:55 ` Philipp Hahn
  0 siblings, 1 reply; 3+ messages in thread
From: Xen.org security team @ 2016-04-18 13:31 UTC (permalink / raw)
  To: xen-announce, xen-devel, xen-users, oss-security; +Cc: Xen.org security team

[-- Attachment #1: Type: text/plain, Size: 5123 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

            Xen Security Advisory CVE-2016-3960 / XSA-173
                              version 3

             x86 shadow pagetables: address width overflow

UPDATES IN VERSION 3
====================

Public release.

ISSUE DESCRIPTION
=================

In the x86 shadow pagetable code, the guest frame number of a
superpage mapping is stored in a 32-bit field.  If a shadowed guest
can cause a superpage mapping of a guest-physical address at or above
2^44 to be shadowed, the top bits of the address will be lost, causing
an assertion failure or NULL dereference later on, in code that
removes the shadow.


IMPACT
======

A HVM guest using shadow pagetables can cause the host to crash.

A PV guest using shadow pagetables (i.e. being migrated) with PV
superpages enabled (which is not the default) can crash the host, or
corrupt hypervisor memory, and so a privilege escalation cannot be
ruled out.


VULNERABLE SYSTEMS
==================

Xen versions from 3.4 onwards are affected.

Only x86 variants of Xen are susceptible.  ARM variants are not
affected.

HVM guests using shadow mode paging can expose this vulnerability.  HVM
guests using Hardware Assisted Paging (HAP) are unaffected.

Systems running PV guests with PV superpages enabled are vulnerable if
those guests undergo live migration.  PV superpages are disabled by
default, so systems are not vulnerable in this way unless
"allowsuperpage" is on the Xen command line.

To discover whether your HVM guests are using HAP, or shadow page
tables: request debug key `q' (from the Xen console, or with
`xl debug-keys q').  This will print (to the console, and visible in
`xl dmesg'), debug information for every domain, containing something
like this:

  (XEN) General information for domain 2:
  (XEN)     refcnt=1 dying=2 pause_count=2
  (XEN)     nr_pages=2 xenheap_pages=0 shared_pages=0 paged_pages=0 dirty_cpus={} max_pages=262400
  (XEN)     handle=ef58ef1a-784d-4e59-8079-42bdee87f219 vm_assist=00000000
  (XEN)     paging assistance: hap refcounts translate external
                               ^^^
The presence of `hap' here indicates that the host is not
vulnerable to this domain.  For an HVM domain the presence of `shadow'
indicates that the domain can exploit the vulnerability.


MITIGATION
==========

Running only PV guests will avoid this vulnerability, unless PV
superpage support is enabled (see above).

Running HVM guests with Hardware Assisted Paging (HAP) enabled will
also avoid this vulnerability.  This is the default mode on hardware
supporting HAP, but can be overridden by hypervisor command line
option and guest configuration setting.  Such overrides ("hap=0" in
either case, with variants like "no-hap" being possible in the
hypervisor command line case) would need to be removed to avoid this
vulnerability.


CREDITS
=======

This issue was discovered by Ling Liu and Yihan Lian of the Cloud
Security Team, Qihoo 360.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

xsa173-unstable.patch  xen-unstable
xsa173-4.6.patch       Xen 4.6.x
xsa173-4.5.patch       Xen 4.5.x
xsa173-4.4.patch       Xen 4.4.x
xsa173-4.3.patch       Xen 4.3.x

$ sha256sum xsa173*
bd4619334351afc9f71bb529e8ac102c63415bb4d13197e3bd24a58de03726cb  xsa173-unstable.patch
089c07f0c8237da674796f155ee7e3c0305efd11a59df30ef2c3d5f6b423bfea  xsa173-4.3.patch
35e02b8d4c2841ad951dd967b4f11aa7911fe5d52be2cb605b174e8c2e9214ca  xsa173-4.4.patch
8cd255416975b5589b85911142b385cc1ed78b8ea5e16ebe9d6c60e2679b23aa  xsa173-4.5.patch
6dbc34e3e2d4415967c4406e0f8392a9395bff74da115ae20f26bd112b19017c  xsa173-4.6.patch
$

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJXFOGhAAoJEIP+FMlX6CvZXUYH/A1ekMpU71/JUK1c53qHmaTY
ZCsJj5hArL9poTYss/AfyumZRATalPrbX/Wt6JaVMutMefgPjphP8OKTzywr/aDJ
vRIim4piOABS15cWtYlfTans6X4yyk1NxmMt182osRW1JSW+OrjXORs6719zoEL7
3hzuf7g6pYiaVqtUmLEx9/U3T246ZaQ98V93YVxGGUyUYRBmFJxEAtA2yf4SlqNX
G3XNDc4DZpXnp8yABFEu+atfWef/Mn/gbNdJPxUXpu25WAOGEf0/0mnEF1b+KmCZ
nBXM3UwMIwN+OR0xMC447iQxvKQe7WD/6/JNMI6Bl76SSctCaLV0LU6PtyVCfJI=
=FOR7
-----END PGP SIGNATURE-----

[-- Attachment #2: xsa173-unstable.patch --]
[-- Type: application/octet-stream, Size: 9021 bytes --]

commit 52a15bdc4238d7ace836c097902222d3e12a19fc
Author: Tim Deegan <tim@xen.org>
Date:   Mon Mar 14 11:05:48 2016 +0000

    x86: limit GFNs to 32 bits for shadowed superpages.
    
    Superpage shadows store the shadowed GFN in the backpointer field,
    which for non-BIGMEM builds is 32 bits wide.  Shadowing a superpage
    mapping of a guest-physical address above 2^44 would lead to the GFN
    being truncated there, and a crash when we come to remove the shadow
    from the hash table.
    
    Track the valid width of a GFN for each guest, including reporting it
    through CPUID, and enforce it in the shadow pagetables.  Set the
    maximum witth to 32 for guests where this truncation could occur.
    
    This is XSA-173.
    
    Signed-off-by: Tim Deegan <tim@xen.org>
    Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reported-by: Ling Liu <liuling-it@360.cn>
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 9d364d4..cccc3cc 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -42,6 +42,7 @@ integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
+unsigned int hap_paddr_bits __read_mostly = 36;
 
 /*
  * Default host IA32_CR_PAT value to cover all memory types.
@@ -212,8 +213,11 @@ static void __init early_cpu_detect(void)
 	c->x86_capability[cpufeat_word(X86_FEATURE_FPU)] = edx;
 	c->x86_capability[cpufeat_word(X86_FEATURE_SSE3)] = ecx;
 
-	if ( cpuid_eax(0x80000000) >= 0x80000008 )
-		paddr_bits = cpuid_eax(0x80000008) & 0xff;
+	if ( cpuid_eax(0x80000000) >= 0x80000008 ) {
+		eax = cpuid_eax(0x80000008);
+		paddr_bits = eax & 0xff;
+		hap_paddr_bits = ((eax >> 16) & 0xff) ?: paddr_bits;
+	}
 }
 
 static void generic_identify(struct cpuinfo_x86 *c)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5bc2812..e309e9a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4776,8 +4776,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000008:
-        count = cpuid_eax(0x80000008);
-        count = (count >> 16) & 0xff ?: count & 0xff;
+        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
         if ( (*eax & 0xff) > count )
             *eax = (*eax & ~0xff) | count;
 
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 21cabba..625ac4d 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -342,20 +342,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
             flags &= ~_PAGE_PAT;
 
         if ( gfn_x(start) & GUEST_L2_GFN_MASK & ~0x1 )
-        {
-#if GUEST_PAGING_LEVELS == 2
-            /*
-             * Note that _PAGE_INVALID_BITS is zero in this case, yielding a
-             * no-op here.
-             *
-             * Architecturally, the walk should fail if bit 21 is set (others
-             * aren't being checked at least in PSE36 mode), but we'll ignore
-             * this here in order to avoid specifying a non-natural, non-zero
-             * _PAGE_INVALID_BITS value just for that case.
-             */
-#endif
             rc |= _PAGE_INVALID_BITS;
-        }
+
         /* Increment the pfn by the right number of 4k pages.  
          * Mask out PAT and invalid bits. */
         start = _gfn((gfn_x(start) & ~GUEST_L2_GFN_MASK) +
@@ -441,5 +429,11 @@ set_ad:
         put_page(mfn_to_page(mfn_x(gw->l1mfn)));
     }
 
+    /* If this guest has a restricted physical address space then the
+     * target GFN must fit within it. */
+    if ( !(rc & _PAGE_PRESENT)
+         && gfn_x(guest_l1e_get_gfn(gw->l1e)) >> d->arch.paging.gfn_bits )
+        rc |= _PAGE_INVALID_BITS;
+
     return rc;
 }
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index f173539..4ab99bb 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -448,6 +448,8 @@ void hap_domain_init(struct domain *d)
 {
     INIT_PAGE_LIST_HEAD(&d->arch.paging.hap.freelist);
 
+    d->arch.paging.gfn_bits = hap_paddr_bits - PAGE_SHIFT;
+
     /* Use HAP logdirty mechanism. */
     paging_log_dirty_init(d, hap_enable_log_dirty,
                           hap_disable_log_dirty,
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index b3fce1b..9e82256 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2081,6 +2081,12 @@ void *map_domain_gfn(struct p2m_domain *p2m, gfn_t gfn, mfn_t *mfn,
 {
     struct page_info *page;
 
+    if ( gfn_x(gfn) >> p2m->domain->arch.paging.gfn_bits )
+    {
+        *rc = _PAGE_INVALID_BIT;
+        return NULL;
+    }
+
     /* Translate the gfn, unsharing if shared. */
     page = get_page_from_gfn_p2m(p2m->domain, p2m, gfn_x(gfn), p2mt, NULL, q);
     if ( p2m_is_paging(*p2mt) )
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index cdad572..da3023c 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -51,6 +51,16 @@ int shadow_domain_init(struct domain *d, unsigned int domcr_flags)
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.freelist);
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.pinned_shadows);
 
+    d->arch.paging.gfn_bits = paddr_bits - PAGE_SHIFT;
+#ifndef CONFIG_BIGMEM
+    /*
+     * Shadowed superpages store GFNs in 32-bit page_info fields.
+     * Note that we cannot use guest_supports_superpages() here.
+     */
+    if ( !is_pv_domain(d) || opt_allow_superpage )
+        d->arch.paging.gfn_bits = 32;
+#endif
+
     /* Use shadow pagetables for log-dirty support */
     paging_log_dirty_init(d, sh_enable_log_dirty,
                           sh_disable_log_dirty, sh_clean_dirty_bitmap);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 70f2aa5..5f2622d 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -540,7 +540,8 @@ _sh_propagate(struct vcpu *v,
     ASSERT(GUEST_PAGING_LEVELS > 3 || level != 3);
 
     /* Check there's something for the shadows to map to */
-    if ( !p2m_is_valid(p2mt) && !p2m_is_grant(p2mt) )
+    if ( (!p2m_is_valid(p2mt) && !p2m_is_grant(p2mt))
+         || gfn_x(target_gfn) >> d->arch.paging.gfn_bits )
     {
         *sp = shadow_l1e_empty();
         goto done;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index de60def..37ded91 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -194,6 +194,9 @@ struct paging_domain {
     /* log dirty support */
     struct log_dirty_domain log_dirty;
 
+    /* Number of valid bits in a gfn. */
+    unsigned int gfn_bits;
+
     /* preemption handling */
     struct {
         const struct domain *dom;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index eb29e62..1f6c2ae 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -222,15 +222,17 @@ guest_supports_nx(struct vcpu *v)
 }
 
 
-/* Some bits are invalid in any pagetable entry. */
-#if GUEST_PAGING_LEVELS == 2
-#define _PAGE_INVALID_BITS (0)
-#elif GUEST_PAGING_LEVELS == 3
-#define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<63) - 1) & ~((1ull<<paddr_bits) - 1))
-#else /* GUEST_PAGING_LEVELS == 4 */
+/*
+ * Some bits are invalid in any pagetable entry.
+ * Normal flags values get represented in 24-bit values (see
+ * get_pte_flags() and put_pte_flags()), so set bit 24 in
+ * addition to be able to flag out of range frame numbers.
+ */
+#if GUEST_PAGING_LEVELS == 3
 #define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<52) - 1) & ~((1ull<<paddr_bits) - 1))
+    (_PAGE_INVALID_BIT | get_pte_flags(((1ull << 63) - 1) & ~(PAGE_SIZE - 1)))
+#else /* 2-level and 4-level */
+#define _PAGE_INVALID_BITS _PAGE_INVALID_BIT
 #endif
 
 
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index ba907ee..966e9c7 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -218,6 +218,8 @@ extern u64 trampoline_misc_enable_off;
 
 /* Maximum width of physical addresses supported by the hardware */
 extern unsigned int paddr_bits;
+/* Max physical address width supported within HAP guests */
+extern unsigned int hap_paddr_bits;
 
 extern const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id table[]);
 
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 86abb94..589f225 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -153,6 +153,12 @@ typedef l4_pgentry_t root_pgentry_t;
 #define _PAGE_GNTTAB (1U<<22)
 
 /*
+ * Bit 24 of a 24-bit flag mask!  This is not any bit of a real pte,
+ * and is only used for signalling in variables that contain flags.
+ */
+#define _PAGE_INVALID_BIT (1U<<24)
+
+/*
  * Bit 12 of a 24-bit flag mask. This corresponds to bit 52 of a pte.
  * This is needed to distinguish between user and kernel PTEs since _PAGE_USER
  * is asserted for both.

[-- Attachment #3: xsa173-4.3.patch --]
[-- Type: application/octet-stream, Size: 9227 bytes --]

commit 95dd1b6e87b61222fc856724a5d828c9bdc30c80
Author: Tim Deegan <tim@xen.org>
Date:   Wed Mar 16 17:07:18 2016 +0000

    x86: limit GFNs to 32 bits for shadowed superpages.
    
    Superpage shadows store the shadowed GFN in the backpointer field,
    which for non-BIGMEM builds is 32 bits wide.  Shadowing a superpage
    mapping of a guest-physical address above 2^44 would lead to the GFN
    being truncated there, and a crash when we come to remove the shadow
    from the hash table.
    
    Track the valid width of a GFN for each guest, including reporting it
    through CPUID, and enforce it in the shadow pagetables.  Set the
    maximum witth to 32 for guests where this truncation could occur.
    
    This is XSA-173.
    
    Signed-off-by: Tim Deegan <tim@xen.org>
    Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reported-by: Ling Liu <liuling-it@360.cn>
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index f449a8f..533558c 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -34,6 +34,7 @@ integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 struct cpu_dev * cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
+unsigned int hap_paddr_bits __read_mostly = 36;
 
 /*
  * Default host IA32_CR_PAT value to cover all memory types.
@@ -192,7 +193,7 @@ static void __init early_cpu_detect(void)
 
 static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 {
-	u32 tfms, xlvl, capability, excap, ebx;
+	u32 tfms, xlvl, capability, excap, eax, ebx;
 
 	/* Get vendor name */
 	cpuid(0x00000000, &c->cpuid_level,
@@ -227,8 +228,11 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 		}
 		if ( xlvl >= 0x80000004 )
 			get_model_name(c); /* Default name */
-		if ( xlvl >= 0x80000008 )
-			paddr_bits = cpuid_eax(0x80000008) & 0xff;
+		if ( xlvl >= 0x80000008 ) {
+			eax = cpuid_eax(0x80000008);
+			paddr_bits = eax & 0xff;
+			hap_paddr_bits = ((eax >> 16) & 0xff) ?: paddr_bits;
+		}
 	}
 
 	/* Might lift BIOS max_leaf=3 limit. */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 9dafcca..aee4aa1 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2888,8 +2888,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000008:
-        count = cpuid_eax(0x80000008);
-        count = (count >> 16) & 0xff ?: count & 0xff;
+        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
         if ( (*eax & 0xff) > count )
             *eax = (*eax & ~0xff) | count;
 
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 70460b6..09511f0 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -94,6 +94,12 @@ void *map_domain_gfn(struct p2m_domain *p2m, gfn_t gfn, mfn_t *mfn,
     struct page_info *page;
     void *map;
 
+    if ( gfn_x(gfn) >> p2m->domain->arch.paging.gfn_bits )
+    {
+        *rc = _PAGE_INVALID_BIT;
+        return NULL;
+    }
+
     /* Translate the gfn, unsharing if shared */
     page = get_page_from_gfn_p2m(p2m->domain, p2m, gfn_x(gfn), p2mt, NULL,
                                  q);
@@ -294,20 +300,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
             flags &= ~_PAGE_PAT;
 
         if ( gfn_x(start) & GUEST_L2_GFN_MASK & ~0x1 )
-        {
-#if GUEST_PAGING_LEVELS == 2
-            /*
-             * Note that _PAGE_INVALID_BITS is zero in this case, yielding a
-             * no-op here.
-             *
-             * Architecturally, the walk should fail if bit 21 is set (others
-             * aren't being checked at least in PSE36 mode), but we'll ignore
-             * this here in order to avoid specifying a non-natural, non-zero
-             * _PAGE_INVALID_BITS value just for that case.
-             */
-#endif
             rc |= _PAGE_INVALID_BITS;
-        }
+
         /* Increment the pfn by the right number of 4k pages.  
          * Mask out PAT and invalid bits. */
         start = _gfn((gfn_x(start) & ~GUEST_L2_GFN_MASK) +
@@ -390,5 +384,11 @@ set_ad:
         put_page(mfn_to_page(mfn_x(gw->l1mfn)));
     }
 
+    /* If this guest has a restricted physical address space then the
+     * target GFN must fit within it. */
+    if ( !(rc & _PAGE_PRESENT)
+         && gfn_x(guest_l1e_get_gfn(gw->l1e)) >> d->arch.paging.gfn_bits )
+        rc |= _PAGE_INVALID_BITS;
+
     return rc;
 }
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 66239b7..10c2951 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -421,6 +421,7 @@ static void hap_destroy_monitor_table(struct vcpu* v, mfn_t mmfn)
 void hap_domain_init(struct domain *d)
 {
     INIT_PAGE_LIST_HEAD(&d->arch.paging.hap.freelist);
+    d->arch.paging.gfn_bits = hap_paddr_bits - PAGE_SHIFT;
 }
 
 /* return 0 for success, -errno for failure */
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 49d9e06..edbb544 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -48,6 +48,16 @@ void shadow_domain_init(struct domain *d, unsigned int domcr_flags)
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.freelist);
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.pinned_shadows);
 
+    d->arch.paging.gfn_bits = paddr_bits - PAGE_SHIFT;
+#ifndef CONFIG_BIGMEM
+    /*
+     * Shadowed superpages store GFNs in 32-bit page_info fields.
+     * Note that we cannot use guest_supports_superpages() here.
+     */
+    if ( is_hvm_domain(d) || opt_allow_superpage )
+        d->arch.paging.gfn_bits = 32;
+#endif
+
     /* Use shadow pagetables for log-dirty support */
     paging_log_dirty_init(d, shadow_enable_log_dirty, 
                           shadow_disable_log_dirty, shadow_clean_dirty_bitmap);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 96ba5f2..fa5aad4 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -527,7 +527,8 @@ _sh_propagate(struct vcpu *v,
     ASSERT(GUEST_PAGING_LEVELS > 3 || level != 3);
 
     /* Check there's something for the shadows to map to */
-    if ( !p2m_is_valid(p2mt) && !p2m_is_grant(p2mt) )
+    if ( (!p2m_is_valid(p2mt) && !p2m_is_grant(p2mt))
+         || gfn_x(target_gfn) >> d->arch.paging.gfn_bits )
     {
         *sp = shadow_l1e_empty();
         goto done;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index b477646..24ddabd 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -187,6 +187,9 @@ struct paging_domain {
     /* log dirty support */
     struct log_dirty_domain log_dirty;
 
+    /* Number of valid bits in a gfn. */
+    unsigned int gfn_bits;
+
     /* preemption handling */
     struct {
         const struct domain *dom;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index b62bc6a..e06826f 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -220,15 +220,17 @@ guest_supports_nx(struct vcpu *v)
 }
 
 
-/* Some bits are invalid in any pagetable entry. */
-#if GUEST_PAGING_LEVELS == 2
-#define _PAGE_INVALID_BITS (0)
-#elif GUEST_PAGING_LEVELS == 3
-#define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<63) - 1) & ~((1ull<<paddr_bits) - 1))
-#else /* GUEST_PAGING_LEVELS == 4 */
+/*
+ * Some bits are invalid in any pagetable entry.
+ * Normal flags values get represented in 24-bit values (see
+ * get_pte_flags() and put_pte_flags()), so set bit 24 in
+ * addition to be able to flag out of range frame numbers.
+ */
+#if GUEST_PAGING_LEVELS == 3
 #define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<52) - 1) & ~((1ull<<paddr_bits) - 1))
+    (_PAGE_INVALID_BIT | get_pte_flags(((1ull << 63) - 1) & ~(PAGE_SIZE - 1)))
+#else /* 2-level and 4-level */
+#define _PAGE_INVALID_BITS _PAGE_INVALID_BIT
 #endif
 
 
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 4fa4a61..3bcc5b8 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -193,6 +193,8 @@ extern bool_t opt_cpu_info;
 
 /* Maximum width of physical addresses supported by the hardware */
 extern unsigned int paddr_bits;
+/* Max physical address width supported within HAP guests */
+extern unsigned int hap_paddr_bits;
 
 extern void identify_cpu(struct cpuinfo_x86 *);
 extern void setup_clear_cpu_cap(unsigned int);
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index c193c88..a48c650 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -166,6 +166,7 @@ typedef l4_pgentry_t root_pgentry_t;
 
 #define USER_MAPPINGS_ARE_GLOBAL
 #ifdef USER_MAPPINGS_ARE_GLOBAL
+
 /*
  * Bit 12 of a 24-bit flag mask. This corresponds to bit 52 of a pte.
  * This is needed to distinguish between user and kernel PTEs since _PAGE_USER
@@ -176,6 +177,12 @@ typedef l4_pgentry_t root_pgentry_t;
 #define _PAGE_GUEST_KERNEL 0
 #endif
 
+/*
+ * Bit 24 of a 24-bit flag mask!  This is not any bit of a real pte,
+ * and is only used for signalling in variables that contain flags.
+ */
+#define _PAGE_INVALID_BIT (1U<<24)
+
 #endif /* __X86_64_PAGE_H__ */
 
 /*

[-- Attachment #4: xsa173-4.4.patch --]
[-- Type: application/octet-stream, Size: 9227 bytes --]

commit 5893f9ea852f428e7120a4f3184a20deeb145d91
Author: Tim Deegan <tim@xen.org>
Date:   Wed Mar 16 17:05:25 2016 +0000

    x86: limit GFNs to 32 bits for shadowed superpages.
    
    Superpage shadows store the shadowed GFN in the backpointer field,
    which for non-BIGMEM builds is 32 bits wide.  Shadowing a superpage
    mapping of a guest-physical address above 2^44 would lead to the GFN
    being truncated there, and a crash when we come to remove the shadow
    from the hash table.
    
    Track the valid width of a GFN for each guest, including reporting it
    through CPUID, and enforce it in the shadow pagetables.  Set the
    maximum witth to 32 for guests where this truncation could occur.
    
    This is XSA-173.
    
    Signed-off-by: Tim Deegan <tim@xen.org>
    Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reported-by: Ling Liu <liuling-it@360.cn>
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 4221826..f436f91 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -37,6 +37,7 @@ integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 struct cpu_dev * cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
+unsigned int hap_paddr_bits __read_mostly = 36;
 
 /*
  * Default host IA32_CR_PAT value to cover all memory types.
@@ -195,7 +196,7 @@ static void __init early_cpu_detect(void)
 
 static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 {
-	u32 tfms, xlvl, capability, excap, ebx;
+	u32 tfms, xlvl, capability, excap, eax, ebx;
 
 	/* Get vendor name */
 	cpuid(0x00000000, &c->cpuid_level,
@@ -230,8 +231,11 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 		}
 		if ( xlvl >= 0x80000004 )
 			get_model_name(c); /* Default name */
-		if ( xlvl >= 0x80000008 )
-			paddr_bits = cpuid_eax(0x80000008) & 0xff;
+		if ( xlvl >= 0x80000008 ) {
+			eax = cpuid_eax(0x80000008);
+			paddr_bits = eax & 0xff;
+			hap_paddr_bits = ((eax >> 16) & 0xff) ?: paddr_bits;
+		}
 	}
 
 	/* Might lift BIOS max_leaf=3 limit. */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f3f6c61..a4bfb90 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2966,8 +2966,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000008:
-        count = cpuid_eax(0x80000008);
-        count = (count >> 16) & 0xff ?: count & 0xff;
+        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
         if ( (*eax & 0xff) > count )
             *eax = (*eax & ~0xff) | count;
 
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 70460b6..09511f0 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -94,6 +94,12 @@ void *map_domain_gfn(struct p2m_domain *p2m, gfn_t gfn, mfn_t *mfn,
     struct page_info *page;
     void *map;
 
+    if ( gfn_x(gfn) >> p2m->domain->arch.paging.gfn_bits )
+    {
+        *rc = _PAGE_INVALID_BIT;
+        return NULL;
+    }
+
     /* Translate the gfn, unsharing if shared */
     page = get_page_from_gfn_p2m(p2m->domain, p2m, gfn_x(gfn), p2mt, NULL,
                                  q);
@@ -294,20 +300,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
             flags &= ~_PAGE_PAT;
 
         if ( gfn_x(start) & GUEST_L2_GFN_MASK & ~0x1 )
-        {
-#if GUEST_PAGING_LEVELS == 2
-            /*
-             * Note that _PAGE_INVALID_BITS is zero in this case, yielding a
-             * no-op here.
-             *
-             * Architecturally, the walk should fail if bit 21 is set (others
-             * aren't being checked at least in PSE36 mode), but we'll ignore
-             * this here in order to avoid specifying a non-natural, non-zero
-             * _PAGE_INVALID_BITS value just for that case.
-             */
-#endif
             rc |= _PAGE_INVALID_BITS;
-        }
+
         /* Increment the pfn by the right number of 4k pages.  
          * Mask out PAT and invalid bits. */
         start = _gfn((gfn_x(start) & ~GUEST_L2_GFN_MASK) +
@@ -390,5 +384,11 @@ set_ad:
         put_page(mfn_to_page(mfn_x(gw->l1mfn)));
     }
 
+    /* If this guest has a restricted physical address space then the
+     * target GFN must fit within it. */
+    if ( !(rc & _PAGE_PRESENT)
+         && gfn_x(guest_l1e_get_gfn(gw->l1e)) >> d->arch.paging.gfn_bits )
+        rc |= _PAGE_INVALID_BITS;
+
     return rc;
 }
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index c06369b..ccc4174 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -428,6 +428,7 @@ static void hap_destroy_monitor_table(struct vcpu* v, mfn_t mmfn)
 void hap_domain_init(struct domain *d)
 {
     INIT_PAGE_LIST_HEAD(&d->arch.paging.hap.freelist);
+    d->arch.paging.gfn_bits = hap_paddr_bits - PAGE_SHIFT;
 }
 
 /* return 0 for success, -errno for failure */
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 90ba4d6..06a04ad 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -48,6 +48,16 @@ void shadow_domain_init(struct domain *d, unsigned int domcr_flags)
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.freelist);
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.pinned_shadows);
 
+    d->arch.paging.gfn_bits = paddr_bits - PAGE_SHIFT;
+#ifndef CONFIG_BIGMEM
+    /*
+     * Shadowed superpages store GFNs in 32-bit page_info fields.
+     * Note that we cannot use guest_supports_superpages() here.
+     */
+    if ( !is_pv_domain(d) || opt_allow_superpage )
+        d->arch.paging.gfn_bits = 32;
+#endif
+
     /* Use shadow pagetables for log-dirty support */
     paging_log_dirty_init(d, shadow_enable_log_dirty, 
                           shadow_disable_log_dirty, shadow_clean_dirty_bitmap);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 0f499bf..6c88d4e 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -527,7 +527,8 @@ _sh_propagate(struct vcpu *v,
     ASSERT(GUEST_PAGING_LEVELS > 3 || level != 3);
 
     /* Check there's something for the shadows to map to */
-    if ( !p2m_is_valid(p2mt) && !p2m_is_grant(p2mt) )
+    if ( (!p2m_is_valid(p2mt) && !p2m_is_grant(p2mt))
+         || gfn_x(target_gfn) >> d->arch.paging.gfn_bits )
     {
         *sp = shadow_l1e_empty();
         goto done;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 7dfbbcb..a03fc2e 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -187,6 +187,9 @@ struct paging_domain {
     /* log dirty support */
     struct log_dirty_domain log_dirty;
 
+    /* Number of valid bits in a gfn. */
+    unsigned int gfn_bits;
+
     /* preemption handling */
     struct {
         const struct domain *dom;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index d2a8250..d95f835 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -220,15 +220,17 @@ guest_supports_nx(struct vcpu *v)
 }
 
 
-/* Some bits are invalid in any pagetable entry. */
-#if GUEST_PAGING_LEVELS == 2
-#define _PAGE_INVALID_BITS (0)
-#elif GUEST_PAGING_LEVELS == 3
-#define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<63) - 1) & ~((1ull<<paddr_bits) - 1))
-#else /* GUEST_PAGING_LEVELS == 4 */
+/*
+ * Some bits are invalid in any pagetable entry.
+ * Normal flags values get represented in 24-bit values (see
+ * get_pte_flags() and put_pte_flags()), so set bit 24 in
+ * addition to be able to flag out of range frame numbers.
+ */
+#if GUEST_PAGING_LEVELS == 3
 #define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<52) - 1) & ~((1ull<<paddr_bits) - 1))
+    (_PAGE_INVALID_BIT | get_pte_flags(((1ull << 63) - 1) & ~(PAGE_SIZE - 1)))
+#else /* 2-level and 4-level */
+#define _PAGE_INVALID_BITS _PAGE_INVALID_BIT
 #endif
 
 
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index ec3da9b..8182afd 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -194,6 +194,8 @@ extern bool_t opt_cpu_info;
 
 /* Maximum width of physical addresses supported by the hardware */
 extern unsigned int paddr_bits;
+/* Max physical address width supported within HAP guests */
+extern unsigned int hap_paddr_bits;
 
 extern void identify_cpu(struct cpuinfo_x86 *);
 extern void setup_clear_cpu_cap(unsigned int);
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index c193c88..a48c650 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -166,6 +166,7 @@ typedef l4_pgentry_t root_pgentry_t;
 
 #define USER_MAPPINGS_ARE_GLOBAL
 #ifdef USER_MAPPINGS_ARE_GLOBAL
+
 /*
  * Bit 12 of a 24-bit flag mask. This corresponds to bit 52 of a pte.
  * This is needed to distinguish between user and kernel PTEs since _PAGE_USER
@@ -176,6 +177,12 @@ typedef l4_pgentry_t root_pgentry_t;
 #define _PAGE_GUEST_KERNEL 0
 #endif
 
+/*
+ * Bit 24 of a 24-bit flag mask!  This is not any bit of a real pte,
+ * and is only used for signalling in variables that contain flags.
+ */
+#define _PAGE_INVALID_BIT (1U<<24)
+
 #endif /* __X86_64_PAGE_H__ */
 
 /*

[-- Attachment #5: xsa173-4.5.patch --]
[-- Type: application/octet-stream, Size: 9153 bytes --]

commit 9d7687d60ae2e09ad2a77b05bd820e7850709375
Author: Tim Deegan <tim@xen.org>
Date:   Wed Mar 16 16:56:04 2016 +0000

    x86: limit GFNs to 32 bits for shadowed superpages.
    
    Superpage shadows store the shadowed GFN in the backpointer field,
    which for non-BIGMEM builds is 32 bits wide.  Shadowing a superpage
    mapping of a guest-physical address above 2^44 would lead to the GFN
    being truncated there, and a crash when we come to remove the shadow
    from the hash table.
    
    Track the valid width of a GFN for each guest, including reporting it
    through CPUID, and enforce it in the shadow pagetables.  Set the
    maximum witth to 32 for guests where this truncation could occur.
    
    This is XSA-173.
    
    Signed-off-by: Tim Deegan <tim@xen.org>
    Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reported-by: Ling Liu <liuling-it@360.cn>
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 5c8d3c2..7dc8220 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -37,6 +37,7 @@ integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 struct cpu_dev * cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
+unsigned int hap_paddr_bits __read_mostly = 36;
 
 /*
  * Default host IA32_CR_PAT value to cover all memory types.
@@ -209,7 +210,7 @@ static void __init early_cpu_detect(void)
 
 static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 {
-	u32 tfms, capability, excap, ebx;
+	u32 tfms, capability, excap, ebx, eax;
 
 	/* Get vendor name */
 	cpuid(0x00000000, &c->cpuid_level,
@@ -246,8 +247,11 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 		}
 		if ( c->extended_cpuid_level >= 0x80000004 )
 			get_model_name(c); /* Default name */
-		if ( c->extended_cpuid_level >= 0x80000008 )
-			paddr_bits = cpuid_eax(0x80000008) & 0xff;
+		if ( c->extended_cpuid_level >= 0x80000008 ) {
+			eax = cpuid_eax(0x80000008);
+			paddr_bits = eax & 0xff;
+			hap_paddr_bits = ((eax >> 16) & 0xff) ?: paddr_bits;
+		}
 	}
 
 	/* Might lift BIOS max_leaf=3 limit. */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 41fb10a..cac458a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4327,8 +4327,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000008:
-        count = cpuid_eax(0x80000008);
-        count = (count >> 16) & 0xff ?: count & 0xff;
+        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
         if ( (*eax & 0xff) > count )
             *eax = (*eax & ~0xff) | count;
 
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 1b26175..50ba7d5 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -94,6 +94,12 @@ void *map_domain_gfn(struct p2m_domain *p2m, gfn_t gfn, mfn_t *mfn,
     struct page_info *page;
     void *map;
 
+    if ( gfn_x(gfn) >> p2m->domain->arch.paging.gfn_bits )
+    {
+        *rc = _PAGE_INVALID_BIT;
+        return NULL;
+    }
+
     /* Translate the gfn, unsharing if shared */
     page = get_page_from_gfn_p2m(p2m->domain, p2m, gfn_x(gfn), p2mt, NULL,
                                  q);
@@ -327,20 +333,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
             flags &= ~_PAGE_PAT;
 
         if ( gfn_x(start) & GUEST_L2_GFN_MASK & ~0x1 )
-        {
-#if GUEST_PAGING_LEVELS == 2
-            /*
-             * Note that _PAGE_INVALID_BITS is zero in this case, yielding a
-             * no-op here.
-             *
-             * Architecturally, the walk should fail if bit 21 is set (others
-             * aren't being checked at least in PSE36 mode), but we'll ignore
-             * this here in order to avoid specifying a non-natural, non-zero
-             * _PAGE_INVALID_BITS value just for that case.
-             */
-#endif
             rc |= _PAGE_INVALID_BITS;
-        }
+
         /* Increment the pfn by the right number of 4k pages.  
          * Mask out PAT and invalid bits. */
         start = _gfn((gfn_x(start) & ~GUEST_L2_GFN_MASK) +
@@ -423,5 +417,11 @@ set_ad:
         put_page(mfn_to_page(mfn_x(gw->l1mfn)));
     }
 
+    /* If this guest has a restricted physical address space then the
+     * target GFN must fit within it. */
+    if ( !(rc & _PAGE_PRESENT)
+         && gfn_x(guest_l1e_get_gfn(gw->l1e)) >> d->arch.paging.gfn_bits )
+        rc |= _PAGE_INVALID_BITS;
+
     return rc;
 }
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 0c80012..84531b1 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -429,6 +429,8 @@ void hap_domain_init(struct domain *d)
 {
     INIT_PAGE_LIST_HEAD(&d->arch.paging.hap.freelist);
 
+    d->arch.paging.gfn_bits = hap_paddr_bits - PAGE_SHIFT;
+
     /* Use HAP logdirty mechanism. */
     paging_log_dirty_init(d, hap_enable_log_dirty,
                           hap_disable_log_dirty,
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 18026fe..9028d82 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -48,6 +48,16 @@ void shadow_domain_init(struct domain *d, unsigned int domcr_flags)
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.freelist);
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.pinned_shadows);
 
+    d->arch.paging.gfn_bits = paddr_bits - PAGE_SHIFT;
+#ifndef CONFIG_BIGMEM
+    /*
+     * Shadowed superpages store GFNs in 32-bit page_info fields.
+     * Note that we cannot use guest_supports_superpages() here.
+     */
+    if ( !is_pv_domain(d) || opt_allow_superpage )
+        d->arch.paging.gfn_bits = 32;
+#endif
+
     /* Use shadow pagetables for log-dirty support */
     paging_log_dirty_init(d, shadow_enable_log_dirty, 
                           shadow_disable_log_dirty, shadow_clean_dirty_bitmap);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index d6802ff..7589d23 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -527,7 +527,8 @@ _sh_propagate(struct vcpu *v,
     ASSERT(GUEST_PAGING_LEVELS > 3 || level != 3);
 
     /* Check there's something for the shadows to map to */
-    if ( !p2m_is_valid(p2mt) && !p2m_is_grant(p2mt) )
+    if ( (!p2m_is_valid(p2mt) && !p2m_is_grant(p2mt))
+         || gfn_x(target_gfn) >> d->arch.paging.gfn_bits )
     {
         *sp = shadow_l1e_empty();
         goto done;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 6a77a93..e8df4a9 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -188,6 +188,9 @@ struct paging_domain {
     /* log dirty support */
     struct log_dirty_domain log_dirty;
 
+    /* Number of valid bits in a gfn. */
+    unsigned int gfn_bits;
+
     /* preemption handling */
     struct {
         const struct domain *dom;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index d2a8250..d95f835 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -220,15 +220,17 @@ guest_supports_nx(struct vcpu *v)
 }
 
 
-/* Some bits are invalid in any pagetable entry. */
-#if GUEST_PAGING_LEVELS == 2
-#define _PAGE_INVALID_BITS (0)
-#elif GUEST_PAGING_LEVELS == 3
-#define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<63) - 1) & ~((1ull<<paddr_bits) - 1))
-#else /* GUEST_PAGING_LEVELS == 4 */
+/*
+ * Some bits are invalid in any pagetable entry.
+ * Normal flags values get represented in 24-bit values (see
+ * get_pte_flags() and put_pte_flags()), so set bit 24 in
+ * addition to be able to flag out of range frame numbers.
+ */
+#if GUEST_PAGING_LEVELS == 3
 #define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<52) - 1) & ~((1ull<<paddr_bits) - 1))
+    (_PAGE_INVALID_BIT | get_pte_flags(((1ull << 63) - 1) & ~(PAGE_SIZE - 1)))
+#else /* 2-level and 4-level */
+#define _PAGE_INVALID_BITS _PAGE_INVALID_BIT
 #endif
 
 
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index b4e4731..56fc5a2 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -203,6 +203,8 @@ extern u32 cpuid_ext_features;
 
 /* Maximum width of physical addresses supported by the hardware */
 extern unsigned int paddr_bits;
+/* Max physical address width supported within HAP guests */
+extern unsigned int hap_paddr_bits;
 
 extern void identify_cpu(struct cpuinfo_x86 *);
 extern void setup_clear_cpu_cap(unsigned int);
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 1d54587..f1d1b6c 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -141,6 +141,12 @@ typedef l4_pgentry_t root_pgentry_t;
 #define _PAGE_GNTTAB (1U<<22)
 
 /*
+ * Bit 24 of a 24-bit flag mask!  This is not any bit of a real pte,
+ * and is only used for signalling in variables that contain flags.
+ */
+#define _PAGE_INVALID_BIT (1U<<24)
+
+/*
  * Bit 12 of a 24-bit flag mask. This corresponds to bit 52 of a pte.
  * This is needed to distinguish between user and kernel PTEs since _PAGE_USER
  * is asserted for both.

[-- Attachment #6: xsa173-4.6.patch --]
[-- Type: application/octet-stream, Size: 9144 bytes --]

commit 54a4651cb4e744960fb375ed99909d7dfb943caf
Author: Tim Deegan <tim@xen.org>
Date:   Wed Mar 16 16:51:27 2016 +0000

    x86: limit GFNs to 32 bits for shadowed superpages.
    
    Superpage shadows store the shadowed GFN in the backpointer field,
    which for non-BIGMEM builds is 32 bits wide.  Shadowing a superpage
    mapping of a guest-physical address above 2^44 would lead to the GFN
    being truncated there, and a crash when we come to remove the shadow
    from the hash table.
    
    Track the valid width of a GFN for each guest, including reporting it
    through CPUID, and enforce it in the shadow pagetables.  Set the
    maximum witth to 32 for guests where this truncation could occur.
    
    This is XSA-173.
    
    Signed-off-by: Tim Deegan <tim@xen.org>
    Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reported-by: Ling Liu <liuling-it@360.cn>
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 35ef21b..528c283 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -38,6 +38,7 @@ integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
+unsigned int hap_paddr_bits __read_mostly = 36;
 
 /*
  * Default host IA32_CR_PAT value to cover all memory types.
@@ -211,7 +212,7 @@ static void __init early_cpu_detect(void)
 
 static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 {
-	u32 tfms, capability, excap, ebx;
+	u32 tfms, capability, excap, ebx, eax;
 
 	/* Get vendor name */
 	cpuid(0x00000000, &c->cpuid_level,
@@ -248,8 +249,11 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 		}
 		if ( c->extended_cpuid_level >= 0x80000004 )
 			get_model_name(c); /* Default name */
-		if ( c->extended_cpuid_level >= 0x80000008 )
-			paddr_bits = cpuid_eax(0x80000008) & 0xff;
+		if ( c->extended_cpuid_level >= 0x80000008 ) {
+			eax = cpuid_eax(0x80000008);
+			paddr_bits = eax & 0xff;
+			hap_paddr_bits = ((eax >> 16) & 0xff) ?: paddr_bits;
+		}
 	}
 
 	/* Might lift BIOS max_leaf=3 limit. */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e200aab..0b4d9f0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4567,8 +4567,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         break;
 
     case 0x80000008:
-        count = cpuid_eax(0x80000008);
-        count = (count >> 16) & 0xff ?: count & 0xff;
+        count = d->arch.paging.gfn_bits + PAGE_SHIFT;
         if ( (*eax & 0xff) > count )
             *eax = (*eax & ~0xff) | count;
 
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 773454d..06543d3 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -93,6 +93,12 @@ void *map_domain_gfn(struct p2m_domain *p2m, gfn_t gfn, mfn_t *mfn,
     struct page_info *page;
     void *map;
 
+    if ( gfn_x(gfn) >> p2m->domain->arch.paging.gfn_bits )
+    {
+        *rc = _PAGE_INVALID_BIT;
+        return NULL;
+    }
+
     /* Translate the gfn, unsharing if shared */
     page = get_page_from_gfn_p2m(p2m->domain, p2m, gfn_x(gfn), p2mt, NULL,
                                  q);
@@ -326,20 +332,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
             flags &= ~_PAGE_PAT;
 
         if ( gfn_x(start) & GUEST_L2_GFN_MASK & ~0x1 )
-        {
-#if GUEST_PAGING_LEVELS == 2
-            /*
-             * Note that _PAGE_INVALID_BITS is zero in this case, yielding a
-             * no-op here.
-             *
-             * Architecturally, the walk should fail if bit 21 is set (others
-             * aren't being checked at least in PSE36 mode), but we'll ignore
-             * this here in order to avoid specifying a non-natural, non-zero
-             * _PAGE_INVALID_BITS value just for that case.
-             */
-#endif
             rc |= _PAGE_INVALID_BITS;
-        }
+
         /* Increment the pfn by the right number of 4k pages.  
          * Mask out PAT and invalid bits. */
         start = _gfn((gfn_x(start) & ~GUEST_L2_GFN_MASK) +
@@ -422,5 +416,11 @@ set_ad:
         put_page(mfn_to_page(mfn_x(gw->l1mfn)));
     }
 
+    /* If this guest has a restricted physical address space then the
+     * target GFN must fit within it. */
+    if ( !(rc & _PAGE_PRESENT)
+         && gfn_x(guest_l1e_get_gfn(gw->l1e)) >> d->arch.paging.gfn_bits )
+        rc |= _PAGE_INVALID_BITS;
+
     return rc;
 }
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 6eb2167..f3475c6 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -448,6 +448,8 @@ void hap_domain_init(struct domain *d)
 {
     INIT_PAGE_LIST_HEAD(&d->arch.paging.hap.freelist);
 
+    d->arch.paging.gfn_bits = hap_paddr_bits - PAGE_SHIFT;
+
     /* Use HAP logdirty mechanism. */
     paging_log_dirty_init(d, hap_enable_log_dirty,
                           hap_disable_log_dirty,
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index bad8360..98d0d2c 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -51,6 +51,16 @@ int shadow_domain_init(struct domain *d, unsigned int domcr_flags)
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.freelist);
     INIT_PAGE_LIST_HEAD(&d->arch.paging.shadow.pinned_shadows);
 
+    d->arch.paging.gfn_bits = paddr_bits - PAGE_SHIFT;
+#ifndef CONFIG_BIGMEM
+    /*
+     * Shadowed superpages store GFNs in 32-bit page_info fields.
+     * Note that we cannot use guest_supports_superpages() here.
+     */
+    if ( !is_pv_domain(d) || opt_allow_superpage )
+        d->arch.paging.gfn_bits = 32;
+#endif
+
     /* Use shadow pagetables for log-dirty support */
     paging_log_dirty_init(d, sh_enable_log_dirty,
                           sh_disable_log_dirty, sh_clean_dirty_bitmap);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 43c9488..71477fe 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -525,7 +525,8 @@ _sh_propagate(struct vcpu *v,
     ASSERT(GUEST_PAGING_LEVELS > 3 || level != 3);
 
     /* Check there's something for the shadows to map to */
-    if ( !p2m_is_valid(p2mt) && !p2m_is_grant(p2mt) )
+    if ( (!p2m_is_valid(p2mt) && !p2m_is_grant(p2mt))
+         || gfn_x(target_gfn) >> d->arch.paging.gfn_bits )
     {
         *sp = shadow_l1e_empty();
         goto done;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index c6c6e71..74c3a52 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -193,6 +193,9 @@ struct paging_domain {
     /* log dirty support */
     struct log_dirty_domain log_dirty;
 
+    /* Number of valid bits in a gfn. */
+    unsigned int gfn_bits;
+
     /* preemption handling */
     struct {
         const struct domain *dom;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index f8a0d76..b5db401 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -210,15 +210,17 @@ guest_supports_nx(struct vcpu *v)
 }
 
 
-/* Some bits are invalid in any pagetable entry. */
-#if GUEST_PAGING_LEVELS == 2
-#define _PAGE_INVALID_BITS (0)
-#elif GUEST_PAGING_LEVELS == 3
-#define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<63) - 1) & ~((1ull<<paddr_bits) - 1))
-#else /* GUEST_PAGING_LEVELS == 4 */
+/*
+ * Some bits are invalid in any pagetable entry.
+ * Normal flags values get represented in 24-bit values (see
+ * get_pte_flags() and put_pte_flags()), so set bit 24 in
+ * addition to be able to flag out of range frame numbers.
+ */
+#if GUEST_PAGING_LEVELS == 3
 #define _PAGE_INVALID_BITS \
-    get_pte_flags(((1ull<<52) - 1) & ~((1ull<<paddr_bits) - 1))
+    (_PAGE_INVALID_BIT | get_pte_flags(((1ull << 63) - 1) & ~(PAGE_SIZE - 1)))
+#else /* 2-level and 4-level */
+#define _PAGE_INVALID_BITS _PAGE_INVALID_BIT
 #endif
 
 
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index f507f5e..a200470 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -212,6 +212,8 @@ extern u32 cpuid_ext_features;
 
 /* Maximum width of physical addresses supported by the hardware */
 extern unsigned int paddr_bits;
+/* Max physical address width supported within HAP guests */
+extern unsigned int hap_paddr_bits;
 
 extern const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id table[]);
 
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..eb5e2fd 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -141,6 +141,12 @@ typedef l4_pgentry_t root_pgentry_t;
 #define _PAGE_GNTTAB (1U<<22)
 
 /*
+ * Bit 24 of a 24-bit flag mask!  This is not any bit of a real pte,
+ * and is only used for signalling in variables that contain flags.
+ */
+#define _PAGE_INVALID_BIT (1U<<24)
+
+/*
  * Bit 12 of a 24-bit flag mask. This corresponds to bit 52 of a pte.
  * This is needed to distinguish between user and kernel PTEs since _PAGE_USER
  * is asserted for both.

[-- Attachment #7: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: Xen Security Advisory 173 (CVE-2016-3960) - x86 shadow pagetables: address width overflow
  2016-04-18 13:31 Xen Security Advisory 173 (CVE-2016-3960) - x86 shadow pagetables: address width overflow Xen.org security team
@ 2016-05-13 10:55 ` Philipp Hahn
  2016-05-13 11:28   ` Jan Beulich
  0 siblings, 1 reply; 3+ messages in thread
From: Philipp Hahn @ 2016-05-13 10:55 UTC (permalink / raw)
  To: Tim Deegan, Jan Beulich; +Cc: Stefan Bader, Xen-devel

Hi,


Am 18.04.2016 um 15:31 schrieb Xen.org security team:
>             Xen Security Advisory CVE-2016-3960 / XSA-173
>                               version 3
> 
>              x86 shadow pagetables: address width overflow
...
> ISSUE DESCRIPTION
> =================
> In the x86 shadow pagetable code, the guest frame number of a
> superpage mapping is stored in a 32-bit field.  If a shadowed guest
> can cause a superpage mapping of a guest-physical address at or above
> 2^44 to be shadowed, the top bits of the address will be lost, causing
> an assertion failure or NULL dereference later on, in code that
> removes the shadow.
...
> VULNERABLE SYSTEMS
> ==================
> Xen versions from 3.4 onwards are affected.
> 
> Only x86 variants of Xen are susceptible.  ARM variants are not
> affected.
...
> RESOLUTION
> ==========
> Applying the appropriate attached patch resolves this issue.
...
> xsa173-4.3.patch       Xen 4.3.x

As Xen-4.2 and xen-4.1 are also vulnerable, I'm trying to backport this.
The 4.3 patch applies mostly, but compilation fails as x86-32-bit
support was dropped with Xen-4.3 and  _PAGE_INVALID_BIT remains
undefined for x86-32:
> guest_walk.c: In function 'mandatory_flags':
> guest_walk.c:66:40: error: '_PAGE_INVALID_BIT' undeclared (first use in this function)
> guest_walk.c:66:40: note: each undeclared identifier is reported only once for each function it appears in
> guest_walk.c: In function 'guest_walk_tables_2_levels':
> guest_walk.c:146:30: error: '_PAGE_INVALID_BIT' undeclared (first use in this function)
> guest_walk.c: In function 'mandatory_flags':
> guest_walk.c:67:1: error: control reaches end of non-void function [-Werror=return-type]

It's only defined for x86-64:
> --- a/xen/include/asm-x86/x86_64/page.h
> +++ b/xen/include/asm-x86/x86_64/page.h
...
> +/*
> + * Bit 24 of a 24-bit flag mask!  This is not any bit of a real pte,
> + * and is only used for signalling in variables that contain flags.
> + */
> +#define _PAGE_INVALID_BIT (1U<<24)
> +
>  #endif /* __X86_64_PAGE_H__ */

I guess using bit 24 is okay for 32 bit, too.

Can someone confirm that please?

Philipp

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Xen Security Advisory 173 (CVE-2016-3960) - x86 shadow pagetables: address width overflow
  2016-05-13 10:55 ` Philipp Hahn
@ 2016-05-13 11:28   ` Jan Beulich
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Beulich @ 2016-05-13 11:28 UTC (permalink / raw)
  To: Philipp Hahn; +Cc: Tim Deegan, Stefan Bader, Xen-devel

>>> On 13.05.16 at 12:55, <hahn@univention.de> wrote:
> Am 18.04.2016 um 15:31 schrieb Xen.org security team:
>> +/*
>> + * Bit 24 of a 24-bit flag mask!  This is not any bit of a real pte,
>> + * and is only used for signalling in variables that contain flags.
>> + */
>> +#define _PAGE_INVALID_BIT (1U<<24)
>> +
>>  #endif /* __X86_64_PAGE_H__ */
> 
> I guess using bit 24 is okay for 32 bit, too.
> 
> Can someone confirm that please?

Yes, that should be fine.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-05-13 11:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-18 13:31 Xen Security Advisory 173 (CVE-2016-3960) - x86 shadow pagetables: address width overflow Xen.org security team
2016-05-13 10:55 ` Philipp Hahn
2016-05-13 11:28   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.