All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.17 0/4] XSA-409 fixes
@ 2022-11-17  1:08 Andrew Cooper
  2022-11-17  1:08 ` [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size Andrew Cooper
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-11-17  1:08 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

For 4.17, and backport to all stable releases.

Patch 2 doesn't technically need backporting, but it's safe and I'm
dis-enclined to forgo testing in older releases.

See patches for changelogs.

Andrew Cooper (4):
  xen: Introduce non-broken hypercalls for the paging mempool size
  tools/tests: Unit test for paging mempool size
  xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls
  xen/arm: Correct the p2m pool size calculations

 tools/include/xenctrl.h                          |   3 +
 tools/libs/ctrl/xc_domain.c                      |  29 ++++
 tools/libs/light/libxl_arm.c                     |  14 +-
 tools/libs/light/libxl_dom.c                     |  19 +++
 tools/libs/light/libxl_internal.h                |   3 +
 tools/libs/light/libxl_x86.c                     |  15 +-
 tools/tests/Makefile                             |   1 +
 tools/tests/paging-mempool/.gitignore            |   1 +
 tools/tests/paging-mempool/Makefile              |  42 ++++++
 tools/tests/paging-mempool/test-paging-mempool.c | 181 +++++++++++++++++++++++
 xen/arch/arm/domctl.c                            |  53 -------
 xen/arch/arm/include/asm/p2m.h                   |   1 -
 xen/arch/arm/p2m.c                               |  32 ++--
 xen/arch/x86/include/asm/hap.h                   |   1 +
 xen/arch/x86/include/asm/shadow.h                |   4 +
 xen/arch/x86/mm/hap/hap.c                        |  11 ++
 xen/arch/x86/mm/paging.c                         |  43 ++++++
 xen/arch/x86/mm/shadow/common.c                  |  11 ++
 xen/common/domctl.c                              |  14 ++
 xen/include/public/domctl.h                      |  24 ++-
 xen/include/xen/domain.h                         |   3 +
 21 files changed, 415 insertions(+), 90 deletions(-)
 create mode 100644 tools/tests/paging-mempool/.gitignore
 create mode 100644 tools/tests/paging-mempool/Makefile
 create mode 100644 tools/tests/paging-mempool/test-paging-mempool.c

-- 
2.11.0



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size
  2022-11-17  1:08 [PATCH for-4.17 0/4] XSA-409 fixes Andrew Cooper
@ 2022-11-17  1:08 ` Andrew Cooper
  2022-11-17  2:08   ` Stefano Stabellini
                     ` (2 more replies)
  2022-11-17  1:08 ` [PATCH 2/4] tools/tests: Unit test for " Andrew Cooper
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-11-17  1:08 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

The existing XEN_DOMCTL_SHADOW_OP_{GET,SET}_ALLOCATION have problems:

 * All set_allocation() flavours have an overflow-before-widen bug when
   calculating "sc->mb << (20 - PAGE_SHIFT)".
 * All flavours have a granularity of 1M.  This was tolerable when the size of
   the pool could only be set at the same granularity, but is broken now that
   ARM has a 16-page stopgap allocation in use.
 * All get_allocation() flavours round up, and in particular turn 0 into 1,
   meaning the get op returns junk before a successful set op.
 * The x86 flavours reject the hypercalls before the VM has vCPUs allocated,
   despite the pool size being a domain property.
 * Even the hypercall names are long-obsolete.

Implement a better interface, which can be first used to unit test the
behaviour, and subsequently correct a broken implementation.  The old
interface will be retired in due course.

The unit of bytes (as opposed pages) is a deliberate API/ABI improvement to
more easily support multiple page granularities.

This is part of XSA-409 / CVE-2022-33747.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Henry Wang <Henry.Wang@arm.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Henry Wang <Henry.Wang@arm.com>
CC: Anthony PERARD <anthony.perard@citrix.com>

v2:
 * s/p2m/paging/
 * Fix overflow-before-widen in ARM's arch_get_p2m_mempool_size()
 * Fix overflow-before-widen in both {hap,shadow}_get_allocation_bytes()
 * Leave a TODO about x86/PV, drop assertion.
 * Check for long->int truncation in x86's arch_set_paging_mempool_size()

Future TODOs:
 * x86 shadow still rounds up.  This is buggy as it's a simultaneous equation
   with tot_pages which varies over time with ballooning.
 * x86 PV is weird.  There are no toolstack interact with the shadow pool
   size, but the "shadow" pool it does come into existence when logdirty (or
   pv-l1tf) when first enabled.
 * The shadow+hap logic is in desperate need of deduping.
---
 tools/include/xenctrl.h           |  3 +++
 tools/libs/ctrl/xc_domain.c       | 29 ++++++++++++++++++++++++++
 xen/arch/arm/p2m.c                | 26 +++++++++++++++++++++++
 xen/arch/x86/include/asm/hap.h    |  1 +
 xen/arch/x86/include/asm/shadow.h |  4 ++++
 xen/arch/x86/mm/hap/hap.c         | 11 ++++++++++
 xen/arch/x86/mm/paging.c          | 43 +++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/shadow/common.c   | 11 ++++++++++
 xen/common/domctl.c               | 14 +++++++++++++
 xen/include/public/domctl.h       | 24 +++++++++++++++++++++-
 xen/include/xen/domain.h          |  3 +++
 11 files changed, 168 insertions(+), 1 deletion(-)

diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 0c8b4c3aa7a5..23037874d3d5 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -893,6 +893,9 @@ long long xc_logdirty_control(xc_interface *xch,
                               unsigned int mode,
                               xc_shadow_op_stats_t *stats);
 
+int xc_get_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t *size);
+int xc_set_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t size);
+
 int xc_sched_credit_domain_set(xc_interface *xch,
                                uint32_t domid,
                                struct xen_domctl_sched_credit *sdom);
diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index 14c0420c35be..e939d0715739 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -706,6 +706,35 @@ long long xc_logdirty_control(xc_interface *xch,
     return (rc == 0) ? domctl.u.shadow_op.pages : rc;
 }
 
+int xc_get_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t *size)
+{
+    int rc;
+    struct xen_domctl domctl = {
+        .cmd         = XEN_DOMCTL_get_paging_mempool_size,
+        .domain      = domid,
+    };
+
+    rc = do_domctl(xch, &domctl);
+    if ( rc )
+        return rc;
+
+    *size = domctl.u.paging_mempool.size;
+    return 0;
+}
+
+int xc_set_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t size)
+{
+    struct xen_domctl domctl = {
+        .cmd         = XEN_DOMCTL_set_paging_mempool_size,
+        .domain      = domid,
+        .u.paging_mempool = {
+            .size = size,
+        },
+    };
+
+    return do_domctl(xch, &domctl);
+}
+
 int xc_domain_setmaxmem(xc_interface *xch,
                         uint32_t domid,
                         uint64_t max_memkb)
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 94d3b60b1387..8c1972e58227 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -100,6 +100,13 @@ unsigned int p2m_get_allocation(struct domain *d)
     return ROUNDUP(nr_pages, 1 << (20 - PAGE_SHIFT)) >> (20 - PAGE_SHIFT);
 }
 
+/* Return the size of the pool, in bytes. */
+int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
+{
+    *size = (uint64_t)ACCESS_ONCE(d->arch.paging.p2m_total_pages) << PAGE_SHIFT;
+    return 0;
+}
+
 /*
  * Set the pool of pages to the required number of pages.
  * Returns 0 for success, non-zero for failure.
@@ -157,6 +164,25 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
     return 0;
 }
 
+int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
+{
+    unsigned long pages = size >> PAGE_SHIFT;
+    bool preempted = false;
+    int rc;
+
+    if ( (size & ~PAGE_MASK) ||          /* Non page-sized request? */
+         pages != (size >> PAGE_SHIFT) ) /* 32-bit overflow? */
+        return -EINVAL;
+
+    spin_lock(&d->arch.paging.lock);
+    rc = p2m_set_allocation(d, pages, &preempted);
+    spin_unlock(&d->arch.paging.lock);
+
+    ASSERT(preempted == (rc == -ERESTART));
+
+    return rc;
+}
+
 int p2m_teardown_allocation(struct domain *d)
 {
     int ret = 0;
diff --git a/xen/arch/x86/include/asm/hap.h b/xen/arch/x86/include/asm/hap.h
index 90dece29deca..14d2f212dab9 100644
--- a/xen/arch/x86/include/asm/hap.h
+++ b/xen/arch/x86/include/asm/hap.h
@@ -47,6 +47,7 @@ int   hap_track_dirty_vram(struct domain *d,
 extern const struct paging_mode *hap_paging_get_mode(struct vcpu *);
 int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted);
 unsigned int hap_get_allocation(struct domain *d);
+int hap_get_allocation_bytes(struct domain *d, uint64_t *size);
 
 #endif /* XEN_HAP_H */
 
diff --git a/xen/arch/x86/include/asm/shadow.h b/xen/arch/x86/include/asm/shadow.h
index 1365fe480518..dad876d29499 100644
--- a/xen/arch/x86/include/asm/shadow.h
+++ b/xen/arch/x86/include/asm/shadow.h
@@ -97,6 +97,8 @@ void shadow_blow_tables_per_domain(struct domain *d);
 int shadow_set_allocation(struct domain *d, unsigned int pages,
                           bool *preempted);
 
+int shadow_get_allocation_bytes(struct domain *d, uint64_t *size);
+
 #else /* !CONFIG_SHADOW_PAGING */
 
 #define shadow_vcpu_teardown(v) ASSERT(is_pv_vcpu(v))
@@ -108,6 +110,8 @@ int shadow_set_allocation(struct domain *d, unsigned int pages,
     ({ ASSERT_UNREACHABLE(); -EOPNOTSUPP; })
 #define shadow_set_allocation(d, pages, preempted) \
     ({ ASSERT_UNREACHABLE(); -EOPNOTSUPP; })
+#define shadow_get_allocation_bytes(d, size) \
+    ({ ASSERT_UNREACHABLE(); -EOPNOTSUPP; })
 
 static inline void sh_remove_shadows(struct domain *d, mfn_t gmfn,
                                      int fast, int all) {}
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index f809ea9aa6ae..0fc1b1d9aced 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -345,6 +345,17 @@ unsigned int hap_get_allocation(struct domain *d)
             + ((pg & ((1 << (20 - PAGE_SHIFT)) - 1)) ? 1 : 0));
 }
 
+int hap_get_allocation_bytes(struct domain *d, uint64_t *size)
+{
+    unsigned long pages = d->arch.paging.hap.total_pages;
+
+    pages += d->arch.paging.hap.p2m_pages;
+
+    *size = pages << PAGE_SHIFT;
+
+    return 0;
+}
+
 /* Set the pool of pages to the required number of pages.
  * Returns 0 for success, non-zero for failure. */
 int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted)
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 3a355eee9ca3..8d579fa9a3e8 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -977,6 +977,49 @@ int __init paging_set_allocation(struct domain *d, unsigned int pages,
 }
 #endif
 
+int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
+{
+    int rc;
+
+    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
+        return -EOPNOTSUPP;
+
+    if ( hap_enabled(d) )
+        rc = hap_get_allocation_bytes(d, size);
+    else
+        rc = shadow_get_allocation_bytes(d, size);
+
+    return rc;
+}
+
+int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
+{
+    unsigned long pages = size >> PAGE_SHIFT;
+    bool preempted = false;
+    int rc;
+
+    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
+        return -EOPNOTSUPP;
+
+    if ( size & ~PAGE_MASK ||              /* Non page-sized request? */
+         pages != (unsigned int)pages )    /* Overflow $X_set_allocation()? */
+        return -EINVAL;
+
+    paging_lock(d);
+    if ( hap_enabled(d) )
+        rc = hap_set_allocation(d, pages, &preempted);
+    else
+        rc = shadow_set_allocation(d, pages, &preempted);
+    paging_unlock(d);
+
+    /*
+     * TODO: Adjust $X_set_allocation() so this is true.
+    ASSERT(preempted == (rc == -ERESTART));
+     */
+
+    return preempted ? -ERESTART : rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index badfd53c6b23..a8404f97f668 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1427,6 +1427,17 @@ static unsigned int shadow_get_allocation(struct domain *d)
             + ((pg & ((1 << (20 - PAGE_SHIFT)) - 1)) ? 1 : 0));
 }
 
+int shadow_get_allocation_bytes(struct domain *d, uint64_t *size)
+{
+    unsigned long pages = d->arch.paging.shadow.total_pages;
+
+    pages += d->arch.paging.shadow.p2m_pages;
+
+    *size = pages << PAGE_SHIFT;
+
+    return 0;
+}
+
 /**************************************************************************/
 /* Hash table for storing the guest->shadow mappings.
  * The table itself is an array of pointers to shadows; the shadows are then
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 69fb9abd346f..ad71ad8a4cc5 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -874,6 +874,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
         ret = iommu_do_domctl(op, d, u_domctl);
         break;
 
+    case XEN_DOMCTL_get_paging_mempool_size:
+        ret = arch_get_paging_mempool_size(d, &op->u.paging_mempool.size);
+        if ( !ret )
+            copyback = 1;
+        break;
+
+    case XEN_DOMCTL_set_paging_mempool_size:
+        ret = arch_set_paging_mempool_size(d, op->u.paging_mempool.size);
+
+        if ( ret == -ERESTART )
+            ret = hypercall_create_continuation(
+                __HYPERVISOR_domctl, "h", u_domctl);
+        break;
+
     default:
         ret = arch_do_domctl(op, d, u_domctl);
         break;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index b2ae839c3632..d4072761791a 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -214,7 +214,10 @@ struct xen_domctl_getpageframeinfo3 {
  /* Return the bitmap but do not modify internal copy. */
 #define XEN_DOMCTL_SHADOW_OP_PEEK        12
 
-/* Memory allocation accessors. */
+/*
+ * Memory allocation accessors.  These APIs are broken and will be removed.
+ * Use XEN_DOMCTL_{get,set}_paging_mempool_size instead.
+ */
 #define XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION   30
 #define XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION   31
 
@@ -946,6 +949,22 @@ struct xen_domctl_cacheflush {
     xen_pfn_t start_pfn, nr_pfns;
 };
 
+/*
+ * XEN_DOMCTL_get_paging_mempool_size / XEN_DOMCTL_set_paging_mempool_size.
+ *
+ * Get or set the paging memory pool size.  The size is in bytes.
+ *
+ * This is a dedicated pool of memory for Xen to use while managing the guest,
+ * typically containing pagetables.  As such, there is an implementation
+ * specific minimum granularity.
+ *
+ * The set operation can fail mid-way through the request (e.g. Xen running
+ * out of memory, no free memory to reclaim from the pool, etc.).
+ */
+struct xen_domctl_paging_mempool {
+    uint64_aligned_t size; /* IN/OUT.  Size in bytes. */
+};
+
 #if defined(__i386__) || defined(__x86_64__)
 struct xen_domctl_vcpu_msr {
     uint32_t         index;
@@ -1274,6 +1293,8 @@ struct xen_domctl {
 #define XEN_DOMCTL_get_cpu_policy                82
 #define XEN_DOMCTL_set_cpu_policy                83
 #define XEN_DOMCTL_vmtrace_op                    84
+#define XEN_DOMCTL_get_paging_mempool_size       85
+#define XEN_DOMCTL_set_paging_mempool_size       86
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1335,6 +1356,7 @@ struct xen_domctl {
         struct xen_domctl_psr_alloc         psr_alloc;
         struct xen_domctl_vuart_op          vuart_op;
         struct xen_domctl_vmtrace_op        vmtrace_op;
+        struct xen_domctl_paging_mempool    paging_mempool;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 2c8116afba27..0de9cbc1696d 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -98,6 +98,9 @@ void arch_get_info_guest(struct vcpu *, vcpu_guest_context_u);
 int arch_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg);
 int default_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg);
 
+int arch_get_paging_mempool_size(struct domain *d, uint64_t *size /* bytes */);
+int arch_set_paging_mempool_size(struct domain *d, uint64_t size /* bytes */);
+
 int domain_relinquish_resources(struct domain *d);
 
 void dump_pageframe_info(struct domain *d);
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] tools/tests: Unit test for paging mempool size
  2022-11-17  1:08 [PATCH for-4.17 0/4] XSA-409 fixes Andrew Cooper
  2022-11-17  1:08 ` [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size Andrew Cooper
@ 2022-11-17  1:08 ` Andrew Cooper
  2022-11-17 10:39   ` Jan Beulich
  2022-11-17 14:20   ` Anthony PERARD
  2022-11-17  1:08 ` [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls Andrew Cooper
  2022-11-17  1:08 ` [PATCH 4/4] xen/arm: Correct the p2m pool size calculations Andrew Cooper
  3 siblings, 2 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-11-17  1:08 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

Exercise some basic functionality of the new
xc_{get,set}_paging_mempool_size() hypercalls.

This passes on x86, but fails currently on ARM.  ARM will be fixed up in
future patches.

This is part of XSA-409 / CVE-2022-33747.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Henry Wang <Henry.Wang@arm.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Henry Wang <Henry.Wang@arm.com>
CC: Anthony PERARD <anthony.perard@citrix.com>

x86 Shadow is complicated because of how it behaves for PV guests, and because
of how it forms a simultaneous equation with tot_pages.  This will require
more work to untangle.

v2:
 * s/p2m/paging/
 * Fix CFLAGS_libxenforeginmemory typo
---
 tools/tests/Makefile                             |   1 +
 tools/tests/paging-mempool/.gitignore            |   1 +
 tools/tests/paging-mempool/Makefile              |  42 ++++++
 tools/tests/paging-mempool/test-paging-mempool.c | 181 +++++++++++++++++++++++
 4 files changed, 225 insertions(+)
 create mode 100644 tools/tests/paging-mempool/.gitignore
 create mode 100644 tools/tests/paging-mempool/Makefile
 create mode 100644 tools/tests/paging-mempool/test-paging-mempool.c

diff --git a/tools/tests/Makefile b/tools/tests/Makefile
index d99146d56a64..1319c3a9d88c 100644
--- a/tools/tests/Makefile
+++ b/tools/tests/Makefile
@@ -11,6 +11,7 @@ endif
 SUBDIRS-y += xenstore
 SUBDIRS-y += depriv
 SUBDIRS-y += vpci
+SUBDIRS-y += paging-mempool
 
 .PHONY: all clean install distclean uninstall
 all clean distclean install uninstall: %: subdirs-%
diff --git a/tools/tests/paging-mempool/.gitignore b/tools/tests/paging-mempool/.gitignore
new file mode 100644
index 000000000000..2f9305b7cc07
--- /dev/null
+++ b/tools/tests/paging-mempool/.gitignore
@@ -0,0 +1 @@
+test-paging-mempool
diff --git a/tools/tests/paging-mempool/Makefile b/tools/tests/paging-mempool/Makefile
new file mode 100644
index 000000000000..5d49497710e0
--- /dev/null
+++ b/tools/tests/paging-mempool/Makefile
@@ -0,0 +1,42 @@
+XEN_ROOT = $(CURDIR)/../../..
+include $(XEN_ROOT)/tools/Rules.mk
+
+TARGET := test-paging-mempool
+
+.PHONY: all
+all: $(TARGET)
+
+.PHONY: clean
+clean:
+	$(RM) -- *.o $(TARGET) $(DEPS_RM)
+
+.PHONY: distclean
+distclean: clean
+	$(RM) -- *~
+
+.PHONY: install
+install: all
+	$(INSTALL_DIR) $(DESTDIR)$(LIBEXEC_BIN)
+	$(INSTALL_PROG) $(TARGET) $(DESTDIR)$(LIBEXEC_BIN)
+
+.PHONY: uninstall
+uninstall:
+	$(RM) -- $(DESTDIR)$(LIBEXEC_BIN)/$(TARGET)
+
+CFLAGS += $(CFLAGS_xeninclude)
+CFLAGS += $(CFLAGS_libxenctrl)
+CFLAGS += $(CFLAGS_libxenforeignmemory)
+CFLAGS += $(CFLAGS_libxengnttab)
+CFLAGS += $(APPEND_CFLAGS)
+
+LDFLAGS += $(LDLIBS_libxenctrl)
+LDFLAGS += $(LDLIBS_libxenforeignmemory)
+LDFLAGS += $(LDLIBS_libxengnttab)
+LDFLAGS += $(APPEND_LDFLAGS)
+
+%.o: Makefile
+
+$(TARGET): test-paging-mempool.o
+	$(CC) -o $@ $< $(LDFLAGS)
+
+-include $(DEPS_INCLUDE)
diff --git a/tools/tests/paging-mempool/test-paging-mempool.c b/tools/tests/paging-mempool/test-paging-mempool.c
new file mode 100644
index 000000000000..942a2fde19c7
--- /dev/null
+++ b/tools/tests/paging-mempool/test-paging-mempool.c
@@ -0,0 +1,181 @@
+#include <err.h>
+#include <errno.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/mman.h>
+
+#include <xenctrl.h>
+#include <xenforeignmemory.h>
+#include <xengnttab.h>
+#include <xen-tools/libs.h>
+
+static unsigned int nr_failures;
+#define fail(fmt, ...)                          \
+({                                              \
+    nr_failures++;                              \
+    (void)printf(fmt, ##__VA_ARGS__);           \
+})
+
+static xc_interface *xch;
+static uint32_t domid;
+
+static struct xen_domctl_createdomain create = {
+    .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
+    .max_vcpus = 1,
+    .max_grant_frames = 1,
+    .grant_opts = XEN_DOMCTL_GRANT_version(1),
+
+    .arch = {
+#if defined(__x86_64__) || defined(__i386__)
+        .emulation_flags = XEN_X86_EMU_LAPIC,
+#endif
+    },
+};
+
+static uint64_t default_mempool_size_bytes =
+#if defined(__x86_64__) || defined(__i386__)
+    256 << 12; /* Only x86 HAP for now.  x86 Shadow needs more work. */
+#elif defined (__arm__) || defined(__aarch64__)
+    16 << 12;
+#endif
+
+static void run_tests(void)
+{
+    xen_pfn_t physmap[] = { 0 };
+    uint64_t size_bytes, old_size_bytes;
+    int rc;
+
+    printf("Test default mempool size\n");
+
+    rc = xc_get_paging_mempool_size(xch, domid, &size_bytes);
+    if ( rc )
+        return fail("  Fail: get mempool size: %d - %s\n",
+                    errno, strerror(errno));
+
+    printf("mempool size %"PRIu64" bytes (%"PRIu64"kB, %"PRIu64"MB)\n",
+           size_bytes, size_bytes >> 10, size_bytes >> 20);
+
+
+    /*
+     * Check that the domain has the expected default allocation size.  This
+     * will fail if the logic in Xen is altered without an equivelent
+     * adjustment here.
+     */
+    if ( size_bytes != default_mempool_size_bytes )
+        return fail("  Fail: size %"PRIu64" != expected size %"PRIu64"\n",
+                    size_bytes, default_mempool_size_bytes);
+
+
+    printf("Test that allocate doesn't alter pool size\n");
+
+    /*
+     * Populate the domain with some RAM.  This will cause more of the mempool
+     * to be used.
+     */
+    old_size_bytes = size_bytes;
+
+    rc = xc_domain_setmaxmem(xch, domid, -1);
+    if ( rc )
+        return fail("  Fail: setmaxmem: : %d - %s\n",
+                    errno, strerror(errno));
+
+    rc = xc_domain_populate_physmap_exact(xch, domid, 1, 0, 0, physmap);
+    if ( rc )
+        return fail("  Fail: populate physmap: %d - %s\n",
+                    errno, strerror(errno));
+
+    /*
+     * Re-get the p2m size.  Should not have changed as a consequence of
+     * populate physmap.
+     */
+    rc = xc_get_paging_mempool_size(xch, domid, &size_bytes);
+    if ( rc )
+        return fail("  Fail: get mempool size: %d - %s\n",
+                    errno, strerror(errno));
+
+    if ( old_size_bytes != size_bytes )
+        return fail("  Fail: mempool size changed %"PRIu64" => %"PRIu64"\n",
+                    old_size_bytes, size_bytes);
+
+
+
+    printf("Test bad set size\n");
+
+    /*
+     * Check that setting a non-page size results in failure.
+     */
+    rc = xc_set_paging_mempool_size(xch, domid, size_bytes + 1);
+    if ( rc != -1 || errno != EINVAL )
+        return fail("  Fail: Bad set size: expected -1/EINVAL, got %d/%d - %s\n",
+                    rc, errno, strerror(errno));
+
+
+    printf("Test very large set size\n");
+
+    /*
+     * Check that setting a large P2M size succeeds.  This is expecting to
+     * trigger continuations.
+     */
+    rc = xc_set_paging_mempool_size(xch, domid, 64 << 20);
+    if ( rc )
+        return fail("  Fail: Set size 64MB: %d - %s\n",
+                    errno, strerror(errno));
+
+
+    /*
+     * Check that the reported size matches what set consumed.
+     */
+    rc = xc_get_paging_mempool_size(xch, domid, &size_bytes);
+    if ( rc )
+        return fail("  Fail: get p2m mempool size: %d - %s\n",
+                    errno, strerror(errno));
+
+    if ( size_bytes != 64 << 20 )
+        return fail("  Fail: expected mempool size %u, got %"PRIu64"\n",
+                    64 << 20, size_bytes);
+}
+
+int main(int argc, char **argv)
+{
+    int rc;
+
+    printf("Paging mempool tests\n");
+
+    xch = xc_interface_open(NULL, NULL, 0);
+
+    if ( !xch )
+        err(1, "xc_interface_open");
+
+    rc = xc_domain_create(xch, &domid, &create);
+    if ( rc )
+    {
+        if ( errno == EINVAL || errno == EOPNOTSUPP )
+            printf("  Skip: %d - %s\n", errno, strerror(errno));
+        else
+            fail("  Domain create failure: %d - %s\n",
+                 errno, strerror(errno));
+        goto out;
+    }
+
+    printf("  Created d%u\n", domid);
+
+    run_tests();
+
+    rc = xc_domain_destroy(xch, domid);
+    if ( rc )
+        fail("  Failed to destroy domain: %d - %s\n",
+             errno, strerror(errno));
+ out:
+    return !!nr_failures;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls
  2022-11-17  1:08 [PATCH for-4.17 0/4] XSA-409 fixes Andrew Cooper
  2022-11-17  1:08 ` [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size Andrew Cooper
  2022-11-17  1:08 ` [PATCH 2/4] tools/tests: Unit test for " Andrew Cooper
@ 2022-11-17  1:08 ` Andrew Cooper
  2022-11-17  2:12   ` Stefano Stabellini
  2022-11-17 14:07   ` Anthony PERARD
  2022-11-17  1:08 ` [PATCH 4/4] xen/arm: Correct the p2m pool size calculations Andrew Cooper
  3 siblings, 2 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-11-17  1:08 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

This reverts most of commit cf2a68d2ffbc3ce95e01449d46180bddb10d24a0, and bits
of cbea5a1149ca7fd4b7cdbfa3ec2e4f109b601ff7.

First of all, with ARM borrowing x86's implementation, the logic to set the
pool size should have been common, not duplicated.  Introduce
libxl__domain_set_p2m_pool_size() as a shared implementation, and use it from
the ARM and x86 paths.  It is left as an exercise to the reader to judge how
libxl/xl can reasonably function without the ability to query the pool size...

Remove ARM's p2m_domctl() infrastructure now the functioanlity has been
replaced with a working and unit tested interface.

This is part of XSA-409 / CVE-2022-33747.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Henry Wang <Henry.Wang@arm.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Henry Wang <Henry.Wang@arm.com>
CC: Anthony PERARD <anthony.perard@citrix.com>

v2:
 * s/p2m/paging/
 * Fix get/set typo in libxl__domain_set_p2m_pool_size()
---
 tools/libs/light/libxl_arm.c      | 14 +----------
 tools/libs/light/libxl_dom.c      | 19 ++++++++++++++
 tools/libs/light/libxl_internal.h |  3 +++
 tools/libs/light/libxl_x86.c      | 15 ++---------
 xen/arch/arm/domctl.c             | 53 ---------------------------------------
 xen/arch/arm/include/asm/p2m.h    |  1 -
 xen/arch/arm/p2m.c                |  8 ------
 7 files changed, 25 insertions(+), 88 deletions(-)

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 2a5e93c28403..2f5615263543 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -209,19 +209,7 @@ int libxl__arch_domain_create(libxl__gc *gc,
                               libxl__domain_build_state *state,
                               uint32_t domid)
 {
-    libxl_ctx *ctx = libxl__gc_owner(gc);
-    unsigned int shadow_mb = DIV_ROUNDUP(d_config->b_info.shadow_memkb, 1024);
-
-    int r = xc_shadow_control(ctx->xch, domid,
-                              XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
-                              &shadow_mb, 0);
-    if (r) {
-        LOGED(ERROR, domid,
-              "Failed to set %u MiB shadow allocation", shadow_mb);
-        return ERROR_FAIL;
-    }
-
-    return 0;
+    return libxl__domain_set_p2m_pool_size(gc, d_config, domid);
 }
 
 int libxl__arch_extra_memory(libxl__gc *gc,
diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
index 2abaab439c4f..f8f7b7e81837 100644
--- a/tools/libs/light/libxl_dom.c
+++ b/tools/libs/light/libxl_dom.c
@@ -1448,6 +1448,25 @@ int libxl_userdata_unlink(libxl_ctx *ctx, uint32_t domid,
     return rc;
 }
 
+int libxl__domain_set_p2m_pool_size(
+    libxl__gc *gc, libxl_domain_config *d_config, uint32_t domid)
+{
+    libxl_ctx *ctx = libxl__gc_owner(gc);
+    uint64_t shadow_mem;
+
+    shadow_mem = d_config->b_info.shadow_memkb;
+    shadow_mem <<= 10;
+
+    int r = xc_set_paging_mempool_size(ctx->xch, domid, shadow_mem);
+    if (r) {
+        LOGED(ERROR, domid,
+              "Failed to set paging mempool size to %"PRIu64"kB", shadow_mem);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index cb9e8b3b8b5a..f31164bc6c0d 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -4864,6 +4864,9 @@ int libxl__is_domid_recent(libxl__gc *gc, uint32_t domid, bool *recent);
 /* os-specific implementation of setresuid() */
 int libxl__setresuid(uid_t ruid, uid_t euid, uid_t suid);
 
+_hidden int libxl__domain_set_p2m_pool_size(
+    libxl__gc *gc, libxl_domain_config *d_config, uint32_t domid);
+
 #endif
 
 /*
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 7c5ee74443e5..99aba51d05df 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -538,20 +538,9 @@ int libxl__arch_domain_create(libxl__gc *gc,
         xc_domain_set_time_offset(ctx->xch, domid, rtc_timeoffset);
 
     if (d_config->b_info.type != LIBXL_DOMAIN_TYPE_PV) {
-        unsigned int shadow_mb = DIV_ROUNDUP(d_config->b_info.shadow_memkb,
-                                             1024);
-        int r = xc_shadow_control(ctx->xch, domid,
-                                  XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
-                                  &shadow_mb, 0);
-
-        if (r) {
-            LOGED(ERROR, domid,
-                  "Failed to set %u MiB %s allocation",
-                  shadow_mb,
-                  libxl_defbool_val(d_config->c_info.hap) ? "HAP" : "shadow");
-            ret = ERROR_FAIL;
+        ret = libxl__domain_set_p2m_pool_size(gc, d_config, domid);
+        if (ret)
             goto out;
-        }
     }
 
     if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV &&
diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index c8fdeb124084..1baf25c3d98b 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -47,64 +47,11 @@ static int handle_vuart_init(struct domain *d,
     return rc;
 }
 
-static long p2m_domctl(struct domain *d, struct xen_domctl_shadow_op *sc,
-                       XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
-{
-    long rc;
-    bool preempted = false;
-
-    if ( unlikely(d == current->domain) )
-    {
-        printk(XENLOG_ERR "Tried to do a p2m domctl op on itself.\n");
-        return -EINVAL;
-    }
-
-    if ( unlikely(d->is_dying) )
-    {
-        printk(XENLOG_ERR "Tried to do a p2m domctl op on dying domain %u\n",
-               d->domain_id);
-        return -EINVAL;
-    }
-
-    switch ( sc->op )
-    {
-    case XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION:
-    {
-        /* Allow and handle preemption */
-        spin_lock(&d->arch.paging.lock);
-        rc = p2m_set_allocation(d, sc->mb << (20 - PAGE_SHIFT), &preempted);
-        spin_unlock(&d->arch.paging.lock);
-
-        if ( preempted )
-            /* Not finished. Set up to re-run the call. */
-            rc = hypercall_create_continuation(__HYPERVISOR_domctl, "h",
-                                               u_domctl);
-        else
-            /* Finished. Return the new allocation. */
-            sc->mb = p2m_get_allocation(d);
-
-        return rc;
-    }
-    case XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION:
-    {
-        sc->mb = p2m_get_allocation(d);
-        return 0;
-    }
-    default:
-    {
-        printk(XENLOG_ERR "Bad p2m domctl op %u\n", sc->op);
-        return -EINVAL;
-    }
-    }
-}
-
 long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
 {
     switch ( domctl->cmd )
     {
-    case XEN_DOMCTL_shadow_op:
-        return p2m_domctl(d, &domctl->u.shadow_op, u_domctl);
     case XEN_DOMCTL_cacheflush:
     {
         gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
diff --git a/xen/arch/arm/include/asm/p2m.h b/xen/arch/arm/include/asm/p2m.h
index c8f14d13c2c5..91df922e1c9f 100644
--- a/xen/arch/arm/include/asm/p2m.h
+++ b/xen/arch/arm/include/asm/p2m.h
@@ -222,7 +222,6 @@ void p2m_restore_state(struct vcpu *n);
 /* Print debugging/statistial info about a domain's p2m */
 void p2m_dump_info(struct domain *d);
 
-unsigned int p2m_get_allocation(struct domain *d);
 int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted);
 int p2m_teardown_allocation(struct domain *d);
 
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 8c1972e58227..b2f7e8d804aa 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -92,14 +92,6 @@ static void p2m_free_page(struct domain *d, struct page_info *pg)
     spin_unlock(&d->arch.paging.lock);
 }
 
-/* Return the size of the pool, rounded up to the nearest MB */
-unsigned int p2m_get_allocation(struct domain *d)
-{
-    unsigned long nr_pages = ACCESS_ONCE(d->arch.paging.p2m_total_pages);
-
-    return ROUNDUP(nr_pages, 1 << (20 - PAGE_SHIFT)) >> (20 - PAGE_SHIFT);
-}
-
 /* Return the size of the pool, in bytes. */
 int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
 {
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] xen/arm: Correct the p2m pool size calculations
  2022-11-17  1:08 [PATCH for-4.17 0/4] XSA-409 fixes Andrew Cooper
                   ` (2 preceding siblings ...)
  2022-11-17  1:08 ` [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls Andrew Cooper
@ 2022-11-17  1:08 ` Andrew Cooper
  3 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-11-17  1:08 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

Allocating or freeing p2m pages doesn't alter the size of the mempool; only
the split between free and used pages.

Right now, the hypercalls operate on the free subset of the pool, meaning that
XEN_DOMCTL_get_paging_mempool_size varies with time as the guest shuffles its
physmap, and XEN_DOMCTL_set_paging_mempool_size ignores the used subset of the
pool and lets the guest grow unbounded.

This fixes test-pagign-mempool on ARM so that the behaviour matches x86.

This is part of XSA-409 / CVE-2022-33747.

Fixes: cbea5a1149ca ("xen/arm: Allocate and free P2M pages from the P2M pool")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Release-acked-by: Henry Wang <Henry.Wang@arm.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Henry Wang <Henry.Wang@arm.com>
CC: Anthony PERARD <anthony.perard@citrix.com>
---
 xen/arch/arm/p2m.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index b2f7e8d804aa..9bc5443d9e8a 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -72,7 +72,6 @@ static struct page_info *p2m_alloc_page(struct domain *d)
             spin_unlock(&d->arch.paging.lock);
             return NULL;
         }
-        d->arch.paging.p2m_total_pages--;
     }
     spin_unlock(&d->arch.paging.lock);
 
@@ -85,10 +84,7 @@ static void p2m_free_page(struct domain *d, struct page_info *pg)
     if ( is_hardware_domain(d) )
         free_domheap_page(pg);
     else
-    {
-        d->arch.paging.p2m_total_pages++;
         page_list_add_tail(pg, &d->arch.paging.p2m_freelist);
-    }
     spin_unlock(&d->arch.paging.lock);
 }
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size
  2022-11-17  1:08 ` [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size Andrew Cooper
@ 2022-11-17  2:08   ` Stefano Stabellini
  2022-11-17 10:18   ` Jan Beulich
  2022-11-17 14:10   ` Anthony PERARD
  2 siblings, 0 replies; 15+ messages in thread
From: Stefano Stabellini @ 2022-11-17  2:08 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 15239 bytes --]

On Thu, 17 Nov 2022, Andrew Cooper wrote:
> The existing XEN_DOMCTL_SHADOW_OP_{GET,SET}_ALLOCATION have problems:
> 
>  * All set_allocation() flavours have an overflow-before-widen bug when
>    calculating "sc->mb << (20 - PAGE_SHIFT)".
>  * All flavours have a granularity of 1M.  This was tolerable when the size of
>    the pool could only be set at the same granularity, but is broken now that
>    ARM has a 16-page stopgap allocation in use.
>  * All get_allocation() flavours round up, and in particular turn 0 into 1,
>    meaning the get op returns junk before a successful set op.
>  * The x86 flavours reject the hypercalls before the VM has vCPUs allocated,
>    despite the pool size being a domain property.
>  * Even the hypercall names are long-obsolete.
> 
> Implement a better interface, which can be first used to unit test the
> behaviour, and subsequently correct a broken implementation.  The old
> interface will be retired in due course.
> 
> The unit of bytes (as opposed pages) is a deliberate API/ABI improvement to
> more easily support multiple page granularities.
> 
> This is part of XSA-409 / CVE-2022-33747.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Henry Wang <Henry.Wang@arm.com>

ARM side:

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien@xen.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Henry Wang <Henry.Wang@arm.com>
> CC: Anthony PERARD <anthony.perard@citrix.com>
> 
> v2:
>  * s/p2m/paging/
>  * Fix overflow-before-widen in ARM's arch_get_p2m_mempool_size()
>  * Fix overflow-before-widen in both {hap,shadow}_get_allocation_bytes()
>  * Leave a TODO about x86/PV, drop assertion.
>  * Check for long->int truncation in x86's arch_set_paging_mempool_size()
> 
> Future TODOs:
>  * x86 shadow still rounds up.  This is buggy as it's a simultaneous equation
>    with tot_pages which varies over time with ballooning.
>  * x86 PV is weird.  There are no toolstack interact with the shadow pool
>    size, but the "shadow" pool it does come into existence when logdirty (or
>    pv-l1tf) when first enabled.
>  * The shadow+hap logic is in desperate need of deduping.
> ---
>  tools/include/xenctrl.h           |  3 +++
>  tools/libs/ctrl/xc_domain.c       | 29 ++++++++++++++++++++++++++
>  xen/arch/arm/p2m.c                | 26 +++++++++++++++++++++++
>  xen/arch/x86/include/asm/hap.h    |  1 +
>  xen/arch/x86/include/asm/shadow.h |  4 ++++
>  xen/arch/x86/mm/hap/hap.c         | 11 ++++++++++
>  xen/arch/x86/mm/paging.c          | 43 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/mm/shadow/common.c   | 11 ++++++++++
>  xen/common/domctl.c               | 14 +++++++++++++
>  xen/include/public/domctl.h       | 24 +++++++++++++++++++++-
>  xen/include/xen/domain.h          |  3 +++
>  11 files changed, 168 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
> index 0c8b4c3aa7a5..23037874d3d5 100644
> --- a/tools/include/xenctrl.h
> +++ b/tools/include/xenctrl.h
> @@ -893,6 +893,9 @@ long long xc_logdirty_control(xc_interface *xch,
>                                unsigned int mode,
>                                xc_shadow_op_stats_t *stats);
>  
> +int xc_get_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t *size);
> +int xc_set_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t size);
> +
>  int xc_sched_credit_domain_set(xc_interface *xch,
>                                 uint32_t domid,
>                                 struct xen_domctl_sched_credit *sdom);
> diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
> index 14c0420c35be..e939d0715739 100644
> --- a/tools/libs/ctrl/xc_domain.c
> +++ b/tools/libs/ctrl/xc_domain.c
> @@ -706,6 +706,35 @@ long long xc_logdirty_control(xc_interface *xch,
>      return (rc == 0) ? domctl.u.shadow_op.pages : rc;
>  }
>  
> +int xc_get_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t *size)
> +{
> +    int rc;
> +    struct xen_domctl domctl = {
> +        .cmd         = XEN_DOMCTL_get_paging_mempool_size,
> +        .domain      = domid,
> +    };
> +
> +    rc = do_domctl(xch, &domctl);
> +    if ( rc )
> +        return rc;
> +
> +    *size = domctl.u.paging_mempool.size;
> +    return 0;
> +}
> +
> +int xc_set_paging_mempool_size(xc_interface *xch, uint32_t domid, uint64_t size)
> +{
> +    struct xen_domctl domctl = {
> +        .cmd         = XEN_DOMCTL_set_paging_mempool_size,
> +        .domain      = domid,
> +        .u.paging_mempool = {
> +            .size = size,
> +        },
> +    };
> +
> +    return do_domctl(xch, &domctl);
> +}
> +
>  int xc_domain_setmaxmem(xc_interface *xch,
>                          uint32_t domid,
>                          uint64_t max_memkb)
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 94d3b60b1387..8c1972e58227 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -100,6 +100,13 @@ unsigned int p2m_get_allocation(struct domain *d)
>      return ROUNDUP(nr_pages, 1 << (20 - PAGE_SHIFT)) >> (20 - PAGE_SHIFT);
>  }
>  
> +/* Return the size of the pool, in bytes. */
> +int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
> +{
> +    *size = (uint64_t)ACCESS_ONCE(d->arch.paging.p2m_total_pages) << PAGE_SHIFT;
> +    return 0;
> +}
> +
>  /*
>   * Set the pool of pages to the required number of pages.
>   * Returns 0 for success, non-zero for failure.
> @@ -157,6 +164,25 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
>      return 0;
>  }
>  
> +int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
> +{
> +    unsigned long pages = size >> PAGE_SHIFT;
> +    bool preempted = false;
> +    int rc;
> +
> +    if ( (size & ~PAGE_MASK) ||          /* Non page-sized request? */
> +         pages != (size >> PAGE_SHIFT) ) /* 32-bit overflow? */
> +        return -EINVAL;
> +
> +    spin_lock(&d->arch.paging.lock);
> +    rc = p2m_set_allocation(d, pages, &preempted);
> +    spin_unlock(&d->arch.paging.lock);
> +
> +    ASSERT(preempted == (rc == -ERESTART));
> +
> +    return rc;
> +}
> +
>  int p2m_teardown_allocation(struct domain *d)
>  {
>      int ret = 0;
> diff --git a/xen/arch/x86/include/asm/hap.h b/xen/arch/x86/include/asm/hap.h
> index 90dece29deca..14d2f212dab9 100644
> --- a/xen/arch/x86/include/asm/hap.h
> +++ b/xen/arch/x86/include/asm/hap.h
> @@ -47,6 +47,7 @@ int   hap_track_dirty_vram(struct domain *d,
>  extern const struct paging_mode *hap_paging_get_mode(struct vcpu *);
>  int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted);
>  unsigned int hap_get_allocation(struct domain *d);
> +int hap_get_allocation_bytes(struct domain *d, uint64_t *size);
>  
>  #endif /* XEN_HAP_H */
>  
> diff --git a/xen/arch/x86/include/asm/shadow.h b/xen/arch/x86/include/asm/shadow.h
> index 1365fe480518..dad876d29499 100644
> --- a/xen/arch/x86/include/asm/shadow.h
> +++ b/xen/arch/x86/include/asm/shadow.h
> @@ -97,6 +97,8 @@ void shadow_blow_tables_per_domain(struct domain *d);
>  int shadow_set_allocation(struct domain *d, unsigned int pages,
>                            bool *preempted);
>  
> +int shadow_get_allocation_bytes(struct domain *d, uint64_t *size);
> +
>  #else /* !CONFIG_SHADOW_PAGING */
>  
>  #define shadow_vcpu_teardown(v) ASSERT(is_pv_vcpu(v))
> @@ -108,6 +110,8 @@ int shadow_set_allocation(struct domain *d, unsigned int pages,
>      ({ ASSERT_UNREACHABLE(); -EOPNOTSUPP; })
>  #define shadow_set_allocation(d, pages, preempted) \
>      ({ ASSERT_UNREACHABLE(); -EOPNOTSUPP; })
> +#define shadow_get_allocation_bytes(d, size) \
> +    ({ ASSERT_UNREACHABLE(); -EOPNOTSUPP; })
>  
>  static inline void sh_remove_shadows(struct domain *d, mfn_t gmfn,
>                                       int fast, int all) {}
> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> index f809ea9aa6ae..0fc1b1d9aced 100644
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -345,6 +345,17 @@ unsigned int hap_get_allocation(struct domain *d)
>              + ((pg & ((1 << (20 - PAGE_SHIFT)) - 1)) ? 1 : 0));
>  }
>  
> +int hap_get_allocation_bytes(struct domain *d, uint64_t *size)
> +{
> +    unsigned long pages = d->arch.paging.hap.total_pages;
> +
> +    pages += d->arch.paging.hap.p2m_pages;
> +
> +    *size = pages << PAGE_SHIFT;
> +
> +    return 0;
> +}
> +
>  /* Set the pool of pages to the required number of pages.
>   * Returns 0 for success, non-zero for failure. */
>  int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted)
> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
> index 3a355eee9ca3..8d579fa9a3e8 100644
> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -977,6 +977,49 @@ int __init paging_set_allocation(struct domain *d, unsigned int pages,
>  }
>  #endif
>  
> +int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
> +{
> +    int rc;
> +
> +    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
> +        return -EOPNOTSUPP;
> +
> +    if ( hap_enabled(d) )
> +        rc = hap_get_allocation_bytes(d, size);
> +    else
> +        rc = shadow_get_allocation_bytes(d, size);
> +
> +    return rc;
> +}
> +
> +int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
> +{
> +    unsigned long pages = size >> PAGE_SHIFT;
> +    bool preempted = false;
> +    int rc;
> +
> +    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
> +        return -EOPNOTSUPP;
> +
> +    if ( size & ~PAGE_MASK ||              /* Non page-sized request? */
> +         pages != (unsigned int)pages )    /* Overflow $X_set_allocation()? */
> +        return -EINVAL;
> +
> +    paging_lock(d);
> +    if ( hap_enabled(d) )
> +        rc = hap_set_allocation(d, pages, &preempted);
> +    else
> +        rc = shadow_set_allocation(d, pages, &preempted);
> +    paging_unlock(d);
> +
> +    /*
> +     * TODO: Adjust $X_set_allocation() so this is true.
> +    ASSERT(preempted == (rc == -ERESTART));
> +     */
> +
> +    return preempted ? -ERESTART : rc;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
> index badfd53c6b23..a8404f97f668 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -1427,6 +1427,17 @@ static unsigned int shadow_get_allocation(struct domain *d)
>              + ((pg & ((1 << (20 - PAGE_SHIFT)) - 1)) ? 1 : 0));
>  }
>  
> +int shadow_get_allocation_bytes(struct domain *d, uint64_t *size)
> +{
> +    unsigned long pages = d->arch.paging.shadow.total_pages;
> +
> +    pages += d->arch.paging.shadow.p2m_pages;
> +
> +    *size = pages << PAGE_SHIFT;
> +
> +    return 0;
> +}
> +
>  /**************************************************************************/
>  /* Hash table for storing the guest->shadow mappings.
>   * The table itself is an array of pointers to shadows; the shadows are then
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index 69fb9abd346f..ad71ad8a4cc5 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -874,6 +874,20 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>          ret = iommu_do_domctl(op, d, u_domctl);
>          break;
>  
> +    case XEN_DOMCTL_get_paging_mempool_size:
> +        ret = arch_get_paging_mempool_size(d, &op->u.paging_mempool.size);
> +        if ( !ret )
> +            copyback = 1;
> +        break;
> +
> +    case XEN_DOMCTL_set_paging_mempool_size:
> +        ret = arch_set_paging_mempool_size(d, op->u.paging_mempool.size);
> +
> +        if ( ret == -ERESTART )
> +            ret = hypercall_create_continuation(
> +                __HYPERVISOR_domctl, "h", u_domctl);
> +        break;
> +
>      default:
>          ret = arch_do_domctl(op, d, u_domctl);
>          break;
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index b2ae839c3632..d4072761791a 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -214,7 +214,10 @@ struct xen_domctl_getpageframeinfo3 {
>   /* Return the bitmap but do not modify internal copy. */
>  #define XEN_DOMCTL_SHADOW_OP_PEEK        12
>  
> -/* Memory allocation accessors. */
> +/*
> + * Memory allocation accessors.  These APIs are broken and will be removed.
> + * Use XEN_DOMCTL_{get,set}_paging_mempool_size instead.
> + */
>  #define XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION   30
>  #define XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION   31
>  
> @@ -946,6 +949,22 @@ struct xen_domctl_cacheflush {
>      xen_pfn_t start_pfn, nr_pfns;
>  };
>  
> +/*
> + * XEN_DOMCTL_get_paging_mempool_size / XEN_DOMCTL_set_paging_mempool_size.
> + *
> + * Get or set the paging memory pool size.  The size is in bytes.
> + *
> + * This is a dedicated pool of memory for Xen to use while managing the guest,
> + * typically containing pagetables.  As such, there is an implementation
> + * specific minimum granularity.
> + *
> + * The set operation can fail mid-way through the request (e.g. Xen running
> + * out of memory, no free memory to reclaim from the pool, etc.).
> + */
> +struct xen_domctl_paging_mempool {
> +    uint64_aligned_t size; /* IN/OUT.  Size in bytes. */
> +};
> +
>  #if defined(__i386__) || defined(__x86_64__)
>  struct xen_domctl_vcpu_msr {
>      uint32_t         index;
> @@ -1274,6 +1293,8 @@ struct xen_domctl {
>  #define XEN_DOMCTL_get_cpu_policy                82
>  #define XEN_DOMCTL_set_cpu_policy                83
>  #define XEN_DOMCTL_vmtrace_op                    84
> +#define XEN_DOMCTL_get_paging_mempool_size       85
> +#define XEN_DOMCTL_set_paging_mempool_size       86
>  #define XEN_DOMCTL_gdbsx_guestmemio            1000
>  #define XEN_DOMCTL_gdbsx_pausevcpu             1001
>  #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
> @@ -1335,6 +1356,7 @@ struct xen_domctl {
>          struct xen_domctl_psr_alloc         psr_alloc;
>          struct xen_domctl_vuart_op          vuart_op;
>          struct xen_domctl_vmtrace_op        vmtrace_op;
> +        struct xen_domctl_paging_mempool    paging_mempool;
>          uint8_t                             pad[128];
>      } u;
>  };
> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> index 2c8116afba27..0de9cbc1696d 100644
> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -98,6 +98,9 @@ void arch_get_info_guest(struct vcpu *, vcpu_guest_context_u);
>  int arch_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg);
>  int default_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg);
>  
> +int arch_get_paging_mempool_size(struct domain *d, uint64_t *size /* bytes */);
> +int arch_set_paging_mempool_size(struct domain *d, uint64_t size /* bytes */);
> +
>  int domain_relinquish_resources(struct domain *d);
>  
>  void dump_pageframe_info(struct domain *d);
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls
  2022-11-17  1:08 ` [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls Andrew Cooper
@ 2022-11-17  2:12   ` Stefano Stabellini
  2022-11-17 14:07   ` Anthony PERARD
  1 sibling, 0 replies; 15+ messages in thread
From: Stefano Stabellini @ 2022-11-17  2:12 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 8910 bytes --]

On Thu, 17 Nov 2022, Andrew Cooper wrote:
> This reverts most of commit cf2a68d2ffbc3ce95e01449d46180bddb10d24a0, and bits
> of cbea5a1149ca7fd4b7cdbfa3ec2e4f109b601ff7.
> 
> First of all, with ARM borrowing x86's implementation, the logic to set the
> pool size should have been common, not duplicated.  Introduce
> libxl__domain_set_p2m_pool_size() as a shared implementation, and use it from
> the ARM and x86 paths.  It is left as an exercise to the reader to judge how
> libxl/xl can reasonably function without the ability to query the pool size...
> 
> Remove ARM's p2m_domctl() infrastructure now the functioanlity has been
> replaced with a working and unit tested interface.
> 
> This is part of XSA-409 / CVE-2022-33747.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien@xen.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Henry Wang <Henry.Wang@arm.com>
> CC: Anthony PERARD <anthony.perard@citrix.com>
> 
> v2:
>  * s/p2m/paging/
>  * Fix get/set typo in libxl__domain_set_p2m_pool_size()
> ---
>  tools/libs/light/libxl_arm.c      | 14 +----------
>  tools/libs/light/libxl_dom.c      | 19 ++++++++++++++
>  tools/libs/light/libxl_internal.h |  3 +++
>  tools/libs/light/libxl_x86.c      | 15 ++---------
>  xen/arch/arm/domctl.c             | 53 ---------------------------------------
>  xen/arch/arm/include/asm/p2m.h    |  1 -
>  xen/arch/arm/p2m.c                |  8 ------
>  7 files changed, 25 insertions(+), 88 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index 2a5e93c28403..2f5615263543 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -209,19 +209,7 @@ int libxl__arch_domain_create(libxl__gc *gc,
>                                libxl__domain_build_state *state,
>                                uint32_t domid)
>  {
> -    libxl_ctx *ctx = libxl__gc_owner(gc);
> -    unsigned int shadow_mb = DIV_ROUNDUP(d_config->b_info.shadow_memkb, 1024);
> -
> -    int r = xc_shadow_control(ctx->xch, domid,
> -                              XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
> -                              &shadow_mb, 0);
> -    if (r) {
> -        LOGED(ERROR, domid,
> -              "Failed to set %u MiB shadow allocation", shadow_mb);
> -        return ERROR_FAIL;
> -    }
> -
> -    return 0;
> +    return libxl__domain_set_p2m_pool_size(gc, d_config, domid);
>  }
>  
>  int libxl__arch_extra_memory(libxl__gc *gc,
> diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
> index 2abaab439c4f..f8f7b7e81837 100644
> --- a/tools/libs/light/libxl_dom.c
> +++ b/tools/libs/light/libxl_dom.c
> @@ -1448,6 +1448,25 @@ int libxl_userdata_unlink(libxl_ctx *ctx, uint32_t domid,
>      return rc;
>  }
>  
> +int libxl__domain_set_p2m_pool_size(
> +    libxl__gc *gc, libxl_domain_config *d_config, uint32_t domid)
> +{
> +    libxl_ctx *ctx = libxl__gc_owner(gc);
> +    uint64_t shadow_mem;
> +
> +    shadow_mem = d_config->b_info.shadow_memkb;
> +    shadow_mem <<= 10;
> +
> +    int r = xc_set_paging_mempool_size(ctx->xch, domid, shadow_mem);
> +    if (r) {
> +        LOGED(ERROR, domid,
> +              "Failed to set paging mempool size to %"PRIu64"kB", shadow_mem);
> +        return ERROR_FAIL;
> +    }
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
> index cb9e8b3b8b5a..f31164bc6c0d 100644
> --- a/tools/libs/light/libxl_internal.h
> +++ b/tools/libs/light/libxl_internal.h
> @@ -4864,6 +4864,9 @@ int libxl__is_domid_recent(libxl__gc *gc, uint32_t domid, bool *recent);
>  /* os-specific implementation of setresuid() */
>  int libxl__setresuid(uid_t ruid, uid_t euid, uid_t suid);
>  
> +_hidden int libxl__domain_set_p2m_pool_size(
> +    libxl__gc *gc, libxl_domain_config *d_config, uint32_t domid);
> +
>  #endif
>  
>  /*
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index 7c5ee74443e5..99aba51d05df 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> @@ -538,20 +538,9 @@ int libxl__arch_domain_create(libxl__gc *gc,
>          xc_domain_set_time_offset(ctx->xch, domid, rtc_timeoffset);
>  
>      if (d_config->b_info.type != LIBXL_DOMAIN_TYPE_PV) {
> -        unsigned int shadow_mb = DIV_ROUNDUP(d_config->b_info.shadow_memkb,
> -                                             1024);
> -        int r = xc_shadow_control(ctx->xch, domid,
> -                                  XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
> -                                  &shadow_mb, 0);
> -
> -        if (r) {
> -            LOGED(ERROR, domid,
> -                  "Failed to set %u MiB %s allocation",
> -                  shadow_mb,
> -                  libxl_defbool_val(d_config->c_info.hap) ? "HAP" : "shadow");
> -            ret = ERROR_FAIL;
> +        ret = libxl__domain_set_p2m_pool_size(gc, d_config, domid);
> +        if (ret)
>              goto out;
> -        }
>      }
>  
>      if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV &&
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index c8fdeb124084..1baf25c3d98b 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -47,64 +47,11 @@ static int handle_vuart_init(struct domain *d,
>      return rc;
>  }
>  
> -static long p2m_domctl(struct domain *d, struct xen_domctl_shadow_op *sc,
> -                       XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> -{
> -    long rc;
> -    bool preempted = false;
> -
> -    if ( unlikely(d == current->domain) )
> -    {
> -        printk(XENLOG_ERR "Tried to do a p2m domctl op on itself.\n");
> -        return -EINVAL;
> -    }
> -
> -    if ( unlikely(d->is_dying) )
> -    {
> -        printk(XENLOG_ERR "Tried to do a p2m domctl op on dying domain %u\n",
> -               d->domain_id);
> -        return -EINVAL;
> -    }
> -
> -    switch ( sc->op )
> -    {
> -    case XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION:
> -    {
> -        /* Allow and handle preemption */
> -        spin_lock(&d->arch.paging.lock);
> -        rc = p2m_set_allocation(d, sc->mb << (20 - PAGE_SHIFT), &preempted);
> -        spin_unlock(&d->arch.paging.lock);
> -
> -        if ( preempted )
> -            /* Not finished. Set up to re-run the call. */
> -            rc = hypercall_create_continuation(__HYPERVISOR_domctl, "h",
> -                                               u_domctl);
> -        else
> -            /* Finished. Return the new allocation. */
> -            sc->mb = p2m_get_allocation(d);
> -
> -        return rc;
> -    }
> -    case XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION:
> -    {
> -        sc->mb = p2m_get_allocation(d);
> -        return 0;
> -    }
> -    default:
> -    {
> -        printk(XENLOG_ERR "Bad p2m domctl op %u\n", sc->op);
> -        return -EINVAL;
> -    }
> -    }
> -}
> -
>  long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>                      XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>  {
>      switch ( domctl->cmd )
>      {
> -    case XEN_DOMCTL_shadow_op:
> -        return p2m_domctl(d, &domctl->u.shadow_op, u_domctl);
>      case XEN_DOMCTL_cacheflush:
>      {
>          gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
> diff --git a/xen/arch/arm/include/asm/p2m.h b/xen/arch/arm/include/asm/p2m.h
> index c8f14d13c2c5..91df922e1c9f 100644
> --- a/xen/arch/arm/include/asm/p2m.h
> +++ b/xen/arch/arm/include/asm/p2m.h
> @@ -222,7 +222,6 @@ void p2m_restore_state(struct vcpu *n);
>  /* Print debugging/statistial info about a domain's p2m */
>  void p2m_dump_info(struct domain *d);
>  
> -unsigned int p2m_get_allocation(struct domain *d);
>  int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted);
>  int p2m_teardown_allocation(struct domain *d);
>  
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 8c1972e58227..b2f7e8d804aa 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -92,14 +92,6 @@ static void p2m_free_page(struct domain *d, struct page_info *pg)
>      spin_unlock(&d->arch.paging.lock);
>  }
>  
> -/* Return the size of the pool, rounded up to the nearest MB */
> -unsigned int p2m_get_allocation(struct domain *d)
> -{
> -    unsigned long nr_pages = ACCESS_ONCE(d->arch.paging.p2m_total_pages);
> -
> -    return ROUNDUP(nr_pages, 1 << (20 - PAGE_SHIFT)) >> (20 - PAGE_SHIFT);
> -}
> -
>  /* Return the size of the pool, in bytes. */
>  int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
>  {
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size
  2022-11-17  1:08 ` [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size Andrew Cooper
  2022-11-17  2:08   ` Stefano Stabellini
@ 2022-11-17 10:18   ` Jan Beulich
  2022-11-17 15:51     ` Andrew Cooper
  2022-11-17 14:10   ` Anthony PERARD
  2 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2022-11-17 10:18 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

On 17.11.2022 02:08, Andrew Cooper wrote:
> The existing XEN_DOMCTL_SHADOW_OP_{GET,SET}_ALLOCATION have problems:
> 
>  * All set_allocation() flavours have an overflow-before-widen bug when
>    calculating "sc->mb << (20 - PAGE_SHIFT)".
>  * All flavours have a granularity of 1M.  This was tolerable when the size of
>    the pool could only be set at the same granularity, but is broken now that
>    ARM has a 16-page stopgap allocation in use.
>  * All get_allocation() flavours round up, and in particular turn 0 into 1,
>    meaning the get op returns junk before a successful set op.
>  * The x86 flavours reject the hypercalls before the VM has vCPUs allocated,
>    despite the pool size being a domain property.
>  * Even the hypercall names are long-obsolete.
> 
> Implement a better interface, which can be first used to unit test the
> behaviour, and subsequently correct a broken implementation.  The old
> interface will be retired in due course.
> 
> The unit of bytes (as opposed pages) is a deliberate API/ABI improvement to
> more easily support multiple page granularities.
> 
> This is part of XSA-409 / CVE-2022-33747.

While I'm not convinced of this attribution, ...

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com> # hypervisor
albeit with remarks:

> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -977,6 +977,49 @@ int __init paging_set_allocation(struct domain *d, unsigned int pages,
>  }
>  #endif
>  
> +int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
> +{
> +    int rc;
> +
> +    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
> +        return -EOPNOTSUPP;

I guess this is merely for symmetry with ...

> +int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
> +{
> +    unsigned long pages = size >> PAGE_SHIFT;
> +    bool preempted = false;
> +    int rc;
> +
> +    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
> +        return -EOPNOTSUPP;

... this, since otherwise "get" ought to be fine for PV?

> @@ -946,6 +949,22 @@ struct xen_domctl_cacheflush {
>      xen_pfn_t start_pfn, nr_pfns;
>  };
>  
> +/*
> + * XEN_DOMCTL_get_paging_mempool_size / XEN_DOMCTL_set_paging_mempool_size.
> + *
> + * Get or set the paging memory pool size.  The size is in bytes.
> + *
> + * This is a dedicated pool of memory for Xen to use while managing the guest,
> + * typically containing pagetables.  As such, there is an implementation
> + * specific minimum granularity.
> + *
> + * The set operation can fail mid-way through the request (e.g. Xen running
> + * out of memory, no free memory to reclaim from the pool, etc.).
> + */
> +struct xen_domctl_paging_mempool {
> +    uint64_aligned_t size; /* IN/OUT.  Size in bytes. */

While likely people will correctly infer what is meant, strictly speaking
this is wrong: The field is IN for "set" and OUT for "get".

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] tools/tests: Unit test for paging mempool size
  2022-11-17  1:08 ` [PATCH 2/4] tools/tests: Unit test for " Andrew Cooper
@ 2022-11-17 10:39   ` Jan Beulich
  2022-11-17 16:27     ` Andrew Cooper
  2022-11-17 14:20   ` Anthony PERARD
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2022-11-17 10:39 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD, Xen-devel

On 17.11.2022 02:08, Andrew Cooper wrote:
> Exercise some basic functionality of the new
> xc_{get,set}_paging_mempool_size() hypercalls.
> 
> This passes on x86, but fails currently on ARM.  ARM will be fixed up in
> future patches.
> 
> This is part of XSA-409 / CVE-2022-33747.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
(if this counts anything, since as it stands the new stuff all falls
under tool stack maintainership)

> --- /dev/null
> +++ b/tools/tests/paging-mempool/test-paging-mempool.c
> @@ -0,0 +1,181 @@
> +#include <err.h>
> +#include <errno.h>
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <sys/mman.h>
> +
> +#include <xenctrl.h>
> +#include <xenforeignmemory.h>
> +#include <xengnttab.h>
> +#include <xen-tools/libs.h>
> +
> +static unsigned int nr_failures;
> +#define fail(fmt, ...)                          \
> +({                                              \
> +    nr_failures++;                              \
> +    (void)printf(fmt, ##__VA_ARGS__);           \
> +})
> +
> +static xc_interface *xch;
> +static uint32_t domid;
> +
> +static struct xen_domctl_createdomain create = {
> +    .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,

I understand that it is accepted that this test will thus fail when run
on HAP-incapable hardware (including when run with Xen itself running on
top of another hypervisor not surfacing HAP capabilities)? Oh, I notice
you're actually translating EINVAL and EOPNOTSUPP failures into "skip".
That'll probably do, albeit personally I consider skipping when EINVAL
(which we use all over the place) as a overly relaxed.

> +static void run_tests(void)
> +{
> +    xen_pfn_t physmap[] = { 0 };

I have to admit that I'm uncertain whether Arm (or other architectures
that Xen is being planned to be ported to) have constraints which may
cause populating of GFN 0 to fail.

> +    uint64_t size_bytes, old_size_bytes;
> +    int rc;
> +
> +    printf("Test default mempool size\n");
> +
> +    rc = xc_get_paging_mempool_size(xch, domid, &size_bytes);
> +    if ( rc )
> +        return fail("  Fail: get mempool size: %d - %s\n",
> +                    errno, strerror(errno));
> +
> +    printf("mempool size %"PRIu64" bytes (%"PRIu64"kB, %"PRIu64"MB)\n",
> +           size_bytes, size_bytes >> 10, size_bytes >> 20);
> +
> +
> +    /*
> +     * Check that the domain has the expected default allocation size.  This
> +     * will fail if the logic in Xen is altered without an equivelent

Nit: equivalent

> +     * adjustment here.
> +     */
> +    if ( size_bytes != default_mempool_size_bytes )
> +        return fail("  Fail: size %"PRIu64" != expected size %"PRIu64"\n",
> +                    size_bytes, default_mempool_size_bytes);
> +
> +
> +    printf("Test that allocate doesn't alter pool size\n");
> +
> +    /*
> +     * Populate the domain with some RAM.  This will cause more of the mempool
> +     * to be used.
> +     */
> +    old_size_bytes = size_bytes;
> +
> +    rc = xc_domain_setmaxmem(xch, domid, -1);
> +    if ( rc )
> +        return fail("  Fail: setmaxmem: : %d - %s\n",
> +                    errno, strerror(errno));
> +
> +    rc = xc_domain_populate_physmap_exact(xch, domid, 1, 0, 0, physmap);
> +    if ( rc )
> +        return fail("  Fail: populate physmap: %d - %s\n",
> +                    errno, strerror(errno));
> +
> +    /*
> +     * Re-get the p2m size.  Should not have changed as a consequence of
> +     * populate physmap.
> +     */
> +    rc = xc_get_paging_mempool_size(xch, domid, &size_bytes);
> +    if ( rc )
> +        return fail("  Fail: get mempool size: %d - %s\n",
> +                    errno, strerror(errno));
> +
> +    if ( old_size_bytes != size_bytes )
> +        return fail("  Fail: mempool size changed %"PRIu64" => %"PRIu64"\n",
> +                    old_size_bytes, size_bytes);
> +
> +
> +
> +    printf("Test bad set size\n");
> +
> +    /*
> +     * Check that setting a non-page size results in failure.
> +     */
> +    rc = xc_set_paging_mempool_size(xch, domid, size_bytes + 1);
> +    if ( rc != -1 || errno != EINVAL )
> +        return fail("  Fail: Bad set size: expected -1/EINVAL, got %d/%d - %s\n",
> +                    rc, errno, strerror(errno));
> +
> +
> +    printf("Test very large set size\n");

Maybe drop "very", as 64M isn't all that much (and would, in particular,
not expose any 32-bit truncation issues)?

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls
  2022-11-17  1:08 ` [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls Andrew Cooper
  2022-11-17  2:12   ` Stefano Stabellini
@ 2022-11-17 14:07   ` Anthony PERARD
  1 sibling, 0 replies; 15+ messages in thread
From: Anthony PERARD @ 2022-11-17 14:07 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang

On Thu, Nov 17, 2022 at 01:08:03AM +0000, Andrew Cooper wrote:
> diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
> index 2abaab439c4f..f8f7b7e81837 100644
> --- a/tools/libs/light/libxl_dom.c
> +++ b/tools/libs/light/libxl_dom.c
> @@ -1448,6 +1448,25 @@ int libxl_userdata_unlink(libxl_ctx *ctx, uint32_t domid,
>      return rc;
>  }
>  
> +int libxl__domain_set_p2m_pool_size(
> +    libxl__gc *gc, libxl_domain_config *d_config, uint32_t domid)
> +{
> +    libxl_ctx *ctx = libxl__gc_owner(gc);
> +    uint64_t shadow_mem;
> +
> +    shadow_mem = d_config->b_info.shadow_memkb;
> +    shadow_mem <<= 10;
> +
> +    int r = xc_set_paging_mempool_size(ctx->xch, domid, shadow_mem);

Nit: 'ctx' could be replace by using the macro 'CTX', without having to
declare a local ctx variable.


With or without this changed, patch looks fine:
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size
  2022-11-17  1:08 ` [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size Andrew Cooper
  2022-11-17  2:08   ` Stefano Stabellini
  2022-11-17 10:18   ` Jan Beulich
@ 2022-11-17 14:10   ` Anthony PERARD
  2 siblings, 0 replies; 15+ messages in thread
From: Anthony PERARD @ 2022-11-17 14:10 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang

On Thu, Nov 17, 2022 at 01:08:01AM +0000, Andrew Cooper wrote:
> The existing XEN_DOMCTL_SHADOW_OP_{GET,SET}_ALLOCATION have problems:
> 
>  * All set_allocation() flavours have an overflow-before-widen bug when
>    calculating "sc->mb << (20 - PAGE_SHIFT)".
>  * All flavours have a granularity of 1M.  This was tolerable when the size of
>    the pool could only be set at the same granularity, but is broken now that
>    ARM has a 16-page stopgap allocation in use.
>  * All get_allocation() flavours round up, and in particular turn 0 into 1,
>    meaning the get op returns junk before a successful set op.
>  * The x86 flavours reject the hypercalls before the VM has vCPUs allocated,
>    despite the pool size being a domain property.
>  * Even the hypercall names are long-obsolete.
> 
> Implement a better interface, which can be first used to unit test the
> behaviour, and subsequently correct a broken implementation.  The old
> interface will be retired in due course.
> 
> The unit of bytes (as opposed pages) is a deliberate API/ABI improvement to
> more easily support multiple page granularities.
> 
> This is part of XSA-409 / CVE-2022-33747.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] tools/tests: Unit test for paging mempool size
  2022-11-17  1:08 ` [PATCH 2/4] tools/tests: Unit test for " Andrew Cooper
  2022-11-17 10:39   ` Jan Beulich
@ 2022-11-17 14:20   ` Anthony PERARD
  1 sibling, 0 replies; 15+ messages in thread
From: Anthony PERARD @ 2022-11-17 14:20 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Jan Beulich, Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang

On Thu, Nov 17, 2022 at 01:08:02AM +0000, Andrew Cooper wrote:
> Exercise some basic functionality of the new
> xc_{get,set}_paging_mempool_size() hypercalls.
> 
> This passes on x86, but fails currently on ARM.  ARM will be fixed up in
> future patches.
> 
> This is part of XSA-409 / CVE-2022-33747.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Acked-by: Anthony PERARD <anthony.perard@citrix.com>

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size
  2022-11-17 10:18   ` Jan Beulich
@ 2022-11-17 15:51     ` Andrew Cooper
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-11-17 15:51 UTC (permalink / raw)
  To: Jan Beulich, Xen-devel
  Cc: Roger Pau Monne, Wei Liu, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Bertrand Marquis, Henry Wang, Anthony Perard

On 17/11/2022 10:18, Jan Beulich wrote:
> On 17.11.2022 02:08, Andrew Cooper wrote:
>> The existing XEN_DOMCTL_SHADOW_OP_{GET,SET}_ALLOCATION have problems:
>>
>>  * All set_allocation() flavours have an overflow-before-widen bug when
>>    calculating "sc->mb << (20 - PAGE_SHIFT)".
>>  * All flavours have a granularity of 1M.  This was tolerable when the size of
>>    the pool could only be set at the same granularity, but is broken now that
>>    ARM has a 16-page stopgap allocation in use.
>>  * All get_allocation() flavours round up, and in particular turn 0 into 1,
>>    meaning the get op returns junk before a successful set op.
>>  * The x86 flavours reject the hypercalls before the VM has vCPUs allocated,
>>    despite the pool size being a domain property.
>>  * Even the hypercall names are long-obsolete.
>>
>> Implement a better interface, which can be first used to unit test the
>> behaviour, and subsequently correct a broken implementation.  The old
>> interface will be retired in due course.
>>
>> The unit of bytes (as opposed pages) is a deliberate API/ABI improvement to
>> more easily support multiple page granularities.
>>
>> This is part of XSA-409 / CVE-2022-33747.
> While I'm not convinced of this attribution, ...

I think this very much depends on how critical the unit test is deemed.

If this was done the first time around, it would all have had
attribution.  We're on the 3rd set of fixes, and the unit test is a key
justification of the safety of the fix.

>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Release-acked-by: Henry Wang <Henry.Wang@arm.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com> # hypervisor

Thanks.

> albeit with remarks:
>
>> --- a/xen/arch/x86/mm/paging.c
>> +++ b/xen/arch/x86/mm/paging.c
>> @@ -977,6 +977,49 @@ int __init paging_set_allocation(struct domain *d, unsigned int pages,
>>  }
>>  #endif
>>  
>> +int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
>> +{
>> +    int rc;
>> +
>> +    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
>> +        return -EOPNOTSUPP;
> I guess this is merely for symmetry with ...
>
>> +int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
>> +{
>> +    unsigned long pages = size >> PAGE_SHIFT;
>> +    bool preempted = false;
>> +    int rc;
>> +
>> +    if ( is_pv_domain(d) )                 /* TODO: Relax in due course */
>> +        return -EOPNOTSUPP;
> ... this, since otherwise "get" ought to be fine for PV?

Its the safest course of action, given other known issues with PV. 
There's no need for a working get without a working set.

>
>> @@ -946,6 +949,22 @@ struct xen_domctl_cacheflush {
>>      xen_pfn_t start_pfn, nr_pfns;
>>  };
>>  
>> +/*
>> + * XEN_DOMCTL_get_paging_mempool_size / XEN_DOMCTL_set_paging_mempool_size.
>> + *
>> + * Get or set the paging memory pool size.  The size is in bytes.
>> + *
>> + * This is a dedicated pool of memory for Xen to use while managing the guest,
>> + * typically containing pagetables.  As such, there is an implementation
>> + * specific minimum granularity.
>> + *
>> + * The set operation can fail mid-way through the request (e.g. Xen running
>> + * out of memory, no free memory to reclaim from the pool, etc.).
>> + */
>> +struct xen_domctl_paging_mempool {
>> +    uint64_aligned_t size; /* IN/OUT.  Size in bytes. */
> While likely people will correctly infer what is meant, strictly speaking
> this is wrong: The field is IN for "set" and OUT for "get".

I'll drop them, to reduce any possible confusion.  As you say, the
meaning is entirely clear.

~Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] tools/tests: Unit test for paging mempool size
  2022-11-17 10:39   ` Jan Beulich
@ 2022-11-17 16:27     ` Andrew Cooper
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-11-17 16:27 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Roger Pau Monne, Wei Liu, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Bertrand Marquis, Henry Wang, Anthony Perard,
	Xen-devel

On 17/11/2022 10:39, Jan Beulich wrote:
> On 17.11.2022 02:08, Andrew Cooper wrote:
>> Exercise some basic functionality of the new
>> xc_{get,set}_paging_mempool_size() hypercalls.
>>
>> This passes on x86, but fails currently on ARM.  ARM will be fixed up in
>> future patches.
>>
>> This is part of XSA-409 / CVE-2022-33747.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Release-acked-by: Henry Wang <Henry.Wang@arm.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks.

> (if this counts anything, since as it stands the new stuff all falls
> under tool stack maintainership)

I do intend to give it its own section in due course, but this falls
very much into "The Rest" at the moment.

>> --- /dev/null
>> +++ b/tools/tests/paging-mempool/test-paging-mempool.c
>> @@ -0,0 +1,181 @@
>> +#include <err.h>
>> +#include <errno.h>
>> +#include <inttypes.h>
>> +#include <stdio.h>
>> +#include <string.h>
>> +#include <sys/mman.h>
>> +
>> +#include <xenctrl.h>
>> +#include <xenforeignmemory.h>
>> +#include <xengnttab.h>
>> +#include <xen-tools/libs.h>
>> +
>> +static unsigned int nr_failures;
>> +#define fail(fmt, ...)                          \
>> +({                                              \
>> +    nr_failures++;                              \
>> +    (void)printf(fmt, ##__VA_ARGS__);           \
>> +})
>> +
>> +static xc_interface *xch;
>> +static uint32_t domid;
>> +
>> +static struct xen_domctl_createdomain create = {
>> +    .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> I understand that it is accepted that this test will thus fail when run
> on HAP-incapable hardware (including when run with Xen itself running on
> top of another hypervisor not surfacing HAP capabilities)? Oh, I notice
> you're actually translating EINVAL and EOPNOTSUPP failures into "skip".
> That'll probably do, albeit personally I consider skipping when EINVAL
> (which we use all over the place) as a overly relaxed.

Checking capabilities needs to happen anyway to get PV and HVM Shadow
support working.

But this will do for now.

>> +static void run_tests(void)
>> +{
>> +    xen_pfn_t physmap[] = { 0 };
> I have to admit that I'm uncertain whether Arm (or other architectures
> that Xen is being planned to be ported to) have constraints which may
> cause populating of GFN 0 to fail.

Mechanically, no.  x86 PV is an entirely arbitrary mapping, while all
HVM modes (x86 and ARM) are just filling in a guest physical -> "some
RAM" mapping in a pagetable structure.

Logically, I hope that CDF_directmap on ARM checks for an alias to a
real RAM block, but this capability isn't even exposed to the toolstack.


And now I've looked at this, ewwww.  We've got:

d->options which holds config->flags, DOMCTL_CDF_*
d->cdf which holds the local 'flags' parameter, CFD_*
d->is_privileged which holds 'flags & CDF_privileged' and was never
cleaned up when d->cdf was introduced.

This is unnecessarily confusing to follow.  Yet more for the cleanup pile.

>> +    uint64_t size_bytes, old_size_bytes;
>> +    int rc;
>> +
>> +    printf("Test default mempool size\n");
>> +
>> +    rc = xc_get_paging_mempool_size(xch, domid, &size_bytes);
>> +    if ( rc )
>> +        return fail("  Fail: get mempool size: %d - %s\n",
>> +                    errno, strerror(errno));
>> +
>> +    printf("mempool size %"PRIu64" bytes (%"PRIu64"kB, %"PRIu64"MB)\n",
>> +           size_bytes, size_bytes >> 10, size_bytes >> 20);
>> +
>> +
>> +    /*
>> +     * Check that the domain has the expected default allocation size.  This
>> +     * will fail if the logic in Xen is altered without an equivelent
> Nit: equivalent

Fixed.

>
>> +     * adjustment here.
>> +     */
>> +    if ( size_bytes != default_mempool_size_bytes )
>> +        return fail("  Fail: size %"PRIu64" != expected size %"PRIu64"\n",
>> +                    size_bytes, default_mempool_size_bytes);
>> +
>> +
>> +    printf("Test that allocate doesn't alter pool size\n");
>> +
>> +    /*
>> +     * Populate the domain with some RAM.  This will cause more of the mempool
>> +     * to be used.
>> +     */
>> +    old_size_bytes = size_bytes;
>> +
>> +    rc = xc_domain_setmaxmem(xch, domid, -1);
>> +    if ( rc )
>> +        return fail("  Fail: setmaxmem: : %d - %s\n",
>> +                    errno, strerror(errno));
>> +
>> +    rc = xc_domain_populate_physmap_exact(xch, domid, 1, 0, 0, physmap);
>> +    if ( rc )
>> +        return fail("  Fail: populate physmap: %d - %s\n",
>> +                    errno, strerror(errno));
>> +
>> +    /*
>> +     * Re-get the p2m size.  Should not have changed as a consequence of
>> +     * populate physmap.
>> +     */
>> +    rc = xc_get_paging_mempool_size(xch, domid, &size_bytes);
>> +    if ( rc )
>> +        return fail("  Fail: get mempool size: %d - %s\n",
>> +                    errno, strerror(errno));
>> +
>> +    if ( old_size_bytes != size_bytes )
>> +        return fail("  Fail: mempool size changed %"PRIu64" => %"PRIu64"\n",
>> +                    old_size_bytes, size_bytes);
>> +
>> +
>> +
>> +    printf("Test bad set size\n");
>> +
>> +    /*
>> +     * Check that setting a non-page size results in failure.
>> +     */
>> +    rc = xc_set_paging_mempool_size(xch, domid, size_bytes + 1);
>> +    if ( rc != -1 || errno != EINVAL )
>> +        return fail("  Fail: Bad set size: expected -1/EINVAL, got %d/%d - %s\n",
>> +                    rc, errno, strerror(errno));
>> +
>> +
>> +    printf("Test very large set size\n");
> Maybe drop "very", as 64M isn't all that much (and would, in particular,
> not expose any 32-bit truncation issues)?

Hmm yeah.  That was rather stale.

I've switched to "Test set continuation\n" seeing as that's the purpose
of the test.  64M is an internal detail which ought to be small enough
to work reliably everywhere, but large enough to cause a continuation.

~Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH for-4.17 0/4] XSA-409 fixes
@ 2022-10-26 10:20 Andrew Cooper
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2022-10-26 10:20 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Xen Security Team, Jan Beulich,
	Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Bertrand Marquis, Henry Wang, Anthony PERARD

For 4.17, and backport to all stable releases.

Patch 2 doesn't technically need backporting, but it's safe and I'm
dis-enclined to forgo testing in older releases.

Andrew Cooper (4):
  xen: Introduce non-broken hypercalls for the p2m pool size
  tools/tests: Unit test for p2m pool size
  xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls
  xen/arm: Correct the p2m pool size calculations

 tools/include/xenctrl.h              |   3 +
 tools/libs/ctrl/xc_domain.c          |  29 ++++++
 tools/libs/light/libxl_arm.c         |  14 +--
 tools/libs/light/libxl_dom.c         |  19 ++++
 tools/libs/light/libxl_internal.h    |   3 +
 tools/libs/light/libxl_x86.c         |  15 +--
 tools/tests/Makefile                 |   1 +
 tools/tests/p2m-pool/.gitignore      |   1 +
 tools/tests/p2m-pool/Makefile        |  42 ++++++++
 tools/tests/p2m-pool/test-p2m-pool.c | 181 +++++++++++++++++++++++++++++++++++
 xen/arch/arm/domctl.c                |  53 ----------
 xen/arch/arm/include/asm/p2m.h       |   1 -
 xen/arch/arm/p2m.c                   |  31 ++++--
 xen/arch/x86/include/asm/hap.h       |   1 +
 xen/arch/x86/include/asm/shadow.h    |   4 +
 xen/arch/x86/mm/hap/hap.c            |  10 ++
 xen/arch/x86/mm/paging.c             |  39 ++++++++
 xen/arch/x86/mm/shadow/common.c      |  10 ++
 xen/common/domctl.c                  |  14 +++
 xen/include/public/domctl.h          |  26 ++++-
 xen/include/xen/domain.h             |   3 +
 21 files changed, 411 insertions(+), 89 deletions(-)
 create mode 100644 tools/tests/p2m-pool/.gitignore
 create mode 100644 tools/tests/p2m-pool/Makefile
 create mode 100644 tools/tests/p2m-pool/test-p2m-pool.c

-- 
2.11.0



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-11-17 16:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-17  1:08 [PATCH for-4.17 0/4] XSA-409 fixes Andrew Cooper
2022-11-17  1:08 ` [PATCH 1/4] xen: Introduce non-broken hypercalls for the paging mempool size Andrew Cooper
2022-11-17  2:08   ` Stefano Stabellini
2022-11-17 10:18   ` Jan Beulich
2022-11-17 15:51     ` Andrew Cooper
2022-11-17 14:10   ` Anthony PERARD
2022-11-17  1:08 ` [PATCH 2/4] tools/tests: Unit test for " Andrew Cooper
2022-11-17 10:39   ` Jan Beulich
2022-11-17 16:27     ` Andrew Cooper
2022-11-17 14:20   ` Anthony PERARD
2022-11-17  1:08 ` [PATCH 3/4] xen/arm, libxl: Revert XEN_DOMCTL_shadow_op; use p2m mempool hypercalls Andrew Cooper
2022-11-17  2:12   ` Stefano Stabellini
2022-11-17 14:07   ` Anthony PERARD
2022-11-17  1:08 ` [PATCH 4/4] xen/arm: Correct the p2m pool size calculations Andrew Cooper
  -- strict thread matches above, loose matches on Subject: below --
2022-10-26 10:20 [PATCH for-4.17 0/4] XSA-409 fixes Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.