xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource
@ 2020-09-22 18:24 Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 01/11] xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE Andrew Cooper
                   ` (10 more replies)
  0 siblings, 11 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Julien Grall, Roger Pau Monné,
	Stefano Stabellini, Volodymyr Babchuk, Wei Liu,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

I thought this was going to be a very simple small bugfix for Michał's
Processor Trace series.  Serves me right for expecting it not to be full of
bear traps...

The sole implementation of acquire_resource never asks for size, so its little
surprise that Xen is broken for compat callers, and returns incorrect
information for regular callers.

v2 was delayed substantially due to the discovery of XSA-334, but is complete
now, permitting the mapping of arbitrary sized resouces, along with fixes to
the compat XLAT logic.

The final two patches are only local testing, and not intended for committing.

A branch can be obtained from:

  https://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/xen-acquire-resource

Andrew Cooper (11):
  xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE
  xen/gnttab: Rework resource acquisition
  xen/memory: Fix compat XENMEM_acquire_resource for size requests
  xen/memory: Fix acquire_resource size semantics
  tools/foreignmem: Support querying the size of a resource
  xen/memory: Clarify the XENMEM_acquire_resource ABI description
  xen/memory: Improve compat XENMEM_acquire_resource handling
  xen/memory: Indent part of acquire_resource()
  xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  TESTING dom0
  TESTING XTF

 tools/libs/foreignmemory/Makefile                  |   2 +-
 tools/libs/foreignmemory/core.c                    |  14 ++
 .../libs/foreignmemory/include/xenforeignmemory.h  |  15 ++
 tools/libs/foreignmemory/libxenforeignmemory.map   |   4 +
 tools/libs/foreignmemory/linux.c                   |  36 +++
 tools/libs/foreignmemory/private.h                 |  14 ++
 tools/libs/uselibs.mk                              |   2 +-
 tools/misc/Makefile                                |   4 +
 tools/misc/xen-resource.c                          | 106 +++++++++
 xen/arch/x86/Kconfig                               |   1 +
 xen/arch/x86/mm.c                                  |  24 +-
 xen/common/Kconfig                                 |   3 +
 xen/common/compat/memory.c                         | 151 +++++++++----
 xen/common/grant_table.c                           | 128 ++++++++---
 xen/common/memory.c                                | 244 ++++++++++++++-------
 xen/include/asm-arm/mm.h                           |   8 -
 xen/include/asm-x86/mm.h                           |   3 +
 xen/include/public/memory.h                        |  23 +-
 xen/include/xen/grant_table.h                      |  21 +-
 xen/include/xen/mm.h                               |  15 ++
 20 files changed, 647 insertions(+), 171 deletions(-)
 create mode 100644 tools/misc/xen-resource.c

-- 
2.11.0



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 01/11] xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 02/11] xen/gnttab: Rework resource acquisition Andrew Cooper
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	Stefano Stabellini, Volodymyr Babchuk,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

New architectures shouldn't be forced to implement no-op stubs for unused
functionality.

Introduce CONFIG_ARCH_ACQUIRE_RESOURCE which can be opted in to, and provide
compatibility logic in xen/mm.h

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
---
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/arch/x86/Kconfig     | 1 +
 xen/common/Kconfig       | 3 +++
 xen/include/asm-arm/mm.h | 8 --------
 xen/include/xen/mm.h     | 9 +++++++++
 4 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index a636a4bb1e..e7644a0a9d 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -6,6 +6,7 @@ config X86
 	select ACPI
 	select ACPI_LEGACY_TABLES_LOOKUP
 	select ARCH_SUPPORTS_INT128
+	select ARCH_ACQUIRE_RESOURCE
 	select COMPAT
 	select CORE_PARKING
 	select HAS_ALTERNATIVE
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 15e3b79ff5..593459ea6e 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -22,6 +22,9 @@ config GRANT_TABLE
 
 	  If unsure, say Y.
 
+config ARCH_ACQUIRE_RESOURCE
+	bool
+
 config HAS_ALTERNATIVE
 	bool
 
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index f8ba49b118..0b7de3102e 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -358,14 +358,6 @@ static inline void put_page_and_type(struct page_info *page)
 
 void clear_and_clean_page(struct page_info *page);
 
-static inline
-int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id,
-                          unsigned long frame, unsigned int nr_frames,
-                          xen_pfn_t mfn_list[])
-{
-    return -EOPNOTSUPP;
-}
-
 unsigned int arch_get_dma_bitsize(void);
 
 #endif /*  __ARCH_ARM_MM__ */
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 4536a62940..26a4a3d350 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -685,4 +685,13 @@ static inline void put_page_alloc_ref(struct page_info *page)
     }
 }
 
+#ifndef CONFIG_ARCH_ACQUIRE_RESOURCE
+static inline int arch_acquire_resource(
+    struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
+    unsigned int nr_frames, xen_pfn_t mfn_list[])
+{
+    return -EOPNOTSUPP;
+}
+#endif /* !CONFIG_ARCH_ACQUIRE_RESOURCE */
+
 #endif /* __XEN_MM_H__ */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 01/11] xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-24  9:51   ` Paul Durrant
  2020-09-25 13:17   ` Jan Beulich
  2020-09-22 18:24 ` [PATCH v2 03/11] xen/memory: Fix compat XENMEM_acquire_resource for size requests Andrew Cooper
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

The existing logic doesn't function in the general case for mapping a guests
grant table, due to arbitrary 32 frame limit, and the default grant table
limit being 64.

In order to start addressing this, rework the existing grant table logic by
implementing a single gnttab_acquire_resource().  This is far more efficient
than the previous acquire_grant_table() in memory.c because it doesn't take
the grant table write lock, and attempt to grow the table, for every single
frame.

The new gnttab_acquire_resource() function subsumes the previous two
gnttab_get_{shared,status}_frame() helpers.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v2:
 * Fix CentOS 6 build by initialising vaddrs to NULL
 * Add ASSERT_UNREACHABLE() and gt->status typecheck
---
 xen/common/grant_table.c      | 102 +++++++++++++++++++++++++++++++-----------
 xen/common/memory.c           |  42 +----------------
 xen/include/xen/grant_table.h |  19 +++-----
 3 files changed, 84 insertions(+), 79 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index a5d3ed8bda..912f07be47 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
     return 0;
 }
 
+int gnttab_acquire_resource(
+    struct domain *d, unsigned int id, unsigned long frame,
+    unsigned int nr_frames, xen_pfn_t mfn_list[])
+{
+    struct grant_table *gt = d->grant_table;
+    unsigned int i = nr_frames, tot_frames;
+    mfn_t tmp;
+    void **vaddrs = NULL;
+    int rc;
+
+    /* Input sanity. */
+    if ( !nr_frames )
+        return -EINVAL;
+
+    /* Overflow checks */
+    if ( frame + nr_frames < frame )
+        return -EINVAL;
+
+    tot_frames = frame + nr_frames;
+    if ( tot_frames != frame + nr_frames )
+        return -EINVAL;
+
+    /* Grow table if necessary. */
+    grant_write_lock(gt);
+    switch ( id )
+    {
+    case XENMEM_resource_grant_table_id_shared:
+        rc = gnttab_get_shared_frame_mfn(d, tot_frames - 1, &tmp);
+        break;
+
+    case XENMEM_resource_grant_table_id_status:
+        if ( gt->gt_version != 2 )
+        {
+    default:
+            rc = -EINVAL;
+            break;
+        }
+        rc = gnttab_get_status_frame_mfn(d, tot_frames - 1, &tmp);
+        break;
+    }
+
+    /* Any errors from growing the table? */
+    if ( rc )
+        goto out;
+
+    switch ( id )
+    {
+    case XENMEM_resource_grant_table_id_shared:
+        vaddrs = gt->shared_raw;
+        break;
+
+    case XENMEM_resource_grant_table_id_status:
+        /* Check that void ** is a suitable representation for gt->status. */
+        BUILD_BUG_ON(!__builtin_types_compatible_p(
+                         typeof(gt->status), grant_status_t **));
+        vaddrs = (void **)gt->status;
+        break;
+    }
+
+    if ( !vaddrs )
+    {
+        ASSERT_UNREACHABLE();
+        rc = -EINVAL;
+        goto out;
+    }
+
+    for ( i = 0; i < nr_frames; ++i )
+        mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
+
+ out:
+    grant_write_unlock(gt);
+
+    return rc;
+}
+
 int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn, mfn_t *mfn)
 {
     int rc = 0;
@@ -4047,33 +4122,6 @@ int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn, mfn_t *mfn)
     return rc;
 }
 
-int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
-                            mfn_t *mfn)
-{
-    struct grant_table *gt = d->grant_table;
-    int rc;
-
-    grant_write_lock(gt);
-    rc = gnttab_get_shared_frame_mfn(d, idx, mfn);
-    grant_write_unlock(gt);
-
-    return rc;
-}
-
-int gnttab_get_status_frame(struct domain *d, unsigned long idx,
-                            mfn_t *mfn)
-{
-    struct grant_table *gt = d->grant_table;
-    int rc;
-
-    grant_write_lock(gt);
-    rc = (gt->gt_version == 2) ?
-        gnttab_get_status_frame_mfn(d, idx, mfn) : -EINVAL;
-    grant_write_unlock(gt);
-
-    return rc;
-}
-
 static void gnttab_usage_print(struct domain *rd)
 {
     int first = 1;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 1bab0e80c2..177fc378d9 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1007,44 +1007,6 @@ static long xatp_permission_check(struct domain *d, unsigned int space)
     return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
 }
 
-static int acquire_grant_table(struct domain *d, unsigned int id,
-                               unsigned long frame,
-                               unsigned int nr_frames,
-                               xen_pfn_t mfn_list[])
-{
-    unsigned int i = nr_frames;
-
-    /* Iterate backwards in case table needs to grow */
-    while ( i-- != 0 )
-    {
-        mfn_t mfn = INVALID_MFN;
-        int rc;
-
-        switch ( id )
-        {
-        case XENMEM_resource_grant_table_id_shared:
-            rc = gnttab_get_shared_frame(d, frame + i, &mfn);
-            break;
-
-        case XENMEM_resource_grant_table_id_status:
-            rc = gnttab_get_status_frame(d, frame + i, &mfn);
-            break;
-
-        default:
-            rc = -EINVAL;
-            break;
-        }
-
-        if ( rc )
-            return rc;
-
-        ASSERT(!mfn_eq(mfn, INVALID_MFN));
-        mfn_list[i] = mfn_x(mfn);
-    }
-
-    return 0;
-}
-
 static int acquire_resource(
     XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
 {
@@ -1099,8 +1061,8 @@ static int acquire_resource(
     switch ( xmar.type )
     {
     case XENMEM_resource_grant_table:
-        rc = acquire_grant_table(d, xmar.id, xmar.frame, xmar.nr_frames,
-                                 mfn_list);
+        rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
+                                     mfn_list);
         break;
 
     default:
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 98603604b8..5a2c75b880 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -56,10 +56,10 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
 
 int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
                      mfn_t *mfn);
-int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
-                            mfn_t *mfn);
-int gnttab_get_status_frame(struct domain *d, unsigned long idx,
-                            mfn_t *mfn);
+
+int gnttab_acquire_resource(
+    struct domain *d, unsigned int id, unsigned long frame,
+    unsigned int nr_frames, xen_pfn_t mfn_list[]);
 
 #else
 
@@ -93,14 +93,9 @@ static inline int gnttab_map_frame(struct domain *d, unsigned long idx,
     return -EINVAL;
 }
 
-static inline int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
-                                          mfn_t *mfn)
-{
-    return -EINVAL;
-}
-
-static inline int gnttab_get_status_frame(struct domain *d, unsigned long idx,
-                                          mfn_t *mfn)
+static inline int gnttab_acquire_resource(
+    struct domain *d, unsigned int id, unsigned long frame,
+    unsigned int nr_frames, xen_pfn_t mfn_list[])
 {
     return -EINVAL;
 }
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 03/11] xen/memory: Fix compat XENMEM_acquire_resource for size requests
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 01/11] xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 02/11] xen/gnttab: Rework resource acquisition Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics Andrew Cooper
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu, Julien Grall,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

Copy the nr_frames from the structure which actually has the correct value, so
the caller doesn't unconditionally receive 0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/common/compat/memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 3851f756c7..ed92e05b08 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -599,7 +599,7 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                 if ( __copy_field_to_guest(
                          guest_handle_cast(compat,
                                            compat_mem_acquire_resource_t),
-                         &cmp.mar, nr_frames) )
+                         nat.mar, nr_frames) )
                     return -EFAULT;
             }
             else
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (2 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 03/11] xen/memory: Fix compat XENMEM_acquire_resource for size requests Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-24 10:06   ` Paul Durrant
  2020-09-25 15:56   ` Jan Beulich
  2020-09-22 18:24 ` [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource Andrew Cooper
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

Calling XENMEM_acquire_resource with a NULL frame_list is a request for the
size of the resource, but the returned 32 is bogus.

If someone tries to follow it for XENMEM_resource_ioreq_server, the acquire
call will fail as IOREQ servers currently top out at 2 frames, and it is only
half the size of the default grant table limit for guests.

Also, no users actually request a resource size, because it was never wired up
in the sole implementaion of resource acquisition in Linux.

Introduce a new resource_max_frames() to calculate the size of a resource, and
implement it the IOREQ and grant subsystems.

It is impossible to guarantee that a mapping call following a successful size
call will succeed (e.g. The target IOREQ server gets destroyed, or the domain
switches from grant v2 to v1).  Document the restriction, and use the
flexibility to simplify the paths to be lockless.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v2:
 * Spelling fixes
 * Add more local variables.
 * Don't return any status frames on ARM where v2 support is compiled out.
---
 xen/arch/x86/mm.c             | 20 ++++++++++++++++
 xen/common/grant_table.c      | 23 ++++++++++++++++++
 xen/common/memory.c           | 55 +++++++++++++++++++++++++++++++++----------
 xen/include/asm-x86/mm.h      |  3 +++
 xen/include/public/memory.h   | 16 +++++++++----
 xen/include/xen/grant_table.h |  8 +++++++
 xen/include/xen/mm.h          |  6 +++++
 7 files changed, 114 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d1cfc8fb4a..e82307bdae 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4591,6 +4591,26 @@ int xenmem_add_to_physmap_one(
     return rc;
 }
 
+unsigned int arch_resource_max_frames(
+    struct domain *d, unsigned int type, unsigned int id)
+{
+    unsigned int nr = 0;
+
+    switch ( type )
+    {
+#ifdef CONFIG_HVM
+    case XENMEM_resource_ioreq_server:
+        if ( !is_hvm_domain(d) )
+            break;
+        /* One frame for the buf-ioreq ring, and one frame per 128 vcpus. */
+        nr = 1 + DIV_ROUND_UP(d->max_vcpus * sizeof(struct ioreq), PAGE_SIZE);
+        break;
+#endif
+    }
+
+    return nr;
+}
+
 int arch_acquire_resource(struct domain *d, unsigned int type,
                           unsigned int id, unsigned long frame,
                           unsigned int nr_frames, xen_pfn_t mfn_list[])
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 912f07be47..8c401a5540 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -4013,6 +4013,29 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
     return 0;
 }
 
+unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id)
+{
+    const struct grant_table *gt = d->grant_table;
+    unsigned int nr = 0;
+
+    /* Don't need the grant lock.  This limit is fixed at domain create time. */
+    switch ( id )
+    {
+    case XENMEM_resource_grant_table_id_shared:
+        nr = gt->max_grant_frames;
+        break;
+
+    case XENMEM_resource_grant_table_id_status:
+        if ( GNTTAB_MAX_VERSION < 2 )
+            break;
+
+        nr = grant_to_status_frames(gt->max_grant_frames);
+        break;
+    }
+
+    return nr;
+}
+
 int gnttab_acquire_resource(
     struct domain *d, unsigned int id, unsigned long frame,
     unsigned int nr_frames, xen_pfn_t mfn_list[])
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 177fc378d9..c559935732 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1007,6 +1007,26 @@ static long xatp_permission_check(struct domain *d, unsigned int space)
     return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
 }
 
+/*
+ * Return 0 on any kind of error.  Caller converts to -EINVAL.
+ *
+ * All nonzero values should be repeatable (i.e. derived from some fixed
+ * property of the domain), and describe the full resource (i.e. mapping the
+ * result of this call will be the entire resource).
+ */
+static unsigned int resource_max_frames(struct domain *d,
+                                        unsigned int type, unsigned int id)
+{
+    switch ( type )
+    {
+    case XENMEM_resource_grant_table:
+        return gnttab_resource_max_frames(d, id);
+
+    default:
+        return arch_resource_max_frames(d, type, id);
+    }
+}
+
 static int acquire_resource(
     XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
 {
@@ -1018,6 +1038,7 @@ static int acquire_resource(
      * use-cases then per-CPU arrays or heap allocations may be required.
      */
     xen_pfn_t mfn_list[32];
+    unsigned int max_frames;
     int rc;
 
     /*
@@ -1034,19 +1055,6 @@ static int acquire_resource(
     if ( xmar.pad != 0 )
         return -EINVAL;
 
-    if ( guest_handle_is_null(xmar.frame_list) )
-    {
-        if ( xmar.nr_frames )
-            return -EINVAL;
-
-        xmar.nr_frames = ARRAY_SIZE(mfn_list);
-
-        if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
-            return -EFAULT;
-
-        return 0;
-    }
-
     if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
         return -E2BIG;
 
@@ -1058,6 +1066,27 @@ static int acquire_resource(
     if ( rc )
         goto out;
 
+    max_frames = resource_max_frames(d, xmar.type, xmar.id);
+
+    rc = -EINVAL;
+    if ( !max_frames )
+        goto out;
+
+    if ( guest_handle_is_null(xmar.frame_list) )
+    {
+        if ( xmar.nr_frames )
+            goto out;
+
+        xmar.nr_frames = max_frames;
+
+        rc = -EFAULT;
+        if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
+            goto out;
+
+        rc = 0;
+        goto out;
+    }
+
     switch ( xmar.type )
     {
     case XENMEM_resource_grant_table:
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index deeba75a1c..13977652a8 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -639,6 +639,9 @@ static inline bool arch_mfn_in_directmap(unsigned long mfn)
     return mfn <= (virt_to_mfn(eva - 1) + 1);
 }
 
+unsigned int arch_resource_max_frames(struct domain *d,
+                                      unsigned int type, unsigned int id);
+
 int arch_acquire_resource(struct domain *d, unsigned int type,
                           unsigned int id, unsigned long frame,
                           unsigned int nr_frames, xen_pfn_t mfn_list[]);
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 21d483298e..d7eb34f167 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -639,10 +639,18 @@ struct xen_mem_acquire_resource {
 #define XENMEM_resource_grant_table_id_status 1
 
     /*
-     * IN/OUT - As an IN parameter number of frames of the resource
-     *          to be mapped. However, if the specified value is 0 and
-     *          frame_list is NULL then this field will be set to the
-     *          maximum value supported by the implementation on return.
+     * IN/OUT
+     *
+     * As an IN parameter number of frames of the resource to be mapped.
+     *
+     * When frame_list is NULL and nr_frames is 0, this is interpreted as a
+     * request for the size of the resource, which shall be returned in the
+     * nr_frames field.
+     *
+     * The size of a resource will never be zero, but a nonzero result doesn't
+     * guarantee that a subsequent mapping request will be successful.  There
+     * are further type/id specific constraints which may change between the
+     * two calls.
      */
     uint32_t nr_frames;
     uint32_t pad;
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 5a2c75b880..bae4d79623 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -57,6 +57,8 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
 int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
                      mfn_t *mfn);
 
+unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id);
+
 int gnttab_acquire_resource(
     struct domain *d, unsigned int id, unsigned long frame,
     unsigned int nr_frames, xen_pfn_t mfn_list[]);
@@ -93,6 +95,12 @@ static inline int gnttab_map_frame(struct domain *d, unsigned long idx,
     return -EINVAL;
 }
 
+static inline unsigned int gnttab_resource_max_frames(
+    struct domain *d, unsigned int id)
+{
+    return 0;
+}
+
 static inline int gnttab_acquire_resource(
     struct domain *d, unsigned int id, unsigned long frame,
     unsigned int nr_frames, xen_pfn_t mfn_list[])
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 26a4a3d350..d686876b0e 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -686,6 +686,12 @@ static inline void put_page_alloc_ref(struct page_info *page)
 }
 
 #ifndef CONFIG_ARCH_ACQUIRE_RESOURCE
+static inline unsigned int arch_resource_max_frames(
+    struct domain *d, unsigned int type, unsigned int id)
+{
+    return 0;
+}
+
 static inline int arch_acquire_resource(
     struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
     unsigned int nr_frames, xen_pfn_t mfn_list[])
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (3 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2021-01-08 17:52   ` Andrew Cooper
  2021-01-11 15:26   ` [PATCH v3 " Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 06/11] xen/memory: Clarify the XENMEM_acquire_resource ABI description Andrew Cooper
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Ian Jackson, Michał Leszczyński,
	Hubert Jasudowicz, Tamas K Lengyel

With the Xen side of this interface fixed to return real sizes, userspace
needs to be able to make the query.

Introduce xenforeignmemory_resource_size() for the purpose, bumping the
library minor version and providing compatibility for the non-Linux builds.

Its not possible to reuse the IOCTL_PRIVCMD_MMAP_RESOURCE infrastructure,
because it depends on having already mmap()'d a suitably sized region before
it will make an XENMEM_acquire_resource hypercall to Xen.

Instead, open a xencall handle and make an XENMEM_acquire_resource hypercall
directly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Paul Durrant <paul@xen.org>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v2:
 * Fix build in some environments.  Extend the rubric for making foreignmem
   depend on call
 * Rebase over Juergens library changes.
 * Spelling fixes.
---
 tools/libs/foreignmemory/Makefile                  |  2 +-
 tools/libs/foreignmemory/core.c                    | 14 +++++++++
 .../libs/foreignmemory/include/xenforeignmemory.h  | 15 +++++++++
 tools/libs/foreignmemory/libxenforeignmemory.map   |  4 +++
 tools/libs/foreignmemory/linux.c                   | 36 ++++++++++++++++++++++
 tools/libs/foreignmemory/private.h                 | 14 +++++++++
 tools/libs/uselibs.mk                              |  2 +-
 7 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/tools/libs/foreignmemory/Makefile b/tools/libs/foreignmemory/Makefile
index cf444d3c1a..0ed93edac2 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR    = 1
-MINOR    = 3
+MINOR    = 4
 
 SRCS-y                 += core.c
 SRCS-$(CONFIG_Linux)   += linux.c
diff --git a/tools/libs/foreignmemory/core.c b/tools/libs/foreignmemory/core.c
index 63f12e2450..5d95c59c48 100644
--- a/tools/libs/foreignmemory/core.c
+++ b/tools/libs/foreignmemory/core.c
@@ -53,6 +53,10 @@ xenforeignmemory_handle *xenforeignmemory_open(xentoollog_logger *logger,
         if (!fmem->logger) goto err;
     }
 
+    fmem->xcall = xencall_open(fmem->logger, 0);
+    if ( !fmem->xcall )
+        goto err;
+
     rc = osdep_xenforeignmemory_open(fmem);
     if ( rc  < 0 ) goto err;
 
@@ -61,6 +65,7 @@ xenforeignmemory_handle *xenforeignmemory_open(xentoollog_logger *logger,
 err:
     xentoolcore__deregister_active_handle(&fmem->tc_ah);
     osdep_xenforeignmemory_close(fmem);
+    xencall_close(fmem->xcall);
     xtl_logger_destroy(fmem->logger_tofree);
     free(fmem);
     return NULL;
@@ -75,6 +80,7 @@ int xenforeignmemory_close(xenforeignmemory_handle *fmem)
 
     xentoolcore__deregister_active_handle(&fmem->tc_ah);
     rc = osdep_xenforeignmemory_close(fmem);
+    xencall_close(fmem->xcall);
     xtl_logger_destroy(fmem->logger_tofree);
     free(fmem);
     return rc;
@@ -188,6 +194,14 @@ int xenforeignmemory_unmap_resource(
     return rc;
 }
 
+int xenforeignmemory_resource_size(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long *nr_frames)
+{
+    return osdep_xenforeignmemory_resource_size(fmem, domid, type,
+                                                id, nr_frames);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/include/xenforeignmemory.h b/tools/libs/foreignmemory/include/xenforeignmemory.h
index d594be8df0..1ba2f5316b 100644
--- a/tools/libs/foreignmemory/include/xenforeignmemory.h
+++ b/tools/libs/foreignmemory/include/xenforeignmemory.h
@@ -179,6 +179,21 @@ xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
 int xenforeignmemory_unmap_resource(
     xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
 
+/**
+ * Determine the maximum size of a specific resource.
+ *
+ * @parm fmem handle to the open foreignmemory interface
+ * @parm domid the domain id
+ * @parm type the resource type
+ * @parm id the type-specific resource identifier
+ *
+ * Return 0 on success and fills in *nr_frames.  Sets errno and return -1 on
+ * error.
+ */
+int xenforeignmemory_resource_size(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long *nr_frames);
+
 #endif
 
 /*
diff --git a/tools/libs/foreignmemory/libxenforeignmemory.map b/tools/libs/foreignmemory/libxenforeignmemory.map
index d5323c87d9..8aca341b99 100644
--- a/tools/libs/foreignmemory/libxenforeignmemory.map
+++ b/tools/libs/foreignmemory/libxenforeignmemory.map
@@ -19,3 +19,7 @@ VERS_1.3 {
 		xenforeignmemory_map_resource;
 		xenforeignmemory_unmap_resource;
 } VERS_1.2;
+VERS_1.4 {
+	global:
+		xenforeignmemory_resource_size;
+} VERS_1.3;
diff --git a/tools/libs/foreignmemory/linux.c b/tools/libs/foreignmemory/linux.c
index fe73d5ab72..eec089e232 100644
--- a/tools/libs/foreignmemory/linux.c
+++ b/tools/libs/foreignmemory/linux.c
@@ -21,12 +21,15 @@
 #include <errno.h>
 #include <fcntl.h>
 #include <unistd.h>
+#include <stdint.h>
 #include <string.h>
 
 #include <sys/mman.h>
 #include <sys/ioctl.h>
 #include <xen-tools/libs.h>
 
+#include <xen/memory.h>
+
 #include "private.h"
 
 #ifndef O_CLOEXEC
@@ -339,6 +342,39 @@ int osdep_xenforeignmemory_map_resource(
     return 0;
 }
 
+int osdep_xenforeignmemory_resource_size(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long *nr_frames)
+{
+    int rc;
+    struct xen_mem_acquire_resource *xmar =
+        xencall_alloc_buffer(fmem->xcall, sizeof(*xmar));
+
+    if ( !xmar )
+    {
+        PERROR("Could not bounce memory for acquire_resource hypercall");
+        return -1;
+    }
+
+    *xmar = (struct xen_mem_acquire_resource){
+        .domid = domid,
+        .type = type,
+        .id = id,
+    };
+
+    rc = xencall2(fmem->xcall, __HYPERVISOR_memory_op,
+                  XENMEM_acquire_resource, (uintptr_t)xmar);
+    if ( rc )
+        goto out;
+
+    *nr_frames = xmar->nr_frames;
+
+ out:
+    xencall_free_buffer(fmem->xcall, xmar);
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/private.h b/tools/libs/foreignmemory/private.h
index 8f1bf081ed..1a6b685f45 100644
--- a/tools/libs/foreignmemory/private.h
+++ b/tools/libs/foreignmemory/private.h
@@ -4,6 +4,7 @@
 #include <xentoollog.h>
 
 #include <xenforeignmemory.h>
+#include <xencall.h>
 
 #include <xentoolcore_internal.h>
 
@@ -20,6 +21,7 @@
 
 struct xenforeignmemory_handle {
     xentoollog_logger *logger, *logger_tofree;
+    xencall_handle *xcall;
     unsigned flags;
     int fd;
     Xentoolcore__Active_Handle tc_ah;
@@ -74,6 +76,15 @@ static inline int osdep_xenforeignmemory_unmap_resource(
 {
     return 0;
 }
+
+static inline int osdep_xenforeignmemory_resource_size(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long *nr_frames)
+{
+    errno = EOPNOTSUPP;
+    return -1;
+}
+
 #else
 int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
                                     domid_t domid);
@@ -81,6 +92,9 @@ int osdep_xenforeignmemory_map_resource(
     xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
 int osdep_xenforeignmemory_unmap_resource(
     xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
+int osdep_xenforeignmemory_resource_size(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long *nr_frames);
 #endif
 
 #define PERROR(_f...) \
diff --git a/tools/libs/uselibs.mk b/tools/libs/uselibs.mk
index a9dc2ce994..cb09e54de5 100644
--- a/tools/libs/uselibs.mk
+++ b/tools/libs/uselibs.mk
@@ -11,7 +11,7 @@ USELIBS_gnttab := toollog toolcore
 LIBS_LIBS += call
 USELIBS_call := toollog toolcore
 LIBS_LIBS += foreignmemory
-USELIBS_foreignmemory := toollog toolcore
+USELIBS_foreignmemory := toollog toolcore call
 LIBS_LIBS += devicemodel
 USELIBS_devicemodel := toollog toolcore call
 LIBS_LIBS += hypfs
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 06/11] xen/memory: Clarify the XENMEM_acquire_resource ABI description
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (4 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-24 10:08   ` Paul Durrant
  2020-09-22 18:24 ` [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling Andrew Cooper
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

This is how similar operations already operate, compatible with the sole
implementation (in Linux), and explicitly gives us some flexibility.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/include/public/memory.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index d7eb34f167..c4c47a0b38 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -642,6 +642,7 @@ struct xen_mem_acquire_resource {
      * IN/OUT
      *
      * As an IN parameter number of frames of the resource to be mapped.
+     * This value may be updated during the course of the operation.
      *
      * When frame_list is NULL and nr_frames is 0, this is interpreted as a
      * request for the size of the resource, which shall be returned in the
@@ -656,7 +657,8 @@ struct xen_mem_acquire_resource {
     uint32_t pad;
     /*
      * IN - the index of the initial frame to be mapped. This parameter
-     *      is ignored if nr_frames is 0.
+     *      is ignored if nr_frames is 0.  This value may be updated
+     *      during the course of the operation.
      */
     uint64_t frame;
 
@@ -672,7 +674,8 @@ struct xen_mem_acquire_resource {
      *          If -EIO is returned then the frame_list has only been
      *          partially mapped and it is up to the caller to unmap all
      *          the GFNs.
-     *          This parameter may be NULL if nr_frames is 0.
+     *          This parameter may be NULL if nr_frames is 0.  This
+     *          value may be updated during the course of the operation.
      */
     XEN_GUEST_HANDLE(xen_pfn_t) frame_list;
 };
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (5 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 06/11] xen/memory: Clarify the XENMEM_acquire_resource ABI description Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-24 10:16   ` Paul Durrant
  2020-09-28  9:09   ` Jan Beulich
  2020-09-22 18:24 ` [PATCH v2 08/11] xen/memory: Indent part of acquire_resource() Andrew Cooper
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

The frame_list is an input, or an output, depending on whether the calling
domain is translated or not.  The array does not need marshalling in both
directions.

Furthermore, the copy-in loop was very inefficient, copying 4 bytes at at
time.  Rewrite it to copy in all nr_frames at once, and then expand
compat_pfn_t to xen_pfn_t in place.

Re-position the copy-in loop to simplify continuation support in a future
patch, and reduce the scope of certain variables.

No change in guest observed behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/common/compat/memory.c | 65 ++++++++++++++++++++++++++++------------------
 1 file changed, 40 insertions(+), 25 deletions(-)

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index ed92e05b08..834c5e19d1 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -55,6 +55,8 @@ static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
 
 int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 {
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
     int split, op = cmd & MEMOP_CMD_MASK;
     long rc;
     unsigned int start_extent = cmd >> MEMOP_EXTENT_SHIFT;
@@ -399,7 +401,7 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 
         case XENMEM_acquire_resource:
         {
-            xen_pfn_t *xen_frame_list;
+            xen_pfn_t *xen_frame_list = NULL;
             unsigned int max_nr_frames;
 
             if ( copy_from_guest(&cmp.mar, compat, 1) )
@@ -417,28 +419,10 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
             if ( cmp.mar.nr_frames > max_nr_frames )
                 return -E2BIG;
 
-            if ( compat_handle_is_null(cmp.mar.frame_list) )
-                xen_frame_list = NULL;
-            else
-            {
+            /* Marshal the frame list in the remainder of the xlat space. */
+            if ( !compat_handle_is_null(cmp.mar.frame_list) )
                 xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
 
-                if ( !compat_handle_okay(cmp.mar.frame_list,
-                                         cmp.mar.nr_frames) )
-                    return -EFAULT;
-
-                for ( i = 0; i < cmp.mar.nr_frames; i++ )
-                {
-                    compat_pfn_t frame;
-
-                    if ( __copy_from_compat_offset(
-                             &frame, cmp.mar.frame_list, i, 1) )
-                        return -EFAULT;
-
-                    xen_frame_list[i] = frame;
-                }
-            }
-
 #define XLAT_mem_acquire_resource_HNDL_frame_list(_d_, _s_) \
             set_xen_guest_handle((_d_)->frame_list, xen_frame_list)
 
@@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 
 #undef XLAT_mem_acquire_resource_HNDL_frame_list
 
+            if ( xen_frame_list && cmp.mar.nr_frames )
+            {
+                /*
+                 * frame_list is an input for translated guests, and an output
+                 * for untranslated guests.  Only copy in for translated guests.
+                 */
+                if ( paging_mode_translate(currd) )
+                {
+                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
+
+                    if ( !compat_handle_okay(cmp.mar.frame_list,
+                                             cmp.mar.nr_frames) ||
+                         __copy_from_compat_offset(
+                             compat_frame_list, cmp.mar.frame_list,
+                             0, cmp.mar.nr_frames) )
+                        return -EFAULT;
+
+                    /*
+                     * Iterate backwards over compat_frame_list[] expanding
+                     * compat_pfn_t to xen_pfn_t in place.
+                     */
+                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
+                        xen_frame_list[x] = compat_frame_list[x];
+                }
+            }
             break;
         }
         default:
@@ -590,8 +599,6 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 
         case XENMEM_acquire_resource:
         {
-            const xen_pfn_t *xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
-            compat_pfn_t *compat_frame_list = (compat_pfn_t *)(nat.mar + 1);
             DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
 
             if ( compat_handle_is_null(cmp.mar.frame_list) )
@@ -601,9 +608,18 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                                            compat_mem_acquire_resource_t),
                          nat.mar, nr_frames) )
                     return -EFAULT;
+                break;
             }
-            else
+
+            /*
+             * frame_list is an input for translated guests, and an output for
+             * untranslated guests.  Only copy out for untranslated guests.
+             */
+            if ( !paging_mode_translate(currd) )
             {
+                const xen_pfn_t *xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
+                compat_pfn_t *compat_frame_list = (compat_pfn_t *)(nat.mar + 1);
+
                 /*
                  * NOTE: the smaller compat array overwrites the native
                  *       array.
@@ -625,7 +641,6 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                                              cmp.mar.nr_frames) )
                     return -EFAULT;
             }
-
             break;
         }
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 08/11] xen/memory: Indent part of acquire_resource()
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (6 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-24 10:36   ` Paul Durrant
  2020-09-22 18:24 ` [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource Andrew Cooper
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

Indent the middle of acquire_resource() inside a do {} while ( 0 ) loop.  This
is broken out specifically to make the following change readable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/common/memory.c | 66 +++++++++++++++++++++++++++--------------------------
 1 file changed, 34 insertions(+), 32 deletions(-)

diff --git a/xen/common/memory.c b/xen/common/memory.c
index c559935732..369154b7c0 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1087,44 +1087,46 @@ static int acquire_resource(
         goto out;
     }
 
-    switch ( xmar.type )
-    {
-    case XENMEM_resource_grant_table:
-        rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
-                                     mfn_list);
-        break;
+    do {
+        switch ( xmar.type )
+        {
+        case XENMEM_resource_grant_table:
+            rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
+                                         mfn_list);
+            break;
 
-    default:
-        rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
-                                   xmar.nr_frames, mfn_list);
-        break;
-    }
+        default:
+            rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
+                                       xmar.nr_frames, mfn_list);
+            break;
+        }
 
-    if ( rc )
-        goto out;
+        if ( rc )
+            goto out;
 
-    if ( !paging_mode_translate(currd) )
-    {
-        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
-            rc = -EFAULT;
-    }
-    else
-    {
-        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
-        unsigned int i;
+        if ( !paging_mode_translate(currd) )
+        {
+            if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
+                rc = -EFAULT;
+        }
+        else
+        {
+            xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
+            unsigned int i;
 
-        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
-            rc = -EFAULT;
+            if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
+                rc = -EFAULT;
 
-        for ( i = 0; !rc && i < xmar.nr_frames; i++ )
-        {
-            rc = set_foreign_p2m_entry(currd, gfn_list[i],
-                                       _mfn(mfn_list[i]));
-            /* rc should be -EIO for any iteration other than the first */
-            if ( rc && i )
-                rc = -EIO;
+            for ( i = 0; !rc && i < xmar.nr_frames; i++ )
+            {
+                rc = set_foreign_p2m_entry(currd, gfn_list[i],
+                                           _mfn(mfn_list[i]));
+                /* rc should be -EIO for any iteration other than the first */
+                if ( rc && i )
+                    rc = -EIO;
+            }
         }
-    }
+    } while ( 0 );
 
  out:
     rcu_unlock_domain(d);
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (7 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 08/11] xen/memory: Indent part of acquire_resource() Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-24 10:47   ` Paul Durrant
  2020-09-28  9:37   ` Jan Beulich
  2020-09-22 18:24 ` [PATCH v2 10/11] TESTING dom0 Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 11/11] TESTING XTF Andrew Cooper
  10 siblings, 2 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

A guests default number of grant frames is 64, and XENMEM_acquire_resource
will reject an attempt to map more than 32 frames.  This limit is caused by
the size of mfn_list[] on the stack.

Fix mapping of arbitrary size requests by looping over batches of 32 in
acquire_resource(), and using hypercall continuations when necessary.

To start with, break _acquire_resource() out of acquire_resource() to cope
with type-specific dispatching, and update the return semantics to indicate
the number of mfns returned.  Update gnttab_acquire_resource() and x86's
arch_acquire_resource() to match these new semantics.

Have do_memory_op() pass start_extent into acquire_resource() so it can pick
up where it left off after a continuation, and loop over batches of 32 until
all the work is done, or a continuation needs to occur.

compat_memory_op() is a bit more complicated, because it also has to marshal
frame_list in the XLAT buffer.  Have it account for continuation information
itself and hide details from the upper layer, so it can marshal the buffer in
chunks if necessary.

With these fixes in place, it is now possible to map the whole grant table for
a guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/arch/x86/mm.c          |   4 +-
 xen/common/compat/memory.c |  96 ++++++++++++++++++++++++++++++--------
 xen/common/grant_table.c   |   3 ++
 xen/common/memory.c        | 114 ++++++++++++++++++++++++++++++++++-----------
 4 files changed, 168 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index e82307bdae..8628f51402 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4632,7 +4632,6 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
         if ( id != (unsigned int)ioservid )
             break;
 
-        rc = 0;
         for ( i = 0; i < nr_frames; i++ )
         {
             mfn_t mfn;
@@ -4643,6 +4642,9 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
 
             mfn_list[i] = mfn_x(mfn);
         }
+        if ( i == nr_frames )
+            /* Success.  Passed nr_frames back to the caller. */
+            rc = nr_frames;
         break;
     }
 #endif
diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 834c5e19d1..17619f26ed 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -402,23 +402,10 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
         case XENMEM_acquire_resource:
         {
             xen_pfn_t *xen_frame_list = NULL;
-            unsigned int max_nr_frames;
 
             if ( copy_from_guest(&cmp.mar, compat, 1) )
                 return -EFAULT;
 
-            /*
-             * The number of frames handled is currently limited to a
-             * small number by the underlying implementation, so the
-             * scratch space should be sufficient for bouncing the
-             * frame addresses.
-             */
-            max_nr_frames = (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
-                sizeof(*xen_frame_list);
-
-            if ( cmp.mar.nr_frames > max_nr_frames )
-                return -E2BIG;
-
             /* Marshal the frame list in the remainder of the xlat space. */
             if ( !compat_handle_is_null(cmp.mar.frame_list) )
                 xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
@@ -432,6 +419,28 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 
             if ( xen_frame_list && cmp.mar.nr_frames )
             {
+                unsigned int xlat_max_frames =
+                    (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
+                    sizeof(*xen_frame_list);
+
+                if ( start_extent >= nat.mar->nr_frames )
+                    return -EINVAL;
+
+                /*
+                 * Adjust nat to account for work done on previous
+                 * continuations, leaving cmp pristine.  Hide the continaution
+                 * from the native code to prevent double accounting.
+                 */
+                nat.mar->nr_frames -= start_extent;
+                nat.mar->frame += start_extent;
+                cmd &= MEMOP_CMD_MASK;
+
+                /*
+                 * If there are two many frames to fit within the xlat buffer,
+                 * we'll need to loop to marshal them all.
+                 */
+                nat.mar->nr_frames = min(nat.mar->nr_frames, xlat_max_frames);
+
                 /*
                  * frame_list is an input for translated guests, and an output
                  * for untranslated guests.  Only copy in for translated guests.
@@ -444,14 +453,14 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                                              cmp.mar.nr_frames) ||
                          __copy_from_compat_offset(
                              compat_frame_list, cmp.mar.frame_list,
-                             0, cmp.mar.nr_frames) )
+                             start_extent, nat.mar->nr_frames) )
                         return -EFAULT;
 
                     /*
                      * Iterate backwards over compat_frame_list[] expanding
                      * compat_pfn_t to xen_pfn_t in place.
                      */
-                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
+                    for ( int x = nat.mar->nr_frames - 1; x >= 0; --x )
                         xen_frame_list[x] = compat_frame_list[x];
                 }
             }
@@ -600,9 +609,11 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
         case XENMEM_acquire_resource:
         {
             DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
+            unsigned int done;
 
             if ( compat_handle_is_null(cmp.mar.frame_list) )
             {
+                ASSERT(split == 0 && rc == 0);
                 if ( __copy_field_to_guest(
                          guest_handle_cast(compat,
                                            compat_mem_acquire_resource_t),
@@ -611,6 +622,21 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                 break;
             }
 
+            if ( split < 0 )
+            {
+                /* Contintuation occured. */
+                ASSERT(rc != XENMEM_acquire_resource);
+                done = cmd >> MEMOP_EXTENT_SHIFT;
+            }
+            else
+            {
+                /* No continuation. */
+                ASSERT(rc == 0);
+                done = nat.mar->nr_frames;
+            }
+
+            ASSERT(done <= nat.mar->nr_frames);
+
             /*
              * frame_list is an input for translated guests, and an output for
              * untranslated guests.  Only copy out for untranslated guests.
@@ -626,7 +652,7 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                  */
                 BUILD_BUG_ON(sizeof(compat_pfn_t) > sizeof(xen_pfn_t));
 
-                for ( i = 0; i < cmp.mar.nr_frames; i++ )
+                for ( i = 0; i < done; i++ )
                 {
                     compat_pfn_t frame = xen_frame_list[i];
 
@@ -636,15 +662,45 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                     compat_frame_list[i] = frame;
                 }
 
-                if ( __copy_to_compat_offset(cmp.mar.frame_list, 0,
-                                             compat_frame_list,
-                                             cmp.mar.nr_frames) )
+                if ( __copy_to_compat_offset(
+                         cmp.mar.frame_list, start_extent,
+                         compat_frame_list, done) )
                     return -EFAULT;
             }
-            break;
+
+            start_extent += done;
+
+            /* Completely done. */
+            if ( start_extent == cmp.mar.nr_frames )
+                break;
+
+            /*
+             * Done a "full" batch, but we were limited by space in the xlat
+             * area.  Go around the loop again without necesserily returning
+             * to guest context.
+             */
+            if ( done == nat.mar->nr_frames )
+            {
+                split = 1;
+                break;
+            }
+
+            /* Explicit continuation request from a higher level. */
+            if ( done < nat.mar->nr_frames )
+                return hypercall_create_continuation(
+                    __HYPERVISOR_memory_op, "ih",
+                    op | (start_extent << MEMOP_EXTENT_SHIFT), compat);
+
+            /*
+             * Well... Somethings gone wrong with the two levels of chunking.
+             * My condolences to whomever next has to debug this mess.
+             */
+            ASSERT_UNREACHABLE();
+            goto crash;
         }
 
         default:
+        crash:
             domain_crash(current->domain);
             split = 0;
             break;
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 8c401a5540..81db47f8fd 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -4105,6 +4105,9 @@ int gnttab_acquire_resource(
     for ( i = 0; i < nr_frames; ++i )
         mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
 
+    /* Success.  Passed nr_frames back to the caller. */
+    rc = nr_frames;
+
  out:
     grant_write_unlock(gt);
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 369154b7c0..ec276cb9b1 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1027,17 +1027,31 @@ static unsigned int resource_max_frames(struct domain *d,
     }
 }
 
+/*
+ * Returns -errno on error, or positive in the range [1, nr_frames] on
+ * success.  Returning less than nr_frames contitutes a request for a
+ * continuation.
+ */
+static int _acquire_resource(
+    struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
+    unsigned int nr_frames, xen_pfn_t mfn_list[])
+{
+    switch ( type )
+    {
+    case XENMEM_resource_grant_table:
+        return gnttab_acquire_resource(d, id, frame, nr_frames, mfn_list);
+
+    default:
+        return arch_acquire_resource(d, type, id, frame, nr_frames, mfn_list);
+    }
+}
+
 static int acquire_resource(
-    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
+    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg,
+    unsigned long start_extent)
 {
     struct domain *d, *currd = current->domain;
     xen_mem_acquire_resource_t xmar;
-    /*
-     * The mfn_list and gfn_list (below) arrays are ok on stack for the
-     * moment since they are small, but if they need to grow in future
-     * use-cases then per-CPU arrays or heap allocations may be required.
-     */
-    xen_pfn_t mfn_list[32];
     unsigned int max_frames;
     int rc;
 
@@ -1055,9 +1069,6 @@ static int acquire_resource(
     if ( xmar.pad != 0 )
         return -EINVAL;
 
-    if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
-        return -E2BIG;
-
     rc = rcu_lock_remote_domain_by_id(xmar.domid, &d);
     if ( rc )
         return rc;
@@ -1074,7 +1085,7 @@ static int acquire_resource(
 
     if ( guest_handle_is_null(xmar.frame_list) )
     {
-        if ( xmar.nr_frames )
+        if ( xmar.nr_frames || start_extent )
             goto out;
 
         xmar.nr_frames = max_frames;
@@ -1087,26 +1098,47 @@ static int acquire_resource(
         goto out;
     }
 
+    /*
+     * Limiting nr_frames at (UINT_MAX >> MEMOP_EXTENT_SHIFT) isn't ideal.  If
+     * it ever becomes a practical problem, we can switch to mutating
+     * xmar.{frame,nr_frames,frame_list} in guest memory.
+     */
+    rc = -EINVAL;
+    if ( start_extent >= xmar.nr_frames ||
+         xmar.nr_frames > (UINT_MAX >> MEMOP_EXTENT_SHIFT) )
+        goto out;
+
+    /* Adjust for work done on previous continuations. */
+    xmar.nr_frames -= start_extent;
+    xmar.frame += start_extent;
+    guest_handle_add_offset(xmar.frame_list, start_extent);
+
     do {
-        switch ( xmar.type )
-        {
-        case XENMEM_resource_grant_table:
-            rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
-                                         mfn_list);
-            break;
+        /*
+         * Arbitrary size.  Not too much stack space, and a reasonable stride
+         * for continutation checks.
+         */
+        xen_pfn_t mfn_list[32];
+        unsigned int todo = MIN(ARRAY_SIZE(mfn_list), xmar.nr_frames), done;
 
-        default:
-            rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
-                                       xmar.nr_frames, mfn_list);
-            break;
-        }
+        rc = _acquire_resource(d, xmar.type, xmar.id, xmar.frame,
+                               todo, mfn_list);
+        if ( rc < 0 )
+            goto out;
 
-        if ( rc )
+        done = rc;
+        rc = 0;
+        if ( done == 0 || done > todo )
+        {
+            ASSERT_UNREACHABLE();
+            rc = -EINVAL;
             goto out;
+        }
 
+        /* Adjust guest frame_list appropriately. */
         if ( !paging_mode_translate(currd) )
         {
-            if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
+            if ( copy_to_guest(xmar.frame_list, mfn_list, done) )
                 rc = -EFAULT;
         }
         else
@@ -1114,10 +1146,10 @@ static int acquire_resource(
             xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
             unsigned int i;
 
-            if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
+            if ( copy_from_guest(gfn_list, xmar.frame_list, done) )
                 rc = -EFAULT;
 
-            for ( i = 0; !rc && i < xmar.nr_frames; i++ )
+            for ( i = 0; !rc && i < done; i++ )
             {
                 rc = set_foreign_p2m_entry(currd, gfn_list[i],
                                            _mfn(mfn_list[i]));
@@ -1126,7 +1158,32 @@ static int acquire_resource(
                     rc = -EIO;
             }
         }
-    } while ( 0 );
+
+        if ( rc )
+            goto out;
+
+        xmar.nr_frames -= done;
+        xmar.frame += done;
+        guest_handle_add_offset(xmar.frame_list, done);
+        start_extent += done;
+
+        /*
+         * Explicit contination request from _acquire_resource(), or we've
+         * still got work to do other work is pending.
+         */
+        if ( done < todo ||
+             (xmar.nr_frames && hypercall_preempt_check()) )
+        {
+            rc = hypercall_create_continuation(
+                __HYPERVISOR_memory_op, "lh",
+                XENMEM_acquire_resource | (start_extent << MEMOP_EXTENT_SHIFT),
+                arg);
+            goto out;
+        }
+
+    } while ( xmar.nr_frames );
+
+    rc = 0;
 
  out:
     rcu_unlock_domain(d);
@@ -1593,7 +1650,8 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case XENMEM_acquire_resource:
         rc = acquire_resource(
-            guest_handle_cast(arg, xen_mem_acquire_resource_t));
+            guest_handle_cast(arg, xen_mem_acquire_resource_t),
+            start_extent);
         break;
 
     default:
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 10/11] TESTING dom0
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (8 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  2020-09-22 18:24 ` [PATCH v2 11/11] TESTING XTF Andrew Cooper
  10 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Poke xenforeignmemory_map_resource() from userspace.  Confirm that all 40
grant frames can be mapped.

Do not apply.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/misc/Makefile       |   4 ++
 tools/misc/xen-resource.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 110 insertions(+)
 create mode 100644 tools/misc/xen-resource.c

diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 7d37f297a9..c1d262c329 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -76,6 +76,10 @@ distclean: clean
 xen-cpuid: xen-cpuid.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(APPEND_LDFLAGS)
 
+xen-resource.o: APPEND_CFLAGS += -Wno-declaration-after-statement
+xen-resource: xen-resource.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenforeignmemory) $(LDLIBS_libxendevicemodel) $(APPEND_LDFLAGS)
+
 xen-hvmctx: xen-hvmctx.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-resource.c b/tools/misc/xen-resource.c
new file mode 100644
index 0000000000..3c29d43a35
--- /dev/null
+++ b/tools/misc/xen-resource.c
@@ -0,0 +1,106 @@
+#include <err.h>
+#include <error.h>
+#include <errno.h>
+#include <stdio.h>
+#include <sys/mman.h>
+
+#include <xenctrl.h>
+#include <xenforeignmemory.h>
+#include <xendevicemodel.h>
+#include <xen-tools/libs.h>
+
+static xc_interface *xch;
+static xenforeignmemory_handle *fh;
+static xendevicemodel_handle *dh;
+
+static void test(unsigned int domid, unsigned int type, unsigned int id)
+{
+    unsigned long nr = ~0;
+
+    printf("Testing d%u, type %u, id %u\n", domid, type, id);
+
+    int rc = xenforeignmemory_resource_size(fh, domid, type, id, &nr);
+    if ( rc )
+    {
+        printf("  failed %d\n", -errno);
+        return;
+    }
+    else
+        printf("  %lu frames\n", nr);
+
+    printf("  Trying to map\n");
+    void *addr = NULL;
+    xenforeignmemory_resource_handle *res =
+        xenforeignmemory_map_resource(fh, domid, type, id, 0, nr,
+                                      &addr, PROT_READ | PROT_WRITE, 0);
+    if ( !res )
+    {
+        perror("  failed");
+        return;
+    }
+
+    printf("  Success\n");
+    xenforeignmemory_unmap_resource(fh, res);
+}
+
+int main(int argc, char **argv)
+{
+    int rc;
+
+    xch = xc_interface_open(NULL, NULL, 0);
+    fh = xenforeignmemory_open(NULL, 0);
+    dh = xendevicemodel_open(NULL, 0);
+
+    if ( !xch )
+        err(1, "xc_interface_open");
+    if ( !fh )
+        err(1, "xenforeignmemory_open");
+    if ( !dh )
+        err(1, "xendevicemodel_open");
+
+    uint32_t domid = 0;
+    struct xen_domctl_createdomain dom = {
+        .flags = XEN_DOMCTL_CDF_hvm,
+        .max_vcpus = 8,
+        .max_grant_frames = 40,
+
+        .arch = {
+            .emulation_flags = XEN_X86_EMU_LAPIC,
+        },
+    };
+
+    rc = xc_domain_create(xch, &domid, &dom);
+    if ( rc )
+    {
+        perror("xc_domain_create()");
+        goto out;
+    }
+    printf("Created d%u\n", domid);
+
+    ioservid_t id = -1;
+    rc = xendevicemodel_create_ioreq_server(dh, domid, 1, &id);
+    if ( rc )
+    {
+        perror("xendevicemodel_create_ioreq_server()");
+        goto out;
+    }
+    printf("Created ioreq server %u\n", id);
+
+
+    test(domid, XENMEM_resource_ioreq_server, id);
+
+    test(domid, XENMEM_resource_grant_table,
+         XENMEM_resource_grant_table_id_shared);
+    test(domid, XENMEM_resource_grant_table,
+         XENMEM_resource_grant_table_id_status);
+
+    test(domid, 2, 0);
+
+out:
+    if ( id >= 0 )
+        xendevicemodel_destroy_ioreq_server(dh, domid, id);
+    if ( domid )
+        xc_domain_destroy(xch, domid);
+
+    return 0;
+}
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 11/11] TESTING XTF
  2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
                   ` (9 preceding siblings ...)
  2020-09-22 18:24 ` [PATCH v2 10/11] TESTING dom0 Andrew Cooper
@ 2020-09-22 18:24 ` Andrew Cooper
  10 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2020-09-22 18:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Add an arbitrary "resource type 2" which uses "frames" of a fixed format so
both Xen (for a PVH dom0) and XTF (for a PV dom0) can check the integrity of
the marshalled buffer.

Skip the hypercall preempt check to allow the compat PVH logic a chance to hit
the 1020 limit in the XLAT buffer.

Do not apply.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/common/memory.c | 39 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/xen/common/memory.c b/xen/common/memory.c
index ec276cb9b1..15a8ed253e 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1022,11 +1022,33 @@ static unsigned int resource_max_frames(struct domain *d,
     case XENMEM_resource_grant_table:
         return gnttab_resource_max_frames(d, id);
 
+    case 2:
+        return 2900;
+
     default:
         return arch_resource_max_frames(d, type, id);
     }
 }
 
+static int _acquire_2(unsigned int id, unsigned long frame,
+                      unsigned int nr_frames, xen_pfn_t mfn_list[])
+{
+    unsigned int i;
+
+    for ( i = 0; i < nr_frames; ++i )
+    {
+        mfn_list[i] = 0xdead0000 + frame + i;
+
+        /* Simulate some -ERESTARTs */
+        if ( i && ((frame + i) == 22 ||
+                   (frame + i) == 37 ||
+                   (frame + i) == 1040) )
+            break;
+    }
+
+    return i;
+}
+
 /*
  * Returns -errno on error, or positive in the range [1, nr_frames] on
  * success.  Returning less than nr_frames contitutes a request for a
@@ -1041,6 +1063,9 @@ static int _acquire_resource(
     case XENMEM_resource_grant_table:
         return gnttab_acquire_resource(d, id, frame, nr_frames, mfn_list);
 
+    case 2:
+        return _acquire_2(id, frame, nr_frames, mfn_list);
+
     default:
         return arch_acquire_resource(d, type, id, frame, nr_frames, mfn_list);
     }
@@ -1151,6 +1176,18 @@ static int acquire_resource(
 
             for ( i = 0; !rc && i < done; i++ )
             {
+                /*
+                 * For debug type 2, check that the marshalled-in frames are
+                 * correct, rather than actually inserting them into the P2M.
+                 */
+                if ( xmar.type == 2 )
+                {
+                    if ( gfn_list[i] != mfn_list[i] )
+                        panic("gfn %#lx != mfn %#lx, i %lu\n",
+                              gfn_list[i], mfn_list[i], i + xmar.frame);
+                    continue;
+                }
+
                 rc = set_foreign_p2m_entry(currd, gfn_list[i],
                                            _mfn(mfn_list[i]));
                 /* rc should be -EIO for any iteration other than the first */
@@ -1172,7 +1209,7 @@ static int acquire_resource(
          * still got work to do other work is pending.
          */
         if ( done < todo ||
-             (xmar.nr_frames && hypercall_preempt_check()) )
+             (0 && xmar.nr_frames && hypercall_preempt_check()) )
         {
             rc = hypercall_create_continuation(
                 __HYPERVISOR_memory_op, "lh",
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2020-09-22 18:24 ` [PATCH v2 02/11] xen/gnttab: Rework resource acquisition Andrew Cooper
@ 2020-09-24  9:51   ` Paul Durrant
  2021-01-11 21:22     ` Andrew Cooper
  2020-09-25 13:17   ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2020-09-24  9:51 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 22 September 2020 19:25
> To: Xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>; Ian
> Jackson <iwj@xenproject.org>; Jan Beulich <JBeulich@suse.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Paul Durrant
> <paul@xen.org>; Michał Leszczyński <michal.leszczynski@cert.pl>; Hubert Jasudowicz
> <hubert.jasudowicz@cert.pl>; Tamas K Lengyel <tamas@tklengyel.com>
> Subject: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
> 
> The existing logic doesn't function in the general case for mapping a guests
> grant table, due to arbitrary 32 frame limit, and the default grant table
> limit being 64.
> 
> In order to start addressing this, rework the existing grant table logic by
> implementing a single gnttab_acquire_resource().  This is far more efficient
> than the previous acquire_grant_table() in memory.c because it doesn't take
> the grant table write lock, and attempt to grow the table, for every single
> frame.
> 
> The new gnttab_acquire_resource() function subsumes the previous two
> gnttab_get_{shared,status}_frame() helpers.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Ian Jackson <iwj@xenproject.org>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Paul Durrant <paul@xen.org>
> CC: Michał Leszczyński <michal.leszczynski@cert.pl>
> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
> CC: Tamas K Lengyel <tamas@tklengyel.com>
> 
> v2:
>  * Fix CentOS 6 build by initialising vaddrs to NULL
>  * Add ASSERT_UNREACHABLE() and gt->status typecheck
> ---
>  xen/common/grant_table.c      | 102 +++++++++++++++++++++++++++++++-----------
>  xen/common/memory.c           |  42 +----------------
>  xen/include/xen/grant_table.h |  19 +++-----
>  3 files changed, 84 insertions(+), 79 deletions(-)
> 
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index a5d3ed8bda..912f07be47 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>      return 0;
>  }
> 
> +int gnttab_acquire_resource(
> +    struct domain *d, unsigned int id, unsigned long frame,
> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
> +{
> +    struct grant_table *gt = d->grant_table;
> +    unsigned int i = nr_frames, tot_frames;
> +    mfn_t tmp;
> +    void **vaddrs = NULL;
> +    int rc;
> +
> +    /* Input sanity. */

Nit: inconsistency with full stops on single line comments.

> +    if ( !nr_frames )
> +        return -EINVAL;
> +
> +    /* Overflow checks */
> +    if ( frame + nr_frames < frame )
> +        return -EINVAL;
> +
> +    tot_frames = frame + nr_frames;
> +    if ( tot_frames != frame + nr_frames )
> +        return -EINVAL;
> +
> +    /* Grow table if necessary. */
> +    grant_write_lock(gt);
> +    switch ( id )
> +    {
> +    case XENMEM_resource_grant_table_id_shared:
> +        rc = gnttab_get_shared_frame_mfn(d, tot_frames - 1, &tmp);
> +        break;
> +
> +    case XENMEM_resource_grant_table_id_status:
> +        if ( gt->gt_version != 2 )
> +        {
> +    default:

Personally I dislike this kind of thing. IMO it would be neater to initialise rc to -EINVAL at the top, then this could just be a 'break' (so no braces) and you could have a simple 'default: break;' case instead.

> +            rc = -EINVAL;
> +            break;
> +        }
> +        rc = gnttab_get_status_frame_mfn(d, tot_frames - 1, &tmp);
> +        break;
> +    }
> +

I think you could drop the write lock here...

> +    /* Any errors from growing the table? */
> +    if ( rc )
> +        goto out;
> +

...and acquire it read here, since we know the table cannot shrink. You'd need to re-check the gt_version for safety though. 

  Paul

> +    switch ( id )
> +    {
> +    case XENMEM_resource_grant_table_id_shared:
> +        vaddrs = gt->shared_raw;
> +        break;
> +
> +    case XENMEM_resource_grant_table_id_status:
> +        /* Check that void ** is a suitable representation for gt->status. */
> +        BUILD_BUG_ON(!__builtin_types_compatible_p(
> +                         typeof(gt->status), grant_status_t **));
> +        vaddrs = (void **)gt->status;
> +        break;
> +    }
> +
> +    if ( !vaddrs )
> +    {
> +        ASSERT_UNREACHABLE();
> +        rc = -EINVAL;
> +        goto out;
> +    }
> +
> +    for ( i = 0; i < nr_frames; ++i )
> +        mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
> +
> + out:
> +    grant_write_unlock(gt);
> +
> +    return rc;
> +}
> +
>  int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn, mfn_t *mfn)
>  {
>      int rc = 0;
> @@ -4047,33 +4122,6 @@ int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn, mfn_t
> *mfn)
>      return rc;
>  }
> 
> -int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
> -                            mfn_t *mfn)
> -{
> -    struct grant_table *gt = d->grant_table;
> -    int rc;
> -
> -    grant_write_lock(gt);
> -    rc = gnttab_get_shared_frame_mfn(d, idx, mfn);
> -    grant_write_unlock(gt);
> -
> -    return rc;
> -}
> -
> -int gnttab_get_status_frame(struct domain *d, unsigned long idx,
> -                            mfn_t *mfn)
> -{
> -    struct grant_table *gt = d->grant_table;
> -    int rc;
> -
> -    grant_write_lock(gt);
> -    rc = (gt->gt_version == 2) ?
> -        gnttab_get_status_frame_mfn(d, idx, mfn) : -EINVAL;
> -    grant_write_unlock(gt);
> -
> -    return rc;
> -}
> -
>  static void gnttab_usage_print(struct domain *rd)
>  {
>      int first = 1;
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 1bab0e80c2..177fc378d9 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1007,44 +1007,6 @@ static long xatp_permission_check(struct domain *d, unsigned int space)
>      return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
>  }
> 
> -static int acquire_grant_table(struct domain *d, unsigned int id,
> -                               unsigned long frame,
> -                               unsigned int nr_frames,
> -                               xen_pfn_t mfn_list[])
> -{
> -    unsigned int i = nr_frames;
> -
> -    /* Iterate backwards in case table needs to grow */
> -    while ( i-- != 0 )
> -    {
> -        mfn_t mfn = INVALID_MFN;
> -        int rc;
> -
> -        switch ( id )
> -        {
> -        case XENMEM_resource_grant_table_id_shared:
> -            rc = gnttab_get_shared_frame(d, frame + i, &mfn);
> -            break;
> -
> -        case XENMEM_resource_grant_table_id_status:
> -            rc = gnttab_get_status_frame(d, frame + i, &mfn);
> -            break;
> -
> -        default:
> -            rc = -EINVAL;
> -            break;
> -        }
> -
> -        if ( rc )
> -            return rc;
> -
> -        ASSERT(!mfn_eq(mfn, INVALID_MFN));
> -        mfn_list[i] = mfn_x(mfn);
> -    }
> -
> -    return 0;
> -}
> -
>  static int acquire_resource(
>      XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
>  {
> @@ -1099,8 +1061,8 @@ static int acquire_resource(
>      switch ( xmar.type )
>      {
>      case XENMEM_resource_grant_table:
> -        rc = acquire_grant_table(d, xmar.id, xmar.frame, xmar.nr_frames,
> -                                 mfn_list);
> +        rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
> +                                     mfn_list);
>          break;
> 
>      default:
> diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
> index 98603604b8..5a2c75b880 100644
> --- a/xen/include/xen/grant_table.h
> +++ b/xen/include/xen/grant_table.h
> @@ -56,10 +56,10 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
> 
>  int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
>                       mfn_t *mfn);
> -int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
> -                            mfn_t *mfn);
> -int gnttab_get_status_frame(struct domain *d, unsigned long idx,
> -                            mfn_t *mfn);
> +
> +int gnttab_acquire_resource(
> +    struct domain *d, unsigned int id, unsigned long frame,
> +    unsigned int nr_frames, xen_pfn_t mfn_list[]);
> 
>  #else
> 
> @@ -93,14 +93,9 @@ static inline int gnttab_map_frame(struct domain *d, unsigned long idx,
>      return -EINVAL;
>  }
> 
> -static inline int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
> -                                          mfn_t *mfn)
> -{
> -    return -EINVAL;
> -}
> -
> -static inline int gnttab_get_status_frame(struct domain *d, unsigned long idx,
> -                                          mfn_t *mfn)
> +static inline int gnttab_acquire_resource(
> +    struct domain *d, unsigned int id, unsigned long frame,
> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>  {
>      return -EINVAL;
>  }
> --
> 2.11.0




^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics
  2020-09-22 18:24 ` [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics Andrew Cooper
@ 2020-09-24 10:06   ` Paul Durrant
  2020-09-24 10:57     ` Andrew Cooper
  2020-09-25 15:56   ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2020-09-24 10:06 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 22 September 2020 19:25
> To: Xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>; Ian
> Jackson <iwj@xenproject.org>; Jan Beulich <JBeulich@suse.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Paul Durrant
> <paul@xen.org>; Michał Leszczyński <michal.leszczynski@cert.pl>; Hubert Jasudowicz
> <hubert.jasudowicz@cert.pl>; Tamas K Lengyel <tamas@tklengyel.com>
> Subject: [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics
> 
> Calling XENMEM_acquire_resource with a NULL frame_list is a request for the
> size of the resource, but the returned 32 is bogus.
> 
> If someone tries to follow it for XENMEM_resource_ioreq_server, the acquire
> call will fail as IOREQ servers currently top out at 2 frames, and it is only
> half the size of the default grant table limit for guests.
> 
> Also, no users actually request a resource size, because it was never wired up
> in the sole implementaion of resource acquisition in Linux.
> 
> Introduce a new resource_max_frames() to calculate the size of a resource, and
> implement it the IOREQ and grant subsystems.
> 
> It is impossible to guarantee that a mapping call following a successful size
> call will succeed (e.g. The target IOREQ server gets destroyed, or the domain
> switches from grant v2 to v1).  Document the restriction, and use the
> flexibility to simplify the paths to be lockless.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Ian Jackson <iwj@xenproject.org>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Paul Durrant <paul@xen.org>
> CC: Michał Leszczyński <michal.leszczynski@cert.pl>
> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
> CC: Tamas K Lengyel <tamas@tklengyel.com>
> 
> v2:
>  * Spelling fixes
>  * Add more local variables.
>  * Don't return any status frames on ARM where v2 support is compiled out.
> ---
>  xen/arch/x86/mm.c             | 20 ++++++++++++++++
>  xen/common/grant_table.c      | 23 ++++++++++++++++++
>  xen/common/memory.c           | 55 +++++++++++++++++++++++++++++++++----------
>  xen/include/asm-x86/mm.h      |  3 +++
>  xen/include/public/memory.h   | 16 +++++++++----
>  xen/include/xen/grant_table.h |  8 +++++++
>  xen/include/xen/mm.h          |  6 +++++
>  7 files changed, 114 insertions(+), 17 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index d1cfc8fb4a..e82307bdae 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4591,6 +4591,26 @@ int xenmem_add_to_physmap_one(
>      return rc;
>  }
> 
> +unsigned int arch_resource_max_frames(
> +    struct domain *d, unsigned int type, unsigned int id)
> +{
> +    unsigned int nr = 0;
> +
> +    switch ( type )
> +    {
> +#ifdef CONFIG_HVM
> +    case XENMEM_resource_ioreq_server:
> +        if ( !is_hvm_domain(d) )
> +            break;
> +        /* One frame for the buf-ioreq ring, and one frame per 128 vcpus. */
> +        nr = 1 + DIV_ROUND_UP(d->max_vcpus * sizeof(struct ioreq), PAGE_SIZE);

The buf-ioreq ring is optional so a caller using this value may still get a resource acquisition failure unless the id is used to actually look up and check the ioreq server in question for the actual maximum. So this needs to call into a new function in ioreq.c.

  Paul

> +        break;
> +#endif
> +    }
> +
> +    return nr;
> +}
> +
>  int arch_acquire_resource(struct domain *d, unsigned int type,
>                            unsigned int id, unsigned long frame,
>                            unsigned int nr_frames, xen_pfn_t mfn_list[])
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index 912f07be47..8c401a5540 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -4013,6 +4013,29 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>      return 0;
>  }
> 
> +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id)
> +{
> +    const struct grant_table *gt = d->grant_table;
> +    unsigned int nr = 0;
> +
> +    /* Don't need the grant lock.  This limit is fixed at domain create time. */
> +    switch ( id )
> +    {
> +    case XENMEM_resource_grant_table_id_shared:
> +        nr = gt->max_grant_frames;
> +        break;
> +
> +    case XENMEM_resource_grant_table_id_status:
> +        if ( GNTTAB_MAX_VERSION < 2 )
> +            break;
> +
> +        nr = grant_to_status_frames(gt->max_grant_frames);
> +        break;
> +    }
> +
> +    return nr;
> +}
> +
>  int gnttab_acquire_resource(
>      struct domain *d, unsigned int id, unsigned long frame,
>      unsigned int nr_frames, xen_pfn_t mfn_list[])
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 177fc378d9..c559935732 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1007,6 +1007,26 @@ static long xatp_permission_check(struct domain *d, unsigned int space)
>      return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
>  }
> 
> +/*
> + * Return 0 on any kind of error.  Caller converts to -EINVAL.
> + *
> + * All nonzero values should be repeatable (i.e. derived from some fixed
> + * property of the domain), and describe the full resource (i.e. mapping the
> + * result of this call will be the entire resource).
> + */
> +static unsigned int resource_max_frames(struct domain *d,
> +                                        unsigned int type, unsigned int id)
> +{
> +    switch ( type )
> +    {
> +    case XENMEM_resource_grant_table:
> +        return gnttab_resource_max_frames(d, id);
> +
> +    default:
> +        return arch_resource_max_frames(d, type, id);
> +    }
> +}
> +
>  static int acquire_resource(
>      XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
>  {
> @@ -1018,6 +1038,7 @@ static int acquire_resource(
>       * use-cases then per-CPU arrays or heap allocations may be required.
>       */
>      xen_pfn_t mfn_list[32];
> +    unsigned int max_frames;
>      int rc;
> 
>      /*
> @@ -1034,19 +1055,6 @@ static int acquire_resource(
>      if ( xmar.pad != 0 )
>          return -EINVAL;
> 
> -    if ( guest_handle_is_null(xmar.frame_list) )
> -    {
> -        if ( xmar.nr_frames )
> -            return -EINVAL;
> -
> -        xmar.nr_frames = ARRAY_SIZE(mfn_list);
> -
> -        if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
> -            return -EFAULT;
> -
> -        return 0;
> -    }
> -
>      if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
>          return -E2BIG;
> 
> @@ -1058,6 +1066,27 @@ static int acquire_resource(
>      if ( rc )
>          goto out;
> 
> +    max_frames = resource_max_frames(d, xmar.type, xmar.id);
> +
> +    rc = -EINVAL;
> +    if ( !max_frames )
> +        goto out;
> +
> +    if ( guest_handle_is_null(xmar.frame_list) )
> +    {
> +        if ( xmar.nr_frames )
> +            goto out;
> +
> +        xmar.nr_frames = max_frames;
> +
> +        rc = -EFAULT;
> +        if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
> +            goto out;
> +
> +        rc = 0;
> +        goto out;
> +    }
> +
>      switch ( xmar.type )
>      {
>      case XENMEM_resource_grant_table:
> diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
> index deeba75a1c..13977652a8 100644
> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -639,6 +639,9 @@ static inline bool arch_mfn_in_directmap(unsigned long mfn)
>      return mfn <= (virt_to_mfn(eva - 1) + 1);
>  }
> 
> +unsigned int arch_resource_max_frames(struct domain *d,
> +                                      unsigned int type, unsigned int id);
> +
>  int arch_acquire_resource(struct domain *d, unsigned int type,
>                            unsigned int id, unsigned long frame,
>                            unsigned int nr_frames, xen_pfn_t mfn_list[]);
> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index 21d483298e..d7eb34f167 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -639,10 +639,18 @@ struct xen_mem_acquire_resource {
>  #define XENMEM_resource_grant_table_id_status 1
> 
>      /*
> -     * IN/OUT - As an IN parameter number of frames of the resource
> -     *          to be mapped. However, if the specified value is 0 and
> -     *          frame_list is NULL then this field will be set to the
> -     *          maximum value supported by the implementation on return.
> +     * IN/OUT
> +     *
> +     * As an IN parameter number of frames of the resource to be mapped.
> +     *
> +     * When frame_list is NULL and nr_frames is 0, this is interpreted as a
> +     * request for the size of the resource, which shall be returned in the
> +     * nr_frames field.
> +     *
> +     * The size of a resource will never be zero, but a nonzero result doesn't
> +     * guarantee that a subsequent mapping request will be successful.  There
> +     * are further type/id specific constraints which may change between the
> +     * two calls.
>       */
>      uint32_t nr_frames;
>      uint32_t pad;
> diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
> index 5a2c75b880..bae4d79623 100644
> --- a/xen/include/xen/grant_table.h
> +++ b/xen/include/xen/grant_table.h
> @@ -57,6 +57,8 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
>  int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
>                       mfn_t *mfn);
> 
> +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id);
> +
>  int gnttab_acquire_resource(
>      struct domain *d, unsigned int id, unsigned long frame,
>      unsigned int nr_frames, xen_pfn_t mfn_list[]);
> @@ -93,6 +95,12 @@ static inline int gnttab_map_frame(struct domain *d, unsigned long idx,
>      return -EINVAL;
>  }
> 
> +static inline unsigned int gnttab_resource_max_frames(
> +    struct domain *d, unsigned int id)
> +{
> +    return 0;
> +}
> +
>  static inline int gnttab_acquire_resource(
>      struct domain *d, unsigned int id, unsigned long frame,
>      unsigned int nr_frames, xen_pfn_t mfn_list[])
> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> index 26a4a3d350..d686876b0e 100644
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -686,6 +686,12 @@ static inline void put_page_alloc_ref(struct page_info *page)
>  }
> 
>  #ifndef CONFIG_ARCH_ACQUIRE_RESOURCE
> +static inline unsigned int arch_resource_max_frames(
> +    struct domain *d, unsigned int type, unsigned int id)
> +{
> +    return 0;
> +}
> +
>  static inline int arch_acquire_resource(
>      struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
>      unsigned int nr_frames, xen_pfn_t mfn_list[])
> --
> 2.11.0




^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 06/11] xen/memory: Clarify the XENMEM_acquire_resource ABI description
  2020-09-22 18:24 ` [PATCH v2 06/11] xen/memory: Clarify the XENMEM_acquire_resource ABI description Andrew Cooper
@ 2020-09-24 10:08   ` Paul Durrant
  0 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2020-09-24 10:08 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 22 September 2020 19:25
> To: Xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>; Ian
> Jackson <iwj@xenproject.org>; Jan Beulich <JBeulich@suse.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Paul Durrant
> <paul@xen.org>; Michał Leszczyński <michal.leszczynski@cert.pl>; Hubert Jasudowicz
> <hubert.jasudowicz@cert.pl>; Tamas K Lengyel <tamas@tklengyel.com>
> Subject: [PATCH v2 06/11] xen/memory: Clarify the XENMEM_acquire_resource ABI description
> 
> This is how similar operations already operate, compatible with the sole
> implementation (in Linux), and explicitly gives us some flexibility.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Paul Durrant <paul@xen.org>

> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Ian Jackson <iwj@xenproject.org>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Paul Durrant <paul@xen.org>
> CC: Michał Leszczyński <michal.leszczynski@cert.pl>
> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
> CC: Tamas K Lengyel <tamas@tklengyel.com>
> ---
>  xen/include/public/memory.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index d7eb34f167..c4c47a0b38 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -642,6 +642,7 @@ struct xen_mem_acquire_resource {
>       * IN/OUT
>       *
>       * As an IN parameter number of frames of the resource to be mapped.
> +     * This value may be updated during the course of the operation.
>       *
>       * When frame_list is NULL and nr_frames is 0, this is interpreted as a
>       * request for the size of the resource, which shall be returned in the
> @@ -656,7 +657,8 @@ struct xen_mem_acquire_resource {
>      uint32_t pad;
>      /*
>       * IN - the index of the initial frame to be mapped. This parameter
> -     *      is ignored if nr_frames is 0.
> +     *      is ignored if nr_frames is 0.  This value may be updated
> +     *      during the course of the operation.
>       */
>      uint64_t frame;
> 
> @@ -672,7 +674,8 @@ struct xen_mem_acquire_resource {
>       *          If -EIO is returned then the frame_list has only been
>       *          partially mapped and it is up to the caller to unmap all
>       *          the GFNs.
> -     *          This parameter may be NULL if nr_frames is 0.
> +     *          This parameter may be NULL if nr_frames is 0.  This
> +     *          value may be updated during the course of the operation.
>       */
>      XEN_GUEST_HANDLE(xen_pfn_t) frame_list;
>  };
> --
> 2.11.0




^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling
  2020-09-22 18:24 ` [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling Andrew Cooper
@ 2020-09-24 10:16   ` Paul Durrant
  2020-09-28  9:09   ` Jan Beulich
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2020-09-24 10:16 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 22 September 2020 19:25
> To: Xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>; Ian
> Jackson <iwj@xenproject.org>; Jan Beulich <JBeulich@suse.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Paul Durrant
> <paul@xen.org>; Michał Leszczyński <michal.leszczynski@cert.pl>; Hubert Jasudowicz
> <hubert.jasudowicz@cert.pl>; Tamas K Lengyel <tamas@tklengyel.com>
> Subject: [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling
> 
> The frame_list is an input, or an output, depending on whether the calling
> domain is translated or not.  The array does not need marshalling in both
> directions.
> 
> Furthermore, the copy-in loop was very inefficient, copying 4 bytes at at
> time.  Rewrite it to copy in all nr_frames at once, and then expand
> compat_pfn_t to xen_pfn_t in place.
> 
> Re-position the copy-in loop to simplify continuation support in a future
> patch, and reduce the scope of certain variables.
> 
> No change in guest observed behaviour.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Ian Jackson <iwj@xenproject.org>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Paul Durrant <paul@xen.org>
> CC: Michał Leszczyński <michal.leszczynski@cert.pl>
> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
> CC: Tamas K Lengyel <tamas@tklengyel.com>
> ---
>  xen/common/compat/memory.c | 65 ++++++++++++++++++++++++++++------------------
>  1 file changed, 40 insertions(+), 25 deletions(-)
> 
> diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
> index ed92e05b08..834c5e19d1 100644
> --- a/xen/common/compat/memory.c
> +++ b/xen/common/compat/memory.c
> @@ -55,6 +55,8 @@ static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
> 
>  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>  {
> +    struct vcpu *curr = current;
> +    struct domain *currd = curr->domain;
>      int split, op = cmd & MEMOP_CMD_MASK;
>      long rc;
>      unsigned int start_extent = cmd >> MEMOP_EXTENT_SHIFT;
> @@ -399,7 +401,7 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
> 
>          case XENMEM_acquire_resource:
>          {
> -            xen_pfn_t *xen_frame_list;
> +            xen_pfn_t *xen_frame_list = NULL;
>              unsigned int max_nr_frames;
> 
>              if ( copy_from_guest(&cmp.mar, compat, 1) )
> @@ -417,28 +419,10 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>              if ( cmp.mar.nr_frames > max_nr_frames )
>                  return -E2BIG;
> 
> -            if ( compat_handle_is_null(cmp.mar.frame_list) )
> -                xen_frame_list = NULL;
> -            else
> -            {
> +            /* Marshal the frame list in the remainder of the xlat space. */
> +            if ( !compat_handle_is_null(cmp.mar.frame_list) )
>                  xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
> 
> -                if ( !compat_handle_okay(cmp.mar.frame_list,
> -                                         cmp.mar.nr_frames) )
> -                    return -EFAULT;
> -
> -                for ( i = 0; i < cmp.mar.nr_frames; i++ )
> -                {
> -                    compat_pfn_t frame;
> -
> -                    if ( __copy_from_compat_offset(
> -                             &frame, cmp.mar.frame_list, i, 1) )
> -                        return -EFAULT;
> -
> -                    xen_frame_list[i] = frame;
> -                }
> -            }
> -
>  #define XLAT_mem_acquire_resource_HNDL_frame_list(_d_, _s_) \
>              set_xen_guest_handle((_d_)->frame_list, xen_frame_list)
> 
> @@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
> 
>  #undef XLAT_mem_acquire_resource_HNDL_frame_list
> 
> +            if ( xen_frame_list && cmp.mar.nr_frames )
> +            {
> +                /*
> +                 * frame_list is an input for translated guests, and an output
> +                 * for untranslated guests.  Only copy in for translated guests.
> +                 */
> +                if ( paging_mode_translate(currd) )
> +                {
> +                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
> +
> +                    if ( !compat_handle_okay(cmp.mar.frame_list,
> +                                             cmp.mar.nr_frames) ||
> +                         __copy_from_compat_offset(
> +                             compat_frame_list, cmp.mar.frame_list,
> +                             0, cmp.mar.nr_frames) )
> +                        return -EFAULT;
> +
> +                    /*
> +                     * Iterate backwards over compat_frame_list[] expanding
> +                     * compat_pfn_t to xen_pfn_t in place.
> +                     */
> +                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )

I know it is legal c99 but personally I still dislike declarations like this, and I've not seen one elsewhere in the Xen code. But I'm not the maintainer, so...

Reviewed-by: Paul Durrant <paul@xen.org>

> +                        xen_frame_list[x] = compat_frame_list[x];
> +                }
> +            }
>              break;
>          }
>          default:
> @@ -590,8 +599,6 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
> 
>          case XENMEM_acquire_resource:
>          {
> -            const xen_pfn_t *xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
> -            compat_pfn_t *compat_frame_list = (compat_pfn_t *)(nat.mar + 1);
>              DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
> 
>              if ( compat_handle_is_null(cmp.mar.frame_list) )
> @@ -601,9 +608,18 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                                             compat_mem_acquire_resource_t),
>                           nat.mar, nr_frames) )
>                      return -EFAULT;
> +                break;
>              }
> -            else
> +
> +            /*
> +             * frame_list is an input for translated guests, and an output for
> +             * untranslated guests.  Only copy out for untranslated guests.
> +             */
> +            if ( !paging_mode_translate(currd) )
>              {
> +                const xen_pfn_t *xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
> +                compat_pfn_t *compat_frame_list = (compat_pfn_t *)(nat.mar + 1);
> +
>                  /*
>                   * NOTE: the smaller compat array overwrites the native
>                   *       array.
> @@ -625,7 +641,6 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                                               cmp.mar.nr_frames) )
>                      return -EFAULT;
>              }
> -
>              break;
>          }
> 
> --
> 2.11.0




^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 08/11] xen/memory: Indent part of acquire_resource()
  2020-09-22 18:24 ` [PATCH v2 08/11] xen/memory: Indent part of acquire_resource() Andrew Cooper
@ 2020-09-24 10:36   ` Paul Durrant
  0 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2020-09-24 10:36 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 22 September 2020 19:25
> To: Xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>; Ian
> Jackson <iwj@xenproject.org>; Jan Beulich <JBeulich@suse.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Paul Durrant
> <paul@xen.org>; Michał Leszczyński <michal.leszczynski@cert.pl>; Hubert Jasudowicz
> <hubert.jasudowicz@cert.pl>; Tamas K Lengyel <tamas@tklengyel.com>
> Subject: [PATCH v2 08/11] xen/memory: Indent part of acquire_resource()
> 
> Indent the middle of acquire_resource() inside a do {} while ( 0 ) loop.  This
> is broken out specifically to make the following change readable.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Paul Durrant <paul@xen.org>

> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Ian Jackson <iwj@xenproject.org>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Paul Durrant <paul@xen.org>
> CC: Michał Leszczyński <michal.leszczynski@cert.pl>
> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
> CC: Tamas K Lengyel <tamas@tklengyel.com>
> ---
>  xen/common/memory.c | 66 +++++++++++++++++++++++++++--------------------------
>  1 file changed, 34 insertions(+), 32 deletions(-)
> 
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index c559935732..369154b7c0 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1087,44 +1087,46 @@ static int acquire_resource(
>          goto out;
>      }
> 
> -    switch ( xmar.type )
> -    {
> -    case XENMEM_resource_grant_table:
> -        rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
> -                                     mfn_list);
> -        break;
> +    do {
> +        switch ( xmar.type )
> +        {
> +        case XENMEM_resource_grant_table:
> +            rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
> +                                         mfn_list);
> +            break;
> 
> -    default:
> -        rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
> -                                   xmar.nr_frames, mfn_list);
> -        break;
> -    }
> +        default:
> +            rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
> +                                       xmar.nr_frames, mfn_list);
> +            break;
> +        }
> 
> -    if ( rc )
> -        goto out;
> +        if ( rc )
> +            goto out;
> 
> -    if ( !paging_mode_translate(currd) )
> -    {
> -        if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> -            rc = -EFAULT;
> -    }
> -    else
> -    {
> -        xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
> -        unsigned int i;
> +        if ( !paging_mode_translate(currd) )
> +        {
> +            if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> +                rc = -EFAULT;
> +        }
> +        else
> +        {
> +            xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
> +            unsigned int i;
> 
> -        if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> -            rc = -EFAULT;
> +            if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> +                rc = -EFAULT;
> 
> -        for ( i = 0; !rc && i < xmar.nr_frames; i++ )
> -        {
> -            rc = set_foreign_p2m_entry(currd, gfn_list[i],
> -                                       _mfn(mfn_list[i]));
> -            /* rc should be -EIO for any iteration other than the first */
> -            if ( rc && i )
> -                rc = -EIO;
> +            for ( i = 0; !rc && i < xmar.nr_frames; i++ )
> +            {
> +                rc = set_foreign_p2m_entry(currd, gfn_list[i],
> +                                           _mfn(mfn_list[i]));
> +                /* rc should be -EIO for any iteration other than the first */
> +                if ( rc && i )
> +                    rc = -EIO;
> +            }
>          }
> -    }
> +    } while ( 0 );
> 
>   out:
>      rcu_unlock_domain(d);
> --
> 2.11.0




^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  2020-09-22 18:24 ` [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource Andrew Cooper
@ 2020-09-24 10:47   ` Paul Durrant
  2021-01-08 19:36     ` Andrew Cooper
  2020-09-28  9:37   ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2020-09-24 10:47 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 22 September 2020 19:25
> To: Xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>; Ian
> Jackson <iwj@xenproject.org>; Jan Beulich <JBeulich@suse.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Paul Durrant
> <paul@xen.org>; Michał Leszczyński <michal.leszczynski@cert.pl>; Hubert Jasudowicz
> <hubert.jasudowicz@cert.pl>; Tamas K Lengyel <tamas@tklengyel.com>
> Subject: [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
> 
> A guests default number of grant frames is 64, and XENMEM_acquire_resource

Nit: s/guests/guest's

> will reject an attempt to map more than 32 frames.  This limit is caused by
> the size of mfn_list[] on the stack.
> 
> Fix mapping of arbitrary size requests by looping over batches of 32 in
> acquire_resource(), and using hypercall continuations when necessary.
> 
> To start with, break _acquire_resource() out of acquire_resource() to cope
> with type-specific dispatching, and update the return semantics to indicate
> the number of mfns returned.  Update gnttab_acquire_resource() and x86's
> arch_acquire_resource() to match these new semantics.
> 
> Have do_memory_op() pass start_extent into acquire_resource() so it can pick
> up where it left off after a continuation, and loop over batches of 32 until
> all the work is done, or a continuation needs to occur.
> 
> compat_memory_op() is a bit more complicated, because it also has to marshal
> frame_list in the XLAT buffer.  Have it account for continuation information
> itself and hide details from the upper layer, so it can marshal the buffer in
> chunks if necessary.
> 
> With these fixes in place, it is now possible to map the whole grant table for
> a guest.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Ian Jackson <iwj@xenproject.org>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Paul Durrant <paul@xen.org>
> CC: Michał Leszczyński <michal.leszczynski@cert.pl>
> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
> CC: Tamas K Lengyel <tamas@tklengyel.com>
> ---
>  xen/arch/x86/mm.c          |   4 +-
>  xen/common/compat/memory.c |  96 ++++++++++++++++++++++++++++++--------
>  xen/common/grant_table.c   |   3 ++
>  xen/common/memory.c        | 114 ++++++++++++++++++++++++++++++++++-----------
>  4 files changed, 168 insertions(+), 49 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index e82307bdae..8628f51402 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4632,7 +4632,6 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
>          if ( id != (unsigned int)ioservid )
>              break;
> 
> -        rc = 0;
>          for ( i = 0; i < nr_frames; i++ )
>          {
>              mfn_t mfn;
> @@ -4643,6 +4642,9 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
> 
>              mfn_list[i] = mfn_x(mfn);
>          }
> +        if ( i == nr_frames )
> +            /* Success.  Passed nr_frames back to the caller. */

Nit: s/passed/pass

> +            rc = nr_frames;
>          break;
>      }
>  #endif
> diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
> index 834c5e19d1..17619f26ed 100644
> --- a/xen/common/compat/memory.c
> +++ b/xen/common/compat/memory.c
> @@ -402,23 +402,10 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>          case XENMEM_acquire_resource:
>          {
>              xen_pfn_t *xen_frame_list = NULL;
> -            unsigned int max_nr_frames;
> 
>              if ( copy_from_guest(&cmp.mar, compat, 1) )
>                  return -EFAULT;
> 
> -            /*
> -             * The number of frames handled is currently limited to a
> -             * small number by the underlying implementation, so the
> -             * scratch space should be sufficient for bouncing the
> -             * frame addresses.
> -             */
> -            max_nr_frames = (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
> -                sizeof(*xen_frame_list);
> -
> -            if ( cmp.mar.nr_frames > max_nr_frames )
> -                return -E2BIG;
> -
>              /* Marshal the frame list in the remainder of the xlat space. */
>              if ( !compat_handle_is_null(cmp.mar.frame_list) )
>                  xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
> @@ -432,6 +419,28 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
> 
>              if ( xen_frame_list && cmp.mar.nr_frames )
>              {
> +                unsigned int xlat_max_frames =
> +                    (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
> +                    sizeof(*xen_frame_list);
> +
> +                if ( start_extent >= nat.mar->nr_frames )
> +                    return -EINVAL;
> +
> +                /*
> +                 * Adjust nat to account for work done on previous
> +                 * continuations, leaving cmp pristine.  Hide the continaution
> +                 * from the native code to prevent double accounting.
> +                 */
> +                nat.mar->nr_frames -= start_extent;
> +                nat.mar->frame += start_extent;
> +                cmd &= MEMOP_CMD_MASK;
> +
> +                /*
> +                 * If there are two many frames to fit within the xlat buffer,
> +                 * we'll need to loop to marshal them all.
> +                 */
> +                nat.mar->nr_frames = min(nat.mar->nr_frames, xlat_max_frames);
> +
>                  /*
>                   * frame_list is an input for translated guests, and an output
>                   * for untranslated guests.  Only copy in for translated guests.
> @@ -444,14 +453,14 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                                               cmp.mar.nr_frames) ||
>                           __copy_from_compat_offset(
>                               compat_frame_list, cmp.mar.frame_list,
> -                             0, cmp.mar.nr_frames) )
> +                             start_extent, nat.mar->nr_frames) )
>                          return -EFAULT;
> 
>                      /*
>                       * Iterate backwards over compat_frame_list[] expanding
>                       * compat_pfn_t to xen_pfn_t in place.
>                       */
> -                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
> +                    for ( int x = nat.mar->nr_frames - 1; x >= 0; --x )
>                          xen_frame_list[x] = compat_frame_list[x];
>                  }
>              }
> @@ -600,9 +609,11 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>          case XENMEM_acquire_resource:
>          {
>              DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
> +            unsigned int done;
> 
>              if ( compat_handle_is_null(cmp.mar.frame_list) )
>              {
> +                ASSERT(split == 0 && rc == 0);
>                  if ( __copy_field_to_guest(
>                           guest_handle_cast(compat,
>                                             compat_mem_acquire_resource_t),
> @@ -611,6 +622,21 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                  break;
>              }
> 
> +            if ( split < 0 )
> +            {
> +                /* Contintuation occured. */
> +                ASSERT(rc != XENMEM_acquire_resource);

Do you mean "op !=" here?

> +                done = cmd >> MEMOP_EXTENT_SHIFT;
> +            }
> +            else
> +            {
> +                /* No continuation. */
> +                ASSERT(rc == 0);

"!rc" for consistency with other code perhaps?

> +                done = nat.mar->nr_frames;
> +            }
> +
> +            ASSERT(done <= nat.mar->nr_frames);
> +
>              /*
>               * frame_list is an input for translated guests, and an output for
>               * untranslated guests.  Only copy out for untranslated guests.
> @@ -626,7 +652,7 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                   */
>                  BUILD_BUG_ON(sizeof(compat_pfn_t) > sizeof(xen_pfn_t));
> 
> -                for ( i = 0; i < cmp.mar.nr_frames; i++ )
> +                for ( i = 0; i < done; i++ )
>                  {
>                      compat_pfn_t frame = xen_frame_list[i];
> 
> @@ -636,15 +662,45 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                      compat_frame_list[i] = frame;
>                  }
> 
> -                if ( __copy_to_compat_offset(cmp.mar.frame_list, 0,
> -                                             compat_frame_list,
> -                                             cmp.mar.nr_frames) )
> +                if ( __copy_to_compat_offset(
> +                         cmp.mar.frame_list, start_extent,
> +                         compat_frame_list, done) )
>                      return -EFAULT;
>              }
> -            break;
> +
> +            start_extent += done;
> +
> +            /* Completely done. */
> +            if ( start_extent == cmp.mar.nr_frames )
> +                break;
> +
> +            /*
> +             * Done a "full" batch, but we were limited by space in the xlat
> +             * area.  Go around the loop again without necesserily returning
> +             * to guest context.
> +             */
> +            if ( done == nat.mar->nr_frames )
> +            {
> +                split = 1;
> +                break;
> +            }
> +
> +            /* Explicit continuation request from a higher level. */
> +            if ( done < nat.mar->nr_frames )
> +                return hypercall_create_continuation(
> +                    __HYPERVISOR_memory_op, "ih",
> +                    op | (start_extent << MEMOP_EXTENT_SHIFT), compat);
> +
> +            /*
> +             * Well... Somethings gone wrong with the two levels of chunking.
> +             * My condolences to whomever next has to debug this mess.
> +             */
> +            ASSERT_UNREACHABLE();
> +            goto crash;
>          }
> 
>          default:
> +        crash:
>              domain_crash(current->domain);
>              split = 0;
>              break;
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index 8c401a5540..81db47f8fd 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -4105,6 +4105,9 @@ int gnttab_acquire_resource(
>      for ( i = 0; i < nr_frames; ++i )
>          mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
> 
> +    /* Success.  Passed nr_frames back to the caller. */
> +    rc = nr_frames;
> +
>   out:
>      grant_write_unlock(gt);
> 
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 369154b7c0..ec276cb9b1 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1027,17 +1027,31 @@ static unsigned int resource_max_frames(struct domain *d,
>      }
>  }
> 
> +/*
> + * Returns -errno on error, or positive in the range [1, nr_frames] on
> + * success.  Returning less than nr_frames contitutes a request for a
> + * continuation.
> + */
> +static int _acquire_resource(
> +    struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
> +{
> +    switch ( type )
> +    {
> +    case XENMEM_resource_grant_table:
> +        return gnttab_acquire_resource(d, id, frame, nr_frames, mfn_list);
> +
> +    default:
> +        return arch_acquire_resource(d, type, id, frame, nr_frames, mfn_list);
> +    }
> +}
> +
>  static int acquire_resource(
> -    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
> +    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg,
> +    unsigned long start_extent)
>  {
>      struct domain *d, *currd = current->domain;
>      xen_mem_acquire_resource_t xmar;
> -    /*
> -     * The mfn_list and gfn_list (below) arrays are ok on stack for the
> -     * moment since they are small, but if they need to grow in future
> -     * use-cases then per-CPU arrays or heap allocations may be required.
> -     */
> -    xen_pfn_t mfn_list[32];
>      unsigned int max_frames;
>      int rc;
> 
> @@ -1055,9 +1069,6 @@ static int acquire_resource(
>      if ( xmar.pad != 0 )
>          return -EINVAL;
> 
> -    if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
> -        return -E2BIG;
> -
>      rc = rcu_lock_remote_domain_by_id(xmar.domid, &d);
>      if ( rc )
>          return rc;
> @@ -1074,7 +1085,7 @@ static int acquire_resource(
> 
>      if ( guest_handle_is_null(xmar.frame_list) )
>      {
> -        if ( xmar.nr_frames )
> +        if ( xmar.nr_frames || start_extent )
>              goto out;
> 
>          xmar.nr_frames = max_frames;
> @@ -1087,26 +1098,47 @@ static int acquire_resource(
>          goto out;
>      }
> 
> +    /*
> +     * Limiting nr_frames at (UINT_MAX >> MEMOP_EXTENT_SHIFT) isn't ideal.  If
> +     * it ever becomes a practical problem, we can switch to mutating
> +     * xmar.{frame,nr_frames,frame_list} in guest memory.
> +     */
> +    rc = -EINVAL;
> +    if ( start_extent >= xmar.nr_frames ||
> +         xmar.nr_frames > (UINT_MAX >> MEMOP_EXTENT_SHIFT) )
> +        goto out;
> +
> +    /* Adjust for work done on previous continuations. */
> +    xmar.nr_frames -= start_extent;
> +    xmar.frame += start_extent;
> +    guest_handle_add_offset(xmar.frame_list, start_extent);
> +
>      do {
> -        switch ( xmar.type )
> -        {
> -        case XENMEM_resource_grant_table:
> -            rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
> -                                         mfn_list);
> -            break;
> +        /*
> +         * Arbitrary size.  Not too much stack space, and a reasonable stride
> +         * for continutation checks.
> +         */
> +        xen_pfn_t mfn_list[32];
> +        unsigned int todo = MIN(ARRAY_SIZE(mfn_list), xmar.nr_frames), done;
> 
> -        default:
> -            rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
> -                                       xmar.nr_frames, mfn_list);
> -            break;
> -        }
> +        rc = _acquire_resource(d, xmar.type, xmar.id, xmar.frame,
> +                               todo, mfn_list);
> +        if ( rc < 0 )
> +            goto out;
> 
> -        if ( rc )
> +        done = rc;
> +        rc = 0;
> +        if ( done == 0 || done > todo )
> +        {
> +            ASSERT_UNREACHABLE();
> +            rc = -EINVAL;
>              goto out;
> +        }
> 
> +        /* Adjust guest frame_list appropriately. */
>          if ( !paging_mode_translate(currd) )
>          {
> -            if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> +            if ( copy_to_guest(xmar.frame_list, mfn_list, done) )
>                  rc = -EFAULT;
>          }
>          else
> @@ -1114,10 +1146,10 @@ static int acquire_resource(
>              xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
>              unsigned int i;
> 
> -            if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> +            if ( copy_from_guest(gfn_list, xmar.frame_list, done) )
>                  rc = -EFAULT;
> 
> -            for ( i = 0; !rc && i < xmar.nr_frames; i++ )
> +            for ( i = 0; !rc && i < done; i++ )
>              {
>                  rc = set_foreign_p2m_entry(currd, gfn_list[i],
>                                             _mfn(mfn_list[i]));
> @@ -1126,7 +1158,32 @@ static int acquire_resource(
>                      rc = -EIO;
>              }
>          }
> -    } while ( 0 );
> +
> +        if ( rc )
> +            goto out;
> +
> +        xmar.nr_frames -= done;
> +        xmar.frame += done;
> +        guest_handle_add_offset(xmar.frame_list, done);
> +        start_extent += done;
> +
> +        /*
> +         * Explicit contination request from _acquire_resource(), or we've
> +         * still got work to do other work is pending.

I think "still got work to do" or " other work is pending", not both?

  Paul

> +         */
> +        if ( done < todo ||
> +             (xmar.nr_frames && hypercall_preempt_check()) )
> +        {
> +            rc = hypercall_create_continuation(
> +                __HYPERVISOR_memory_op, "lh",
> +                XENMEM_acquire_resource | (start_extent << MEMOP_EXTENT_SHIFT),
> +                arg);
> +            goto out;
> +        }
> +
> +    } while ( xmar.nr_frames );
> +
> +    rc = 0;
> 
>   out:
>      rcu_unlock_domain(d);
> @@ -1593,7 +1650,8 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> 
>      case XENMEM_acquire_resource:
>          rc = acquire_resource(
> -            guest_handle_cast(arg, xen_mem_acquire_resource_t));
> +            guest_handle_cast(arg, xen_mem_acquire_resource_t),
> +            start_extent);
>          break;
> 
>      default:
> --
> 2.11.0




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics
  2020-09-24 10:06   ` Paul Durrant
@ 2020-09-24 10:57     ` Andrew Cooper
  2020-09-24 11:04       ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2020-09-24 10:57 UTC (permalink / raw)
  To: paul, 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

On 24/09/2020 11:06, Paul Durrant wrote:
>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
>> index d1cfc8fb4a..e82307bdae 100644
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -4591,6 +4591,26 @@ int xenmem_add_to_physmap_one(
>>      return rc;
>>  }
>>
>> +unsigned int arch_resource_max_frames(
>> +    struct domain *d, unsigned int type, unsigned int id)
>> +{
>> +    unsigned int nr = 0;
>> +
>> +    switch ( type )
>> +    {
>> +#ifdef CONFIG_HVM
>> +    case XENMEM_resource_ioreq_server:
>> +        if ( !is_hvm_domain(d) )
>> +            break;
>> +        /* One frame for the buf-ioreq ring, and one frame per 128 vcpus. */
>> +        nr = 1 + DIV_ROUND_UP(d->max_vcpus * sizeof(struct ioreq), PAGE_SIZE);
> The buf-ioreq ring is optional

Yes, but it's position within the resource, and effect on the position
of the ioreq page(s), is not.

>  so a caller using this value may still get a resource acquisition failure unless the id is used to actually look up and check the ioreq server in question for the actual maximum.

Yes, but that is potentially true of *any* acquisition attempt, even if
the id matches, because maybe someone else has destroyed the ioreq
server, or the domain, in the meantime.


What we have is an mmap() where the caller needs to know to not try and
map page 0 for an ioreq server where buf-ioreq doesn't exist.

This is a direct consequence of:

#define XENMEM_resource_ioreq_server_frame_bufioreq 0
#define XENMEM_resource_ioreq_server_frame_ioreq(n) (1 + (n))

and in practice, what a qemu/demu/other needs to do to map just the
ioreq frames (in a manner compatible with >127 vcpu HVM domains) is to
query the resource size and map size-1 pages from offset 1. 

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics
  2020-09-24 10:57     ` Andrew Cooper
@ 2020-09-24 11:04       ` Paul Durrant
  0 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2020-09-24 11:04 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 24 September 2020 11:58
> To: paul@xen.org; 'Xen-devel' <xen-devel@lists.xenproject.org>
> Cc: 'George Dunlap' <George.Dunlap@eu.citrix.com>; 'Ian Jackson' <iwj@xenproject.org>; 'Jan Beulich'
> <JBeulich@suse.com>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>; 'Julien
> Grall' <julien@xen.org>; 'Michał Leszczyński' <michal.leszczynski@cert.pl>; 'Hubert Jasudowicz'
> <hubert.jasudowicz@cert.pl>; 'Tamas K Lengyel' <tamas@tklengyel.com>
> Subject: Re: [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics
> 
> On 24/09/2020 11:06, Paul Durrant wrote:
> >> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> >> index d1cfc8fb4a..e82307bdae 100644
> >> --- a/xen/arch/x86/mm.c
> >> +++ b/xen/arch/x86/mm.c
> >> @@ -4591,6 +4591,26 @@ int xenmem_add_to_physmap_one(
> >>      return rc;
> >>  }
> >>
> >> +unsigned int arch_resource_max_frames(
> >> +    struct domain *d, unsigned int type, unsigned int id)
> >> +{
> >> +    unsigned int nr = 0;
> >> +
> >> +    switch ( type )
> >> +    {
> >> +#ifdef CONFIG_HVM
> >> +    case XENMEM_resource_ioreq_server:
> >> +        if ( !is_hvm_domain(d) )
> >> +            break;
> >> +        /* One frame for the buf-ioreq ring, and one frame per 128 vcpus. */
> >> +        nr = 1 + DIV_ROUND_UP(d->max_vcpus * sizeof(struct ioreq), PAGE_SIZE);
> > The buf-ioreq ring is optional
> 
> Yes, but it's position within the resource, and effect on the position
> of the ioreq page(s), is not.

Oh yes, I was forgetting that this is fixed so...

Reviewed-by: Paul Durrant <paul@xen.org>

> 
> >  so a caller using this value may still get a resource acquisition failure unless the id is used to
> actually look up and check the ioreq server in question for the actual maximum.
> 
> Yes, but that is potentially true of *any* acquisition attempt, even if
> the id matches, because maybe someone else has destroyed the ioreq
> server, or the domain, in the meantime.
> 
> 
> What we have is an mmap() where the caller needs to know to not try and
> map page 0 for an ioreq server where buf-ioreq doesn't exist.
> 
> This is a direct consequence of:
> 
> #define XENMEM_resource_ioreq_server_frame_bufioreq 0
> #define XENMEM_resource_ioreq_server_frame_ioreq(n) (1 + (n))
> 
> and in practice, what a qemu/demu/other needs to do to map just the
> ioreq frames (in a manner compatible with >127 vcpu HVM domains) is to
> query the resource size and map size-1 pages from offset 1.

Yes.

  Paul

> 
> ~Andrew



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2020-09-22 18:24 ` [PATCH v2 02/11] xen/gnttab: Rework resource acquisition Andrew Cooper
  2020-09-24  9:51   ` Paul Durrant
@ 2020-09-25 13:17   ` Jan Beulich
  2021-01-11 21:22     ` Andrew Cooper
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2020-09-25 13:17 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 22.09.2020 20:24, Andrew Cooper wrote:
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>      return 0;
>  }
>  
> +int gnttab_acquire_resource(
> +    struct domain *d, unsigned int id, unsigned long frame,
> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
> +{
> +    struct grant_table *gt = d->grant_table;
> +    unsigned int i = nr_frames, tot_frames;
> +    mfn_t tmp;
> +    void **vaddrs = NULL;
> +    int rc;
> +
> +    /* Input sanity. */
> +    if ( !nr_frames )
> +        return -EINVAL;

I continue to object to this becoming an error. It wasn't before,
and it wouldn't be in line with various other operations, not the
least XENMEM_resource_ioreq_server handling, but also various of
the other mem-op functions (just to give concrete examples) or
basically all of the grant-table ones.

And if it really was to be an error, it could be had by folding
into ...

> +    /* Overflow checks */
> +    if ( frame + nr_frames < frame )
> +        return -EINVAL;

this:

    if ( frame + nr_frames <= frame )
        return -EINVAL;

> +    tot_frames = frame + nr_frames;
> +    if ( tot_frames != frame + nr_frames )
> +        return -EINVAL;
> +
> +    /* Grow table if necessary. */
> +    grant_write_lock(gt);
> +    switch ( id )
> +    {
> +    case XENMEM_resource_grant_table_id_shared:
> +        rc = gnttab_get_shared_frame_mfn(d, tot_frames - 1, &tmp);
> +        break;
> +
> +    case XENMEM_resource_grant_table_id_status:
> +        if ( gt->gt_version != 2 )
> +        {
> +    default:
> +            rc = -EINVAL;
> +            break;
> +        }
> +        rc = gnttab_get_status_frame_mfn(d, tot_frames - 1, &tmp);
> +        break;
> +    }
> +
> +    /* Any errors from growing the table? */
> +    if ( rc )
> +        goto out;
> +
> +    switch ( id )
> +    {
> +    case XENMEM_resource_grant_table_id_shared:
> +        vaddrs = gt->shared_raw;
> +        break;
> +
> +    case XENMEM_resource_grant_table_id_status:
> +        /* Check that void ** is a suitable representation for gt->status. */
> +        BUILD_BUG_ON(!__builtin_types_compatible_p(
> +                         typeof(gt->status), grant_status_t **));
> +        vaddrs = (void **)gt->status;
> +        break;
> +    }

Why the 2nd switch()? As long as you don't de-reference vaddrs
prior to checking rc, I don't see anything that could go wrong.
And once both are folded, maybe vaddr's initializer could also
be dropped again?

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics
  2020-09-22 18:24 ` [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics Andrew Cooper
  2020-09-24 10:06   ` Paul Durrant
@ 2020-09-25 15:56   ` Jan Beulich
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2020-09-25 15:56 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 22.09.2020 20:24, Andrew Cooper wrote:
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1007,6 +1007,26 @@ static long xatp_permission_check(struct domain *d, unsigned int space)
>      return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
>  }
>  
> +/*
> + * Return 0 on any kind of error.  Caller converts to -EINVAL.
> + *
> + * All nonzero values should be repeatable (i.e. derived from some fixed
> + * property of the domain), and describe the full resource (i.e. mapping the
> + * result of this call will be the entire resource).
> + */
> +static unsigned int resource_max_frames(struct domain *d,

With the lockless intentions I think this could be const from
here on through all the descendants. With this
Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit I have one more minor remark:

> @@ -1058,6 +1066,27 @@ static int acquire_resource(
>      if ( rc )
>          goto out;
>  
> +    max_frames = resource_max_frames(d, xmar.type, xmar.id);
> +
> +    rc = -EINVAL;
> +    if ( !max_frames )
> +        goto out;
> +
> +    if ( guest_handle_is_null(xmar.frame_list) )
> +    {
> +        if ( xmar.nr_frames )
> +            goto out;
> +
> +        xmar.nr_frames = max_frames;
> +
> +        rc = -EFAULT;
> +        if ( __copy_field_to_guest(arg, &xmar, nr_frames) )
> +            goto out;
> +
> +        rc = 0;
> +        goto out;
> +    }

That's a lot of "goto out" here. I don't suppose I could talk you
into reducing their amount some, since at least the last two look
to be easy to fold?

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling
  2020-09-22 18:24 ` [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling Andrew Cooper
  2020-09-24 10:16   ` Paul Durrant
@ 2020-09-28  9:09   ` Jan Beulich
  2021-01-08 18:57     ` Andrew Cooper
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2020-09-28  9:09 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 22.09.2020 20:24, Andrew Cooper wrote:
> @@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>  
>  #undef XLAT_mem_acquire_resource_HNDL_frame_list
>  
> +            if ( xen_frame_list && cmp.mar.nr_frames )
> +            {
> +                /*
> +                 * frame_list is an input for translated guests, and an output
> +                 * for untranslated guests.  Only copy in for translated guests.
> +                 */
> +                if ( paging_mode_translate(currd) )
> +                {
> +                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
> +
> +                    if ( !compat_handle_okay(cmp.mar.frame_list,
> +                                             cmp.mar.nr_frames) ||
> +                         __copy_from_compat_offset(
> +                             compat_frame_list, cmp.mar.frame_list,
> +                             0, cmp.mar.nr_frames) )
> +                        return -EFAULT;
> +
> +                    /*
> +                     * Iterate backwards over compat_frame_list[] expanding
> +                     * compat_pfn_t to xen_pfn_t in place.
> +                     */
> +                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
> +                        xen_frame_list[x] = compat_frame_list[x];

In addition to what Paul has said, I also don't see why you resort
to a signed type here. Using the available local variable i ought to
be quite easy:

                    for ( i = cmp.mar.nr_frames; i--; )
                        xen_frame_list[i] = compat_frame_list[i];

As an aside, considering the controversy we're having on patch 2, I
find it quite interesting how you carefully allow for nr_frames being
zero throughout your changes here (which, as I think is obvious, I
agree you want to do).

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  2020-09-22 18:24 ` [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource Andrew Cooper
  2020-09-24 10:47   ` Paul Durrant
@ 2020-09-28  9:37   ` Jan Beulich
  2021-01-11 20:05     ` Andrew Cooper
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2020-09-28  9:37 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 22.09.2020 20:24, Andrew Cooper wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4632,7 +4632,6 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
>          if ( id != (unsigned int)ioservid )
>              break;
>  
> -        rc = 0;
>          for ( i = 0; i < nr_frames; i++ )
>          {
>              mfn_t mfn;
> @@ -4643,6 +4642,9 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
>  
>              mfn_list[i] = mfn_x(mfn);
>          }
> +        if ( i == nr_frames )
> +            /* Success.  Passed nr_frames back to the caller. */
> +            rc = nr_frames;

With this, shouldn't the return type of the function be changed to
"long"? I realize that's no an issue with XENMEM_resource_ioreq_server
specifically, but I mean the general case.

> --- a/xen/common/compat/memory.c
> +++ b/xen/common/compat/memory.c
> @@ -402,23 +402,10 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>          case XENMEM_acquire_resource:
>          {
>              xen_pfn_t *xen_frame_list = NULL;
> -            unsigned int max_nr_frames;
>  
>              if ( copy_from_guest(&cmp.mar, compat, 1) )
>                  return -EFAULT;
>  
> -            /*
> -             * The number of frames handled is currently limited to a
> -             * small number by the underlying implementation, so the
> -             * scratch space should be sufficient for bouncing the
> -             * frame addresses.
> -             */
> -            max_nr_frames = (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
> -                sizeof(*xen_frame_list);
> -
> -            if ( cmp.mar.nr_frames > max_nr_frames )
> -                return -E2BIG;
> -
>              /* Marshal the frame list in the remainder of the xlat space. */
>              if ( !compat_handle_is_null(cmp.mar.frame_list) )
>                  xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
> @@ -432,6 +419,28 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>  
>              if ( xen_frame_list && cmp.mar.nr_frames )
>              {
> +                unsigned int xlat_max_frames =
> +                    (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
> +                    sizeof(*xen_frame_list);
> +
> +                if ( start_extent >= nat.mar->nr_frames )
> +                    return -EINVAL;

Like for patch 2, I don't see why the == case should result in an
error, at the very least when start_extent is zero.

> @@ -611,6 +622,21 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                  break;
>              }
>  
> +            if ( split < 0 )
> +            {
> +                /* Contintuation occured. */

Nit: Stray 't'. And missing 'r'?

> @@ -636,15 +662,45 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>                      compat_frame_list[i] = frame;
>                  }
>  
> -                if ( __copy_to_compat_offset(cmp.mar.frame_list, 0,
> -                                             compat_frame_list,
> -                                             cmp.mar.nr_frames) )
> +                if ( __copy_to_compat_offset(
> +                         cmp.mar.frame_list, start_extent,
> +                         compat_frame_list, done) )
>                      return -EFAULT;
>              }
> -            break;
> +
> +            start_extent += done;
> +
> +            /* Completely done. */
> +            if ( start_extent == cmp.mar.nr_frames )
> +                break;
> +
> +            /*
> +             * Done a "full" batch, but we were limited by space in the xlat
> +             * area.  Go around the loop again without necesserily returning
> +             * to guest context.
> +             */
> +            if ( done == nat.mar->nr_frames )
> +            {
> +                split = 1;
> +                break;
> +            }
> +
> +            /* Explicit continuation request from a higher level. */
> +            if ( done < nat.mar->nr_frames )
> +                return hypercall_create_continuation(
> +                    __HYPERVISOR_memory_op, "ih",
> +                    op | (start_extent << MEMOP_EXTENT_SHIFT), compat);
> +
> +            /*
> +             * Well... Somethings gone wrong with the two levels of chunking.
> +             * My condolences to whomever next has to debug this mess.
> +             */

Any suggestion how to overcome this "mess"?

> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -4105,6 +4105,9 @@ int gnttab_acquire_resource(
>      for ( i = 0; i < nr_frames; ++i )
>          mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
>  
> +    /* Success.  Passed nr_frames back to the caller. */

Nit: "Pass"?

> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1027,17 +1027,31 @@ static unsigned int resource_max_frames(struct domain *d,
>      }
>  }
>  
> +/*
> + * Returns -errno on error, or positive in the range [1, nr_frames] on
> + * success.  Returning less than nr_frames contitutes a request for a
> + * continuation.
> + */
> +static int _acquire_resource(
> +    struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
> +    unsigned int nr_frames, xen_pfn_t mfn_list[])

As per the comment the return type may again want to be "long" here.
Albeit I realize the restriction to (UINT_MAX >> MEMOP_EXTENT_SHIFT)
makes this (and the other place above) only a latent issue for now,
so it may well be fine to be left as is.

> @@ -1087,26 +1098,47 @@ static int acquire_resource(
>          goto out;
>      }
>  
> +    /*
> +     * Limiting nr_frames at (UINT_MAX >> MEMOP_EXTENT_SHIFT) isn't ideal.  If
> +     * it ever becomes a practical problem, we can switch to mutating
> +     * xmar.{frame,nr_frames,frame_list} in guest memory.
> +     */

For 64-bit, extending the limit to ULONG_MAX >> MEMOP_EXTENT_SHIFT
may also be an option.

> +    rc = -EINVAL;
> +    if ( start_extent >= xmar.nr_frames ||

Again, at least when start_extent is zero, == should not result in an
error.

> +         xmar.nr_frames > (UINT_MAX >> MEMOP_EXTENT_SHIFT) )
> +        goto out;
> +
> +    /* Adjust for work done on previous continuations. */
> +    xmar.nr_frames -= start_extent;
> +    xmar.frame += start_extent;
> +    guest_handle_add_offset(xmar.frame_list, start_extent);
> +
>      do {
> -        switch ( xmar.type )
> -        {
> -        case XENMEM_resource_grant_table:
> -            rc = gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr_frames,
> -                                         mfn_list);
> -            break;
> +        /*
> +         * Arbitrary size.  Not too much stack space, and a reasonable stride
> +         * for continutation checks.

Nit: Stray 't' again.

> @@ -1126,7 +1158,32 @@ static int acquire_resource(
>                      rc = -EIO;
>              }
>          }
> -    } while ( 0 );
> +
> +        if ( rc )
> +            goto out;
> +
> +        xmar.nr_frames -= done;
> +        xmar.frame += done;
> +        guest_handle_add_offset(xmar.frame_list, done);
> +        start_extent += done;
> +
> +        /*
> +         * Explicit contination request from _acquire_resource(), or we've

Nit: Missing 'u' this time round.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource
  2020-09-22 18:24 ` [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource Andrew Cooper
@ 2021-01-08 17:52   ` Andrew Cooper
  2021-01-11 10:50     ` Roger Pau Monné
  2021-01-11 15:26   ` [PATCH v3 " Andrew Cooper
  1 sibling, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2021-01-08 17:52 UTC (permalink / raw)
  To: Xen-devel
  Cc: Ian Jackson, Michał Leszczyński, Hubert Jasudowicz,
	Tamas K Lengyel, Roger Pau Monné,
	Andrew Cooper

On 22/09/2020 19:24, Andrew Cooper wrote:
> diff --git a/tools/libs/foreignmemory/linux.c b/tools/libs/foreignmemory/linux.c
> index fe73d5ab72..eec089e232 100644
> --- a/tools/libs/foreignmemory/linux.c
> +++ b/tools/libs/foreignmemory/linux.c
> @@ -339,6 +342,39 @@ int osdep_xenforeignmemory_map_resource(
>      return 0;
>  }
>  
> +int osdep_xenforeignmemory_resource_size(
> +    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
> +    unsigned int id, unsigned long *nr_frames)
> +{
> +    int rc;
> +    struct xen_mem_acquire_resource *xmar =
> +        xencall_alloc_buffer(fmem->xcall, sizeof(*xmar));
> +
> +    if ( !xmar )
> +    {
> +        PERROR("Could not bounce memory for acquire_resource hypercall");
> +        return -1;
> +    }
> +
> +    *xmar = (struct xen_mem_acquire_resource){
> +        .domid = domid,
> +        .type = type,
> +        .id = id,
> +    };
> +
> +    rc = xencall2(fmem->xcall, __HYPERVISOR_memory_op,
> +                  XENMEM_acquire_resource, (uintptr_t)xmar);
> +    if ( rc )
> +        goto out;
> +
> +    *nr_frames = xmar->nr_frames;
> +
> + out:
> +    xencall_free_buffer(fmem->xcall, xmar);
> +
> +    return rc;
> +}

Having talked this through with Roger, it's broken.

In the meantime, foreignmem has gained acquire_resource on FreeBSD.
Nothing in this osdep function is linux-specific, so it oughtn't to be
osdep.

However, its also not permitted to make hypercalls like this in
restricted mode, and that isn't something we should be breaking. 
Amongst other things, it will prevent us from supporting >128 cpus, as
Qemu needs updating to use this interface in due course.

The only solution (which keeps restricted mode working) is to fix
Linux's ioctl() to be able to understand size requests.  This also
avoids foreignmem needing to open a xencall handle which was fugly in
the first place.

This patch (well - a subset of it) will be needed to remain, to make a
usable interface in userspace.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling
  2020-09-28  9:09   ` Jan Beulich
@ 2021-01-08 18:57     ` Andrew Cooper
  2021-01-11 14:25       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2021-01-08 18:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 28/09/2020 10:09, Jan Beulich wrote:
> On 22.09.2020 20:24, Andrew Cooper wrote:
>> @@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>>  
>>  #undef XLAT_mem_acquire_resource_HNDL_frame_list
>>  
>> +            if ( xen_frame_list && cmp.mar.nr_frames )
>> +            {
>> +                /*
>> +                 * frame_list is an input for translated guests, and an output
>> +                 * for untranslated guests.  Only copy in for translated guests.
>> +                 */
>> +                if ( paging_mode_translate(currd) )
>> +                {
>> +                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
>> +
>> +                    if ( !compat_handle_okay(cmp.mar.frame_list,
>> +                                             cmp.mar.nr_frames) ||
>> +                         __copy_from_compat_offset(
>> +                             compat_frame_list, cmp.mar.frame_list,
>> +                             0, cmp.mar.nr_frames) )
>> +                        return -EFAULT;
>> +
>> +                    /*
>> +                     * Iterate backwards over compat_frame_list[] expanding
>> +                     * compat_pfn_t to xen_pfn_t in place.
>> +                     */
>> +                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
>> +                        xen_frame_list[x] = compat_frame_list[x];
> In addition to what Paul has said, I also don't see why you resort
> to a signed type here. Using the available local variable i ought to
> be quite easy:
>
>                     for ( i = cmp.mar.nr_frames; i--; )
>                         xen_frame_list[i] = compat_frame_list[i];

My goal is to make this code able to be followed, not to obfuscate it
further.  In particular, my version doesn't take several minutes to
figure out why it doesn't die with a fatal #PF.

Also (because I thought it would be full of irony to try, and it turns
out I was right), your version is 9 bytes larger once compiled.  This
has everything to do with the scope of the induction variable.  I'm
surprised that, in your effort to irradiate overly large scopes, you
haven't pushed for this form further.

> As an aside, considering the controversy we're having on patch 2, I
> find it quite interesting how you carefully allow for nr_frames being
> zero throughout your changes here (which, as I think is obvious, I
> agree you want to do).

I thought you of all people would appreciate that there *is* a
separation of responsibilities between this parameter-marshalling one,
and the native one.

Also, this code doesn't livelock in the hypervisor when handed 0.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  2020-09-24 10:47   ` Paul Durrant
@ 2021-01-08 19:36     ` Andrew Cooper
  0 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2021-01-08 19:36 UTC (permalink / raw)
  To: paul, 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

On 24/09/2020 11:47, Paul Durrant wrote:
>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
>> index e82307bdae..8628f51402 100644
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -4632,7 +4632,6 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
>>          if ( id != (unsigned int)ioservid )
>>              break;
>>
>> -        rc = 0;
>>          for ( i = 0; i < nr_frames; i++ )
>>          {
>>              mfn_t mfn;
>> @@ -4643,6 +4642,9 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
>>
>>              mfn_list[i] = mfn_x(mfn);
>>          }
>> +        if ( i == nr_frames )
>> +            /* Success.  Passed nr_frames back to the caller. */
> Nit: s/passed/pass

That would change the meaning of the comment, and it would no longer be
accurate.

>
>> +            rc = nr_frames;
>>          break;
>>      }
>>  #endif
>> diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
>> index 834c5e19d1..17619f26ed 100644
>> --- a/xen/common/compat/memory.c
>> +++ b/xen/common/compat/memory.c
>> @@ -611,6 +622,21 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>>                  break;
>>              }
>>
>> +            if ( split < 0 )
>> +            {
>> +                /* Contintuation occured. */
>> +                ASSERT(rc != XENMEM_acquire_resource);
> Do you mean "op !=" here?

No.  Such an assertion had better trigger every time we hit this path.

rc is the continuation information that we're in the middle of
processing.  See also the comment about this being a giant undebuggable
mess.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource
  2021-01-08 17:52   ` Andrew Cooper
@ 2021-01-11 10:50     ` Roger Pau Monné
  2021-01-11 15:00       ` Andrew Cooper
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2021-01-11 10:50 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Ian Jackson, Michał Leszczyński,
	Hubert Jasudowicz, Tamas K Lengyel

On Fri, Jan 08, 2021 at 05:52:36PM +0000, Andrew Cooper wrote:
> On 22/09/2020 19:24, Andrew Cooper wrote:
> > diff --git a/tools/libs/foreignmemory/linux.c b/tools/libs/foreignmemory/linux.c
> > index fe73d5ab72..eec089e232 100644
> > --- a/tools/libs/foreignmemory/linux.c
> > +++ b/tools/libs/foreignmemory/linux.c
> > @@ -339,6 +342,39 @@ int osdep_xenforeignmemory_map_resource(
> >      return 0;
> >  }
> >  
> > +int osdep_xenforeignmemory_resource_size(
> > +    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
> > +    unsigned int id, unsigned long *nr_frames)
> > +{
> > +    int rc;
> > +    struct xen_mem_acquire_resource *xmar =
> > +        xencall_alloc_buffer(fmem->xcall, sizeof(*xmar));
> > +
> > +    if ( !xmar )
> > +    {
> > +        PERROR("Could not bounce memory for acquire_resource hypercall");
> > +        return -1;
> > +    }
> > +
> > +    *xmar = (struct xen_mem_acquire_resource){
> > +        .domid = domid,
> > +        .type = type,
> > +        .id = id,
> > +    };
> > +
> > +    rc = xencall2(fmem->xcall, __HYPERVISOR_memory_op,
> > +                  XENMEM_acquire_resource, (uintptr_t)xmar);
> > +    if ( rc )
> > +        goto out;
> > +
> > +    *nr_frames = xmar->nr_frames;
> > +
> > + out:
> > +    xencall_free_buffer(fmem->xcall, xmar);
> > +
> > +    return rc;
> > +}
> 
> Having talked this through with Roger, it's broken.
> 
> In the meantime, foreignmem has gained acquire_resource on FreeBSD.
> Nothing in this osdep function is linux-specific, so it oughtn't to be
> osdep.
> 
> However, its also not permitted to make hypercalls like this in
> restricted mode, and that isn't something we should be breaking. 
> Amongst other things, it will prevent us from supporting >128 cpus, as
> Qemu needs updating to use this interface in due course.
> 
> The only solution (which keeps restricted mode working) is to fix
> Linux's ioctl() to be able to understand size requests.  This also
> avoids foreignmem needing to open a xencall handle which was fugly in
> the first place.

I think the following patch should allow you to fetch the resource
size from Linux privcmd driver by doing an ioctl with addr = 0 and num
= 0. I've just build tested it, but I haven't tried exercising the
code.

Roger.
---8<---
From 5d717c7b9ad3561ed0b17e7c5cf76b7c9fb536db Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Mon, 11 Jan 2021 10:38:59 +0100
Subject: [PATCH] xen/privcmd: allow fetching resource sizes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Allow issuing an IOCTL_PRIVCMD_MMAP_RESOURCE ioctl with num = 0 and
addr = 0 in order to fetch the size of a specific resource.

Add a shortcut to the default map resource path, since fetching the
size requires no address to be passed in, and thus no VMA to setup.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
NB: fetching the size of a resource shouldn't trigger an hypercall
preemption, and hence I've dropped the preempt indications.
---
 drivers/xen/privcmd.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index b0c73c58f987..3278f93eb3da 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -717,7 +717,8 @@ static long privcmd_ioctl_restrict(struct file *file, void __user *udata)
 	return 0;
 }
 
-static long privcmd_ioctl_mmap_resource(struct file *file, void __user *udata)
+static long privcmd_ioctl_mmap_resource(struct file *file,
+				struct privcmd_mmap_resource __user *udata)
 {
 	struct privcmd_data *data = file->private_data;
 	struct mm_struct *mm = current->mm;
@@ -734,6 +735,19 @@ static long privcmd_ioctl_mmap_resource(struct file *file, void __user *udata)
 	if (data->domid != DOMID_INVALID && data->domid != kdata.dom)
 		return -EPERM;
 
+	memset(&xdata, 0, sizeof(xdata));
+
+	if (!kdata.addr && !kdata.num) {
+		/* Query the size of the resource. */
+		xdata.domid = kdata.dom;
+		xdata.type = kdata.type;
+		xdata.id = kdata.id;
+		rc = HYPERVISOR_memory_op(XENMEM_acquire_resource, &xdata);
+		if (rc)
+			return rc;
+		return __put_user(xdata.nr_frames, &udata->num);
+	}
+
 	mmap_write_lock(mm);
 
 	vma = find_vma(mm, kdata.addr);
@@ -768,7 +782,6 @@ static long privcmd_ioctl_mmap_resource(struct file *file, void __user *udata)
 	} else
 		vma->vm_private_data = PRIV_VMA_LOCKED;
 
-	memset(&xdata, 0, sizeof(xdata));
 	xdata.domid = kdata.dom;
 	xdata.type = kdata.type;
 	xdata.id = kdata.id;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling
  2021-01-08 18:57     ` Andrew Cooper
@ 2021-01-11 14:25       ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2021-01-11 14:25 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 08.01.2021 19:57, Andrew Cooper wrote:
> On 28/09/2020 10:09, Jan Beulich wrote:
>> On 22.09.2020 20:24, Andrew Cooper wrote:
>>> @@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>>>  
>>>  #undef XLAT_mem_acquire_resource_HNDL_frame_list
>>>  
>>> +            if ( xen_frame_list && cmp.mar.nr_frames )
>>> +            {
>>> +                /*
>>> +                 * frame_list is an input for translated guests, and an output
>>> +                 * for untranslated guests.  Only copy in for translated guests.
>>> +                 */
>>> +                if ( paging_mode_translate(currd) )
>>> +                {
>>> +                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
>>> +
>>> +                    if ( !compat_handle_okay(cmp.mar.frame_list,
>>> +                                             cmp.mar.nr_frames) ||
>>> +                         __copy_from_compat_offset(
>>> +                             compat_frame_list, cmp.mar.frame_list,
>>> +                             0, cmp.mar.nr_frames) )
>>> +                        return -EFAULT;
>>> +
>>> +                    /*
>>> +                     * Iterate backwards over compat_frame_list[] expanding
>>> +                     * compat_pfn_t to xen_pfn_t in place.
>>> +                     */
>>> +                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
>>> +                        xen_frame_list[x] = compat_frame_list[x];
>> In addition to what Paul has said, I also don't see why you resort
>> to a signed type here. Using the available local variable i ought to
>> be quite easy:
>>
>>                     for ( i = cmp.mar.nr_frames; i--; )
>>                         xen_frame_list[i] = compat_frame_list[i];
> 
> My goal is to make this code able to be followed, not to obfuscate it
> further.  In particular, my version doesn't take several minutes to
> figure out why it doesn't die with a fatal #PF.
> 
> Also (because I thought it would be full of irony to try, and it turns
> out I was right), your version is 9 bytes larger once compiled.  This
> has everything to do with the scope of the induction variable.  I'm
> surprised that, in your effort to irradiate overly large scopes, you
> haven't pushed for this form further.

Let me reply in reverse order: When asking for scope reduction, I
don't think I would typically ask to introduce multiple identical
variables across many smaller scopes. So in general I view helper
variables like induction ones okay to have wide scope, as long as
they're used only for similar purposes (e.g. not again after their
loops).

Additionally it wasn't clear to me whether this originally C++
style of declaring induction variables was something we actually
consider acceptable. Personally I've avoided using such constructs
so far, for this reason. And again personally I'd be happy to see
us formally allow for their use.

Finally, the main aspect of my previous reply was left unaddressed:
I'm not so much concerned about the extra variable, but about it
being signed when (this being used as an array index) one can do
without.

>> As an aside, considering the controversy we're having on patch 2, I
>> find it quite interesting how you carefully allow for nr_frames being
>> zero throughout your changes here (which, as I think is obvious, I
>> agree you want to do).
> 
> I thought you of all people would appreciate that there *is* a
> separation of responsibilities between this parameter-marshalling one,
> and the native one.

Sure. But the two would better agree in their interpretation of
what a count of zero means.

> Also, this code doesn't livelock in the hypervisor when handed 0.

Would you mind explaining where there's a livelock? If indeed
code structure results in such, special casing count-is-zero
early on (and returning success without having done anything) is
an easy solution. Nevertheless I'd generally prefer to achieve
such behavior without additional code, e.g. by loops "naturally"
degenerating to no-ops in such a case.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource
  2021-01-11 10:50     ` Roger Pau Monné
@ 2021-01-11 15:00       ` Andrew Cooper
  0 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2021-01-11 15:00 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Xen-devel, Ian Jackson, Michał Leszczyński,
	Hubert Jasudowicz, Tamas K Lengyel, Andrew Cooper

On 11/01/2021 10:50, Roger Pau Monné wrote:
> On Fri, Jan 08, 2021 at 05:52:36PM +0000, Andrew Cooper wrote:
>> On 22/09/2020 19:24, Andrew Cooper wrote:
>>> diff --git a/tools/libs/foreignmemory/linux.c b/tools/libs/foreignmemory/linux.c
>>> index fe73d5ab72..eec089e232 100644
>>> --- a/tools/libs/foreignmemory/linux.c
>>> +++ b/tools/libs/foreignmemory/linux.c
>>> @@ -339,6 +342,39 @@ int osdep_xenforeignmemory_map_resource(
>>>      return 0;
>>>  }
>>>  
>>> +int osdep_xenforeignmemory_resource_size(
>>> +    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
>>> +    unsigned int id, unsigned long *nr_frames)
>>> +{
>>> +    int rc;
>>> +    struct xen_mem_acquire_resource *xmar =
>>> +        xencall_alloc_buffer(fmem->xcall, sizeof(*xmar));
>>> +
>>> +    if ( !xmar )
>>> +    {
>>> +        PERROR("Could not bounce memory for acquire_resource hypercall");
>>> +        return -1;
>>> +    }
>>> +
>>> +    *xmar = (struct xen_mem_acquire_resource){
>>> +        .domid = domid,
>>> +        .type = type,
>>> +        .id = id,
>>> +    };
>>> +
>>> +    rc = xencall2(fmem->xcall, __HYPERVISOR_memory_op,
>>> +                  XENMEM_acquire_resource, (uintptr_t)xmar);
>>> +    if ( rc )
>>> +        goto out;
>>> +
>>> +    *nr_frames = xmar->nr_frames;
>>> +
>>> + out:
>>> +    xencall_free_buffer(fmem->xcall, xmar);
>>> +
>>> +    return rc;
>>> +}
>> Having talked this through with Roger, it's broken.
>>
>> In the meantime, foreignmem has gained acquire_resource on FreeBSD.
>> Nothing in this osdep function is linux-specific, so it oughtn't to be
>> osdep.
>>
>> However, its also not permitted to make hypercalls like this in
>> restricted mode, and that isn't something we should be breaking. 
>> Amongst other things, it will prevent us from supporting >128 cpus, as
>> Qemu needs updating to use this interface in due course.
>>
>> The only solution (which keeps restricted mode working) is to fix
>> Linux's ioctl() to be able to understand size requests.  This also
>> avoids foreignmem needing to open a xencall handle which was fugly in
>> the first place.
> I think the following patch should allow you to fetch the resource
> size from Linux privcmd driver by doing an ioctl with addr = 0 and num
> = 0. I've just build tested it, but I haven't tried exercising the
> code.
>
> Roger.
> ---8<---
> From 5d717c7b9ad3561ed0b17e7c5cf76b7c9fb536db Mon Sep 17 00:00:00 2001
> From: Roger Pau Monne <roger.pau@citrix.com>
> Date: Mon, 11 Jan 2021 10:38:59 +0100
> Subject: [PATCH] xen/privcmd: allow fetching resource sizes
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Allow issuing an IOCTL_PRIVCMD_MMAP_RESOURCE ioctl with num = 0 and
> addr = 0 in order to fetch the size of a specific resource.
>
> Add a shortcut to the default map resource path, since fetching the
> size requires no address to be passed in, and thus no VMA to setup.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>

> ---
> NB: fetching the size of a resource shouldn't trigger an hypercall
> preemption, and hence I've dropped the preempt indications.

Yeah - that's fine.  Querying the size isn't ever going to turn into a
long running operation from the guest's point of view.

I'll submit the matching patch for libxenforeignmem.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 05/11] tools/foreignmem: Support querying the size of a resource
  2020-09-22 18:24 ` [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource Andrew Cooper
  2021-01-08 17:52   ` Andrew Cooper
@ 2021-01-11 15:26   ` Andrew Cooper
  2021-01-11 15:54     ` Roger Pau Monné
  1 sibling, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2021-01-11 15:26 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Andrew Cooper, Wei Liu, Paul Durrant,
	Roger Pau Monné,
	Ian Jackson, Michał Leszczyński, Hubert Jasudowicz,
	Tamas K Lengyel

With the Xen side of this interface fixed to return real sizes, userspace
needs to be able to make the query.

Introduce xenforeignmemory_resource_size() for the purpose, bumping the
library minor version.

Update both Linux and FreeBSD's osdep_xenforeignmemory_map_resource() to
understand size requests, skip the mmap() operation, and copy back the
nr_frames field.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Wei Liu <wl@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

This depends on a bugfix to the Linux IOCTL to understand size requests and
pass them on to Xen.

v3:
 * Rewrite from scratch, to avoid breaking restricted domid situations.  In
   particular, we cannot open a xencall interface and issue blind hypercalls.
---
 tools/include/xenforeignmemory.h                 | 15 +++++++++++++++
 tools/libs/foreignmemory/Makefile                |  2 +-
 tools/libs/foreignmemory/core.c                  | 18 ++++++++++++++++++
 tools/libs/foreignmemory/freebsd.c               | 18 +++++++++++++++---
 tools/libs/foreignmemory/libxenforeignmemory.map |  4 ++++
 tools/libs/foreignmemory/linux.c                 | 18 +++++++++++++++---
 6 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/tools/include/xenforeignmemory.h b/tools/include/xenforeignmemory.h
index d594be8df0..1ba2f5316b 100644
--- a/tools/include/xenforeignmemory.h
+++ b/tools/include/xenforeignmemory.h
@@ -179,6 +179,21 @@ xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
 int xenforeignmemory_unmap_resource(
     xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
 
+/**
+ * Determine the maximum size of a specific resource.
+ *
+ * @parm fmem handle to the open foreignmemory interface
+ * @parm domid the domain id
+ * @parm type the resource type
+ * @parm id the type-specific resource identifier
+ *
+ * Return 0 on success and fills in *nr_frames.  Sets errno and return -1 on
+ * error.
+ */
+int xenforeignmemory_resource_size(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long *nr_frames);
+
 #endif
 
 /*
diff --git a/tools/libs/foreignmemory/Makefile b/tools/libs/foreignmemory/Makefile
index 13850f7988..90d80a49ae 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR    = 1
-MINOR    = 3
+MINOR    = 4
 
 SRCS-y                 += core.c
 SRCS-$(CONFIG_Linux)   += linux.c
diff --git a/tools/libs/foreignmemory/core.c b/tools/libs/foreignmemory/core.c
index 63f12e2450..1e92c567e1 100644
--- a/tools/libs/foreignmemory/core.c
+++ b/tools/libs/foreignmemory/core.c
@@ -188,6 +188,24 @@ int xenforeignmemory_unmap_resource(
     return rc;
 }
 
+int xenforeignmemory_resource_size(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long *nr_frames)
+{
+    xenforeignmemory_resource_handle fres = {
+        .domid = domid,
+        .type  = type,
+        .id    = id,
+    };
+    int rc = osdep_xenforeignmemory_map_resource(fmem, &fres);
+
+    if ( rc )
+        return rc;
+
+    *nr_frames = fres.nr_frames;
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/freebsd.c b/tools/libs/foreignmemory/freebsd.c
index 3d403a7cd0..9a2796f0b7 100644
--- a/tools/libs/foreignmemory/freebsd.c
+++ b/tools/libs/foreignmemory/freebsd.c
@@ -119,6 +119,10 @@ int osdep_xenforeignmemory_map_resource(xenforeignmemory_handle *fmem,
     };
     int rc;
 
+    if ( !fres->addr && !fres->nr_frames )
+        /* Request for resource size.  Skip mmap(). */
+        goto skip_mmap;
+
     fres->addr = mmap(fres->addr, fres->nr_frames << PAGE_SHIFT,
                       fres->prot, fres->flags | MAP_SHARED, fmem->fd, 0);
     if ( fres->addr == MAP_FAILED )
@@ -126,6 +130,7 @@ int osdep_xenforeignmemory_map_resource(xenforeignmemory_handle *fmem,
 
     mr.addr = (uintptr_t)fres->addr;
 
+ skip_mmap:
     rc = ioctl(fmem->fd, IOCTL_PRIVCMD_MMAP_RESOURCE, &mr);
     if ( rc )
     {
@@ -136,13 +141,20 @@ int osdep_xenforeignmemory_map_resource(xenforeignmemory_handle *fmem,
         else
             errno = EOPNOTSUPP;
 
-        saved_errno = errno;
-        osdep_xenforeignmemory_unmap_resource(fmem, fres);
-        errno = saved_errno;
+        if ( fres->addr )
+        {
+            saved_errno = errno;
+            osdep_xenforeignmemory_unmap_resource(fmem, fres);
+            errno = saved_errno;
+        }
 
         return -1;
     }
 
+    /* If requesting size, copy back. */
+    if ( !fres->addr )
+        fres->nr_frames = mr.num;
+
     return 0;
 }
 
diff --git a/tools/libs/foreignmemory/libxenforeignmemory.map b/tools/libs/foreignmemory/libxenforeignmemory.map
index d5323c87d9..8aca341b99 100644
--- a/tools/libs/foreignmemory/libxenforeignmemory.map
+++ b/tools/libs/foreignmemory/libxenforeignmemory.map
@@ -19,3 +19,7 @@ VERS_1.3 {
 		xenforeignmemory_map_resource;
 		xenforeignmemory_unmap_resource;
 } VERS_1.2;
+VERS_1.4 {
+	global:
+		xenforeignmemory_resource_size;
+} VERS_1.3;
diff --git a/tools/libs/foreignmemory/linux.c b/tools/libs/foreignmemory/linux.c
index fe73d5ab72..d0eead1196 100644
--- a/tools/libs/foreignmemory/linux.c
+++ b/tools/libs/foreignmemory/linux.c
@@ -312,6 +312,10 @@ int osdep_xenforeignmemory_map_resource(
     };
     int rc;
 
+    if ( !fres->addr && !fres->nr_frames )
+        /* Request for resource size.  Skip mmap(). */
+        goto skip_mmap;
+
     fres->addr = mmap(fres->addr, fres->nr_frames << PAGE_SHIFT,
                       fres->prot, fres->flags | MAP_SHARED, fmem->fd, 0);
     if ( fres->addr == MAP_FAILED )
@@ -319,6 +323,7 @@ int osdep_xenforeignmemory_map_resource(
 
     mr.addr = (uintptr_t)fres->addr;
 
+ skip_mmap:
     rc = ioctl(fmem->fd, IOCTL_PRIVCMD_MMAP_RESOURCE, &mr);
     if ( rc )
     {
@@ -329,13 +334,20 @@ int osdep_xenforeignmemory_map_resource(
         else
             errno = EOPNOTSUPP;
 
-        saved_errno = errno;
-        (void)osdep_xenforeignmemory_unmap_resource(fmem, fres);
-        errno = saved_errno;
+        if ( fres->addr )
+        {
+            saved_errno = errno;
+            osdep_xenforeignmemory_unmap_resource(fmem, fres);
+            errno = saved_errno;
+        }
 
         return -1;
     }
 
+    /* If requesting size, copy back. */
+    if ( !fres->addr )
+        fres->nr_frames = mr.num;
+
     return 0;
 }
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 05/11] tools/foreignmem: Support querying the size of a resource
  2021-01-11 15:26   ` [PATCH v3 " Andrew Cooper
@ 2021-01-11 15:54     ` Roger Pau Monné
  0 siblings, 0 replies; 42+ messages in thread
From: Roger Pau Monné @ 2021-01-11 15:54 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Andrew Cooper, Wei Liu, Paul Durrant, Ian Jackson,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On Mon, Jan 11, 2021 at 03:26:57PM +0000, Andrew Cooper wrote:
> With the Xen side of this interface fixed to return real sizes, userspace
> needs to be able to make the query.
> 
> Introduce xenforeignmemory_resource_size() for the purpose, bumping the
> library minor version.
> 
> Update both Linux and FreeBSD's osdep_xenforeignmemory_map_resource() to
> understand size requests, skip the mmap() operation, and copy back the
> nr_frames field.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  2020-09-28  9:37   ` Jan Beulich
@ 2021-01-11 20:05     ` Andrew Cooper
  2021-01-11 22:36       ` Andrew Cooper
  2021-01-12  8:39       ` Jan Beulich
  0 siblings, 2 replies; 42+ messages in thread
From: Andrew Cooper @ 2021-01-11 20:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel,
	Andrew Cooper

On 28/09/2020 10:37, Jan Beulich wrote:
> On 22.09.2020 20:24, Andrew Cooper wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -4632,7 +4632,6 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
>>          if ( id != (unsigned int)ioservid )
>>              break;
>>  
>> -        rc = 0;
>>          for ( i = 0; i < nr_frames; i++ )
>>          {
>>              mfn_t mfn;
>> @@ -4643,6 +4642,9 @@ int arch_acquire_resource(struct domain *d, unsigned int type,
>>  
>>              mfn_list[i] = mfn_x(mfn);
>>          }
>> +        if ( i == nr_frames )
>> +            /* Success.  Passed nr_frames back to the caller. */
>> +            rc = nr_frames;
> With this, shouldn't the return type of the function be changed to
> "long"? I realize that's no an issue with XENMEM_resource_ioreq_server
> specifically, but I mean the general case.

That would require going back in time and making a more sane ABI for
struct xen_mem_acquire_resource

We really do have a 32bit nr_frames field, and a 64bit "where to
continue from" field.

>> --- a/xen/common/compat/memory.c
>> +++ b/xen/common/compat/memory.c
>> @@ -636,15 +662,45 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>>                      compat_frame_list[i] = frame;
>>                  }
>>  
>> -                if ( __copy_to_compat_offset(cmp.mar.frame_list, 0,
>> -                                             compat_frame_list,
>> -                                             cmp.mar.nr_frames) )
>> +                if ( __copy_to_compat_offset(
>> +                         cmp.mar.frame_list, start_extent,
>> +                         compat_frame_list, done) )
>>                      return -EFAULT;
>>              }
>> -            break;
>> +
>> +            start_extent += done;
>> +
>> +            /* Completely done. */
>> +            if ( start_extent == cmp.mar.nr_frames )
>> +                break;
>> +
>> +            /*
>> +             * Done a "full" batch, but we were limited by space in the xlat
>> +             * area.  Go around the loop again without necesserily returning
>> +             * to guest context.
>> +             */
>> +            if ( done == nat.mar->nr_frames )
>> +            {
>> +                split = 1;
>> +                break;
>> +            }
>> +
>> +            /* Explicit continuation request from a higher level. */
>> +            if ( done < nat.mar->nr_frames )
>> +                return hypercall_create_continuation(
>> +                    __HYPERVISOR_memory_op, "ih",
>> +                    op | (start_extent << MEMOP_EXTENT_SHIFT), compat);
>> +
>> +            /*
>> +             * Well... Somethings gone wrong with the two levels of chunking.
>> +             * My condolences to whomever next has to debug this mess.
>> +             */
> Any suggestion how to overcome this "mess"?

The double level of array handling is what makes it so complicated. 
There are enough cases in compat_memory_op() alone which can't

We've got two cases in practice.  A singleton object needing conversion,
or a large array of them.  I'm quite certain we'd have less code and
less complexity by having copy_$OJBECT_{to,from}_guest() helpers which
dealt with compat internally as appropriate.

We don't care about the performance of 32bit hypercalls, but not doing
batch conversions of 1020/etc objects in the compat layer will probably
result in better performance overall, as we don't throw away the work as
we batch things at smaller increments higher up the stack.

>
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -4105,6 +4105,9 @@ int gnttab_acquire_resource(
>>      for ( i = 0; i < nr_frames; ++i )
>>          mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
>>  
>> +    /* Success.  Passed nr_frames back to the caller. */
> Nit: "Pass"?

We have already passed them back to the caller.  "Pass" is the wrong
tense to use.

>
>> --- a/xen/common/memory.c
>> +++ b/xen/common/memory.c
>> @@ -1027,17 +1027,31 @@ static unsigned int resource_max_frames(struct domain *d,
>>      }
>>  }
>>  
>> +/*
>> + * Returns -errno on error, or positive in the range [1, nr_frames] on
>> + * success.  Returning less than nr_frames contitutes a request for a
>> + * continuation.
>> + */
>> +static int _acquire_resource(
>> +    struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
> As per the comment the return type may again want to be "long" here.
> Albeit I realize the restriction to (UINT_MAX >> MEMOP_EXTENT_SHIFT)
> makes this (and the other place above) only a latent issue for now,
> so it may well be fine to be left as is.

Hmm yes - it should be long, because per the ABI we still should be able
to return 0xffffffff to a caller in the success case.

I'll update.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2020-09-24  9:51   ` Paul Durrant
@ 2021-01-11 21:22     ` Andrew Cooper
  2021-01-12  8:23       ` Jan Beulich
  2021-01-12  8:29       ` Paul Durrant
  0 siblings, 2 replies; 42+ messages in thread
From: Andrew Cooper @ 2021-01-11 21:22 UTC (permalink / raw)
  To: paul, 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel',
	Andrew Cooper

On 24/09/2020 10:51, Paul Durrant wrote:
>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>> index a5d3ed8bda..912f07be47 100644
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>>      return 0;
>>  }
>>
>> +int gnttab_acquire_resource(
>> +    struct domain *d, unsigned int id, unsigned long frame,
>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>> +{
>> +    struct grant_table *gt = d->grant_table;
>> +    unsigned int i = nr_frames, tot_frames;
>> +    mfn_t tmp;
>> +    void **vaddrs = NULL;
>> +    int rc;
>> +
>> +    /* Input sanity. */
> Nit: inconsistency with full stops on single line comments.

The whole point of relaxing the style was because feedback over minutia
such as this was deemed detrimental to the community.

If I ever see feedback like this, I will commit commit the patch there
and then.  This is the only way upstream Xen is going to turn into a
less toxic environment for contributors.

>> +            rc = -EINVAL;
>> +            break;
>> +        }
>> +        rc = gnttab_get_status_frame_mfn(d, tot_frames - 1, &tmp);
>> +        break;
>> +    }
>> +
> I think you could drop the write lock here...
>
>> +    /* Any errors from growing the table? */
>> +    if ( rc )
>> +        goto out;
>> +
> ...and acquire it read here, since we know the table cannot shrink. You'd need to re-check the gt_version for safety though.

And you've correctly identified why I didn't.  If we had a
relax-write-to-read lock operation, that would also be fine, but we don't.

Fundamentally, this is an operation made once during VM construction, to
map one single frame.  It is not a hotpath in need of microptimising its
locking pattern, and absolutely not something worth introducing a safety
hazard for.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2020-09-25 13:17   ` Jan Beulich
@ 2021-01-11 21:22     ` Andrew Cooper
  2021-01-12  8:15       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2021-01-11 21:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel,
	Andrew Cooper

On 25/09/2020 14:17, Jan Beulich wrote:
> On 22.09.2020 20:24, Andrew Cooper wrote:
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>>      return 0;
>>  }
>>  
>> +int gnttab_acquire_resource(
>> +    struct domain *d, unsigned int id, unsigned long frame,
>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>> +{
>> +    struct grant_table *gt = d->grant_table;
>> +    unsigned int i = nr_frames, tot_frames;
>> +    mfn_t tmp;
>> +    void **vaddrs = NULL;
>> +    int rc;
>> +
>> +    /* Input sanity. */
>> +    if ( !nr_frames )
>> +        return -EINVAL;
> I continue to object to this becoming an error.

It's not a path any legitimate caller will ever exercise.  POSIX defines
any mmap() of zero length to be an error, and I completely agree.

The problem isn't, per say, with accepting bogus arguments.  It is the
quantity of additional complexity in the hypervisor required to support
accepting the bogus input cleanly.

There are exactly 2 cases where 0 might be found here.  Either the
caller passed it in directly (and bypassed the POSIX check which would
reject the attempt), or some part of multi-layer continuation handling
went wrong on the previous iteration.

For this hypercall (by the end of the series), _acquire_resource()
returning 0 is specifically treated as an error so we don't livelock in
32-chunking loop until some other preemption kicks in.

In this case, the check isn't actually necessary because it is (will be)
guarded higher up the call chain in a more general way, but I'm not
interested in adding unnecessary extra complexity (to area I've had to
rewrite from scratch to remove the bugs) simply to support a
non-existent usecase.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  2021-01-11 20:05     ` Andrew Cooper
@ 2021-01-11 22:36       ` Andrew Cooper
  2021-01-12  8:39       ` Jan Beulich
  1 sibling, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2021-01-11 22:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel,
	Andrew Cooper

On 11/01/2021 20:05, Andrew Cooper wrote:
>>> --- a/xen/common/memory.c
>>> +++ b/xen/common/memory.c
>>> @@ -1027,17 +1027,31 @@ static unsigned int resource_max_frames(struct domain *d,
>>>      }
>>>  }
>>>  
>>> +/*
>>> + * Returns -errno on error, or positive in the range [1, nr_frames] on
>>> + * success.  Returning less than nr_frames contitutes a request for a
>>> + * continuation.
>>> + */
>>> +static int _acquire_resource(
>>> +    struct domain *d, unsigned int type, unsigned int id, unsigned long frame,
>>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>> As per the comment the return type may again want to be "long" here.
>> Albeit I realize the restriction to (UINT_MAX >> MEMOP_EXTENT_SHIFT)
>> makes this (and the other place above) only a latent issue for now,
>> so it may well be fine to be left as is.
> Hmm yes - it should be long, because per the ABI we still should be able
> to return 0xffffffff to a caller in the success case.
>
> I'll update.

Actually, no.  Wrong half the hypercall.

For _acquire_resource(), the return value is bound by nr_frames which is
a maximum of 32, and is unlikely to grow substantially from this.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2021-01-11 21:22     ` Andrew Cooper
@ 2021-01-12  8:15       ` Jan Beulich
  2021-01-12 18:11         ` Andrew Cooper
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2021-01-12  8:15 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 11.01.2021 22:22, Andrew Cooper wrote:
> On 25/09/2020 14:17, Jan Beulich wrote:
>> On 22.09.2020 20:24, Andrew Cooper wrote:
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>>>      return 0;
>>>  }
>>>  
>>> +int gnttab_acquire_resource(
>>> +    struct domain *d, unsigned int id, unsigned long frame,
>>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>>> +{
>>> +    struct grant_table *gt = d->grant_table;
>>> +    unsigned int i = nr_frames, tot_frames;
>>> +    mfn_t tmp;
>>> +    void **vaddrs = NULL;
>>> +    int rc;
>>> +
>>> +    /* Input sanity. */
>>> +    if ( !nr_frames )
>>> +        return -EINVAL;
>> I continue to object to this becoming an error.
> 
> It's not a path any legitimate caller will ever exercise.  POSIX defines
> any mmap() of zero length to be an error, and I completely agree.
> 
> The problem isn't, per say, with accepting bogus arguments.  It is the
> quantity of additional complexity in the hypervisor required to support
> accepting the bogus input cleanly.

Is there any, besides s/-EINVAL/0/ for the line above?

> There are exactly 2 cases where 0 might be found here.  Either the
> caller passed it in directly (and bypassed the POSIX check which would
> reject the attempt), or some part of multi-layer continuation handling
> went wrong on the previous iteration.
> 
> For this hypercall (by the end of the series), _acquire_resource()
> returning 0 is specifically treated as an error so we don't livelock in
> 32-chunking loop until some other preemption kicks in.
> 
> In this case, the check isn't actually necessary because it is (will be)
> guarded higher up the call chain in a more general way, but I'm not
> interested in adding unnecessary extra complexity (to area I've had to
> rewrite from scratch to remove the bugs) simply to support a
> non-existent usecase.

In a couple of cases you've been calling out the principle of least
surprise. This holds for the entirety of batched (in whatever form)
hypercalls then as well. Either zero elements means "no-op"
everywhere, or it gets treated as an error everywhere. Anything
else is inconsistent and hence possibly surprising.

To be clear - if others agree with your view, I'm not meaning to
NAK the change in this form. But I'm also not going to knowingly
ack introduction of inconsistencies.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2021-01-11 21:22     ` Andrew Cooper
@ 2021-01-12  8:23       ` Jan Beulich
  2021-01-12 20:06         ` Andrew Cooper
  2021-01-12  8:29       ` Paul Durrant
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2021-01-12  8:23 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: 'George Dunlap', 'Ian Jackson',
	'Stefano Stabellini', 'Wei Liu',
	'Julien Grall', 'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel',
	'Xen-devel',
	paul

On 11.01.2021 22:22, Andrew Cooper wrote:
> On 24/09/2020 10:51, Paul Durrant wrote:
>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>> index a5d3ed8bda..912f07be47 100644
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>>>      return 0;
>>>  }
>>>
>>> +int gnttab_acquire_resource(
>>> +    struct domain *d, unsigned int id, unsigned long frame,
>>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>>> +{
>>> +    struct grant_table *gt = d->grant_table;
>>> +    unsigned int i = nr_frames, tot_frames;
>>> +    mfn_t tmp;
>>> +    void **vaddrs = NULL;
>>> +    int rc;
>>> +
>>> +    /* Input sanity. */
>> Nit: inconsistency with full stops on single line comments.
> 
> The whole point of relaxing the style was because feedback over minutia
> such as this was deemed detrimental to the community.
> 
> If I ever see feedback like this, I will commit commit the patch there
> and then.  This is the only way upstream Xen is going to turn into a
> less toxic environment for contributors.

Paul had clearly marked this as "nit". ./CODING_STYLE explicitly
allows omission as well as presence of a full stop here. So while
I agree with you that you'd be okay ignoring such a remark, I
also agree with Paul giving the remark if he's found a nearby
comment using the opposite variant. Besides relaxing requirements
where sensible, at least local consistency also helps avoid such
comments (now or for future contributors).

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2021-01-11 21:22     ` Andrew Cooper
  2021-01-12  8:23       ` Jan Beulich
@ 2021-01-12  8:29       ` Paul Durrant
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2021-01-12  8:29 UTC (permalink / raw)
  To: 'Andrew Cooper', 'Xen-devel'
  Cc: 'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Stefano Stabellini',
	'Wei Liu', 'Julien Grall',
	'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel'

> -----Original Message-----
> From: Andrew Cooper <amc96@cam.ac.uk>
> Sent: 11 January 2021 21:23
> To: paul@xen.org; 'Xen-devel' <xen-devel@lists.xenproject.org>
> Cc: 'George Dunlap' <George.Dunlap@eu.citrix.com>; 'Ian Jackson' <iwj@xenproject.org>; 'Jan Beulich'
> <JBeulich@suse.com>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>; 'Julien
> Grall' <julien@xen.org>; 'Michał Leszczyński' <michal.leszczynski@cert.pl>; 'Hubert Jasudowicz'
> <hubert.jasudowicz@cert.pl>; 'Tamas K Lengyel' <tamas@tklengyel.com>; Andrew Cooper <amc96@cam.ac.uk>
> Subject: Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
> 
> On 24/09/2020 10:51, Paul Durrant wrote:
> >> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> >> index a5d3ed8bda..912f07be47 100644
> >> --- a/xen/common/grant_table.c
> >> +++ b/xen/common/grant_table.c
> >> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
> >>      return 0;
> >>  }
> >>
> >> +int gnttab_acquire_resource(
> >> +    struct domain *d, unsigned int id, unsigned long frame,
> >> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
> >> +{
> >> +    struct grant_table *gt = d->grant_table;
> >> +    unsigned int i = nr_frames, tot_frames;
> >> +    mfn_t tmp;
> >> +    void **vaddrs = NULL;
> >> +    int rc;
> >> +
> >> +    /* Input sanity. */
> > Nit: inconsistency with full stops on single line comments.
> 
> The whole point of relaxing the style was because feedback over minutia
> such as this was deemed detrimental to the community.
> 
> If I ever see feedback like this, I will commit commit the patch there
> and then.  This is the only way upstream Xen is going to turn into a
> less toxic environment for contributors.

The tone of your response rather proves that Xen is already a toxic environment, I'm afraid.

  Paul



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
  2021-01-11 20:05     ` Andrew Cooper
  2021-01-11 22:36       ` Andrew Cooper
@ 2021-01-12  8:39       ` Jan Beulich
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2021-01-12  8:39 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 11.01.2021 21:05, Andrew Cooper wrote:
> On 28/09/2020 10:37, Jan Beulich wrote:
>> On 22.09.2020 20:24, Andrew Cooper wrote:
>>> --- a/xen/common/compat/memory.c
>>> +++ b/xen/common/compat/memory.c
>>> @@ -636,15 +662,45 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>>>                      compat_frame_list[i] = frame;
>>>                  }
>>>  
>>> -                if ( __copy_to_compat_offset(cmp.mar.frame_list, 0,
>>> -                                             compat_frame_list,
>>> -                                             cmp.mar.nr_frames) )
>>> +                if ( __copy_to_compat_offset(
>>> +                         cmp.mar.frame_list, start_extent,
>>> +                         compat_frame_list, done) )
>>>                      return -EFAULT;
>>>              }
>>> -            break;
>>> +
>>> +            start_extent += done;
>>> +
>>> +            /* Completely done. */
>>> +            if ( start_extent == cmp.mar.nr_frames )
>>> +                break;
>>> +
>>> +            /*
>>> +             * Done a "full" batch, but we were limited by space in the xlat
>>> +             * area.  Go around the loop again without necesserily returning
>>> +             * to guest context.
>>> +             */
>>> +            if ( done == nat.mar->nr_frames )
>>> +            {
>>> +                split = 1;
>>> +                break;
>>> +            }
>>> +
>>> +            /* Explicit continuation request from a higher level. */
>>> +            if ( done < nat.mar->nr_frames )
>>> +                return hypercall_create_continuation(
>>> +                    __HYPERVISOR_memory_op, "ih",
>>> +                    op | (start_extent << MEMOP_EXTENT_SHIFT), compat);
>>> +
>>> +            /*
>>> +             * Well... Somethings gone wrong with the two levels of chunking.
>>> +             * My condolences to whomever next has to debug this mess.
>>> +             */
>> Any suggestion how to overcome this "mess"?
> 
> The double level of array handling is what makes it so complicated. 
> There are enough cases in compat_memory_op() alone which can't
> 
> We've got two cases in practice.  A singleton object needing conversion,
> or a large array of them.  I'm quite certain we'd have less code and
> less complexity by having copy_$OJBECT_{to,from}_guest() helpers which
> dealt with compat internally as appropriate.
> 
> We don't care about the performance of 32bit hypercalls, but not doing
> batch conversions of 1020/etc objects in the compat layer will probably
> result in better performance overall, as we don't throw away the work as
> we batch things at smaller increments higher up the stack.

I've put this on my todo list to investigate. Maybe nowadays we
really don't care all this much about 32-bit hypercall performance.
The picture was surely different when the compat layer was
introduced.

>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -4105,6 +4105,9 @@ int gnttab_acquire_resource(
>>>      for ( i = 0; i < nr_frames; ++i )
>>>          mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
>>>  
>>> +    /* Success.  Passed nr_frames back to the caller. */
>> Nit: "Pass"?
> 
> We have already passed them back to the caller.  "Pass" is the wrong
> tense to use.

Hmm, interesting view. I personally wouldn't consider the
passing back to have completed before we've returned.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2021-01-12  8:15       ` Jan Beulich
@ 2021-01-12 18:11         ` Andrew Cooper
  0 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2021-01-12 18:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xen-devel, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, Julien Grall, Paul Durrant,
	Michał Leszczyński, Hubert Jasudowicz, Tamas K Lengyel

On 12/01/2021 08:15, Jan Beulich wrote:
> On 11.01.2021 22:22, Andrew Cooper wrote:
>> On 25/09/2020 14:17, Jan Beulich wrote:
>>> On 22.09.2020 20:24, Andrew Cooper wrote:
>>>> --- a/xen/common/grant_table.c
>>>> +++ b/xen/common/grant_table.c
>>>> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>>>>      return 0;
>>>>  }
>>>>  
>>>> +int gnttab_acquire_resource(
>>>> +    struct domain *d, unsigned int id, unsigned long frame,
>>>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>>>> +{
>>>> +    struct grant_table *gt = d->grant_table;
>>>> +    unsigned int i = nr_frames, tot_frames;
>>>> +    mfn_t tmp;
>>>> +    void **vaddrs = NULL;
>>>> +    int rc;
>>>> +
>>>> +    /* Input sanity. */
>>>> +    if ( !nr_frames )
>>>> +        return -EINVAL;
>>> I continue to object to this becoming an error.
>> It's not a path any legitimate caller will ever exercise.  POSIX defines
>> any mmap() of zero length to be an error, and I completely agree.
>>
>> The problem isn't, per say, with accepting bogus arguments.  It is the
>> quantity of additional complexity in the hypervisor required to support
>> accepting the bogus input cleanly.
> Is there any, besides s/-EINVAL/0/ for the line above?

It's really not that simple.

I've dropped this particular hunk from v3 (as it is not necessary for
safety at this point in the series), but retained the equivalent logical
change in later patches where it does become necessary.

>
>> There are exactly 2 cases where 0 might be found here.  Either the
>> caller passed it in directly (and bypassed the POSIX check which would
>> reject the attempt), or some part of multi-layer continuation handling
>> went wrong on the previous iteration.
>>
>> For this hypercall (by the end of the series), _acquire_resource()
>> returning 0 is specifically treated as an error so we don't livelock in
>> 32-chunking loop until some other preemption kicks in.
>>
>> In this case, the check isn't actually necessary because it is (will be)
>> guarded higher up the call chain in a more general way, but I'm not
>> interested in adding unnecessary extra complexity (to area I've had to
>> rewrite from scratch to remove the bugs) simply to support a
>> non-existent usecase.
> In a couple of cases you've been calling out the principle of least
> surprise. This holds for the entirety of batched (in whatever form)
> hypercalls then as well. Either zero elements means "no-op"
> everywhere, or it gets treated as an error everywhere. Anything
> else is inconsistent and hence possibly surprising.

If you want to talk consistency, I'm fixing an inconsistency with this
change, not introducing one.

The existing logic will already yield EINVAL for addr==0, num !=0,
rather than writing the size back.

addr!=0, num==0 is the odd-combination-out.  Its not a useful input, and
won't be seen from legitimate callers, but can crop up from bugs in the
continuation logic.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/11] xen/gnttab: Rework resource acquisition
  2021-01-12  8:23       ` Jan Beulich
@ 2021-01-12 20:06         ` Andrew Cooper
  0 siblings, 0 replies; 42+ messages in thread
From: Andrew Cooper @ 2021-01-12 20:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: 'George Dunlap', 'Ian Jackson',
	'Stefano Stabellini', 'Wei Liu',
	'Julien Grall', 'Michał Leszczyński',
	'Hubert Jasudowicz', 'Tamas K Lengyel',
	'Xen-devel',
	paul

On 12/01/2021 08:23, Jan Beulich wrote:
> On 11.01.2021 22:22, Andrew Cooper wrote:
>> On 24/09/2020 10:51, Paul Durrant wrote:
>>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>>> index a5d3ed8bda..912f07be47 100644
>>>> --- a/xen/common/grant_table.c
>>>> +++ b/xen/common/grant_table.c
>>>> @@ -4013,6 +4013,81 @@ static int gnttab_get_shared_frame_mfn(struct domain *d,
>>>>      return 0;
>>>>  }
>>>>
>>>> +int gnttab_acquire_resource(
>>>> +    struct domain *d, unsigned int id, unsigned long frame,
>>>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>>>> +{
>>>> +    struct grant_table *gt = d->grant_table;
>>>> +    unsigned int i = nr_frames, tot_frames;
>>>> +    mfn_t tmp;
>>>> +    void **vaddrs = NULL;
>>>> +    int rc;
>>>> +
>>>> +    /* Input sanity. */
>>> Nit: inconsistency with full stops on single line comments.
>> The whole point of relaxing the style was because feedback over minutia
>> such as this was deemed detrimental to the community.
>>
>> If I ever see feedback like this, I will commit commit the patch there
>> and then.  This is the only way upstream Xen is going to turn into a
>> less toxic environment for contributors.
> Paul had clearly marked this as "nit". ./CODING_STYLE explicitly
> allows omission as well as presence of a full stop here. So while
> I agree with you that you'd be okay ignoring such a remark, I
> also agree with Paul giving the remark if he's found a nearby
> comment using the opposite variant.

A reply like that means "this needs changing".  This is how it is read
by people, even if it is not how it was intended.

Contributors do not know CODING_STYLE off by heart, and even if they
did, it is the responsibility of a reviewer not to mislead the
contributor, not the contributors responsibility to challenge incorrect
feedback.

As it stands, the reply requires change of something which is explicitly
permitted under CODING_STYLE.  The change to CODING_STYLE was made
specially to prevent feedback of this form to begin with.

This is not an ok use of anyone's time, nor is repeating the argument
which lead to CODING_STYLE being changed to begin with.

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-01-12 20:07 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-22 18:24 [PATCH v2 00/11] Multiple fixes to XENMEM_acquire_resource Andrew Cooper
2020-09-22 18:24 ` [PATCH v2 01/11] xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE Andrew Cooper
2020-09-22 18:24 ` [PATCH v2 02/11] xen/gnttab: Rework resource acquisition Andrew Cooper
2020-09-24  9:51   ` Paul Durrant
2021-01-11 21:22     ` Andrew Cooper
2021-01-12  8:23       ` Jan Beulich
2021-01-12 20:06         ` Andrew Cooper
2021-01-12  8:29       ` Paul Durrant
2020-09-25 13:17   ` Jan Beulich
2021-01-11 21:22     ` Andrew Cooper
2021-01-12  8:15       ` Jan Beulich
2021-01-12 18:11         ` Andrew Cooper
2020-09-22 18:24 ` [PATCH v2 03/11] xen/memory: Fix compat XENMEM_acquire_resource for size requests Andrew Cooper
2020-09-22 18:24 ` [PATCH v2 04/11] xen/memory: Fix acquire_resource size semantics Andrew Cooper
2020-09-24 10:06   ` Paul Durrant
2020-09-24 10:57     ` Andrew Cooper
2020-09-24 11:04       ` Paul Durrant
2020-09-25 15:56   ` Jan Beulich
2020-09-22 18:24 ` [PATCH v2 05/11] tools/foreignmem: Support querying the size of a resource Andrew Cooper
2021-01-08 17:52   ` Andrew Cooper
2021-01-11 10:50     ` Roger Pau Monné
2021-01-11 15:00       ` Andrew Cooper
2021-01-11 15:26   ` [PATCH v3 " Andrew Cooper
2021-01-11 15:54     ` Roger Pau Monné
2020-09-22 18:24 ` [PATCH v2 06/11] xen/memory: Clarify the XENMEM_acquire_resource ABI description Andrew Cooper
2020-09-24 10:08   ` Paul Durrant
2020-09-22 18:24 ` [PATCH v2 07/11] xen/memory: Improve compat XENMEM_acquire_resource handling Andrew Cooper
2020-09-24 10:16   ` Paul Durrant
2020-09-28  9:09   ` Jan Beulich
2021-01-08 18:57     ` Andrew Cooper
2021-01-11 14:25       ` Jan Beulich
2020-09-22 18:24 ` [PATCH v2 08/11] xen/memory: Indent part of acquire_resource() Andrew Cooper
2020-09-24 10:36   ` Paul Durrant
2020-09-22 18:24 ` [PATCH v2 09/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource Andrew Cooper
2020-09-24 10:47   ` Paul Durrant
2021-01-08 19:36     ` Andrew Cooper
2020-09-28  9:37   ` Jan Beulich
2021-01-11 20:05     ` Andrew Cooper
2021-01-11 22:36       ` Andrew Cooper
2021-01-12  8:39       ` Jan Beulich
2020-09-22 18:24 ` [PATCH v2 10/11] TESTING dom0 Andrew Cooper
2020-09-22 18:24 ` [PATCH v2 11/11] TESTING XTF Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).