All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] gnttab: further work from XSA-380 / -382 context
@ 2021-08-26 10:06 Jan Beulich
  2021-08-26 10:09 ` [PATCH 1/9] gnttab: defer allocation of maptrack frames table Jan Beulich
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

The first four patches can be attributed to the former, the last four
patches to the latter. The middle patch had been submitted standalone
before, has a suitable Reviewed-by tag, but also has an objection by
Andrew pending, which unfortunately has lead to this patch now being
stuck. Short of Andrew being willing to settle the disagreement more
with Julien than with me (although I'm on Julien's side), I have no
idea what to do here.

There's probably not much interrelation between the patches, so they
can perhaps go in about any order.

1: defer allocation of maptrack frames table
2: drop a redundant expression from gnttab_release_mappings()
3: fold recurring is_iomem_page()
4: drop GNTMAP_can_fail
5: defer allocation of status frame tracking array
6: check handle early in gnttab_get_status_frames()
7: no need to translate handle for gnttab_get_status_frames()
8: bail from GFN-storing loops early in case of error
9: don't silently truncate GFNs in compat setup-table handling

Jan



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/9] gnttab: defer allocation of maptrack frames table
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
@ 2021-08-26 10:09 ` Jan Beulich
  2021-09-06 13:13   ` Julien Grall
  2021-09-06 14:05   ` Andrew Cooper
  2021-08-26 10:11 ` [PATCH 2/9] gnttab: drop a redundant expression from gnttab_release_mappings() Jan Beulich
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

By default all guests are permitted to have up to 1024 maptrack frames,
which on 64-bit means an 8k frame table. Yet except for driver domains
guests normally don't make use of grant mappings. Defer allocating the
table until a map track handle is first requested.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I continue to be unconvinced that it is a good idea to allow all DomU-s
1024 maptrack frames by default. While I'm still of the opinion that a
hypervisor enforced upper bound is okay, I question this upper bound
also getting used as the default value - this is perhaps okay for Dom0,
but not elsewhere.

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -633,6 +633,34 @@ get_maptrack_handle(
     if ( likely(handle != INVALID_MAPTRACK_HANDLE) )
         return handle;
 
+    if ( unlikely(!read_atomic(&lgt->maptrack)) )
+    {
+        struct grant_mapping **maptrack = NULL;
+
+        if ( lgt->max_maptrack_frames )
+            maptrack = vzalloc(lgt->max_maptrack_frames * sizeof(*maptrack));
+
+        spin_lock(&lgt->maptrack_lock);
+
+        if ( !lgt->maptrack )
+        {
+            if ( !maptrack )
+            {
+                spin_unlock(&lgt->maptrack_lock);
+                return INVALID_MAPTRACK_HANDLE;
+            }
+
+            write_atomic(&lgt->maptrack, maptrack);
+            maptrack = NULL;
+
+            radix_tree_init(&lgt->maptrack_tree);
+        }
+
+        spin_unlock(&lgt->maptrack_lock);
+
+        vfree(maptrack);
+    }
+
     spin_lock(&lgt->maptrack_lock);
 
     /*
@@ -1955,16 +1983,6 @@ int grant_table_init(struct domain *d, i
     if ( gt->active == NULL )
         goto out;
 
-    /* Tracking of mapped foreign frames table */
-    if ( gt->max_maptrack_frames )
-    {
-        gt->maptrack = vzalloc(gt->max_maptrack_frames * sizeof(*gt->maptrack));
-        if ( gt->maptrack == NULL )
-            goto out;
-
-        radix_tree_init(&gt->maptrack_tree);
-    }
-
     /* Shared grant table. */
     gt->shared_raw = xzalloc_array(void *, gt->max_grant_frames);
     if ( gt->shared_raw == NULL )



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 2/9] gnttab: drop a redundant expression from gnttab_release_mappings()
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
  2021-08-26 10:09 ` [PATCH 1/9] gnttab: defer allocation of maptrack frames table Jan Beulich
@ 2021-08-26 10:11 ` Jan Beulich
  2021-09-06 13:15   ` Julien Grall
  2021-08-26 10:12 ` [PATCH 3/9] gnttab: fold recurring is_iomem_page() Jan Beulich
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

This gnttab_host_mapping_get_page_type() invocation sits in the "else"
path of a conditional controlled by "map->flags & GNTMAP_readonly".

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3816,9 +3816,7 @@ int gnttab_release_mappings(struct domai
                 if ( gnttab_release_host_mappings(d) &&
                      !is_iomem_page(act->mfn) )
                 {
-                    if ( gnttab_host_mapping_get_page_type((map->flags &
-                                                            GNTMAP_readonly),
-                                                           d, rd) )
+                    if ( gnttab_host_mapping_get_page_type(false, d, rd) )
                         put_page_type(pg);
                     put_page(pg);
                 }



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 3/9] gnttab: fold recurring is_iomem_page()
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
  2021-08-26 10:09 ` [PATCH 1/9] gnttab: defer allocation of maptrack frames table Jan Beulich
  2021-08-26 10:11 ` [PATCH 2/9] gnttab: drop a redundant expression from gnttab_release_mappings() Jan Beulich
@ 2021-08-26 10:12 ` Jan Beulich
  2021-09-06 13:35   ` Julien Grall
  2021-08-26 10:13 ` [PATCH 4/9] gnttab: drop GNTMAP_can_fail Jan Beulich
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:12 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

In all cases call the function just once instead of up to four times, at
the same time avoiding to store a dangling pointer in a local variable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1587,11 +1587,11 @@ unmap_common_complete(struct gnttab_unma
     else
         status = &status_entry(rgt, op->ref);
 
-    pg = mfn_to_page(op->mfn);
+    pg = !is_iomem_page(act->mfn) ? mfn_to_page(op->mfn) : NULL;
 
     if ( op->done & GNTMAP_device_map )
     {
-        if ( !is_iomem_page(act->mfn) )
+        if ( pg )
         {
             if ( op->done & GNTMAP_readonly )
                 put_page(pg);
@@ -1608,7 +1608,7 @@ unmap_common_complete(struct gnttab_unma
 
     if ( op->done & GNTMAP_host_map )
     {
-        if ( !is_iomem_page(op->mfn) )
+        if ( pg )
         {
             if ( gnttab_host_mapping_get_page_type(op->done & GNTMAP_readonly,
                                                    ld, rd) )
@@ -3778,7 +3778,7 @@ int gnttab_release_mappings(struct domai
         else
             status = &status_entry(rgt, ref);
 
-        pg = mfn_to_page(act->mfn);
+        pg = !is_iomem_page(act->mfn) ? mfn_to_page(act->mfn) : NULL;
 
         if ( map->flags & GNTMAP_readonly )
         {
@@ -3786,7 +3786,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_devr_mask));
                 act->pin -= GNTPIN_devr_inc;
-                if ( !is_iomem_page(act->mfn) )
+                if ( pg )
                     put_page(pg);
             }
 
@@ -3794,8 +3794,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_hstr_mask));
                 act->pin -= GNTPIN_hstr_inc;
-                if ( gnttab_release_host_mappings(d) &&
-                     !is_iomem_page(act->mfn) )
+                if ( pg && gnttab_release_host_mappings(d) )
                     put_page(pg);
             }
         }
@@ -3805,7 +3804,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_devw_mask));
                 act->pin -= GNTPIN_devw_inc;
-                if ( !is_iomem_page(act->mfn) )
+                if ( pg )
                     put_page_and_type(pg);
             }
 
@@ -3813,8 +3812,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_hstw_mask));
                 act->pin -= GNTPIN_hstw_inc;
-                if ( gnttab_release_host_mappings(d) &&
-                     !is_iomem_page(act->mfn) )
+                if ( pg && gnttab_release_host_mappings(d) )
                 {
                     if ( gnttab_host_mapping_get_page_type(false, d, rd) )
                         put_page_type(pg);
In all cases call the function just once instead of up to four times, at
the same time avoiding to store a dangling pointer in a local variable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1587,11 +1587,11 @@ unmap_common_complete(struct gnttab_unma
     else
         status = &status_entry(rgt, op->ref);
 
-    pg = mfn_to_page(op->mfn);
+    pg = !is_iomem_page(act->mfn) ? mfn_to_page(op->mfn) : NULL;
 
     if ( op->done & GNTMAP_device_map )
     {
-        if ( !is_iomem_page(act->mfn) )
+        if ( pg )
         {
             if ( op->done & GNTMAP_readonly )
                 put_page(pg);
@@ -1608,7 +1608,7 @@ unmap_common_complete(struct gnttab_unma
 
     if ( op->done & GNTMAP_host_map )
     {
-        if ( !is_iomem_page(op->mfn) )
+        if ( pg )
         {
             if ( gnttab_host_mapping_get_page_type(op->done & GNTMAP_readonly,
                                                    ld, rd) )
@@ -3778,7 +3778,7 @@ int gnttab_release_mappings(struct domai
         else
             status = &status_entry(rgt, ref);
 
-        pg = mfn_to_page(act->mfn);
+        pg = !is_iomem_page(act->mfn) ? mfn_to_page(act->mfn) : NULL;
 
         if ( map->flags & GNTMAP_readonly )
         {
@@ -3786,7 +3786,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_devr_mask));
                 act->pin -= GNTPIN_devr_inc;
-                if ( !is_iomem_page(act->mfn) )
+                if ( pg )
                     put_page(pg);
             }
 
@@ -3794,8 +3794,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_hstr_mask));
                 act->pin -= GNTPIN_hstr_inc;
-                if ( gnttab_release_host_mappings(d) &&
-                     !is_iomem_page(act->mfn) )
+                if ( pg && gnttab_release_host_mappings(d) )
                     put_page(pg);
             }
         }
@@ -3805,7 +3804,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_devw_mask));
                 act->pin -= GNTPIN_devw_inc;
-                if ( !is_iomem_page(act->mfn) )
+                if ( pg )
                     put_page_and_type(pg);
             }
 
@@ -3813,8 +3812,7 @@ int gnttab_release_mappings(struct domai
             {
                 BUG_ON(!(act->pin & GNTPIN_hstw_mask));
                 act->pin -= GNTPIN_hstw_inc;
-                if ( gnttab_release_host_mappings(d) &&
-                     !is_iomem_page(act->mfn) )
+                if ( pg && gnttab_release_host_mappings(d) )
                 {
                     if ( gnttab_host_mapping_get_page_type(false, d, rd) )
                         put_page_type(pg);



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 4/9] gnttab: drop GNTMAP_can_fail
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (2 preceding siblings ...)
  2021-08-26 10:12 ` [PATCH 3/9] gnttab: fold recurring is_iomem_page() Jan Beulich
@ 2021-08-26 10:13 ` Jan Beulich
  2021-08-26 11:45   ` Andrew Cooper
  2021-08-26 10:13 ` [PATCH 5/9] gnttab: defer allocation of status frame tracking array Jan Beulich
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

There's neither documentation of what this flag is supposed to mean, nor
any implementation. With this, don't even bother enclosing the #define-s
in a __XEN_INTERFACE_VERSION__ conditional, but drop them altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/include/public/grant_table.h
+++ b/xen/include/public/grant_table.h
@@ -628,9 +628,6 @@ DEFINE_XEN_GUEST_HANDLE(gnttab_cache_flu
 #define _GNTMAP_contains_pte    (4)
 #define GNTMAP_contains_pte     (1<<_GNTMAP_contains_pte)
 
-#define _GNTMAP_can_fail        (5)
-#define GNTMAP_can_fail         (1<<_GNTMAP_can_fail)
-
 /*
  * Bits to be placed in guest kernel available PTE bits (architecture
  * dependent; only supported when XENFEAT_gnttab_map_avail_bits is set).



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 5/9] gnttab: defer allocation of status frame tracking array
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (3 preceding siblings ...)
  2021-08-26 10:13 ` [PATCH 4/9] gnttab: drop GNTMAP_can_fail Jan Beulich
@ 2021-08-26 10:13 ` Jan Beulich
  2021-08-26 10:13 ` [PATCH 6/9] gnttab: check handle early in gnttab_get_status_frames() Jan Beulich
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

This array can be large when many grant frames are permitted; avoid
allocating it when it's not going to be used anyway, by doing this only
in gnttab_populate_status_frames(). While the delaying of the respective
memory allocation adds possible reasons for failure of the respective
enclosing operations, there are other memory allocations there already,
so callers can't expect these operations to always succeed anyway.

As to the re-ordering at the end of gnttab_unpopulate_status_frames(),
this is merely to represent intended order of actions (shrink array
bound, then free higher array entries).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
v1: Fold into series.
[standalone history]
v4: Add a comment. Add a few blank lines. Extend description.
v3: Drop smp_wmb(). Re-base.
v2: Defer allocation to when a domain actually switches to the v2 grant
    API.

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1774,6 +1774,17 @@ gnttab_populate_status_frames(struct dom
     /* Make sure, prior version checks are architectural visible */
     block_speculation();
 
+    if ( gt->status == ZERO_BLOCK_PTR )
+    {
+        gt->status = xzalloc_array(grant_status_t *,
+                                   grant_to_status_frames(gt->max_grant_frames));
+        if ( !gt->status )
+        {
+            gt->status = ZERO_BLOCK_PTR;
+            return -ENOMEM;
+        }
+    }
+
     for ( i = nr_status_frames(gt); i < req_status_frames; i++ )
     {
         if ( (gt->status[i] = alloc_xenheap_page()) == NULL )
@@ -1794,18 +1805,25 @@ status_alloc_failed:
         free_xenheap_page(gt->status[i]);
         gt->status[i] = NULL;
     }
+
+    if ( !nr_status_frames(gt) )
+    {
+        xfree(gt->status);
+        gt->status = ZERO_BLOCK_PTR;
+    }
+
     return -ENOMEM;
 }
 
 static int
 gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
 {
-    unsigned int i;
+    unsigned int i, n = nr_status_frames(gt);
 
     /* Make sure, prior version checks are architectural visible */
     block_speculation();
 
-    for ( i = 0; i < nr_status_frames(gt); i++ )
+    for ( i = 0; i < n; i++ )
     {
         struct page_info *pg = virt_to_page(gt->status[i]);
         gfn_t gfn = gnttab_get_frame_gfn(gt, true, i);
@@ -1860,12 +1878,11 @@ gnttab_unpopulate_status_frames(struct d
         page_set_owner(pg, NULL);
     }
 
-    for ( i = 0; i < nr_status_frames(gt); i++ )
-    {
-        free_xenheap_page(gt->status[i]);
-        gt->status[i] = NULL;
-    }
     gt->nr_status_frames = 0;
+    for ( i = 0; i < n; i++ )
+        free_xenheap_page(gt->status[i]);
+    xfree(gt->status);
+    gt->status = ZERO_BLOCK_PTR;
 
     return 0;
 }
@@ -1988,11 +2005,11 @@ int grant_table_init(struct domain *d, i
     if ( gt->shared_raw == NULL )
         goto out;
 
-    /* Status pages for grant table - for version 2 */
-    gt->status = xzalloc_array(grant_status_t *,
-                               grant_to_status_frames(gt->max_grant_frames));
-    if ( gt->status == NULL )
-        goto out;
+    /*
+     * Status page tracking array for v2 gets allocated on demand. But don't
+     * leave a NULL pointer there.
+     */
+    gt->status = ZERO_BLOCK_PTR;
 
     grant_write_lock(gt);
 
@@ -4103,11 +4120,13 @@ int gnttab_acquire_resource(
         if ( gt->gt_version != 2 )
             break;
 
+        /* This may change gt->status, so has to happen before setting vaddrs. */ 
+        rc = gnttab_get_status_frame_mfn(d, final_frame, &tmp);
+
         /* Check that void ** is a suitable representation for gt->status. */
         BUILD_BUG_ON(!__builtin_types_compatible_p(
                          typeof(gt->status), grant_status_t **));
         vaddrs = (void **)gt->status;
-        rc = gnttab_get_status_frame_mfn(d, final_frame, &tmp);
         break;
     }
 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 6/9] gnttab: check handle early in gnttab_get_status_frames()
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (4 preceding siblings ...)
  2021-08-26 10:13 ` [PATCH 5/9] gnttab: defer allocation of status frame tracking array Jan Beulich
@ 2021-08-26 10:13 ` Jan Beulich
  2021-09-06 13:41   ` Julien Grall
  2021-08-26 10:14 ` [PATCH 7/9] gnttab: no need to translate handle for gnttab_get_status_frames() Jan Beulich
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

Like done in gnttab_setup_table(), check the handle once early in the
function and use the lighter-weight (for PV) copying function in the
loop.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3261,6 +3261,9 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
         return -EFAULT;
     }
 
+    if ( !guest_handle_okay(op.frame_list, op.nr_frames) )
+        return -EFAULT;
+
     d = rcu_lock_domain_by_any_id(op.dom);
     if ( d == NULL )
     {
@@ -3301,7 +3304,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
     for ( i = 0; i < op.nr_frames; i++ )
     {
         gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
-        if ( copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
+        if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
             op.status = GNTST_bad_virt_addr;
     }
 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 7/9] gnttab: no need to translate handle for gnttab_get_status_frames()
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (5 preceding siblings ...)
  2021-08-26 10:13 ` [PATCH 6/9] gnttab: check handle early in gnttab_get_status_frames() Jan Beulich
@ 2021-08-26 10:14 ` Jan Beulich
  2022-10-07 18:24   ` Andrew Cooper
  2021-08-26 10:14 ` [PATCH 8/9] gnttab: bail from GFN-storing loops early in case of error Jan Beulich
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

Unlike for GNTTABOP_setup_table native and compat frame lists are arrays
of the same type (uint64_t). Hence there's no need to translate the frame
values. This then also renders unnecessary the limit_max parameter of
gnttab_get_status_frames().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/compat/grant_table.c
+++ b/xen/common/compat/grant_table.c
@@ -271,10 +271,7 @@ int compat_grant_table_op(unsigned int c
             }
             break;
 
-        case GNTTABOP_get_status_frames: {
-            unsigned int max_frame_list_size_in_pages =
-                (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.get_status)) /
-                sizeof(*nat.get_status->frame_list.p);
+        case GNTTABOP_get_status_frames:
             if ( count != 1)
             {
                 rc = -EINVAL;
@@ -289,38 +286,25 @@ int compat_grant_table_op(unsigned int c
             }
 
 #define XLAT_gnttab_get_status_frames_HNDL_frame_list(_d_, _s_) \
-            set_xen_guest_handle((_d_)->frame_list, (uint64_t *)(nat.get_status + 1))
+            guest_from_compat_handle((_d_)->frame_list, (_s_)->frame_list)
             XLAT_gnttab_get_status_frames(nat.get_status, &cmp.get_status);
 #undef XLAT_gnttab_get_status_frames_HNDL_frame_list
 
             rc = gnttab_get_status_frames(
-                guest_handle_cast(nat.uop, gnttab_get_status_frames_t),
-                count, max_frame_list_size_in_pages);
+                guest_handle_cast(nat.uop, gnttab_get_status_frames_t), count);
             if ( rc >= 0 )
             {
-#define XLAT_gnttab_get_status_frames_HNDL_frame_list(_d_, _s_) \
-                do \
-                { \
-                    if ( (_s_)->status == GNTST_okay ) \
-                    { \
-                        for ( i = 0; i < (_s_)->nr_frames; ++i ) \
-                        { \
-                            uint64_t frame = (_s_)->frame_list.p[i]; \
-                            if ( __copy_to_compat_offset((_d_)->frame_list, \
-                                                         i, &frame, 1) ) \
-                                (_s_)->status = GNTST_bad_virt_addr; \
-                        } \
-                    } \
-                } while (0)
-                XLAT_gnttab_get_status_frames(&cmp.get_status, nat.get_status);
-#undef XLAT_gnttab_get_status_frames_HNDL_frame_list
-                if ( unlikely(__copy_to_guest(cmp_uop, &cmp.get_status, 1)) )
+                XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_compat_t) get =
+                    guest_handle_cast(cmp_uop,
+                                      gnttab_get_status_frames_compat_t);
+
+                if ( unlikely(__copy_field_to_guest(get, nat.get_status,
+                                                    status)) )
                     rc = -EFAULT;
                 else
                     i = 1;
             }
             break;
-        }
 
         default:
             domain_crash(current->domain);
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3242,7 +3242,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARA
 
 static long
 gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) uop,
-                         unsigned int count, unsigned int limit_max)
+                         unsigned int count)
 {
     gnttab_get_status_frames_t op;
     struct domain *d;
@@ -3292,15 +3292,6 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
         goto unlock;
     }
 
-    if ( unlikely(limit_max < op.nr_frames) )
-    {
-        gdprintk(XENLOG_WARNING,
-                 "nr_status_frames for %pd is too large (%u,%u)\n",
-                 d, op.nr_frames, limit_max);
-        op.status = GNTST_general_error;
-        goto unlock;
-    }
-
     for ( i = 0; i < op.nr_frames; i++ )
     {
         gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
@@ -3664,8 +3655,7 @@ do_grant_table_op(
 
     case GNTTABOP_get_status_frames:
         rc = gnttab_get_status_frames(
-            guest_handle_cast(uop, gnttab_get_status_frames_t), count,
-                              UINT_MAX);
+            guest_handle_cast(uop, gnttab_get_status_frames_t), count);
         break;
 
     case GNTTABOP_get_version:



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 8/9] gnttab: bail from GFN-storing loops early in case of error
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (6 preceding siblings ...)
  2021-08-26 10:14 ` [PATCH 7/9] gnttab: no need to translate handle for gnttab_get_status_frames() Jan Beulich
@ 2021-08-26 10:14 ` Jan Beulich
  2022-10-07 19:36   ` Andrew Cooper
  2021-08-26 10:15 ` [PATCH 9/9] gnttab: don't silently truncate GFNs in compat setup-table handling Jan Beulich
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

The contents of the output arrays are undefined in both cases anyway
when the operation itself gets marked as failed. There's no value in
trying to continue after a guest memory access failure.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
There's also a curious difference between the two sub-ops wrt the use of
SHARED_M2P().

--- a/xen/common/compat/grant_table.c
+++ b/xen/common/compat/grant_table.c
@@ -170,17 +170,14 @@ int compat_grant_table_op(unsigned int c
             if ( rc == 0 )
             {
 #define XLAT_gnttab_setup_table_HNDL_frame_list(_d_, _s_) \
-                do \
-                { \
-                    if ( (_s_)->status == GNTST_okay ) \
+                do { \
+                    for ( i = 0; (_s_)->status == GNTST_okay && \
+                                 i < (_s_)->nr_frames; ++i ) \
                     { \
-                        for ( i = 0; i < (_s_)->nr_frames; ++i ) \
-                        { \
-                            unsigned int frame = (_s_)->frame_list.p[i]; \
-                            if ( __copy_to_compat_offset((_d_)->frame_list, \
-                                                         i, &frame, 1) ) \
-                                (_s_)->status = GNTST_bad_virt_addr; \
-                        } \
+                        compat_pfn_t frame = (_s_)->frame_list.p[i]; \
+                        if ( __copy_to_compat_offset((_d_)->frame_list, \
+                                                     i, &frame, 1) ) \
+                            (_s_)->status = GNTST_bad_virt_addr; \
                     } \
                 } while (0)
                 XLAT_gnttab_setup_table(&cmp.setup, nat.setup);
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -2103,7 +2103,10 @@ gnttab_setup_table(
         BUG_ON(SHARED_M2P(gmfn));
 
         if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
+        {
             op.status = GNTST_bad_virt_addr;
+            break;
+        }
     }
 
  unlock:
@@ -3289,17 +3292,15 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
                  "status frames, but has only %u\n",
                  d->domain_id, op.nr_frames, nr_status_frames(gt));
         op.status = GNTST_general_error;
-        goto unlock;
     }
 
-    for ( i = 0; i < op.nr_frames; i++ )
+    for ( i = 0; op.status == GNTST_okay && i < op.nr_frames; i++ )
     {
         gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
         if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
             op.status = GNTST_bad_virt_addr;
     }
 
- unlock:
     grant_read_unlock(gt);
  out2:
     rcu_unlock_domain(d);



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 9/9] gnttab: don't silently truncate GFNs in compat setup-table handling
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (7 preceding siblings ...)
  2021-08-26 10:14 ` [PATCH 8/9] gnttab: bail from GFN-storing loops early in case of error Jan Beulich
@ 2021-08-26 10:15 ` Jan Beulich
  2022-10-07 19:57   ` Andrew Cooper
  2022-01-05  7:42 ` Ping: [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
  2022-10-07 13:49 ` Jan Beulich
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 10:15 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

Returning back truncated frame numbers is unhelpful: Quite likely
they're not owned by the domain (if it's PV), or we may misguide the
guest into writing grant entries into a page that it actually uses for
other purposes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
RFC: Arguably in the 32-bit PV case it may be necessary to instead put
     in place an explicit address restriction when allocating
     ->shared_raw[N]. This is currently implicit by alloc_xenheap_page()
     only returning memory covered by the direct-map.

--- a/xen/common/compat/grant_table.c
+++ b/xen/common/compat/grant_table.c
@@ -175,8 +175,15 @@ int compat_grant_table_op(unsigned int c
                                  i < (_s_)->nr_frames; ++i ) \
                     { \
                         compat_pfn_t frame = (_s_)->frame_list.p[i]; \
-                        if ( __copy_to_compat_offset((_d_)->frame_list, \
-                                                     i, &frame, 1) ) \
+                        if ( frame != (_s_)->frame_list.p[i] ) \
+                        { \
+                            if ( VALID_M2P((_s_)->frame_list.p[i]) ) \
+                                (_s_)->status = GNTST_address_too_big; \
+                            else \
+                                frame |= 0x80000000U;\
+                        } \
+                        else if ( __copy_to_compat_offset((_d_)->frame_list, \
+                                                          i, &frame, 1) ) \
                             (_s_)->status = GNTST_bad_virt_addr; \
                     } \
                 } while (0)



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9] gnttab: drop GNTMAP_can_fail
  2021-08-26 10:13 ` [PATCH 4/9] gnttab: drop GNTMAP_can_fail Jan Beulich
@ 2021-08-26 11:45   ` Andrew Cooper
  2021-08-26 13:03     ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2021-08-26 11:45 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu

On 26/08/2021 11:13, Jan Beulich wrote:
> There's neither documentation of what this flag is supposed to mean, nor
> any implementation. With this, don't even bother enclosing the #define-s
> in a __XEN_INTERFACE_VERSION__ conditional, but drop them altogether.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

It was introduced in 4d45702cf0398 along with GNTST_eagain, and the
commit message hints that it is for xen-paging

Furthermore, the first use of GNTST_eagain was in ecb35ecb79e0 for
trying to map/copy a paged-out frame.

Therefore I think it is reasonable to conclude that there was a planned
interaction between GNTMAP_can_fail and paging, which presumably would
have been "don't pull this in from disk if it is paged out".


I think it is fine to drop, as no implementation has ever existed in
Xen, but it would be helpful to have the history summarised in the
commit message.

~Andrew



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9] gnttab: drop GNTMAP_can_fail
  2021-08-26 11:45   ` Andrew Cooper
@ 2021-08-26 13:03     ` Jan Beulich
  2021-08-26 13:13       ` Andrew Cooper
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-08-26 13:03 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 26.08.2021 13:45, Andrew Cooper wrote:
> On 26/08/2021 11:13, Jan Beulich wrote:
>> There's neither documentation of what this flag is supposed to mean, nor
>> any implementation. With this, don't even bother enclosing the #define-s
>> in a __XEN_INTERFACE_VERSION__ conditional, but drop them altogether.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> It was introduced in 4d45702cf0398 along with GNTST_eagain, and the
> commit message hints that it is for xen-paging
> 
> Furthermore, the first use of GNTST_eagain was in ecb35ecb79e0 for
> trying to map/copy a paged-out frame.
> 
> Therefore I think it is reasonable to conclude that there was a planned
> interaction between GNTMAP_can_fail and paging, which presumably would
> have been "don't pull this in from disk if it is paged out".

I suppose that's (far fetched?) guesswork.

> I think it is fine to drop, as no implementation has ever existed in
> Xen, but it would be helpful to have the history summarised in the
> commit message.

I've extended it to

"There's neither documentation of what this flag is supposed to mean, nor
 any implementation. Commit 4d45702cf0398 ("paging: Updates to public
 grant table header file") suggests there might have been plans to use it
 for interaction with mem-paging, but no such functionality has ever
 materialized. With this, don't even bother enclosing the #define-s in a
 __XEN_INTERFACE_VERSION__ conditional, but drop them altogether."

Jan



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9] gnttab: drop GNTMAP_can_fail
  2021-08-26 13:03     ` Jan Beulich
@ 2021-08-26 13:13       ` Andrew Cooper
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2021-08-26 13:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 26/08/2021 14:03, Jan Beulich wrote:
> On 26.08.2021 13:45, Andrew Cooper wrote:
>> On 26/08/2021 11:13, Jan Beulich wrote:
>>> There's neither documentation of what this flag is supposed to mean, nor
>>> any implementation. With this, don't even bother enclosing the #define-s
>>> in a __XEN_INTERFACE_VERSION__ conditional, but drop them altogether.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> It was introduced in 4d45702cf0398 along with GNTST_eagain, and the
>> commit message hints that it is for xen-paging
>>
>> Furthermore, the first use of GNTST_eagain was in ecb35ecb79e0 for
>> trying to map/copy a paged-out frame.
>>
>> Therefore I think it is reasonable to conclude that there was a planned
>> interaction between GNTMAP_can_fail and paging, which presumably would
>> have been "don't pull this in from disk if it is paged out".
> I suppose that's (far fetched?) guesswork.

Not really - the phrase "far fetched" loosely translates as implausible
or unlikely, and I wouldn't characterise the suggestion as implausible.

It is guesswork, and the most plausible guess I can think of.  It is
clear that GNTMAP_can_fail is related to paging somehow, which means
there is some interaction with the gref not being mappable right now.

>
>> I think it is fine to drop, as no implementation has ever existed in
>> Xen, but it would be helpful to have the history summarised in the
>> commit message.
> I've extended it to
>
> "There's neither documentation of what this flag is supposed to mean, nor
>  any implementation. Commit 4d45702cf0398 ("paging: Updates to public
>  grant table header file") suggests there might have been plans to use it
>  for interaction with mem-paging, but no such functionality has ever
>  materialized. With this, don't even bother enclosing the #define-s in a
>  __XEN_INTERFACE_VERSION__ conditional, but drop them altogether."

LGTM.  Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

>
> Jan
>




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] gnttab: defer allocation of maptrack frames table
  2021-08-26 10:09 ` [PATCH 1/9] gnttab: defer allocation of maptrack frames table Jan Beulich
@ 2021-09-06 13:13   ` Julien Grall
  2021-09-06 13:29     ` Jan Beulich
  2021-09-06 14:05   ` Andrew Cooper
  1 sibling, 1 reply; 31+ messages in thread
From: Julien Grall @ 2021-09-06 13:13 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini, Wei Liu

Hi Jan,

On 26/08/2021 11:09, Jan Beulich wrote:
> By default all guests are permitted to have up to 1024 maptrack frames,
> which on 64-bit means an 8k frame table. Yet except for driver domains
> guests normally don't make use of grant mappings. Defer allocating the
> table until a map track handle is first requested.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> I continue to be unconvinced that it is a good idea to allow all DomU-s
> 1024 maptrack frames by default. While I'm still of the opinion that a
> hypervisor enforced upper bound is okay, I question this upper bound
> also getting used as the default value - this is perhaps okay for Dom0,
> but not elsewhere.

I agree here. I think having a per-domain limit maptrack is a good idea.

> 
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -633,6 +633,34 @@ get_maptrack_handle(
>       if ( likely(handle != INVALID_MAPTRACK_HANDLE) )
>           return handle;
>   
> +    if ( unlikely(!read_atomic(&lgt->maptrack)) )
> +    {
> +        struct grant_mapping **maptrack = NULL;
> +
> +        if ( lgt->max_maptrack_frames )
> +            maptrack = vzalloc(lgt->max_maptrack_frames * sizeof(*maptrack));

While I understand that allocating with a lock is bad idea, I don't like 
the fact that multiple vCPUs racing each other would result to 
temporarily allocate more memory.

If moving the call within the lock is not a solution, would using a loop 
with a cmpxchg() a solution to block the other vCPU?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/9] gnttab: drop a redundant expression from gnttab_release_mappings()
  2021-08-26 10:11 ` [PATCH 2/9] gnttab: drop a redundant expression from gnttab_release_mappings() Jan Beulich
@ 2021-09-06 13:15   ` Julien Grall
  0 siblings, 0 replies; 31+ messages in thread
From: Julien Grall @ 2021-09-06 13:15 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini, Wei Liu

Hi Jan,

On 26/08/2021 11:11, Jan Beulich wrote:
> This gnttab_host_mapping_get_page_type() invocation sits in the "else"
> path of a conditional controlled by "map->flags & GNTMAP_readonly".
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Julien Grall <jgrall@amazon.com>

Cheers,

> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3816,9 +3816,7 @@ int gnttab_release_mappings(struct domai
>                   if ( gnttab_release_host_mappings(d) &&
>                        !is_iomem_page(act->mfn) )
>                   {
> -                    if ( gnttab_host_mapping_get_page_type((map->flags &
> -                                                            GNTMAP_readonly),
> -                                                           d, rd) )
> +                    if ( gnttab_host_mapping_get_page_type(false, d, rd) )
>                           put_page_type(pg);
>                       put_page(pg);
>                   }
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] gnttab: defer allocation of maptrack frames table
  2021-09-06 13:13   ` Julien Grall
@ 2021-09-06 13:29     ` Jan Beulich
  2021-09-06 13:33       ` Julien Grall
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2021-09-06 13:29 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, xen-devel

On 06.09.2021 15:13, Julien Grall wrote:
> On 26/08/2021 11:09, Jan Beulich wrote:
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -633,6 +633,34 @@ get_maptrack_handle(
>>       if ( likely(handle != INVALID_MAPTRACK_HANDLE) )
>>           return handle;
>>   
>> +    if ( unlikely(!read_atomic(&lgt->maptrack)) )
>> +    {
>> +        struct grant_mapping **maptrack = NULL;
>> +
>> +        if ( lgt->max_maptrack_frames )
>> +            maptrack = vzalloc(lgt->max_maptrack_frames * sizeof(*maptrack));
> 
> While I understand that allocating with a lock is bad idea, I don't like 
> the fact that multiple vCPUs racing each other would result to 
> temporarily allocate more memory.
> 
> If moving the call within the lock is not a solution, would using a loop 
> with a cmpxchg() a solution to block the other vCPU?

As with any such loop the question then is for how long to retry. No matter
what (static) loop bound you choose, if you exceed it you would return an
error to the caller for no reason.

As to the value to store by cmpxchg() - were you thinking of some "alloc in
progress" marker? You obviously can't store the result of the allocation
before actually doing the allocation, yet it is unnecessary allocations
that you want to avoid (i.e. to achieve your goal the allocation would need
to come after the cmpxchg()). A marker would further complicate the other
code here, even if (maybe) just slightly ...

Jan



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] gnttab: defer allocation of maptrack frames table
  2021-09-06 13:29     ` Jan Beulich
@ 2021-09-06 13:33       ` Julien Grall
  2021-09-06 13:58         ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Julien Grall @ 2021-09-06 13:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, xen-devel

Hi Jan,

On 06/09/2021 14:29, Jan Beulich wrote:
> On 06.09.2021 15:13, Julien Grall wrote:
>> On 26/08/2021 11:09, Jan Beulich wrote:
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -633,6 +633,34 @@ get_maptrack_handle(
>>>        if ( likely(handle != INVALID_MAPTRACK_HANDLE) )
>>>            return handle;
>>>    
>>> +    if ( unlikely(!read_atomic(&lgt->maptrack)) )
>>> +    {
>>> +        struct grant_mapping **maptrack = NULL;
>>> +
>>> +        if ( lgt->max_maptrack_frames )
>>> +            maptrack = vzalloc(lgt->max_maptrack_frames * sizeof(*maptrack));
>>
>> While I understand that allocating with a lock is bad idea, I don't like
>> the fact that multiple vCPUs racing each other would result to
>> temporarily allocate more memory.
>>
>> If moving the call within the lock is not a solution, would using a loop
>> with a cmpxchg() a solution to block the other vCPU?
> 
> As with any such loop the question then is for how long to retry. No matter
> what (static) loop bound you choose, if you exceed it you would return an
> error to the caller for no reason.

I was thinking to have an unbound loop. This would be no better no worth 
than the current implementation because of the existing lock.

> 
> As to the value to store by cmpxchg() - were you thinking of some "alloc in
> progress" marker?

Yes.

> You obviously can't store the result of the allocation
> before actually doing the allocation, yet it is unnecessary allocations
> that you want to avoid (i.e. to achieve your goal the allocation would need
> to come after the cmpxchg()). A marker would further complicate the other
> code here, even if (maybe) just slightly ...

Right, the code will be a bit more complicated (although it would not be 
that bad if moved in a separate function...) but I feel it is better 
than the multiple vzalloc().

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/9] gnttab: fold recurring is_iomem_page()
  2021-08-26 10:12 ` [PATCH 3/9] gnttab: fold recurring is_iomem_page() Jan Beulich
@ 2021-09-06 13:35   ` Julien Grall
  2021-09-06 14:02     ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Julien Grall @ 2021-09-06 13:35 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini, Wei Liu

Hi Jan,

On 26/08/2021 11:12, Jan Beulich wrote:
> In all cases call the function just once instead of up to four times, at

extra NIT: It is more like two because there is a else in 
gnttab_release_mappings() :)

> the same time avoiding to store a dangling pointer in a local variable.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

> 
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c

[...]

Everything below looks a duplicate. Might be an issue in your tools?

> In all cases call the function just once instead of up to four times, at
> the same time avoiding to store a dangling pointer in a local variable.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -1587,11 +1587,11 @@ unmap_common_complete(struct gnttab_unma
>       else
>           status = &status_entry(rgt, op->ref);
>   
> -    pg = mfn_to_page(op->mfn);
> +    pg = !is_iomem_page(act->mfn) ? mfn_to_page(op->mfn) : NULL;
>   
>       if ( op->done & GNTMAP_device_map )
>       {
> -        if ( !is_iomem_page(act->mfn) )
> +        if ( pg )
>           {
>               if ( op->done & GNTMAP_readonly )
>                   put_page(pg);
> @@ -1608,7 +1608,7 @@ unmap_common_complete(struct gnttab_unma
>   
>       if ( op->done & GNTMAP_host_map )
>       {
> -        if ( !is_iomem_page(op->mfn) )
> +        if ( pg )
>           {
>               if ( gnttab_host_mapping_get_page_type(op->done & GNTMAP_readonly,
>                                                      ld, rd) )
> @@ -3778,7 +3778,7 @@ int gnttab_release_mappings(struct domai
>           else
>               status = &status_entry(rgt, ref);
>   
> -        pg = mfn_to_page(act->mfn);
> +        pg = !is_iomem_page(act->mfn) ? mfn_to_page(act->mfn) : NULL;
>   
>           if ( map->flags & GNTMAP_readonly )
>           {
> @@ -3786,7 +3786,7 @@ int gnttab_release_mappings(struct domai
>               {
>                   BUG_ON(!(act->pin & GNTPIN_devr_mask));
>                   act->pin -= GNTPIN_devr_inc;
> -                if ( !is_iomem_page(act->mfn) )
> +                if ( pg )
>                       put_page(pg);
>               }
>   
> @@ -3794,8 +3794,7 @@ int gnttab_release_mappings(struct domai
>               {
>                   BUG_ON(!(act->pin & GNTPIN_hstr_mask));
>                   act->pin -= GNTPIN_hstr_inc;
> -                if ( gnttab_release_host_mappings(d) &&
> -                     !is_iomem_page(act->mfn) )
> +                if ( pg && gnttab_release_host_mappings(d) )
>                       put_page(pg);
>               }
>           }
> @@ -3805,7 +3804,7 @@ int gnttab_release_mappings(struct domai
>               {
>                   BUG_ON(!(act->pin & GNTPIN_devw_mask));
>                   act->pin -= GNTPIN_devw_inc;
> -                if ( !is_iomem_page(act->mfn) )
> +                if ( pg )
>                       put_page_and_type(pg);
>               }
>   
> @@ -3813,8 +3812,7 @@ int gnttab_release_mappings(struct domai
>               {
>                   BUG_ON(!(act->pin & GNTPIN_hstw_mask));
>                   act->pin -= GNTPIN_hstw_inc;
> -                if ( gnttab_release_host_mappings(d) &&
> -                     !is_iomem_page(act->mfn) )
> +                if ( pg && gnttab_release_host_mappings(d) )
>                   {
>                       if ( gnttab_host_mapping_get_page_type(false, d, rd) )
>                           put_page_type(pg);
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/9] gnttab: check handle early in gnttab_get_status_frames()
  2021-08-26 10:13 ` [PATCH 6/9] gnttab: check handle early in gnttab_get_status_frames() Jan Beulich
@ 2021-09-06 13:41   ` Julien Grall
  0 siblings, 0 replies; 31+ messages in thread
From: Julien Grall @ 2021-09-06 13:41 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini, Wei Liu

Hi Jan,

On 26/08/2021 11:13, Jan Beulich wrote:
> Like done in gnttab_setup_table(), check the handle once early in the
> function and use the lighter-weight (for PV) copying function in the
> loop.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

> 
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3261,6 +3261,9 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
>           return -EFAULT;
>       }
>   
> +    if ( !guest_handle_okay(op.frame_list, op.nr_frames) )
> +        return -EFAULT;
> +
>       d = rcu_lock_domain_by_any_id(op.dom);
>       if ( d == NULL )
>       {
> @@ -3301,7 +3304,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
>       for ( i = 0; i < op.nr_frames; i++ )
>       {
>           gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
> -        if ( copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
> +        if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
>               op.status = GNTST_bad_virt_addr;
>       }
>   
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] gnttab: defer allocation of maptrack frames table
  2021-09-06 13:33       ` Julien Grall
@ 2021-09-06 13:58         ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2021-09-06 13:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, xen-devel

On 06.09.2021 15:33, Julien Grall wrote:
> On 06/09/2021 14:29, Jan Beulich wrote:
>> On 06.09.2021 15:13, Julien Grall wrote:
>>> On 26/08/2021 11:09, Jan Beulich wrote:
>>>> --- a/xen/common/grant_table.c
>>>> +++ b/xen/common/grant_table.c
>>>> @@ -633,6 +633,34 @@ get_maptrack_handle(
>>>>        if ( likely(handle != INVALID_MAPTRACK_HANDLE) )
>>>>            return handle;
>>>>    
>>>> +    if ( unlikely(!read_atomic(&lgt->maptrack)) )
>>>> +    {
>>>> +        struct grant_mapping **maptrack = NULL;
>>>> +
>>>> +        if ( lgt->max_maptrack_frames )
>>>> +            maptrack = vzalloc(lgt->max_maptrack_frames * sizeof(*maptrack));
>>>
>>> While I understand that allocating with a lock is bad idea, I don't like
>>> the fact that multiple vCPUs racing each other would result to
>>> temporarily allocate more memory.
>>>
>>> If moving the call within the lock is not a solution, would using a loop
>>> with a cmpxchg() a solution to block the other vCPU?
>>
>> As with any such loop the question then is for how long to retry. No matter
>> what (static) loop bound you choose, if you exceed it you would return an
>> error to the caller for no reason.
> 
> I was thinking to have an unbound loop. This would be no better no worth 
> than the current implementation because of the existing lock.

Not quite: Ticket locks grant access to the locked region in FIFO manner.
Such an open-coded loop wouldn't, i.e. there would be a risk of a loop
becoming (close to) infinite. Granted this is largely a theoretical risk,
but still ...

>> As to the value to store by cmpxchg() - were you thinking of some "alloc in
>> progress" marker?
> 
> Yes.
> 
>> You obviously can't store the result of the allocation
>> before actually doing the allocation, yet it is unnecessary allocations
>> that you want to avoid (i.e. to achieve your goal the allocation would need
>> to come after the cmpxchg()). A marker would further complicate the other
>> code here, even if (maybe) just slightly ...
> 
> Right, the code will be a bit more complicated (although it would not be 
> that bad if moved in a separate function...) but I feel it is better 
> than the multiple vzalloc().

It's the other way around here - to me it feels better the way I've coded
it. I don't think the risk of an actual race is overly high, the more that
this is a one time event for every domain.

Jan



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/9] gnttab: fold recurring is_iomem_page()
  2021-09-06 13:35   ` Julien Grall
@ 2021-09-06 14:02     ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2021-09-06 14:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Stefano Stabellini,
	Wei Liu, xen-devel

On 06.09.2021 15:35, Julien Grall wrote:
> On 26/08/2021 11:12, Jan Beulich wrote:
>> In all cases call the function just once instead of up to four times, at
> 
> extra NIT: It is more like two because there is a else in 
> gnttab_release_mappings() :)

Well, I was viewing things from code gen pov, not so much execution
paths. And of course "two" is not wrongly covered by "up to four" ;-)

>> the same time avoiding to store a dangling pointer in a local variable.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Julien Grall <jgrall@amazon.com>

Thanks.

>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
> 
> [...]
> 
> Everything below looks a duplicate. Might be an issue in your tools?

Oops, indeed - some kind of glitch. Odd.

Jan



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] gnttab: defer allocation of maptrack frames table
  2021-08-26 10:09 ` [PATCH 1/9] gnttab: defer allocation of maptrack frames table Jan Beulich
  2021-09-06 13:13   ` Julien Grall
@ 2021-09-06 14:05   ` Andrew Cooper
  2021-09-06 14:43     ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2021-09-06 14:05 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu

On 26/08/2021 11:09, Jan Beulich wrote:
> By default all guests are permitted to have up to 1024 maptrack frames,
> which on 64-bit means an 8k frame table. Yet except for driver domains
> guests normally don't make use of grant mappings. Defer allocating the
> table until a map track handle is first requested.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Nack.  This creates new dynamic failures outside the VM's control,
therefore regressing Xen's usability.

The maptrack array (and frames for that matter) should be allocated at
domain creation time, like we do for most other structures in the
hypervisor.

Furthermore, seeing as this has come up on multiple community calls, I
will remind you that it is not just Citrix as a downstream which is
firmly of this opinion.

This entire patch should be replaced with one that...

> ---
> I continue to be unconvinced that it is a good idea to allow all DomU-s
> 1024 maptrack frames by default.

... defaults to something smaller for plain domUs, because this improves
the general case without leaving VMs more liable to crash at runtime.

~Andrew



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] gnttab: defer allocation of maptrack frames table
  2021-09-06 14:05   ` Andrew Cooper
@ 2021-09-06 14:43     ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2021-09-06 14:43 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 06.09.2021 16:05, Andrew Cooper wrote:
> On 26/08/2021 11:09, Jan Beulich wrote:
>> By default all guests are permitted to have up to 1024 maptrack frames,
>> which on 64-bit means an 8k frame table. Yet except for driver domains
>> guests normally don't make use of grant mappings. Defer allocating the
>> table until a map track handle is first requested.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Nack.  This creates new dynamic failures outside the VM's control,
> therefore regressing Xen's usability.

As with the v2 status frames tracking array, the memory demand of
the actual table will typically be quite a bit higher than that of
the frame tracking table. Therefore guests already are at risk of
observing failures from these paths; the increase of that risk from
the change here is typically quite small.

> The maptrack array (and frames for that matter) should be allocated at
> domain creation time, like we do for most other structures in the
> hypervisor.

Structures we allocate at domain creation time are typically ones
which we know will get used (to at least some degree), or where
allocation can't be done later because paths this would be needed on
have no way to indicate respective failure. Neither of this is true
for the maptrack frame table (or the v2 status frame table, for that
matter).

This said, I could see us _switch_ to a policy like the one you
describe, or even allow either behavior depending on some kind of
setting. But then this needs to be done consistently for all forms
of resources (e.g. also the gnttab v2 frames, not just the frame
table).

> Furthermore, seeing as this has come up on multiple community calls, I
> will remind you that it is not just Citrix as a downstream which is
> firmly of this opinion.

No-one but you has voiced such an opinion, so far.

> This entire patch should be replaced with one that...
> 
>> ---
>> I continue to be unconvinced that it is a good idea to allow all DomU-s
>> 1024 maptrack frames by default.
> 
> ... defaults to something smaller for plain domUs, because this improves
> the general case without leaving VMs more liable to crash at runtime.

Just to continue to waste memory? Or to force the admin to know very
precisely how many maptrack entries a guest may need to create, such
that they will neither cause large amounts of unused memory nor risk
the guest running out of handles?

Jan



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Ping: [PATCH 0/9] gnttab: further work from XSA-380 / -382 context
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (8 preceding siblings ...)
  2021-08-26 10:15 ` [PATCH 9/9] gnttab: don't silently truncate GFNs in compat setup-table handling Jan Beulich
@ 2022-01-05  7:42 ` Jan Beulich
  2022-10-07 13:49 ` Jan Beulich
  10 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-01-05  7:42 UTC (permalink / raw)
  To: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu
  Cc: xen-devel

On 26.08.2021 12:06, Jan Beulich wrote:
> The first four patches can be attributed to the former, the last four
> patches to the latter. The middle patch had been submitted standalone
> before, has a suitable Reviewed-by tag, but also has an objection by
> Andrew pending, which unfortunately has lead to this patch now being
> stuck. Short of Andrew being willing to settle the disagreement more
> with Julien than with me (although I'm on Julien's side), I have no
> idea what to do here.
> 
> There's probably not much interrelation between the patches, so they
> can perhaps go in about any order.
> 
> 1: defer allocation of maptrack frames table
> 2: drop a redundant expression from gnttab_release_mappings()
> 3: fold recurring is_iomem_page()
> 4: drop GNTMAP_can_fail
> 5: defer allocation of status frame tracking array
> 6: check handle early in gnttab_get_status_frames()
> 7: no need to translate handle for gnttab_get_status_frames()
> 8: bail from GFN-storing loops early in case of error
> 9: don't silently truncate GFNs in compat setup-table handling

While some of the patches have gone in, and while I realize patches
1 and 5 are controversial (with no clear route towards resolution),
may I ask for acks or otherwise on patches 7, 8, and 9 please?

Thanks, Jan



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/9] gnttab: further work from XSA-380 / -382 context
  2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
                   ` (9 preceding siblings ...)
  2022-01-05  7:42 ` Ping: [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
@ 2022-10-07 13:49 ` Jan Beulich
  2022-10-07 18:09   ` Andrew Cooper
  10 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2022-10-07 13:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu

On 26.08.2021 12:06, Jan Beulich wrote:
> The first four patches can be attributed to the former, the last four
> patches to the latter. The middle patch had been submitted standalone
> before, has a suitable Reviewed-by tag, but also has an objection by
> Andrew pending, which unfortunately has lead to this patch now being
> stuck. Short of Andrew being willing to settle the disagreement more
> with Julien than with me (although I'm on Julien's side), I have no
> idea what to do here.
> 
> There's probably not much interrelation between the patches, so they
> can perhaps go in about any order.
> 
> 1: defer allocation of maptrack frames table
> 2: drop a redundant expression from gnttab_release_mappings()
> 3: fold recurring is_iomem_page()
> 4: drop GNTMAP_can_fail
> 5: defer allocation of status frame tracking array

Just to make "official" what I've said in the course of the resource
management discussion at the event in Cambridge: I'm withdrawing 1
and 5, in the expectation that eager/lazy allocation of resources
will become a property to be honored uniformly for a guest. With 2,
3, 4, and 6 already having gone in, it would still be nice to
(finally) have feedback on ...

> 6: check handle early in gnttab_get_status_frames()
> 7: no need to translate handle for gnttab_get_status_frames()
> 8: bail from GFN-storing loops early in case of error
> 9: don't silently truncate GFNs in compat setup-table handling

... the remaining three - even if these aren't really 4.17 candidates
anymore at this point.

Thanks, Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/9] gnttab: further work from XSA-380 / -382 context
  2022-10-07 13:49 ` Jan Beulich
@ 2022-10-07 18:09   ` Andrew Cooper
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2022-10-07 18:09 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu

On 07/10/2022 14:49, Jan Beulich wrote:
> On 26.08.2021 12:06, Jan Beulich wrote:
>> The first four patches can be attributed to the former, the last four
>> patches to the latter. The middle patch had been submitted standalone
>> before, has a suitable Reviewed-by tag, but also has an objection by
>> Andrew pending, which unfortunately has lead to this patch now being
>> stuck. Short of Andrew being willing to settle the disagreement more
>> with Julien than with me (although I'm on Julien's side), I have no
>> idea what to do here.
>>
>> There's probably not much interrelation between the patches, so they
>> can perhaps go in about any order.
>>
>> 1: defer allocation of maptrack frames table
>> 2: drop a redundant expression from gnttab_release_mappings()
>> 3: fold recurring is_iomem_page()
>> 4: drop GNTMAP_can_fail
>> 5: defer allocation of status frame tracking array
> Just to make "official" what I've said in the course of the resource
> management discussion at the event in Cambridge: I'm withdrawing 1
> and 5, in the expectation that eager/lazy allocation of resources
> will become a property to be honored uniformly for a guest. With 2,
> 3, 4, and 6 already having gone in, it would still be nice to
> (finally) have feedback on ...

To these issues in particular, there was specific work done in 4.16 to
address the impasse in a way which got the savings in most cases, but
without increasing the risk of hitting known buggy paths in guest drivers.


For patch 5, the ABI max version is now known to domain_create(), so the
status array allocation can safely be skipped when gnttab v2 isn't
available.

This gets the saving Julien wants on ARM, and on x86 we ought to
encourage people to restricting to v1 where possible, so they get the
savings too.


For patch 1, I agree with the observation that 1024 maptrack frames is a
silly default to have.  Adjusting this appropriately will result in the
kind of savings wanted in the patch, without modifying the hypervisor
directly.

We should have a patch to xl (or is it libxl?) to make a better choice
than blindly picking 1024.

Having written this out, something does strike me as odd.  These limits
are specified in frames, and for the gnttab limit, this equates into a
known number of grants.  But struct maptrack is internal Xen, so the
toolstack selecting 1024 frames can't know how many grant maps that
actually equates to in Xen.  Right now, 1024 maptrack frames equates to
2^18 mappings (I think).


For traditional server-virt VMs, they can have 0 because the frontends
should never be mapping grants.  We actually already do this for the
xenstored stubdom.  The same is almost certainly true for qemu stubdoms.

For VMs in a bit of a more interesting configuration, e.g. with virtio,
then they need "some".

For device driver VMs, they do need a dom0-like mapping capabilities.

Either way, there is definitely room to improve the status quo without
impacting runtime safety.  If patches were to appear promptly, I think
there's probably wiggle room to consider them for 4.17 at this point.

~Andrew

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/9] gnttab: no need to translate handle for gnttab_get_status_frames()
  2021-08-26 10:14 ` [PATCH 7/9] gnttab: no need to translate handle for gnttab_get_status_frames() Jan Beulich
@ 2022-10-07 18:24   ` Andrew Cooper
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2022-10-07 18:24 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu

On 26/08/2021 11:14, Jan Beulich wrote:
> Unlike for GNTTABOP_setup_table native and compat frame lists are arrays

"GNTTABOP_setup_table, native"

But I think it would also be clearer to follow with

"frame lists for GNTTABOP_get_status_frames are of".

> of the same type (uint64_t). Hence there's no need to translate the frame
> values. This then also renders unnecessary the limit_max parameter of
> gnttab_get_status_frames().
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/9] gnttab: bail from GFN-storing loops early in case of error
  2021-08-26 10:14 ` [PATCH 8/9] gnttab: bail from GFN-storing loops early in case of error Jan Beulich
@ 2022-10-07 19:36   ` Andrew Cooper
  2022-10-10  9:30     ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2022-10-07 19:36 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu, Juergen Gross

On 26/08/2021 11:14, Jan Beulich wrote:
> The contents of the output arrays are undefined in both cases anyway
> when the operation itself gets marked as failed. There's no value in
> trying to continue after a guest memory access failure.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Not really Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

This is an example of a bad loop adjustment.  Taking just one example to
demonstrate with,

> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3289,17 +3292,15 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
>                   "status frames, but has only %u\n",
>                   d->domain_id, op.nr_frames, nr_status_frames(gt));
>          op.status = GNTST_general_error;
> -        goto unlock;
>      }
>  
> -    for ( i = 0; i < op.nr_frames; i++ )
> +    for ( i = 0; op.status == GNTST_okay && i < op.nr_frames; i++ )
>      {
>          gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
>          if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
>              op.status = GNTST_bad_virt_addr;
>      }
>  
> - unlock:
>      grant_read_unlock(gt);
>   out2:
>      rcu_unlock_domain(d);
>


If instead, this were written

    for ( i = 0; i < op.nr_frames; i++ )
    {
        gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
        if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
        {
            op.status = GNTST_bad_virt_addr;
            goto unlock;
        }
    }

then the delta vs your version is -36 bytes, and faster to run because
the loop doesn't need a memory read and compare on every iteration when
you can exit based purely on existing control flow.

Furthermore, the version with a goto is clearer to follow, because the
exit condition is much more obvious.  The compat change can do the same
with breaks rather than gotos, for a slightly more modest -11 saving.

A form with the op.status changes adjustments *not* added to the loop
condition, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>


In reference to the hypercall ABI adjustments, it occurs to me that
loops like this (which we have loads of, even in hypercall hotpaths) are
horrifying for performance.  For HVM, we're redoing the nested pagewalk
for every uint64_t element of an array. 

A "copy array to guest" primitive would more efficient still than a
plain virt->phys translation, because we'd be able to drop the p2m walks
too.

Obviously, we don't want every instance like this to be doing its own
manual bounce buffering, so perhaps we should invest in some buffered
copy helpers as part of trying to improve hypercall performance.

~Andrew

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 9/9] gnttab: don't silently truncate GFNs in compat setup-table handling
  2021-08-26 10:15 ` [PATCH 9/9] gnttab: don't silently truncate GFNs in compat setup-table handling Jan Beulich
@ 2022-10-07 19:57   ` Andrew Cooper
  2022-10-10  8:46     ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2022-10-07 19:57 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu

On 26/08/2021 11:15, Jan Beulich wrote:
> Returning back truncated frame numbers is unhelpful: Quite likely
> they're not owned by the domain (if it's PV), or we may misguide the
> guest into writing grant entries into a page that it actually uses for
> other purposes.
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> RFC: Arguably in the 32-bit PV case it may be necessary to instead put
>      in place an explicit address restriction when allocating
>      ->shared_raw[N]. This is currently implicit by alloc_xenheap_page()
>      only returning memory covered by the direct-map.

Yet another reason why having the grant table be Xen memory, rather than
guest memory, was a terrible idea.  Changing this is in consideration
for the encrypted vm work.

Its fine for now, but is fragile and liable to break for e.g. an
xmalloc() -> vmalloc() conversion, or when we get 5-level paging and the
directmap boundary moves above 16T.



> --- a/xen/common/compat/grant_table.c
> +++ b/xen/common/compat/grant_table.c
> @@ -175,8 +175,15 @@ int compat_grant_table_op(unsigned int c
>                                   i < (_s_)->nr_frames; ++i ) \
>                      { \
>                          compat_pfn_t frame = (_s_)->frame_list.p[i]; \
> -                        if ( __copy_to_compat_offset((_d_)->frame_list, \
> -                                                     i, &frame, 1) ) \
> +                        if ( frame != (_s_)->frame_list.p[i] ) \
> +                        { \
> +                            if ( VALID_M2P((_s_)->frame_list.p[i]) ) \
> +                                (_s_)->status = GNTST_address_too_big; \
> +                            else \
> +                                frame |= 0x80000000U;\

Space before the \.  (This is one reason why I hate this style.  The
borderline illegibility makes it almost impossible to spot style problems.)

With the adjustment from the previous patch, you'll need a break in here.

But for !valid case, shouldn't we saturate to ~0u ?  I recall from the
migration work that various kernels disagree on what constitutes an
invalid MFN.

Then again, I can't see what legitimate case we'd have for a truncation
and an invalid M2P entry needing translating.

~Andrew

> +                        } \
> +                        else if ( __copy_to_compat_offset((_d_)->frame_list, \
> +                                                          i, &frame, 1) ) \
>                              (_s_)->status = GNTST_bad_virt_addr; \
>                      } \
>                  } while (0)
>
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 9/9] gnttab: don't silently truncate GFNs in compat setup-table handling
  2022-10-07 19:57   ` Andrew Cooper
@ 2022-10-10  8:46     ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-10-10  8:46 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 07.10.2022 21:57, Andrew Cooper wrote:
> On 26/08/2021 11:15, Jan Beulich wrote:
>> --- a/xen/common/compat/grant_table.c
>> +++ b/xen/common/compat/grant_table.c
>> @@ -175,8 +175,15 @@ int compat_grant_table_op(unsigned int c
>>                                   i < (_s_)->nr_frames; ++i ) \
>>                      { \
>>                          compat_pfn_t frame = (_s_)->frame_list.p[i]; \
>> -                        if ( __copy_to_compat_offset((_d_)->frame_list, \
>> -                                                     i, &frame, 1) ) \
>> +                        if ( frame != (_s_)->frame_list.p[i] ) \
>> +                        { \
>> +                            if ( VALID_M2P((_s_)->frame_list.p[i]) ) \
>> +                                (_s_)->status = GNTST_address_too_big; \
>> +                            else \
>> +                                frame |= 0x80000000U;\
> 
> Space before the \.  (This is one reason why I hate this style.  The
> borderline illegibility makes it almost impossible to spot style problems.)

There is a (imo severe) downsides to backslashes on the far right as well:
It's easier to miss adding one on a newly added line, which may or may not
result in a build failure.

> With the adjustment from the previous patch, you'll need a break in here.

Can do. Question then is whether to go further right here and adjust
the loop header and the other setting of (_s_)->status at the same time.

> But for !valid case, shouldn't we saturate to ~0u ?  I recall from the
> migration work that various kernels disagree on what constitutes an
> invalid MFN.
> 
> Then again, I can't see what legitimate case we'd have for a truncation
> and an invalid M2P entry needing translating.

I've dropped the use of VALID_M2P(). It's a bogus check anyway (I don't
actually recall how I came to think of doing this sort of check), and
there's indeed no reason to report back an (overflow) error in this way.
Furthermore I've noticed that the updating of "frame" was actually dead
code - the updated variable wasn't really used for anything; we would
have left both ->status and the array slot untouched, misguiding the
caller.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/9] gnttab: bail from GFN-storing loops early in case of error
  2022-10-07 19:36   ` Andrew Cooper
@ 2022-10-10  9:30     ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-10-10  9:30 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	Juergen Gross, xen-devel

On 07.10.2022 21:36, Andrew Cooper wrote:
> On 26/08/2021 11:14, Jan Beulich wrote:
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -3289,17 +3292,15 @@ gnttab_get_status_frames(XEN_GUEST_HANDL
>>                   "status frames, but has only %u\n",
>>                   d->domain_id, op.nr_frames, nr_status_frames(gt));
>>          op.status = GNTST_general_error;
>> -        goto unlock;
>>      }
>>  
>> -    for ( i = 0; i < op.nr_frames; i++ )
>> +    for ( i = 0; op.status == GNTST_okay && i < op.nr_frames; i++ )
>>      {
>>          gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
>>          if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
>>              op.status = GNTST_bad_virt_addr;
>>      }
>>  
>> - unlock:
>>      grant_read_unlock(gt);
>>   out2:
>>      rcu_unlock_domain(d);
>>
> 
> 
> If instead, this were written
> 
>     for ( i = 0; i < op.nr_frames; i++ )
>     {
>         gmfn = gfn_x(gnttab_status_gfn(d, gt, i));
>         if ( __copy_to_guest_offset(op.frame_list, i, &gmfn, 1) )
>         {
>             op.status = GNTST_bad_virt_addr;
>             goto unlock;
>         }
>     }
> 
> then the delta vs your version is -36 bytes, and faster to run because
> the loop doesn't need a memory read and compare on every iteration when
> you can exit based purely on existing control flow.
> 
> Furthermore, the version with a goto is clearer to follow, because the
> exit condition is much more obvious.

I know you and others deem "goto" okay to use; where possible (and where
the resulting code remains readable/maintainable) I'm aiming at avoiding
them. Nevertheless I'll follow the request here.

>  The compat change can do the same
> with breaks rather than gotos, for a slightly more modest -11 saving.

There of course the original if() around the the loop then also needs
retaining; on the positive side this means a smaller diff.

> A form with the op.status changes adjustments *not* added to the loop
> condition, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

> In reference to the hypercall ABI adjustments, it occurs to me that
> loops like this (which we have loads of, even in hypercall hotpaths) are
> horrifying for performance.  For HVM, we're redoing the nested pagewalk
> for every uint64_t element of an array. 
> 
> A "copy array to guest" primitive would more efficient still than a
> plain virt->phys translation, because we'd be able to drop the p2m walks
> too.

Generally we can copy arrays (the last parameter of the copying primitives
is a count, after all) but ...

> Obviously, we don't want every instance like this to be doing its own
> manual bounce buffering, so perhaps we should invest in some buffered
> copy helpers as part of trying to improve hypercall performance.

... avoiding bounce buffering is possible only where the data to copy out
is available in ready-to-copy form. Here in the native cases we need to
retrieve the GFN (from e.g. gnttab_status_gfn()), and in the compat case
we need to translate from 64 to 32 bits. Neither really lends itself to
the use of a generic helper, I think.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-10-10  9:30 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-26 10:06 [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
2021-08-26 10:09 ` [PATCH 1/9] gnttab: defer allocation of maptrack frames table Jan Beulich
2021-09-06 13:13   ` Julien Grall
2021-09-06 13:29     ` Jan Beulich
2021-09-06 13:33       ` Julien Grall
2021-09-06 13:58         ` Jan Beulich
2021-09-06 14:05   ` Andrew Cooper
2021-09-06 14:43     ` Jan Beulich
2021-08-26 10:11 ` [PATCH 2/9] gnttab: drop a redundant expression from gnttab_release_mappings() Jan Beulich
2021-09-06 13:15   ` Julien Grall
2021-08-26 10:12 ` [PATCH 3/9] gnttab: fold recurring is_iomem_page() Jan Beulich
2021-09-06 13:35   ` Julien Grall
2021-09-06 14:02     ` Jan Beulich
2021-08-26 10:13 ` [PATCH 4/9] gnttab: drop GNTMAP_can_fail Jan Beulich
2021-08-26 11:45   ` Andrew Cooper
2021-08-26 13:03     ` Jan Beulich
2021-08-26 13:13       ` Andrew Cooper
2021-08-26 10:13 ` [PATCH 5/9] gnttab: defer allocation of status frame tracking array Jan Beulich
2021-08-26 10:13 ` [PATCH 6/9] gnttab: check handle early in gnttab_get_status_frames() Jan Beulich
2021-09-06 13:41   ` Julien Grall
2021-08-26 10:14 ` [PATCH 7/9] gnttab: no need to translate handle for gnttab_get_status_frames() Jan Beulich
2022-10-07 18:24   ` Andrew Cooper
2021-08-26 10:14 ` [PATCH 8/9] gnttab: bail from GFN-storing loops early in case of error Jan Beulich
2022-10-07 19:36   ` Andrew Cooper
2022-10-10  9:30     ` Jan Beulich
2021-08-26 10:15 ` [PATCH 9/9] gnttab: don't silently truncate GFNs in compat setup-table handling Jan Beulich
2022-10-07 19:57   ` Andrew Cooper
2022-10-10  8:46     ` Jan Beulich
2022-01-05  7:42 ` Ping: [PATCH 0/9] gnttab: further work from XSA-380 / -382 context Jan Beulich
2022-10-07 13:49 ` Jan Beulich
2022-10-07 18:09   ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.