On 12.09.22 06:00, M. Vefa Bicakci wrote: > Prior to this commit, if a grant mapping operation failed partially, > some of the entries in the map_ops array would be invalid, whereas all > of the entries in the kmap_ops array would be valid. This in turn would > cause the following logic in gntdev_map_grant_pages to become invalid: > > for (i = 0; i < map->count; i++) { > if (map->map_ops[i].status == GNTST_okay) { > map->unmap_ops[i].handle = map->map_ops[i].handle; > if (!use_ptemod) > alloced++; > } > if (use_ptemod) { > if (map->kmap_ops[i].status == GNTST_okay) { > if (map->map_ops[i].status == GNTST_okay) > alloced++; > map->kunmap_ops[i].handle = map->kmap_ops[i].handle; > } > } > } > ... > atomic_add(alloced, &map->live_grants); > > Assume that use_ptemod is true (i.e., the domain mapping the granted > pages is a paravirtualized domain). In the code excerpt above, note that > the "alloced" variable is only incremented when both kmap_ops[i].status > and map_ops[i].status are set to GNTST_okay (i.e., both mapping > operations are successful). However, as also noted above, there are > cases where a grant mapping operation fails partially, breaking the > assumption of the code excerpt above. > > The aforementioned causes map->live_grants to be incorrectly set. In > some cases, all of the map_ops mappings fail, but all of the kmap_ops > mappings succeed, meaning that live_grants may remain zero. This in turn > makes it impossible to unmap the successfully grant-mapped pages pointed > to by kmap_ops, because unmap_grant_pages has the following snippet of > code at its beginning: > > if (atomic_read(&map->live_grants) == 0) > return; /* Nothing to do */ > > In other cases where only some of the map_ops mappings fail but all > kmap_ops mappings succeed, live_grants is made positive, but when the > user requests unmapping the grant-mapped pages, __unmap_grant_pages_done > will then make map->live_grants negative, because the latter function > does not check if all of the pages that were requested to be unmapped > were actually unmapped, and the same function unconditionally subtracts > "data->count" (i.e., a value that can be greater than map->live_grants) > from map->live_grants. The side effects of a negative live_grants value > have not been studied. > > The net effect of all of this is that grant references are leaked in one > of the above conditions. In Qubes OS v4.1 (which uses Xen's grant > mechanism extensively for X11 GUI isolation), this issue manifests > itself with warning messages like the following to be printed out by the > Linux kernel in the VM that had granted pages (that contain X11 GUI > window data) to dom0: "g.e. 0x1234 still pending", especially after the > user rapidly resizes GUI VM windows (causing some grant-mapping > operations to partially or completely fail, due to the fact that the VM > unshares some of the pages as part of the window resizing, making the > pages impossible to grant-map from dom0). > > The fix for this issue involves counting all successful map_ops and > kmap_ops mappings separately, and then adding the sum to live_grants. > During unmapping, only the number of successfully unmapped grants is > subtracted from live_grants. To determine which grants were successfully > unmapped, their status fields are set to an arbitrary positive number > (1), as was done in commit ebee0eab0859 ("Xen/gntdev: correct error > checking in gntdev_map_grant_pages()"). The code is also modified to > check for negative live_grants values after the subtraction and warn the > user. > > Link: https://github.com/QubesOS/qubes-issues/issues/7631 > Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()") > Cc: stable@vger.kernel.org > Signed-off-by: M. Vefa Bicakci > --- > drivers/xen/gntdev.c | 32 +++++++++++++++++++++++++++----- > 1 file changed, 27 insertions(+), 5 deletions(-) > > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c > index 84b143eef395..485fa9c630aa 100644 > --- a/drivers/xen/gntdev.c > +++ b/drivers/xen/gntdev.c > @@ -367,8 +367,7 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map) > for (i = 0; i < map->count; i++) { > if (map->map_ops[i].status == GNTST_okay) { > map->unmap_ops[i].handle = map->map_ops[i].handle; > - if (!use_ptemod) > - alloced++; > + alloced++; > } else if (!err) > err = -EINVAL; > > @@ -377,8 +376,7 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map) > > if (use_ptemod) { > if (map->kmap_ops[i].status == GNTST_okay) { > - if (map->map_ops[i].status == GNTST_okay) > - alloced++; > + alloced++; > map->kunmap_ops[i].handle = map->kmap_ops[i].handle; > } else if (!err) > err = -EINVAL; > @@ -394,8 +392,13 @@ static void __unmap_grant_pages_done(int result, > unsigned int i; > struct gntdev_grant_map *map = data->data; > unsigned int offset = data->unmap_ops - map->unmap_ops; > + int successful_unmaps = 0; > + int live_grants; > > for (i = 0; i < data->count; i++) { > + if (map->unmap_ops[offset + i].status == GNTST_okay) > + successful_unmaps++; Shouldn't this test include "&& handle != INVALID_GRANT_HANDLE" ? This should enable you to drop setting status to 1 below. > + > WARN_ON(map->unmap_ops[offset + i].status != GNTST_okay && > map->unmap_ops[offset + i].handle != INVALID_GRANT_HANDLE); > pr_debug("unmap handle=%d st=%d\n", > @@ -403,6 +406,9 @@ static void __unmap_grant_pages_done(int result, > map->unmap_ops[offset+i].status); > map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE; > if (use_ptemod) { > + if (map->kunmap_ops[offset + i].status == GNTST_okay) > + successful_unmaps++; > + > WARN_ON(map->kunmap_ops[offset + i].status != GNTST_okay && > map->kunmap_ops[offset + i].handle != INVALID_GRANT_HANDLE); > pr_debug("kunmap handle=%u st=%d\n", > @@ -411,11 +417,15 @@ static void __unmap_grant_pages_done(int result, > map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE; > } > } > + > /* > * Decrease the live-grant counter. This must happen after the loop to > * prevent premature reuse of the grants by gnttab_mmap(). > */ > - atomic_sub(data->count, &map->live_grants); > + live_grants = atomic_sub_return(successful_unmaps, &map->live_grants); > + if (WARN_ON(live_grants < 0)) > + pr_err("%s: live_grants became negative (%d) after unmapping %d pages!\n", > + __func__, live_grants, successful_unmaps); > > /* Release reference taken by __unmap_grant_pages */ > gntdev_put_map(NULL, map); > @@ -424,6 +434,8 @@ static void __unmap_grant_pages_done(int result, > static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset, > int pages) > { > + int idx; > + > if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) { > int pgno = (map->notify.addr >> PAGE_SHIFT); > > @@ -436,6 +448,16 @@ static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset, > } > } > > + /* Set all unmap/kunmap status fields to an arbitrary positive value, > + * so that it is possible to determine which grants were successfully > + * unmapped by inspecting the status fields. > + */ > + for (idx = offset; idx < offset + pages; idx++) { > + map->unmap_ops[idx].status = 1; > + if (use_ptemod) > + map->kunmap_ops[idx].status = 1; > + } > + > map->unmap_data.unmap_ops = map->unmap_ops + offset; > map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL; > map->unmap_data.pages = map->pages + offset; Juergen