kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/9] vfio virtual address update
@ 2021-01-29 16:54 Steve Sistare
  2021-01-29 16:54 ` [PATCH V3 1/9] vfio: option to unmap all Steve Sistare
                   ` (10 more replies)
  0 siblings, 11 replies; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Add interfaces that allow the underlying memory object of an iova range
to be mapped to a new virtual address in the host process:

  - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
  - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
  - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
  - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
  - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION

Unmap-vaddr invalidates the host virtual address in an iova range and blocks
vfio translation of host virtual addresses, but DMA to already-mapped pages
continues.  Map-vaddr updates the base VA and resumes translation.  The
implementation supports iommu type1 and mediated devices.  Unmap-all allows
all ranges to be unmapped or invalidated in a single ioctl, which simplifies
userland code.

This functionality is necessary for live update, in which a host process
such as qemu exec's an updated version of itself, while preserving its
guest and vfio devices.  The process blocks vfio VA translation, exec's
its new self, mmap's the memory object(s) underlying vfio object, updates
the VA, and unblocks translation.  For a working example that uses these
new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com

Patches 1-3 define and implement the flag to unmap all ranges.
Patches 4-6 define and implement the flags to update vaddr.
Patches 7-9 add blocking to complete the implementation.

Changes in V2:
 - define a flag for unmap all instead of special range values
 - define the VFIO_UNMAP_ALL extension
 - forbid the combination of unmap-all and get-dirty-bitmap
 - unwind in unmap on vaddr error
 - add a new function to find first dma in a range instead of modifying
   an existing function
 - change names of update flags
 - fix concurrency bugs due to iommu lock being dropped
 - call down from from vfio to a new backend interface instead of up from
   driver to detect container close
 - use wait/wake instead of sleep and polling
 - refine the uapi specification
 - split patches into vfio vs type1

Changes in V3:
 - add vaddr_invalid_count to fix pin_pages race with unmap
 - refactor the wait helper functions
 - traverse dma entries more efficiently in unmap
 - check unmap flag conflicts more explicitly
 - rename some local variables and functions

Steve Sistare (9):
  vfio: option to unmap all
  vfio/type1: unmap cleanup
  vfio/type1: implement unmap all
  vfio: interfaces to update vaddr
  vfio/type1: massage unmap iteration
  vfio/type1: implement interfaces to update vaddr
  vfio: iommu driver notify callback
  vfio/type1: implement notify callback
  vfio/type1: block on invalid vaddr

 drivers/vfio/vfio.c             |   5 +
 drivers/vfio/vfio_iommu_type1.c | 251 +++++++++++++++++++++++++++++++++++-----
 include/linux/vfio.h            |   5 +
 include/uapi/linux/vfio.h       |  27 +++++
 4 files changed, 256 insertions(+), 32 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH V3 1/9] vfio: option to unmap all
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-02-01 11:42   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 2/9] vfio/type1: unmap cleanup Steve Sistare
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

For the UNMAP_DMA ioctl, delete all mappings if VFIO_DMA_UNMAP_FLAG_ALL
is set.  Define the corresponding VFIO_UNMAP_ALL extension.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/uapi/linux/vfio.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 9204705..55578b6 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -46,6 +46,9 @@
  */
 #define VFIO_NOIOMMU_IOMMU		8
 
+/* Supports VFIO_DMA_UNMAP_FLAG_ALL */
+#define VFIO_UNMAP_ALL			9
+
 /*
  * The IOCTL interface is designed for extensibility by embedding the
  * structure length (argsz) and flags into structures passed between
@@ -1074,6 +1077,7 @@ struct vfio_bitmap {
  * field.  No guarantee is made to the user that arbitrary unmaps of iova
  * or size different from those used in the original mapping call will
  * succeed.
+ *
  * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get the dirty bitmap
  * before unmapping IO virtual addresses. When this flag is set, the user must
  * provide a struct vfio_bitmap in data[]. User must provide zero-allocated
@@ -1083,11 +1087,15 @@ struct vfio_bitmap {
  * indicates that the page at that offset from iova is dirty. A Bitmap of the
  * pages in the range of unmapped size is returned in the user-provided
  * vfio_bitmap.data.
+ *
+ * If flags & VFIO_DMA_UNMAP_FLAG_ALL, unmap all addresses.  iova and size
+ * must be 0.  This may not be combined with the get-dirty-bitmap flag.
  */
 struct vfio_iommu_type1_dma_unmap {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0)
+#define VFIO_DMA_UNMAP_FLAG_ALL		     (1 << 1)
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
 	__u8    data[];
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 2/9] vfio/type1: unmap cleanup
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
  2021-01-29 16:54 ` [PATCH V3 1/9] vfio: option to unmap all Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-02-01 11:50   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 3/9] vfio/type1: implement unmap all Steve Sistare
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Minor changes in vfio_dma_do_unmap to improve readability, which also
simplify the subsequent unmap-all patch.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 38 +++++++++++++++-----------------------
 1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5fbf0c1..6bf33c2 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1073,34 +1073,28 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 {
 	struct vfio_dma *dma, *dma_last = NULL;
 	size_t unmapped = 0, pgsize;
-	int ret = 0, retries = 0;
+	int ret = -EINVAL, retries = 0;
 	unsigned long pgshift;
+	dma_addr_t iova = unmap->iova;
+	unsigned long size = unmap->size;
 
 	mutex_lock(&iommu->lock);
 
 	pgshift = __ffs(iommu->pgsize_bitmap);
 	pgsize = (size_t)1 << pgshift;
 
-	if (unmap->iova & (pgsize - 1)) {
-		ret = -EINVAL;
+	if (iova & (pgsize - 1))
 		goto unlock;
-	}
 
-	if (!unmap->size || unmap->size & (pgsize - 1)) {
-		ret = -EINVAL;
+	if (!size || size & (pgsize - 1))
 		goto unlock;
-	}
 
-	if (unmap->iova + unmap->size - 1 < unmap->iova ||
-	    unmap->size > SIZE_MAX) {
-		ret = -EINVAL;
+	if (iova + size - 1 < iova || size > SIZE_MAX)
 		goto unlock;
-	}
 
 	/* When dirty tracking is enabled, allow only min supported pgsize */
 	if ((unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) &&
 	    (!iommu->dirty_page_tracking || (bitmap->pgsize != pgsize))) {
-		ret = -EINVAL;
 		goto unlock;
 	}
 
@@ -1138,20 +1132,18 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	 * mappings within the range.
 	 */
 	if (iommu->v2) {
-		dma = vfio_find_dma(iommu, unmap->iova, 1);
-		if (dma && dma->iova != unmap->iova) {
-			ret = -EINVAL;
+		dma = vfio_find_dma(iommu, iova, 1);
+		if (dma && dma->iova != iova)
 			goto unlock;
-		}
-		dma = vfio_find_dma(iommu, unmap->iova + unmap->size - 1, 0);
-		if (dma && dma->iova + dma->size != unmap->iova + unmap->size) {
-			ret = -EINVAL;
+
+		dma = vfio_find_dma(iommu, iova + size - 1, 0);
+		if (dma && dma->iova + dma->size != iova + size)
 			goto unlock;
-		}
 	}
 
-	while ((dma = vfio_find_dma(iommu, unmap->iova, unmap->size))) {
-		if (!iommu->v2 && unmap->iova > dma->iova)
+	ret = 0;
+	while ((dma = vfio_find_dma(iommu, iova, size))) {
+		if (!iommu->v2 && iova > dma->iova)
 			break;
 		/*
 		 * Task with same address space who mapped this iova range is
@@ -1189,7 +1181,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 
 		if (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
 			ret = update_user_bitmap(bitmap->data, iommu, dma,
-						 unmap->iova, pgsize);
+						 iova, pgsize);
 			if (ret)
 				break;
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 3/9] vfio/type1: implement unmap all
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
  2021-01-29 16:54 ` [PATCH V3 1/9] vfio: option to unmap all Steve Sistare
  2021-01-29 16:54 ` [PATCH V3 2/9] vfio/type1: unmap cleanup Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-02-01 11:59   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 4/9] vfio: interfaces to update vaddr Steve Sistare
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Implement VFIO_DMA_UNMAP_FLAG_ALL.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 6bf33c2..407f0f7 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1077,6 +1077,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	unsigned long pgshift;
 	dma_addr_t iova = unmap->iova;
 	unsigned long size = unmap->size;
+	bool unmap_all = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL);
 
 	mutex_lock(&iommu->lock);
 
@@ -1086,8 +1087,13 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	if (iova & (pgsize - 1))
 		goto unlock;
 
-	if (!size || size & (pgsize - 1))
+	if (unmap_all) {
+		if (iova || size)
+			goto unlock;
+		size = SIZE_MAX;
+	} else if (!size || size & (pgsize - 1)) {
 		goto unlock;
+	}
 
 	if (iova + size - 1 < iova || size > SIZE_MAX)
 		goto unlock;
@@ -1131,7 +1137,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	 * will only return success and a size of zero if there were no
 	 * mappings within the range.
 	 */
-	if (iommu->v2) {
+	if (iommu->v2 && !unmap_all) {
 		dma = vfio_find_dma(iommu, iova, 1);
 		if (dma && dma->iova != iova)
 			goto unlock;
@@ -2515,6 +2521,7 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu,
 	case VFIO_TYPE1_IOMMU:
 	case VFIO_TYPE1v2_IOMMU:
 	case VFIO_TYPE1_NESTING_IOMMU:
+	case VFIO_UNMAP_ALL:
 		return 1;
 	case VFIO_DMA_CC_IOMMU:
 		if (!iommu)
@@ -2687,6 +2694,8 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
 {
 	struct vfio_iommu_type1_dma_unmap unmap;
 	struct vfio_bitmap bitmap = { 0 };
+	uint32_t mask = VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP |
+			VFIO_DMA_UNMAP_FLAG_ALL;
 	unsigned long minsz;
 	int ret;
 
@@ -2695,8 +2704,11 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
 	if (copy_from_user(&unmap, (void __user *)arg, minsz))
 		return -EFAULT;
 
-	if (unmap.argsz < minsz ||
-	    unmap.flags & ~VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP)
+	if (unmap.argsz < minsz || unmap.flags & ~mask)
+		return -EINVAL;
+
+	if ((unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) &&
+	    (unmap.flags & VFIO_DMA_UNMAP_FLAG_ALL))
 		return -EINVAL;
 
 	if (unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 4/9] vfio: interfaces to update vaddr
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (2 preceding siblings ...)
  2021-01-29 16:54 ` [PATCH V3 3/9] vfio/type1: implement unmap all Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-02-01 13:13   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 5/9] vfio/type1: massage unmap iteration Steve Sistare
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Define interfaces that allow the underlying memory object of an iova
range to be mapped to a new host virtual address in the host process:

  - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
  - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
  - VFIO_UPDATE_VADDR extension for VFIO_CHECK_EXTENSION

Unmap vaddr invalidates the host virtual address in an iova range, and
blocks vfio translation of host virtual addresses.  DMA to already-mapped
pages continues.  Map vaddr updates the base VA and resumes translation.
See comments in uapi/linux/vfio.h for more details.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/uapi/linux/vfio.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 55578b6..85bb89a 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -49,6 +49,9 @@
 /* Supports VFIO_DMA_UNMAP_FLAG_ALL */
 #define VFIO_UNMAP_ALL			9
 
+/* Supports the vaddr flag for DMA map and unmap */
+#define VFIO_UPDATE_VADDR		10
+
 /*
  * The IOCTL interface is designed for extensibility by embedding the
  * structure length (argsz) and flags into structures passed between
@@ -1049,12 +1052,22 @@ struct vfio_iommu_type1_info_cap_migration {
  *
  * Map process virtual addresses to IO virtual addresses using the
  * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
+ *
+ * If flags & VFIO_DMA_MAP_FLAG_VADDR, update the base vaddr for iova, and
+ * unblock translation of host virtual addresses in the iova range.  The vaddr
+ * must have previously been invalidated with VFIO_DMA_UNMAP_FLAG_VADDR.  To
+ * maintain memory consistency within the user application, the updated vaddr
+ * must address the same memory object as originally mapped.  Failure to do so
+ * will result in user memory corruption and/or device misbehavior.  iova and
+ * size must match those in the original MAP_DMA call.  Protection is not
+ * changed, and the READ & WRITE flags must be 0.
  */
 struct vfio_iommu_type1_dma_map {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_MAP_FLAG_READ (1 << 0)		/* readable from device */
 #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)	/* writable from device */
+#define VFIO_DMA_MAP_FLAG_VADDR (1 << 2)
 	__u64	vaddr;				/* Process virtual address */
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
@@ -1090,12 +1103,18 @@ struct vfio_bitmap {
  *
  * If flags & VFIO_DMA_UNMAP_FLAG_ALL, unmap all addresses.  iova and size
  * must be 0.  This may not be combined with the get-dirty-bitmap flag.
+ *
+ * If flags & VFIO_DMA_UNMAP_FLAG_VADDR, do not unmap, but invalidate host
+ * virtual addresses in the iova range.  Tasks that attempt to translate an
+ * iova's vaddr will block.  DMA to already-mapped pages continues.  This may
+ * not be combined with the get-dirty-bitmap flag.
  */
 struct vfio_iommu_type1_dma_unmap {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0)
 #define VFIO_DMA_UNMAP_FLAG_ALL		     (1 << 1)
+#define VFIO_DMA_UNMAP_FLAG_VADDR	     (1 << 2)
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
 	__u8    data[];
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 5/9] vfio/type1: massage unmap iteration
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (3 preceding siblings ...)
  2021-01-29 16:54 ` [PATCH V3 4/9] vfio: interfaces to update vaddr Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-01-29 21:56   ` Alex Williamson
  2021-02-01 12:11   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr Steve Sistare
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Modify the iteration in vfio_dma_do_unmap so it does not depend on deletion
of each dma entry.  Add a variant of vfio_find_dma that returns the entry
with the lowest iova in the search range to initialize the iteration.  No
externally visible change, but this behavior is needed in the subsequent
update-vaddr patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 35 ++++++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 407f0f7..5823607 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -173,6 +173,31 @@ static struct vfio_dma *vfio_find_dma(struct vfio_iommu *iommu,
 	return NULL;
 }
 
+static struct rb_node *vfio_find_dma_first(struct vfio_iommu *iommu,
+					    dma_addr_t start, size_t size)
+{
+	struct rb_node *res = NULL;
+	struct rb_node *node = iommu->dma_list.rb_node;
+	struct vfio_dma *dma_res = NULL;
+
+	while (node) {
+		struct vfio_dma *dma = rb_entry(node, struct vfio_dma, node);
+
+		if (start < dma->iova + dma->size) {
+			res = node;
+			dma_res = dma;
+			if (start >= dma->iova)
+				break;
+			node = node->rb_left;
+		} else {
+			node = node->rb_right;
+		}
+	}
+	if (res && size && dma_res->iova >= start + size)
+		res = NULL;
+	return res;
+}
+
 static void vfio_link_dma(struct vfio_iommu *iommu, struct vfio_dma *new)
 {
 	struct rb_node **link = &iommu->dma_list.rb_node, *parent = NULL;
@@ -1078,6 +1103,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	dma_addr_t iova = unmap->iova;
 	unsigned long size = unmap->size;
 	bool unmap_all = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL);
+	struct rb_node *n;
 
 	mutex_lock(&iommu->lock);
 
@@ -1148,7 +1174,13 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	}
 
 	ret = 0;
-	while ((dma = vfio_find_dma(iommu, iova, size))) {
+	n = vfio_find_dma_first(iommu, iova, size);
+
+	while (n) {
+		dma = rb_entry(n, struct vfio_dma, node);
+		if (dma->iova >= iova + size)
+			break;
+
 		if (!iommu->v2 && iova > dma->iova)
 			break;
 		/*
@@ -1193,6 +1225,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 		}
 
 		unmapped += dma->size;
+		n = rb_next(n);
 		vfio_remove_dma(iommu, dma);
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (4 preceding siblings ...)
  2021-01-29 16:54 ` [PATCH V3 5/9] vfio/type1: massage unmap iteration Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-01-29 21:56   ` Alex Williamson
  2021-02-01 12:27   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 7/9] vfio: iommu driver notify callback Steve Sistare
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Implement VFIO_DMA_UNMAP_FLAG_VADDR, VFIO_DMA_MAP_FLAG_VADDR, and
VFIO_UPDATE_VADDR.  This is a partial implementation.  Blocking is
added in a subsequent patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 62 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5823607..cbe3c65 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -74,6 +74,7 @@ struct vfio_iommu {
 	bool			nesting;
 	bool			dirty_page_tracking;
 	bool			pinned_page_dirty_scope;
+	int			vaddr_invalid_count;
 };
 
 struct vfio_domain {
@@ -92,6 +93,7 @@ struct vfio_dma {
 	int			prot;		/* IOMMU_READ/WRITE */
 	bool			iommu_mapped;
 	bool			lock_cap;	/* capable(CAP_IPC_LOCK) */
+	bool			vaddr_invalid;
 	struct task_struct	*task;
 	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
 	unsigned long		*bitmap;
@@ -973,6 +975,8 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 	vfio_unlink_dma(iommu, dma);
 	put_task_struct(dma->task);
 	vfio_dma_bitmap_free(dma);
+	if (dma->vaddr_invalid)
+		--iommu->vaddr_invalid_count;
 	kfree(dma);
 	iommu->dma_avail++;
 }
@@ -1103,7 +1107,8 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	dma_addr_t iova = unmap->iova;
 	unsigned long size = unmap->size;
 	bool unmap_all = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL);
-	struct rb_node *n;
+	bool invalidate_vaddr = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_VADDR);
+	struct rb_node *n, *first_n, *last_n;
 
 	mutex_lock(&iommu->lock);
 
@@ -1174,7 +1179,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	}
 
 	ret = 0;
-	n = vfio_find_dma_first(iommu, iova, size);
+	n = first_n = vfio_find_dma_first(iommu, iova, size);
 
 	while (n) {
 		dma = rb_entry(n, struct vfio_dma, node);
@@ -1190,6 +1195,16 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 		if (dma->task->mm != current->mm)
 			break;
 
+		if (invalidate_vaddr) {
+			if (dma->vaddr_invalid)
+				goto unwind;
+			dma->vaddr_invalid = true;
+			iommu->vaddr_invalid_count++;
+			unmapped += dma->size;
+			n = rb_next(n);
+			continue;
+		}
+
 		if (!RB_EMPTY_ROOT(&dma->pfn_list)) {
 			struct vfio_iommu_type1_dma_unmap nb_unmap;
 
@@ -1228,6 +1243,18 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 		n = rb_next(n);
 		vfio_remove_dma(iommu, dma);
 	}
+	goto unlock;
+
+unwind:
+	last_n = n;
+
+	for (n = first_n; n != last_n; n = rb_next(n)) {
+		dma = rb_entry(n, struct vfio_dma, node);
+		dma->vaddr_invalid = false;
+		--iommu->vaddr_invalid_count;
+	}
+	ret = -EINVAL;
+	unmapped = 0;
 
 unlock:
 	mutex_unlock(&iommu->lock);
@@ -1329,6 +1356,7 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
 static int vfio_dma_do_map(struct vfio_iommu *iommu,
 			   struct vfio_iommu_type1_dma_map *map)
 {
+	bool set_vaddr = map->flags & VFIO_DMA_MAP_FLAG_VADDR;
 	dma_addr_t iova = map->iova;
 	unsigned long vaddr = map->vaddr;
 	size_t size = map->size;
@@ -1346,13 +1374,16 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	if (map->flags & VFIO_DMA_MAP_FLAG_READ)
 		prot |= IOMMU_READ;
 
+	if ((prot && set_vaddr) || (!prot && !set_vaddr))
+		return -EINVAL;
+
 	mutex_lock(&iommu->lock);
 
 	pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
 
 	WARN_ON((pgsize - 1) & PAGE_MASK);
 
-	if (!prot || !size || (size | iova | vaddr) & (pgsize - 1)) {
+	if (!size || (size | iova | vaddr) & (pgsize - 1)) {
 		ret = -EINVAL;
 		goto out_unlock;
 	}
@@ -1363,7 +1394,20 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 		goto out_unlock;
 	}
 
-	if (vfio_find_dma(iommu, iova, size)) {
+	dma = vfio_find_dma(iommu, iova, size);
+	if (set_vaddr) {
+		if (!dma) {
+			ret = -ENOENT;
+		} else if (!dma->vaddr_invalid || dma->iova != iova ||
+			   dma->size != size) {
+			ret = -EINVAL;
+		} else {
+			dma->vaddr = vaddr;
+			dma->vaddr_invalid = false;
+			--iommu->vaddr_invalid_count;
+		}
+		goto out_unlock;
+	} else if (dma) {
 		ret = -EEXIST;
 		goto out_unlock;
 	}
@@ -1387,6 +1431,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	iommu->dma_avail--;
 	dma->iova = iova;
 	dma->vaddr = vaddr;
+	dma->vaddr_invalid = false;
 	dma->prot = prot;
 
 	/*
@@ -2483,6 +2528,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	INIT_LIST_HEAD(&iommu->iova_list);
 	iommu->dma_list = RB_ROOT;
 	iommu->dma_avail = dma_entry_limit;
+	iommu->vaddr_invalid_count = 0;
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 
@@ -2555,6 +2601,7 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu,
 	case VFIO_TYPE1v2_IOMMU:
 	case VFIO_TYPE1_NESTING_IOMMU:
 	case VFIO_UNMAP_ALL:
+	case VFIO_UPDATE_VADDR:
 		return 1;
 	case VFIO_DMA_CC_IOMMU:
 		if (!iommu)
@@ -2709,7 +2756,8 @@ static int vfio_iommu_type1_map_dma(struct vfio_iommu *iommu,
 {
 	struct vfio_iommu_type1_dma_map map;
 	unsigned long minsz;
-	uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
+	uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE |
+			VFIO_DMA_MAP_FLAG_VADDR;
 
 	minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
 
@@ -2728,6 +2776,7 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
 	struct vfio_iommu_type1_dma_unmap unmap;
 	struct vfio_bitmap bitmap = { 0 };
 	uint32_t mask = VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP |
+			VFIO_DMA_UNMAP_FLAG_VADDR |
 			VFIO_DMA_UNMAP_FLAG_ALL;
 	unsigned long minsz;
 	int ret;
@@ -2741,7 +2790,8 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
 		return -EINVAL;
 
 	if ((unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) &&
-	    (unmap.flags & VFIO_DMA_UNMAP_FLAG_ALL))
+	    (unmap.flags & (VFIO_DMA_UNMAP_FLAG_ALL |
+			    VFIO_DMA_UNMAP_FLAG_VADDR)))
 		return -EINVAL;
 
 	if (unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 7/9] vfio: iommu driver notify callback
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (5 preceding siblings ...)
  2021-01-29 16:54 ` [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-01-29 21:57   ` Alex Williamson
  2021-02-01 13:17   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 8/9] vfio/type1: implement " Steve Sistare
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Define a vfio_iommu_driver_ops notify callback, for sending events to
the driver.  Drivers are not required to provide the callback, and
may ignore any events.  The handling of events is driver specific.

Define the CONTAINER_CLOSE event, called when the container's file
descriptor is closed.  This event signifies that no further state changes
will occur via container ioctl's.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio.c  | 5 +++++
 include/linux/vfio.h | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 262ab0e..99458fc 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1220,6 +1220,11 @@ static int vfio_fops_open(struct inode *inode, struct file *filep)
 static int vfio_fops_release(struct inode *inode, struct file *filep)
 {
 	struct vfio_container *container = filep->private_data;
+	struct vfio_iommu_driver *driver = container->iommu_driver;
+
+	if (driver && driver->ops->notify)
+		driver->ops->notify(container->iommu_data,
+				    VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE, NULL);
 
 	filep->private_data = NULL;
 
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 38d3c6a..9642579 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -57,6 +57,9 @@ extern int vfio_add_group_dev(struct device *dev,
 extern void vfio_device_put(struct vfio_device *device);
 extern void *vfio_device_data(struct vfio_device *device);
 
+/* events for the backend driver notify callback */
+#define VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE	1
+
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
  */
@@ -90,6 +93,8 @@ struct vfio_iommu_driver_ops {
 					       struct notifier_block *nb);
 	int		(*dma_rw)(void *iommu_data, dma_addr_t user_iova,
 				  void *data, size_t count, bool write);
+	void		(*notify)(void *iommu_data, unsigned int event,
+				  void *data);
 };
 
 extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 8/9] vfio/type1: implement notify callback
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (6 preceding siblings ...)
  2021-01-29 16:54 ` [PATCH V3 7/9] vfio: iommu driver notify callback Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-02-01 12:40   ` Cornelia Huck
  2021-01-29 16:54 ` [PATCH V3 9/9] vfio/type1: block on invalid vaddr Steve Sistare
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Implement a notify callback that remembers if the container's file
descriptor has been closed.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index cbe3c65..b74a5f3 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -74,6 +74,7 @@ struct vfio_iommu {
 	bool			nesting;
 	bool			dirty_page_tracking;
 	bool			pinned_page_dirty_scope;
+	bool			container_open;
 	int			vaddr_invalid_count;
 };
 
@@ -2529,6 +2530,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	iommu->dma_list = RB_ROOT;
 	iommu->dma_avail = dma_entry_limit;
 	iommu->vaddr_invalid_count = 0;
+	iommu->container_open = true;
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 
@@ -3055,6 +3057,18 @@ static int vfio_iommu_type1_dma_rw(void *iommu_data, dma_addr_t user_iova,
 	return ret;
 }
 
+static void vfio_iommu_type1_notify(void *iommu_data, unsigned int event,
+				    void *data)
+{
+	struct vfio_iommu *iommu = iommu_data;
+
+	if (event != VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE)
+		return;
+	mutex_lock(&iommu->lock);
+	iommu->container_open = false;
+	mutex_unlock(&iommu->lock);
+}
+
 static const struct vfio_iommu_driver_ops vfio_iommu_driver_ops_type1 = {
 	.name			= "vfio-iommu-type1",
 	.owner			= THIS_MODULE,
@@ -3068,6 +3082,7 @@ static int vfio_iommu_type1_dma_rw(void *iommu_data, dma_addr_t user_iova,
 	.register_notifier	= vfio_iommu_type1_register_notifier,
 	.unregister_notifier	= vfio_iommu_type1_unregister_notifier,
 	.dma_rw			= vfio_iommu_type1_dma_rw,
+	.notify			= vfio_iommu_type1_notify,
 };
 
 static int __init vfio_iommu_type1_init(void)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH V3 9/9] vfio/type1: block on invalid vaddr
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (7 preceding siblings ...)
  2021-01-29 16:54 ` [PATCH V3 8/9] vfio/type1: implement " Steve Sistare
@ 2021-01-29 16:54 ` Steve Sistare
  2021-02-01 12:47   ` Cornelia Huck
  2021-01-29 21:55 ` [PATCH V3 0/9] vfio virtual address update Alex Williamson
  2021-02-02 17:21 ` Alex Williamson
  10 siblings, 1 reply; 37+ messages in thread
From: Steve Sistare @ 2021-01-29 16:54 UTC (permalink / raw)
  To: kvm; +Cc: Alex Williamson, Cornelia Huck, Kirti Wankhede, Steve Sistare

Block translation of host virtual address while an iova range has an
invalid vaddr.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 95 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 90 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b74a5f3..e3fc450 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -31,6 +31,7 @@
 #include <linux/rbtree.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/mm.h>
+#include <linux/kthread.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
@@ -76,6 +77,7 @@ struct vfio_iommu {
 	bool			pinned_page_dirty_scope;
 	bool			container_open;
 	int			vaddr_invalid_count;
+	wait_queue_head_t	vaddr_wait;
 };
 
 struct vfio_domain {
@@ -146,6 +148,8 @@ struct vfio_regions {
 #define DIRTY_BITMAP_PAGES_MAX	 ((u64)INT_MAX)
 #define DIRTY_BITMAP_SIZE_MAX	 DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX)
 
+#define WAITED 1
+
 static int put_pfn(unsigned long pfn, int prot);
 
 static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
@@ -507,6 +511,61 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
 	return ret;
 }
 
+static int vfio_wait(struct vfio_iommu *iommu)
+{
+	DEFINE_WAIT(wait);
+
+	prepare_to_wait(&iommu->vaddr_wait, &wait, TASK_KILLABLE);
+	mutex_unlock(&iommu->lock);
+	schedule();
+	mutex_lock(&iommu->lock);
+	finish_wait(&iommu->vaddr_wait, &wait);
+	if (kthread_should_stop() || !iommu->container_open ||
+	    fatal_signal_pending(current)) {
+		return -EFAULT;
+	}
+	return WAITED;
+}
+
+/*
+ * Find dma struct and wait for its vaddr to be valid.  iommu lock is dropped
+ * if the task waits, but is re-locked on return.  Return result in *dma_p.
+ * Return 0 on success with no waiting, WAITED on success if waited, and -errno
+ * on error.
+ */
+static int vfio_find_dma_valid(struct vfio_iommu *iommu, dma_addr_t start,
+			       size_t size, struct vfio_dma **dma_p)
+{
+	int ret;
+
+	do {
+		*dma_p = vfio_find_dma(iommu, start, size);
+		if (!*dma_p)
+			ret = -EINVAL;
+		else if (!(*dma_p)->vaddr_invalid)
+			ret = 0;
+		else
+			ret = vfio_wait(iommu);
+	} while (ret > 0);
+
+	return ret;
+}
+
+/*
+ * Wait for all vaddr in the dma_list to become valid.  iommu lock is dropped
+ * if the task waits, but is re-locked on return.  Return 0 on success with no
+ * waiting, WAITED on success if waited, and -errno on error.
+ */
+static int vfio_wait_all_valid(struct vfio_iommu *iommu)
+{
+	int ret = 0;
+
+	while (iommu->vaddr_invalid_count && ret >= 0)
+		ret = vfio_wait(iommu);
+
+	return ret;
+}
+
 /*
  * Attempt to pin pages.  We really don't want to track all the pfns and
  * the iommu can only map chunks of consecutive pfns anyway, so get the
@@ -668,6 +727,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 	unsigned long remote_vaddr;
 	struct vfio_dma *dma;
 	bool do_accounting;
+	dma_addr_t iova;
 
 	if (!iommu || !user_pfn || !phys_pfn)
 		return -EINVAL;
@@ -678,6 +738,22 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 
 	mutex_lock(&iommu->lock);
 
+	/*
+	 * Wait for all necessary vaddr's to be valid so they can be used in
+	 * the main loop without dropping the lock, to avoid racing vs unmap.
+	 */
+again:
+	if (iommu->vaddr_invalid_count) {
+		for (i = 0; i < npage; i++) {
+			iova = user_pfn[i] << PAGE_SHIFT;
+			ret = vfio_find_dma_valid(iommu, iova, PAGE_SIZE, &dma);
+			if (ret < 0)
+				goto pin_done;
+			if (ret == WAITED)
+				goto again;
+		}
+	}
+
 	/* Fail if notifier list is empty */
 	if (!iommu->notifier.head) {
 		ret = -EINVAL;
@@ -692,7 +768,6 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
 
 	for (i = 0; i < npage; i++) {
-		dma_addr_t iova;
 		struct vfio_pfn *vpfn;
 
 		iova = user_pfn[i] << PAGE_SHIFT;
@@ -976,8 +1051,10 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 	vfio_unlink_dma(iommu, dma);
 	put_task_struct(dma->task);
 	vfio_dma_bitmap_free(dma);
-	if (dma->vaddr_invalid)
+	if (dma->vaddr_invalid) {
 		--iommu->vaddr_invalid_count;
+		wake_up_all(&iommu->vaddr_wait);
+	}
 	kfree(dma);
 	iommu->dma_avail++;
 }
@@ -1406,6 +1483,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 			dma->vaddr = vaddr;
 			dma->vaddr_invalid = false;
 			--iommu->vaddr_invalid_count;
+			wake_up_all(&iommu->vaddr_wait);
 		}
 		goto out_unlock;
 	} else if (dma) {
@@ -1506,6 +1584,10 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 	unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
 	int ret;
 
+	ret = vfio_wait_all_valid(iommu);
+	if (ret < 0)
+		return ret;
+
 	/* Arbitrarily pick the first domain in the list for lookups */
 	if (!list_empty(&iommu->domain_list))
 		d = list_first_entry(&iommu->domain_list,
@@ -2533,6 +2615,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	iommu->container_open = true;
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
+	init_waitqueue_head(&iommu->vaddr_wait);
 
 	return iommu;
 }
@@ -2984,12 +3067,13 @@ static int vfio_iommu_type1_dma_rw_chunk(struct vfio_iommu *iommu,
 	struct vfio_dma *dma;
 	bool kthread = current->mm == NULL;
 	size_t offset;
+	int ret;
 
 	*copied = 0;
 
-	dma = vfio_find_dma(iommu, user_iova, 1);
-	if (!dma)
-		return -EINVAL;
+	ret = vfio_find_dma_valid(iommu, user_iova, 1, &dma);
+	if (ret < 0)
+		return ret;
 
 	if ((write && !(dma->prot & IOMMU_WRITE)) ||
 			!(dma->prot & IOMMU_READ))
@@ -3067,6 +3151,7 @@ static void vfio_iommu_type1_notify(void *iommu_data, unsigned int event,
 	mutex_lock(&iommu->lock);
 	iommu->container_open = false;
 	mutex_unlock(&iommu->lock);
+	wake_up_all(&iommu->vaddr_wait);
 }
 
 static const struct vfio_iommu_driver_ops vfio_iommu_driver_ops_type1 = {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 0/9] vfio virtual address update
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (8 preceding siblings ...)
  2021-01-29 16:54 ` [PATCH V3 9/9] vfio/type1: block on invalid vaddr Steve Sistare
@ 2021-01-29 21:55 ` Alex Williamson
  2021-01-29 22:14   ` Steven Sistare
  2021-01-30 16:54   ` Steven Sistare
  2021-02-02 17:21 ` Alex Williamson
  10 siblings, 2 replies; 37+ messages in thread
From: Alex Williamson @ 2021-01-29 21:55 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:03 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Add interfaces that allow the underlying memory object of an iova range
> to be mapped to a new virtual address in the host process:
> 
>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
>   - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
>   - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
>   - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION
> 
> Unmap-vaddr invalidates the host virtual address in an iova range and blocks
> vfio translation of host virtual addresses, but DMA to already-mapped pages
> continues.  Map-vaddr updates the base VA and resumes translation.  The
> implementation supports iommu type1 and mediated devices.  Unmap-all allows
> all ranges to be unmapped or invalidated in a single ioctl, which simplifies
> userland code.
> 
> This functionality is necessary for live update, in which a host process
> such as qemu exec's an updated version of itself, while preserving its
> guest and vfio devices.  The process blocks vfio VA translation, exec's
> its new self, mmap's the memory object(s) underlying vfio object, updates
> the VA, and unblocks translation.  For a working example that uses these
> new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
> https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com
> 
> Patches 1-3 define and implement the flag to unmap all ranges.
> Patches 4-6 define and implement the flags to update vaddr.
> Patches 7-9 add blocking to complete the implementation.

Hi Steve,

It looks pretty good to me, but I have some nit-picky comments that
I'll follow-up with on the individual patches.  However, I've made the
changes I suggest in a branch that you can find here:

git://github.com/awilliam/linux-vfio.git vaddr-v3

If the changes look ok, just send me an ack, I don't want to attribute
something to you that you don't approve of.  Thanks,

Alex
 
> Changes in V2:
>  - define a flag for unmap all instead of special range values
>  - define the VFIO_UNMAP_ALL extension
>  - forbid the combination of unmap-all and get-dirty-bitmap
>  - unwind in unmap on vaddr error
>  - add a new function to find first dma in a range instead of modifying
>    an existing function
>  - change names of update flags
>  - fix concurrency bugs due to iommu lock being dropped
>  - call down from from vfio to a new backend interface instead of up from
>    driver to detect container close
>  - use wait/wake instead of sleep and polling
>  - refine the uapi specification
>  - split patches into vfio vs type1
> 
> Changes in V3:
>  - add vaddr_invalid_count to fix pin_pages race with unmap
>  - refactor the wait helper functions
>  - traverse dma entries more efficiently in unmap
>  - check unmap flag conflicts more explicitly
>  - rename some local variables and functions
> 
> Steve Sistare (9):
>   vfio: option to unmap all
>   vfio/type1: unmap cleanup
>   vfio/type1: implement unmap all
>   vfio: interfaces to update vaddr
>   vfio/type1: massage unmap iteration
>   vfio/type1: implement interfaces to update vaddr
>   vfio: iommu driver notify callback
>   vfio/type1: implement notify callback
>   vfio/type1: block on invalid vaddr
> 
>  drivers/vfio/vfio.c             |   5 +
>  drivers/vfio/vfio_iommu_type1.c | 251 +++++++++++++++++++++++++++++++++++-----
>  include/linux/vfio.h            |   5 +
>  include/uapi/linux/vfio.h       |  27 +++++
>  4 files changed, 256 insertions(+), 32 deletions(-)
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 5/9] vfio/type1: massage unmap iteration
  2021-01-29 16:54 ` [PATCH V3 5/9] vfio/type1: massage unmap iteration Steve Sistare
@ 2021-01-29 21:56   ` Alex Williamson
  2021-01-30 16:51     ` Steven Sistare
  2021-02-01 12:11   ` Cornelia Huck
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2021-01-29 21:56 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:08 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Modify the iteration in vfio_dma_do_unmap so it does not depend on deletion
> of each dma entry.  Add a variant of vfio_find_dma that returns the entry
> with the lowest iova in the search range to initialize the iteration.  No
> externally visible change, but this behavior is needed in the subsequent
> update-vaddr patch.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 35 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 407f0f7..5823607 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -173,6 +173,31 @@ static struct vfio_dma *vfio_find_dma(struct vfio_iommu *iommu,
>  	return NULL;
>  }
>  
> +static struct rb_node *vfio_find_dma_first(struct vfio_iommu *iommu,
> +					    dma_addr_t start, size_t size)

Nit, we return an rb_node rather than a vfio_dma now, but the naming is
still pretty similar to vfio_find_dma().  Can I change it to
vfio_find_dma_first_node() (yes, getting wordy)?  Thanks,

Alex

> +{
> +	struct rb_node *res = NULL;
> +	struct rb_node *node = iommu->dma_list.rb_node;
> +	struct vfio_dma *dma_res = NULL;
> +
> +	while (node) {
> +		struct vfio_dma *dma = rb_entry(node, struct vfio_dma, node);
> +
> +		if (start < dma->iova + dma->size) {
> +			res = node;
> +			dma_res = dma;
> +			if (start >= dma->iova)
> +				break;
> +			node = node->rb_left;
> +		} else {
> +			node = node->rb_right;
> +		}
> +	}
> +	if (res && size && dma_res->iova >= start + size)
> +		res = NULL;
> +	return res;
> +}
> +
>  static void vfio_link_dma(struct vfio_iommu *iommu, struct vfio_dma *new)
>  {
>  	struct rb_node **link = &iommu->dma_list.rb_node, *parent = NULL;
> @@ -1078,6 +1103,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  	dma_addr_t iova = unmap->iova;
>  	unsigned long size = unmap->size;
>  	bool unmap_all = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL);
> +	struct rb_node *n;
>  
>  	mutex_lock(&iommu->lock);
>  
> @@ -1148,7 +1174,13 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  	}
>  
>  	ret = 0;
> -	while ((dma = vfio_find_dma(iommu, iova, size))) {
> +	n = vfio_find_dma_first(iommu, iova, size);
> +
> +	while (n) {
> +		dma = rb_entry(n, struct vfio_dma, node);
> +		if (dma->iova >= iova + size)
> +			break;
> +
>  		if (!iommu->v2 && iova > dma->iova)
>  			break;
>  		/*
> @@ -1193,6 +1225,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  		}
>  
>  		unmapped += dma->size;
> +		n = rb_next(n);
>  		vfio_remove_dma(iommu, dma);
>  	}
>  


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr
  2021-01-29 16:54 ` [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr Steve Sistare
@ 2021-01-29 21:56   ` Alex Williamson
  2021-01-30 16:51     ` Steven Sistare
  2021-02-01 12:27   ` Cornelia Huck
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2021-01-29 21:56 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:09 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Implement VFIO_DMA_UNMAP_FLAG_VADDR, VFIO_DMA_MAP_FLAG_VADDR, and
> VFIO_UPDATE_VADDR.  This is a partial implementation.  Blocking is
> added in a subsequent patch.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 62 +++++++++++++++++++++++++++++++++++++----
>  1 file changed, 56 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 5823607..cbe3c65 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -74,6 +74,7 @@ struct vfio_iommu {
>  	bool			nesting;
>  	bool			dirty_page_tracking;
>  	bool			pinned_page_dirty_scope;
> +	int			vaddr_invalid_count;

Nit, I'll move this up in the struct for alignment/hole purposes.  Also
I think this should be an unsigned, we never use the negative value for
sanity checking and our number of vfio_dma entries is governed by an
unsigned int.

>  };
>  
>  struct vfio_domain {
> @@ -92,6 +93,7 @@ struct vfio_dma {
>  	int			prot;		/* IOMMU_READ/WRITE */
>  	bool			iommu_mapped;
>  	bool			lock_cap;	/* capable(CAP_IPC_LOCK) */
> +	bool			vaddr_invalid;
>  	struct task_struct	*task;
>  	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
>  	unsigned long		*bitmap;
> @@ -973,6 +975,8 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>  	vfio_unlink_dma(iommu, dma);
>  	put_task_struct(dma->task);
>  	vfio_dma_bitmap_free(dma);
> +	if (dma->vaddr_invalid)
> +		--iommu->vaddr_invalid_count;

I'm not a fan of the -- on the left, nor the mixing with ++ on the
right.  Can I move the -- to the right?


>  	kfree(dma);
>  	iommu->dma_avail++;
>  }
> @@ -1103,7 +1107,8 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  	dma_addr_t iova = unmap->iova;
>  	unsigned long size = unmap->size;
>  	bool unmap_all = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL);
> -	struct rb_node *n;
> +	bool invalidate_vaddr = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_VADDR);

Nit, !! is not necessary and inconsistent with the MAP side below, will
remove here and for previous unmap_all use.

> +	struct rb_node *n, *first_n, *last_n;
>  
>  	mutex_lock(&iommu->lock);
>  
> @@ -1174,7 +1179,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  	}
>  
>  	ret = 0;
> -	n = vfio_find_dma_first(iommu, iova, size);
> +	n = first_n = vfio_find_dma_first(iommu, iova, size);
>  
>  	while (n) {
>  		dma = rb_entry(n, struct vfio_dma, node);
> @@ -1190,6 +1195,16 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  		if (dma->task->mm != current->mm)
>  			break;
>  
> +		if (invalidate_vaddr) {
> +			if (dma->vaddr_invalid)
> +				goto unwind;

The unwind actually fits pretty well inline here and avoids the leap
frogging gotos below.  last_n can also be scoped within the unwind more
easily here.

> +			dma->vaddr_invalid = true;
> +			iommu->vaddr_invalid_count++;
> +			unmapped += dma->size;
> +			n = rb_next(n);
> +			continue;
> +		}
> +
>  		if (!RB_EMPTY_ROOT(&dma->pfn_list)) {
>  			struct vfio_iommu_type1_dma_unmap nb_unmap;
>  
> @@ -1228,6 +1243,18 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  		n = rb_next(n);
>  		vfio_remove_dma(iommu, dma);
>  	}
> +	goto unlock;
> +
> +unwind:
> +	last_n = n;
> +
> +	for (n = first_n; n != last_n; n = rb_next(n)) {
> +		dma = rb_entry(n, struct vfio_dma, node);
> +		dma->vaddr_invalid = false;
> +		--iommu->vaddr_invalid_count;
> +	}
> +	ret = -EINVAL;
> +	unmapped = 0;
>  
>  unlock:
>  	mutex_unlock(&iommu->lock);
> @@ -1329,6 +1356,7 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
>  static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  			   struct vfio_iommu_type1_dma_map *map)
>  {
> +	bool set_vaddr = map->flags & VFIO_DMA_MAP_FLAG_VADDR;
>  	dma_addr_t iova = map->iova;
>  	unsigned long vaddr = map->vaddr;
>  	size_t size = map->size;
> @@ -1346,13 +1374,16 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  	if (map->flags & VFIO_DMA_MAP_FLAG_READ)
>  		prot |= IOMMU_READ;
>  
> +	if ((prot && set_vaddr) || (!prot && !set_vaddr))
> +		return -EINVAL;
> +
>  	mutex_lock(&iommu->lock);
>  
>  	pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
>  
>  	WARN_ON((pgsize - 1) & PAGE_MASK);
>  
> -	if (!prot || !size || (size | iova | vaddr) & (pgsize - 1)) {
> +	if (!size || (size | iova | vaddr) & (pgsize - 1)) {
>  		ret = -EINVAL;
>  		goto out_unlock;
>  	}
> @@ -1363,7 +1394,20 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  		goto out_unlock;
>  	}
>  
> -	if (vfio_find_dma(iommu, iova, size)) {
> +	dma = vfio_find_dma(iommu, iova, size);
> +	if (set_vaddr) {
> +		if (!dma) {
> +			ret = -ENOENT;
> +		} else if (!dma->vaddr_invalid || dma->iova != iova ||
> +			   dma->size != size) {
> +			ret = -EINVAL;
> +		} else {
> +			dma->vaddr = vaddr;
> +			dma->vaddr_invalid = false;
> +			--iommu->vaddr_invalid_count;
> +		}
> +		goto out_unlock;
> +	} else if (dma) {
>  		ret = -EEXIST;
>  		goto out_unlock;
>  	}
> @@ -1387,6 +1431,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  	iommu->dma_avail--;
>  	dma->iova = iova;
>  	dma->vaddr = vaddr;
> +	dma->vaddr_invalid = false;
>  	dma->prot = prot;
>  
>  	/*
> @@ -2483,6 +2528,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
>  	INIT_LIST_HEAD(&iommu->iova_list);
>  	iommu->dma_list = RB_ROOT;
>  	iommu->dma_avail = dma_entry_limit;
> +	iommu->vaddr_invalid_count = 0;

Nit, we can drop this.

>  	mutex_init(&iommu->lock);
>  	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
>  
> @@ -2555,6 +2601,7 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu,
>  	case VFIO_TYPE1v2_IOMMU:
>  	case VFIO_TYPE1_NESTING_IOMMU:
>  	case VFIO_UNMAP_ALL:
> +	case VFIO_UPDATE_VADDR:
>  		return 1;
>  	case VFIO_DMA_CC_IOMMU:
>  		if (!iommu)
> @@ -2709,7 +2756,8 @@ static int vfio_iommu_type1_map_dma(struct vfio_iommu *iommu,
>  {
>  	struct vfio_iommu_type1_dma_map map;
>  	unsigned long minsz;
> -	uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
> +	uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE |
> +			VFIO_DMA_MAP_FLAG_VADDR;
>  
>  	minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
>  
> @@ -2728,6 +2776,7 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
>  	struct vfio_iommu_type1_dma_unmap unmap;
>  	struct vfio_bitmap bitmap = { 0 };
>  	uint32_t mask = VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP |
> +			VFIO_DMA_UNMAP_FLAG_VADDR |
>  			VFIO_DMA_UNMAP_FLAG_ALL;
>  	unsigned long minsz;
>  	int ret;
> @@ -2741,7 +2790,8 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
>  		return -EINVAL;
>  
>  	if ((unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) &&
> -	    (unmap.flags & VFIO_DMA_UNMAP_FLAG_ALL))
> +	    (unmap.flags & (VFIO_DMA_UNMAP_FLAG_ALL |
> +			    VFIO_DMA_UNMAP_FLAG_VADDR)))
>  		return -EINVAL;
>  
>  	if (unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 7/9] vfio: iommu driver notify callback
  2021-01-29 16:54 ` [PATCH V3 7/9] vfio: iommu driver notify callback Steve Sistare
@ 2021-01-29 21:57   ` Alex Williamson
  2021-01-30 16:51     ` Steven Sistare
  2021-02-01 13:17   ` Cornelia Huck
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2021-01-29 21:57 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:10 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Define a vfio_iommu_driver_ops notify callback, for sending events to
> the driver.  Drivers are not required to provide the callback, and
> may ignore any events.  The handling of events is driver specific.
> 
> Define the CONTAINER_CLOSE event, called when the container's file
> descriptor is closed.  This event signifies that no further state changes
> will occur via container ioctl's.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio.c  | 5 +++++
>  include/linux/vfio.h | 5 +++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 262ab0e..99458fc 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -1220,6 +1220,11 @@ static int vfio_fops_open(struct inode *inode, struct file *filep)
>  static int vfio_fops_release(struct inode *inode, struct file *filep)
>  {
>  	struct vfio_container *container = filep->private_data;
> +	struct vfio_iommu_driver *driver = container->iommu_driver;
> +
> +	if (driver && driver->ops->notify)
> +		driver->ops->notify(container->iommu_data,
> +				    VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE, NULL);
>  
>  	filep->private_data = NULL;
>  
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 38d3c6a..9642579 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -57,6 +57,9 @@ extern int vfio_add_group_dev(struct device *dev,
>  extern void vfio_device_put(struct vfio_device *device);
>  extern void *vfio_device_data(struct vfio_device *device);
>  
> +/* events for the backend driver notify callback */
> +#define VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE	1

We should use an enum for type checking.

> +
>  /**
>   * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
>   */
> @@ -90,6 +93,8 @@ struct vfio_iommu_driver_ops {
>  					       struct notifier_block *nb);
>  	int		(*dma_rw)(void *iommu_data, dma_addr_t user_iova,
>  				  void *data, size_t count, bool write);
> +	void		(*notify)(void *iommu_data, unsigned int event,
> +				  void *data);

I'm not sure why we're pre-enabling this 3rd arg, do you have some
short term usage in mind that we can't easily add on demand later?
This is an internal API, not one we need to keep stable.  Thanks,

Alex

>  };
>  
>  extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 0/9] vfio virtual address update
  2021-01-29 21:55 ` [PATCH V3 0/9] vfio virtual address update Alex Williamson
@ 2021-01-29 22:14   ` Steven Sistare
  2021-01-30 16:54   ` Steven Sistare
  1 sibling, 0 replies; 37+ messages in thread
From: Steven Sistare @ 2021-01-29 22:14 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On 1/29/2021 4:55 PM, Alex Williamson wrote:
> On Fri, 29 Jan 2021 08:54:03 -0800
> Steve Sistare <steven.sistare@oracle.com> wrote:
> 
>> Add interfaces that allow the underlying memory object of an iova range
>> to be mapped to a new virtual address in the host process:
>>
>>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
>>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
>>   - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
>>   - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
>>   - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION
>>
>> Unmap-vaddr invalidates the host virtual address in an iova range and blocks
>> vfio translation of host virtual addresses, but DMA to already-mapped pages
>> continues.  Map-vaddr updates the base VA and resumes translation.  The
>> implementation supports iommu type1 and mediated devices.  Unmap-all allows
>> all ranges to be unmapped or invalidated in a single ioctl, which simplifies
>> userland code.
>>
>> This functionality is necessary for live update, in which a host process
>> such as qemu exec's an updated version of itself, while preserving its
>> guest and vfio devices.  The process blocks vfio VA translation, exec's
>> its new self, mmap's the memory object(s) underlying vfio object, updates
>> the VA, and unblocks translation.  For a working example that uses these
>> new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
>> https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com
>>
>> Patches 1-3 define and implement the flag to unmap all ranges.
>> Patches 4-6 define and implement the flags to update vaddr.
>> Patches 7-9 add blocking to complete the implementation.
> 
> Hi Steve,
> 
> It looks pretty good to me, but I have some nit-picky comments that
> I'll follow-up with on the individual patches.  However, I've made the
> changes I suggest in a branch that you can find here:
> 
> git://github.com/awilliam/linux-vfio.git vaddr-v3
> 
> If the changes look ok, just send me an ack, I don't want to attribute
> something to you that you don't approve of.  Thanks,

Thanks Alex, it's great to be down to nits!  I am done for today but I will
look at the vaddr-v3 branch and respond to all your comments tomorrow.

- Steve

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 5/9] vfio/type1: massage unmap iteration
  2021-01-29 21:56   ` Alex Williamson
@ 2021-01-30 16:51     ` Steven Sistare
  0 siblings, 0 replies; 37+ messages in thread
From: Steven Sistare @ 2021-01-30 16:51 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On 1/29/2021 4:56 PM, Alex Williamson wrote:
> On Fri, 29 Jan 2021 08:54:08 -0800
> Steve Sistare <steven.sistare@oracle.com> wrote:
> 
>> Modify the iteration in vfio_dma_do_unmap so it does not depend on deletion
>> of each dma entry.  Add a variant of vfio_find_dma that returns the entry
>> with the lowest iova in the search range to initialize the iteration.  No
>> externally visible change, but this behavior is needed in the subsequent
>> update-vaddr patch.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 35 ++++++++++++++++++++++++++++++++++-
>>  1 file changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 407f0f7..5823607 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -173,6 +173,31 @@ static struct vfio_dma *vfio_find_dma(struct vfio_iommu *iommu,
>>  	return NULL;
>>  }
>>  
>> +static struct rb_node *vfio_find_dma_first(struct vfio_iommu *iommu,
>> +					    dma_addr_t start, size_t size)
> 
> Nit, we return an rb_node rather than a vfio_dma now, but the naming is
> still pretty similar to vfio_find_dma().  Can I change it to
> vfio_find_dma_first_node() (yes, getting wordy)?  Thanks,

Sure - steve

>> +{
>> +	struct rb_node *res = NULL;
>> +	struct rb_node *node = iommu->dma_list.rb_node;
>> +	struct vfio_dma *dma_res = NULL;
>> +
>> +	while (node) {
>> +		struct vfio_dma *dma = rb_entry(node, struct vfio_dma, node);
>> +
>> +		if (start < dma->iova + dma->size) {
>> +			res = node;
>> +			dma_res = dma;
>> +			if (start >= dma->iova)
>> +				break;
>> +			node = node->rb_left;
>> +		} else {
>> +			node = node->rb_right;
>> +		}
>> +	}
>> +	if (res && size && dma_res->iova >= start + size)
>> +		res = NULL;
>> +	return res;
>> +}
>> +
>>  static void vfio_link_dma(struct vfio_iommu *iommu, struct vfio_dma *new)
>>  {
>>  	struct rb_node **link = &iommu->dma_list.rb_node, *parent = NULL;
>> @@ -1078,6 +1103,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>>  	dma_addr_t iova = unmap->iova;
>>  	unsigned long size = unmap->size;
>>  	bool unmap_all = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL);
>> +	struct rb_node *n;
>>  
>>  	mutex_lock(&iommu->lock);
>>  
>> @@ -1148,7 +1174,13 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>>  	}
>>  
>>  	ret = 0;
>> -	while ((dma = vfio_find_dma(iommu, iova, size))) {
>> +	n = vfio_find_dma_first(iommu, iova, size);
>> +
>> +	while (n) {
>> +		dma = rb_entry(n, struct vfio_dma, node);
>> +		if (dma->iova >= iova + size)
>> +			break;
>> +
>>  		if (!iommu->v2 && iova > dma->iova)
>>  			break;
>>  		/*
>> @@ -1193,6 +1225,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>>  		}
>>  
>>  		unmapped += dma->size;
>> +		n = rb_next(n);
>>  		vfio_remove_dma(iommu, dma);
>>  	}
>>  
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr
  2021-01-29 21:56   ` Alex Williamson
@ 2021-01-30 16:51     ` Steven Sistare
  0 siblings, 0 replies; 37+ messages in thread
From: Steven Sistare @ 2021-01-30 16:51 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On 1/29/2021 4:56 PM, Alex Williamson wrote:
> On Fri, 29 Jan 2021 08:54:09 -0800
> Steve Sistare <steven.sistare@oracle.com> wrote:
> 
>> Implement VFIO_DMA_UNMAP_FLAG_VADDR, VFIO_DMA_MAP_FLAG_VADDR, and
>> VFIO_UPDATE_VADDR.  This is a partial implementation.  Blocking is
>> added in a subsequent patch.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 62 +++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 56 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 5823607..cbe3c65 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -74,6 +74,7 @@ struct vfio_iommu {
>>  	bool			nesting;
>>  	bool			dirty_page_tracking;
>>  	bool			pinned_page_dirty_scope;
>> +	int			vaddr_invalid_count;
> 
> Nit, I'll move this up in the struct for alignment/hole purposes.  

OK.
I noticed the hole, but I also noticed that the booleans were not declared in
the hole, so I thought perhaps you cared about maintaining binary compatibility.

> Also
> I think this should be an unsigned, we never use the negative value for
> sanity checking and our number of vfio_dma entries is governed by an
> unsigned int.

Agreed.

>>  };
>>  
>>  struct vfio_domain {
>> @@ -92,6 +93,7 @@ struct vfio_dma {
>>  	int			prot;		/* IOMMU_READ/WRITE */
>>  	bool			iommu_mapped;
>>  	bool			lock_cap;	/* capable(CAP_IPC_LOCK) */
>> +	bool			vaddr_invalid;
>>  	struct task_struct	*task;
>>  	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
>>  	unsigned long		*bitmap;
>> @@ -973,6 +975,8 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>>  	vfio_unlink_dma(iommu, dma);
>>  	put_task_struct(dma->task);
>>  	vfio_dma_bitmap_free(dma);
>> +	if (dma->vaddr_invalid)
>> +		--iommu->vaddr_invalid_count;
> 
> I'm not a fan of the -- on the left, nor the mixing with ++ on the
> right.  Can I move the -- to the right?

Yes please, same here!  I saw a lot of --prefix usage in the kernel so I made a guess 
that it was a preferred style.

>>  	kfree(dma);
>>  	iommu->dma_avail++;
>>  }
>> @@ -1103,7 +1107,8 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>>  	dma_addr_t iova = unmap->iova;
>>  	unsigned long size = unmap->size;
>>  	bool unmap_all = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_ALL);
>> -	struct rb_node *n;
>> +	bool invalidate_vaddr = !!(unmap->flags & VFIO_DMA_UNMAP_FLAG_VADDR);
> 
> Nit, !! is not necessary and inconsistent with the MAP side below, will
> remove here and for previous unmap_all use.

Yes please.  I was not sure if the community was OK with variables declared bool
taking values other than 0 or 1.  And I missed the !! in map.

>> +	struct rb_node *n, *first_n, *last_n;
>>  
>>  	mutex_lock(&iommu->lock);
>>  
>> @@ -1174,7 +1179,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>>  	}
>>  
>>  	ret = 0;
>> -	n = vfio_find_dma_first(iommu, iova, size);
>> +	n = first_n = vfio_find_dma_first(iommu, iova, size);
>>  
>>  	while (n) {
>>  		dma = rb_entry(n, struct vfio_dma, node);
>> @@ -1190,6 +1195,16 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>>  		if (dma->task->mm != current->mm)
>>  			break;
>>  
>> +		if (invalidate_vaddr) {
>> +			if (dma->vaddr_invalid)
>> +				goto unwind;
> 
> The unwind actually fits pretty well inline here and avoids the leap
> frogging gotos below.  last_n can also be scoped within the unwind more
> easily here.

Sure, that keeps all the vaddr code in one place.

>> +			dma->vaddr_invalid = true;
>> +			iommu->vaddr_invalid_count++;
>> +			unmapped += dma->size;
>> +			n = rb_next(n);
>> +			continue;
>> +		}
>> +
>>  		if (!RB_EMPTY_ROOT(&dma->pfn_list)) {
>>  			struct vfio_iommu_type1_dma_unmap nb_unmap;
>>  
>> @@ -1228,6 +1243,18 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>>  		n = rb_next(n);
>>  		vfio_remove_dma(iommu, dma);
>>  	}
>> +	goto unlock;
>> +
>> +unwind:
>> +	last_n = n;
>> +
>> +	for (n = first_n; n != last_n; n = rb_next(n)) {
>> +		dma = rb_entry(n, struct vfio_dma, node);
>> +		dma->vaddr_invalid = false;
>> +		--iommu->vaddr_invalid_count;
>> +	}
>> +	ret = -EINVAL;
>> +	unmapped = 0;
>>  
>>  unlock:
>>  	mutex_unlock(&iommu->lock);
>> @@ -1329,6 +1356,7 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
>>  static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  			   struct vfio_iommu_type1_dma_map *map)
>>  {
>> +	bool set_vaddr = map->flags & VFIO_DMA_MAP_FLAG_VADDR;
>>  	dma_addr_t iova = map->iova;
>>  	unsigned long vaddr = map->vaddr;
>>  	size_t size = map->size;
>> @@ -1346,13 +1374,16 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  	if (map->flags & VFIO_DMA_MAP_FLAG_READ)
>>  		prot |= IOMMU_READ;
>>  
>> +	if ((prot && set_vaddr) || (!prot && !set_vaddr))
>> +		return -EINVAL;
>> +
>>  	mutex_lock(&iommu->lock);
>>  
>>  	pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
>>  
>>  	WARN_ON((pgsize - 1) & PAGE_MASK);
>>  
>> -	if (!prot || !size || (size | iova | vaddr) & (pgsize - 1)) {
>> +	if (!size || (size | iova | vaddr) & (pgsize - 1)) {
>>  		ret = -EINVAL;
>>  		goto out_unlock;
>>  	}
>> @@ -1363,7 +1394,20 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  		goto out_unlock;
>>  	}
>>  
>> -	if (vfio_find_dma(iommu, iova, size)) {
>> +	dma = vfio_find_dma(iommu, iova, size);
>> +	if (set_vaddr) {
>> +		if (!dma) {
>> +			ret = -ENOENT;
>> +		} else if (!dma->vaddr_invalid || dma->iova != iova ||
>> +			   dma->size != size) {
>> +			ret = -EINVAL;
>> +		} else {
>> +			dma->vaddr = vaddr;
>> +			dma->vaddr_invalid = false;
>> +			--iommu->vaddr_invalid_count;
>> +		}
>> +		goto out_unlock;
>> +	} else if (dma) {
>>  		ret = -EEXIST;
>>  		goto out_unlock;
>>  	}
>> @@ -1387,6 +1431,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  	iommu->dma_avail--;
>>  	dma->iova = iova;
>>  	dma->vaddr = vaddr;
>> +	dma->vaddr_invalid = false;
>>  	dma->prot = prot;
>>  
>>  	/*
>> @@ -2483,6 +2528,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
>>  	INIT_LIST_HEAD(&iommu->iova_list);
>>  	iommu->dma_list = RB_ROOT;
>>  	iommu->dma_avail = dma_entry_limit;
>> +	iommu->vaddr_invalid_count = 0;
> 
> Nit, we can drop this.

OK.

- Steve

>>  	mutex_init(&iommu->lock);
>>  	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
>>  
>> @@ -2555,6 +2601,7 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu,
>>  	case VFIO_TYPE1v2_IOMMU:
>>  	case VFIO_TYPE1_NESTING_IOMMU:
>>  	case VFIO_UNMAP_ALL:
>> +	case VFIO_UPDATE_VADDR:
>>  		return 1;
>>  	case VFIO_DMA_CC_IOMMU:
>>  		if (!iommu)
>> @@ -2709,7 +2756,8 @@ static int vfio_iommu_type1_map_dma(struct vfio_iommu *iommu,
>>  {
>>  	struct vfio_iommu_type1_dma_map map;
>>  	unsigned long minsz;
>> -	uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
>> +	uint32_t mask = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE |
>> +			VFIO_DMA_MAP_FLAG_VADDR;
>>  
>>  	minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
>>  
>> @@ -2728,6 +2776,7 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
>>  	struct vfio_iommu_type1_dma_unmap unmap;
>>  	struct vfio_bitmap bitmap = { 0 };
>>  	uint32_t mask = VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP |
>> +			VFIO_DMA_UNMAP_FLAG_VADDR |
>>  			VFIO_DMA_UNMAP_FLAG_ALL;
>>  	unsigned long minsz;
>>  	int ret;
>> @@ -2741,7 +2790,8 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu,
>>  		return -EINVAL;
>>  
>>  	if ((unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) &&
>> -	    (unmap.flags & VFIO_DMA_UNMAP_FLAG_ALL))
>> +	    (unmap.flags & (VFIO_DMA_UNMAP_FLAG_ALL |
>> +			    VFIO_DMA_UNMAP_FLAG_VADDR)))
>>  		return -EINVAL;
>>  
>>  	if (unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 7/9] vfio: iommu driver notify callback
  2021-01-29 21:57   ` Alex Williamson
@ 2021-01-30 16:51     ` Steven Sistare
  2021-02-01 12:34       ` Cornelia Huck
  0 siblings, 1 reply; 37+ messages in thread
From: Steven Sistare @ 2021-01-30 16:51 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On 1/29/2021 4:57 PM, Alex Williamson wrote:
> On Fri, 29 Jan 2021 08:54:10 -0800
> Steve Sistare <steven.sistare@oracle.com> wrote:
> 
>> Define a vfio_iommu_driver_ops notify callback, for sending events to
>> the driver.  Drivers are not required to provide the callback, and
>> may ignore any events.  The handling of events is driver specific.
>>
>> Define the CONTAINER_CLOSE event, called when the container's file
>> descriptor is closed.  This event signifies that no further state changes
>> will occur via container ioctl's.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  drivers/vfio/vfio.c  | 5 +++++
>>  include/linux/vfio.h | 5 +++++
>>  2 files changed, 10 insertions(+)
>>
>> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
>> index 262ab0e..99458fc 100644
>> --- a/drivers/vfio/vfio.c
>> +++ b/drivers/vfio/vfio.c
>> @@ -1220,6 +1220,11 @@ static int vfio_fops_open(struct inode *inode, struct file *filep)
>>  static int vfio_fops_release(struct inode *inode, struct file *filep)
>>  {
>>  	struct vfio_container *container = filep->private_data;
>> +	struct vfio_iommu_driver *driver = container->iommu_driver;
>> +
>> +	if (driver && driver->ops->notify)
>> +		driver->ops->notify(container->iommu_data,
>> +				    VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE, NULL);
>>  
>>  	filep->private_data = NULL;
>>  
>> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
>> index 38d3c6a..9642579 100644
>> --- a/include/linux/vfio.h
>> +++ b/include/linux/vfio.h
>> @@ -57,6 +57,9 @@ extern int vfio_add_group_dev(struct device *dev,
>>  extern void vfio_device_put(struct vfio_device *device);
>>  extern void *vfio_device_data(struct vfio_device *device);
>>  
>> +/* events for the backend driver notify callback */
>> +#define VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE	1
> 
> We should use an enum for type checking.

Agreed.
I see you changed the value to 0.  Do you want to reserve 0 for invalid-event?
(I know, this is internal and can be changed).  Your call.

>> +
>>  /**
>>   * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
>>   */
>> @@ -90,6 +93,8 @@ struct vfio_iommu_driver_ops {
>>  					       struct notifier_block *nb);
>>  	int		(*dma_rw)(void *iommu_data, dma_addr_t user_iova,
>>  				  void *data, size_t count, bool write);
>> +	void		(*notify)(void *iommu_data, unsigned int event,
>> +				  void *data);
> 
> I'm not sure why we're pre-enabling this 3rd arg, do you have some
> short term usage in mind that we can't easily add on demand later?
> This is an internal API, not one we need to keep stable.  Thanks,

No short term need, just seems sensible for an event callback.  I was mimic'ing the 
signature of the callback for vfio_register_notifier. Your call.

- Steve

> 
> Alex
> 
>>  };
>>  
>>  extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 0/9] vfio virtual address update
  2021-01-29 21:55 ` [PATCH V3 0/9] vfio virtual address update Alex Williamson
  2021-01-29 22:14   ` Steven Sistare
@ 2021-01-30 16:54   ` Steven Sistare
  2021-02-01 20:23     ` Alex Williamson
  1 sibling, 1 reply; 37+ messages in thread
From: Steven Sistare @ 2021-01-30 16:54 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On 1/29/2021 4:55 PM, Alex Williamson wrote:
> On Fri, 29 Jan 2021 08:54:03 -0800
> Steve Sistare <steven.sistare@oracle.com> wrote:
> 
>> Add interfaces that allow the underlying memory object of an iova range
>> to be mapped to a new virtual address in the host process:
>>
>>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
>>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
>>   - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
>>   - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
>>   - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION
>>
>> Unmap-vaddr invalidates the host virtual address in an iova range and blocks
>> vfio translation of host virtual addresses, but DMA to already-mapped pages
>> continues.  Map-vaddr updates the base VA and resumes translation.  The
>> implementation supports iommu type1 and mediated devices.  Unmap-all allows
>> all ranges to be unmapped or invalidated in a single ioctl, which simplifies
>> userland code.
>>
>> This functionality is necessary for live update, in which a host process
>> such as qemu exec's an updated version of itself, while preserving its
>> guest and vfio devices.  The process blocks vfio VA translation, exec's
>> its new self, mmap's the memory object(s) underlying vfio object, updates
>> the VA, and unblocks translation.  For a working example that uses these
>> new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
>> https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com
>>
>> Patches 1-3 define and implement the flag to unmap all ranges.
>> Patches 4-6 define and implement the flags to update vaddr.
>> Patches 7-9 add blocking to complete the implementation.
> 
> Hi Steve,
> 
> It looks pretty good to me, but I have some nit-picky comments that
> I'll follow-up with on the individual patches.  However, I've made the
> changes I suggest in a branch that you can find here:
> 
> git://github.com/awilliam/linux-vfio.git vaddr-v3
> 
> If the changes look ok, just send me an ack, I don't want to attribute
> something to you that you don't approve of.  Thanks,

All changes look good, thanks!  
Do you need anything more from me on this patch series?

- Steve

>> Changes in V2:
>>  - define a flag for unmap all instead of special range values
>>  - define the VFIO_UNMAP_ALL extension
>>  - forbid the combination of unmap-all and get-dirty-bitmap
>>  - unwind in unmap on vaddr error
>>  - add a new function to find first dma in a range instead of modifying
>>    an existing function
>>  - change names of update flags
>>  - fix concurrency bugs due to iommu lock being dropped
>>  - call down from from vfio to a new backend interface instead of up from
>>    driver to detect container close
>>  - use wait/wake instead of sleep and polling
>>  - refine the uapi specification
>>  - split patches into vfio vs type1
>>
>> Changes in V3:
>>  - add vaddr_invalid_count to fix pin_pages race with unmap
>>  - refactor the wait helper functions
>>  - traverse dma entries more efficiently in unmap
>>  - check unmap flag conflicts more explicitly
>>  - rename some local variables and functions
>>
>> Steve Sistare (9):
>>   vfio: option to unmap all
>>   vfio/type1: unmap cleanup
>>   vfio/type1: implement unmap all
>>   vfio: interfaces to update vaddr
>>   vfio/type1: massage unmap iteration
>>   vfio/type1: implement interfaces to update vaddr
>>   vfio: iommu driver notify callback
>>   vfio/type1: implement notify callback
>>   vfio/type1: block on invalid vaddr
>>
>>  drivers/vfio/vfio.c             |   5 +
>>  drivers/vfio/vfio_iommu_type1.c | 251 +++++++++++++++++++++++++++++++++++-----
>>  include/linux/vfio.h            |   5 +
>>  include/uapi/linux/vfio.h       |  27 +++++
>>  4 files changed, 256 insertions(+), 32 deletions(-)
>>
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 1/9] vfio: option to unmap all
  2021-01-29 16:54 ` [PATCH V3 1/9] vfio: option to unmap all Steve Sistare
@ 2021-02-01 11:42   ` Cornelia Huck
  2021-02-01 12:52     ` Steven Sistare
  0 siblings, 1 reply; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 11:42 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:04 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> For the UNMAP_DMA ioctl, delete all mappings if VFIO_DMA_UNMAP_FLAG_ALL
> is set.  Define the corresponding VFIO_UNMAP_ALL extension.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  include/uapi/linux/vfio.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 9204705..55578b6 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -46,6 +46,9 @@
>   */
>  #define VFIO_NOIOMMU_IOMMU		8
>  
> +/* Supports VFIO_DMA_UNMAP_FLAG_ALL */
> +#define VFIO_UNMAP_ALL			9
> +
>  /*
>   * The IOCTL interface is designed for extensibility by embedding the
>   * structure length (argsz) and flags into structures passed between
> @@ -1074,6 +1077,7 @@ struct vfio_bitmap {
>   * field.  No guarantee is made to the user that arbitrary unmaps of iova
>   * or size different from those used in the original mapping call will
>   * succeed.
> + *
>   * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get the dirty bitmap
>   * before unmapping IO virtual addresses. When this flag is set, the user must
>   * provide a struct vfio_bitmap in data[]. User must provide zero-allocated
> @@ -1083,11 +1087,15 @@ struct vfio_bitmap {
>   * indicates that the page at that offset from iova is dirty. A Bitmap of the
>   * pages in the range of unmapped size is returned in the user-provided
>   * vfio_bitmap.data.
> + *
> + * If flags & VFIO_DMA_UNMAP_FLAG_ALL, unmap all addresses.  iova and size
> + * must be 0.  This may not be combined with the get-dirty-bitmap flag.

wording nit: s/may not/cannot/

Or maybe

"VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP and VFIO_DMA_UNMAP_FLAG_ALL are
mutually exclusive."

?

>   */
>  struct vfio_iommu_type1_dma_unmap {
>  	__u32	argsz;
>  	__u32	flags;
>  #define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0)
> +#define VFIO_DMA_UNMAP_FLAG_ALL		     (1 << 1)
>  	__u64	iova;				/* IO virtual address */
>  	__u64	size;				/* Size of mapping (bytes) */
>  	__u8    data[];


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 2/9] vfio/type1: unmap cleanup
  2021-01-29 16:54 ` [PATCH V3 2/9] vfio/type1: unmap cleanup Steve Sistare
@ 2021-02-01 11:50   ` Cornelia Huck
  0 siblings, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 11:50 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:05 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Minor changes in vfio_dma_do_unmap to improve readability, which also
> simplify the subsequent unmap-all patch.  No functional change.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 38 +++++++++++++++-----------------------
>  1 file changed, 15 insertions(+), 23 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 3/9] vfio/type1: implement unmap all
  2021-01-29 16:54 ` [PATCH V3 3/9] vfio/type1: implement unmap all Steve Sistare
@ 2021-02-01 11:59   ` Cornelia Huck
  0 siblings, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 11:59 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:06 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Implement VFIO_DMA_UNMAP_FLAG_ALL.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 5/9] vfio/type1: massage unmap iteration
  2021-01-29 16:54 ` [PATCH V3 5/9] vfio/type1: massage unmap iteration Steve Sistare
  2021-01-29 21:56   ` Alex Williamson
@ 2021-02-01 12:11   ` Cornelia Huck
  1 sibling, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 12:11 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:08 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Modify the iteration in vfio_dma_do_unmap so it does not depend on deletion
> of each dma entry.  Add a variant of vfio_find_dma that returns the entry
> with the lowest iova in the search range to initialize the iteration.  No
> externally visible change, but this behavior is needed in the subsequent
> update-vaddr patch.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 35 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 34 insertions(+), 1 deletion(-)

With the function name tweak:

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr
  2021-01-29 16:54 ` [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr Steve Sistare
  2021-01-29 21:56   ` Alex Williamson
@ 2021-02-01 12:27   ` Cornelia Huck
  1 sibling, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 12:27 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:09 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Implement VFIO_DMA_UNMAP_FLAG_VADDR, VFIO_DMA_MAP_FLAG_VADDR, and
> VFIO_UPDATE_VADDR.  This is a partial implementation.  Blocking is
> added in a subsequent patch.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 62 +++++++++++++++++++++++++++++++++++++----
>  1 file changed, 56 insertions(+), 6 deletions(-)

With the changes in Alex' branch:

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 7/9] vfio: iommu driver notify callback
  2021-01-30 16:51     ` Steven Sistare
@ 2021-02-01 12:34       ` Cornelia Huck
  2021-02-01 12:52         ` Steven Sistare
  0 siblings, 1 reply; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 12:34 UTC (permalink / raw)
  To: Steven Sistare; +Cc: Alex Williamson, kvm, Kirti Wankhede

On Sat, 30 Jan 2021 11:51:41 -0500
Steven Sistare <steven.sistare@oracle.com> wrote:

> On 1/29/2021 4:57 PM, Alex Williamson wrote:
> > On Fri, 29 Jan 2021 08:54:10 -0800
> > Steve Sistare <steven.sistare@oracle.com> wrote:
> >   
> >> Define a vfio_iommu_driver_ops notify callback, for sending events to
> >> the driver.  Drivers are not required to provide the callback, and
> >> may ignore any events.  The handling of events is driver specific.
> >>
> >> Define the CONTAINER_CLOSE event, called when the container's file
> >> descriptor is closed.  This event signifies that no further state changes
> >> will occur via container ioctl's.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> ---
> >>  drivers/vfio/vfio.c  | 5 +++++
> >>  include/linux/vfio.h | 5 +++++
> >>  2 files changed, 10 insertions(+)
> >>
> >> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> >> index 262ab0e..99458fc 100644
> >> --- a/drivers/vfio/vfio.c
> >> +++ b/drivers/vfio/vfio.c
> >> @@ -1220,6 +1220,11 @@ static int vfio_fops_open(struct inode *inode, struct file *filep)
> >>  static int vfio_fops_release(struct inode *inode, struct file *filep)
> >>  {
> >>  	struct vfio_container *container = filep->private_data;
> >> +	struct vfio_iommu_driver *driver = container->iommu_driver;
> >> +
> >> +	if (driver && driver->ops->notify)
> >> +		driver->ops->notify(container->iommu_data,
> >> +				    VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE, NULL);
> >>  
> >>  	filep->private_data = NULL;
> >>  
> >> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> >> index 38d3c6a..9642579 100644
> >> --- a/include/linux/vfio.h
> >> +++ b/include/linux/vfio.h
> >> @@ -57,6 +57,9 @@ extern int vfio_add_group_dev(struct device *dev,
> >>  extern void vfio_device_put(struct vfio_device *device);
> >>  extern void *vfio_device_data(struct vfio_device *device);
> >>  
> >> +/* events for the backend driver notify callback */
> >> +#define VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE	1  
> > 
> > We should use an enum for type checking.  
> 
> Agreed.
> I see you changed the value to 0.  Do you want to reserve 0 for invalid-event?
> (I know, this is internal and can be changed).  Your call.

I'm not sure what we would use an invalid-event event for... the type
checking provided by the enum should be enough?

> 
> >> +
> >>  /**
> >>   * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
> >>   */
> >> @@ -90,6 +93,8 @@ struct vfio_iommu_driver_ops {
> >>  					       struct notifier_block *nb);
> >>  	int		(*dma_rw)(void *iommu_data, dma_addr_t user_iova,
> >>  				  void *data, size_t count, bool write);
> >> +	void		(*notify)(void *iommu_data, unsigned int event,
> >> +				  void *data);  
> > 
> > I'm not sure why we're pre-enabling this 3rd arg, do you have some
> > short term usage in mind that we can't easily add on demand later?
> > This is an internal API, not one we need to keep stable.  Thanks,  
> 
> No short term need, just seems sensible for an event callback.  I was mimic'ing the 
> signature of the callback for vfio_register_notifier. Your call.

I'd drop *data for now, if we don't have a concrete use case.

> 
> - Steve
> 
> > 
> > Alex
> >   
> >>  };
> >>  
> >>  extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);  
> >   
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 8/9] vfio/type1: implement notify callback
  2021-01-29 16:54 ` [PATCH V3 8/9] vfio/type1: implement " Steve Sistare
@ 2021-02-01 12:40   ` Cornelia Huck
  0 siblings, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 12:40 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:11 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Implement a notify callback that remembers if the container's file
> descriptor has been closed.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 15 +++++++++++++++

For the version in Alex' git branch:

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 9/9] vfio/type1: block on invalid vaddr
  2021-01-29 16:54 ` [PATCH V3 9/9] vfio/type1: block on invalid vaddr Steve Sistare
@ 2021-02-01 12:47   ` Cornelia Huck
  0 siblings, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 12:47 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:12 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Block translation of host virtual address while an iova range has an
> invalid vaddr.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 95 ++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 90 insertions(+), 5 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 1/9] vfio: option to unmap all
  2021-02-01 11:42   ` Cornelia Huck
@ 2021-02-01 12:52     ` Steven Sistare
  2021-02-01 13:13       ` Cornelia Huck
  0 siblings, 1 reply; 37+ messages in thread
From: Steven Sistare @ 2021-02-01 12:52 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: kvm, Alex Williamson, Kirti Wankhede

On 2/1/2021 6:42 AM, Cornelia Huck wrote:
> On Fri, 29 Jan 2021 08:54:04 -0800
> Steve Sistare <steven.sistare@oracle.com> wrote:
> 
>> For the UNMAP_DMA ioctl, delete all mappings if VFIO_DMA_UNMAP_FLAG_ALL
>> is set.  Define the corresponding VFIO_UNMAP_ALL extension.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  include/uapi/linux/vfio.h | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>> index 9204705..55578b6 100644
>> --- a/include/uapi/linux/vfio.h
>> +++ b/include/uapi/linux/vfio.h
>> @@ -46,6 +46,9 @@
>>   */
>>  #define VFIO_NOIOMMU_IOMMU		8
>>  
>> +/* Supports VFIO_DMA_UNMAP_FLAG_ALL */
>> +#define VFIO_UNMAP_ALL			9
>> +
>>  /*
>>   * The IOCTL interface is designed for extensibility by embedding the
>>   * structure length (argsz) and flags into structures passed between
>> @@ -1074,6 +1077,7 @@ struct vfio_bitmap {
>>   * field.  No guarantee is made to the user that arbitrary unmaps of iova
>>   * or size different from those used in the original mapping call will
>>   * succeed.
>> + *
>>   * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get the dirty bitmap
>>   * before unmapping IO virtual addresses. When this flag is set, the user must
>>   * provide a struct vfio_bitmap in data[]. User must provide zero-allocated
>> @@ -1083,11 +1087,15 @@ struct vfio_bitmap {
>>   * indicates that the page at that offset from iova is dirty. A Bitmap of the
>>   * pages in the range of unmapped size is returned in the user-provided
>>   * vfio_bitmap.data.
>> + *
>> + * If flags & VFIO_DMA_UNMAP_FLAG_ALL, unmap all addresses.  iova and size
>> + * must be 0.  This may not be combined with the get-dirty-bitmap flag.
> 
> wording nit: s/may not/cannot/

OK. So, the same edit should be applied to this from patch 4:

+ * iova's vaddr will block.  DMA to already-mapped pages continues.  This may
+ * not be combined with the get-dirty-bitmap flag.

Alex, can you make these edits please?

- Steve

> Or maybe
> 
> "VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP and VFIO_DMA_UNMAP_FLAG_ALL are
> mutually exclusive."
> 
> ?
> 
>>   */
>>  struct vfio_iommu_type1_dma_unmap {
>>  	__u32	argsz;
>>  	__u32	flags;
>>  #define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0)
>> +#define VFIO_DMA_UNMAP_FLAG_ALL		     (1 << 1)
>>  	__u64	iova;				/* IO virtual address */
>>  	__u64	size;				/* Size of mapping (bytes) */
>>  	__u8    data[];
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 7/9] vfio: iommu driver notify callback
  2021-02-01 12:34       ` Cornelia Huck
@ 2021-02-01 12:52         ` Steven Sistare
  2021-02-01 13:17           ` Cornelia Huck
  0 siblings, 1 reply; 37+ messages in thread
From: Steven Sistare @ 2021-02-01 12:52 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Alex Williamson, kvm, Kirti Wankhede

On 2/1/2021 7:34 AM, Cornelia Huck wrote:
> On Sat, 30 Jan 2021 11:51:41 -0500
> Steven Sistare <steven.sistare@oracle.com> wrote:
> 
>> On 1/29/2021 4:57 PM, Alex Williamson wrote:
>>> On Fri, 29 Jan 2021 08:54:10 -0800
>>> Steve Sistare <steven.sistare@oracle.com> wrote:
>>>   
>>>> Define a vfio_iommu_driver_ops notify callback, for sending events to
>>>> the driver.  Drivers are not required to provide the callback, and
>>>> may ignore any events.  The handling of events is driver specific.
>>>>
>>>> Define the CONTAINER_CLOSE event, called when the container's file
>>>> descriptor is closed.  This event signifies that no further state changes
>>>> will occur via container ioctl's.
>>>>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> ---
>>>>  drivers/vfio/vfio.c  | 5 +++++
>>>>  include/linux/vfio.h | 5 +++++
>>>>  2 files changed, 10 insertions(+)
>>>>
>>>> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
>>>> index 262ab0e..99458fc 100644
>>>> --- a/drivers/vfio/vfio.c
>>>> +++ b/drivers/vfio/vfio.c
>>>> @@ -1220,6 +1220,11 @@ static int vfio_fops_open(struct inode *inode, struct file *filep)
>>>>  static int vfio_fops_release(struct inode *inode, struct file *filep)
>>>>  {
>>>>  	struct vfio_container *container = filep->private_data;
>>>> +	struct vfio_iommu_driver *driver = container->iommu_driver;
>>>> +
>>>> +	if (driver && driver->ops->notify)
>>>> +		driver->ops->notify(container->iommu_data,
>>>> +				    VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE, NULL);
>>>>  
>>>>  	filep->private_data = NULL;
>>>>  
>>>> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
>>>> index 38d3c6a..9642579 100644
>>>> --- a/include/linux/vfio.h
>>>> +++ b/include/linux/vfio.h
>>>> @@ -57,6 +57,9 @@ extern int vfio_add_group_dev(struct device *dev,
>>>>  extern void vfio_device_put(struct vfio_device *device);
>>>>  extern void *vfio_device_data(struct vfio_device *device);
>>>>  
>>>> +/* events for the backend driver notify callback */
>>>> +#define VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE	1  
>>>
>>> We should use an enum for type checking.  
>>
>> Agreed.
>> I see you changed the value to 0.  Do you want to reserve 0 for invalid-event?
>> (I know, this is internal and can be changed).  Your call.
> 
> I'm not sure what we would use an invalid-event event for... the type
> checking provided by the enum should be enough?

I should have described it as no-event or null-event.  It can be useful when
initializing a struct member that stores an event, eg, last-event-received.

- Steve

>>>> +
>>>>  /**
>>>>   * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
>>>>   */
>>>> @@ -90,6 +93,8 @@ struct vfio_iommu_driver_ops {
>>>>  					       struct notifier_block *nb);
>>>>  	int		(*dma_rw)(void *iommu_data, dma_addr_t user_iova,
>>>>  				  void *data, size_t count, bool write);
>>>> +	void		(*notify)(void *iommu_data, unsigned int event,
>>>> +				  void *data);  
>>>
>>> I'm not sure why we're pre-enabling this 3rd arg, do you have some
>>> short term usage in mind that we can't easily add on demand later?
>>> This is an internal API, not one we need to keep stable.  Thanks,  
>>
>> No short term need, just seems sensible for an event callback.  I was mimic'ing the 
>> signature of the callback for vfio_register_notifier. Your call.
> 
> I'd drop *data for now, if we don't have a concrete use case.
> 
>>
>> - Steve
>>
>>>
>>> Alex
>>>   
>>>>  };
>>>>  
>>>>  extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);  
>>>   
>>
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 1/9] vfio: option to unmap all
  2021-02-01 12:52     ` Steven Sistare
@ 2021-02-01 13:13       ` Cornelia Huck
  0 siblings, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 13:13 UTC (permalink / raw)
  To: Steven Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Mon, 1 Feb 2021 07:52:04 -0500
Steven Sistare <steven.sistare@oracle.com> wrote:

> On 2/1/2021 6:42 AM, Cornelia Huck wrote:
> > On Fri, 29 Jan 2021 08:54:04 -0800
> > Steve Sistare <steven.sistare@oracle.com> wrote:
> >   
> >> For the UNMAP_DMA ioctl, delete all mappings if VFIO_DMA_UNMAP_FLAG_ALL
> >> is set.  Define the corresponding VFIO_UNMAP_ALL extension.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> ---
> >>  include/uapi/linux/vfio.h | 8 ++++++++
> >>  1 file changed, 8 insertions(+)
> >>
> >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >> index 9204705..55578b6 100644
> >> --- a/include/uapi/linux/vfio.h
> >> +++ b/include/uapi/linux/vfio.h
> >> @@ -46,6 +46,9 @@
> >>   */
> >>  #define VFIO_NOIOMMU_IOMMU		8
> >>  
> >> +/* Supports VFIO_DMA_UNMAP_FLAG_ALL */
> >> +#define VFIO_UNMAP_ALL			9
> >> +
> >>  /*
> >>   * The IOCTL interface is designed for extensibility by embedding the
> >>   * structure length (argsz) and flags into structures passed between
> >> @@ -1074,6 +1077,7 @@ struct vfio_bitmap {
> >>   * field.  No guarantee is made to the user that arbitrary unmaps of iova
> >>   * or size different from those used in the original mapping call will
> >>   * succeed.
> >> + *
> >>   * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get the dirty bitmap
> >>   * before unmapping IO virtual addresses. When this flag is set, the user must
> >>   * provide a struct vfio_bitmap in data[]. User must provide zero-allocated
> >> @@ -1083,11 +1087,15 @@ struct vfio_bitmap {
> >>   * indicates that the page at that offset from iova is dirty. A Bitmap of the
> >>   * pages in the range of unmapped size is returned in the user-provided
> >>   * vfio_bitmap.data.
> >> + *
> >> + * If flags & VFIO_DMA_UNMAP_FLAG_ALL, unmap all addresses.  iova and size
> >> + * must be 0.  This may not be combined with the get-dirty-bitmap flag.  
> > 
> > wording nit: s/may not/cannot/  
> 
> OK. So, the same edit should be applied to this from patch 4:
> 
> + * iova's vaddr will block.  DMA to already-mapped pages continues.  This may
> + * not be combined with the get-dirty-bitmap flag.
> 
> Alex, can you make these edits please?

With that,

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

> 
> - Steve
> 
> > Or maybe
> > 
> > "VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP and VFIO_DMA_UNMAP_FLAG_ALL are
> > mutually exclusive."
> > 
> > ?
> >   
> >>   */
> >>  struct vfio_iommu_type1_dma_unmap {
> >>  	__u32	argsz;
> >>  	__u32	flags;
> >>  #define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0)
> >> +#define VFIO_DMA_UNMAP_FLAG_ALL		     (1 << 1)
> >>  	__u64	iova;				/* IO virtual address */
> >>  	__u64	size;				/* Size of mapping (bytes) */
> >>  	__u8    data[];  
> >   
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 4/9] vfio: interfaces to update vaddr
  2021-01-29 16:54 ` [PATCH V3 4/9] vfio: interfaces to update vaddr Steve Sistare
@ 2021-02-01 13:13   ` Cornelia Huck
  0 siblings, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 13:13 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:07 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Define interfaces that allow the underlying memory object of an iova
> range to be mapped to a new host virtual address in the host process:
> 
>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
>   - VFIO_UPDATE_VADDR extension for VFIO_CHECK_EXTENSION
> 
> Unmap vaddr invalidates the host virtual address in an iova range, and
> blocks vfio translation of host virtual addresses.  DMA to already-mapped
> pages continues.  Map vaddr updates the base VA and resumes translation.
> See comments in uapi/linux/vfio.h for more details.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  include/uapi/linux/vfio.h | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 7/9] vfio: iommu driver notify callback
  2021-02-01 12:52         ` Steven Sistare
@ 2021-02-01 13:17           ` Cornelia Huck
  0 siblings, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 13:17 UTC (permalink / raw)
  To: Steven Sistare; +Cc: Alex Williamson, kvm, Kirti Wankhede

On Mon, 1 Feb 2021 07:52:16 -0500
Steven Sistare <steven.sistare@oracle.com> wrote:

> On 2/1/2021 7:34 AM, Cornelia Huck wrote:
> > On Sat, 30 Jan 2021 11:51:41 -0500
> > Steven Sistare <steven.sistare@oracle.com> wrote:
> >   
> >> On 1/29/2021 4:57 PM, Alex Williamson wrote:  
> >>> On Fri, 29 Jan 2021 08:54:10 -0800
> >>> Steve Sistare <steven.sistare@oracle.com> wrote:
> >>>     
> >>>> Define a vfio_iommu_driver_ops notify callback, for sending events to
> >>>> the driver.  Drivers are not required to provide the callback, and
> >>>> may ignore any events.  The handling of events is driver specific.
> >>>>
> >>>> Define the CONTAINER_CLOSE event, called when the container's file
> >>>> descriptor is closed.  This event signifies that no further state changes
> >>>> will occur via container ioctl's.
> >>>>
> >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >>>> ---
> >>>>  drivers/vfio/vfio.c  | 5 +++++
> >>>>  include/linux/vfio.h | 5 +++++
> >>>>  2 files changed, 10 insertions(+)

> >>>> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> >>>> index 38d3c6a..9642579 100644
> >>>> --- a/include/linux/vfio.h
> >>>> +++ b/include/linux/vfio.h
> >>>> @@ -57,6 +57,9 @@ extern int vfio_add_group_dev(struct device *dev,
> >>>>  extern void vfio_device_put(struct vfio_device *device);
> >>>>  extern void *vfio_device_data(struct vfio_device *device);
> >>>>  
> >>>> +/* events for the backend driver notify callback */
> >>>> +#define VFIO_DRIVER_NOTIFY_CONTAINER_CLOSE	1    
> >>>
> >>> We should use an enum for type checking.    
> >>
> >> Agreed.
> >> I see you changed the value to 0.  Do you want to reserve 0 for invalid-event?
> >> (I know, this is internal and can be changed).  Your call.  
> > 
> > I'm not sure what we would use an invalid-event event for... the type
> > checking provided by the enum should be enough?  
> 
> I should have described it as no-event or null-event.  It can be useful when
> initializing a struct member that stores an event, eg, last-event-received.

I think we could just use -1 for that. Anyway, easy to change if a need
comes up.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 7/9] vfio: iommu driver notify callback
  2021-01-29 16:54 ` [PATCH V3 7/9] vfio: iommu driver notify callback Steve Sistare
  2021-01-29 21:57   ` Alex Williamson
@ 2021-02-01 13:17   ` Cornelia Huck
  1 sibling, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-01 13:17 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Alex Williamson, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:10 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Define a vfio_iommu_driver_ops notify callback, for sending events to
> the driver.  Drivers are not required to provide the callback, and
> may ignore any events.  The handling of events is driver specific.
> 
> Define the CONTAINER_CLOSE event, called when the container's file
> descriptor is closed.  This event signifies that no further state changes
> will occur via container ioctl's.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/vfio/vfio.c  | 5 +++++
>  include/linux/vfio.h | 5 +++++
>  2 files changed, 10 insertions(+)

For the version in Alex' git branch:

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 0/9] vfio virtual address update
  2021-01-30 16:54   ` Steven Sistare
@ 2021-02-01 20:23     ` Alex Williamson
  2021-02-01 21:13       ` Steven Sistare
  2021-02-02  7:54       ` Cornelia Huck
  0 siblings, 2 replies; 37+ messages in thread
From: Alex Williamson @ 2021-02-01 20:23 UTC (permalink / raw)
  To: Steven Sistare; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On Sat, 30 Jan 2021 11:54:03 -0500
Steven Sistare <steven.sistare@oracle.com> wrote:

> On 1/29/2021 4:55 PM, Alex Williamson wrote:
> > On Fri, 29 Jan 2021 08:54:03 -0800
> > Steve Sistare <steven.sistare@oracle.com> wrote:
> >   
> >> Add interfaces that allow the underlying memory object of an iova range
> >> to be mapped to a new virtual address in the host process:
> >>
> >>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
> >>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
> >>   - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
> >>   - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
> >>   - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION
> >>
> >> Unmap-vaddr invalidates the host virtual address in an iova range and blocks
> >> vfio translation of host virtual addresses, but DMA to already-mapped pages
> >> continues.  Map-vaddr updates the base VA and resumes translation.  The
> >> implementation supports iommu type1 and mediated devices.  Unmap-all allows
> >> all ranges to be unmapped or invalidated in a single ioctl, which simplifies
> >> userland code.
> >>
> >> This functionality is necessary for live update, in which a host process
> >> such as qemu exec's an updated version of itself, while preserving its
> >> guest and vfio devices.  The process blocks vfio VA translation, exec's
> >> its new self, mmap's the memory object(s) underlying vfio object, updates
> >> the VA, and unblocks translation.  For a working example that uses these
> >> new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
> >> https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com
> >>
> >> Patches 1-3 define and implement the flag to unmap all ranges.
> >> Patches 4-6 define and implement the flags to update vaddr.
> >> Patches 7-9 add blocking to complete the implementation.  
> > 
> > Hi Steve,
> > 
> > It looks pretty good to me, but I have some nit-picky comments that
> > I'll follow-up with on the individual patches.  However, I've made the
> > changes I suggest in a branch that you can find here:
> > 
> > git://github.com/awilliam/linux-vfio.git vaddr-v3
> > 
> > If the changes look ok, just send me an ack, I don't want to attribute
> > something to you that you don't approve of.  Thanks,  
> 
> All changes look good, thanks!  
> Do you need anything more from me on this patch series?

Here's a new branch:

git://github.com/awilliam/linux-vfio.git vaddr-v3.1

Extent of the changes are s/may not/cannot/ on patches 1 & 4 and
addition of Connie's R-b for all (rebased to rc6).  If there are any
final comments, speak now.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 0/9] vfio virtual address update
  2021-02-01 20:23     ` Alex Williamson
@ 2021-02-01 21:13       ` Steven Sistare
  2021-02-02  7:54       ` Cornelia Huck
  1 sibling, 0 replies; 37+ messages in thread
From: Steven Sistare @ 2021-02-01 21:13 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On 2/1/2021 3:23 PM, Alex Williamson wrote:
> On Sat, 30 Jan 2021 11:54:03 -0500
> Steven Sistare <steven.sistare@oracle.com> wrote:
> 
>> On 1/29/2021 4:55 PM, Alex Williamson wrote:
>>> On Fri, 29 Jan 2021 08:54:03 -0800
>>> Steve Sistare <steven.sistare@oracle.com> wrote:
>>>   
>>>> Add interfaces that allow the underlying memory object of an iova range
>>>> to be mapped to a new virtual address in the host process:
>>>>
>>>>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
>>>>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
>>>>   - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
>>>>   - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
>>>>   - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION
>>>>
>>>> Unmap-vaddr invalidates the host virtual address in an iova range and blocks
>>>> vfio translation of host virtual addresses, but DMA to already-mapped pages
>>>> continues.  Map-vaddr updates the base VA and resumes translation.  The
>>>> implementation supports iommu type1 and mediated devices.  Unmap-all allows
>>>> all ranges to be unmapped or invalidated in a single ioctl, which simplifies
>>>> userland code.
>>>>
>>>> This functionality is necessary for live update, in which a host process
>>>> such as qemu exec's an updated version of itself, while preserving its
>>>> guest and vfio devices.  The process blocks vfio VA translation, exec's
>>>> its new self, mmap's the memory object(s) underlying vfio object, updates
>>>> the VA, and unblocks translation.  For a working example that uses these
>>>> new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
>>>> https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com
>>>>
>>>> Patches 1-3 define and implement the flag to unmap all ranges.
>>>> Patches 4-6 define and implement the flags to update vaddr.
>>>> Patches 7-9 add blocking to complete the implementation.  
>>>
>>> Hi Steve,
>>>
>>> It looks pretty good to me, but I have some nit-picky comments that
>>> I'll follow-up with on the individual patches.  However, I've made the
>>> changes I suggest in a branch that you can find here:
>>>
>>> git://github.com/awilliam/linux-vfio.git vaddr-v3
>>>
>>> If the changes look ok, just send me an ack, I don't want to attribute
>>> something to you that you don't approve of.  Thanks,  
>>
>> All changes look good, thanks!  
>> Do you need anything more from me on this patch series?
> 
> Here's a new branch:
> 
> git://github.com/awilliam/linux-vfio.git vaddr-v3.1
> 
> Extent of the changes are s/may not/cannot/ on patches 1 & 4 and
> addition of Connie's R-b for all (rebased to rc6).  If there are any
> final comments, speak now.  Thanks,

Looks good - Steve

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 0/9] vfio virtual address update
  2021-02-01 20:23     ` Alex Williamson
  2021-02-01 21:13       ` Steven Sistare
@ 2021-02-02  7:54       ` Cornelia Huck
  1 sibling, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2021-02-02  7:54 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Steven Sistare, kvm, Kirti Wankhede

On Mon, 1 Feb 2021 13:23:18 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Sat, 30 Jan 2021 11:54:03 -0500
> Steven Sistare <steven.sistare@oracle.com> wrote:
> 
> > On 1/29/2021 4:55 PM, Alex Williamson wrote:  
> > > On Fri, 29 Jan 2021 08:54:03 -0800
> > > Steve Sistare <steven.sistare@oracle.com> wrote:
> > >     
> > >> Add interfaces that allow the underlying memory object of an iova range
> > >> to be mapped to a new virtual address in the host process:
> > >>
> > >>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
> > >>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
> > >>   - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
> > >>   - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
> > >>   - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION
> > >>
> > >> Unmap-vaddr invalidates the host virtual address in an iova range and blocks
> > >> vfio translation of host virtual addresses, but DMA to already-mapped pages
> > >> continues.  Map-vaddr updates the base VA and resumes translation.  The
> > >> implementation supports iommu type1 and mediated devices.  Unmap-all allows
> > >> all ranges to be unmapped or invalidated in a single ioctl, which simplifies
> > >> userland code.
> > >>
> > >> This functionality is necessary for live update, in which a host process
> > >> such as qemu exec's an updated version of itself, while preserving its
> > >> guest and vfio devices.  The process blocks vfio VA translation, exec's
> > >> its new self, mmap's the memory object(s) underlying vfio object, updates
> > >> the VA, and unblocks translation.  For a working example that uses these
> > >> new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
> > >> https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com
> > >>
> > >> Patches 1-3 define and implement the flag to unmap all ranges.
> > >> Patches 4-6 define and implement the flags to update vaddr.
> > >> Patches 7-9 add blocking to complete the implementation.    
> > > 
> > > Hi Steve,
> > > 
> > > It looks pretty good to me, but I have some nit-picky comments that
> > > I'll follow-up with on the individual patches.  However, I've made the
> > > changes I suggest in a branch that you can find here:
> > > 
> > > git://github.com/awilliam/linux-vfio.git vaddr-v3
> > > 
> > > If the changes look ok, just send me an ack, I don't want to attribute
> > > something to you that you don't approve of.  Thanks,    
> > 
> > All changes look good, thanks!  
> > Do you need anything more from me on this patch series?  
> 
> Here's a new branch:
> 
> git://github.com/awilliam/linux-vfio.git vaddr-v3.1
> 
> Extent of the changes are s/may not/cannot/ on patches 1 & 4 and
> addition of Connie's R-b for all (rebased to rc6).  If there are any
> final comments, speak now.  Thanks,
> 
> Alex

LGTM.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH V3 0/9] vfio virtual address update
  2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
                   ` (9 preceding siblings ...)
  2021-01-29 21:55 ` [PATCH V3 0/9] vfio virtual address update Alex Williamson
@ 2021-02-02 17:21 ` Alex Williamson
  10 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2021-02-02 17:21 UTC (permalink / raw)
  To: Steve Sistare; +Cc: kvm, Cornelia Huck, Kirti Wankhede

On Fri, 29 Jan 2021 08:54:03 -0800
Steve Sistare <steven.sistare@oracle.com> wrote:

> Add interfaces that allow the underlying memory object of an iova range
> to be mapped to a new virtual address in the host process:
> 
>   - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA
>   - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA
>   - VFIO_UPDATE_VADDR for VFIO_CHECK_EXTENSION
>   - VFIO_DMA_UNMAP_FLAG_ALL for VFIO_IOMMU_UNMAP_DMA
>   - VFIO_UNMAP_ALL for VFIO_CHECK_EXTENSION
> 
> Unmap-vaddr invalidates the host virtual address in an iova range and blocks
> vfio translation of host virtual addresses, but DMA to already-mapped pages
> continues.  Map-vaddr updates the base VA and resumes translation.  The
> implementation supports iommu type1 and mediated devices.  Unmap-all allows
> all ranges to be unmapped or invalidated in a single ioctl, which simplifies
> userland code.
> 
> This functionality is necessary for live update, in which a host process
> such as qemu exec's an updated version of itself, while preserving its
> guest and vfio devices.  The process blocks vfio VA translation, exec's
> its new self, mmap's the memory object(s) underlying vfio object, updates
> the VA, and unblocks translation.  For a working example that uses these
> new interfaces, see the QEMU patch series "[PATCH V2] Live Update" at
> https://lore.kernel.org/qemu-devel/1609861330-129855-1-git-send-email-steven.sistare@oracle.com
> 
> Patches 1-3 define and implement the flag to unmap all ranges.
> Patches 4-6 define and implement the flags to update vaddr.
> Patches 7-9 add blocking to complete the implementation.
> 
> Changes in V2:
>  - define a flag for unmap all instead of special range values
>  - define the VFIO_UNMAP_ALL extension
>  - forbid the combination of unmap-all and get-dirty-bitmap
>  - unwind in unmap on vaddr error
>  - add a new function to find first dma in a range instead of modifying
>    an existing function
>  - change names of update flags
>  - fix concurrency bugs due to iommu lock being dropped
>  - call down from from vfio to a new backend interface instead of up from
>    driver to detect container close
>  - use wait/wake instead of sleep and polling
>  - refine the uapi specification
>  - split patches into vfio vs type1
> 
> Changes in V3:
>  - add vaddr_invalid_count to fix pin_pages race with unmap
>  - refactor the wait helper functions
>  - traverse dma entries more efficiently in unmap
>  - check unmap flag conflicts more explicitly
>  - rename some local variables and functions
> 
> Steve Sistare (9):
>   vfio: option to unmap all
>   vfio/type1: unmap cleanup
>   vfio/type1: implement unmap all
>   vfio: interfaces to update vaddr
>   vfio/type1: massage unmap iteration
>   vfio/type1: implement interfaces to update vaddr
>   vfio: iommu driver notify callback
>   vfio/type1: implement notify callback
>   vfio/type1: block on invalid vaddr
> 
>  drivers/vfio/vfio.c             |   5 +
>  drivers/vfio/vfio_iommu_type1.c | 251 +++++++++++++++++++++++++++++++++++-----
>  include/linux/vfio.h            |   5 +
>  include/uapi/linux/vfio.h       |  27 +++++
>  4 files changed, 256 insertions(+), 32 deletions(-)
> 

Applied to vfio next branch for v5.12 with discussed changes and
Connie's R-b.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-02-02 17:25 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 16:54 [PATCH V3 0/9] vfio virtual address update Steve Sistare
2021-01-29 16:54 ` [PATCH V3 1/9] vfio: option to unmap all Steve Sistare
2021-02-01 11:42   ` Cornelia Huck
2021-02-01 12:52     ` Steven Sistare
2021-02-01 13:13       ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 2/9] vfio/type1: unmap cleanup Steve Sistare
2021-02-01 11:50   ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 3/9] vfio/type1: implement unmap all Steve Sistare
2021-02-01 11:59   ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 4/9] vfio: interfaces to update vaddr Steve Sistare
2021-02-01 13:13   ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 5/9] vfio/type1: massage unmap iteration Steve Sistare
2021-01-29 21:56   ` Alex Williamson
2021-01-30 16:51     ` Steven Sistare
2021-02-01 12:11   ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 6/9] vfio/type1: implement interfaces to update vaddr Steve Sistare
2021-01-29 21:56   ` Alex Williamson
2021-01-30 16:51     ` Steven Sistare
2021-02-01 12:27   ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 7/9] vfio: iommu driver notify callback Steve Sistare
2021-01-29 21:57   ` Alex Williamson
2021-01-30 16:51     ` Steven Sistare
2021-02-01 12:34       ` Cornelia Huck
2021-02-01 12:52         ` Steven Sistare
2021-02-01 13:17           ` Cornelia Huck
2021-02-01 13:17   ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 8/9] vfio/type1: implement " Steve Sistare
2021-02-01 12:40   ` Cornelia Huck
2021-01-29 16:54 ` [PATCH V3 9/9] vfio/type1: block on invalid vaddr Steve Sistare
2021-02-01 12:47   ` Cornelia Huck
2021-01-29 21:55 ` [PATCH V3 0/9] vfio virtual address update Alex Williamson
2021-01-29 22:14   ` Steven Sistare
2021-01-30 16:54   ` Steven Sistare
2021-02-01 20:23     ` Alex Williamson
2021-02-01 21:13       ` Steven Sistare
2021-02-02  7:54       ` Cornelia Huck
2021-02-02 17:21 ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).