kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization
@ 2020-12-10  7:34 Keqian Zhu
  2020-12-10  7:34 ` [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin Keqian Zhu
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

Hi folks,

This patch series aim to fix up or optimize some code about vfio
dirty log tracking.

patch   1: Optimize dirty log when unwind pin pages.
patch 2-3: Optimize promoting pinned_page_dirty_scope.
patch   4: Fix up dirty log missing when promote pinned_page_dirty_scope.
patch 5-7: Drop superfluous parameter "pgsize" of some functions.

Wish they improves the robustness of vfio dirty log tracking.

Thanks,
Keqian

Keqian Zhu (7):
  vfio: iommu_type1: Clear added dirty bit when unwind pin
  vfio: iommu_type1: Initially set the pinned_page_dirty_scope
  vfio: iommu_type1: Make an explicit "promote" semantic
  vfio: iommu_type1: Fix missing dirty page when promote pinned_scope
  vfio: iommu_type1: Drop parameter "pgsize" of
    vfio_dma_bitmap_alloc_all
  vfio: iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap.
  vfio: iommu_type1: Drop parameter "pgsize" of update_user_bitmap

 drivers/vfio/vfio_iommu_type1.c | 108 +++++++++++++++++++-------------
 1 file changed, 65 insertions(+), 43 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin
  2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
@ 2020-12-10  7:34 ` Keqian Zhu
  2020-12-10 19:16   ` Alex Williamson
  2020-12-16  7:22   ` [kbuild] " Dan Carpenter
  2020-12-10  7:34 ` [PATCH 2/7] vfio: iommu_type1: Initially set the pinned_page_dirty_scope Keqian Zhu
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

Currently we do not clear added dirty bit of bitmap when unwind
pin, so if pin failed at halfway, we set unnecessary dirty bit
in bitmap. Clearing added dirty bit when unwind pin, userspace
will see less dirty page, which can save much time to handle them.

Note that we should distinguish the bits added by pin and the bits
already set before pin, so introduce bitmap_added to record this.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 67e827638995..f129d24a6ec3 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -637,7 +637,11 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_group *group;
 	int i, j, ret;
+	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
 	unsigned long remote_vaddr;
+	unsigned long bitmap_offset;
+	unsigned long *bitmap_added;
+	dma_addr_t iova;
 	struct vfio_dma *dma;
 	bool do_accounting;
 
@@ -650,6 +654,12 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 
 	mutex_lock(&iommu->lock);
 
+	bitmap_added = bitmap_zalloc(npage, GFP_KERNEL);
+	if (!bitmap_added) {
+		ret = -ENOMEM;
+		goto pin_done;
+	}
+
 	/* Fail if notifier list is empty */
 	if (!iommu->notifier.head) {
 		ret = -EINVAL;
@@ -664,7 +674,6 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
 
 	for (i = 0; i < npage; i++) {
-		dma_addr_t iova;
 		struct vfio_pfn *vpfn;
 
 		iova = user_pfn[i] << PAGE_SHIFT;
@@ -699,14 +708,10 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 		}
 
 		if (iommu->dirty_page_tracking) {
-			unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
-
-			/*
-			 * Bitmap populated with the smallest supported page
-			 * size
-			 */
-			bitmap_set(dma->bitmap,
-				   (iova - dma->iova) >> pgshift, 1);
+			/* Populated with the smallest supported page size */
+			bitmap_offset = (iova - dma->iova) >> pgshift;
+			if (!test_and_set_bit(bitmap_offset, dma->bitmap))
+				set_bit(i, bitmap_added);
 		}
 	}
 	ret = i;
@@ -722,14 +727,20 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 pin_unwind:
 	phys_pfn[i] = 0;
 	for (j = 0; j < i; j++) {
-		dma_addr_t iova;
-
 		iova = user_pfn[j] << PAGE_SHIFT;
 		dma = vfio_find_dma(iommu, iova, PAGE_SIZE);
 		vfio_unpin_page_external(dma, iova, do_accounting);
 		phys_pfn[j] = 0;
+
+		if (test_bit(j, bitmap_added)) {
+			bitmap_offset = (iova - dma->iova) >> pgshift;
+			clear_bit(bitmap_offset, dma->bitmap);
+		}
 	}
 pin_done:
+	if (bitmap_added)
+		bitmap_free(bitmap_added);
+
 	mutex_unlock(&iommu->lock);
 	return ret;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/7] vfio: iommu_type1: Initially set the pinned_page_dirty_scope
  2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
  2020-12-10  7:34 ` [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin Keqian Zhu
@ 2020-12-10  7:34 ` Keqian Zhu
  2020-12-10  7:34 ` [PATCH 3/7] vfio: iommu_type1: Make an explicit "promote" semantic Keqian Zhu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

Currently there are 3 ways to promote the pinned_page_dirty_scope
status of vfio_iommu:

1. Through pin interface.
2. Detach a group without dirty tracking.
3. Attach a group with dirty tracking.

For point 3, the only chance to change the pinned status is that
the vfio_iommu is newly created.

Consider that we can safely set the pinned status when create a
new vfio_iommu, as we do it, the point 3 can be removed to reduce
operations.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index f129d24a6ec3..c52bcefba96b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2064,12 +2064,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 			 * Non-iommu backed group cannot dirty memory directly,
 			 * it can only use interfaces that provide dirty
 			 * tracking.
-			 * The iommu scope can only be promoted with the
-			 * addition of a dirty tracking group.
 			 */
 			group->pinned_page_dirty_scope = true;
-			if (!iommu->pinned_page_dirty_scope)
-				update_pinned_page_dirty_scope(iommu);
 			mutex_unlock(&iommu->lock);
 
 			return 0;
@@ -2457,6 +2453,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	INIT_LIST_HEAD(&iommu->iova_list);
 	iommu->dma_list = RB_ROOT;
 	iommu->dma_avail = dma_entry_limit;
+	iommu->pinned_page_dirty_scope = true;
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/7] vfio: iommu_type1: Make an explicit "promote" semantic
  2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
  2020-12-10  7:34 ` [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin Keqian Zhu
  2020-12-10  7:34 ` [PATCH 2/7] vfio: iommu_type1: Initially set the pinned_page_dirty_scope Keqian Zhu
@ 2020-12-10  7:34 ` Keqian Zhu
  2020-12-10  7:34 ` [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope Keqian Zhu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

When we want to promote pinned_page_scope of vfio_iommu, we
should call the "update" function to visit all vfio_group,
but when we want to downgrade it, we can set the flag directly.

Giving above, we can give an explicit "promote" semantic to
that function. BTW, if vfio_iommu has been promoted, then it
can return early.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c52bcefba96b..bd9a94590ebc 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -148,7 +148,7 @@ static int put_pfn(unsigned long pfn, int prot);
 static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 					       struct iommu_group *iommu_group);
 
-static void update_pinned_page_dirty_scope(struct vfio_iommu *iommu);
+static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu);
 /*
  * This code handles mapping and unmapping of user data buffers
  * into DMA'ble space using the IOMMU
@@ -719,7 +719,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 	group = vfio_iommu_find_iommu_group(iommu, iommu_group);
 	if (!group->pinned_page_dirty_scope) {
 		group->pinned_page_dirty_scope = true;
-		update_pinned_page_dirty_scope(iommu);
+		promote_pinned_page_dirty_scope(iommu);
 	}
 
 	goto pin_done;
@@ -1633,27 +1633,26 @@ static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 	return group;
 }
 
-static void update_pinned_page_dirty_scope(struct vfio_iommu *iommu)
+static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
 {
 	struct vfio_domain *domain;
 	struct vfio_group *group;
 
+	if (iommu->pinned_page_dirty_scope)
+		return;
+
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		list_for_each_entry(group, &domain->group_list, next) {
-			if (!group->pinned_page_dirty_scope) {
-				iommu->pinned_page_dirty_scope = false;
+			if (!group->pinned_page_dirty_scope)
 				return;
-			}
 		}
 	}
 
 	if (iommu->external_domain) {
 		domain = iommu->external_domain;
 		list_for_each_entry(group, &domain->group_list, next) {
-			if (!group->pinned_page_dirty_scope) {
-				iommu->pinned_page_dirty_scope = false;
+			if (!group->pinned_page_dirty_scope)
 				return;
-			}
 		}
 	}
 
@@ -2348,7 +2347,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain;
 	struct vfio_group *group;
-	bool update_dirty_scope = false;
+	bool promote_dirty_scope = false;
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
@@ -2356,7 +2355,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	if (iommu->external_domain) {
 		group = find_iommu_group(iommu->external_domain, iommu_group);
 		if (group) {
-			update_dirty_scope = !group->pinned_page_dirty_scope;
+			promote_dirty_scope = !group->pinned_page_dirty_scope;
 			list_del(&group->next);
 			kfree(group);
 
@@ -2386,7 +2385,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			continue;
 
 		vfio_iommu_detach_group(domain, group);
-		update_dirty_scope = !group->pinned_page_dirty_scope;
+		promote_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
 		kfree(group);
 		/*
@@ -2422,8 +2421,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	 * Removal of a group without dirty tracking may allow the iommu scope
 	 * to be promoted.
 	 */
-	if (update_dirty_scope)
-		update_pinned_page_dirty_scope(iommu);
+	if (promote_dirty_scope)
+		promote_pinned_page_dirty_scope(iommu);
 	mutex_unlock(&iommu->lock);
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope
  2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
                   ` (2 preceding siblings ...)
  2020-12-10  7:34 ` [PATCH 3/7] vfio: iommu_type1: Make an explicit "promote" semantic Keqian Zhu
@ 2020-12-10  7:34 ` Keqian Zhu
  2020-12-15  0:04   ` Alex Williamson
  2020-12-10  7:34 ` [PATCH 5/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_dma_bitmap_alloc_all Keqian Zhu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

When we pin or detach a group which is not dirty tracking capable,
we will try to promote pinned_scope of vfio_iommu.

If we succeed to do so, vfio only report pinned_scope as dirty to
userspace next time, but these memory written before pin or detach
is missed.

The solution is that we must populate all dma range as dirty before
promoting pinned_scope of vfio_iommu.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index bd9a94590ebc..00684597b098 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1633,6 +1633,20 @@ static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 	return group;
 }
 
+static void vfio_populate_bitmap_all(struct vfio_iommu *iommu)
+{
+	struct rb_node *n;
+	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
+
+	for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) {
+		struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
+		unsigned long nbits = dma->size >> pgshift;
+
+		if (dma->iommu_mapped)
+			bitmap_set(dma->bitmap, 0, nbits);
+	}
+}
+
 static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
 {
 	struct vfio_domain *domain;
@@ -1657,6 +1671,10 @@ static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
 	}
 
 	iommu->pinned_page_dirty_scope = true;
+
+	/* Set all bitmap to avoid missing dirty page */
+	if (iommu->dirty_page_tracking)
+		vfio_populate_bitmap_all(iommu);
 }
 
 static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_dma_bitmap_alloc_all
  2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
                   ` (3 preceding siblings ...)
  2020-12-10  7:34 ` [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope Keqian Zhu
@ 2020-12-10  7:34 ` Keqian Zhu
  2020-12-10  7:34 ` [PATCH 6/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap Keqian Zhu
  2020-12-10  7:34 ` [PATCH 7/7] vfio: iommu_type1: Drop parameter "pgsize" of update_user_bitmap Keqian Zhu
  6 siblings, 0 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

We always use the smallest supported page size of vfio_iommu as
pgsize. Remove parameter "pgsize" of vfio_dma_bitmap_alloc_all.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 00684597b098..32ab889c8193 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -236,9 +236,10 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
 	}
 }
 
-static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu, size_t pgsize)
+static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
 {
 	struct rb_node *n;
+	size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
 
 	for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) {
 		struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
@@ -2798,12 +2799,9 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu,
 		return -EINVAL;
 
 	if (dirty.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_START) {
-		size_t pgsize;
-
 		mutex_lock(&iommu->lock);
-		pgsize = 1 << __ffs(iommu->pgsize_bitmap);
 		if (!iommu->dirty_page_tracking) {
-			ret = vfio_dma_bitmap_alloc_all(iommu, pgsize);
+			ret = vfio_dma_bitmap_alloc_all(iommu);
 			if (!ret)
 				iommu->dirty_page_tracking = true;
 		}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap.
  2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
                   ` (4 preceding siblings ...)
  2020-12-10  7:34 ` [PATCH 5/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_dma_bitmap_alloc_all Keqian Zhu
@ 2020-12-10  7:34 ` Keqian Zhu
  2020-12-10  7:34 ` [PATCH 7/7] vfio: iommu_type1: Drop parameter "pgsize" of update_user_bitmap Keqian Zhu
  6 siblings, 0 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

We always use the smallest supported page size of vfio_iommu as
pgsize. Remove parameter "pgsize" of vfio_iova_dirty_bitmap.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 32ab889c8193..2d7a5cd9b916 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1026,11 +1026,12 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
 }
 
 static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
-				  dma_addr_t iova, size_t size, size_t pgsize)
+				  dma_addr_t iova, size_t size)
 {
 	struct vfio_dma *dma;
 	struct rb_node *n;
-	unsigned long pgshift = __ffs(pgsize);
+	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
+	size_t pgsize = (size_t)1 << pgshift;
 	int ret;
 
 	/*
@@ -2861,8 +2862,7 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu,
 		if (iommu->dirty_page_tracking)
 			ret = vfio_iova_dirty_bitmap(range.bitmap.data,
 						     iommu, range.iova,
-						     range.size,
-						     range.bitmap.pgsize);
+						     range.size);
 		else
 			ret = -EINVAL;
 out_unlock:
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/7] vfio: iommu_type1: Drop parameter "pgsize" of update_user_bitmap
  2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
                   ` (5 preceding siblings ...)
  2020-12-10  7:34 ` [PATCH 6/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap Keqian Zhu
@ 2020-12-10  7:34 ` Keqian Zhu
  6 siblings, 0 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-10  7:34 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Alex Williamson, Cornelia Huck, Marc Zyngier, Will Deacon,
	Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun,
	Keqian Zhu

We always use the smallest supported page size of vfio_iommu as
pgsize. Drop parameter "pgsize" of update_user_bitmap.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 drivers/vfio/vfio_iommu_type1.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2d7a5cd9b916..edb0a6468e8d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -989,10 +989,9 @@ static void vfio_update_pgsize_bitmap(struct vfio_iommu *iommu)
 }
 
 static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
-			      struct vfio_dma *dma, dma_addr_t base_iova,
-			      size_t pgsize)
+			      struct vfio_dma *dma, dma_addr_t base_iova)
 {
-	unsigned long pgshift = __ffs(pgsize);
+	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
 	unsigned long nbits = dma->size >> pgshift;
 	unsigned long bit_offset = (dma->iova - base_iova) >> pgshift;
 	unsigned long copy_offset = bit_offset / BITS_PER_LONG;
@@ -1057,7 +1056,7 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
 		if (dma->iova > iova + size - 1)
 			break;
 
-		ret = update_user_bitmap(bitmap, iommu, dma, iova, pgsize);
+		ret = update_user_bitmap(bitmap, iommu, dma, iova);
 		if (ret)
 			return ret;
 
@@ -1203,7 +1202,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 
 		if (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
 			ret = update_user_bitmap(bitmap->data, iommu, dma,
-						 unmap->iova, pgsize);
+						 unmap->iova);
 			if (ret)
 				break;
 		}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin
  2020-12-10  7:34 ` [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin Keqian Zhu
@ 2020-12-10 19:16   ` Alex Williamson
  2020-12-11  6:51     ` zhukeqian
  2020-12-16  7:22   ` [kbuild] " Dan Carpenter
  1 sibling, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2020-12-10 19:16 UTC (permalink / raw)
  To: Keqian Zhu
  Cc: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Cornelia Huck, Marc Zyngier, Will Deacon, Robin Murphy,
	Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun

On Thu, 10 Dec 2020 15:34:19 +0800
Keqian Zhu <zhukeqian1@huawei.com> wrote:

> Currently we do not clear added dirty bit of bitmap when unwind
> pin, so if pin failed at halfway, we set unnecessary dirty bit
> in bitmap. Clearing added dirty bit when unwind pin, userspace
> will see less dirty page, which can save much time to handle them.
> 
> Note that we should distinguish the bits added by pin and the bits
> already set before pin, so introduce bitmap_added to record this.
> 
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
>  1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 67e827638995..f129d24a6ec3 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -637,7 +637,11 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>  	struct vfio_iommu *iommu = iommu_data;
>  	struct vfio_group *group;
>  	int i, j, ret;
> +	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
>  	unsigned long remote_vaddr;
> +	unsigned long bitmap_offset;
> +	unsigned long *bitmap_added;
> +	dma_addr_t iova;
>  	struct vfio_dma *dma;
>  	bool do_accounting;
>  
> @@ -650,6 +654,12 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>  
>  	mutex_lock(&iommu->lock);
>  
> +	bitmap_added = bitmap_zalloc(npage, GFP_KERNEL);
> +	if (!bitmap_added) {
> +		ret = -ENOMEM;
> +		goto pin_done;
> +	}


This is allocated regardless of whether dirty tracking is enabled, so
this adds overhead to the common case in order to reduce user overhead
(not correctness) in the rare condition that dirty tracking is enabled,
and the even rarer condition that there's a fault during that case.
This is not a good trade-off.  Thanks,

Alex


> +
>  	/* Fail if notifier list is empty */
>  	if (!iommu->notifier.head) {
>  		ret = -EINVAL;
> @@ -664,7 +674,6 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>  	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
>  
>  	for (i = 0; i < npage; i++) {
> -		dma_addr_t iova;
>  		struct vfio_pfn *vpfn;
>  
>  		iova = user_pfn[i] << PAGE_SHIFT;
> @@ -699,14 +708,10 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>  		}
>  
>  		if (iommu->dirty_page_tracking) {
> -			unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> -
> -			/*
> -			 * Bitmap populated with the smallest supported page
> -			 * size
> -			 */
> -			bitmap_set(dma->bitmap,
> -				   (iova - dma->iova) >> pgshift, 1);
> +			/* Populated with the smallest supported page size */
> +			bitmap_offset = (iova - dma->iova) >> pgshift;
> +			if (!test_and_set_bit(bitmap_offset, dma->bitmap))
> +				set_bit(i, bitmap_added);
>  		}
>  	}
>  	ret = i;
> @@ -722,14 +727,20 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>  pin_unwind:
>  	phys_pfn[i] = 0;
>  	for (j = 0; j < i; j++) {
> -		dma_addr_t iova;
> -
>  		iova = user_pfn[j] << PAGE_SHIFT;
>  		dma = vfio_find_dma(iommu, iova, PAGE_SIZE);
>  		vfio_unpin_page_external(dma, iova, do_accounting);
>  		phys_pfn[j] = 0;
> +
> +		if (test_bit(j, bitmap_added)) {
> +			bitmap_offset = (iova - dma->iova) >> pgshift;
> +			clear_bit(bitmap_offset, dma->bitmap);
> +		}
>  	}
>  pin_done:
> +	if (bitmap_added)
> +		bitmap_free(bitmap_added);
> +
>  	mutex_unlock(&iommu->lock);
>  	return ret;
>  }


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin
  2020-12-10 19:16   ` Alex Williamson
@ 2020-12-11  6:51     ` zhukeqian
  2020-12-15  0:16       ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: zhukeqian @ 2020-12-11  6:51 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Cornelia Huck, Marc Zyngier, Will Deacon, Robin Murphy,
	Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun



On 2020/12/11 3:16, Alex Williamson wrote:
> On Thu, 10 Dec 2020 15:34:19 +0800
> Keqian Zhu <zhukeqian1@huawei.com> wrote:
> 
>> Currently we do not clear added dirty bit of bitmap when unwind
>> pin, so if pin failed at halfway, we set unnecessary dirty bit
>> in bitmap. Clearing added dirty bit when unwind pin, userspace
>> will see less dirty page, which can save much time to handle them.
>>
>> Note that we should distinguish the bits added by pin and the bits
>> already set before pin, so introduce bitmap_added to record this.
>>
>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
>>  1 file changed, 22 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 67e827638995..f129d24a6ec3 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -637,7 +637,11 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>>  	struct vfio_iommu *iommu = iommu_data;
>>  	struct vfio_group *group;
>>  	int i, j, ret;
>> +	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
>>  	unsigned long remote_vaddr;
>> +	unsigned long bitmap_offset;
>> +	unsigned long *bitmap_added;
>> +	dma_addr_t iova;
>>  	struct vfio_dma *dma;
>>  	bool do_accounting;
>>  
>> @@ -650,6 +654,12 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>>  
>>  	mutex_lock(&iommu->lock);
>>  
>> +	bitmap_added = bitmap_zalloc(npage, GFP_KERNEL);
>> +	if (!bitmap_added) {
>> +		ret = -ENOMEM;
>> +		goto pin_done;
>> +	}
> 
> 
> This is allocated regardless of whether dirty tracking is enabled, so
> this adds overhead to the common case in order to reduce user overhead
> (not correctness) in the rare condition that dirty tracking is enabled,
> and the even rarer condition that there's a fault during that case.
> This is not a good trade-off.  Thanks,

Hi Alex,

We can allocate the bitmap when dirty tracking is active, do you accept this?
Or we can set the dirty bitmap after all pins succeed, but this costs cpu time
to locate vfio_dma with iova.

Thanks,
Keqian

> 
> Alex
> 
> 
>> +
>>  	/* Fail if notifier list is empty */
>>  	if (!iommu->notifier.head) {
>>  		ret = -EINVAL;
>> @@ -664,7 +674,6 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>>  	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
>>  
>>  	for (i = 0; i < npage; i++) {
>> -		dma_addr_t iova;
>>  		struct vfio_pfn *vpfn;
>>  
>>  		iova = user_pfn[i] << PAGE_SHIFT;
>> @@ -699,14 +708,10 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>>  		}
>>  
>>  		if (iommu->dirty_page_tracking) {
>> -			unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
>> -
>> -			/*
>> -			 * Bitmap populated with the smallest supported page
>> -			 * size
>> -			 */
>> -			bitmap_set(dma->bitmap,
>> -				   (iova - dma->iova) >> pgshift, 1);
>> +			/* Populated with the smallest supported page size */
>> +			bitmap_offset = (iova - dma->iova) >> pgshift;
>> +			if (!test_and_set_bit(bitmap_offset, dma->bitmap))
>> +				set_bit(i, bitmap_added);
>>  		}
>>  	}
>>  	ret = i;
>> @@ -722,14 +727,20 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>>  pin_unwind:
>>  	phys_pfn[i] = 0;
>>  	for (j = 0; j < i; j++) {
>> -		dma_addr_t iova;
>> -
>>  		iova = user_pfn[j] << PAGE_SHIFT;
>>  		dma = vfio_find_dma(iommu, iova, PAGE_SIZE);
>>  		vfio_unpin_page_external(dma, iova, do_accounting);
>>  		phys_pfn[j] = 0;
>> +
>> +		if (test_bit(j, bitmap_added)) {
>> +			bitmap_offset = (iova - dma->iova) >> pgshift;
>> +			clear_bit(bitmap_offset, dma->bitmap);
>> +		}
>>  	}
>>  pin_done:
>> +	if (bitmap_added)
>> +		bitmap_free(bitmap_added);
>> +
>>  	mutex_unlock(&iommu->lock);
>>  	return ret;
>>  }
> 
> .
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope
  2020-12-10  7:34 ` [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope Keqian Zhu
@ 2020-12-15  0:04   ` Alex Williamson
  2020-12-15  9:37     ` zhukeqian
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2020-12-15  0:04 UTC (permalink / raw)
  To: Keqian Zhu
  Cc: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Cornelia Huck, Marc Zyngier, Will Deacon, Robin Murphy,
	Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun

On Thu, 10 Dec 2020 15:34:22 +0800
Keqian Zhu <zhukeqian1@huawei.com> wrote:

> When we pin or detach a group which is not dirty tracking capable,
> we will try to promote pinned_scope of vfio_iommu.
> 
> If we succeed to do so, vfio only report pinned_scope as dirty to
> userspace next time, but these memory written before pin or detach
> is missed.
> 
> The solution is that we must populate all dma range as dirty before
> promoting pinned_scope of vfio_iommu.

Please don't bury fixes patches into a series with other optimizations
and semantic changes.  Send it separately.

> 
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index bd9a94590ebc..00684597b098 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1633,6 +1633,20 @@ static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
>  	return group;
>  }
>  
> +static void vfio_populate_bitmap_all(struct vfio_iommu *iommu)
> +{
> +	struct rb_node *n;
> +	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> +
> +	for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) {
> +		struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
> +		unsigned long nbits = dma->size >> pgshift;
> +
> +		if (dma->iommu_mapped)
> +			bitmap_set(dma->bitmap, 0, nbits);
> +	}
> +}


If we detach a group which results in only non-IOMMU backed mdevs,
don't we also clear dma->iommu_mapped as part of vfio_unmap_unpin()
such that this test is invalid?  Thanks,

Alex

> +
>  static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
>  {
>  	struct vfio_domain *domain;
> @@ -1657,6 +1671,10 @@ static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
>  	}
>  
>  	iommu->pinned_page_dirty_scope = true;
> +
> +	/* Set all bitmap to avoid missing dirty page */
> +	if (iommu->dirty_page_tracking)
> +		vfio_populate_bitmap_all(iommu);
>  }
>  
>  static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin
  2020-12-11  6:51     ` zhukeqian
@ 2020-12-15  0:16       ` Alex Williamson
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Williamson @ 2020-12-15  0:16 UTC (permalink / raw)
  To: zhukeqian
  Cc: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Cornelia Huck, Marc Zyngier, Will Deacon, Robin Murphy,
	Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun

On Fri, 11 Dec 2020 14:51:47 +0800
zhukeqian <zhukeqian1@huawei.com> wrote:

> On 2020/12/11 3:16, Alex Williamson wrote:
> > On Thu, 10 Dec 2020 15:34:19 +0800
> > Keqian Zhu <zhukeqian1@huawei.com> wrote:
> >   
> >> Currently we do not clear added dirty bit of bitmap when unwind
> >> pin, so if pin failed at halfway, we set unnecessary dirty bit
> >> in bitmap. Clearing added dirty bit when unwind pin, userspace
> >> will see less dirty page, which can save much time to handle them.
> >>
> >> Note that we should distinguish the bits added by pin and the bits
> >> already set before pin, so introduce bitmap_added to record this.
> >>
> >> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> >> ---
> >>  drivers/vfio/vfio_iommu_type1.c | 33 ++++++++++++++++++++++-----------
> >>  1 file changed, 22 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index 67e827638995..f129d24a6ec3 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -637,7 +637,11 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
> >>  	struct vfio_iommu *iommu = iommu_data;
> >>  	struct vfio_group *group;
> >>  	int i, j, ret;
> >> +	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> >>  	unsigned long remote_vaddr;
> >> +	unsigned long bitmap_offset;
> >> +	unsigned long *bitmap_added;
> >> +	dma_addr_t iova;
> >>  	struct vfio_dma *dma;
> >>  	bool do_accounting;
> >>  
> >> @@ -650,6 +654,12 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
> >>  
> >>  	mutex_lock(&iommu->lock);
> >>  
> >> +	bitmap_added = bitmap_zalloc(npage, GFP_KERNEL);
> >> +	if (!bitmap_added) {
> >> +		ret = -ENOMEM;
> >> +		goto pin_done;
> >> +	}  
> > 
> > 
> > This is allocated regardless of whether dirty tracking is enabled, so
> > this adds overhead to the common case in order to reduce user overhead
> > (not correctness) in the rare condition that dirty tracking is enabled,
> > and the even rarer condition that there's a fault during that case.
> > This is not a good trade-off.  Thanks,  
> 
> Hi Alex,
> 
> We can allocate the bitmap when dirty tracking is active, do you accept this?
> Or we can set the dirty bitmap after all pins succeed, but this costs cpu time
> to locate vfio_dma with iova.

TBH I don't see this as a terribly significant problem, in the rare
event of an error with dirty tracking enabled, the user might see some
pages marked dirty that were not successfully pinned by the mdev vendor
driver.  The solution shouldn't impose more overhead than the original
issue.  Thanks,

Alex

> >> +
> >>  	/* Fail if notifier list is empty */
> >>  	if (!iommu->notifier.head) {
> >>  		ret = -EINVAL;
> >> @@ -664,7 +674,6 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
> >>  	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
> >>  
> >>  	for (i = 0; i < npage; i++) {
> >> -		dma_addr_t iova;
> >>  		struct vfio_pfn *vpfn;
> >>  
> >>  		iova = user_pfn[i] << PAGE_SHIFT;
> >> @@ -699,14 +708,10 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
> >>  		}
> >>  
> >>  		if (iommu->dirty_page_tracking) {
> >> -			unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> >> -
> >> -			/*
> >> -			 * Bitmap populated with the smallest supported page
> >> -			 * size
> >> -			 */
> >> -			bitmap_set(dma->bitmap,
> >> -				   (iova - dma->iova) >> pgshift, 1);
> >> +			/* Populated with the smallest supported page size */
> >> +			bitmap_offset = (iova - dma->iova) >> pgshift;
> >> +			if (!test_and_set_bit(bitmap_offset, dma->bitmap))
> >> +				set_bit(i, bitmap_added);
> >>  		}
> >>  	}
> >>  	ret = i;
> >> @@ -722,14 +727,20 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
> >>  pin_unwind:
> >>  	phys_pfn[i] = 0;
> >>  	for (j = 0; j < i; j++) {
> >> -		dma_addr_t iova;
> >> -
> >>  		iova = user_pfn[j] << PAGE_SHIFT;
> >>  		dma = vfio_find_dma(iommu, iova, PAGE_SIZE);
> >>  		vfio_unpin_page_external(dma, iova, do_accounting);
> >>  		phys_pfn[j] = 0;
> >> +
> >> +		if (test_bit(j, bitmap_added)) {
> >> +			bitmap_offset = (iova - dma->iova) >> pgshift;
> >> +			clear_bit(bitmap_offset, dma->bitmap);
> >> +		}
> >>  	}
> >>  pin_done:
> >> +	if (bitmap_added)
> >> +		bitmap_free(bitmap_added);
> >> +
> >>  	mutex_unlock(&iommu->lock);
> >>  	return ret;
> >>  }  
> > 
> > .
> >   
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope
  2020-12-15  0:04   ` Alex Williamson
@ 2020-12-15  9:37     ` zhukeqian
  2020-12-15 15:53       ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: zhukeqian @ 2020-12-15  9:37 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Cornelia Huck, Marc Zyngier, Will Deacon, Robin Murphy,
	Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun

Hi Alex,

On 2020/12/15 8:04, Alex Williamson wrote:
> On Thu, 10 Dec 2020 15:34:22 +0800
> Keqian Zhu <zhukeqian1@huawei.com> wrote:
> 
>> When we pin or detach a group which is not dirty tracking capable,
>> we will try to promote pinned_scope of vfio_iommu.
>>
>> If we succeed to do so, vfio only report pinned_scope as dirty to
>> userspace next time, but these memory written before pin or detach
>> is missed.
>>
>> The solution is that we must populate all dma range as dirty before
>> promoting pinned_scope of vfio_iommu.
> 
> Please don't bury fixes patches into a series with other optimizations
> and semantic changes.  Send it separately.
> 
OK, I will.

>>
>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 18 ++++++++++++++++++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index bd9a94590ebc..00684597b098 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -1633,6 +1633,20 @@ static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
>>  	return group;
>>  }
>>  
>> +static void vfio_populate_bitmap_all(struct vfio_iommu *iommu)
>> +{
>> +	struct rb_node *n;
>> +	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
>> +
>> +	for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) {
>> +		struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
>> +		unsigned long nbits = dma->size >> pgshift;
>> +
>> +		if (dma->iommu_mapped)
>> +			bitmap_set(dma->bitmap, 0, nbits);
>> +	}
>> +}
> 
> 
> If we detach a group which results in only non-IOMMU backed mdevs,
> don't we also clear dma->iommu_mapped as part of vfio_unmap_unpin()
> such that this test is invalid?  Thanks,

Good spot :-). The code will skip bitmap_set under this situation.

We should set the bitmap unconditionally when vfio_iommu is promoted,
as we must have IOMMU backed domain before promoting the vfio_iommu.

Besides, I think we should also mark dirty in vfio_remove_dma if dirty
tracking is active. Right?

Thanks,
Keqian

> 
> Alex
> 
>> +
>>  static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
>>  {
>>  	struct vfio_domain *domain;
>> @@ -1657,6 +1671,10 @@ static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
>>  	}
>>  
>>  	iommu->pinned_page_dirty_scope = true;
>> +
>> +	/* Set all bitmap to avoid missing dirty page */
>> +	if (iommu->dirty_page_tracking)
>> +		vfio_populate_bitmap_all(iommu);
>>  }
>>  
>>  static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
> 
> .
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope
  2020-12-15  9:37     ` zhukeqian
@ 2020-12-15 15:53       ` Alex Williamson
  2020-12-18  8:21         ` Keqian Zhu
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2020-12-15 15:53 UTC (permalink / raw)
  To: zhukeqian
  Cc: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Cornelia Huck, Marc Zyngier, Will Deacon, Robin Murphy,
	Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun

On Tue, 15 Dec 2020 17:37:11 +0800
zhukeqian <zhukeqian1@huawei.com> wrote:

> Hi Alex,
> 
> On 2020/12/15 8:04, Alex Williamson wrote:
> > On Thu, 10 Dec 2020 15:34:22 +0800
> > Keqian Zhu <zhukeqian1@huawei.com> wrote:
> >   
> >> When we pin or detach a group which is not dirty tracking capable,
> >> we will try to promote pinned_scope of vfio_iommu.
> >>
> >> If we succeed to do so, vfio only report pinned_scope as dirty to
> >> userspace next time, but these memory written before pin or detach
> >> is missed.
> >>
> >> The solution is that we must populate all dma range as dirty before
> >> promoting pinned_scope of vfio_iommu.  
> > 
> > Please don't bury fixes patches into a series with other optimizations
> > and semantic changes.  Send it separately.
> >   
> OK, I will.
> 
> >>
> >> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> >> ---
> >>  drivers/vfio/vfio_iommu_type1.c | 18 ++++++++++++++++++
> >>  1 file changed, 18 insertions(+)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index bd9a94590ebc..00684597b098 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -1633,6 +1633,20 @@ static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
> >>  	return group;
> >>  }
> >>  
> >> +static void vfio_populate_bitmap_all(struct vfio_iommu *iommu)
> >> +{
> >> +	struct rb_node *n;
> >> +	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> >> +
> >> +	for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) {
> >> +		struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
> >> +		unsigned long nbits = dma->size >> pgshift;
> >> +
> >> +		if (dma->iommu_mapped)
> >> +			bitmap_set(dma->bitmap, 0, nbits);
> >> +	}
> >> +}  
> > 
> > 
> > If we detach a group which results in only non-IOMMU backed mdevs,
> > don't we also clear dma->iommu_mapped as part of vfio_unmap_unpin()
> > such that this test is invalid?  Thanks,  
> 
> Good spot :-). The code will skip bitmap_set under this situation.
> 
> We should set the bitmap unconditionally when vfio_iommu is promoted,
> as we must have IOMMU backed domain before promoting the vfio_iommu.
> 
> Besides, I think we should also mark dirty in vfio_remove_dma if dirty
> tracking is active. Right?

There's no remaining bitmap to mark dirty if the vfio_dma is removed.
In this case it's the user's responsibility to collect remaining dirty
pages using the VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP support in the
VFIO_IOMMU_UNMAP_DMA ioctl.  Thanks,

Alex

> >> +
> >>  static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
> >>  {
> >>  	struct vfio_domain *domain;
> >> @@ -1657,6 +1671,10 @@ static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
> >>  	}
> >>  
> >>  	iommu->pinned_page_dirty_scope = true;
> >> +
> >> +	/* Set all bitmap to avoid missing dirty page */
> >> +	if (iommu->dirty_page_tracking)
> >> +		vfio_populate_bitmap_all(iommu);
> >>  }
> >>  
> >>  static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,  
> > 
> > .
> >   
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [kbuild] Re: [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin
  2020-12-10  7:34 ` [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin Keqian Zhu
  2020-12-10 19:16   ` Alex Williamson
@ 2020-12-16  7:22   ` Dan Carpenter
  1 sibling, 0 replies; 16+ messages in thread
From: Dan Carpenter @ 2020-12-16  7:22 UTC (permalink / raw)
  To: kbuild, Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, kvm,
	kvmarm, Alex Williamson, Cornelia Huck, Marc Zyngier,
	Will Deacon, Robin Murphy
  Cc: lkp, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2928 bytes --]

Hi Keqian,

url:    https://github.com/0day-ci/linux/commits/Keqian-Zhu/vfio-iommu_type1-Some-fixes-and-optimization/20201210-154322 
base:   https://github.com/awilliam/linux-vfio.git  next
config: x86_64-randconfig-m001-20201215 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
drivers/vfio/vfio_iommu_type1.c:648 vfio_iommu_type1_pin_pages() warn: variable dereferenced before check 'iommu' (see line 640)

vim +/iommu +648 drivers/vfio/vfio_iommu_type1.c

a54eb55045ae9b3 Kirti Wankhede  2016-11-17  631  static int vfio_iommu_type1_pin_pages(void *iommu_data,
95fc87b44104d9a Kirti Wankhede  2020-05-29  632  				      struct iommu_group *iommu_group,
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  633  				      unsigned long *user_pfn,
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  634  				      int npage, int prot,
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  635  				      unsigned long *phys_pfn)
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  636  {
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  637  	struct vfio_iommu *iommu = iommu_data;
95fc87b44104d9a Kirti Wankhede  2020-05-29  638  	struct vfio_group *group;
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  639  	int i, j, ret;
2b172c0ea2a6daf Keqian Zhu      2020-12-10 @640  	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
                                                                                      ^^^^^^^^^^^^^^^^^^^^
The patch introduces a new dereference.

a54eb55045ae9b3 Kirti Wankhede  2016-11-17  641  	unsigned long remote_vaddr;
2b172c0ea2a6daf Keqian Zhu      2020-12-10  642  	unsigned long bitmap_offset;
2b172c0ea2a6daf Keqian Zhu      2020-12-10  643  	unsigned long *bitmap_added;
2b172c0ea2a6daf Keqian Zhu      2020-12-10  644  	dma_addr_t iova;
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  645  	struct vfio_dma *dma;
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  646  	bool do_accounting;
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  647  
a54eb55045ae9b3 Kirti Wankhede  2016-11-17 @648  	if (!iommu || !user_pfn || !phys_pfn)
                                                            ^^^^^^
Checked too late.

a54eb55045ae9b3 Kirti Wankhede  2016-11-17  649  		return -EINVAL;
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  650  
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  651  	/* Supported for v2 version only */
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  652  	if (!iommu->v2)
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  653  		return -EACCES;
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  654  
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  655  	mutex_lock(&iommu->lock);
a54eb55045ae9b3 Kirti Wankhede  2016-11-17  656  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org 

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38446 bytes --]

[-- Attachment #3: Type: text/plain, Size: 149 bytes --]

_______________________________________________
kbuild mailing list -- kbuild@lists.01.org
To unsubscribe send an email to kbuild-leave@lists.01.org

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope
  2020-12-15 15:53       ` Alex Williamson
@ 2020-12-18  8:21         ` Keqian Zhu
  0 siblings, 0 replies; 16+ messages in thread
From: Keqian Zhu @ 2020-12-18  8:21 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-kernel, linux-arm-kernel, iommu, kvm, kvmarm,
	Cornelia Huck, Marc Zyngier, Will Deacon, Robin Murphy,
	Joerg Roedel, Catalin Marinas, James Morse, Suzuki K Poulose,
	Sean Christopherson, Julien Thierry, Mark Brown, Thomas Gleixner,
	Andrew Morton, Alexios Zavras, wanghaibin.wang, jiangkunkun


On 2020/12/15 23:53, Alex Williamson wrote:
> On Tue, 15 Dec 2020 17:37:11 +0800
> zhukeqian <zhukeqian1@huawei.com> wrote:
> 
>> Hi Alex,
>>
>> On 2020/12/15 8:04, Alex Williamson wrote:
[...]

>>>>  
>>>> +static void vfio_populate_bitmap_all(struct vfio_iommu *iommu)
>>>> +{
>>>> +	struct rb_node *n;
>>>> +	unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
>>>> +
>>>> +	for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) {
>>>> +		struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
>>>> +		unsigned long nbits = dma->size >> pgshift;
>>>> +
>>>> +		if (dma->iommu_mapped)
>>>> +			bitmap_set(dma->bitmap, 0, nbits);
>>>> +	}
>>>> +}  
>>>
>>>
>>> If we detach a group which results in only non-IOMMU backed mdevs,
>>> don't we also clear dma->iommu_mapped as part of vfio_unmap_unpin()
>>> such that this test is invalid?  Thanks,  
>>
>> Good spot :-). The code will skip bitmap_set under this situation.
>>
>> We should set the bitmap unconditionally when vfio_iommu is promoted,
>> as we must have IOMMU backed domain before promoting the vfio_iommu.
>>
>> Besides, I think we should also mark dirty in vfio_remove_dma if dirty
>> tracking is active. Right?
> 
> There's no remaining bitmap to mark dirty if the vfio_dma is removed.
> In this case it's the user's responsibility to collect remaining dirty
> pages using the VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP support in the
> VFIO_IOMMU_UNMAP_DMA ioctl.  Thanks,
> 
Hi Alex,

Thanks for pointing it out. I also notice that vfio_iommu_type1_detach_group
will remove all dma_range (in vfio_iommu_unmap_unpin_all). If this happens
during dirty tracking, then we have no chance to report dirty log to userspace.

Besides, we will add more dirty log tracking ways to VFIO definitely, but
we has no framework to support this, thus makes it inconvenient to extend
and easy to lost dirty log.

Giving above, I plan to refactor our dirty tracking code. One core idea is
that we should distinguish Dirty Range Limit (such as pin, fully dirty) and
Real Dirty Track (such as iopf, smmu httu).

Thanks,
Keqian

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-12-18  8:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10  7:34 [PATCH 0/7] vfio: iommu_type1: Some fixes and optimization Keqian Zhu
2020-12-10  7:34 ` [PATCH 1/7] vfio: iommu_type1: Clear added dirty bit when unwind pin Keqian Zhu
2020-12-10 19:16   ` Alex Williamson
2020-12-11  6:51     ` zhukeqian
2020-12-15  0:16       ` Alex Williamson
2020-12-16  7:22   ` [kbuild] " Dan Carpenter
2020-12-10  7:34 ` [PATCH 2/7] vfio: iommu_type1: Initially set the pinned_page_dirty_scope Keqian Zhu
2020-12-10  7:34 ` [PATCH 3/7] vfio: iommu_type1: Make an explicit "promote" semantic Keqian Zhu
2020-12-10  7:34 ` [PATCH 4/7] vfio: iommu_type1: Fix missing dirty page when promote pinned_scope Keqian Zhu
2020-12-15  0:04   ` Alex Williamson
2020-12-15  9:37     ` zhukeqian
2020-12-15 15:53       ` Alex Williamson
2020-12-18  8:21         ` Keqian Zhu
2020-12-10  7:34 ` [PATCH 5/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_dma_bitmap_alloc_all Keqian Zhu
2020-12-10  7:34 ` [PATCH 6/7] vfio: iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap Keqian Zhu
2020-12-10  7:34 ` [PATCH 7/7] vfio: iommu_type1: Drop parameter "pgsize" of update_user_bitmap Keqian Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).