linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt
@ 2023-02-25  0:27 Jason Gunthorpe
  2023-02-25  0:27 ` [PATCH 01/14] iommufd: Move isolated msi enforcement to iommufd_device_bind() Jason Gunthorpe
                   ` (14 more replies)
  0 siblings, 15 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

This is the basic functionality for iommufd to support
iommufd_device_replace() and IOMMU_HWPT_ALLOC for physical devices.

iommufd_device_replace() allows changing the HWPT associated with the
device to a new IOAS or HWPT. Replace does this in way that failure leaves
things unchanged, and utilizes the iommu iommu_group_replace_domain() API
to allow the iommu driver to perform an optional non-disruptive change.

IOMMU_HWPT_ALLOC allows HWPTs to be explicitly allocated by the user and
used by attach or replace. At this point it isn't very useful since the
HWPT is the same as the automatically managed HWPT from the IOAS. However
a following series will allow userspace to customize the created HWPT.

The implementation is complicated because we have to introduce some
per-iommu_group memory in iommufd and redo how we think about multi-device
groups to be more explicit. This solves all the locking problems in the
prior attempts.

This series is infrastructure work for the following series which:
 - Add replace for attach
 - Expose replace through VFIO APIs
 - Implement driver parameters for HWPT creation (nesting)

Once review of this is complete I will keep it on a side branch and
accumulate the following series when they are ready so we can have a
stable base and make more incremental progress. When we have all the parts
together to get a full implementation it can go to Linus.

I have this on github:

https://github.com/jgunthorpe/linux/commits/iommufd_hwpt

Jason Gunthorpe (12):
  iommufd: Move isolated msi enforcement to iommufd_device_bind()
  iommufd: Add iommufd_group
  iommufd: Replace the hwpt->devices list with iommufd_group
  iommufd: Use the iommufd_group to avoid duplicate reserved groups and
    msi setup
  iommufd: Make sw_msi_start a group global
  iommufd: Move putting a hwpt to a helper function
  iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc()
  iommufd: Add iommufd_device_replace()
  iommufd: Make destroy_rwsem use a lock class per object type
  iommufd: Add IOMMU_HWPT_ALLOC
  iommufd/selftest: Return the real idev id from selftest mock_domain
  iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC

Nicolin Chen (2):
  iommu: Introduce a new iommu_group_replace_domain() API
  iommufd/selftest: Test iommufd_device_replace()

 drivers/iommu/iommu-priv.h                    |  10 +
 drivers/iommu/iommu.c                         |  30 ++
 drivers/iommu/iommufd/device.c                | 482 +++++++++++++-----
 drivers/iommu/iommufd/hw_pagetable.c          |  96 +++-
 drivers/iommu/iommufd/io_pagetable.c          |   5 +-
 drivers/iommu/iommufd/iommufd_private.h       |  44 +-
 drivers/iommu/iommufd/iommufd_test.h          |   7 +
 drivers/iommu/iommufd/main.c                  |  17 +-
 drivers/iommu/iommufd/selftest.c              |  40 ++
 include/linux/iommufd.h                       |   1 +
 include/uapi/linux/iommufd.h                  |  26 +
 tools/testing/selftests/iommu/iommufd.c       |  64 ++-
 .../selftests/iommu/iommufd_fail_nth.c        |  57 ++-
 tools/testing/selftests/iommu/iommufd_utils.h |  59 ++-
 14 files changed, 758 insertions(+), 180 deletions(-)
 create mode 100644 drivers/iommu/iommu-priv.h


base-commit: ac395958f9156733246b5bc3a481c6d38c321a7a
-- 
2.39.1


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 01/14] iommufd: Move isolated msi enforcement to iommufd_device_bind()
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  7:45   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 02/14] iommufd: Add iommufd_group Jason Gunthorpe
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

With the recent rework this no longer needs to be done at domain
attachment time, we know if the device is usable by iommufd when we bind
it.

The value of msi_device_has_isolated_msi() is not allowed to change while
a driver is bound.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c | 38 ++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index c6f4852a8a0c08..63b65cdfe97f29 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -60,6 +60,26 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 	if (!group)
 		return ERR_PTR(-ENODEV);
 
+	/*
+	 * For historical compat with VFIO the insecure interrupt path is
+	 * allowed if the module parameter is set. Insecure means that a MemWr
+	 * operation from the device (eg a simple DMA) cannot trigger an
+	 * interrupt outside this iommufd context.
+	 */
+	if (!iommufd_selftest_is_mock_dev(dev) &&
+	    !iommu_group_has_isolated_msi(group)) {
+		if (!allow_unsafe_interrupts) {
+			rc = -EPERM;
+			goto out_group_put;
+		}
+
+		dev_warn(
+			dev,
+			"MSI interrupts are not secure, they cannot be isolated by the platform. "
+			"Check that platform features like interrupt remapping are enabled. "
+			"Use the \"allow_unsafe_interrupts\" module parameter to override\n");
+	}
+
 	rc = iommu_device_claim_dma_owner(dev, ictx);
 	if (rc)
 		goto out_group_put;
@@ -146,24 +166,6 @@ static int iommufd_device_setup_msi(struct iommufd_device *idev,
 		 */
 		hwpt->msi_cookie = true;
 	}
-
-	/*
-	 * For historical compat with VFIO the insecure interrupt path is
-	 * allowed if the module parameter is set. Insecure means that a MemWr
-	 * operation from the device (eg a simple DMA) cannot trigger an
-	 * interrupt outside this iommufd context.
-	 */
-	if (!iommufd_selftest_is_mock_dev(idev->dev) &&
-	    !iommu_group_has_isolated_msi(idev->group)) {
-		if (!allow_unsafe_interrupts)
-			return -EPERM;
-
-		dev_warn(
-			idev->dev,
-			"MSI interrupts are not secure, they cannot be isolated by the platform. "
-			"Check that platform features like interrupt remapping are enabled. "
-			"Use the \"allow_unsafe_interrupts\" module parameter to override\n");
-	}
 	return 0;
 }
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 02/14] iommufd: Add iommufd_group
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
  2023-02-25  0:27 ` [PATCH 01/14] iommufd: Move isolated msi enforcement to iommufd_device_bind() Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  7:55   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group Jason Gunthorpe
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

When the hwpt to device attachment is fairly static we could get
away with the simple approach of keeping track of the groups via a
device list. But with replace this is infeasible.

Add an automatically managed struct that is 1:1 with the iommu_group
per-ictx so we can store the necessary tracking information there.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c          | 128 +++++++++++++++++++++---
 drivers/iommu/iommufd/iommufd_private.h |   9 +-
 drivers/iommu/iommufd/main.c            |   2 +
 3 files changed, 123 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 63b65cdfe97f29..d1e227f310e823 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -15,13 +15,110 @@ MODULE_PARM_DESC(
 	"Allow IOMMUFD to bind to devices even if the platform cannot isolate "
 	"the MSI interrupt window. Enabling this is a security weakness.");
 
+static void iommufd_group_release(struct kref *kref)
+{
+	struct iommufd_group *igroup =
+		container_of(kref, struct iommufd_group, ref);
+
+	xa_cmpxchg(&igroup->ictx->groups, iommu_group_id(igroup->group), igroup,
+		   NULL, GFP_KERNEL);
+	iommu_group_put(igroup->group);
+	kfree(igroup);
+}
+
+static void iommufd_put_group(struct iommufd_group *group)
+{
+	kref_put(&group->ref, iommufd_group_release);
+}
+
+/*
+ * iommufd needs to store some more data for each iommu_group, we keep a
+ * parallel xarray indexed by iommu_group id to hold this instead of putting it
+ * in the core structure. To keep things simple the iommufd_group memory is
+ * unique within the iommufd_ctx. This makes it easy to check there are no
+ * memory leaks.
+ */
+static struct iommufd_group *iommufd_get_group(struct iommufd_ctx *ictx,
+					       struct device *dev)
+{
+	struct iommufd_group *new_igroup;
+	struct iommufd_group *cur_igroup;
+	struct iommufd_group *igroup;
+	struct iommu_group *group;
+	unsigned int id;
+
+	group = iommu_group_get(dev);
+	if (!group)
+		return ERR_PTR(-ENODEV);
+
+	id = iommu_group_id(group);
+
+	xa_lock(&ictx->groups);
+	igroup = xa_load(&ictx->groups, id);
+	if (igroup && igroup->group == group &&
+	    kref_get_unless_zero(&igroup->ref)) {
+		xa_unlock(&ictx->groups);
+		iommu_group_put(group);
+		return igroup;
+	}
+	xa_unlock(&ictx->groups);
+
+	new_igroup = kzalloc(sizeof(*new_igroup), GFP_KERNEL);
+	if (!new_igroup) {
+		iommu_group_put(group);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	kref_init(&new_igroup->ref);
+	/* group reference moves into new_igroup */
+	new_igroup->group = group;
+
+	/*
+	 * All objects using a group reference must put it before their destroy
+	 * completes
+	 */
+	new_igroup->ictx = ictx;
+
+	/*
+	 * We dropped the lock so igroup is invalid. Assume that the
+	 * xa had NULL in it, if this guess is wrong then we will obtain
+	 * the actual value under lock and try again once.
+	 */
+	cur_igroup = NULL;
+	xa_lock(&ictx->groups);
+	while (true) {
+		igroup = __xa_cmpxchg(&ictx->groups, id, cur_igroup, new_igroup,
+				      GFP_KERNEL);
+		if (xa_is_err(igroup)) {
+			xa_unlock(&ictx->groups);
+			iommufd_put_group(new_igroup);
+			return ERR_PTR(xa_err(igroup));
+		}
+
+		/* new_group was successfully installed */
+		if (cur_igroup == igroup) {
+			xa_unlock(&ictx->groups);
+			return new_igroup;
+		}
+
+		/* Check again if the current group is any good */
+		if (igroup && igroup->group == group &&
+		    kref_get_unless_zero(&igroup->ref)) {
+			xa_unlock(&ictx->groups);
+			iommufd_put_group(new_igroup);
+			return igroup;
+		}
+		cur_igroup = igroup;
+	}
+}
+
 void iommufd_device_destroy(struct iommufd_object *obj)
 {
 	struct iommufd_device *idev =
 		container_of(obj, struct iommufd_device, obj);
 
 	iommu_device_release_dma_owner(idev->dev);
-	iommu_group_put(idev->group);
+	iommufd_put_group(idev->igroup);
 	if (!iommufd_selftest_is_mock_dev(idev->dev))
 		iommufd_ctx_put(idev->ictx);
 }
@@ -46,7 +143,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 					   struct device *dev, u32 *id)
 {
 	struct iommufd_device *idev;
-	struct iommu_group *group;
+	struct iommufd_group *igroup;
 	int rc;
 
 	/*
@@ -56,9 +153,9 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 	if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
 		return ERR_PTR(-EINVAL);
 
-	group = iommu_group_get(dev);
-	if (!group)
-		return ERR_PTR(-ENODEV);
+	igroup = iommufd_get_group(ictx, dev);
+	if (IS_ERR(igroup))
+		return ERR_CAST(igroup);
 
 	/*
 	 * For historical compat with VFIO the insecure interrupt path is
@@ -67,7 +164,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 	 * interrupt outside this iommufd context.
 	 */
 	if (!iommufd_selftest_is_mock_dev(dev) &&
-	    !iommu_group_has_isolated_msi(group)) {
+	    !iommu_group_has_isolated_msi(igroup->group)) {
 		if (!allow_unsafe_interrupts) {
 			rc = -EPERM;
 			goto out_group_put;
@@ -98,7 +195,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 	/* The calling driver is a user until iommufd_device_unbind() */
 	refcount_inc(&idev->obj.users);
 	/* group refcount moves into iommufd_device */
-	idev->group = group;
+	idev->igroup = igroup;
 
 	/*
 	 * If the caller fails after this success it must call
@@ -113,7 +210,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 out_release_owner:
 	iommu_device_release_dma_owner(dev);
 out_group_put:
-	iommu_group_put(group);
+	iommufd_put_group(igroup);
 	return ERR_PTR(rc);
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, IOMMUFD);
@@ -170,14 +267,14 @@ static int iommufd_device_setup_msi(struct iommufd_device *idev,
 }
 
 static bool iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
-					   struct iommu_group *group)
+					   struct iommufd_group *igroup)
 {
 	struct iommufd_device *cur_dev;
 
 	lockdep_assert_held(&hwpt->devices_lock);
 
 	list_for_each_entry(cur_dev, &hwpt->devices, devices_item)
-		if (cur_dev->group == group)
+		if (cur_dev->igroup->group == igroup->group)
 			return true;
 	return false;
 }
@@ -211,7 +308,8 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 	}
 
 	rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt, idev->dev,
-						   idev->group, &sw_msi_start);
+						   idev->igroup->group,
+						   &sw_msi_start);
 	if (rc)
 		return rc;
 
@@ -223,8 +321,8 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 	 * FIXME: Hack around missing a device-centric iommu api, only attach to
 	 * the group once for the first device that is in the group.
 	 */
-	if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
-		rc = iommu_attach_group(hwpt->domain, idev->group);
+	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup)) {
+		rc = iommu_attach_group(hwpt->domain, idev->igroup->group);
 		if (rc)
 			goto err_unresv;
 	}
@@ -237,8 +335,8 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 void iommufd_hw_pagetable_detach(struct iommufd_hw_pagetable *hwpt,
 				 struct iommufd_device *idev)
 {
-	if (!iommufd_hw_pagetable_has_group(hwpt, idev->group))
-		iommu_detach_group(hwpt->domain, idev->group);
+	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
+		iommu_detach_group(hwpt->domain, idev->igroup->group);
 	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
 }
 
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index d523ef12890e1e..2544f10dae9aef 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -17,6 +17,7 @@ struct iommufd_device;
 struct iommufd_ctx {
 	struct file *file;
 	struct xarray objects;
+	struct xarray groups;
 
 	u8 account_mode;
 	/* Compatibility with VFIO no iommu */
@@ -262,6 +263,12 @@ void iommufd_hw_pagetable_detach(struct iommufd_hw_pagetable *hwpt,
 				 struct iommufd_device *idev);
 void iommufd_hw_pagetable_destroy(struct iommufd_object *obj);
 
+struct iommufd_group {
+	struct kref ref;
+	struct iommufd_ctx *ictx;
+	struct iommu_group *group;
+};
+
 /*
  * A iommufd_device object represents the binding relationship between a
  * consuming driver and the iommufd. These objects are created/destroyed by
@@ -270,12 +277,12 @@ void iommufd_hw_pagetable_destroy(struct iommufd_object *obj);
 struct iommufd_device {
 	struct iommufd_object obj;
 	struct iommufd_ctx *ictx;
+	struct iommufd_group *igroup;
 	struct iommufd_hw_pagetable *hwpt;
 	/* Head at iommufd_hw_pagetable::devices */
 	struct list_head devices_item;
 	/* always the physical device */
 	struct device *dev;
-	struct iommu_group *group;
 	bool enforce_cache_coherency;
 };
 
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 3fbe636c3d8a69..e5ed5dfa91a0b5 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -183,6 +183,7 @@ static int iommufd_fops_open(struct inode *inode, struct file *filp)
 	}
 
 	xa_init_flags(&ictx->objects, XA_FLAGS_ALLOC1 | XA_FLAGS_ACCOUNT);
+	xa_init(&ictx->groups);
 	ictx->file = filp;
 	filp->private_data = ictx;
 	return 0;
@@ -218,6 +219,7 @@ static int iommufd_fops_release(struct inode *inode, struct file *filp)
 		if (WARN_ON(!destroyed))
 			break;
 	}
+	WARN_ON(!xa_empty(&ictx->groups));
 	kfree(ictx);
 	return 0;
 }
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
  2023-02-25  0:27 ` [PATCH 01/14] iommufd: Move isolated msi enforcement to iommufd_device_bind() Jason Gunthorpe
  2023-02-25  0:27 ` [PATCH 02/14] iommufd: Add iommufd_group Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  8:01   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup Jason Gunthorpe
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

The devices list was used as a simple way to avoid having per-group
information. Now that this seems to be unavoidable, just commit to
per-group information fully and remove the devices list.

The iommufd_group stores the currently assigned hwpt for the entire group
and we can manage the per-device attach/detach with a simple counter.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c          | 75 ++++++++++++-------------
 drivers/iommu/iommufd/hw_pagetable.c    | 23 +++-----
 drivers/iommu/iommufd/iommufd_private.h | 12 ++--
 3 files changed, 47 insertions(+), 63 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index d1e227f310e823..264bfa2212481f 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -20,9 +20,12 @@ static void iommufd_group_release(struct kref *kref)
 	struct iommufd_group *igroup =
 		container_of(kref, struct iommufd_group, ref);
 
+	WARN_ON(igroup->hwpt || igroup->devices);
+
 	xa_cmpxchg(&igroup->ictx->groups, iommu_group_id(igroup->group), igroup,
 		   NULL, GFP_KERNEL);
 	iommu_group_put(igroup->group);
+	mutex_destroy(&igroup->lock);
 	kfree(igroup);
 }
 
@@ -70,6 +73,7 @@ static struct iommufd_group *iommufd_get_group(struct iommufd_ctx *ictx,
 	}
 
 	kref_init(&new_igroup->ref);
+	mutex_init(&new_igroup->lock);
 	/* group reference moves into new_igroup */
 	new_igroup->group = group;
 
@@ -266,28 +270,15 @@ static int iommufd_device_setup_msi(struct iommufd_device *idev,
 	return 0;
 }
 
-static bool iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
-					   struct iommufd_group *igroup)
-{
-	struct iommufd_device *cur_dev;
-
-	lockdep_assert_held(&hwpt->devices_lock);
-
-	list_for_each_entry(cur_dev, &hwpt->devices, devices_item)
-		if (cur_dev->igroup->group == igroup->group)
-			return true;
-	return false;
-}
-
 int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 				struct iommufd_device *idev)
 {
 	phys_addr_t sw_msi_start = PHYS_ADDR_MAX;
 	int rc;
 
-	lockdep_assert_held(&hwpt->devices_lock);
+	lockdep_assert_held(&idev->igroup->lock);
 
-	if (WARN_ON(idev->hwpt))
+	if (idev->igroup->hwpt != NULL && idev->igroup->hwpt != hwpt)
 		return -EINVAL;
 
 	/*
@@ -302,7 +293,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 				hwpt->domain->ops->enforce_cache_coherency(
 					hwpt->domain);
 		if (!hwpt->enforce_cache_coherency) {
-			WARN_ON(list_empty(&hwpt->devices));
+			WARN_ON(!idev->igroup->devices);
 			return -EINVAL;
 		}
 	}
@@ -318,26 +309,38 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 		goto err_unresv;
 
 	/*
-	 * FIXME: Hack around missing a device-centric iommu api, only attach to
-	 * the group once for the first device that is in the group.
+	 * Only attach to the group once for the first device that is in the
+	 * group. All the other devices will follow this attachment.
+	 * The user can attach every device individually as well.
 	 */
-	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup)) {
+	if (!idev->igroup->devices) {
 		rc = iommu_attach_group(hwpt->domain, idev->igroup->group);
 		if (rc)
 			goto err_unresv;
+		idev->igroup->hwpt = hwpt;
+		refcount_inc(&hwpt->obj.users);
 	}
+	idev->igroup->devices++;
 	return 0;
 err_unresv:
 	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
 	return rc;
 }
 
-void iommufd_hw_pagetable_detach(struct iommufd_hw_pagetable *hwpt,
-				 struct iommufd_device *idev)
+struct iommufd_hw_pagetable *
+iommufd_hw_pagetable_detach(struct iommufd_device *idev)
 {
-	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
+	struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
+
+	lockdep_assert_held(&idev->igroup->lock);
+
+	idev->igroup->devices--;
+	if (!idev->igroup->devices) {
 		iommu_detach_group(hwpt->domain, idev->igroup->group);
+		idev->igroup->hwpt = NULL;
+	}
 	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
+	return hwpt;
 }
 
 static int iommufd_device_do_attach(struct iommufd_device *idev,
@@ -345,16 +348,9 @@ static int iommufd_device_do_attach(struct iommufd_device *idev,
 {
 	int rc;
 
-	mutex_lock(&hwpt->devices_lock);
+	mutex_lock(&idev->igroup->lock);
 	rc = iommufd_hw_pagetable_attach(hwpt, idev);
-	if (rc)
-		goto out_unlock;
-
-	idev->hwpt = hwpt;
-	refcount_inc(&hwpt->obj.users);
-	list_add(&idev->devices_item, &hwpt->devices);
-out_unlock:
-	mutex_unlock(&hwpt->devices_lock);
+	mutex_unlock(&idev->igroup->lock);
 	return rc;
 }
 
@@ -364,7 +360,7 @@ static int iommufd_device_do_attach(struct iommufd_device *idev,
  * Automatic domain selection will never pick a manually created domain.
  */
 static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
-					  struct iommufd_ioas *ioas)
+					  struct iommufd_ioas *ioas, u32 *pt_id)
 {
 	struct iommufd_hw_pagetable *hwpt;
 	int rc;
@@ -391,6 +387,7 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
 		 */
 		if (rc == -EINVAL)
 			continue;
+		*pt_id = hwpt->obj.id;
 		goto out_unlock;
 	}
 
@@ -400,6 +397,7 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
 		goto out_unlock;
 	}
 	hwpt->auto_domain = true;
+	*pt_id = hwpt->obj.id;
 
 	mutex_unlock(&ioas->mutex);
 	iommufd_object_finalize(idev->ictx, &hwpt->obj);
@@ -444,7 +442,7 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
 		struct iommufd_ioas *ioas =
 			container_of(pt_obj, struct iommufd_ioas, obj);
 
-		rc = iommufd_device_auto_get_domain(idev, ioas);
+		rc = iommufd_device_auto_get_domain(idev, ioas, pt_id);
 		if (rc)
 			goto out_put_pt_obj;
 		break;
@@ -455,7 +453,6 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
 	}
 
 	refcount_inc(&idev->obj.users);
-	*pt_id = idev->hwpt->obj.id;
 	rc = 0;
 
 out_put_pt_obj:
@@ -473,13 +470,11 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD);
  */
 void iommufd_device_detach(struct iommufd_device *idev)
 {
-	struct iommufd_hw_pagetable *hwpt = idev->hwpt;
+	struct iommufd_hw_pagetable *hwpt;
 
-	mutex_lock(&hwpt->devices_lock);
-	list_del(&idev->devices_item);
-	idev->hwpt = NULL;
-	iommufd_hw_pagetable_detach(hwpt, idev);
-	mutex_unlock(&hwpt->devices_lock);
+	mutex_lock(&idev->igroup->lock);
+	hwpt = iommufd_hw_pagetable_detach(idev);
+	mutex_unlock(&idev->igroup->lock);
 
 	if (hwpt->auto_domain)
 		iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 6cdb6749d359f3..566eba0cd9b917 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -11,8 +11,6 @@ void iommufd_hw_pagetable_destroy(struct iommufd_object *obj)
 	struct iommufd_hw_pagetable *hwpt =
 		container_of(obj, struct iommufd_hw_pagetable, obj);
 
-	WARN_ON(!list_empty(&hwpt->devices));
-
 	if (!list_empty(&hwpt->hwpt_item)) {
 		mutex_lock(&hwpt->ioas->mutex);
 		list_del(&hwpt->hwpt_item);
@@ -25,7 +23,6 @@ void iommufd_hw_pagetable_destroy(struct iommufd_object *obj)
 		iommu_domain_free(hwpt->domain);
 
 	refcount_dec(&hwpt->ioas->obj.users);
-	mutex_destroy(&hwpt->devices_lock);
 }
 
 /**
@@ -52,9 +49,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
 	if (IS_ERR(hwpt))
 		return hwpt;
 
-	INIT_LIST_HEAD(&hwpt->devices);
 	INIT_LIST_HEAD(&hwpt->hwpt_item);
-	mutex_init(&hwpt->devices_lock);
 	/* Pairs with iommufd_hw_pagetable_destroy() */
 	refcount_inc(&ioas->obj.users);
 	hwpt->ioas = ioas;
@@ -65,13 +60,16 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
 		goto out_abort;
 	}
 
-	mutex_lock(&hwpt->devices_lock);
+	mutex_lock(&idev->igroup->lock);
 
 	/*
 	 * immediate_attach exists only to accommodate iommu drivers that cannot
 	 * directly allocate a domain. These drivers do not finish creating the
 	 * domain until attach is completed. Thus we must have this call
 	 * sequence. Once those drivers are fixed this should be removed.
+	 *
+	 * Note we hold the igroup->lock here which prevents any other thread
+	 * from observing igroup->hwpt until we finish setting it up.
 	 */
 	if (immediate_attach) {
 		rc = iommufd_hw_pagetable_attach(hwpt, idev);
@@ -84,21 +82,14 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
 		goto out_detach;
 	list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);
 
-	if (immediate_attach) {
-		/* See iommufd_device_do_attach() */
-		refcount_inc(&hwpt->obj.users);
-		idev->hwpt = hwpt;
-		list_add(&idev->devices_item, &hwpt->devices);
-	}
-
-	mutex_unlock(&hwpt->devices_lock);
+	mutex_unlock(&idev->igroup->lock);
 	return hwpt;
 
 out_detach:
 	if (immediate_attach)
-		iommufd_hw_pagetable_detach(hwpt, idev);
+		iommufd_hw_pagetable_detach(idev);
 out_unlock:
-	mutex_unlock(&hwpt->devices_lock);
+	mutex_unlock(&idev->igroup->lock);
 out_abort:
 	iommufd_object_abort_and_destroy(ictx, &hwpt->obj);
 	return ERR_PTR(rc);
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 2544f10dae9aef..5f3ad16da819e7 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -250,8 +250,6 @@ struct iommufd_hw_pagetable {
 	bool msi_cookie : 1;
 	/* Head at iommufd_ioas::hwpt_list */
 	struct list_head hwpt_item;
-	struct mutex devices_lock;
-	struct list_head devices;
 };
 
 struct iommufd_hw_pagetable *
@@ -259,14 +257,17 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
 			   struct iommufd_device *idev, bool immediate_attach);
 int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 				struct iommufd_device *idev);
-void iommufd_hw_pagetable_detach(struct iommufd_hw_pagetable *hwpt,
-				 struct iommufd_device *idev);
+struct iommufd_hw_pagetable *
+iommufd_hw_pagetable_detach(struct iommufd_device *idev);
 void iommufd_hw_pagetable_destroy(struct iommufd_object *obj);
 
 struct iommufd_group {
 	struct kref ref;
+	struct mutex lock;
 	struct iommufd_ctx *ictx;
 	struct iommu_group *group;
+	struct iommufd_hw_pagetable *hwpt;
+	unsigned int devices;
 };
 
 /*
@@ -278,9 +279,6 @@ struct iommufd_device {
 	struct iommufd_object obj;
 	struct iommufd_ctx *ictx;
 	struct iommufd_group *igroup;
-	struct iommufd_hw_pagetable *hwpt;
-	/* Head at iommufd_hw_pagetable::devices */
-	struct list_head devices_item;
 	/* always the physical device */
 	struct device *dev;
 	bool enforce_cache_coherency;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  8:06   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 05/14] iommufd: Make sw_msi_start a group global Jason Gunthorpe
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

These items only needs to be done once per group, not once per device. The
once per device was a way to make the device list work. Since we are
abandoning this we can optimize things a bit.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c          | 38 ++++++++++++++-----------
 drivers/iommu/iommufd/io_pagetable.c    |  5 ++--
 drivers/iommu/iommufd/iommufd_private.h |  1 -
 3 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 264bfa2212481f..75e8d79678736f 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -298,9 +298,20 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 		}
 	}
 
-	rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt, idev->dev,
-						   idev->igroup->group,
-						   &sw_msi_start);
+	/*
+	 * The first device in the group to be attached will do all the work
+	 * to setup the hwpt and ioas. Every other device re-uses it through
+	 * the shared group attachment. Users are allowed/expected to attach
+	 * every device in the group to the same hwpt, that just turns into
+	 * a NOP.
+	 */
+	if (idev->igroup->devices) {
+		idev->igroup->devices++;
+		return 0;
+	}
+
+	rc = iopt_table_enforce_group_resv_regions(
+		&hwpt->ioas->iopt, idev->igroup->group, &sw_msi_start);
 	if (rc)
 		return rc;
 
@@ -308,22 +319,15 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 	if (rc)
 		goto err_unresv;
 
-	/*
-	 * Only attach to the group once for the first device that is in the
-	 * group. All the other devices will follow this attachment.
-	 * The user can attach every device individually as well.
-	 */
-	if (!idev->igroup->devices) {
-		rc = iommu_attach_group(hwpt->domain, idev->igroup->group);
-		if (rc)
-			goto err_unresv;
-		idev->igroup->hwpt = hwpt;
-		refcount_inc(&hwpt->obj.users);
-	}
+	rc = iommu_attach_group(hwpt->domain, idev->igroup->group);
+	if (rc)
+		goto err_unresv;
+	idev->igroup->hwpt = hwpt;
+	refcount_inc(&hwpt->obj.users);
 	idev->igroup->devices++;
 	return 0;
 err_unresv:
-	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
+	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->igroup->group);
 	return rc;
 }
 
@@ -339,7 +343,7 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev)
 		iommu_detach_group(hwpt->domain, idev->igroup->group);
 		idev->igroup->hwpt = NULL;
 	}
-	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
+	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->igroup->group);
 	return hwpt;
 }
 
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index e0ae72b9e67f86..096491bbb5acf5 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -1163,7 +1163,6 @@ void iopt_remove_access(struct io_pagetable *iopt,
 
 /* Narrow the valid_iova_itree to include reserved ranges from a group. */
 int iopt_table_enforce_group_resv_regions(struct io_pagetable *iopt,
-					  struct device *device,
 					  struct iommu_group *group,
 					  phys_addr_t *sw_msi_start)
 {
@@ -1191,7 +1190,7 @@ int iopt_table_enforce_group_resv_regions(struct io_pagetable *iopt,
 		}
 
 		rc = iopt_reserve_iova(iopt, resv->start,
-				       resv->length - 1 + resv->start, device);
+				       resv->length - 1 + resv->start, group);
 		if (rc)
 			goto out_reserved;
 	}
@@ -1206,7 +1205,7 @@ int iopt_table_enforce_group_resv_regions(struct io_pagetable *iopt,
 	goto out_free_resv;
 
 out_reserved:
-	__iopt_remove_reserved_iova(iopt, device);
+	__iopt_remove_reserved_iova(iopt, group);
 out_free_resv:
 	list_for_each_entry_safe(resv, tmp, &group_resv_regions, list)
 		kfree(resv);
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 5f3ad16da819e7..dbecdff013d082 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -77,7 +77,6 @@ int iopt_table_add_domain(struct io_pagetable *iopt,
 void iopt_table_remove_domain(struct io_pagetable *iopt,
 			      struct iommu_domain *domain);
 int iopt_table_enforce_group_resv_regions(struct io_pagetable *iopt,
-					  struct device *device,
 					  struct iommu_group *group,
 					  phys_addr_t *sw_msi_start);
 int iopt_set_allow_iova(struct io_pagetable *iopt,
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 05/14] iommufd: Make sw_msi_start a group global
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  8:09   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 06/14] iommufd: Move putting a hwpt to a helper function Jason Gunthorpe
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

The sw_msi_start is only set by the ARM drivers and it is always constant.
Due to the way vfio/iommufd allow domains to be re-used between
devices we have a built in assumption that there is only one value
for sw_msi_start and it is global to the system.

To make replace simpler where we may not reparse the
iommu_get_group_resv_regions() move the sw_msi_start to the iommufd_group
so it is always available once any HWPT has been attached.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c          | 13 +++++++------
 drivers/iommu/iommufd/iommufd_private.h |  1 +
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 75e8d79678736f..ea34dc5b0fd1e4 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -74,6 +74,7 @@ static struct iommufd_group *iommufd_get_group(struct iommufd_ctx *ictx,
 
 	kref_init(&new_igroup->ref);
 	mutex_init(&new_igroup->lock);
+	new_igroup->sw_msi_start = PHYS_ADDR_MAX;
 	/* group reference moves into new_igroup */
 	new_igroup->group = group;
 
@@ -238,9 +239,9 @@ void iommufd_device_unbind(struct iommufd_device *idev)
 EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD);
 
 static int iommufd_device_setup_msi(struct iommufd_device *idev,
-				    struct iommufd_hw_pagetable *hwpt,
-				    phys_addr_t sw_msi_start)
+				    struct iommufd_hw_pagetable *hwpt)
 {
+	phys_addr_t sw_msi_start = idev->igroup->sw_msi_start;
 	int rc;
 
 	/*
@@ -273,7 +274,6 @@ static int iommufd_device_setup_msi(struct iommufd_device *idev,
 int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 				struct iommufd_device *idev)
 {
-	phys_addr_t sw_msi_start = PHYS_ADDR_MAX;
 	int rc;
 
 	lockdep_assert_held(&idev->igroup->lock);
@@ -310,12 +310,13 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 		return 0;
 	}
 
-	rc = iopt_table_enforce_group_resv_regions(
-		&hwpt->ioas->iopt, idev->igroup->group, &sw_msi_start);
+	rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt,
+						   idev->igroup->group,
+						   &idev->igroup->sw_msi_start);
 	if (rc)
 		return rc;
 
-	rc = iommufd_device_setup_msi(idev, hwpt, sw_msi_start);
+	rc = iommufd_device_setup_msi(idev, hwpt);
 	if (rc)
 		goto err_unresv;
 
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index dbecdff013d082..0497d6190fd1b3 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -267,6 +267,7 @@ struct iommufd_group {
 	struct iommu_group *group;
 	struct iommufd_hw_pagetable *hwpt;
 	unsigned int devices;
+	phys_addr_t sw_msi_start;
 };
 
 /*
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 06/14] iommufd: Move putting a hwpt to a helper function
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 05/14] iommufd: Make sw_msi_start a group global Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  8:12   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 07/14] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc() Jason Gunthorpe
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

Next patch will need to call this from two places.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c          |  6 +-----
 drivers/iommu/iommufd/iommufd_private.h | 10 ++++++++++
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index ea34dc5b0fd1e4..590f84afa06503 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -481,11 +481,7 @@ void iommufd_device_detach(struct iommufd_device *idev)
 	hwpt = iommufd_hw_pagetable_detach(idev);
 	mutex_unlock(&idev->igroup->lock);
 
-	if (hwpt->auto_domain)
-		iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
-	else
-		refcount_dec(&hwpt->obj.users);
-
+	iommufd_hw_pagetable_put(idev->ictx, hwpt);
 	refcount_dec(&idev->obj.users);
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_device_detach, IOMMUFD);
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 0497d6190fd1b3..c0c80181d05ed1 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -260,6 +260,16 @@ struct iommufd_hw_pagetable *
 iommufd_hw_pagetable_detach(struct iommufd_device *idev);
 void iommufd_hw_pagetable_destroy(struct iommufd_object *obj);
 
+static inline void iommufd_hw_pagetable_put(struct iommufd_ctx *ictx,
+					    struct iommufd_hw_pagetable *hwpt)
+{
+	lockdep_assert_not_held(&hwpt->ioas->mutex);
+	if (hwpt->auto_domain)
+		iommufd_object_destroy_user(ictx, &hwpt->obj);
+	else
+		refcount_dec(&hwpt->obj.users);
+}
+
 struct iommufd_group {
 	struct kref ref;
 	struct mutex lock;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 07/14] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc()
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 06/14] iommufd: Move putting a hwpt to a helper function Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  8:14   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 08/14] iommu: Introduce a new iommu_group_replace_domain() API Jason Gunthorpe
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

Logically the HWPT should have the coherency set properly for the device
that it is being created for when it is created.

This was happening implicitly if the immediate_attach was set because
iommufd_hw_pagetable_attach() does it as the first thing.

Do it unconditionally so !immediate_attach works properly.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c          | 20 +++++-------------
 drivers/iommu/iommufd/hw_pagetable.c    | 27 +++++++++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h |  1 +
 3 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 590f84afa06503..7f95876d142d7f 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -281,21 +281,11 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 	if (idev->igroup->hwpt != NULL && idev->igroup->hwpt != hwpt)
 		return -EINVAL;
 
-	/*
-	 * Try to upgrade the domain we have, it is an iommu driver bug to
-	 * report IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail
-	 * enforce_cache_coherency when there are no devices attached to the
-	 * domain.
-	 */
-	if (idev->enforce_cache_coherency && !hwpt->enforce_cache_coherency) {
-		if (hwpt->domain->ops->enforce_cache_coherency)
-			hwpt->enforce_cache_coherency =
-				hwpt->domain->ops->enforce_cache_coherency(
-					hwpt->domain);
-		if (!hwpt->enforce_cache_coherency) {
-			WARN_ON(!idev->igroup->devices);
-			return -EINVAL;
-		}
+	/* Try to upgrade the domain we have */
+	if (idev->enforce_cache_coherency) {
+		rc = iommufd_hw_pagetable_enforce_cc(hwpt);
+		if (rc)
+			return rc;
 	}
 
 	/*
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 566eba0cd9b917..2584f9038b29a2 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -25,6 +25,20 @@ void iommufd_hw_pagetable_destroy(struct iommufd_object *obj)
 	refcount_dec(&hwpt->ioas->obj.users);
 }
 
+int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt)
+{
+	if (hwpt->enforce_cache_coherency)
+		return 0;
+
+	if (hwpt->domain->ops->enforce_cache_coherency)
+		hwpt->enforce_cache_coherency =
+			hwpt->domain->ops->enforce_cache_coherency(
+				hwpt->domain);
+	if (!hwpt->enforce_cache_coherency)
+		return -EINVAL;
+	return 0;
+}
+
 /**
  * iommufd_hw_pagetable_alloc() - Get an iommu_domain for a device
  * @ictx: iommufd context
@@ -60,6 +74,19 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
 		goto out_abort;
 	}
 
+	/*
+	 * Set the coherency mode before we do iopt_table_add_domain() as some
+	 * iommus have a per-PTE bit that controls it and need to decide before
+	 * doing any maps. It is an iommu driver bug to report
+	 * IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail enforce_cache_coherency on
+	 * a new domain.
+	 */
+	if (idev->enforce_cache_coherency) {
+		rc = iommufd_hw_pagetable_enforce_cc(hwpt);
+		if (WARN_ON(rc))
+			goto out_abort;
+	}
+
 	mutex_lock(&idev->igroup->lock);
 
 	/*
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index c0c80181d05ed1..8921761ff98980 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -254,6 +254,7 @@ struct iommufd_hw_pagetable {
 struct iommufd_hw_pagetable *
 iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
 			   struct iommufd_device *idev, bool immediate_attach);
+int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt);
 int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 				struct iommufd_device *idev);
 struct iommufd_hw_pagetable *
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 08/14] iommu: Introduce a new iommu_group_replace_domain() API
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 07/14] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc() Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-02  8:16   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 09/14] iommufd: Add iommufd_device_replace() Jason Gunthorpe
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

From: Nicolin Chen <nicolinc@nvidia.com>

qemu has a need to replace the translations associated with a domain
when the guest does large-scale operations like switching between an
IDENTITY domain and, say, dma-iommu.c.

Currently, it does this by replacing all the mappings in a single
domain, but this is very inefficient and means that domains have to be
per-device rather than per-translation.

Provide a high-level API to allow replacements of one domain with
another. This is similar to a detach/attach cycle except it doesn't
force the group to go to the blocking domain in-between.

By removing this forced blocking domain the iommu driver has the
opportunity to implement a non-disruptive replacement of the domain to the
greatest extent its hardware allows. This allows the qemu emulation of the
vIOMMU to be more complete, as real hardware often has a non-distruptive
replacement capability.

It could be possible to adderss this by simply removing the protection
from the iommu_attach_group(), but it is not so clear if that is safe for
the few users. Thus, add a new API to serve this new purpose.

All drivers are already required to support changing between active
UNMANAGED domains when using their attach_dev ops.

This API is expected to be used only by IOMMUFD, so add to the iommu-priv
header and mark it as IOMMUFD_INTERNAL.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommu-priv.h | 10 ++++++++++
 drivers/iommu/iommu.c      | 30 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)
 create mode 100644 drivers/iommu/iommu-priv.h

diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h
new file mode 100644
index 00000000000000..7c8011bfd15374
--- /dev/null
+++ b/drivers/iommu/iommu-priv.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __LINUX_IOMMU_PRIV_H
+#define __LINUX_IOMMU_PRIV_H
+
+#include <linux/iommu.h>
+
+int iommu_group_replace_domain(struct iommu_group *group,
+			       struct iommu_domain *new_domain);
+
+#endif /* __LINUX_IOMMU_PRIV_H */
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 10db680acaed5a..4c91d9bdd114ab 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -33,6 +33,7 @@
 #include <linux/msi.h>
 
 #include "dma-iommu.h"
+#include "iommu-priv.h"
 
 #include "iommu-sva.h"
 
@@ -2191,6 +2192,35 @@ int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
 }
 EXPORT_SYMBOL_GPL(iommu_attach_group);
 
+/**
+ * iommu_group_replace_domain - replace the domain that a group is attached to
+ * @new_domain: new IOMMU domain to replace with
+ * @group: IOMMU group that will be attached to the new domain
+ *
+ * This API allows the group to switch domains without being forced to go to
+ * the blocking domain in-between.
+ *
+ * If the currently attached domain is a core domain (e.g. a default_domain),
+ * it will act just like the iommu_attach_group().
+ */
+int iommu_group_replace_domain(struct iommu_group *group,
+			       struct iommu_domain *new_domain)
+{
+	int ret;
+
+	if (!new_domain)
+		return -EINVAL;
+
+	mutex_lock(&group->mutex);
+	ret = __iommu_group_set_domain(group, new_domain);
+	if (ret)
+		__iommu_group_for_each_dev(group, group->domain,
+					   iommu_group_do_attach_device);
+	mutex_unlock(&group->mutex);
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(iommu_group_replace_domain, IOMMUFD_INTERNAL);
+
 static int iommu_group_do_set_platform_dma(struct device *dev, void *data)
 {
 	const struct iommu_ops *ops = dev_iommu_ops(dev);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 08/14] iommu: Introduce a new iommu_group_replace_domain() API Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-02-26  3:01   ` Baolu Lu
                     ` (2 more replies)
  2023-02-25  0:27 ` [PATCH 10/14] iommufd: Make destroy_rwsem use a lock class per object type Jason Gunthorpe
                   ` (5 subsequent siblings)
  14 siblings, 3 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

Replace allows all the devices in a group to move in one step to a new
HWPT. Further, the HWPT move is done without going through a blocking
domain so that the IOMMU driver can implement some level of
non-distruption to ongoing DMA if that has meaning for it (eg for future
special driver domains)

Replace uses a lot of the same logic as normal attach, except the actual
domain change over has different restrictions, and we are careful to
sequence things so that failure is going to leave everything the way it
was, and not get trapped in a blocking domain or something if there is
ENOMEM.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/device.c | 216 ++++++++++++++++++++++++++-------
 drivers/iommu/iommufd/main.c   |   1 +
 2 files changed, 175 insertions(+), 42 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 7f95876d142d7f..913de911361115 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -4,6 +4,7 @@
 #include <linux/iommufd.h>
 #include <linux/slab.h>
 #include <linux/iommu.h>
+#include "../iommu-priv.h"
 
 #include "io_pagetable.h"
 #include "iommufd_private.h"
@@ -338,27 +339,101 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev)
 	return hwpt;
 }
 
-static int iommufd_device_do_attach(struct iommufd_device *idev,
-				    struct iommufd_hw_pagetable *hwpt)
+static struct iommufd_hw_pagetable *
+iommufd_device_do_attach(struct iommufd_device *idev,
+			 struct iommufd_hw_pagetable *hwpt)
 {
 	int rc;
 
 	mutex_lock(&idev->igroup->lock);
 	rc = iommufd_hw_pagetable_attach(hwpt, idev);
 	mutex_unlock(&idev->igroup->lock);
-	return rc;
+	if (rc)
+		return ERR_PTR(rc);
+	return NULL;
+}
+
+static struct iommufd_hw_pagetable *
+iommufd_device_do_replace_locked(struct iommufd_device *idev,
+				 struct iommufd_hw_pagetable *hwpt)
+{
+	struct iommufd_hw_pagetable *old_hwpt;
+	int rc;
+
+	lockdep_assert_held(&idev->igroup->lock);
+
+	/* Try to upgrade the domain we have */
+	if (idev->enforce_cache_coherency) {
+		rc = iommufd_hw_pagetable_enforce_cc(hwpt);
+		if (rc)
+			return ERR_PTR(-EINVAL);
+	}
+
+	rc = iommufd_device_setup_msi(idev, hwpt);
+	if (rc)
+		return ERR_PTR(rc);
+
+	old_hwpt = idev->igroup->hwpt;
+	if (hwpt->ioas != old_hwpt->ioas) {
+		rc = iopt_table_enforce_group_resv_regions(
+			&hwpt->ioas->iopt, idev->igroup->group, NULL);
+		if (rc)
+			return ERR_PTR(rc);
+	}
+
+	rc = iommu_group_replace_domain(idev->igroup->group, hwpt->domain);
+	if (rc)
+		goto err_unresv;
+
+	if (hwpt->ioas != old_hwpt->ioas)
+		iopt_remove_reserved_iova(&old_hwpt->ioas->iopt,
+					  idev->igroup->group);
+
+	idev->igroup->hwpt = hwpt;
+	refcount_inc(&hwpt->obj.users);
+	return old_hwpt;
+err_unresv:
+	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->igroup->group);
+	return ERR_PTR(rc);
+}
+
+static struct iommufd_hw_pagetable *
+iommufd_device_do_replace(struct iommufd_device *idev,
+			  struct iommufd_hw_pagetable *hwpt)
+{
+	struct iommufd_hw_pagetable *destroy_hwpt = NULL;
+	int rc;
+
+	mutex_lock(&idev->igroup->lock);
+	destroy_hwpt = iommufd_device_do_replace_locked(idev, hwpt);
+	if (IS_ERR(destroy_hwpt)) {
+		rc = PTR_ERR(destroy_hwpt);
+		goto out_unlock;
+	}
+	mutex_unlock(&idev->igroup->lock);
+	return destroy_hwpt;
+
+out_unlock:
+	mutex_unlock(&idev->igroup->lock);
+	return ERR_PTR(rc);
 }
 
+typedef struct iommufd_hw_pagetable *(*attach_fn)(
+	struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt);
+
 /*
  * When automatically managing the domains we search for a compatible domain in
  * the iopt and if one is found use it, otherwise create a new domain.
  * Automatic domain selection will never pick a manually created domain.
  */
-static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
-					  struct iommufd_ioas *ioas, u32 *pt_id)
+static struct iommufd_hw_pagetable *
+iommufd_device_auto_get_domain(struct iommufd_device *idev,
+			       struct iommufd_ioas *ioas, u32 *pt_id,
+			       attach_fn do_attach)
 {
+	bool immediate_attach = do_attach == iommufd_device_do_attach;
+	struct iommufd_hw_pagetable *destroy_hwpt;
 	struct iommufd_hw_pagetable *hwpt;
-	int rc;
 
 	/*
 	 * There is no differentiation when domains are allocated, so any domain
@@ -372,52 +447,57 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
 
 		if (!iommufd_lock_obj(&hwpt->obj))
 			continue;
-		rc = iommufd_device_do_attach(idev, hwpt);
-		iommufd_put_object(&hwpt->obj);
-
-		/*
-		 * -EINVAL means the domain is incompatible with the device.
-		 * Other error codes should propagate to userspace as failure.
-		 * Success means the domain is attached.
-		 */
-		if (rc == -EINVAL)
-			continue;
+		destroy_hwpt = (*do_attach)(idev, hwpt);
 		*pt_id = hwpt->obj.id;
+		iommufd_put_object(&hwpt->obj);
+		if (IS_ERR(destroy_hwpt)) {
+			/*
+			 * -EINVAL means the domain is incompatible with the
+			 * device. Other error codes should propagate to
+			 * userspace as failure. Success means the domain is
+			 * attached.
+			 */
+			if (PTR_ERR(destroy_hwpt) == -EINVAL)
+				continue;
+			goto out_unlock;
+		}
 		goto out_unlock;
 	}
 
-	hwpt = iommufd_hw_pagetable_alloc(idev->ictx, ioas, idev, true);
+	hwpt = iommufd_hw_pagetable_alloc(idev->ictx, ioas, idev,
+					  immediate_attach);
 	if (IS_ERR(hwpt)) {
-		rc = PTR_ERR(hwpt);
+		destroy_hwpt = ERR_CAST(hwpt);
 		goto out_unlock;
 	}
+
+	if (!immediate_attach) {
+		destroy_hwpt = (*do_attach)(idev, hwpt);
+		if (IS_ERR(destroy_hwpt))
+			goto out_abort;
+	} else {
+		destroy_hwpt = NULL;
+	}
+
 	hwpt->auto_domain = true;
 	*pt_id = hwpt->obj.id;
 
 	mutex_unlock(&ioas->mutex);
 	iommufd_object_finalize(idev->ictx, &hwpt->obj);
-	return 0;
+	return destroy_hwpt;
+
+out_abort:
+	iommufd_object_abort_and_destroy(idev->ictx, &hwpt->obj);
 out_unlock:
 	mutex_unlock(&ioas->mutex);
-	return rc;
+	return destroy_hwpt;
 }
 
-/**
- * iommufd_device_attach - Connect a device from an iommu_domain
- * @idev: device to attach
- * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
- *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
- *
- * This connects the device to an iommu_domain, either automatically or manually
- * selected. Once this completes the device could do DMA.
- *
- * The caller should return the resulting pt_id back to userspace.
- * This function is undone by calling iommufd_device_detach().
- */
-int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
+static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id,
+				    attach_fn do_attach)
 {
+	struct iommufd_hw_pagetable *destroy_hwpt;
 	struct iommufd_object *pt_obj;
-	int rc;
 
 	pt_obj = iommufd_get_object(idev->ictx, *pt_id, IOMMUFD_OBJ_ANY);
 	if (IS_ERR(pt_obj))
@@ -428,8 +508,8 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
 		struct iommufd_hw_pagetable *hwpt =
 			container_of(pt_obj, struct iommufd_hw_pagetable, obj);
 
-		rc = iommufd_device_do_attach(idev, hwpt);
-		if (rc)
+		destroy_hwpt = (*do_attach)(idev, hwpt);
+		if (IS_ERR(destroy_hwpt))
 			goto out_put_pt_obj;
 		break;
 	}
@@ -437,25 +517,77 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
 		struct iommufd_ioas *ioas =
 			container_of(pt_obj, struct iommufd_ioas, obj);
 
-		rc = iommufd_device_auto_get_domain(idev, ioas, pt_id);
-		if (rc)
+		destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id,
+							      do_attach);
+		if (IS_ERR(destroy_hwpt))
 			goto out_put_pt_obj;
 		break;
 	}
 	default:
-		rc = -EINVAL;
+		destroy_hwpt = ERR_PTR(-EINVAL);
 		goto out_put_pt_obj;
 	}
+	iommufd_put_object(pt_obj);
 
-	refcount_inc(&idev->obj.users);
-	rc = 0;
+	/* This destruction has to be after we unlock everything */
+	if (destroy_hwpt)
+		iommufd_hw_pagetable_put(idev->ictx, destroy_hwpt);
+	return 0;
 
 out_put_pt_obj:
 	iommufd_put_object(pt_obj);
-	return rc;
+	return PTR_ERR(destroy_hwpt);
+}
+
+/**
+ * iommufd_device_attach - Connect a device to an iommu_domain
+ * @idev: device to attach
+ * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
+ *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
+ *
+ * This connects the device to an iommu_domain, either automatically or manually
+ * selected. Once this completes the device could do DMA.
+ *
+ * The caller should return the resulting pt_id back to userspace.
+ * This function is undone by calling iommufd_device_detach().
+ */
+int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
+{
+	int rc;
+
+	rc = iommufd_device_change_pt(idev, pt_id, &iommufd_device_do_attach);
+	if (rc)
+		return rc;
+
+	/*
+	 * Pairs with iommufd_device_detach() - catches caller bugs attempting
+	 * to destroy a device with an attachment.
+	 */
+	refcount_inc(&idev->obj.users);
+	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD);
 
+/**
+ * iommufd_device_replace - Change the device's iommu_domain
+ * @idev: device to change
+ * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
+ *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
+ *
+ * This is the same as:
+ *   iommufd_device_detach();
+ *   iommufd_device_attach()
+ * If it fails then no change is made to the attachment. The iommu driver may
+ * implement this so there is no disruption in translation. This can only be
+ * called if iommufd_device_attach() has already succeeded.
+ */
+int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id)
+{
+	return iommufd_device_change_pt(idev, pt_id,
+					&iommufd_device_do_replace);
+}
+EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, IOMMUFD);
+
 /**
  * iommufd_device_detach - Disconnect a device to an iommu_domain
  * @idev: device to detach
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index e5ed5dfa91a0b5..8597f2fb88da3a 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -461,5 +461,6 @@ module_exit(iommufd_exit);
 MODULE_ALIAS_MISCDEV(VFIO_MINOR);
 MODULE_ALIAS("devname:vfio/vfio");
 #endif
+MODULE_IMPORT_NS(IOMMUFD_INTERNAL);
 MODULE_DESCRIPTION("I/O Address Space Management for passthrough devices");
 MODULE_LICENSE("GPL");
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 10/14] iommufd: Make destroy_rwsem use a lock class per object type
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 09/14] iommufd: Add iommufd_device_replace() Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-02-25  0:27 ` [PATCH 11/14] iommufd/selftest: Test iommufd_device_replace() Jason Gunthorpe
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

The selftest invokes things like replace under the object lock of its
idev which protects the idev in a similar way to a real user.
Unfortunately this triggers lockdep. A lock class per type will solve the
problem.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/iommufd_private.h |  1 +
 drivers/iommu/iommufd/main.c            | 11 ++++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 8921761ff98980..cfcda73942b533 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -119,6 +119,7 @@ enum iommufd_object_type {
 #ifdef CONFIG_IOMMUFD_TEST
 	IOMMUFD_OBJ_SELFTEST,
 #endif
+	IOMMUFD_OBJ_MAX,
 };
 
 /* Base struct for all objects with a userspace ID handle. */
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 8597f2fb88da3a..9cba592d0482e7 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -32,6 +32,7 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
 					     size_t size,
 					     enum iommufd_object_type type)
 {
+	static struct lock_class_key obj_keys[IOMMUFD_OBJ_MAX];
 	struct iommufd_object *obj;
 	int rc;
 
@@ -39,7 +40,15 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
 	if (!obj)
 		return ERR_PTR(-ENOMEM);
 	obj->type = type;
-	init_rwsem(&obj->destroy_rwsem);
+	/*
+	 * In most cases the destroy_rwsem is obtained with try so it doesn't
+	 * interact with lockdep, however on destroy we have to sleep. This
+	 * means if we have to destroy an object while holding a get on another
+	 * object it triggers lockdep. Using one locking class per object type
+	 * is a simple and reasonable way to avoid this.
+	 */
+	__init_rwsem(&obj->destroy_rwsem, "iommufd_object::destroy_rwsem",
+		     &obj_keys[type]);
 	refcount_set(&obj->users, 1);
 
 	/*
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 11/14] iommufd/selftest: Test iommufd_device_replace()
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 10/14] iommufd: Make destroy_rwsem use a lock class per object type Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-02-25  0:27 ` [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC Jason Gunthorpe
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

From: Nicolin Chen <nicolinc@nvidia.com>

Allow the selftest to call the function on the mock idev, add some tests
to exercise it.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/iommufd_test.h          |  4 ++
 drivers/iommu/iommufd/selftest.c              | 39 +++++++++++++++++++
 include/linux/iommufd.h                       |  1 +
 tools/testing/selftests/iommu/iommufd.c       | 34 +++++++++++++++-
 .../selftests/iommu/iommufd_fail_nth.c        | 27 +++++++++++++
 tools/testing/selftests/iommu/iommufd_utils.h | 28 +++++++++++++
 6 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index 1d96a8f466fd29..c60c501775e9bb 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -17,6 +17,7 @@ enum {
 	IOMMU_TEST_OP_ACCESS_PAGES,
 	IOMMU_TEST_OP_ACCESS_RW,
 	IOMMU_TEST_OP_SET_TEMP_MEMORY_LIMIT,
+	IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE,
 };
 
 enum {
@@ -52,6 +53,9 @@ struct iommu_test_cmd {
 			__u32 out_device_id;
 			__u32 out_hwpt_id;
 		} mock_domain;
+		struct {
+			__u32 pt_id;
+		} mock_domain_replace;
 		struct {
 			__aligned_u64 iova;
 			__aligned_u64 length;
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index 8540adcb68f1f0..701d6d6be3f06d 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -455,6 +455,42 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd,
 	return rc;
 }
 
+/* Replace the mock domain with a manually allocated hw_pagetable */
+static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd,
+					    unsigned int device_id, u32 pt_id,
+					    struct iommu_test_cmd *cmd)
+{
+	struct iommufd_object *dev_obj;
+	struct selftest_obj *sobj;
+	int rc;
+
+	/*
+	 * Prefer to use the OBJ_SELFTEST because the destroy_rwsem will ensure
+	 * it doesn't race with detach, which is not allowed.
+	 */
+	dev_obj =
+		iommufd_get_object(ucmd->ictx, device_id, IOMMUFD_OBJ_SELFTEST);
+	if (IS_ERR(dev_obj))
+		return PTR_ERR(dev_obj);
+
+	sobj = container_of(dev_obj, struct selftest_obj, obj);
+	if (sobj->type != TYPE_IDEV) {
+		rc = -EINVAL;
+		goto out_dev_obj;
+	}
+
+	rc = iommufd_device_replace(sobj->idev.idev, &pt_id);
+	if (rc)
+		goto out_dev_obj;
+
+	cmd->mock_domain_replace.pt_id = pt_id;
+	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+
+out_dev_obj:
+	iommufd_put_object(dev_obj);
+	return rc;
+}
+
 /* Add an additional reserved IOVA to the IOAS */
 static int iommufd_test_add_reserved(struct iommufd_ucmd *ucmd,
 				     unsigned int mockpt_id,
@@ -939,6 +975,9 @@ int iommufd_test(struct iommufd_ucmd *ucmd)
 						 cmd->add_reserved.length);
 	case IOMMU_TEST_OP_MOCK_DOMAIN:
 		return iommufd_test_mock_domain(ucmd, cmd);
+	case IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE:
+		return iommufd_test_mock_domain_replace(
+			ucmd, cmd->id, cmd->mock_domain_replace.pt_id, cmd);
 	case IOMMU_TEST_OP_MD_CHECK_MAP:
 		return iommufd_test_md_check_pa(
 			ucmd, cmd->id, cmd->check_map.iova,
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index c0b5b3ac34f1e0..3044a432a83e22 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -22,6 +22,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 void iommufd_device_unbind(struct iommufd_device *idev);
 
 int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id);
+int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id);
 void iommufd_device_detach(struct iommufd_device *idev);
 
 struct iommufd_access_ops {
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 91f6bcd5b8f5e9..e1252a874f0b32 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -1001,6 +1001,7 @@ FIXTURE(iommufd_mock_domain)
 	uint32_t ioas_id;
 	uint32_t hwpt_id;
 	uint32_t hwpt_ids[2];
+	uint32_t device_ids[2];
 	int mmap_flags;
 	size_t mmap_buf_size;
 };
@@ -1022,7 +1023,8 @@ FIXTURE_SETUP(iommufd_mock_domain)
 	ASSERT_GE(ARRAY_SIZE(self->hwpt_ids), variant->mock_domains);
 
 	for (i = 0; i != variant->mock_domains; i++)
-		test_cmd_mock_domain(self->ioas_id, NULL, &self->hwpt_ids[i]);
+		test_cmd_mock_domain(self->ioas_id, &self->device_ids[i],
+				     &self->hwpt_ids[i]);
 	self->hwpt_id = self->hwpt_ids[0];
 
 	self->mmap_flags = MAP_SHARED | MAP_ANONYMOUS;
@@ -1274,6 +1276,36 @@ TEST_F(iommufd_mock_domain, user_copy)
 	test_ioctl_destroy(ioas_id);
 }
 
+TEST_F(iommufd_mock_domain, replace)
+{
+	uint32_t ioas_id;
+
+	test_ioctl_ioas_alloc(&ioas_id);
+
+	test_cmd_mock_domain_replace(self->device_ids[0], ioas_id);
+
+	/*
+	 * Replacing the IOAS causes the prior HWPT to be deallocated, thus we
+	 * should get enoent when we try to use it.
+	 */
+	if (variant->mock_domains == 1)
+		test_err_mock_domain_replace(ENOENT, self->device_ids[0],
+					     self->hwpt_ids[0]);
+
+	test_cmd_mock_domain_replace(self->device_ids[0], ioas_id);
+	if (variant->mock_domains >= 2) {
+		test_cmd_mock_domain_replace(self->device_ids[0],
+					     self->hwpt_ids[1]);
+		test_cmd_mock_domain_replace(self->device_ids[0],
+					     self->hwpt_ids[1]);
+		test_cmd_mock_domain_replace(self->device_ids[0],
+					     self->hwpt_ids[0]);
+	}
+
+	test_cmd_mock_domain_replace(self->device_ids[0], self->ioas_id);
+	test_ioctl_destroy(ioas_id);
+}
+
 /* VFIO compatibility IOCTLs */
 
 TEST_F(iommufd, simple_ioctls)
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index dda8f0187cd88b..352835cfe84de0 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -569,4 +569,31 @@ TEST_FAIL_NTH(basic_fail_nth, access_pin_domain)
 	return 0;
 }
 
+/* device.c */
+TEST_FAIL_NTH(basic_fail_nth, device)
+{
+	uint32_t ioas_id;
+	uint32_t ioas_id2;
+	uint32_t device_id;
+
+	self->fd = open("/dev/iommu", O_RDWR);
+	if (self->fd == -1)
+		return -1;
+
+	if (_test_ioctl_ioas_alloc(self->fd, &ioas_id))
+		return -1;
+
+	if (_test_ioctl_ioas_alloc(self->fd, &ioas_id2))
+		return -1;
+
+	fail_nth_enable();
+
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, NULL))
+		return -1;
+
+	if (_test_cmd_mock_domain_replace(self->fd, device_id, ioas_id2, NULL))
+		return -1;
+	return 0;
+}
+
 TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 0d1f46369c2a30..68dbb59f9a515a 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -66,6 +66,34 @@ static int _test_cmd_mock_domain(int fd, unsigned int ioas_id, __u32 *device_id,
 	EXPECT_ERRNO(_errno, _test_cmd_mock_domain(self->fd, ioas_id, \
 						   device_id, hwpt_id))
 
+static int _test_cmd_mock_domain_replace(int fd, __u32 device_id, __u32 pt_id,
+					 __u32 *hwpt_id)
+{
+	struct iommu_test_cmd cmd = {
+		.size = sizeof(cmd),
+		.op = IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE,
+		.id = device_id,
+		.mock_domain_replace = {
+			.pt_id = pt_id,
+		},
+	};
+	int ret;
+
+	ret = ioctl(fd, IOMMU_TEST_CMD, &cmd);
+	if (ret)
+		return ret;
+	if (hwpt_id)
+		*hwpt_id = cmd.mock_domain_replace.pt_id;
+	return 0;
+}
+
+#define test_cmd_mock_domain_replace(device_id, pt_id)                         \
+	ASSERT_EQ(0, _test_cmd_mock_domain_replace(self->fd, device_id, pt_id, \
+						   NULL))
+#define test_err_mock_domain_replace(_errno, device_id, pt_id) \
+	EXPECT_ERRNO(_errno, _test_cmd_mock_domain_replace(    \
+				     self->fd, device_id, pt_id, NULL))
+
 static int _test_cmd_create_access(int fd, unsigned int ioas_id,
 				   __u32 *access_id, unsigned int flags)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 11/14] iommufd/selftest: Test iommufd_device_replace() Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-03-06  1:42   ` Nicolin Chen
  2023-03-17  3:02   ` Tian, Kevin
  2023-02-25  0:27 ` [PATCH 13/14] iommufd/selftest: Return the real idev id from selftest mock_domain Jason Gunthorpe
                   ` (2 subsequent siblings)
  14 siblings, 2 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

This allows userspace to manually create HWPTs on IOAS's and then use
those HWPTs as inputs to iommufd_device_attach/replace().

Following series will extend this to allow creating iommu_domains with
driver specific parameters.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/hw_pagetable.c    | 46 +++++++++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h |  9 +++++
 drivers/iommu/iommufd/main.c            |  3 ++
 include/uapi/linux/iommufd.h            | 26 ++++++++++++++
 4 files changed, 84 insertions(+)

diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 2584f9038b29a2..a1f87193057385 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
  */
 #include <linux/iommu.h>
+#include <uapi/linux/iommufd.h>
 
 #include "iommufd_private.h"
 
@@ -121,3 +122,48 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
 	iommufd_object_abort_and_destroy(ictx, &hwpt->obj);
 	return ERR_PTR(rc);
 }
+
+int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
+{
+	struct iommu_hwpt_alloc *cmd = ucmd->cmd;
+	struct iommufd_hw_pagetable *hwpt;
+	struct iommufd_device *idev;
+	struct iommufd_ioas *ioas;
+	int rc;
+
+	if (cmd->flags)
+		return -EOPNOTSUPP;
+
+	idev = iommufd_get_device(ucmd, cmd->dev_id);
+	if (IS_ERR(idev))
+		return PTR_ERR(idev);
+
+	ioas = iommufd_get_ioas(ucmd, cmd->pt_id);
+	if (IS_ERR(ioas)) {
+		rc = PTR_ERR(idev);
+		goto out_put_idev;
+	}
+
+	mutex_lock(&ioas->mutex);
+	hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, idev, false);
+	mutex_unlock(&ioas->mutex);
+	if (IS_ERR(hwpt)) {
+		rc = PTR_ERR(idev);
+		goto out_put_ioas;
+	}
+
+	cmd->out_hwpt_id = hwpt->obj.id;
+	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+	if (rc)
+		goto out_hwpt;
+	iommufd_object_finalize(ucmd->ictx, &hwpt->obj);
+	goto out_put_ioas;
+
+out_hwpt:
+	iommufd_object_abort_and_destroy(ucmd->ictx, &hwpt->obj);
+out_put_ioas:
+	iommufd_put_object(&ioas->obj);
+out_put_idev:
+	iommufd_put_object(&idev->obj);
+	return rc;
+}
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index cfcda73942b533..c9acc70d84f794 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -261,6 +261,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
 struct iommufd_hw_pagetable *
 iommufd_hw_pagetable_detach(struct iommufd_device *idev);
 void iommufd_hw_pagetable_destroy(struct iommufd_object *obj);
+int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd);
 
 static inline void iommufd_hw_pagetable_put(struct iommufd_ctx *ictx,
 					    struct iommufd_hw_pagetable *hwpt)
@@ -296,6 +297,14 @@ struct iommufd_device {
 	bool enforce_cache_coherency;
 };
 
+static inline struct iommufd_device *
+iommufd_get_device(struct iommufd_ucmd *ucmd, u32 id)
+{
+	return container_of(iommufd_get_object(ucmd->ictx, id,
+					       IOMMUFD_OBJ_DEVICE),
+			    struct iommufd_device, obj);
+}
+
 void iommufd_device_destroy(struct iommufd_object *obj);
 
 struct iommufd_access {
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 9cba592d0482e7..694da191e4b155 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -261,6 +261,7 @@ static int iommufd_option(struct iommufd_ucmd *ucmd)
 
 union ucmd_buffer {
 	struct iommu_destroy destroy;
+	struct iommu_hwpt_alloc hwpt;
 	struct iommu_ioas_alloc alloc;
 	struct iommu_ioas_allow_iovas allow_iovas;
 	struct iommu_ioas_copy ioas_copy;
@@ -292,6 +293,8 @@ struct iommufd_ioctl_op {
 	}
 static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
 	IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct iommu_destroy, id),
+	IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc,
+		 __reserved),
 	IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl,
 		 struct iommu_ioas_alloc, out_ioas_id),
 	IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas,
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 98ebba80cfa1fc..ccd36acad36a3f 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -45,6 +45,7 @@ enum {
 	IOMMUFD_CMD_IOAS_UNMAP,
 	IOMMUFD_CMD_OPTION,
 	IOMMUFD_CMD_VFIO_IOAS,
+	IOMMUFD_CMD_HWPT_ALLOC,
 };
 
 /**
@@ -344,4 +345,29 @@ struct iommu_vfio_ioas {
 	__u16 __reserved;
 };
 #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
+
+/**
+ * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC)
+ * @size: sizeof(struct iommu_hwpt_alloc)
+ * @flags: Must be 0
+ * @dev_id: The device to allocate this HWPT for
+ * @pt_id: The IOAS to connect this HWPT to
+ * @out_hwpt_id: The ID of the new HWPT
+ * @__reserved: Must be 0
+ *
+ * Explicitly allocate a hardware page table object. This is the same object
+ * type that is returned by iommufd_device_attach() and represents the
+ * underlying iommu driver's iommu_domain kernel object.
+ *
+ * A normal HWPT will be created with the mappings from the given IOAS.
+ */
+struct iommu_hwpt_alloc {
+	__u32 size;
+	__u32 flags;
+	__u32 dev_id;
+	__u32 pt_id;
+	__u32 out_hwpt_id;
+	__u32 __reserved;
+};
+#define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC)
 #endif
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 13/14] iommufd/selftest: Return the real idev id from selftest mock_domain
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (11 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-02-25  0:27 ` [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC Jason Gunthorpe
  2023-03-07  8:42 ` [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Tian, Kevin
  14 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

Now that we actually call iommufd_device_bind() we can return the
idev_id from that function to userspace for use in other APIs.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/iommufd_test.h          |  3 +++
 drivers/iommu/iommufd/selftest.c              |  1 +
 tools/testing/selftests/iommu/iommufd.c       | 17 +++++++-------
 .../selftests/iommu/iommufd_fail_nth.c        | 23 ++++++++++++-------
 tools/testing/selftests/iommu/iommufd_utils.h | 10 ++++----
 5 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index c60c501775e9bb..d35620158c8b15 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -50,8 +50,11 @@ struct iommu_test_cmd {
 			__aligned_u64 length;
 		} add_reserved;
 		struct {
+			/* out_device_id is the selftest object */
 			__u32 out_device_id;
 			__u32 out_hwpt_id;
+			/* out_idev_id is the standard iommufd_bind object */
+			__u32 out_idev_id;
 		} mock_domain;
 		struct {
 			__u32 pt_id;
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index 701d6d6be3f06d..9cb206b31374fd 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -443,6 +443,7 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd,
 	/* Userspace must destroy the device_id to destroy the object */
 	cmd->mock_domain.out_hwpt_id = pt_id;
 	cmd->mock_domain.out_device_id = sobj->obj.id;
+	cmd->mock_domain.out_idev_id = idev_id;
 	iommufd_object_finalize(ucmd->ictx, &sobj->obj);
 	return iommufd_ucmd_respond(ucmd, sizeof(*cmd));
 
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index e1252a874f0b32..7e6fe263e1b62e 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -214,7 +214,7 @@ FIXTURE_SETUP(iommufd_ioas)
 
 	for (i = 0; i != variant->mock_domains; i++) {
 		test_cmd_mock_domain(self->ioas_id, &self->device_id,
-				     &self->hwpt_id);
+				     &self->hwpt_id, NULL);
 		self->base_iova = MOCK_APERTURE_START;
 	}
 }
@@ -265,7 +265,7 @@ TEST_F(iommufd_ioas, hwpt_attach)
 {
 	/* Create a device attached directly to a hwpt */
 	if (self->device_id) {
-		test_cmd_mock_domain(self->hwpt_id, NULL, NULL);
+		test_cmd_mock_domain(self->hwpt_id, NULL, NULL, NULL);
 	} else {
 		test_err_mock_domain(ENOENT, self->hwpt_id, NULL, NULL);
 	}
@@ -650,7 +650,7 @@ TEST_F(iommufd_ioas, access_pin)
 				   _IOMMU_TEST_CMD(IOMMU_TEST_OP_ACCESS_PAGES),
 				   &access_cmd));
 		test_cmd_mock_domain(self->ioas_id, &mock_device_id,
-				     &mock_hwpt_id);
+				     &mock_hwpt_id, NULL);
 		check_map_cmd.id = mock_hwpt_id;
 		ASSERT_EQ(0, ioctl(self->fd,
 				   _IOMMU_TEST_CMD(IOMMU_TEST_OP_MD_CHECK_MAP),
@@ -805,7 +805,7 @@ TEST_F(iommufd_ioas, fork_gone)
 		 * If a domain already existed then everything was pinned within
 		 * the fork, so this copies from one domain to another.
 		 */
-		test_cmd_mock_domain(self->ioas_id, NULL, NULL);
+		test_cmd_mock_domain(self->ioas_id, NULL, NULL, NULL);
 		check_access_rw(_metadata, self->fd, access_id,
 				MOCK_APERTURE_START, 0);
 
@@ -854,7 +854,7 @@ TEST_F(iommufd_ioas, fork_present)
 	ASSERT_EQ(8, read(efd, &tmp, sizeof(tmp)));
 
 	/* Read pages from the remote process */
-	test_cmd_mock_domain(self->ioas_id, NULL, NULL);
+	test_cmd_mock_domain(self->ioas_id, NULL, NULL, NULL);
 	check_access_rw(_metadata, self->fd, access_id, MOCK_APERTURE_START, 0);
 
 	ASSERT_EQ(0, close(pipefds[1]));
@@ -1002,6 +1002,7 @@ FIXTURE(iommufd_mock_domain)
 	uint32_t hwpt_id;
 	uint32_t hwpt_ids[2];
 	uint32_t device_ids[2];
+	uint32_t idev_ids[2];
 	int mmap_flags;
 	size_t mmap_buf_size;
 };
@@ -1024,7 +1025,7 @@ FIXTURE_SETUP(iommufd_mock_domain)
 
 	for (i = 0; i != variant->mock_domains; i++)
 		test_cmd_mock_domain(self->ioas_id, &self->device_ids[i],
-				     &self->hwpt_ids[i]);
+				     &self->hwpt_ids[i], &self->idev_ids[i]);
 	self->hwpt_id = self->hwpt_ids[0];
 
 	self->mmap_flags = MAP_SHARED | MAP_ANONYMOUS;
@@ -1218,7 +1219,7 @@ TEST_F(iommufd_mock_domain, all_aligns_copy)
 			/* Add and destroy a domain while the area exists */
 			old_id = self->hwpt_ids[1];
 			test_cmd_mock_domain(self->ioas_id, &mock_device_id,
-					     &self->hwpt_ids[1]);
+					     &self->hwpt_ids[1], NULL);
 
 			check_mock_iova(buf + start, iova, length);
 			check_refs(buf + start / PAGE_SIZE * PAGE_SIZE,
@@ -1427,7 +1428,7 @@ FIXTURE_SETUP(vfio_compat_mock_domain)
 
 	/* Create what VFIO would consider a group */
 	test_ioctl_ioas_alloc(&self->ioas_id);
-	test_cmd_mock_domain(self->ioas_id, NULL, NULL);
+	test_cmd_mock_domain(self->ioas_id, NULL, NULL, NULL);
 
 	/* Attach it to the vfio compat */
 	vfio_ioas_cmd.ioas_id = self->ioas_id;
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index 352835cfe84de0..f2012db43fbc16 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -313,7 +313,8 @@ TEST_FAIL_NTH(basic_fail_nth, map_domain)
 
 	fail_nth_enable();
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id,
+				  NULL))
 		return -1;
 
 	if (_test_ioctl_ioas_map(self->fd, ioas_id, buffer, 262144, &iova,
@@ -324,7 +325,8 @@ TEST_FAIL_NTH(basic_fail_nth, map_domain)
 	if (_test_ioctl_destroy(self->fd, device_id))
 		return -1;
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id,
+				  NULL))
 		return -1;
 	return 0;
 }
@@ -348,12 +350,14 @@ TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
 	if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
 		return -1;
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id,
+				  NULL))
 		return -1;
 
 	fail_nth_enable();
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id2, &hwpt_id2))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id2, &hwpt_id2,
+				  NULL))
 		return -1;
 
 	if (_test_ioctl_ioas_map(self->fd, ioas_id, buffer, 262144, &iova,
@@ -367,9 +371,11 @@ TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
 	if (_test_ioctl_destroy(self->fd, device_id2))
 		return -1;
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id,
+				  NULL))
 		return -1;
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id2, &hwpt_id2))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id2, &hwpt_id2,
+				  NULL))
 		return -1;
 	return 0;
 }
@@ -526,7 +532,8 @@ TEST_FAIL_NTH(basic_fail_nth, access_pin_domain)
 	if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
 		return -1;
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, &hwpt_id,
+				  NULL))
 		return -1;
 
 	if (_test_ioctl_ioas_map(self->fd, ioas_id, buffer, BUFFER_SIZE, &iova,
@@ -588,7 +595,7 @@ TEST_FAIL_NTH(basic_fail_nth, device)
 
 	fail_nth_enable();
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, NULL))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, NULL, NULL))
 		return -1;
 
 	if (_test_cmd_mock_domain_replace(self->fd, device_id, ioas_id2, NULL))
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 68dbb59f9a515a..807a2421121b51 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -39,7 +39,7 @@ static unsigned long BUFFER_SIZE;
 	})
 
 static int _test_cmd_mock_domain(int fd, unsigned int ioas_id, __u32 *device_id,
-				 __u32 *hwpt_id)
+				 __u32 *hwpt_id, __u32 *idev_id)
 {
 	struct iommu_test_cmd cmd = {
 		.size = sizeof(cmd),
@@ -57,14 +57,16 @@ static int _test_cmd_mock_domain(int fd, unsigned int ioas_id, __u32 *device_id,
 	assert(cmd.id != 0);
 	if (hwpt_id)
 		*hwpt_id = cmd.mock_domain.out_hwpt_id;
+	if (idev_id)
+		*idev_id = cmd.mock_domain.out_idev_id;
 	return 0;
 }
-#define test_cmd_mock_domain(ioas_id, device_id, hwpt_id)                \
+#define test_cmd_mock_domain(ioas_id, device_id, hwpt_id, idev_id)       \
 	ASSERT_EQ(0, _test_cmd_mock_domain(self->fd, ioas_id, device_id, \
-					   hwpt_id))
+					   hwpt_id, idev_id))
 #define test_err_mock_domain(_errno, ioas_id, device_id, hwpt_id)     \
 	EXPECT_ERRNO(_errno, _test_cmd_mock_domain(self->fd, ioas_id, \
-						   device_id, hwpt_id))
+						   device_id, hwpt_id, NULL))
 
 static int _test_cmd_mock_domain_replace(int fd, __u32 device_id, __u32 pt_id,
 					 __u32 *hwpt_id)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (12 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 13/14] iommufd/selftest: Return the real idev id from selftest mock_domain Jason Gunthorpe
@ 2023-02-25  0:27 ` Jason Gunthorpe
  2023-02-26 19:29   ` Nicolin Chen
  2023-03-07  8:42 ` [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Tian, Kevin
  14 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-25  0:27 UTC (permalink / raw)
  To: iommu, Kevin Tian, linux-kselftest; +Cc: kvm, Nicolin Chen, Yi Liu

Test the basic flow.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 tools/testing/selftests/iommu/iommufd.c       | 15 +++++++++++++
 .../selftests/iommu/iommufd_fail_nth.c        | 11 +++++++++-
 tools/testing/selftests/iommu/iommufd_utils.h | 21 +++++++++++++++++++
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 7e6fe263e1b62e..65f1847de5a542 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -1307,6 +1307,21 @@ TEST_F(iommufd_mock_domain, replace)
 	test_ioctl_destroy(ioas_id);
 }
 
+TEST_F(iommufd_mock_domain, alloc_hwpt)
+{
+	int i;
+
+	for (i = 0; i != variant->mock_domains; i++) {
+		uint32_t hwpt_id;
+		uint32_t device_id;
+
+		test_cmd_hwpt_alloc(self->idev_ids[0], self->ioas_id, &hwpt_id);
+		test_cmd_mock_domain(hwpt_id, &device_id, NULL, NULL);
+		test_ioctl_destroy(device_id);
+		test_ioctl_destroy(hwpt_id);
+	}
+}
+
 /* VFIO compatibility IOCTLs */
 
 TEST_F(iommufd, simple_ioctls)
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index f2012db43fbc16..5f293825fc37a0 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -582,6 +582,8 @@ TEST_FAIL_NTH(basic_fail_nth, device)
 	uint32_t ioas_id;
 	uint32_t ioas_id2;
 	uint32_t device_id;
+	uint32_t idev_id;
+	uint32_t hwpt_id;
 
 	self->fd = open("/dev/iommu", O_RDWR);
 	if (self->fd == -1)
@@ -595,11 +597,18 @@ TEST_FAIL_NTH(basic_fail_nth, device)
 
 	fail_nth_enable();
 
-	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, NULL, NULL))
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &device_id, NULL,
+				  &idev_id))
+		return -1;
+
+	if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, &hwpt_id))
 		return -1;
 
 	if (_test_cmd_mock_domain_replace(self->fd, device_id, ioas_id2, NULL))
 		return -1;
+
+	if (_test_cmd_mock_domain_replace(self->fd, device_id, hwpt_id, NULL))
+		return -1;
 	return 0;
 }
 
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 807a2421121b51..bc0ca8973e7951 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -96,6 +96,27 @@ static int _test_cmd_mock_domain_replace(int fd, __u32 device_id, __u32 pt_id,
 	EXPECT_ERRNO(_errno, _test_cmd_mock_domain_replace(    \
 				     self->fd, device_id, pt_id, NULL))
 
+static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id,
+					 __u32 *hwpt_id)
+{
+	struct iommu_hwpt_alloc cmd = {
+		.size = sizeof(cmd),
+		.dev_id = device_id,
+		.pt_id = pt_id,
+	};
+	int ret;
+
+	ret = ioctl(fd, IOMMU_HWPT_ALLOC, &cmd);
+	if (ret)
+		return ret;
+	if (hwpt_id)
+		*hwpt_id = cmd.out_hwpt_id;
+	return 0;
+}
+
+#define test_cmd_hwpt_alloc(device_id, pt_id, hwpt_id) \
+	ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, hwpt_id))
+
 static int _test_cmd_create_access(int fd, unsigned int ioas_id,
 				   __u32 *access_id, unsigned int flags)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-25  0:27 ` [PATCH 09/14] iommufd: Add iommufd_device_replace() Jason Gunthorpe
@ 2023-02-26  3:01   ` Baolu Lu
  2023-02-27 13:58     ` Jason Gunthorpe
  2023-02-26  3:13   ` Baolu Lu
  2023-03-02  8:20   ` Tian, Kevin
  2 siblings, 1 reply; 64+ messages in thread
From: Baolu Lu @ 2023-02-26  3:01 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Kevin Tian, linux-kselftest
  Cc: baolu.lu, kvm, Nicolin Chen, Yi Liu

On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
> @@ -437,25 +517,77 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
>   		struct iommufd_ioas *ioas =
>   			container_of(pt_obj, struct iommufd_ioas, obj);
>   
> -		rc = iommufd_device_auto_get_domain(idev, ioas, pt_id);
> -		if (rc)
> +		destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id,
> +							      do_attach);
> +		if (IS_ERR(destroy_hwpt))
>   			goto out_put_pt_obj;
>   		break;
>   	}
>   	default:
> -		rc = -EINVAL;
> +		destroy_hwpt = ERR_PTR(-EINVAL);
>   		goto out_put_pt_obj;
>   	}
> +	iommufd_put_object(pt_obj);
>   
> -	refcount_inc(&idev->obj.users);
> -	rc = 0;
> +	/* This destruction has to be after we unlock everything */
> +	if (destroy_hwpt)

Should this be

	if (!IS_ERR_OR_NULL(destroy_hwpt))

?

> +		iommufd_hw_pagetable_put(idev->ictx, destroy_hwpt);
> +	return 0;
>   
>   out_put_pt_obj:
>   	iommufd_put_object(pt_obj);
> -	return rc;
> +	return PTR_ERR(destroy_hwpt);
> +}

Best regards,
baolu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-25  0:27 ` [PATCH 09/14] iommufd: Add iommufd_device_replace() Jason Gunthorpe
  2023-02-26  3:01   ` Baolu Lu
@ 2023-02-26  3:13   ` Baolu Lu
  2023-02-27 14:00     ` Jason Gunthorpe
  2023-03-02  8:20   ` Tian, Kevin
  2 siblings, 1 reply; 64+ messages in thread
From: Baolu Lu @ 2023-02-26  3:13 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Kevin Tian, linux-kselftest
  Cc: baolu.lu, kvm, Nicolin Chen, Yi Liu

On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
> +/**
> + * iommufd_device_attach - Connect a device to an iommu_domain
> + * @idev: device to attach
> + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
> + *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID

"Output the hwpt ID" only happens when the caller input an IOAS object
and an auto domain was selected or created for the device.

Do I understand it right?

> + *
> + * This connects the device to an iommu_domain, either automatically or manually
> + * selected. Once this completes the device could do DMA.
> + *
> + * The caller should return the resulting pt_id back to userspace.
> + * This function is undone by calling iommufd_device_detach().
> + */
> +int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
> +{
> +	int rc;
> +
> +	rc = iommufd_device_change_pt(idev, pt_id, &iommufd_device_do_attach);
> +	if (rc)
> +		return rc;
> +
> +	/*
> +	 * Pairs with iommufd_device_detach() - catches caller bugs attempting
> +	 * to destroy a device with an attachment.
> +	 */
> +	refcount_inc(&idev->obj.users);
> +	return 0;
>   }
>   EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD);
>   
> +/**
> + * iommufd_device_replace - Change the device's iommu_domain
> + * @idev: device to change
> + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
> + *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID

If my above understanding is correct, then replace will never output a
hwpt id as it only happens after a successful attach.

> + *
> + * This is the same as:
> + *   iommufd_device_detach();
> + *   iommufd_device_attach()
> + * If it fails then no change is made to the attachment. The iommu driver may
> + * implement this so there is no disruption in translation. This can only be
> + * called if iommufd_device_attach() has already succeeded.
> + */
> +int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id)
> +{
> +	return iommufd_device_change_pt(idev, pt_id,
> +					&iommufd_device_do_replace);
> +}
> +EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, IOMMUFD);

Best regards,
baolu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
  2023-02-25  0:27 ` [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC Jason Gunthorpe
@ 2023-02-26 19:29   ` Nicolin Chen
  2023-02-27 15:02     ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Nicolin Chen @ 2023-02-26 19:29 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Yi Liu

On Fri, Feb 24, 2023 at 08:27:59PM -0400, Jason Gunthorpe wrote:
  
> +static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id,
> +					 __u32 *hwpt_id)
> +{
> +	struct iommu_hwpt_alloc cmd = {
> +		.size = sizeof(cmd),
> +		.dev_id = device_id,
> +		.pt_id = pt_id,
> +	};
> +	int ret;

Can we do "s/device_id/idev_id" to differentiate it from the
"device_id" being used for a selftest device object?

Thanks
Nic

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-26  3:01   ` Baolu Lu
@ 2023-02-27 13:58     ` Jason Gunthorpe
  2023-02-28  1:50       ` Baolu Lu
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-27 13:58 UTC (permalink / raw)
  To: Baolu Lu; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On Sun, Feb 26, 2023 at 11:01:59AM +0800, Baolu Lu wrote:
> On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
> > @@ -437,25 +517,77 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
> >   		struct iommufd_ioas *ioas =
> >   			container_of(pt_obj, struct iommufd_ioas, obj);
> > -		rc = iommufd_device_auto_get_domain(idev, ioas, pt_id);
> > -		if (rc)
> > +		destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id,
> > +							      do_attach);
> > +		if (IS_ERR(destroy_hwpt))
> >   			goto out_put_pt_obj;
> >   		break;
> >   	}
> >   	default:
> > -		rc = -EINVAL;
> > +		destroy_hwpt = ERR_PTR(-EINVAL);
> >   		goto out_put_pt_obj;
> >   	}
> > +	iommufd_put_object(pt_obj);
> > -	refcount_inc(&idev->obj.users);
> > -	rc = 0;
> > +	/* This destruction has to be after we unlock everything */
> > +	if (destroy_hwpt)
> 
> Should this be
> 
> 	if (!IS_ERR_OR_NULL(destroy_hwpt))
> 
> ?

Never use IS_ERR_OR_NULL ..

What am I missing? all the flows that could possibly have err_ptr here
do goto_out_put_pt_obj ?

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-26  3:13   ` Baolu Lu
@ 2023-02-27 14:00     ` Jason Gunthorpe
  2023-02-28  2:10       ` Baolu Lu
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-27 14:00 UTC (permalink / raw)
  To: Baolu Lu; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On Sun, Feb 26, 2023 at 11:13:16AM +0800, Baolu Lu wrote:
> On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
> > +/**
> > + * iommufd_device_attach - Connect a device to an iommu_domain
> > + * @idev: device to attach
> > + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
> > + *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
> 
> "Output the hwpt ID" only happens when the caller input an IOAS object
> and an auto domain was selected or created for the device.
> 
> Do I understand it right?

Technically it always outputs the hwpt, if a hwpt is in put then the
same hwpt is output.

> >   EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD);
> > +/**
> > + * iommufd_device_replace - Change the device's iommu_domain
> > + * @idev: device to change
> > + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
> > + *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
> 
> If my above understanding is correct, then replace will never output a
> hwpt id as it only happens after a successful attach.

Replace calls iommufd_device_auto_get_domain() which always sets pt_id
on success?

If a HWPT was passed in then it just leaves it unchanged which is also
correct.

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
  2023-02-26 19:29   ` Nicolin Chen
@ 2023-02-27 15:02     ` Jason Gunthorpe
  2023-02-28  0:17       ` Nicolin Chen
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-27 15:02 UTC (permalink / raw)
  To: Nicolin Chen; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Yi Liu

On Sun, Feb 26, 2023 at 11:29:55AM -0800, Nicolin Chen wrote:
> On Fri, Feb 24, 2023 at 08:27:59PM -0400, Jason Gunthorpe wrote:
>   
> > +static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id,
> > +					 __u32 *hwpt_id)
> > +{
> > +	struct iommu_hwpt_alloc cmd = {
> > +		.size = sizeof(cmd),
> > +		.dev_id = device_id,
> > +		.pt_id = pt_id,
> > +	};
> > +	int ret;
> 
> Can we do "s/device_id/idev_id" to differentiate it from the
> "device_id" being used for a selftest device object?

I renamed the selftest device object to 'stdev_id' instead

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
  2023-02-27 15:02     ` Jason Gunthorpe
@ 2023-02-28  0:17       ` Nicolin Chen
  0 siblings, 0 replies; 64+ messages in thread
From: Nicolin Chen @ 2023-02-28  0:17 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Yi Liu

On Mon, Feb 27, 2023 at 11:02:32AM -0400, Jason Gunthorpe wrote:
> On Sun, Feb 26, 2023 at 11:29:55AM -0800, Nicolin Chen wrote:
> > On Fri, Feb 24, 2023 at 08:27:59PM -0400, Jason Gunthorpe wrote:
> >   
> > > +static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id,
> > > +					 __u32 *hwpt_id)
> > > +{
> > > +	struct iommu_hwpt_alloc cmd = {
> > > +		.size = sizeof(cmd),
> > > +		.dev_id = device_id,
> > > +		.pt_id = pt_id,
> > > +	};
> > > +	int ret;
> > 
> > Can we do "s/device_id/idev_id" to differentiate it from the
> > "device_id" being used for a selftest device object?
> 
> I renamed the selftest device object to 'stdev_id' instead

Cool. I will pull-rebase.

Thanks
Nic

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-27 13:58     ` Jason Gunthorpe
@ 2023-02-28  1:50       ` Baolu Lu
  2023-02-28 13:51         ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Baolu Lu @ 2023-02-28  1:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On 2/27/23 9:58 PM, Jason Gunthorpe wrote:
> On Sun, Feb 26, 2023 at 11:01:59AM +0800, Baolu Lu wrote:
>> On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
>>> @@ -437,25 +517,77 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
>>>    		struct iommufd_ioas *ioas =
>>>    			container_of(pt_obj, struct iommufd_ioas, obj);
>>> -		rc = iommufd_device_auto_get_domain(idev, ioas, pt_id);
>>> -		if (rc)
>>> +		destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id,
>>> +							      do_attach);
>>> +		if (IS_ERR(destroy_hwpt))
>>>    			goto out_put_pt_obj;
>>>    		break;
>>>    	}
>>>    	default:
>>> -		rc = -EINVAL;
>>> +		destroy_hwpt = ERR_PTR(-EINVAL);
>>>    		goto out_put_pt_obj;
>>>    	}
>>> +	iommufd_put_object(pt_obj);
>>> -	refcount_inc(&idev->obj.users);
>>> -	rc = 0;
>>> +	/* This destruction has to be after we unlock everything */
>>> +	if (destroy_hwpt)
>> Should this be
>>
>> 	if (!IS_ERR_OR_NULL(destroy_hwpt))
>>
>> ?
> Never use IS_ERR_OR_NULL ..

Can you please elaborate a bit on this? I can still see a lot of use of
it in the tree.

> 
> What am I missing? all the flows that could possibly have err_ptr here
> do goto_out_put_pt_obj ?

Oh yes! You are right. All err_ptr's have gone to an error handling
path.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-27 14:00     ` Jason Gunthorpe
@ 2023-02-28  2:10       ` Baolu Lu
  2023-02-28 13:52         ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Baolu Lu @ 2023-02-28  2:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On 2/27/23 10:00 PM, Jason Gunthorpe wrote:
> On Sun, Feb 26, 2023 at 11:13:16AM +0800, Baolu Lu wrote:
>> On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
>>> +/**
>>> + * iommufd_device_attach - Connect a device to an iommu_domain
>>> + * @idev: device to attach
>>> + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
>>> + *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
>>
>> "Output the hwpt ID" only happens when the caller input an IOAS object
>> and an auto domain was selected or created for the device.
>>
>> Do I understand it right?
> 
> Technically it always outputs the hwpt, if a hwpt is in put then the
> same hwpt is output.

 From the code point of view, the pt_id is set only when an auto domain
is selected. Otherwise, it is untouched. Hence, probably we could
describe it more accurately in the comments. That is, if auto domain is
selected, its hwpt id will be returned in pt_id and the caller could
return it to userspace.

> 
>>>    EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD);
>>> +/**
>>> + * iommufd_device_replace - Change the device's iommu_domain
>>> + * @idev: device to change
>>> + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
>>> + *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
>>
>> If my above understanding is correct, then replace will never output a
>> hwpt id as it only happens after a successful attach.
> 
> Replace calls iommufd_device_auto_get_domain() which always sets pt_id
> on success?

Yes. replace also calls iommufd_device_auto_get_domain().

> 
> If a HWPT was passed in then it just leaves it unchanged which is also
> correct.

Functionally right. Above I just want to make the comment matches what
the real code does.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-28  1:50       ` Baolu Lu
@ 2023-02-28 13:51         ` Jason Gunthorpe
  2023-03-01  1:55           ` Baolu Lu
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-28 13:51 UTC (permalink / raw)
  To: Baolu Lu; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On Tue, Feb 28, 2023 at 09:50:52AM +0800, Baolu Lu wrote:
> On 2/27/23 9:58 PM, Jason Gunthorpe wrote:
> > On Sun, Feb 26, 2023 at 11:01:59AM +0800, Baolu Lu wrote:
> > > On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
> > > > @@ -437,25 +517,77 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
> > > >    		struct iommufd_ioas *ioas =
> > > >    			container_of(pt_obj, struct iommufd_ioas, obj);
> > > > -		rc = iommufd_device_auto_get_domain(idev, ioas, pt_id);
> > > > -		if (rc)
> > > > +		destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id,
> > > > +							      do_attach);
> > > > +		if (IS_ERR(destroy_hwpt))
> > > >    			goto out_put_pt_obj;
> > > >    		break;
> > > >    	}
> > > >    	default:
> > > > -		rc = -EINVAL;
> > > > +		destroy_hwpt = ERR_PTR(-EINVAL);
> > > >    		goto out_put_pt_obj;
> > > >    	}
> > > > +	iommufd_put_object(pt_obj);
> > > > -	refcount_inc(&idev->obj.users);
> > > > -	rc = 0;
> > > > +	/* This destruction has to be after we unlock everything */
> > > > +	if (destroy_hwpt)
> > > Should this be
> > > 
> > > 	if (!IS_ERR_OR_NULL(destroy_hwpt))
> > > 
> > > ?
> > Never use IS_ERR_OR_NULL ..
> 
> Can you please elaborate a bit on this? I can still see a lot of use of
> it in the tree.

Yes, sadly. It is usually some signal of toxic confusion about what
things mean.

A function that returns an ERR_PTR should very rarely return NULL, and
if it does return NULL then NULL wasn't an error.

Further you should never store an ERR_PTR in some structure and then
later try to test it, that is madness.

So with properly structured code the need should not exist.

https://lore.kernel.org/all/20130109150427.GL3931@n2100.arm.linux.org.uk/

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-28  2:10       ` Baolu Lu
@ 2023-02-28 13:52         ` Jason Gunthorpe
  2023-03-01  2:23           ` Baolu Lu
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-02-28 13:52 UTC (permalink / raw)
  To: Baolu Lu; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On Tue, Feb 28, 2023 at 10:10:41AM +0800, Baolu Lu wrote:

> > If a HWPT was passed in then it just leaves it unchanged which is also
> > correct.
> 
> Functionally right. Above I just want to make the comment matches what
> the real code does.

It does, on output the pt_id is always the IOMMUFD_OBJ_HW_PAGETABLE ID
of the attached HWPT.

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-28 13:51         ` Jason Gunthorpe
@ 2023-03-01  1:55           ` Baolu Lu
  0 siblings, 0 replies; 64+ messages in thread
From: Baolu Lu @ 2023-03-01  1:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On 2/28/23 9:51 PM, Jason Gunthorpe wrote:
> On Tue, Feb 28, 2023 at 09:50:52AM +0800, Baolu Lu wrote:
>> On 2/27/23 9:58 PM, Jason Gunthorpe wrote:
>>> On Sun, Feb 26, 2023 at 11:01:59AM +0800, Baolu Lu wrote:
>>>> On 2/25/23 8:27 AM, Jason Gunthorpe wrote:
>>>>> @@ -437,25 +517,77 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
>>>>>     		struct iommufd_ioas *ioas =
>>>>>     			container_of(pt_obj, struct iommufd_ioas, obj);
>>>>> -		rc = iommufd_device_auto_get_domain(idev, ioas, pt_id);
>>>>> -		if (rc)
>>>>> +		destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id,
>>>>> +							      do_attach);
>>>>> +		if (IS_ERR(destroy_hwpt))
>>>>>     			goto out_put_pt_obj;
>>>>>     		break;
>>>>>     	}
>>>>>     	default:
>>>>> -		rc = -EINVAL;
>>>>> +		destroy_hwpt = ERR_PTR(-EINVAL);
>>>>>     		goto out_put_pt_obj;
>>>>>     	}
>>>>> +	iommufd_put_object(pt_obj);
>>>>> -	refcount_inc(&idev->obj.users);
>>>>> -	rc = 0;
>>>>> +	/* This destruction has to be after we unlock everything */
>>>>> +	if (destroy_hwpt)
>>>> Should this be
>>>>
>>>> 	if (!IS_ERR_OR_NULL(destroy_hwpt))
>>>>
>>>> ?
>>> Never use IS_ERR_OR_NULL ..
>> Can you please elaborate a bit on this? I can still see a lot of use of
>> it in the tree.
> Yes, sadly. It is usually some signal of toxic confusion about what
> things mean.
> 
> A function that returns an ERR_PTR should very rarely return NULL, and
> if it does return NULL then NULL wasn't an error.

That's true.

> Further you should never store an ERR_PTR in some structure and then
> later try to test it, that is madness.
> 
> So with properly structured code the need should not exist.
> 
> https://lore.kernel.org/all/20130109150427.GL3931@n2100.arm.linux.org.uk/

It's clear to me now. Thanks a lot for the explanation.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-28 13:52         ` Jason Gunthorpe
@ 2023-03-01  2:23           ` Baolu Lu
  0 siblings, 0 replies; 64+ messages in thread
From: Baolu Lu @ 2023-03-01  2:23 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, iommu, Kevin Tian, linux-kselftest, kvm, Nicolin Chen, Yi Liu

On 2/28/23 9:52 PM, Jason Gunthorpe wrote:
> On Tue, Feb 28, 2023 at 10:10:41AM +0800, Baolu Lu wrote:
> 
>>> If a HWPT was passed in then it just leaves it unchanged which is also
>>> correct.
>> Functionally right. Above I just want to make the comment matches what
>> the real code does.
> It does, on output the pt_id is always the IOMMUFD_OBJ_HW_PAGETABLE ID
> of the attached HWPT.

OK. That's fine.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 01/14] iommufd: Move isolated msi enforcement to iommufd_device_bind()
  2023-02-25  0:27 ` [PATCH 01/14] iommufd: Move isolated msi enforcement to iommufd_device_bind() Jason Gunthorpe
@ 2023-03-02  7:45   ` Tian, Kevin
  0 siblings, 0 replies; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  7:45 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
> With the recent rework this no longer needs to be done at domain
> attachment time, we know if the device is usable by iommufd when we bind
> it.
> 
> The value of msi_device_has_isolated_msi() is not allowed to change while
> a driver is bound.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 02/14] iommufd: Add iommufd_group
  2023-02-25  0:27 ` [PATCH 02/14] iommufd: Add iommufd_group Jason Gunthorpe
@ 2023-03-02  7:55   ` Tian, Kevin
  2023-03-02 12:51     ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  7:55 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
> +
> +	/*
> +	 * All objects using a group reference must put it before their destroy
> +	 * completes
> +	 */
> +	new_igroup->ictx = ictx;

Looks the comment is not related to the code. Probably put it in the
destroy function?

> +
> +	/*
> +	 * We dropped the lock so igroup is invalid. Assume that the
> +	 * xa had NULL in it, if this guess is wrong then we will obtain
> +	 * the actual value under lock and try again once.

this reads like "if this guess is wrong" is checked outside of the lock.
I'd just remove "under lock".

> +	 */
> +	cur_igroup = NULL;
> +	xa_lock(&ictx->groups);
> +	while (true) {
> +		igroup = __xa_cmpxchg(&ictx->groups, id, cur_igroup,
> new_igroup,
> +				      GFP_KERNEL);
> +		if (xa_is_err(igroup)) {
> +			xa_unlock(&ictx->groups);
> +			iommufd_put_group(new_igroup);
> +			return ERR_PTR(xa_err(igroup));
> +		}
> +
> +		/* new_group was successfully installed */
> +		if (cur_igroup == igroup) {
> +			xa_unlock(&ictx->groups);
> +			return new_igroup;
> +		}
> +
> +		/* Check again if the current group is any good */
> +		if (igroup && igroup->group == group &&
> +		    kref_get_unless_zero(&igroup->ref)) {
> +			xa_unlock(&ictx->groups);
> +			iommufd_put_group(new_igroup);
> +			return igroup;
> +		}
> +		cur_igroup = igroup;
> +	}

Add a WARN_ON(igroup->group != group). The only valid race should
be when an existing group which is created by another device in the
same iommu group is being destroyed then we want another try
hoping that object will be removed from xarray soon. But there should
not be a case with the same slot pointing to a different iommu group.

> @@ -98,7 +195,7 @@ struct iommufd_device *iommufd_device_bind(struct
> iommufd_ctx *ictx,
>  	/* The calling driver is a user until iommufd_device_unbind() */
>  	refcount_inc(&idev->obj.users);
>  	/* group refcount moves into iommufd_device */
> -	idev->group = group;
> +	idev->igroup = igroup;

the comment about group refcount is stale now.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group
  2023-02-25  0:27 ` [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group Jason Gunthorpe
@ 2023-03-02  8:01   ` Tian, Kevin
  2023-03-06 20:22     ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  8:01 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
> +struct iommufd_hw_pagetable *
> +iommufd_hw_pagetable_detach(struct iommufd_device *idev)
>  {
> -	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
> +	struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
> +
> +	lockdep_assert_held(&idev->igroup->lock);
> +
> +	idev->igroup->devices--;
> +	if (!idev->igroup->devices) {
>  		iommu_detach_group(hwpt->domain, idev->igroup->group);
> +		idev->igroup->hwpt = NULL;

hwpt->obj.users should be decremented here instead of leaving it
in iommufd_device_detach().

>  void iommufd_device_detach(struct iommufd_device *idev)
>  {
> -	struct iommufd_hw_pagetable *hwpt = idev->hwpt;
> +	struct iommufd_hw_pagetable *hwpt;
> 
> -	mutex_lock(&hwpt->devices_lock);
> -	list_del(&idev->devices_item);
> -	idev->hwpt = NULL;
> -	iommufd_hw_pagetable_detach(hwpt, idev);
> -	mutex_unlock(&hwpt->devices_lock);
> +	mutex_lock(&idev->igroup->lock);
> +	hwpt = iommufd_hw_pagetable_detach(idev);

the only parameter is idev while the name is called hw_pagetable_xxx.

is it cleaner to get hwpt here and then pass into the detach function?


^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup
  2023-02-25  0:27 ` [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup Jason Gunthorpe
@ 2023-03-02  8:06   ` Tian, Kevin
  2023-03-02 12:55     ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  8:06 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
> These items only needs to be done once per group, not once per device. The
> once per device was a way to make the device list work. Since we are
> abandoning this we can optimize things a bit.
> 

Are we sure that a device hot-plugged into the group won't increase the
list of reserved IOVA ranges?

Currently iommu_get_group_resv_regions() queries reserved regions per
device and then merge them into the specified list.

Of course VFIO cannot cope with it since its group_attach() is done only
once. But if such scenario does exist then the current per-device
reservation in iommufd looks an improvement.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 05/14] iommufd: Make sw_msi_start a group global
  2023-02-25  0:27 ` [PATCH 05/14] iommufd: Make sw_msi_start a group global Jason Gunthorpe
@ 2023-03-02  8:09   ` Tian, Kevin
  2023-03-06 20:27     ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  8:09 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
>  static int iommufd_device_setup_msi(struct iommufd_device *idev,
> -				    struct iommufd_hw_pagetable *hwpt,
> -				    phys_addr_t sw_msi_start)
> +				    struct iommufd_hw_pagetable *hwpt)

Assuming ARM folks are OK with this change, better rename this function
to be iommufd_group related.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 06/14] iommufd: Move putting a hwpt to a helper function
  2023-02-25  0:27 ` [PATCH 06/14] iommufd: Move putting a hwpt to a helper function Jason Gunthorpe
@ 2023-03-02  8:12   ` Tian, Kevin
  2023-03-06 20:29     ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  8:12 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
>
> @@ -481,11 +481,7 @@ void iommufd_device_detach(struct
> iommufd_device *idev)
>  	hwpt = iommufd_hw_pagetable_detach(idev);
>  	mutex_unlock(&idev->igroup->lock);
> 
> -	if (hwpt->auto_domain)
> -		iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
> -	else
> -		refcount_dec(&hwpt->obj.users);
> -
> +	iommufd_hw_pagetable_put(idev->ictx, hwpt);
>  	refcount_dec(&idev->obj.users);
>  }

As commented in patch3 this should be called in
iommufd_hw_pagetable_detach() when idev->igroup->hwpt is cleared.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 07/14] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc()
  2023-02-25  0:27 ` [PATCH 07/14] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc() Jason Gunthorpe
@ 2023-03-02  8:14   ` Tian, Kevin
  0 siblings, 0 replies; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  8:14 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
> Logically the HWPT should have the coherency set properly for the device
> that it is being created for when it is created.
> 
> This was happening implicitly if the immediate_attach was set because
> iommufd_hw_pagetable_attach() does it as the first thing.
> 
> Do it unconditionally so !immediate_attach works properly.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 08/14] iommu: Introduce a new iommu_group_replace_domain() API
  2023-02-25  0:27 ` [PATCH 08/14] iommu: Introduce a new iommu_group_replace_domain() API Jason Gunthorpe
@ 2023-03-02  8:16   ` Tian, Kevin
  0 siblings, 0 replies; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  8:16 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
> From: Nicolin Chen <nicolinc@nvidia.com>
> 
> qemu has a need to replace the translations associated with a domain
> when the guest does large-scale operations like switching between an
> IDENTITY domain and, say, dma-iommu.c.
> 
> Currently, it does this by replacing all the mappings in a single
> domain, but this is very inefficient and means that domains have to be
> per-device rather than per-translation.
> 
> Provide a high-level API to allow replacements of one domain with
> another. This is similar to a detach/attach cycle except it doesn't
> force the group to go to the blocking domain in-between.
> 
> By removing this forced blocking domain the iommu driver has the
> opportunity to implement a non-disruptive replacement of the domain to the
> greatest extent its hardware allows. This allows the qemu emulation of the
> vIOMMU to be more complete, as real hardware often has a non-distruptive
> replacement capability.
> 
> It could be possible to adderss this by simply removing the protection
> from the iommu_attach_group(), but it is not so clear if that is safe for
> the few users. Thus, add a new API to serve this new purpose.
> 
> All drivers are already required to support changing between active
> UNMANAGED domains when using their attach_dev ops.
> 
> This API is expected to be used only by IOMMUFD, so add to the iommu-priv
> header and mark it as IOMMUFD_INTERNAL.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-02-25  0:27 ` [PATCH 09/14] iommufd: Add iommufd_device_replace() Jason Gunthorpe
  2023-02-26  3:01   ` Baolu Lu
  2023-02-26  3:13   ` Baolu Lu
@ 2023-03-02  8:20   ` Tian, Kevin
  2023-03-06 20:44     ` Jason Gunthorpe
  2 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-02  8:20 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
>
> +static struct iommufd_hw_pagetable *
> +iommufd_device_do_replace_locked(struct iommufd_device *idev,
> +				 struct iommufd_hw_pagetable *hwpt)
> +{
> +	struct iommufd_hw_pagetable *old_hwpt;
> +	int rc;
> +
> +	lockdep_assert_held(&idev->igroup->lock);
> +
> +	/* Try to upgrade the domain we have */
> +	if (idev->enforce_cache_coherency) {
> +		rc = iommufd_hw_pagetable_enforce_cc(hwpt);
> +		if (rc)
> +			return ERR_PTR(-EINVAL);
> +	}
> +
> +	rc = iommufd_device_setup_msi(idev, hwpt);
> +	if (rc)
> +		return ERR_PTR(rc);
> +
> +	old_hwpt = idev->igroup->hwpt;
> +	if (hwpt->ioas != old_hwpt->ioas) {
> +		rc = iopt_table_enforce_group_resv_regions(
> +			&hwpt->ioas->iopt, idev->igroup->group, NULL);
> +		if (rc)
> +			return ERR_PTR(rc);
> +	}

This is inconsistent with the earlier cleanup in the attach path
where setup_msi/enforce_group_resv_region are done only
once per group (if that is the right thing to do).

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/14] iommufd: Add iommufd_group
  2023-03-02  7:55   ` Tian, Kevin
@ 2023-03-02 12:51     ` Jason Gunthorpe
  2023-03-03  2:13       ` Tian, Kevin
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-02 12:51 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Thu, Mar 02, 2023 at 07:55:12AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> > 
> > +
> > +	/*
> > +	 * All objects using a group reference must put it before their destroy
> > +	 * completes
> > +	 */
> > +	new_igroup->ictx = ictx;
> 
> Looks the comment is not related to the code. Probably put it in the
> destroy function?

It explains why we don't take a reference on ictx here.

> > +	cur_igroup = NULL;
> > +	xa_lock(&ictx->groups);
> > +	while (true) {
> > +		igroup = __xa_cmpxchg(&ictx->groups, id, cur_igroup,
> > new_igroup,
> > +				      GFP_KERNEL);
> > +		if (xa_is_err(igroup)) {
> > +			xa_unlock(&ictx->groups);
> > +			iommufd_put_group(new_igroup);
> > +			return ERR_PTR(xa_err(igroup));
> > +		}
> > +
> > +		/* new_group was successfully installed */
> > +		if (cur_igroup == igroup) {
> > +			xa_unlock(&ictx->groups);
> > +			return new_igroup;
> > +		}
> > +
> > +		/* Check again if the current group is any good */
> > +		if (igroup && igroup->group == group &&
> > +		    kref_get_unless_zero(&igroup->ref)) {
> > +			xa_unlock(&ictx->groups);
> > +			iommufd_put_group(new_igroup);
> > +			return igroup;
> > +		}
> > +		cur_igroup = igroup;
> > +	}
> 
> Add a WARN_ON(igroup->group != group). The only valid race should
> be when an existing group which is created by another device in the
> same iommu group is being destroyed then we want another try
> hoping that object will be removed from xarray soon. But there should
> not be a case with the same slot pointing to a different iommu group.

Yeah, that is the case, both the group checks are WARN_ON's
because we hold an iommu_group reference as long as the xarray entry
is popoulated so it should be impossible for another iommu_group
pointer to have the same ID.

> > @@ -98,7 +195,7 @@ struct iommufd_device *iommufd_device_bind(struct
> > iommufd_ctx *ictx,
> >  	/* The calling driver is a user until iommufd_device_unbind() */
> >  	refcount_inc(&idev->obj.users);
> >  	/* group refcount moves into iommufd_device */
> > -	idev->group = group;
> > +	idev->igroup = igroup;
> 
> the comment about group refcount is stale now.

You mean it should say 'igroup refcount' ?

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup
  2023-03-02  8:06   ` Tian, Kevin
@ 2023-03-02 12:55     ` Jason Gunthorpe
  2023-03-03  2:16       ` Tian, Kevin
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-02 12:55 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Thu, Mar 02, 2023 at 08:06:27AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> > 
> > These items only needs to be done once per group, not once per device. The
> > once per device was a way to make the device list work. Since we are
> > abandoning this we can optimize things a bit.
> > 
> 
> Are we sure that a device hot-plugged into the group won't increase the
> list of reserved IOVA ranges?
> 
> Currently iommu_get_group_resv_regions() queries reserved regions per
> device and then merge them into the specified list.

So, maybe we should export the device API from the core code and use
it here instead of this group stuff. We don't need the core code to
merge each device together for us anyhow.

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 02/14] iommufd: Add iommufd_group
  2023-03-02 12:51     ` Jason Gunthorpe
@ 2023-03-03  2:13       ` Tian, Kevin
  2023-03-06 19:16         ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-03  2:13 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, March 2, 2023 8:52 PM
> 
> 
> > > @@ -98,7 +195,7 @@ struct iommufd_device
> *iommufd_device_bind(struct
> > > iommufd_ctx *ictx,
> > >  	/* The calling driver is a user until iommufd_device_unbind() */
> > >  	refcount_inc(&idev->obj.users);
> > >  	/* group refcount moves into iommufd_device */
> > > -	idev->group = group;
> > > +	idev->igroup = igroup;
> >
> > the comment about group refcount is stale now.
> 
> You mean it should say 'igroup refcount' ?
> 

The original comment refers to the refcnt of iommu group.

Now that refcnt is held by iommufd_group and no movement per se.

I'd just remove this comment.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup
  2023-03-02 12:55     ` Jason Gunthorpe
@ 2023-03-03  2:16       ` Tian, Kevin
  0 siblings, 0 replies; 64+ messages in thread
From: Tian, Kevin @ 2023-03-03  2:16 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, March 2, 2023 8:55 PM
> 
> On Thu, Mar 02, 2023 at 08:06:27AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Saturday, February 25, 2023 8:28 AM
> > >
> > > These items only needs to be done once per group, not once per device.
> The
> > > once per device was a way to make the device list work. Since we are
> > > abandoning this we can optimize things a bit.
> > >
> >
> > Are we sure that a device hot-plugged into the group won't increase the
> > list of reserved IOVA ranges?
> >
> > Currently iommu_get_group_resv_regions() queries reserved regions per
> > device and then merge them into the specified list.
> 
> So, maybe we should export the device API from the core code and use
> it here instead of this group stuff. We don't need the core code to
> merge each device together for us anyhow.
> 

Yes, that'd be clearer in this context.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC
  2023-02-25  0:27 ` [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC Jason Gunthorpe
@ 2023-03-06  1:42   ` Nicolin Chen
  2023-03-06 20:31     ` Jason Gunthorpe
  2023-03-17  3:02   ` Tian, Kevin
  1 sibling, 1 reply; 64+ messages in thread
From: Nicolin Chen @ 2023-03-06  1:42 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Yi Liu

On Fri, Feb 24, 2023 at 08:27:57PM -0400, Jason Gunthorpe wrote:

> +int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_hwpt_alloc *cmd = ucmd->cmd;
> +	struct iommufd_hw_pagetable *hwpt;
> +	struct iommufd_device *idev;
> +	struct iommufd_ioas *ioas;
> +	int rc;
> +
> +	if (cmd->flags)
> +		return -EOPNOTSUPP;
> +
> +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> +	if (IS_ERR(idev))
> +		return PTR_ERR(idev);
> +
> +	ioas = iommufd_get_ioas(ucmd, cmd->pt_id);
> +	if (IS_ERR(ioas)) {
> +		rc = PTR_ERR(idev);

PTR_ERR(ioas)

> +		goto out_put_idev;
> +	}
> +
> +	mutex_lock(&ioas->mutex);
> +	hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, idev, false);
> +	mutex_unlock(&ioas->mutex);
> +	if (IS_ERR(hwpt)) {
> +		rc = PTR_ERR(idev);
 
PTR_ERR(hwpt)

Thanks
Nic

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/14] iommufd: Add iommufd_group
  2023-03-03  2:13       ` Tian, Kevin
@ 2023-03-06 19:16         ` Jason Gunthorpe
  2023-03-07  2:32           ` Tian, Kevin
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-06 19:16 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Fri, Mar 03, 2023 at 02:13:30AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Thursday, March 2, 2023 8:52 PM
> > 
> > 
> > > > @@ -98,7 +195,7 @@ struct iommufd_device
> > *iommufd_device_bind(struct
> > > > iommufd_ctx *ictx,
> > > >  	/* The calling driver is a user until iommufd_device_unbind() */
> > > >  	refcount_inc(&idev->obj.users);
> > > >  	/* group refcount moves into iommufd_device */
> > > > -	idev->group = group;
> > > > +	idev->igroup = igroup;
> > >
> > > the comment about group refcount is stale now.
> > 
> > You mean it should say 'igroup refcount' ?
> > 
> 
> The original comment refers to the refcnt of iommu group.
> 
> Now that refcnt is held by iommufd_group and no movement per se.

igroup has a reference as well and it is this reference that is now
being moved

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group
  2023-03-02  8:01   ` Tian, Kevin
@ 2023-03-06 20:22     ` Jason Gunthorpe
  2023-03-07  2:38       ` Tian, Kevin
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-06 20:22 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Thu, Mar 02, 2023 at 08:01:00AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> > 
> > +struct iommufd_hw_pagetable *
> > +iommufd_hw_pagetable_detach(struct iommufd_device *idev)
> >  {
> > -	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
> > +	struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
> > +
> > +	lockdep_assert_held(&idev->igroup->lock);
> > +
> > +	idev->igroup->devices--;
> > +	if (!idev->igroup->devices) {
> >  		iommu_detach_group(hwpt->domain, idev->igroup->group);
> > +		idev->igroup->hwpt = NULL;
> 
> hwpt->obj.users should be decremented here instead of leaving it
> in iommufd_device_detach().

It is like this because eventually we can't call
iommufd_object_destroy_user() while holding the locks.

So the lowest function returns the hwpt up the call chain and once
everything is unlocked then it calls iommufd_hw_pagetable_put()

> >  void iommufd_device_detach(struct iommufd_device *idev)
> >  {
> > -	struct iommufd_hw_pagetable *hwpt = idev->hwpt;
> > +	struct iommufd_hw_pagetable *hwpt;
> > 
> > -	mutex_lock(&hwpt->devices_lock);
> > -	list_del(&idev->devices_item);
> > -	idev->hwpt = NULL;
> > -	iommufd_hw_pagetable_detach(hwpt, idev);
> > -	mutex_unlock(&hwpt->devices_lock);
> > +	mutex_lock(&idev->igroup->lock);
> > +	hwpt = iommufd_hw_pagetable_detach(idev);
> 
> the only parameter is idev while the name is called hw_pagetable_xxx.
>
> is it cleaner to get hwpt here and then pass into the detach function?

Not really, the function needs three members of the idev to work.

The pair'd attach function is:

int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
				struct iommufd_device *idev)

So I think it would be very unclear to change the name, and this is
more a hw_pagetable operation than a device operation.

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] iommufd: Make sw_msi_start a group global
  2023-03-02  8:09   ` Tian, Kevin
@ 2023-03-06 20:27     ` Jason Gunthorpe
  0 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-06 20:27 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Thu, Mar 02, 2023 at 08:09:46AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> > 
> >  static int iommufd_device_setup_msi(struct iommufd_device *idev,
> > -				    struct iommufd_hw_pagetable *hwpt,
> > -				    phys_addr_t sw_msi_start)
> > +				    struct iommufd_hw_pagetable *hwpt)
> 
> Assuming ARM folks are OK with this change, better rename this function
> to be iommufd_group related.

OK

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/14] iommufd: Move putting a hwpt to a helper function
  2023-03-02  8:12   ` Tian, Kevin
@ 2023-03-06 20:29     ` Jason Gunthorpe
  0 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-06 20:29 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Thu, Mar 02, 2023 at 08:12:28AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> >
> > @@ -481,11 +481,7 @@ void iommufd_device_detach(struct
> > iommufd_device *idev)
> >  	hwpt = iommufd_hw_pagetable_detach(idev);
> >  	mutex_unlock(&idev->igroup->lock);
> > 
> > -	if (hwpt->auto_domain)
> > -		iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
> > -	else
> > -		refcount_dec(&hwpt->obj.users);
> > -
> > +	iommufd_hw_pagetable_put(idev->ictx, hwpt);
> >  	refcount_dec(&idev->obj.users);
> >  }
> 
> As commented in patch3 this should be called in
> iommufd_hw_pagetable_detach() when idev->igroup->hwpt is cleared.

Same answer, has to be called after we unlock everything. The issue is
not device_detach which has simple locking but
iommufd_device_change_pt() in a later patch.

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC
  2023-03-06  1:42   ` Nicolin Chen
@ 2023-03-06 20:31     ` Jason Gunthorpe
  0 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-06 20:31 UTC (permalink / raw)
  To: Nicolin Chen; +Cc: iommu, Kevin Tian, linux-kselftest, kvm, Yi Liu

On Sun, Mar 05, 2023 at 05:42:56PM -0800, Nicolin Chen wrote:
> On Fri, Feb 24, 2023 at 08:27:57PM -0400, Jason Gunthorpe wrote:
> 
> > +int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_hwpt_alloc *cmd = ucmd->cmd;
> > +	struct iommufd_hw_pagetable *hwpt;
> > +	struct iommufd_device *idev;
> > +	struct iommufd_ioas *ioas;
> > +	int rc;
> > +
> > +	if (cmd->flags)
> > +		return -EOPNOTSUPP;
> > +
> > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > +	if (IS_ERR(idev))
> > +		return PTR_ERR(idev);
> > +
> > +	ioas = iommufd_get_ioas(ucmd, cmd->pt_id);
> > +	if (IS_ERR(ioas)) {
> > +		rc = PTR_ERR(idev);
> 
> PTR_ERR(ioas)
> 
> > +		goto out_put_idev;
> > +	}
> > +
> > +	mutex_lock(&ioas->mutex);
> > +	hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, idev, false);
> > +	mutex_unlock(&ioas->mutex);
> > +	if (IS_ERR(hwpt)) {
> > +		rc = PTR_ERR(idev);
>  
> PTR_ERR(hwpt)

Oops, yep

Thanks,
Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-03-02  8:20   ` Tian, Kevin
@ 2023-03-06 20:44     ` Jason Gunthorpe
  2023-03-07  2:42       ` Tian, Kevin
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-06 20:44 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Thu, Mar 02, 2023 at 08:20:05AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> >
> > +static struct iommufd_hw_pagetable *
> > +iommufd_device_do_replace_locked(struct iommufd_device *idev,
> > +				 struct iommufd_hw_pagetable *hwpt)
> > +{
> > +	struct iommufd_hw_pagetable *old_hwpt;
> > +	int rc;
> > +
> > +	lockdep_assert_held(&idev->igroup->lock);
> > +
> > +	/* Try to upgrade the domain we have */
> > +	if (idev->enforce_cache_coherency) {
> > +		rc = iommufd_hw_pagetable_enforce_cc(hwpt);
> > +		if (rc)
> > +			return ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	rc = iommufd_device_setup_msi(idev, hwpt);
> > +	if (rc)
> > +		return ERR_PTR(rc);
> > +
> > +	old_hwpt = idev->igroup->hwpt;
> > +	if (hwpt->ioas != old_hwpt->ioas) {
> > +		rc = iopt_table_enforce_group_resv_regions(
> > +			&hwpt->ioas->iopt, idev->igroup->group, NULL);
> > +		if (rc)
> > +			return ERR_PTR(rc);
> > +	}
> 
> This is inconsistent with the earlier cleanup in the attach path
> where setup_msi/enforce_group_resv_region are done only
> once per group (if that is the right thing to do).

Logically replace is 'detach every device in the group' - which makes
devices 0 - then 'reattach them all' to the new ioas.

So at this point it is still being done only once per group.

The 2nd idevs to call this function will see hwpt->ioas ==
old_hwpt->ioas

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 02/14] iommufd: Add iommufd_group
  2023-03-06 19:16         ` Jason Gunthorpe
@ 2023-03-07  2:32           ` Tian, Kevin
  0 siblings, 0 replies; 64+ messages in thread
From: Tian, Kevin @ 2023-03-07  2:32 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 7, 2023 3:17 AM
> 
> On Fri, Mar 03, 2023 at 02:13:30AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Thursday, March 2, 2023 8:52 PM
> > >
> > >
> > > > > @@ -98,7 +195,7 @@ struct iommufd_device
> > > *iommufd_device_bind(struct
> > > > > iommufd_ctx *ictx,
> > > > >  	/* The calling driver is a user until iommufd_device_unbind() */
> > > > >  	refcount_inc(&idev->obj.users);
> > > > >  	/* group refcount moves into iommufd_device */
> > > > > -	idev->group = group;
> > > > > +	idev->igroup = igroup;
> > > >
> > > > the comment about group refcount is stale now.
> > >
> > > You mean it should say 'igroup refcount' ?
> > >
> >
> > The original comment refers to the refcnt of iommu group.
> >
> > Now that refcnt is held by iommufd_group and no movement per se.
> 
> igroup has a reference as well and it is this reference that is now
> being moved
> 

oh, yes.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group
  2023-03-06 20:22     ` Jason Gunthorpe
@ 2023-03-07  2:38       ` Tian, Kevin
  2023-03-07 13:53         ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-07  2:38 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 7, 2023 4:22 AM
> 
> On Thu, Mar 02, 2023 at 08:01:00AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Saturday, February 25, 2023 8:28 AM
> > >
> > > +struct iommufd_hw_pagetable *
> > > +iommufd_hw_pagetable_detach(struct iommufd_device *idev)
> > >  {
> > > -	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
> > > +	struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
> > > +
> > > +	lockdep_assert_held(&idev->igroup->lock);
> > > +
> > > +	idev->igroup->devices--;
> > > +	if (!idev->igroup->devices) {
> > >  		iommu_detach_group(hwpt->domain, idev->igroup->group);
> > > +		idev->igroup->hwpt = NULL;
> >
> > hwpt->obj.users should be decremented here instead of leaving it
> > in iommufd_device_detach().
> 
> It is like this because eventually we can't call
> iommufd_object_destroy_user() while holding the locks.
> 
> So the lowest function returns the hwpt up the call chain and once
> everything is unlocked then it calls iommufd_hw_pagetable_put()

but don't we have unbalanced refcnt poke?

device_attach:

	if (!idev->igroup->devices) {
 		rc = iommu_attach_group(hwpt->domain, idev->igroup->group);
 		if (rc)
 			goto err_unresv;
		idev->igroup->hwpt = hwpt;
		refcount_inc(&hwpt->obj.users); //first device increases refcnt
 	}

device_detach:

	idev->igroup->devices--;
	if (!idev->igroup->devices) {
 		iommu_detach_group(hwpt->domain, idev->igroup->group);
		idev->igroup->hwpt = NULL;
	}

	...

	if (hwpt->auto_domain)
		iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
	else
		refcount_dec(&hwpt->obj.users); //every device decrease refcnt


^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-03-06 20:44     ` Jason Gunthorpe
@ 2023-03-07  2:42       ` Tian, Kevin
  2023-03-07 13:54         ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-07  2:42 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 7, 2023 4:45 AM
> 
> On Thu, Mar 02, 2023 at 08:20:05AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Saturday, February 25, 2023 8:28 AM
> > >
> > > +static struct iommufd_hw_pagetable *
> > > +iommufd_device_do_replace_locked(struct iommufd_device *idev,
> > > +				 struct iommufd_hw_pagetable *hwpt)
> > > +{
> > > +	struct iommufd_hw_pagetable *old_hwpt;
> > > +	int rc;
> > > +
> > > +	lockdep_assert_held(&idev->igroup->lock);
> > > +
> > > +	/* Try to upgrade the domain we have */
> > > +	if (idev->enforce_cache_coherency) {
> > > +		rc = iommufd_hw_pagetable_enforce_cc(hwpt);
> > > +		if (rc)
> > > +			return ERR_PTR(-EINVAL);
> > > +	}
> > > +
> > > +	rc = iommufd_device_setup_msi(idev, hwpt);
> > > +	if (rc)
> > > +		return ERR_PTR(rc);
> > > +
> > > +	old_hwpt = idev->igroup->hwpt;
> > > +	if (hwpt->ioas != old_hwpt->ioas) {
> > > +		rc = iopt_table_enforce_group_resv_regions(
> > > +			&hwpt->ioas->iopt, idev->igroup->group, NULL);
> > > +		if (rc)
> > > +			return ERR_PTR(rc);
> > > +	}
> >
> > This is inconsistent with the earlier cleanup in the attach path
> > where setup_msi/enforce_group_resv_region are done only
> > once per group (if that is the right thing to do).
> 
> Logically replace is 'detach every device in the group' - which makes
> devices 0 - then 'reattach them all' to the new ioas.
> 
> So at this point it is still being done only once per group.
> 
> The 2nd idevs to call this function will see hwpt->ioas ==
> old_hwpt->ioas
> 

but setup_msi() is still done for every device which is inconsistent
with what patch5 does.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt
  2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
                   ` (13 preceding siblings ...)
  2023-02-25  0:27 ` [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC Jason Gunthorpe
@ 2023-03-07  8:42 ` Tian, Kevin
  2023-03-07 12:46   ` Jason Gunthorpe
  14 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-07  8:42 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> 
[...]
> The implementation is complicated because we have to introduce some
> per-iommu_group memory in iommufd and redo how we think about multi-
> device
> groups to be more explicit. This solves all the locking problems in the
> prior attempts.
> 

Now think about the pasid case.

pasid attach is managed as a device operation today:
	iommu_attach_device_pasid()

Following it naturally we'll have a pasid array per iommufd_device to
track attached HWPT per pasid.

But internally there is only one pasid table per iommu group. i.e. same
story as RID attach that once dev1 replaces hwpt on pasid1 then it takes
effect on all other devices in the same group.

Then confusion comes. If we must have explicit group object in iommufd
to manage domain replacement per rid, then do we need the same
explicit mechanism e.g. tracking pasid attached hwpt in iommufd_group
instead of in iommufd_device? and have a iommu_attach_group_pasid() API.

In contrast if it's fine to continue managing pasid attach/replace per device,
does it imply there is another way of solving it cleanly w/o explicit group
object in the rid case?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt
  2023-03-07  8:42 ` [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Tian, Kevin
@ 2023-03-07 12:46   ` Jason Gunthorpe
  2023-03-08  2:08     ` Baolu Lu
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-07 12:46 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Tue, Mar 07, 2023 at 08:42:06AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> > 
> [...]
> > The implementation is complicated because we have to introduce some
> > per-iommu_group memory in iommufd and redo how we think about multi-
> > device
> > groups to be more explicit. This solves all the locking problems in the
> > prior attempts.
> > 
> 
> Now think about the pasid case.
> 
> pasid attach is managed as a device operation today:
> 	iommu_attach_device_pasid()
> 
> Following it naturally we'll have a pasid array per iommufd_device to
> track attached HWPT per pasid.
> 
> But internally there is only one pasid table per iommu group. i.e. same
> story as RID attach that once dev1 replaces hwpt on pasid1 then it takes
> effect on all other devices in the same group.

IMHO I can't belive that any actual systems that support PASID have a
RID aliasing problem too.

I think we should fix the iommu core to make PASID per-device and
require systems that have a RID aliasing problem to block PASID.

This is a bigger picture, if drivers have to optionally share their
PASID tables with other drivers then we can't have per-driver PASID
allocators at all either.

> Then confusion comes. If we must have explicit group object in iommufd
> to manage domain replacement per rid, then do we need the same
> explicit mechanism e.g. tracking pasid attached hwpt in iommufd_group
> instead of in iommufd_device? and have a iommu_attach_group_pasid() API.

If we make PASID per-group then yes

But no actual HW models per-group PASID to the VM so this is a
complete API disaster for vIOMMU. Better not to do it.

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group
  2023-03-07  2:38       ` Tian, Kevin
@ 2023-03-07 13:53         ` Jason Gunthorpe
  2023-03-08  7:29           ` Tian, Kevin
  0 siblings, 1 reply; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-07 13:53 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Tue, Mar 07, 2023 at 02:38:47AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, March 7, 2023 4:22 AM
> > 
> > On Thu, Mar 02, 2023 at 08:01:00AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Saturday, February 25, 2023 8:28 AM
> > > >
> > > > +struct iommufd_hw_pagetable *
> > > > +iommufd_hw_pagetable_detach(struct iommufd_device *idev)
> > > >  {
> > > > -	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
> > > > +	struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
> > > > +
> > > > +	lockdep_assert_held(&idev->igroup->lock);
> > > > +
> > > > +	idev->igroup->devices--;
> > > > +	if (!idev->igroup->devices) {
> > > >  		iommu_detach_group(hwpt->domain, idev->igroup->group);
> > > > +		idev->igroup->hwpt = NULL;
> > >
> > > hwpt->obj.users should be decremented here instead of leaving it
> > > in iommufd_device_detach().
> > 
> > It is like this because eventually we can't call
> > iommufd_object_destroy_user() while holding the locks.
> > 
> > So the lowest function returns the hwpt up the call chain and once
> > everything is unlocked then it calls iommufd_hw_pagetable_put()
> 
> but don't we have unbalanced refcnt poke?

Yes, the refcount should be incremented for every attached device

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/14] iommufd: Add iommufd_device_replace()
  2023-03-07  2:42       ` Tian, Kevin
@ 2023-03-07 13:54         ` Jason Gunthorpe
  0 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-07 13:54 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Tue, Mar 07, 2023 at 02:42:23AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, March 7, 2023 4:45 AM
> > 
> > On Thu, Mar 02, 2023 at 08:20:05AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Saturday, February 25, 2023 8:28 AM
> > > >
> > > > +static struct iommufd_hw_pagetable *
> > > > +iommufd_device_do_replace_locked(struct iommufd_device *idev,
> > > > +				 struct iommufd_hw_pagetable *hwpt)
> > > > +{
> > > > +	struct iommufd_hw_pagetable *old_hwpt;
> > > > +	int rc;
> > > > +
> > > > +	lockdep_assert_held(&idev->igroup->lock);
> > > > +
> > > > +	/* Try to upgrade the domain we have */
> > > > +	if (idev->enforce_cache_coherency) {
> > > > +		rc = iommufd_hw_pagetable_enforce_cc(hwpt);
> > > > +		if (rc)
> > > > +			return ERR_PTR(-EINVAL);
> > > > +	}
> > > > +
> > > > +	rc = iommufd_device_setup_msi(idev, hwpt);
> > > > +	if (rc)
> > > > +		return ERR_PTR(rc);
> > > > +
> > > > +	old_hwpt = idev->igroup->hwpt;
> > > > +	if (hwpt->ioas != old_hwpt->ioas) {
> > > > +		rc = iopt_table_enforce_group_resv_regions(
> > > > +			&hwpt->ioas->iopt, idev->igroup->group, NULL);
> > > > +		if (rc)
> > > > +			return ERR_PTR(rc);
> > > > +	}
> > >
> > > This is inconsistent with the earlier cleanup in the attach path
> > > where setup_msi/enforce_group_resv_region are done only
> > > once per group (if that is the right thing to do).
> > 
> > Logically replace is 'detach every device in the group' - which makes
> > devices 0 - then 'reattach them all' to the new ioas.
> > 
> > So at this point it is still being done only once per group.
> > 
> > The 2nd idevs to call this function will see hwpt->ioas ==
> > old_hwpt->ioas
> > 
> 
> but setup_msi() is still done for every device which is inconsistent
> with what patch5 does.

There is a missing check to do nothing if the hwpt is already set

If the hwpt is not set and the ioas is the same then we still have to
do the setup_msi

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt
  2023-03-07 12:46   ` Jason Gunthorpe
@ 2023-03-08  2:08     ` Baolu Lu
  2023-03-08  7:38       ` Tian, Kevin
  0 siblings, 1 reply; 64+ messages in thread
From: Baolu Lu @ 2023-03-08  2:08 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: baolu.lu, iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On 3/7/23 8:46 PM, Jason Gunthorpe wrote:
> On Tue, Mar 07, 2023 at 08:42:06AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe<jgg@nvidia.com>
>>> Sent: Saturday, February 25, 2023 8:28 AM
>>>
>> [...]
>>> The implementation is complicated because we have to introduce some
>>> per-iommu_group memory in iommufd and redo how we think about multi-
>>> device
>>> groups to be more explicit. This solves all the locking problems in the
>>> prior attempts.
>>>
>> Now think about the pasid case.
>>
>> pasid attach is managed as a device operation today:
>> 	iommu_attach_device_pasid()
>>
>> Following it naturally we'll have a pasid array per iommufd_device to
>> track attached HWPT per pasid.
>>
>> But internally there is only one pasid table per iommu group. i.e. same
>> story as RID attach that once dev1 replaces hwpt on pasid1 then it takes
>> effect on all other devices in the same group.
> IMHO I can't belive that any actual systems that support PASID have a
> RID aliasing problem too.
> 
> I think we should fix the iommu core to make PASID per-device and
> require systems that have a RID aliasing problem to block PASID.
> 
> This is a bigger picture, if drivers have to optionally share their
> PASID tables with other drivers then we can't have per-driver PASID
> allocators at all either.

This is actually required in PCI and IOMMU core. pci_enable_pasid()
requires full ACS support on device's upstream path:

        if (!pci_acs_path_enabled(pdev, NULL, PCI_ACS_RR | PCI_ACS_UF))
                 return -EINVAL;

and, for such PCI topology, iommu core always allocates an exclusive
iommu group.

The only place where seems to be a little messy is,

static int __iommu_set_group_pasid(struct iommu_domain *domain,
                                    struct iommu_group *group, ioasid_t 
pasid)
{
         struct group_device *device;
         int ret = 0;

         list_for_each_entry(device, &group->devices, list) {
                 ret = domain->ops->set_dev_pasid(domain, device->dev, 
pasid);
                 if (ret)
                         break;
         }

         return ret;
}

Perhaps we need a check on singleton group?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group
  2023-03-07 13:53         ` Jason Gunthorpe
@ 2023-03-08  7:29           ` Tian, Kevin
  2023-03-08 19:00             ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-08  7:29 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 7, 2023 9:53 PM
> 
> On Tue, Mar 07, 2023 at 02:38:47AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, March 7, 2023 4:22 AM
> > >
> > > On Thu, Mar 02, 2023 at 08:01:00AM +0000, Tian, Kevin wrote:
> > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > Sent: Saturday, February 25, 2023 8:28 AM
> > > > >
> > > > > +struct iommufd_hw_pagetable *
> > > > > +iommufd_hw_pagetable_detach(struct iommufd_device *idev)
> > > > >  {
> > > > > -	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
> > > > > +	struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
> > > > > +
> > > > > +	lockdep_assert_held(&idev->igroup->lock);
> > > > > +
> > > > > +	idev->igroup->devices--;
> > > > > +	if (!idev->igroup->devices) {
> > > > >  		iommu_detach_group(hwpt->domain, idev->igroup->group);
> > > > > +		idev->igroup->hwpt = NULL;
> > > >
> > > > hwpt->obj.users should be decremented here instead of leaving it
> > > > in iommufd_device_detach().
> > >
> > > It is like this because eventually we can't call
> > > iommufd_object_destroy_user() while holding the locks.
> > >
> > > So the lowest function returns the hwpt up the call chain and once
> > > everything is unlocked then it calls iommufd_hw_pagetable_put()
> >
> > but don't we have unbalanced refcnt poke?
> 
> Yes, the refcount should be incremented for every attached device
> 

per device or per group?

Now it's igroup tracking attached hwpt and each device holds a reference
on the igroup. Then why do we want to further poke the refcnt per attached
device?

sounds it's still clearer to increment it at first device attach and decrement
at last device detach.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt
  2023-03-08  2:08     ` Baolu Lu
@ 2023-03-08  7:38       ` Tian, Kevin
  2023-03-08 18:59         ` Jason Gunthorpe
  0 siblings, 1 reply; 64+ messages in thread
From: Tian, Kevin @ 2023-03-08  7:38 UTC (permalink / raw)
  To: Baolu Lu, Jason Gunthorpe
  Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Wednesday, March 8, 2023 10:08 AM
> 
> On 3/7/23 8:46 PM, Jason Gunthorpe wrote:
> > On Tue, Mar 07, 2023 at 08:42:06AM +0000, Tian, Kevin wrote:
> >>> From: Jason Gunthorpe<jgg@nvidia.com>
> >>> Sent: Saturday, February 25, 2023 8:28 AM
> >>>
> >> [...]
> >>> The implementation is complicated because we have to introduce some
> >>> per-iommu_group memory in iommufd and redo how we think about
> multi-
> >>> device
> >>> groups to be more explicit. This solves all the locking problems in the
> >>> prior attempts.
> >>>
> >> Now think about the pasid case.
> >>
> >> pasid attach is managed as a device operation today:
> >> 	iommu_attach_device_pasid()
> >>
> >> Following it naturally we'll have a pasid array per iommufd_device to
> >> track attached HWPT per pasid.
> >>
> >> But internally there is only one pasid table per iommu group. i.e. same
> >> story as RID attach that once dev1 replaces hwpt on pasid1 then it takes
> >> effect on all other devices in the same group.
> > IMHO I can't belive that any actual systems that support PASID have a
> > RID aliasing problem too.
> >
> > I think we should fix the iommu core to make PASID per-device and
> > require systems that have a RID aliasing problem to block PASID.
> >
> > This is a bigger picture, if drivers have to optionally share their
> > PASID tables with other drivers then we can't have per-driver PASID
> > allocators at all either.
> 
> This is actually required in PCI and IOMMU core. pci_enable_pasid()
> requires full ACS support on device's upstream path:
> 
>         if (!pci_acs_path_enabled(pdev, NULL, PCI_ACS_RR | PCI_ACS_UF))
>                  return -EINVAL;
> 
> and, for such PCI topology, iommu core always allocates an exclusive
> iommu group.
> 
> The only place where seems to be a little messy is,
> 
> static int __iommu_set_group_pasid(struct iommu_domain *domain,
>                                     struct iommu_group *group, ioasid_t
> pasid)
> {
>          struct group_device *device;
>          int ret = 0;
> 
>          list_for_each_entry(device, &group->devices, list) {
>                  ret = domain->ops->set_dev_pasid(domain, device->dev,
> pasid);
>                  if (ret)
>                          break;
>          }
> 
>          return ret;
> }
>

yes, this is exactly where my confusion comes. I thought we ever agreed
that PASID usage won't be supported on multi-devices group. Then I saw
about code (but didn't catch the implication of pci acs) which led me to
think about the group mess.

sticking to per-device pasid would certainly simplify the logic in iommufd. 😊
 
> Perhaps we need a check on singleton group?
> 

or move the xarray to device? Storing it in group while adding a singleton
restriction only causes confusion.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt
  2023-03-08  7:38       ` Tian, Kevin
@ 2023-03-08 18:59         ` Jason Gunthorpe
  0 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-08 18:59 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Baolu Lu, iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Wed, Mar 08, 2023 at 07:38:38AM +0000, Tian, Kevin wrote:
  
> > Perhaps we need a check on singleton group?
> > 
> 
> or move the xarray to device? Storing it in group while adding a singleton
> restriction only causes confusion.

Makes sense

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group
  2023-03-08  7:29           ` Tian, Kevin
@ 2023-03-08 19:00             ` Jason Gunthorpe
  0 siblings, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-08 19:00 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Wed, Mar 08, 2023 at 07:29:36AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, March 7, 2023 9:53 PM
> > 
> > On Tue, Mar 07, 2023 at 02:38:47AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Tuesday, March 7, 2023 4:22 AM
> > > >
> > > > On Thu, Mar 02, 2023 at 08:01:00AM +0000, Tian, Kevin wrote:
> > > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > > Sent: Saturday, February 25, 2023 8:28 AM
> > > > > >
> > > > > > +struct iommufd_hw_pagetable *
> > > > > > +iommufd_hw_pagetable_detach(struct iommufd_device *idev)
> > > > > >  {
> > > > > > -	if (!iommufd_hw_pagetable_has_group(hwpt, idev->igroup))
> > > > > > +	struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
> > > > > > +
> > > > > > +	lockdep_assert_held(&idev->igroup->lock);
> > > > > > +
> > > > > > +	idev->igroup->devices--;
> > > > > > +	if (!idev->igroup->devices) {
> > > > > >  		iommu_detach_group(hwpt->domain, idev->igroup->group);
> > > > > > +		idev->igroup->hwpt = NULL;
> > > > >
> > > > > hwpt->obj.users should be decremented here instead of leaving it
> > > > > in iommufd_device_detach().
> > > >
> > > > It is like this because eventually we can't call
> > > > iommufd_object_destroy_user() while holding the locks.
> > > >
> > > > So the lowest function returns the hwpt up the call chain and once
> > > > everything is unlocked then it calls iommufd_hw_pagetable_put()
> > >
> > > but don't we have unbalanced refcnt poke?
> > 
> > Yes, the refcount should be incremented for every attached device
> > 
> 
> per device or per group?

per device

> Now it's igroup tracking attached hwpt and each device holds a reference
> on the igroup. Then why do we want to further poke the refcnt per attached
> device?

It simplified some things, so this is how I made v2..

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC
  2023-02-25  0:27 ` [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC Jason Gunthorpe
  2023-03-06  1:42   ` Nicolin Chen
@ 2023-03-17  3:02   ` Tian, Kevin
  2023-03-17  4:02     ` Nicolin Chen
  2023-03-21 17:16     ` Jason Gunthorpe
  1 sibling, 2 replies; 64+ messages in thread
From: Tian, Kevin @ 2023-03-17  3:02 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, linux-kselftest; +Cc: kvm, Nicolin Chen, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, February 25, 2023 8:28 AM
> +
> +int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_hwpt_alloc *cmd = ucmd->cmd;
> +	struct iommufd_hw_pagetable *hwpt;
> +	struct iommufd_device *idev;
> +	struct iommufd_ioas *ioas;
> +	int rc;
> +
> +	if (cmd->flags)
> +		return -EOPNOTSUPP;

miss a check on the __reserved field.

> +/**
> + * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC)
> + * @size: sizeof(struct iommu_hwpt_alloc)
> + * @flags: Must be 0
> + * @dev_id: The device to allocate this HWPT for
> + * @pt_id: The IOAS to connect this HWPT to
> + * @out_hwpt_id: The ID of the new HWPT
> + * @__reserved: Must be 0
> + *
> + * Explicitly allocate a hardware page table object. This is the same object
> + * type that is returned by iommufd_device_attach() and represents the
> + * underlying iommu driver's iommu_domain kernel object.
> + *
> + * A normal HWPT will be created with the mappings from the given IOAS.
> + */

'normal' is a confusing word in this context.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC
  2023-03-17  3:02   ` Tian, Kevin
@ 2023-03-17  4:02     ` Nicolin Chen
  2023-03-17 10:20       ` Tian, Kevin
  2023-03-21 17:16     ` Jason Gunthorpe
  1 sibling, 1 reply; 64+ messages in thread
From: Nicolin Chen @ 2023-03-17  4:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Liu, Yi L

On Fri, Mar 17, 2023 at 03:02:26AM +0000, Tian, Kevin wrote:

> > +/**
> > + * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC)
> > + * @size: sizeof(struct iommu_hwpt_alloc)
> > + * @flags: Must be 0
> > + * @dev_id: The device to allocate this HWPT for
> > + * @pt_id: The IOAS to connect this HWPT to
> > + * @out_hwpt_id: The ID of the new HWPT
> > + * @__reserved: Must be 0
> > + *
> > + * Explicitly allocate a hardware page table object. This is the same object
> > + * type that is returned by iommufd_device_attach() and represents the
> > + * underlying iommu driver's iommu_domain kernel object.
> > + *
> > + * A normal HWPT will be created with the mappings from the given IOAS.
> > + */
> 
> 'normal' is a confusing word in this context.

Yea, Eric was asking about a related question in another thread,
because he couldn't get this part. I think we could replace this
"normal" with just "kernel-managed", so eventually it would look
like the following narrative after adding user_data and nesting:

 * A kernel-managed HWPT will be created with the mappings from the given IOAS.
 * The @data_type for its allocation can be set to IOMMU_HWPT_TYPE_DEFAULT, or
 * another type (being listed below) to customize the allocation.
 *
 * A user-managed HWPT will be created from a given parent HWPT via @pt_id, in
 * which the parent HWPT must be allocated previously via the same ioctl from a
 * given IOAS. The @data_type must not be set to IOMMU_HWPT_TYPE_DEFAULT but a
 * pre-defined type corresponding to the underlying IOMMU hardware.

Thanks
Nic

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC
  2023-03-17  4:02     ` Nicolin Chen
@ 2023-03-17 10:20       ` Tian, Kevin
  0 siblings, 0 replies; 64+ messages in thread
From: Tian, Kevin @ 2023-03-17 10:20 UTC (permalink / raw)
  To: Nicolin Chen, Jason Gunthorpe; +Cc: iommu, linux-kselftest, kvm, Liu, Yi L

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Friday, March 17, 2023 12:02 PM
> 
> On Fri, Mar 17, 2023 at 03:02:26AM +0000, Tian, Kevin wrote:
> 
> > > +/**
> > > + * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC)
> > > + * @size: sizeof(struct iommu_hwpt_alloc)
> > > + * @flags: Must be 0
> > > + * @dev_id: The device to allocate this HWPT for
> > > + * @pt_id: The IOAS to connect this HWPT to
> > > + * @out_hwpt_id: The ID of the new HWPT
> > > + * @__reserved: Must be 0
> > > + *
> > > + * Explicitly allocate a hardware page table object. This is the same
> object
> > > + * type that is returned by iommufd_device_attach() and represents the
> > > + * underlying iommu driver's iommu_domain kernel object.
> > > + *
> > > + * A normal HWPT will be created with the mappings from the given
> IOAS.
> > > + */
> >
> > 'normal' is a confusing word in this context.
> 
> Yea, Eric was asking about a related question in another thread,
> because he couldn't get this part. I think we could replace this
> "normal" with just "kernel-managed", so eventually it would look
> like the following narrative after adding user_data and nesting:
> 
>  * A kernel-managed HWPT will be created with the mappings from the given
> IOAS.
>  * The @data_type for its allocation can be set to
> IOMMU_HWPT_TYPE_DEFAULT, or
>  * another type (being listed below) to customize the allocation.
>  *
>  * A user-managed HWPT will be created from a given parent HWPT via
> @pt_id, in
>  * which the parent HWPT must be allocated previously via the same ioctl
> from a
>  * given IOAS. The @data_type must not be set to
> IOMMU_HWPT_TYPE_DEFAULT but a
>  * pre-defined type corresponding to the underlying IOMMU hardware.

that is fine. But at this point it's clearer to simply remove 'normal'. 😊

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC
  2023-03-17  3:02   ` Tian, Kevin
  2023-03-17  4:02     ` Nicolin Chen
@ 2023-03-21 17:16     ` Jason Gunthorpe
  1 sibling, 0 replies; 64+ messages in thread
From: Jason Gunthorpe @ 2023-03-21 17:16 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: iommu, linux-kselftest, kvm, Nicolin Chen, Liu, Yi L

On Fri, Mar 17, 2023 at 03:02:26AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, February 25, 2023 8:28 AM
> > +
> > +int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_hwpt_alloc *cmd = ucmd->cmd;
> > +	struct iommufd_hw_pagetable *hwpt;
> > +	struct iommufd_device *idev;
> > +	struct iommufd_ioas *ioas;
> > +	int rc;
> > +
> > +	if (cmd->flags)
> > +		return -EOPNOTSUPP;
> 
> miss a check on the __reserved field.
> 
> > +/**
> > + * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC)
> > + * @size: sizeof(struct iommu_hwpt_alloc)
> > + * @flags: Must be 0
> > + * @dev_id: The device to allocate this HWPT for
> > + * @pt_id: The IOAS to connect this HWPT to
> > + * @out_hwpt_id: The ID of the new HWPT
> > + * @__reserved: Must be 0
> > + *
> > + * Explicitly allocate a hardware page table object. This is the same object
> > + * type that is returned by iommufd_device_attach() and represents the
> > + * underlying iommu driver's iommu_domain kernel object.
> > + *
> > + * A normal HWPT will be created with the mappings from the given IOAS.
> > + */
> 
> 'normal' is a confusing word in this context.

Got it, thanks

Jason

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2023-03-21 17:16 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-25  0:27 [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 01/14] iommufd: Move isolated msi enforcement to iommufd_device_bind() Jason Gunthorpe
2023-03-02  7:45   ` Tian, Kevin
2023-02-25  0:27 ` [PATCH 02/14] iommufd: Add iommufd_group Jason Gunthorpe
2023-03-02  7:55   ` Tian, Kevin
2023-03-02 12:51     ` Jason Gunthorpe
2023-03-03  2:13       ` Tian, Kevin
2023-03-06 19:16         ` Jason Gunthorpe
2023-03-07  2:32           ` Tian, Kevin
2023-02-25  0:27 ` [PATCH 03/14] iommufd: Replace the hwpt->devices list with iommufd_group Jason Gunthorpe
2023-03-02  8:01   ` Tian, Kevin
2023-03-06 20:22     ` Jason Gunthorpe
2023-03-07  2:38       ` Tian, Kevin
2023-03-07 13:53         ` Jason Gunthorpe
2023-03-08  7:29           ` Tian, Kevin
2023-03-08 19:00             ` Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 04/14] iommufd: Use the iommufd_group to avoid duplicate reserved groups and msi setup Jason Gunthorpe
2023-03-02  8:06   ` Tian, Kevin
2023-03-02 12:55     ` Jason Gunthorpe
2023-03-03  2:16       ` Tian, Kevin
2023-02-25  0:27 ` [PATCH 05/14] iommufd: Make sw_msi_start a group global Jason Gunthorpe
2023-03-02  8:09   ` Tian, Kevin
2023-03-06 20:27     ` Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 06/14] iommufd: Move putting a hwpt to a helper function Jason Gunthorpe
2023-03-02  8:12   ` Tian, Kevin
2023-03-06 20:29     ` Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 07/14] iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc() Jason Gunthorpe
2023-03-02  8:14   ` Tian, Kevin
2023-02-25  0:27 ` [PATCH 08/14] iommu: Introduce a new iommu_group_replace_domain() API Jason Gunthorpe
2023-03-02  8:16   ` Tian, Kevin
2023-02-25  0:27 ` [PATCH 09/14] iommufd: Add iommufd_device_replace() Jason Gunthorpe
2023-02-26  3:01   ` Baolu Lu
2023-02-27 13:58     ` Jason Gunthorpe
2023-02-28  1:50       ` Baolu Lu
2023-02-28 13:51         ` Jason Gunthorpe
2023-03-01  1:55           ` Baolu Lu
2023-02-26  3:13   ` Baolu Lu
2023-02-27 14:00     ` Jason Gunthorpe
2023-02-28  2:10       ` Baolu Lu
2023-02-28 13:52         ` Jason Gunthorpe
2023-03-01  2:23           ` Baolu Lu
2023-03-02  8:20   ` Tian, Kevin
2023-03-06 20:44     ` Jason Gunthorpe
2023-03-07  2:42       ` Tian, Kevin
2023-03-07 13:54         ` Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 10/14] iommufd: Make destroy_rwsem use a lock class per object type Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 11/14] iommufd/selftest: Test iommufd_device_replace() Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 12/14] iommufd: Add IOMMU_HWPT_ALLOC Jason Gunthorpe
2023-03-06  1:42   ` Nicolin Chen
2023-03-06 20:31     ` Jason Gunthorpe
2023-03-17  3:02   ` Tian, Kevin
2023-03-17  4:02     ` Nicolin Chen
2023-03-17 10:20       ` Tian, Kevin
2023-03-21 17:16     ` Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 13/14] iommufd/selftest: Return the real idev id from selftest mock_domain Jason Gunthorpe
2023-02-25  0:27 ` [PATCH 14/14] iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC Jason Gunthorpe
2023-02-26 19:29   ` Nicolin Chen
2023-02-27 15:02     ` Jason Gunthorpe
2023-02-28  0:17       ` Nicolin Chen
2023-03-07  8:42 ` [PATCH 00/14] Add iommufd physical device operations for replace and alloc hwpt Tian, Kevin
2023-03-07 12:46   ` Jason Gunthorpe
2023-03-08  2:08     ` Baolu Lu
2023-03-08  7:38       ` Tian, Kevin
2023-03-08 18:59         ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).