All of lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-04-18  0:49 ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

Hi Joerg,

This is a resend version of v8 posted here:
https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
as we discussed in this thread:
https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/

All patches can be applied perfectly except this one:
 - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
It conflicts with below refactoring commit:
 - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
The conflict has been fixed in this post.

No functional changes in this series. I suppress cc-ing this series to
all v8 reviewers in order to avoid spam.

Please consider it for your iommu tree.

Best regards,
baolu

Change log:
- v8 and before:
  - Please refer to v8 post for all the change history.

- v8-resend
  - Rebase the series on top of v5.18-rc3.
  - Add Reviewed-by's granted by Robin.
  - Add a Tested-by granted by Eric.

Jason Gunthorpe (1):
  vfio: Delete the unbound_list

Lu Baolu (10):
  iommu: Add DMA ownership management interfaces
  driver core: Add dma_cleanup callback in bus_type
  amba: Stop sharing platform_dma_configure()
  bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
  PCI: pci_stub: Set driver_managed_dma
  PCI: portdrv: Set driver_managed_dma
  vfio: Set DMA ownership for VFIO devices
  vfio: Remove use of vfio_group_viable()
  vfio: Remove iommu group notifier
  iommu: Remove iommu group changes notifier

 include/linux/amba/bus.h              |   8 +
 include/linux/device/bus.h            |   3 +
 include/linux/fsl/mc.h                |   8 +
 include/linux/iommu.h                 |  54 +++---
 include/linux/pci.h                   |   8 +
 include/linux/platform_device.h       |  10 +-
 drivers/amba/bus.c                    |  37 +++-
 drivers/base/dd.c                     |   5 +
 drivers/base/platform.c               |  21 ++-
 drivers/bus/fsl-mc/fsl-mc-bus.c       |  24 ++-
 drivers/iommu/iommu.c                 | 228 ++++++++++++++++--------
 drivers/pci/pci-driver.c              |  18 ++
 drivers/pci/pci-stub.c                |   1 +
 drivers/pci/pcie/portdrv_pci.c        |   2 +
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |   1 +
 drivers/vfio/pci/vfio_pci.c           |   1 +
 drivers/vfio/platform/vfio_amba.c     |   1 +
 drivers/vfio/platform/vfio_platform.c |   1 +
 drivers/vfio/vfio.c                   | 245 ++------------------------
 19 files changed, 338 insertions(+), 338 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-04-18  0:49 ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

Hi Joerg,

This is a resend version of v8 posted here:
https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
as we discussed in this thread:
https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/

All patches can be applied perfectly except this one:
 - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
It conflicts with below refactoring commit:
 - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
The conflict has been fixed in this post.

No functional changes in this series. I suppress cc-ing this series to
all v8 reviewers in order to avoid spam.

Please consider it for your iommu tree.

Best regards,
baolu

Change log:
- v8 and before:
  - Please refer to v8 post for all the change history.

- v8-resend
  - Rebase the series on top of v5.18-rc3.
  - Add Reviewed-by's granted by Robin.
  - Add a Tested-by granted by Eric.

Jason Gunthorpe (1):
  vfio: Delete the unbound_list

Lu Baolu (10):
  iommu: Add DMA ownership management interfaces
  driver core: Add dma_cleanup callback in bus_type
  amba: Stop sharing platform_dma_configure()
  bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
  PCI: pci_stub: Set driver_managed_dma
  PCI: portdrv: Set driver_managed_dma
  vfio: Set DMA ownership for VFIO devices
  vfio: Remove use of vfio_group_viable()
  vfio: Remove iommu group notifier
  iommu: Remove iommu group changes notifier

 include/linux/amba/bus.h              |   8 +
 include/linux/device/bus.h            |   3 +
 include/linux/fsl/mc.h                |   8 +
 include/linux/iommu.h                 |  54 +++---
 include/linux/pci.h                   |   8 +
 include/linux/platform_device.h       |  10 +-
 drivers/amba/bus.c                    |  37 +++-
 drivers/base/dd.c                     |   5 +
 drivers/base/platform.c               |  21 ++-
 drivers/bus/fsl-mc/fsl-mc-bus.c       |  24 ++-
 drivers/iommu/iommu.c                 | 228 ++++++++++++++++--------
 drivers/pci/pci-driver.c              |  18 ++
 drivers/pci/pci-stub.c                |   1 +
 drivers/pci/pcie/portdrv_pci.c        |   2 +
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |   1 +
 drivers/vfio/pci/vfio_pci.c           |   1 +
 drivers/vfio/platform/vfio_amba.c     |   1 +
 drivers/vfio/platform/vfio_platform.c |   1 +
 drivers/vfio/vfio.c                   | 245 ++------------------------
 19 files changed, 338 insertions(+), 338 deletions(-)

-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

Multiple devices may be placed in the same IOMMU group because they
cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture.

This adds dma ownership management in iommu core and exposes several
interfaces for the device drivers and the device userspace assignment
framework (i.e. VFIO), so that any conflict between user and kernel
controlled dma could be detected at the beginning.

The device driver oriented interfaces are,

	int iommu_device_use_default_domain(struct device *dev);
	void iommu_device_unuse_default_domain(struct device *dev);

By calling iommu_device_use_default_domain(), the device driver tells
the iommu layer that the device dma is handled through the kernel DMA
APIs. The iommu layer will manage the IOVA and use the default domain
for DMA address translation.

The device user-space assignment framework oriented interfaces are,

	int iommu_group_claim_dma_owner(struct iommu_group *group,
					void *owner);
	void iommu_group_release_dma_owner(struct iommu_group *group);
	bool iommu_group_dma_owner_claimed(struct iommu_group *group);

The device userspace assignment must be disallowed if the DMA owner
claiming interface returns failure.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/iommu.h |  31 +++++++++
 drivers/iommu/iommu.c | 153 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 181 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1..77972ef978b5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -675,6 +675,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
+int iommu_device_use_default_domain(struct device *dev);
+void iommu_device_unuse_default_domain(struct device *dev);
+
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
+void iommu_group_release_dma_owner(struct iommu_group *group);
+bool iommu_group_dma_owner_claimed(struct iommu_group *group);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1031,6 +1038,30 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline int iommu_device_use_default_domain(struct device *dev)
+{
+	return 0;
+}
+
+static inline void iommu_device_unuse_default_domain(struct device *dev)
+{
+}
+
+static inline int
+iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	return -ENODEV;
+}
+
+static inline void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+}
+
+static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..eba8e8ccf19d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,8 @@ struct iommu_group {
 	struct iommu_domain *default_domain;
 	struct iommu_domain *domain;
 	struct list_head entry;
+	unsigned int owner_cnt;
+	void *owner;
 };
 
 struct group_device {
@@ -294,7 +296,11 @@ int iommu_probe_device(struct device *dev)
 	mutex_lock(&group->mutex);
 	iommu_alloc_default_domain(group, dev);
 
-	if (group->default_domain) {
+	/*
+	 * If device joined an existing group which has been claimed, don't
+	 * attach the default domain.
+	 */
+	if (group->default_domain && !group->owner) {
 		ret = __iommu_attach_device(group->default_domain, dev);
 		if (ret) {
 			mutex_unlock(&group->mutex);
@@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (group->default_domain && group->domain != group->default_domain)
+	if (group->domain && group->domain != group->default_domain)
 		return -EBUSY;
 
 	ret = __iommu_group_for_each_dev(group, domain,
@@ -2146,7 +2152,11 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (!group->default_domain) {
+	/*
+	 * If the group has been claimed already, do not re-attach the default
+	 * domain.
+	 */
+	if (!group->default_domain || group->owner) {
 		__iommu_group_for_each_dev(group, domain,
 					   iommu_group_do_detach_device);
 		group->domain = NULL;
@@ -3095,3 +3105,140 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 
 	return ret;
 }
+
+/**
+ * iommu_device_use_default_domain() - Device driver wants to handle device
+ *                                     DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver about to bind @dev wants to do DMA through the kernel
+ * DMA API. Return 0 if it is allowed, otherwise an error.
+ */
+int iommu_device_use_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+	int ret = 0;
+
+	if (!group)
+		return 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		if (group->domain != group->default_domain ||
+		    group->owner) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+	}
+
+	group->owner_cnt++;
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+/**
+ * iommu_device_unuse_default_domain() - Device driver stops handling device
+ *                                       DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver doesn't want to do DMA through kernel DMA API anymore.
+ * It must be called after iommu_device_use_default_domain().
+ */
+void iommu_device_unuse_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+
+	if (!group)
+		return;
+
+	mutex_lock(&group->mutex);
+	if (!WARN_ON(!group->owner_cnt))
+		group->owner_cnt--;
+
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+}
+
+/**
+ * iommu_group_claim_dma_owner() - Set DMA ownership of a group
+ * @group: The group.
+ * @owner: Caller specified pointer. Used for exclusive ownership.
+ *
+ * This is to support backward compatibility for vfio which manages
+ * the dma ownership in iommu_group level. New invocations on this
+ * interface should be prohibited.
+ */
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	int ret = 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		ret = -EPERM;
+		goto unlock_out;
+	} else {
+		if (group->domain && group->domain != group->default_domain) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+
+		group->owner = owner;
+		if (group->domain)
+			__iommu_detach_group(group->domain, group);
+	}
+
+	group->owner_cnt++;
+unlock_out:
+	mutex_unlock(&group->mutex);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
+
+/**
+ * iommu_group_release_dma_owner() - Release DMA ownership of a group
+ * @group: The group.
+ *
+ * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
+ */
+void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+	mutex_lock(&group->mutex);
+	if (WARN_ON(!group->owner_cnt || !group->owner))
+		goto unlock_out;
+
+	group->owner_cnt = 0;
+	/*
+	 * The UNMANAGED domain should be detached before all USER
+	 * owners have been released.
+	 */
+	if (!WARN_ON(group->domain) && group->default_domain)
+		__iommu_attach_group(group->default_domain, group);
+	group->owner = NULL;
+unlock_out:
+	mutex_unlock(&group->mutex);
+}
+EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
+
+/**
+ * iommu_group_dma_owner_claimed() - Query group dma ownership status
+ * @group: The group.
+ *
+ * This provides status query on a given group. It is racy and only for
+ * non-binding status reporting.
+ */
+bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	unsigned int user;
+
+	mutex_lock(&group->mutex);
+	user = group->owner_cnt;
+	mutex_unlock(&group->mutex);
+
+	return user;
+}
+EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

Multiple devices may be placed in the same IOMMU group because they
cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture.

This adds dma ownership management in iommu core and exposes several
interfaces for the device drivers and the device userspace assignment
framework (i.e. VFIO), so that any conflict between user and kernel
controlled dma could be detected at the beginning.

The device driver oriented interfaces are,

	int iommu_device_use_default_domain(struct device *dev);
	void iommu_device_unuse_default_domain(struct device *dev);

By calling iommu_device_use_default_domain(), the device driver tells
the iommu layer that the device dma is handled through the kernel DMA
APIs. The iommu layer will manage the IOVA and use the default domain
for DMA address translation.

The device user-space assignment framework oriented interfaces are,

	int iommu_group_claim_dma_owner(struct iommu_group *group,
					void *owner);
	void iommu_group_release_dma_owner(struct iommu_group *group);
	bool iommu_group_dma_owner_claimed(struct iommu_group *group);

The device userspace assignment must be disallowed if the DMA owner
claiming interface returns failure.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/iommu.h |  31 +++++++++
 drivers/iommu/iommu.c | 153 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 181 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1..77972ef978b5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -675,6 +675,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
+int iommu_device_use_default_domain(struct device *dev);
+void iommu_device_unuse_default_domain(struct device *dev);
+
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
+void iommu_group_release_dma_owner(struct iommu_group *group);
+bool iommu_group_dma_owner_claimed(struct iommu_group *group);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1031,6 +1038,30 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline int iommu_device_use_default_domain(struct device *dev)
+{
+	return 0;
+}
+
+static inline void iommu_device_unuse_default_domain(struct device *dev)
+{
+}
+
+static inline int
+iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	return -ENODEV;
+}
+
+static inline void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+}
+
+static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..eba8e8ccf19d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,8 @@ struct iommu_group {
 	struct iommu_domain *default_domain;
 	struct iommu_domain *domain;
 	struct list_head entry;
+	unsigned int owner_cnt;
+	void *owner;
 };
 
 struct group_device {
@@ -294,7 +296,11 @@ int iommu_probe_device(struct device *dev)
 	mutex_lock(&group->mutex);
 	iommu_alloc_default_domain(group, dev);
 
-	if (group->default_domain) {
+	/*
+	 * If device joined an existing group which has been claimed, don't
+	 * attach the default domain.
+	 */
+	if (group->default_domain && !group->owner) {
 		ret = __iommu_attach_device(group->default_domain, dev);
 		if (ret) {
 			mutex_unlock(&group->mutex);
@@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (group->default_domain && group->domain != group->default_domain)
+	if (group->domain && group->domain != group->default_domain)
 		return -EBUSY;
 
 	ret = __iommu_group_for_each_dev(group, domain,
@@ -2146,7 +2152,11 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (!group->default_domain) {
+	/*
+	 * If the group has been claimed already, do not re-attach the default
+	 * domain.
+	 */
+	if (!group->default_domain || group->owner) {
 		__iommu_group_for_each_dev(group, domain,
 					   iommu_group_do_detach_device);
 		group->domain = NULL;
@@ -3095,3 +3105,140 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 
 	return ret;
 }
+
+/**
+ * iommu_device_use_default_domain() - Device driver wants to handle device
+ *                                     DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver about to bind @dev wants to do DMA through the kernel
+ * DMA API. Return 0 if it is allowed, otherwise an error.
+ */
+int iommu_device_use_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+	int ret = 0;
+
+	if (!group)
+		return 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		if (group->domain != group->default_domain ||
+		    group->owner) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+	}
+
+	group->owner_cnt++;
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+/**
+ * iommu_device_unuse_default_domain() - Device driver stops handling device
+ *                                       DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver doesn't want to do DMA through kernel DMA API anymore.
+ * It must be called after iommu_device_use_default_domain().
+ */
+void iommu_device_unuse_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+
+	if (!group)
+		return;
+
+	mutex_lock(&group->mutex);
+	if (!WARN_ON(!group->owner_cnt))
+		group->owner_cnt--;
+
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+}
+
+/**
+ * iommu_group_claim_dma_owner() - Set DMA ownership of a group
+ * @group: The group.
+ * @owner: Caller specified pointer. Used for exclusive ownership.
+ *
+ * This is to support backward compatibility for vfio which manages
+ * the dma ownership in iommu_group level. New invocations on this
+ * interface should be prohibited.
+ */
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	int ret = 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		ret = -EPERM;
+		goto unlock_out;
+	} else {
+		if (group->domain && group->domain != group->default_domain) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+
+		group->owner = owner;
+		if (group->domain)
+			__iommu_detach_group(group->domain, group);
+	}
+
+	group->owner_cnt++;
+unlock_out:
+	mutex_unlock(&group->mutex);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
+
+/**
+ * iommu_group_release_dma_owner() - Release DMA ownership of a group
+ * @group: The group.
+ *
+ * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
+ */
+void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+	mutex_lock(&group->mutex);
+	if (WARN_ON(!group->owner_cnt || !group->owner))
+		goto unlock_out;
+
+	group->owner_cnt = 0;
+	/*
+	 * The UNMANAGED domain should be detached before all USER
+	 * owners have been released.
+	 */
+	if (!WARN_ON(group->domain) && group->default_domain)
+		__iommu_attach_group(group->default_domain, group);
+	group->owner = NULL;
+unlock_out:
+	mutex_unlock(&group->mutex);
+}
+EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
+
+/**
+ * iommu_group_dma_owner_claimed() - Query group dma ownership status
+ * @group: The group.
+ *
+ * This provides status query on a given group. It is racy and only for
+ * non-binding status reporting.
+ */
+bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	unsigned int user;
+
+	mutex_lock(&group->mutex);
+	user = group->owner_cnt;
+	mutex_unlock(&group->mutex);
+
+	return user;
+}
+EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_cleanup()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.

One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.

Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.

With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/device/bus.h | 3 +++
 drivers/base/dd.c          | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index a039ab809753..d8b29ccd07e5 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -59,6 +59,8 @@ struct fwnode_handle;
  *		bus supports.
  * @dma_configure:	Called to setup DMA configuration on a device on
  *			this bus.
+ * @dma_cleanup:	Called to cleanup DMA configuration on a device on
+ *			this bus.
  * @pm:		Power management operations of this bus, callback the specific
  *		device driver's pm-ops.
  * @iommu_ops:  IOMMU specific operations for this bus, used to attach IOMMU
@@ -103,6 +105,7 @@ struct bus_type {
 	int (*num_vf)(struct device *dev);
 
 	int (*dma_configure)(struct device *dev);
+	void (*dma_cleanup)(struct device *dev);
 
 	const struct dev_pm_ops *pm;
 
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 3fc3b5940bb3..94b7ac9bf459 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -671,6 +671,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 	if (dev->bus)
 		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 					     BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+	if (dev->bus && dev->bus->dma_cleanup)
+		dev->bus->dma_cleanup(dev);
 pinctrl_bind_failed:
 	device_links_no_driver(dev);
 	device_unbind_cleanup(dev);
@@ -1199,6 +1201,9 @@ static void __device_release_driver(struct device *dev, struct device *parent)
 
 		device_remove(dev);
 
+		if (dev->bus && dev->bus->dma_cleanup)
+			dev->bus->dma_cleanup(dev);
+
 		device_links_driver_cleanup(dev);
 		device_unbind_cleanup(dev);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_cleanup()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.

One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.

Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.

With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/device/bus.h | 3 +++
 drivers/base/dd.c          | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index a039ab809753..d8b29ccd07e5 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -59,6 +59,8 @@ struct fwnode_handle;
  *		bus supports.
  * @dma_configure:	Called to setup DMA configuration on a device on
  *			this bus.
+ * @dma_cleanup:	Called to cleanup DMA configuration on a device on
+ *			this bus.
  * @pm:		Power management operations of this bus, callback the specific
  *		device driver's pm-ops.
  * @iommu_ops:  IOMMU specific operations for this bus, used to attach IOMMU
@@ -103,6 +105,7 @@ struct bus_type {
 	int (*num_vf)(struct device *dev);
 
 	int (*dma_configure)(struct device *dev);
+	void (*dma_cleanup)(struct device *dev);
 
 	const struct dev_pm_ops *pm;
 
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 3fc3b5940bb3..94b7ac9bf459 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -671,6 +671,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 	if (dev->bus)
 		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 					     BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+	if (dev->bus && dev->bus->dma_cleanup)
+		dev->bus->dma_cleanup(dev);
 pinctrl_bind_failed:
 	device_links_no_driver(dev);
 	device_unbind_cleanup(dev);
@@ -1199,6 +1201,9 @@ static void __device_release_driver(struct device *dev, struct device *parent)
 
 		device_remove(dev);
 
+		if (dev->bus && dev->bus->dma_cleanup)
+			dev->bus->dma_cleanup(dev);
+
 		device_links_driver_cleanup(dev);
 		device_unbind_cleanup(dev);
 
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 03/11] amba: Stop sharing platform_dma_configure()
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

Stop sharing platform_dma_configure() helper as they are about to have
their own bus dma_configure callbacks.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/platform_device.h |  2 --
 drivers/amba/bus.c              | 19 ++++++++++++++++++-
 drivers/base/platform.c         |  3 +--
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 7c96f169d274..17fde717df68 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -328,8 +328,6 @@ extern int platform_pm_restore(struct device *dev);
 #define platform_pm_restore		NULL
 #endif
 
-extern int platform_dma_configure(struct device *dev);
-
 #ifdef CONFIG_PM_SLEEP
 #define USE_PLATFORM_PM_SLEEP_OPS \
 	.suspend = platform_pm_suspend, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index d3bd14aaabf6..76b52bd2c2a4 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -20,6 +20,8 @@
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 #include <linux/of_irq.h>
+#include <linux/of_device.h>
+#include <linux/acpi.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -273,6 +275,21 @@ static void amba_shutdown(struct device *dev)
 		drv->shutdown(to_amba_device(dev));
 }
 
+static int amba_dma_configure(struct device *dev)
+{
+	enum dev_dma_attr attr;
+	int ret = 0;
+
+	if (dev->of_node) {
+		ret = of_dma_configure(dev, dev->of_node, true);
+	} else if (has_acpi_companion(dev)) {
+		attr = acpi_get_dma_attr(to_acpi_device_node(dev->fwnode));
+		ret = acpi_dma_configure(dev, attr);
+	}
+
+	return ret;
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -341,7 +358,7 @@ struct bus_type amba_bustype = {
 	.probe		= amba_probe,
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
-	.dma_configure	= platform_dma_configure,
+	.dma_configure	= amba_dma_configure,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 8cc272fd5c99..d7915734d931 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1454,8 +1454,7 @@ static void platform_shutdown(struct device *_dev)
 		drv->shutdown(dev);
 }
 
-
-int platform_dma_configure(struct device *dev)
+static int platform_dma_configure(struct device *dev)
 {
 	enum dev_dma_attr attr;
 	int ret = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 03/11] amba: Stop sharing platform_dma_configure()
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

Stop sharing platform_dma_configure() helper as they are about to have
their own bus dma_configure callbacks.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/platform_device.h |  2 --
 drivers/amba/bus.c              | 19 ++++++++++++++++++-
 drivers/base/platform.c         |  3 +--
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 7c96f169d274..17fde717df68 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -328,8 +328,6 @@ extern int platform_pm_restore(struct device *dev);
 #define platform_pm_restore		NULL
 #endif
 
-extern int platform_dma_configure(struct device *dev);
-
 #ifdef CONFIG_PM_SLEEP
 #define USE_PLATFORM_PM_SLEEP_OPS \
 	.suspend = platform_pm_suspend, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index d3bd14aaabf6..76b52bd2c2a4 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -20,6 +20,8 @@
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 #include <linux/of_irq.h>
+#include <linux/of_device.h>
+#include <linux/acpi.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -273,6 +275,21 @@ static void amba_shutdown(struct device *dev)
 		drv->shutdown(to_amba_device(dev));
 }
 
+static int amba_dma_configure(struct device *dev)
+{
+	enum dev_dma_attr attr;
+	int ret = 0;
+
+	if (dev->of_node) {
+		ret = of_dma_configure(dev, dev->of_node, true);
+	} else if (has_acpi_companion(dev)) {
+		attr = acpi_get_dma_attr(to_acpi_device_node(dev->fwnode));
+		ret = acpi_dma_configure(dev, attr);
+	}
+
+	return ret;
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -341,7 +358,7 @@ struct bus_type amba_bustype = {
 	.probe		= amba_probe,
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
-	.dma_configure	= platform_dma_configure,
+	.dma_configure	= amba_dma_configure,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 8cc272fd5c99..d7915734d931 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1454,8 +1454,7 @@ static void platform_shutdown(struct device *_dev)
 		drv->shutdown(dev);
 }
 
-
-int platform_dma_configure(struct device *dev)
+static int platform_dma_configure(struct device *dev)
 {
 	enum dev_dma_attr attr;
 	int ret = 0;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 04/11] bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
with the device DMA managed by kernel drivers or user-space applications.
Unfortunately, multiple devices may be placed in the same IOMMU group
because they cannot be isolated from each other. The DMA on these devices
must either be entirely under kernel control or userspace control, never
a mixture. Otherwise the driver integrity is not guaranteed because they
could access each other through the peer-to-peer accesses which by-pass
the IOMMU protection.

This checks and sets the default DMA mode during driver binding, and
cleanups during driver unbinding. In the default mode, the device DMA is
managed by the device driver which handles DMA operations through the
kernel DMA APIs (see Documentation/core-api/dma-api.rst).

For cases where the devices are assigned for userspace control through the
userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
default setting in the assumption that the drivers know what they are
doing with the device DMA.

Calling iommu_device_use_default_domain() before {of,acpi}_dma_configure
is currently a problem. As things stand, the IOMMU driver ignored the
initial iommu_probe_device() call when the device was added, since at
that point it had no fwspec yet. In this situation,
{of,acpi}_iommu_configure() are retriggering iommu_probe_device() after
the IOMMU driver has seen the firmware data via .of_xlate to learn that
it actually responsible for the given device. As the result, before
that gets fixed, iommu_use_default_domain() goes at the end, and calls
arch_teardown_dma_ops() if it fails.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
---
 include/linux/amba/bus.h        |  8 ++++++++
 include/linux/fsl/mc.h          |  8 ++++++++
 include/linux/pci.h             |  8 ++++++++
 include/linux/platform_device.h |  8 ++++++++
 drivers/amba/bus.c              | 18 ++++++++++++++++++
 drivers/base/platform.c         | 18 ++++++++++++++++++
 drivers/bus/fsl-mc/fsl-mc-bus.c | 24 ++++++++++++++++++++++--
 drivers/pci/pci-driver.c        | 18 ++++++++++++++++++
 8 files changed, 108 insertions(+), 2 deletions(-)

diff --git a/include/linux/amba/bus.h b/include/linux/amba/bus.h
index 6562f543c3e0..2ddce9bcd00e 100644
--- a/include/linux/amba/bus.h
+++ b/include/linux/amba/bus.h
@@ -79,6 +79,14 @@ struct amba_driver {
 	void			(*remove)(struct amba_device *);
 	void			(*shutdown)(struct amba_device *);
 	const struct amba_id	*id_table;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 /*
diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index 7b6c42bfb660..27efef8affb1 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -32,6 +32,13 @@ struct fsl_mc_io;
  * @shutdown: Function called at shutdown time to quiesce the device
  * @suspend: Function called when a device is stopped
  * @resume: Function called when a device is resumed
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  *
  * Generic DPAA device driver object for device drivers that are registered
  * with a DPRC bus. This structure is to be embedded in each device-specific
@@ -45,6 +52,7 @@ struct fsl_mc_driver {
 	void (*shutdown)(struct fsl_mc_device *dev);
 	int (*suspend)(struct fsl_mc_device *dev, pm_message_t state);
 	int (*resume)(struct fsl_mc_device *dev);
+	bool driver_managed_dma;
 };
 
 #define to_fsl_mc_driver(_drv) \
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 60adf42460ab..b933d2b08d4d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -895,6 +895,13 @@ struct module;
  *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  */
 struct pci_driver {
 	struct list_head	node;
@@ -913,6 +920,7 @@ struct pci_driver {
 	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
+	bool driver_managed_dma;
 };
 
 static inline struct pci_driver *to_pci_driver(struct device_driver *drv)
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 17fde717df68..b3d9c744f1e5 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -210,6 +210,14 @@ struct platform_driver {
 	struct device_driver driver;
 	const struct platform_device_id *id_table;
 	bool prevent_deferred_probe;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 #define to_platform_driver(drv)	(container_of((drv), struct platform_driver, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 76b52bd2c2a4..a0ec61232b6c 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -22,6 +22,8 @@
 #include <linux/of_irq.h>
 #include <linux/of_device.h>
 #include <linux/acpi.h>
+#include <linux/iommu.h>
+#include <linux/dma-map-ops.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -277,6 +279,7 @@ static void amba_shutdown(struct device *dev)
 
 static int amba_dma_configure(struct device *dev)
 {
+	struct amba_driver *drv = to_amba_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
@@ -287,9 +290,23 @@ static int amba_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (!ret && !drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
 	return ret;
 }
 
+static void amba_dma_cleanup(struct device *dev)
+{
+	struct amba_driver *drv = to_amba_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -359,6 +376,7 @@ struct bus_type amba_bustype = {
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
 	.dma_configure	= amba_dma_configure,
+	.dma_cleanup	= amba_dma_cleanup,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index d7915734d931..70bc30cf575c 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -30,6 +30,8 @@
 #include <linux/property.h>
 #include <linux/kmemleak.h>
 #include <linux/types.h>
+#include <linux/iommu.h>
+#include <linux/dma-map-ops.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -1456,6 +1458,7 @@ static void platform_shutdown(struct device *_dev)
 
 static int platform_dma_configure(struct device *dev)
 {
+	struct platform_driver *drv = to_platform_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
@@ -1466,9 +1469,23 @@ static int platform_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (!ret && !drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
 	return ret;
 }
 
+static void platform_dma_cleanup(struct device *dev)
+{
+	struct platform_driver *drv = to_platform_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 static const struct dev_pm_ops platform_dev_pm_ops = {
 	SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, pm_generic_runtime_resume, NULL)
 	USE_PLATFORM_PM_SLEEP_OPS
@@ -1483,6 +1500,7 @@ struct bus_type platform_bus_type = {
 	.remove		= platform_remove,
 	.shutdown	= platform_shutdown,
 	.dma_configure	= platform_dma_configure,
+	.dma_cleanup	= platform_dma_cleanup,
 	.pm		= &platform_dev_pm_ops,
 };
 EXPORT_SYMBOL_GPL(platform_bus_type);
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..76648c4fdaf4 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -21,6 +21,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/acpi.h>
 #include <linux/iommu.h>
+#include <linux/dma-map-ops.h>
 
 #include "fsl-mc-private.h"
 
@@ -140,15 +141,33 @@ static int fsl_mc_dma_configure(struct device *dev)
 {
 	struct device *dma_dev = dev;
 	struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 	u32 input_id = mc_dev->icid;
+	int ret;
 
 	while (dev_is_fsl_mc(dma_dev))
 		dma_dev = dma_dev->parent;
 
 	if (dev_of_node(dma_dev))
-		return of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+		ret = of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+	else
+		ret = acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+
+	if (!ret && !mc_drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
+	return ret;
+}
+
+static void fsl_mc_dma_cleanup(struct device *dev)
+{
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 
-	return acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+	if (!mc_drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
 }
 
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
@@ -312,6 +331,7 @@ struct bus_type fsl_mc_bus_type = {
 	.match = fsl_mc_bus_match,
 	.uevent = fsl_mc_bus_uevent,
 	.dma_configure  = fsl_mc_dma_configure,
+	.dma_cleanup = fsl_mc_dma_cleanup,
 	.dev_groups = fsl_mc_dev_groups,
 	.bus_groups = fsl_mc_bus_groups,
 };
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 4ceeb75fc899..f83f7fbac68f 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -20,6 +20,7 @@
 #include <linux/of_device.h>
 #include <linux/acpi.h>
 #include <linux/dma-map-ops.h>
+#include <linux/iommu.h>
 #include "pci.h"
 #include "pcie/portdrv.h"
 
@@ -1601,6 +1602,7 @@ static int pci_bus_num_vf(struct device *dev)
  */
 static int pci_dma_configure(struct device *dev)
 {
+	struct pci_driver *driver = to_pci_driver(dev->driver);
 	struct device *bridge;
 	int ret = 0;
 
@@ -1616,9 +1618,24 @@ static int pci_dma_configure(struct device *dev)
 	}
 
 	pci_put_host_bridge_device(bridge);
+
+	if (!ret && !driver->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
 	return ret;
 }
 
+static void pci_dma_cleanup(struct device *dev)
+{
+	struct pci_driver *driver = to_pci_driver(dev->driver);
+
+	if (!driver->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 struct bus_type pci_bus_type = {
 	.name		= "pci",
 	.match		= pci_bus_match,
@@ -1632,6 +1649,7 @@ struct bus_type pci_bus_type = {
 	.pm		= PCI_PM_OPS_PTR,
 	.num_vf		= pci_bus_num_vf,
 	.dma_configure	= pci_dma_configure,
+	.dma_cleanup	= pci_dma_cleanup,
 };
 EXPORT_SYMBOL(pci_bus_type);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 04/11] bus: platform, amba, fsl-mc, PCI: Add device DMA ownership management
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
with the device DMA managed by kernel drivers or user-space applications.
Unfortunately, multiple devices may be placed in the same IOMMU group
because they cannot be isolated from each other. The DMA on these devices
must either be entirely under kernel control or userspace control, never
a mixture. Otherwise the driver integrity is not guaranteed because they
could access each other through the peer-to-peer accesses which by-pass
the IOMMU protection.

This checks and sets the default DMA mode during driver binding, and
cleanups during driver unbinding. In the default mode, the device DMA is
managed by the device driver which handles DMA operations through the
kernel DMA APIs (see Documentation/core-api/dma-api.rst).

For cases where the devices are assigned for userspace control through the
userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
default setting in the assumption that the drivers know what they are
doing with the device DMA.

Calling iommu_device_use_default_domain() before {of,acpi}_dma_configure
is currently a problem. As things stand, the IOMMU driver ignored the
initial iommu_probe_device() call when the device was added, since at
that point it had no fwspec yet. In this situation,
{of,acpi}_iommu_configure() are retriggering iommu_probe_device() after
the IOMMU driver has seen the firmware data via .of_xlate to learn that
it actually responsible for the given device. As the result, before
that gets fixed, iommu_use_default_domain() goes at the end, and calls
arch_teardown_dma_ops() if it fails.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
---
 include/linux/amba/bus.h        |  8 ++++++++
 include/linux/fsl/mc.h          |  8 ++++++++
 include/linux/pci.h             |  8 ++++++++
 include/linux/platform_device.h |  8 ++++++++
 drivers/amba/bus.c              | 18 ++++++++++++++++++
 drivers/base/platform.c         | 18 ++++++++++++++++++
 drivers/bus/fsl-mc/fsl-mc-bus.c | 24 ++++++++++++++++++++++--
 drivers/pci/pci-driver.c        | 18 ++++++++++++++++++
 8 files changed, 108 insertions(+), 2 deletions(-)

diff --git a/include/linux/amba/bus.h b/include/linux/amba/bus.h
index 6562f543c3e0..2ddce9bcd00e 100644
--- a/include/linux/amba/bus.h
+++ b/include/linux/amba/bus.h
@@ -79,6 +79,14 @@ struct amba_driver {
 	void			(*remove)(struct amba_device *);
 	void			(*shutdown)(struct amba_device *);
 	const struct amba_id	*id_table;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 /*
diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index 7b6c42bfb660..27efef8affb1 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -32,6 +32,13 @@ struct fsl_mc_io;
  * @shutdown: Function called at shutdown time to quiesce the device
  * @suspend: Function called when a device is stopped
  * @resume: Function called when a device is resumed
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  *
  * Generic DPAA device driver object for device drivers that are registered
  * with a DPRC bus. This structure is to be embedded in each device-specific
@@ -45,6 +52,7 @@ struct fsl_mc_driver {
 	void (*shutdown)(struct fsl_mc_device *dev);
 	int (*suspend)(struct fsl_mc_device *dev, pm_message_t state);
 	int (*resume)(struct fsl_mc_device *dev);
+	bool driver_managed_dma;
 };
 
 #define to_fsl_mc_driver(_drv) \
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 60adf42460ab..b933d2b08d4d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -895,6 +895,13 @@ struct module;
  *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  */
 struct pci_driver {
 	struct list_head	node;
@@ -913,6 +920,7 @@ struct pci_driver {
 	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
+	bool driver_managed_dma;
 };
 
 static inline struct pci_driver *to_pci_driver(struct device_driver *drv)
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 17fde717df68..b3d9c744f1e5 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -210,6 +210,14 @@ struct platform_driver {
 	struct device_driver driver;
 	const struct platform_device_id *id_table;
 	bool prevent_deferred_probe;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 #define to_platform_driver(drv)	(container_of((drv), struct platform_driver, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 76b52bd2c2a4..a0ec61232b6c 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -22,6 +22,8 @@
 #include <linux/of_irq.h>
 #include <linux/of_device.h>
 #include <linux/acpi.h>
+#include <linux/iommu.h>
+#include <linux/dma-map-ops.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -277,6 +279,7 @@ static void amba_shutdown(struct device *dev)
 
 static int amba_dma_configure(struct device *dev)
 {
+	struct amba_driver *drv = to_amba_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
@@ -287,9 +290,23 @@ static int amba_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (!ret && !drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
 	return ret;
 }
 
+static void amba_dma_cleanup(struct device *dev)
+{
+	struct amba_driver *drv = to_amba_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -359,6 +376,7 @@ struct bus_type amba_bustype = {
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
 	.dma_configure	= amba_dma_configure,
+	.dma_cleanup	= amba_dma_cleanup,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index d7915734d931..70bc30cf575c 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -30,6 +30,8 @@
 #include <linux/property.h>
 #include <linux/kmemleak.h>
 #include <linux/types.h>
+#include <linux/iommu.h>
+#include <linux/dma-map-ops.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -1456,6 +1458,7 @@ static void platform_shutdown(struct device *_dev)
 
 static int platform_dma_configure(struct device *dev)
 {
+	struct platform_driver *drv = to_platform_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
@@ -1466,9 +1469,23 @@ static int platform_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (!ret && !drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
 	return ret;
 }
 
+static void platform_dma_cleanup(struct device *dev)
+{
+	struct platform_driver *drv = to_platform_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 static const struct dev_pm_ops platform_dev_pm_ops = {
 	SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, pm_generic_runtime_resume, NULL)
 	USE_PLATFORM_PM_SLEEP_OPS
@@ -1483,6 +1500,7 @@ struct bus_type platform_bus_type = {
 	.remove		= platform_remove,
 	.shutdown	= platform_shutdown,
 	.dma_configure	= platform_dma_configure,
+	.dma_cleanup	= platform_dma_cleanup,
 	.pm		= &platform_dev_pm_ops,
 };
 EXPORT_SYMBOL_GPL(platform_bus_type);
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..76648c4fdaf4 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -21,6 +21,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/acpi.h>
 #include <linux/iommu.h>
+#include <linux/dma-map-ops.h>
 
 #include "fsl-mc-private.h"
 
@@ -140,15 +141,33 @@ static int fsl_mc_dma_configure(struct device *dev)
 {
 	struct device *dma_dev = dev;
 	struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 	u32 input_id = mc_dev->icid;
+	int ret;
 
 	while (dev_is_fsl_mc(dma_dev))
 		dma_dev = dma_dev->parent;
 
 	if (dev_of_node(dma_dev))
-		return of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+		ret = of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+	else
+		ret = acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+
+	if (!ret && !mc_drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
+	return ret;
+}
+
+static void fsl_mc_dma_cleanup(struct device *dev)
+{
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 
-	return acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+	if (!mc_drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
 }
 
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
@@ -312,6 +331,7 @@ struct bus_type fsl_mc_bus_type = {
 	.match = fsl_mc_bus_match,
 	.uevent = fsl_mc_bus_uevent,
 	.dma_configure  = fsl_mc_dma_configure,
+	.dma_cleanup = fsl_mc_dma_cleanup,
 	.dev_groups = fsl_mc_dev_groups,
 	.bus_groups = fsl_mc_bus_groups,
 };
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 4ceeb75fc899..f83f7fbac68f 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -20,6 +20,7 @@
 #include <linux/of_device.h>
 #include <linux/acpi.h>
 #include <linux/dma-map-ops.h>
+#include <linux/iommu.h>
 #include "pci.h"
 #include "pcie/portdrv.h"
 
@@ -1601,6 +1602,7 @@ static int pci_bus_num_vf(struct device *dev)
  */
 static int pci_dma_configure(struct device *dev)
 {
+	struct pci_driver *driver = to_pci_driver(dev->driver);
 	struct device *bridge;
 	int ret = 0;
 
@@ -1616,9 +1618,24 @@ static int pci_dma_configure(struct device *dev)
 	}
 
 	pci_put_host_bridge_device(bridge);
+
+	if (!ret && !driver->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			arch_teardown_dma_ops(dev);
+	}
+
 	return ret;
 }
 
+static void pci_dma_cleanup(struct device *dev)
+{
+	struct pci_driver *driver = to_pci_driver(dev->driver);
+
+	if (!driver->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 struct bus_type pci_bus_type = {
 	.name		= "pci",
 	.match		= pci_bus_match,
@@ -1632,6 +1649,7 @@ struct bus_type pci_bus_type = {
 	.pm		= PCI_PM_OPS_PTR,
 	.num_vf		= pci_bus_num_vf,
 	.dma_configure	= pci_dma_configure,
+	.dma_cleanup	= pci_dma_cleanup,
 };
 EXPORT_SYMBOL(pci_bus_type);
 
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 05/11] PCI: pci_stub: Set driver_managed_dma
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

The current VFIO implementation allows pci-stub driver to be bound to
a PCI device with other devices in the same IOMMU group being assigned
to userspace. The pci-stub driver has no dependencies on DMA or the
IOVA mapping of the device, but it does prevent the user from having
direct access to the device, which is useful in some circumstances.

The pci_dma_configure() marks the iommu_group as containing only devices
with kernel drivers that manage DMA. For compatibility with the VFIO
usage, avoid this default behavior for the pci_stub. This allows the
pci_stub still able to be used by the admin to block driver binding after
applying the DMA ownership to VFIO.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci-stub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pci-stub.c b/drivers/pci/pci-stub.c
index e408099fea52..d1f4c1ce7bd1 100644
--- a/drivers/pci/pci-stub.c
+++ b/drivers/pci/pci-stub.c
@@ -36,6 +36,7 @@ static struct pci_driver stub_driver = {
 	.name		= "pci-stub",
 	.id_table	= NULL,	/* only dynamic id's */
 	.probe		= pci_stub_probe,
+	.driver_managed_dma = true,
 };
 
 static int __init pci_stub_init(void)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 05/11] PCI: pci_stub: Set driver_managed_dma
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

The current VFIO implementation allows pci-stub driver to be bound to
a PCI device with other devices in the same IOMMU group being assigned
to userspace. The pci-stub driver has no dependencies on DMA or the
IOVA mapping of the device, but it does prevent the user from having
direct access to the device, which is useful in some circumstances.

The pci_dma_configure() marks the iommu_group as containing only devices
with kernel drivers that manage DMA. For compatibility with the VFIO
usage, avoid this default behavior for the pci_stub. This allows the
pci_stub still able to be used by the admin to block driver binding after
applying the DMA ownership to VFIO.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci-stub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pci-stub.c b/drivers/pci/pci-stub.c
index e408099fea52..d1f4c1ce7bd1 100644
--- a/drivers/pci/pci-stub.c
+++ b/drivers/pci/pci-stub.c
@@ -36,6 +36,7 @@ static struct pci_driver stub_driver = {
 	.name		= "pci-stub",
 	.id_table	= NULL,	/* only dynamic id's */
 	.probe		= pci_stub_probe,
+	.driver_managed_dma = true,
 };
 
 static int __init pci_stub_init(void)
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 06/11] PCI: portdrv: Set driver_managed_dma
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

If a switch lacks ACS P2P Request Redirect, a device below the switch can
bypass the IOMMU and DMA directly to other devices below the switch, so
all the downstream devices must be in the same IOMMU group as the switch
itself.

The existing VFIO framework allows the portdrv driver to be bound to the
bridge while its downstream devices are assigned to user space. The
pci_dma_configure() marks the IOMMU group as containing only devices
with kernel drivers that manage DMA. Avoid this default behavior for the
portdrv driver in order for compatibility with the current VFIO usage.

We achieve this by setting ".driver_managed_dma = true" in pci_driver
structure. It is safe because the portdrv driver meets below criteria:

- This driver doesn't use DMA, as you can't find any related calls like
  pci_set_master() or any kernel DMA API (dma_map_*() and etc.).
- It doesn't use MMIO as you can't find ioremap() or similar calls. It's
  tolerant to userspace possibly also touching the same MMIO registers
  via P2P DMA access.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/portdrv_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 4b8801656ffb..7f8788a970ae 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -202,6 +202,8 @@ static struct pci_driver pcie_portdriver = {
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
+	.driver_managed_dma = true,
+
 	.driver.pm	= PCIE_PORTDRV_PM_OPS,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 06/11] PCI: portdrv: Set driver_managed_dma
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

If a switch lacks ACS P2P Request Redirect, a device below the switch can
bypass the IOMMU and DMA directly to other devices below the switch, so
all the downstream devices must be in the same IOMMU group as the switch
itself.

The existing VFIO framework allows the portdrv driver to be bound to the
bridge while its downstream devices are assigned to user space. The
pci_dma_configure() marks the IOMMU group as containing only devices
with kernel drivers that manage DMA. Avoid this default behavior for the
portdrv driver in order for compatibility with the current VFIO usage.

We achieve this by setting ".driver_managed_dma = true" in pci_driver
structure. It is safe because the portdrv driver meets below criteria:

- This driver doesn't use DMA, as you can't find any related calls like
  pci_set_master() or any kernel DMA API (dma_map_*() and etc.).
- It doesn't use MMIO as you can't find ioremap() or similar calls. It's
  tolerant to userspace possibly also touching the same MMIO registers
  via P2P DMA access.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/portdrv_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 4b8801656ffb..7f8788a970ae 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -202,6 +202,8 @@ static struct pci_driver pcie_portdriver = {
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
+	.driver_managed_dma = true,
+
 	.driver.pm	= PCIE_PORTDRV_PM_OPS,
 };
 
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 07/11] vfio: Set DMA ownership for VFIO devices
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

Claim group dma ownership when an IOMMU group is set to a container,
and release the dma ownership once the iommu group is unset from the
container.

This change disallows some unsafe bridge drivers to bind to non-ACS
bridges while devices under them are assigned to user space. This is an
intentional enhancement and possibly breaks some existing
configurations. The recommendation to such an affected user would be
that the previously allowed host bridge driver was unsafe for this use
case and to continue to enable assignment of devices within that group,
the driver should be unbound from the bridge device or replaced with the
pci-stub driver.

For any bridge driver, we consider it unsafe if it satisfies any of the
following conditions:

  1) The bridge driver uses DMA. Calling pci_set_master() or calling any
     kernel DMA API (dma_map_*() and etc.) is an indicate that the
     driver is doing DMA.

  2) If the bridge driver uses MMIO, it should be tolerant to hostile
     userspace also touching the same MMIO registers via P2P DMA
     attacks.

If the bridge driver turns out to be a safe one, it could be used as
before by setting the driver's .driver_managed_dma field, just like what
we have done in the pcieport driver.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |  1 +
 drivers/vfio/pci/vfio_pci.c           |  1 +
 drivers/vfio/platform/vfio_amba.c     |  1 +
 drivers/vfio/platform/vfio_platform.c |  1 +
 drivers/vfio/vfio.c                   | 10 +++++++++-
 5 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 6e2e62c6f47a..3feff729f3ce 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -588,6 +588,7 @@ static struct fsl_mc_driver vfio_fsl_mc_driver = {
 		.name	= "vfio-fsl-mc",
 		.owner	= THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 static int __init vfio_fsl_mc_driver_init(void)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 2b047469e02f..58839206d1ca 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -194,6 +194,7 @@ static struct pci_driver vfio_pci_driver = {
 	.remove			= vfio_pci_remove,
 	.sriov_configure	= vfio_pci_sriov_configure,
 	.err_handler		= &vfio_pci_core_err_handlers,
+	.driver_managed_dma	= true,
 };
 
 static void __init vfio_pci_fill_ids(void)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index badfffea14fb..1aaa4f721bd2 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -95,6 +95,7 @@ static struct amba_driver vfio_amba_driver = {
 		.name = "vfio-amba",
 		.owner = THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 module_amba_driver(vfio_amba_driver);
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 68a1c87066d7..04f40c5acfd6 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
 	.driver	= {
 		.name	= "vfio-platform",
 	},
+	.driver_managed_dma = true,
 };
 
 module_platform_driver(vfio_platform_driver);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index a4555014bd1e..56e741cbccce 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1198,6 +1198,8 @@ static void __vfio_group_unset_container(struct vfio_group *group)
 		driver->ops->detach_group(container->iommu_data,
 					  group->iommu_group);
 
+	iommu_group_release_dma_owner(group->iommu_group);
+
 	group->container = NULL;
 	wake_up(&group->container_q);
 	list_del(&group->container_next);
@@ -1282,13 +1284,19 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 		goto unlock_out;
 	}
 
+	ret = iommu_group_claim_dma_owner(group->iommu_group, f.file);
+	if (ret)
+		goto unlock_out;
+
 	driver = container->iommu_driver;
 	if (driver) {
 		ret = driver->ops->attach_group(container->iommu_data,
 						group->iommu_group,
 						group->type);
-		if (ret)
+		if (ret) {
+			iommu_group_release_dma_owner(group->iommu_group);
 			goto unlock_out;
+		}
 	}
 
 	group->container = container;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 07/11] vfio: Set DMA ownership for VFIO devices
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

Claim group dma ownership when an IOMMU group is set to a container,
and release the dma ownership once the iommu group is unset from the
container.

This change disallows some unsafe bridge drivers to bind to non-ACS
bridges while devices under them are assigned to user space. This is an
intentional enhancement and possibly breaks some existing
configurations. The recommendation to such an affected user would be
that the previously allowed host bridge driver was unsafe for this use
case and to continue to enable assignment of devices within that group,
the driver should be unbound from the bridge device or replaced with the
pci-stub driver.

For any bridge driver, we consider it unsafe if it satisfies any of the
following conditions:

  1) The bridge driver uses DMA. Calling pci_set_master() or calling any
     kernel DMA API (dma_map_*() and etc.) is an indicate that the
     driver is doing DMA.

  2) If the bridge driver uses MMIO, it should be tolerant to hostile
     userspace also touching the same MMIO registers via P2P DMA
     attacks.

If the bridge driver turns out to be a safe one, it could be used as
before by setting the driver's .driver_managed_dma field, just like what
we have done in the pcieport driver.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |  1 +
 drivers/vfio/pci/vfio_pci.c           |  1 +
 drivers/vfio/platform/vfio_amba.c     |  1 +
 drivers/vfio/platform/vfio_platform.c |  1 +
 drivers/vfio/vfio.c                   | 10 +++++++++-
 5 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 6e2e62c6f47a..3feff729f3ce 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -588,6 +588,7 @@ static struct fsl_mc_driver vfio_fsl_mc_driver = {
 		.name	= "vfio-fsl-mc",
 		.owner	= THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 static int __init vfio_fsl_mc_driver_init(void)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 2b047469e02f..58839206d1ca 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -194,6 +194,7 @@ static struct pci_driver vfio_pci_driver = {
 	.remove			= vfio_pci_remove,
 	.sriov_configure	= vfio_pci_sriov_configure,
 	.err_handler		= &vfio_pci_core_err_handlers,
+	.driver_managed_dma	= true,
 };
 
 static void __init vfio_pci_fill_ids(void)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index badfffea14fb..1aaa4f721bd2 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -95,6 +95,7 @@ static struct amba_driver vfio_amba_driver = {
 		.name = "vfio-amba",
 		.owner = THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 module_amba_driver(vfio_amba_driver);
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 68a1c87066d7..04f40c5acfd6 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
 	.driver	= {
 		.name	= "vfio-platform",
 	},
+	.driver_managed_dma = true,
 };
 
 module_platform_driver(vfio_platform_driver);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index a4555014bd1e..56e741cbccce 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1198,6 +1198,8 @@ static void __vfio_group_unset_container(struct vfio_group *group)
 		driver->ops->detach_group(container->iommu_data,
 					  group->iommu_group);
 
+	iommu_group_release_dma_owner(group->iommu_group);
+
 	group->container = NULL;
 	wake_up(&group->container_q);
 	list_del(&group->container_next);
@@ -1282,13 +1284,19 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 		goto unlock_out;
 	}
 
+	ret = iommu_group_claim_dma_owner(group->iommu_group, f.file);
+	if (ret)
+		goto unlock_out;
+
 	driver = container->iommu_driver;
 	if (driver) {
 		ret = driver->ops->attach_group(container->iommu_data,
 						group->iommu_group,
 						group->type);
-		if (ret)
+		if (ret) {
+			iommu_group_release_dma_owner(group->iommu_group);
 			goto unlock_out;
+		}
 	}
 
 	group->container = container;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 08/11] vfio: Remove use of vfio_group_viable()
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

As DMA ownership is claimed for the iommu group when a VFIO group is
added to a VFIO container, the VFIO group viability is guaranteed as long
as group->container_users > 0. Remove those unnecessary group viability
checks which are only hit when group->container_users is not zero.

The only remaining reference is in GROUP_GET_STATUS, which could be called
at any time when group fd is valid. Here we just replace the
vfio_group_viable() by directly calling IOMMU core to get viability status.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 56e741cbccce..8a9347f732a5 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1313,12 +1313,6 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	return ret;
 }
 
-static bool vfio_group_viable(struct vfio_group *group)
-{
-	return (iommu_group_for_each_dev(group->iommu_group,
-					 group, vfio_dev_viable) == 0);
-}
-
 static int vfio_group_add_container_user(struct vfio_group *group)
 {
 	if (!atomic_inc_not_zero(&group->container_users))
@@ -1328,7 +1322,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
-	if (!group->container->iommu_driver || !vfio_group_viable(group)) {
+	if (!group->container->iommu_driver) {
 		atomic_dec(&group->container_users);
 		return -EINVAL;
 	}
@@ -1346,7 +1340,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	int ret = 0;
 
 	if (0 == atomic_read(&group->container_users) ||
-	    !group->container->iommu_driver || !vfio_group_viable(group))
+	    !group->container->iommu_driver)
 		return -EINVAL;
 
 	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
@@ -1438,11 +1432,11 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
 
 		status.flags = 0;
 
-		if (vfio_group_viable(group))
-			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
-
 		if (group->container)
-			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET;
+			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
+					VFIO_GROUP_FLAGS_VIABLE;
+		else if (!iommu_group_dma_owner_claimed(group->iommu_group))
+			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
 
 		if (copy_to_user((void __user *)arg, &status, minsz))
 			return -EFAULT;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 08/11] vfio: Remove use of vfio_group_viable()
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

As DMA ownership is claimed for the iommu group when a VFIO group is
added to a VFIO container, the VFIO group viability is guaranteed as long
as group->container_users > 0. Remove those unnecessary group viability
checks which are only hit when group->container_users is not zero.

The only remaining reference is in GROUP_GET_STATUS, which could be called
at any time when group fd is valid. Here we just replace the
vfio_group_viable() by directly calling IOMMU core to get viability status.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 56e741cbccce..8a9347f732a5 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1313,12 +1313,6 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	return ret;
 }
 
-static bool vfio_group_viable(struct vfio_group *group)
-{
-	return (iommu_group_for_each_dev(group->iommu_group,
-					 group, vfio_dev_viable) == 0);
-}
-
 static int vfio_group_add_container_user(struct vfio_group *group)
 {
 	if (!atomic_inc_not_zero(&group->container_users))
@@ -1328,7 +1322,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
-	if (!group->container->iommu_driver || !vfio_group_viable(group)) {
+	if (!group->container->iommu_driver) {
 		atomic_dec(&group->container_users);
 		return -EINVAL;
 	}
@@ -1346,7 +1340,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	int ret = 0;
 
 	if (0 == atomic_read(&group->container_users) ||
-	    !group->container->iommu_driver || !vfio_group_viable(group))
+	    !group->container->iommu_driver)
 		return -EINVAL;
 
 	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
@@ -1438,11 +1432,11 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
 
 		status.flags = 0;
 
-		if (vfio_group_viable(group))
-			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
-
 		if (group->container)
-			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET;
+			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
+					VFIO_GROUP_FLAGS_VIABLE;
+		else if (!iommu_group_dma_owner_claimed(group->iommu_group))
+			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
 
 		if (copy_to_user((void __user *)arg, &status, minsz))
 			return -EFAULT;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 09/11] vfio: Delete the unbound_list
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

From: Jason Gunthorpe <jgg@nvidia.com>

commit 60720a0fc646 ("vfio: Add device tracking during unbind") added the
unbound list to plug a problem with KVM where KVM_DEV_VFIO_GROUP_DEL
relied on vfio_group_get_external_user() succeeding to return the
vfio_group from a group file descriptor. The unbound list allowed
vfio_group_get_external_user() to continue to succeed in edge cases.

However commit 5d6dee80a1e9 ("vfio: New external user group/file match")
deleted the call to vfio_group_get_external_user() during
KVM_DEV_VFIO_GROUP_DEL. Instead vfio_external_group_match_file() is used
to directly match the file descriptor to the group pointer.

This in turn avoids the call down to vfio_dev_viable() during
KVM_DEV_VFIO_GROUP_DEL and also avoids the trouble the first commit was
trying to fix.

There are no other users of vfio_dev_viable() that care about the time
after vfio_unregister_group_dev() returns, so simply delete the
unbound_list entirely.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio.c | 74 ++-------------------------------------------
 1 file changed, 2 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 8a9347f732a5..b2f19d17d0c3 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -62,11 +62,6 @@ struct vfio_container {
 	bool				noiommu;
 };
 
-struct vfio_unbound_dev {
-	struct device			*dev;
-	struct list_head		unbound_next;
-};
-
 struct vfio_group {
 	struct device 			dev;
 	struct cdev			cdev;
@@ -79,8 +74,6 @@ struct vfio_group {
 	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
-	struct list_head		unbound_list;
-	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
 	enum vfio_group_type		type;
@@ -340,16 +333,8 @@ vfio_group_get_from_iommu(struct iommu_group *iommu_group)
 static void vfio_group_release(struct device *dev)
 {
 	struct vfio_group *group = container_of(dev, struct vfio_group, dev);
-	struct vfio_unbound_dev *unbound, *tmp;
-
-	list_for_each_entry_safe(unbound, tmp,
-				 &group->unbound_list, unbound_next) {
-		list_del(&unbound->unbound_next);
-		kfree(unbound);
-	}
 
 	mutex_destroy(&group->device_lock);
-	mutex_destroy(&group->unbound_lock);
 	iommu_group_put(group->iommu_group);
 	ida_free(&vfio.group_ida, MINOR(group->dev.devt));
 	kfree(group);
@@ -381,8 +366,6 @@ static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
 	refcount_set(&group->users, 1);
 	INIT_LIST_HEAD(&group->device_list);
 	mutex_init(&group->device_lock);
-	INIT_LIST_HEAD(&group->unbound_list);
-	mutex_init(&group->unbound_lock);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
 	/* put in vfio_group_release() */
@@ -571,19 +554,8 @@ static int vfio_dev_viable(struct device *dev, void *data)
 	struct vfio_group *group = data;
 	struct vfio_device *device;
 	struct device_driver *drv = READ_ONCE(dev->driver);
-	struct vfio_unbound_dev *unbound;
-	int ret = -EINVAL;
 
-	mutex_lock(&group->unbound_lock);
-	list_for_each_entry(unbound, &group->unbound_list, unbound_next) {
-		if (dev == unbound->dev) {
-			ret = 0;
-			break;
-		}
-	}
-	mutex_unlock(&group->unbound_lock);
-
-	if (!ret || !drv || vfio_dev_driver_allowed(dev, drv))
+	if (!drv || vfio_dev_driver_allowed(dev, drv))
 		return 0;
 
 	device = vfio_group_get_device(group, dev);
@@ -592,7 +564,7 @@ static int vfio_dev_viable(struct device *dev, void *data)
 		return 0;
 	}
 
-	return ret;
+	return -EINVAL;
 }
 
 /*
@@ -634,7 +606,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 {
 	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
 	struct device *dev = data;
-	struct vfio_unbound_dev *unbound;
 
 	switch (action) {
 	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
@@ -663,28 +634,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 			__func__, iommu_group_id(group->iommu_group),
 			dev->driver->name);
 		break;
-	case IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbound from driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		/*
-		 * XXX An unbound device in a live group is ok, but we'd
-		 * really like to avoid the above BUG_ON by preventing other
-		 * drivers from binding to it.  Once that occurs, we have to
-		 * stop the system to maintain isolation.  At a minimum, we'd
-		 * want a toggle to disable driver auto probe for this device.
-		 */
-
-		mutex_lock(&group->unbound_lock);
-		list_for_each_entry(unbound,
-				    &group->unbound_list, unbound_next) {
-			if (dev == unbound->dev) {
-				list_del(&unbound->unbound_next);
-				kfree(unbound);
-				break;
-			}
-		}
-		mutex_unlock(&group->unbound_lock);
-		break;
 	}
 	return NOTIFY_OK;
 }
@@ -889,29 +838,10 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 void vfio_unregister_group_dev(struct vfio_device *device)
 {
 	struct vfio_group *group = device->group;
-	struct vfio_unbound_dev *unbound;
 	unsigned int i = 0;
 	bool interrupted = false;
 	long rc;
 
-	/*
-	 * When the device is removed from the group, the group suddenly
-	 * becomes non-viable; the device has a driver (until the unbind
-	 * completes), but it's not present in the group.  This is bad news
-	 * for any external users that need to re-acquire a group reference
-	 * in order to match and release their existing reference.  To
-	 * solve this, we track such devices on the unbound_list to bridge
-	 * the gap until they're fully unbound.
-	 */
-	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
-	if (unbound) {
-		unbound->dev = device->dev;
-		mutex_lock(&group->unbound_lock);
-		list_add(&unbound->unbound_next, &group->unbound_list);
-		mutex_unlock(&group->unbound_lock);
-	}
-	WARN_ON(!unbound);
-
 	vfio_device_put(device);
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 09/11] vfio: Delete the unbound_list
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

From: Jason Gunthorpe <jgg@nvidia.com>

commit 60720a0fc646 ("vfio: Add device tracking during unbind") added the
unbound list to plug a problem with KVM where KVM_DEV_VFIO_GROUP_DEL
relied on vfio_group_get_external_user() succeeding to return the
vfio_group from a group file descriptor. The unbound list allowed
vfio_group_get_external_user() to continue to succeed in edge cases.

However commit 5d6dee80a1e9 ("vfio: New external user group/file match")
deleted the call to vfio_group_get_external_user() during
KVM_DEV_VFIO_GROUP_DEL. Instead vfio_external_group_match_file() is used
to directly match the file descriptor to the group pointer.

This in turn avoids the call down to vfio_dev_viable() during
KVM_DEV_VFIO_GROUP_DEL and also avoids the trouble the first commit was
trying to fix.

There are no other users of vfio_dev_viable() that care about the time
after vfio_unregister_group_dev() returns, so simply delete the
unbound_list entirely.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio.c | 74 ++-------------------------------------------
 1 file changed, 2 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 8a9347f732a5..b2f19d17d0c3 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -62,11 +62,6 @@ struct vfio_container {
 	bool				noiommu;
 };
 
-struct vfio_unbound_dev {
-	struct device			*dev;
-	struct list_head		unbound_next;
-};
-
 struct vfio_group {
 	struct device 			dev;
 	struct cdev			cdev;
@@ -79,8 +74,6 @@ struct vfio_group {
 	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
-	struct list_head		unbound_list;
-	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
 	enum vfio_group_type		type;
@@ -340,16 +333,8 @@ vfio_group_get_from_iommu(struct iommu_group *iommu_group)
 static void vfio_group_release(struct device *dev)
 {
 	struct vfio_group *group = container_of(dev, struct vfio_group, dev);
-	struct vfio_unbound_dev *unbound, *tmp;
-
-	list_for_each_entry_safe(unbound, tmp,
-				 &group->unbound_list, unbound_next) {
-		list_del(&unbound->unbound_next);
-		kfree(unbound);
-	}
 
 	mutex_destroy(&group->device_lock);
-	mutex_destroy(&group->unbound_lock);
 	iommu_group_put(group->iommu_group);
 	ida_free(&vfio.group_ida, MINOR(group->dev.devt));
 	kfree(group);
@@ -381,8 +366,6 @@ static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
 	refcount_set(&group->users, 1);
 	INIT_LIST_HEAD(&group->device_list);
 	mutex_init(&group->device_lock);
-	INIT_LIST_HEAD(&group->unbound_list);
-	mutex_init(&group->unbound_lock);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
 	/* put in vfio_group_release() */
@@ -571,19 +554,8 @@ static int vfio_dev_viable(struct device *dev, void *data)
 	struct vfio_group *group = data;
 	struct vfio_device *device;
 	struct device_driver *drv = READ_ONCE(dev->driver);
-	struct vfio_unbound_dev *unbound;
-	int ret = -EINVAL;
 
-	mutex_lock(&group->unbound_lock);
-	list_for_each_entry(unbound, &group->unbound_list, unbound_next) {
-		if (dev == unbound->dev) {
-			ret = 0;
-			break;
-		}
-	}
-	mutex_unlock(&group->unbound_lock);
-
-	if (!ret || !drv || vfio_dev_driver_allowed(dev, drv))
+	if (!drv || vfio_dev_driver_allowed(dev, drv))
 		return 0;
 
 	device = vfio_group_get_device(group, dev);
@@ -592,7 +564,7 @@ static int vfio_dev_viable(struct device *dev, void *data)
 		return 0;
 	}
 
-	return ret;
+	return -EINVAL;
 }
 
 /*
@@ -634,7 +606,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 {
 	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
 	struct device *dev = data;
-	struct vfio_unbound_dev *unbound;
 
 	switch (action) {
 	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
@@ -663,28 +634,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 			__func__, iommu_group_id(group->iommu_group),
 			dev->driver->name);
 		break;
-	case IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbound from driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		/*
-		 * XXX An unbound device in a live group is ok, but we'd
-		 * really like to avoid the above BUG_ON by preventing other
-		 * drivers from binding to it.  Once that occurs, we have to
-		 * stop the system to maintain isolation.  At a minimum, we'd
-		 * want a toggle to disable driver auto probe for this device.
-		 */
-
-		mutex_lock(&group->unbound_lock);
-		list_for_each_entry(unbound,
-				    &group->unbound_list, unbound_next) {
-			if (dev == unbound->dev) {
-				list_del(&unbound->unbound_next);
-				kfree(unbound);
-				break;
-			}
-		}
-		mutex_unlock(&group->unbound_lock);
-		break;
 	}
 	return NOTIFY_OK;
 }
@@ -889,29 +838,10 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 void vfio_unregister_group_dev(struct vfio_device *device)
 {
 	struct vfio_group *group = device->group;
-	struct vfio_unbound_dev *unbound;
 	unsigned int i = 0;
 	bool interrupted = false;
 	long rc;
 
-	/*
-	 * When the device is removed from the group, the group suddenly
-	 * becomes non-viable; the device has a driver (until the unbind
-	 * completes), but it's not present in the group.  This is bad news
-	 * for any external users that need to re-acquire a group reference
-	 * in order to match and release their existing reference.  To
-	 * solve this, we track such devices on the unbound_list to bridge
-	 * the gap until they're fully unbound.
-	 */
-	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
-	if (unbound) {
-		unbound->dev = device->dev;
-		mutex_lock(&group->unbound_lock);
-		list_add(&unbound->unbound_next, &group->unbound_list);
-		mutex_unlock(&group->unbound_lock);
-	}
-	WARN_ON(!unbound);
-
 	vfio_device_put(device);
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 10/11] vfio: Remove iommu group notifier
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:49   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

The iommu core and driver core have been enhanced to avoid unsafe driver
binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
has been called. There's no need to register iommu group notifier. This
removes the iommu group notifer which contains BUG_ON() and WARN().

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio.c | 147 --------------------------------------------
 1 file changed, 147 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index b2f19d17d0c3..0c766384cee0 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -71,7 +71,6 @@ struct vfio_group {
 	struct vfio_container		*container;
 	struct list_head		device_list;
 	struct mutex			device_lock;
-	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
 	atomic_t			opened;
@@ -274,8 +273,6 @@ void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops)
 }
 EXPORT_SYMBOL_GPL(vfio_unregister_iommu_driver);
 
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data);
 static void vfio_group_get(struct vfio_group *group);
 
 /*
@@ -395,13 +392,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 		goto err_put;
 	}
 
-	group->nb.notifier_call = vfio_iommu_group_notifier;
-	err = iommu_group_register_notifier(iommu_group, &group->nb);
-	if (err) {
-		ret = ERR_PTR(err);
-		goto err_put;
-	}
-
 	mutex_lock(&vfio.group_lock);
 
 	/* Did we race creating this group? */
@@ -422,7 +412,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 
 err_unlock:
 	mutex_unlock(&vfio.group_lock);
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 err_put:
 	put_device(&group->dev);
 	return ret;
@@ -447,7 +436,6 @@ static void vfio_group_put(struct vfio_group *group)
 	cdev_device_del(&group->cdev, &group->dev);
 	mutex_unlock(&vfio.group_lock);
 
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 	put_device(&group->dev);
 }
 
@@ -503,141 +491,6 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 	return NULL;
 }
 
-/*
- * Some drivers, like pci-stub, are only used to prevent other drivers from
- * claiming a device and are therefore perfectly legitimate for a user owned
- * group.  The pci-stub driver has no dependencies on DMA or the IOVA mapping
- * of the device, but it does prevent the user from having direct access to
- * the device, which is useful in some circumstances.
- *
- * We also assume that we can include PCI interconnect devices, ie. bridges.
- * IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
- * then all of the downstream devices will be part of the same IOMMU group as
- * the bridge.  Thus, if placing the bridge into the user owned IOVA space
- * breaks anything, it only does so for user owned devices downstream.  Note
- * that error notification via MSI can be affected for platforms that handle
- * MSI within the same IOVA space as DMA.
- */
-static const char * const vfio_driver_allowed[] = { "pci-stub" };
-
-static bool vfio_dev_driver_allowed(struct device *dev,
-				    struct device_driver *drv)
-{
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
-			return true;
-	}
-
-	return match_string(vfio_driver_allowed,
-			    ARRAY_SIZE(vfio_driver_allowed),
-			    drv->name) >= 0;
-}
-
-/*
- * A vfio group is viable for use by userspace if all devices are in
- * one of the following states:
- *  - driver-less
- *  - bound to a vfio driver
- *  - bound to an otherwise allowed driver
- *  - a PCI interconnect device
- *
- * We use two methods to determine whether a device is bound to a vfio
- * driver.  The first is to test whether the device exists in the vfio
- * group.  The second is to test if the device exists on the group
- * unbound_list, indicating it's in the middle of transitioning from
- * a vfio driver to driver-less.
- */
-static int vfio_dev_viable(struct device *dev, void *data)
-{
-	struct vfio_group *group = data;
-	struct vfio_device *device;
-	struct device_driver *drv = READ_ONCE(dev->driver);
-
-	if (!drv || vfio_dev_driver_allowed(dev, drv))
-		return 0;
-
-	device = vfio_group_get_device(group, dev);
-	if (device) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
-/*
- * Async device support
- */
-static int vfio_group_nb_add_dev(struct vfio_group *group, struct device *dev)
-{
-	struct vfio_device *device;
-
-	/* Do we already know about it?  We shouldn't */
-	device = vfio_group_get_device(group, dev);
-	if (WARN_ON_ONCE(device)) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	/* Nothing to do for idle groups */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	/* TODO Prevent device auto probing */
-	dev_WARN(dev, "Device added to live group %d!\n",
-		 iommu_group_id(group->iommu_group));
-
-	return 0;
-}
-
-static int vfio_group_nb_verify(struct vfio_group *group, struct device *dev)
-{
-	/* We don't care what happens when the group isn't in use */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	return vfio_dev_viable(dev, group);
-}
-
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data)
-{
-	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
-	struct device *dev = data;
-
-	switch (action) {
-	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
-		vfio_group_nb_add_dev(group, dev);
-		break;
-	case IOMMU_GROUP_NOTIFY_DEL_DEVICE:
-		/*
-		 * Nothing to do here.  If the device is in use, then the
-		 * vfio sub-driver should block the remove callback until
-		 * it is unused.  If the device is unused or attached to a
-		 * stub driver, then it should be released and we don't
-		 * care that it will be going away.
-		 */
-		break;
-	case IOMMU_GROUP_NOTIFY_BIND_DRIVER:
-		dev_dbg(dev, "%s: group %d binding to driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		break;
-	case IOMMU_GROUP_NOTIFY_BOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d bound to driver %s\n", __func__,
-			iommu_group_id(group->iommu_group), dev->driver->name);
-		BUG_ON(vfio_group_nb_verify(group, dev));
-		break;
-	case IOMMU_GROUP_NOTIFY_UNBIND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbinding from driver %s\n",
-			__func__, iommu_group_id(group->iommu_group),
-			dev->driver->name);
-		break;
-	}
-	return NOTIFY_OK;
-}
-
 /*
  * VFIO driver API
  */
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 10/11] vfio: Remove iommu group notifier
@ 2022-04-18  0:49   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

The iommu core and driver core have been enhanced to avoid unsafe driver
binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
has been called. There's no need to register iommu group notifier. This
removes the iommu group notifer which contains BUG_ON() and WARN().

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio.c | 147 --------------------------------------------
 1 file changed, 147 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index b2f19d17d0c3..0c766384cee0 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -71,7 +71,6 @@ struct vfio_group {
 	struct vfio_container		*container;
 	struct list_head		device_list;
 	struct mutex			device_lock;
-	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
 	atomic_t			opened;
@@ -274,8 +273,6 @@ void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops)
 }
 EXPORT_SYMBOL_GPL(vfio_unregister_iommu_driver);
 
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data);
 static void vfio_group_get(struct vfio_group *group);
 
 /*
@@ -395,13 +392,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 		goto err_put;
 	}
 
-	group->nb.notifier_call = vfio_iommu_group_notifier;
-	err = iommu_group_register_notifier(iommu_group, &group->nb);
-	if (err) {
-		ret = ERR_PTR(err);
-		goto err_put;
-	}
-
 	mutex_lock(&vfio.group_lock);
 
 	/* Did we race creating this group? */
@@ -422,7 +412,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 
 err_unlock:
 	mutex_unlock(&vfio.group_lock);
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 err_put:
 	put_device(&group->dev);
 	return ret;
@@ -447,7 +436,6 @@ static void vfio_group_put(struct vfio_group *group)
 	cdev_device_del(&group->cdev, &group->dev);
 	mutex_unlock(&vfio.group_lock);
 
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 	put_device(&group->dev);
 }
 
@@ -503,141 +491,6 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 	return NULL;
 }
 
-/*
- * Some drivers, like pci-stub, are only used to prevent other drivers from
- * claiming a device and are therefore perfectly legitimate for a user owned
- * group.  The pci-stub driver has no dependencies on DMA or the IOVA mapping
- * of the device, but it does prevent the user from having direct access to
- * the device, which is useful in some circumstances.
- *
- * We also assume that we can include PCI interconnect devices, ie. bridges.
- * IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
- * then all of the downstream devices will be part of the same IOMMU group as
- * the bridge.  Thus, if placing the bridge into the user owned IOVA space
- * breaks anything, it only does so for user owned devices downstream.  Note
- * that error notification via MSI can be affected for platforms that handle
- * MSI within the same IOVA space as DMA.
- */
-static const char * const vfio_driver_allowed[] = { "pci-stub" };
-
-static bool vfio_dev_driver_allowed(struct device *dev,
-				    struct device_driver *drv)
-{
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
-			return true;
-	}
-
-	return match_string(vfio_driver_allowed,
-			    ARRAY_SIZE(vfio_driver_allowed),
-			    drv->name) >= 0;
-}
-
-/*
- * A vfio group is viable for use by userspace if all devices are in
- * one of the following states:
- *  - driver-less
- *  - bound to a vfio driver
- *  - bound to an otherwise allowed driver
- *  - a PCI interconnect device
- *
- * We use two methods to determine whether a device is bound to a vfio
- * driver.  The first is to test whether the device exists in the vfio
- * group.  The second is to test if the device exists on the group
- * unbound_list, indicating it's in the middle of transitioning from
- * a vfio driver to driver-less.
- */
-static int vfio_dev_viable(struct device *dev, void *data)
-{
-	struct vfio_group *group = data;
-	struct vfio_device *device;
-	struct device_driver *drv = READ_ONCE(dev->driver);
-
-	if (!drv || vfio_dev_driver_allowed(dev, drv))
-		return 0;
-
-	device = vfio_group_get_device(group, dev);
-	if (device) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
-/*
- * Async device support
- */
-static int vfio_group_nb_add_dev(struct vfio_group *group, struct device *dev)
-{
-	struct vfio_device *device;
-
-	/* Do we already know about it?  We shouldn't */
-	device = vfio_group_get_device(group, dev);
-	if (WARN_ON_ONCE(device)) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	/* Nothing to do for idle groups */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	/* TODO Prevent device auto probing */
-	dev_WARN(dev, "Device added to live group %d!\n",
-		 iommu_group_id(group->iommu_group));
-
-	return 0;
-}
-
-static int vfio_group_nb_verify(struct vfio_group *group, struct device *dev)
-{
-	/* We don't care what happens when the group isn't in use */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	return vfio_dev_viable(dev, group);
-}
-
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data)
-{
-	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
-	struct device *dev = data;
-
-	switch (action) {
-	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
-		vfio_group_nb_add_dev(group, dev);
-		break;
-	case IOMMU_GROUP_NOTIFY_DEL_DEVICE:
-		/*
-		 * Nothing to do here.  If the device is in use, then the
-		 * vfio sub-driver should block the remove callback until
-		 * it is unused.  If the device is unused or attached to a
-		 * stub driver, then it should be released and we don't
-		 * care that it will be going away.
-		 */
-		break;
-	case IOMMU_GROUP_NOTIFY_BIND_DRIVER:
-		dev_dbg(dev, "%s: group %d binding to driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		break;
-	case IOMMU_GROUP_NOTIFY_BOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d bound to driver %s\n", __func__,
-			iommu_group_id(group->iommu_group), dev->driver->name);
-		BUG_ON(vfio_group_nb_verify(group, dev));
-		break;
-	case IOMMU_GROUP_NOTIFY_UNBIND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbinding from driver %s\n",
-			__func__, iommu_group_id(group->iommu_group),
-			dev->driver->name);
-		break;
-	}
-	return NOTIFY_OK;
-}
-
 /*
  * VFIO driver API
  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 11/11] iommu: Remove iommu group changes notifier
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-18  0:50   ` Lu Baolu
  -1 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:50 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

The iommu group changes notifer is not referenced in the tree. Remove it
to avoid dead code.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/iommu.h | 23 -------------
 drivers/iommu/iommu.c | 75 -------------------------------------------
 2 files changed, 98 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 77972ef978b5..6ef2df258673 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -407,13 +407,6 @@ static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
 	return dev->iommu->iommu_dev->ops;
 }
 
-#define IOMMU_GROUP_NOTIFY_ADD_DEVICE		1 /* Device added */
-#define IOMMU_GROUP_NOTIFY_DEL_DEVICE		2 /* Pre Device removed */
-#define IOMMU_GROUP_NOTIFY_BIND_DRIVER		3 /* Pre Driver bind */
-#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER		4 /* Post Driver bind */
-#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER	5 /* Pre Driver unbind */
-#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER	6 /* Post Driver unbind */
-
 extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
 extern int bus_iommu_probe(struct bus_type *bus);
 extern bool iommu_present(struct bus_type *bus);
@@ -478,10 +471,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
 extern struct iommu_group *iommu_group_get(struct device *dev);
 extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group);
 extern void iommu_group_put(struct iommu_group *group);
-extern int iommu_group_register_notifier(struct iommu_group *group,
-					 struct notifier_block *nb);
-extern int iommu_group_unregister_notifier(struct iommu_group *group,
-					   struct notifier_block *nb);
 extern int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
 					void *data);
@@ -878,18 +867,6 @@ static inline void iommu_group_put(struct iommu_group *group)
 {
 }
 
-static inline int iommu_group_register_notifier(struct iommu_group *group,
-						struct notifier_block *nb)
-{
-	return -ENODEV;
-}
-
-static inline int iommu_group_unregister_notifier(struct iommu_group *group,
-						  struct notifier_block *nb)
-{
-	return 0;
-}
-
 static inline
 int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index eba8e8ccf19d..0c42ece25854 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -18,7 +18,6 @@
 #include <linux/errno.h>
 #include <linux/iommu.h>
 #include <linux/idr.h>
-#include <linux/notifier.h>
 #include <linux/err.h>
 #include <linux/pci.h>
 #include <linux/bitops.h>
@@ -40,7 +39,6 @@ struct iommu_group {
 	struct kobject *devices_kobj;
 	struct list_head devices;
 	struct mutex mutex;
-	struct blocking_notifier_head notifier;
 	void *iommu_data;
 	void (*iommu_data_release)(void *iommu_data);
 	char *name;
@@ -632,7 +630,6 @@ struct iommu_group *iommu_group_alloc(void)
 	mutex_init(&group->mutex);
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
-	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
@@ -905,10 +902,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
-	/* Notify any listeners about change to group. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-
 	trace_add_device_to_group(group->id, dev);
 
 	dev_info(dev, "Adding to iommu group %d\n", group->id);
@@ -950,10 +943,6 @@ void iommu_group_remove_device(struct device *dev)
 
 	dev_info(dev, "Removing from iommu group %d\n", group->id);
 
-	/* Pre-notify listeners that a device is being removed. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
-
 	mutex_lock(&group->mutex);
 	list_for_each_entry(tmp_device, &group->devices, list) {
 		if (tmp_device->dev == dev) {
@@ -1075,36 +1064,6 @@ void iommu_group_put(struct iommu_group *group)
 }
 EXPORT_SYMBOL_GPL(iommu_group_put);
 
-/**
- * iommu_group_register_notifier - Register a notifier for group changes
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * This function allows iommu group users to track changes in a group.
- * See include/linux/iommu.h for actions sent via this notifier.  Caller
- * should hold a reference to the group throughout notifier registration.
- */
-int iommu_group_register_notifier(struct iommu_group *group,
-				  struct notifier_block *nb)
-{
-	return blocking_notifier_chain_register(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_register_notifier);
-
-/**
- * iommu_group_unregister_notifier - Unregister a notifier
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * Unregister a previously registered group notifier block.
- */
-int iommu_group_unregister_notifier(struct iommu_group *group,
-				    struct notifier_block *nb)
-{
-	return blocking_notifier_chain_unregister(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier);
-
 /**
  * iommu_register_device_fault_handler() - Register a device fault handler
  * @dev: the device
@@ -1650,14 +1609,8 @@ static int remove_iommu_group(struct device *dev, void *data)
 static int iommu_bus_notifier(struct notifier_block *nb,
 			      unsigned long action, void *data)
 {
-	unsigned long group_action = 0;
 	struct device *dev = data;
-	struct iommu_group *group;
 
-	/*
-	 * ADD/DEL call into iommu driver ops if provided, which may
-	 * result in ADD/DEL notifiers to group->notifier
-	 */
 	if (action == BUS_NOTIFY_ADD_DEVICE) {
 		int ret;
 
@@ -1668,34 +1621,6 @@ static int iommu_bus_notifier(struct notifier_block *nb,
 		return NOTIFY_OK;
 	}
 
-	/*
-	 * Remaining BUS_NOTIFYs get filtered and republished to the
-	 * group, if anyone is listening
-	 */
-	group = iommu_group_get(dev);
-	if (!group)
-		return 0;
-
-	switch (action) {
-	case BUS_NOTIFY_BIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BIND_DRIVER;
-		break;
-	case BUS_NOTIFY_BOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BOUND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBIND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER;
-		break;
-	}
-
-	if (group_action)
-		blocking_notifier_call_chain(&group->notifier,
-					     group_action, dev);
-
-	iommu_group_put(group);
 	return 0;
 }
 
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RESEND PATCH v8 11/11] iommu: Remove iommu group changes notifier
@ 2022-04-18  0:50   ` Lu Baolu
  0 siblings, 0 replies; 67+ messages in thread
From: Lu Baolu @ 2022-04-18  0:50 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

The iommu group changes notifer is not referenced in the tree. Remove it
to avoid dead code.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/iommu.h | 23 -------------
 drivers/iommu/iommu.c | 75 -------------------------------------------
 2 files changed, 98 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 77972ef978b5..6ef2df258673 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -407,13 +407,6 @@ static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
 	return dev->iommu->iommu_dev->ops;
 }
 
-#define IOMMU_GROUP_NOTIFY_ADD_DEVICE		1 /* Device added */
-#define IOMMU_GROUP_NOTIFY_DEL_DEVICE		2 /* Pre Device removed */
-#define IOMMU_GROUP_NOTIFY_BIND_DRIVER		3 /* Pre Driver bind */
-#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER		4 /* Post Driver bind */
-#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER	5 /* Pre Driver unbind */
-#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER	6 /* Post Driver unbind */
-
 extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
 extern int bus_iommu_probe(struct bus_type *bus);
 extern bool iommu_present(struct bus_type *bus);
@@ -478,10 +471,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
 extern struct iommu_group *iommu_group_get(struct device *dev);
 extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group);
 extern void iommu_group_put(struct iommu_group *group);
-extern int iommu_group_register_notifier(struct iommu_group *group,
-					 struct notifier_block *nb);
-extern int iommu_group_unregister_notifier(struct iommu_group *group,
-					   struct notifier_block *nb);
 extern int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
 					void *data);
@@ -878,18 +867,6 @@ static inline void iommu_group_put(struct iommu_group *group)
 {
 }
 
-static inline int iommu_group_register_notifier(struct iommu_group *group,
-						struct notifier_block *nb)
-{
-	return -ENODEV;
-}
-
-static inline int iommu_group_unregister_notifier(struct iommu_group *group,
-						  struct notifier_block *nb)
-{
-	return 0;
-}
-
 static inline
 int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index eba8e8ccf19d..0c42ece25854 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -18,7 +18,6 @@
 #include <linux/errno.h>
 #include <linux/iommu.h>
 #include <linux/idr.h>
-#include <linux/notifier.h>
 #include <linux/err.h>
 #include <linux/pci.h>
 #include <linux/bitops.h>
@@ -40,7 +39,6 @@ struct iommu_group {
 	struct kobject *devices_kobj;
 	struct list_head devices;
 	struct mutex mutex;
-	struct blocking_notifier_head notifier;
 	void *iommu_data;
 	void (*iommu_data_release)(void *iommu_data);
 	char *name;
@@ -632,7 +630,6 @@ struct iommu_group *iommu_group_alloc(void)
 	mutex_init(&group->mutex);
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
-	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
@@ -905,10 +902,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
-	/* Notify any listeners about change to group. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-
 	trace_add_device_to_group(group->id, dev);
 
 	dev_info(dev, "Adding to iommu group %d\n", group->id);
@@ -950,10 +943,6 @@ void iommu_group_remove_device(struct device *dev)
 
 	dev_info(dev, "Removing from iommu group %d\n", group->id);
 
-	/* Pre-notify listeners that a device is being removed. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
-
 	mutex_lock(&group->mutex);
 	list_for_each_entry(tmp_device, &group->devices, list) {
 		if (tmp_device->dev == dev) {
@@ -1075,36 +1064,6 @@ void iommu_group_put(struct iommu_group *group)
 }
 EXPORT_SYMBOL_GPL(iommu_group_put);
 
-/**
- * iommu_group_register_notifier - Register a notifier for group changes
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * This function allows iommu group users to track changes in a group.
- * See include/linux/iommu.h for actions sent via this notifier.  Caller
- * should hold a reference to the group throughout notifier registration.
- */
-int iommu_group_register_notifier(struct iommu_group *group,
-				  struct notifier_block *nb)
-{
-	return blocking_notifier_chain_register(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_register_notifier);
-
-/**
- * iommu_group_unregister_notifier - Unregister a notifier
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * Unregister a previously registered group notifier block.
- */
-int iommu_group_unregister_notifier(struct iommu_group *group,
-				    struct notifier_block *nb)
-{
-	return blocking_notifier_chain_unregister(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier);
-
 /**
  * iommu_register_device_fault_handler() - Register a device fault handler
  * @dev: the device
@@ -1650,14 +1609,8 @@ static int remove_iommu_group(struct device *dev, void *data)
 static int iommu_bus_notifier(struct notifier_block *nb,
 			      unsigned long action, void *data)
 {
-	unsigned long group_action = 0;
 	struct device *dev = data;
-	struct iommu_group *group;
 
-	/*
-	 * ADD/DEL call into iommu driver ops if provided, which may
-	 * result in ADD/DEL notifiers to group->notifier
-	 */
 	if (action == BUS_NOTIFY_ADD_DEVICE) {
 		int ret;
 
@@ -1668,34 +1621,6 @@ static int iommu_bus_notifier(struct notifier_block *nb,
 		return NOTIFY_OK;
 	}
 
-	/*
-	 * Remaining BUS_NOTIFYs get filtered and republished to the
-	 * group, if anyone is listening
-	 */
-	group = iommu_group_get(dev);
-	if (!group)
-		return 0;
-
-	switch (action) {
-	case BUS_NOTIFY_BIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BIND_DRIVER;
-		break;
-	case BUS_NOTIFY_BOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BOUND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBIND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER;
-		break;
-	}
-
-	if (group_action)
-		blocking_notifier_call_chain(&group->notifier,
-					     group_action, dev);
-
-	iommu_group_put(group);
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-04-18  0:49 ` Lu Baolu
@ 2022-04-28  9:32   ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-04-28  9:32 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> Lu Baolu (10):
>   iommu: Add DMA ownership management interfaces
>   driver core: Add dma_cleanup callback in bus_type
>   amba: Stop sharing platform_dma_configure()
>   bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
>   PCI: pci_stub: Set driver_managed_dma
>   PCI: portdrv: Set driver_managed_dma
>   vfio: Set DMA ownership for VFIO devices
>   vfio: Remove use of vfio_group_viable()
>   vfio: Remove iommu group notifier
>   iommu: Remove iommu group changes notifier

Applied to core branch, thanks Baolu.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-04-28  9:32   ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-04-28  9:32 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> Lu Baolu (10):
>   iommu: Add DMA ownership management interfaces
>   driver core: Add dma_cleanup callback in bus_type
>   amba: Stop sharing platform_dma_configure()
>   bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
>   PCI: pci_stub: Set driver_managed_dma
>   PCI: portdrv: Set driver_managed_dma
>   vfio: Set DMA ownership for VFIO devices
>   vfio: Remove use of vfio_group_viable()
>   vfio: Remove iommu group notifier
>   iommu: Remove iommu group changes notifier

Applied to core branch, thanks Baolu.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-04-28  9:32   ` Joerg Roedel
@ 2022-04-28 11:54     ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-04-28 11:54 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Lu Baolu, Kevin Tian, Liu Yi L, iommu, linux-kernel

On Thu, Apr 28, 2022 at 11:32:04AM +0200, Joerg Roedel wrote:
> On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> > Lu Baolu (10):
> >   iommu: Add DMA ownership management interfaces
> >   driver core: Add dma_cleanup callback in bus_type
> >   amba: Stop sharing platform_dma_configure()
> >   bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
> >   PCI: pci_stub: Set driver_managed_dma
> >   PCI: portdrv: Set driver_managed_dma
> >   vfio: Set DMA ownership for VFIO devices
> >   vfio: Remove use of vfio_group_viable()
> >   vfio: Remove iommu group notifier
> >   iommu: Remove iommu group changes notifier
> 
> Applied to core branch, thanks Baolu.

Can we get this on a topic branch so Alex can pull it? There are
conflicts with other VFIO patches

Thanks!
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-04-28 11:54     ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-28 11:54 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: linux-kernel, Kevin Tian, iommu

On Thu, Apr 28, 2022 at 11:32:04AM +0200, Joerg Roedel wrote:
> On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> > Lu Baolu (10):
> >   iommu: Add DMA ownership management interfaces
> >   driver core: Add dma_cleanup callback in bus_type
> >   amba: Stop sharing platform_dma_configure()
> >   bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
> >   PCI: pci_stub: Set driver_managed_dma
> >   PCI: portdrv: Set driver_managed_dma
> >   vfio: Set DMA ownership for VFIO devices
> >   vfio: Remove use of vfio_group_viable()
> >   vfio: Remove iommu group notifier
> >   iommu: Remove iommu group changes notifier
> 
> Applied to core branch, thanks Baolu.

Can we get this on a topic branch so Alex can pull it? There are
conflicts with other VFIO patches

Thanks!
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-04-28 11:54     ` Jason Gunthorpe via iommu
@ 2022-04-28 13:34       ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-04-28 13:34 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Lu Baolu, Kevin Tian, Liu Yi L, iommu, linux-kernel

On Thu, Apr 28, 2022 at 08:54:11AM -0300, Jason Gunthorpe wrote:
> Can we get this on a topic branch so Alex can pull it? There are
> conflicts with other VFIO patches

Right, we already discussed this. Moved the patches to a separate topic
branch. It will appear as 'vfio-notifier-fix' once I pushed the changes.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-04-28 13:34       ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-04-28 13:34 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-kernel, Kevin Tian, iommu

On Thu, Apr 28, 2022 at 08:54:11AM -0300, Jason Gunthorpe wrote:
> Can we get this on a topic branch so Alex can pull it? There are
> conflicts with other VFIO patches

Right, we already discussed this. Moved the patches to a separate topic
branch. It will appear as 'vfio-notifier-fix' once I pushed the changes.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-04-18  0:49 ` Lu Baolu
@ 2022-05-02 16:12   ` Qian Cai
  -1 siblings, 0 replies; 67+ messages in thread
From: Qian Cai @ 2022-05-02 16:12 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> Hi Joerg,
> 
> This is a resend version of v8 posted here:
> https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
> as we discussed in this thread:
> https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/
> 
> All patches can be applied perfectly except this one:
>  - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
> It conflicts with below refactoring commit:
>  - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
> The conflict has been fixed in this post.
> 
> No functional changes in this series. I suppress cc-ing this series to
> all v8 reviewers in order to avoid spam.
> 
> Please consider it for your iommu tree.

Reverting this series fixed an user-after-free while doing SR-IOV.

 BUG: KASAN: use-after-free in __lock_acquire
 Read of size 8 at addr ffff080279825d78 by task qemu-system-aar/22429
 CPU: 24 PID: 22429 Comm: qemu-system-aar Not tainted 5.18.0-rc5-next-20220502 #69
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  print_address_description.constprop.0
  print_report
  kasan_report
  __asan_report_load8_noabort
  __lock_acquire
  lock_acquire.part.0
  lock_acquire
  _raw_spin_lock_irqsave
  arm_smmu_detach_dev
  arm_smmu_detach_dev at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2377
  arm_smmu_attach_dev
  __iommu_attach_group
  __iommu_attach_device at drivers/iommu/iommu.c:1942
  (inlined by) iommu_group_do_attach_device at drivers/iommu/iommu.c:2058
  (inlined by) __iommu_group_for_each_dev at drivers/iommu/iommu.c:989
  (inlined by) __iommu_attach_group at drivers/iommu/iommu.c:2069
  iommu_group_release_dma_owner
  __vfio_group_unset_container
  vfio_group_try_dissolve_container
  vfio_group_put_external_user
  kvm_vfio_destroy
  kvm_destroy_vm
  kvm_vm_release
  __fput
  ____fput
  task_work_run
  do_exit
  do_group_exit
  get_signal
  do_signal
  do_notify_resume
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Allocated by task 22427:
  kasan_save_stack
  __kasan_kmalloc
  kmem_cache_alloc_trace
  arm_smmu_domain_alloc
  iommu_domain_alloc
  vfio_iommu_type1_attach_group
  vfio_ioctl_set_iommu
  vfio_fops_unl_ioctl
  __arm64_sys_ioctl
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Freed by task 22429:
  kasan_save_stack
  kasan_set_track
  kasan_set_free_info
  ____kasan_slab_free
  __kasan_slab_free
  slab_free_freelist_hook
  kfree
  arm_smmu_domain_free
  arm_smmu_domain_free at iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2067
  iommu_domain_free
  vfio_iommu_type1_detach_group
  __vfio_group_unset_container
  vfio_group_try_dissolve_container
  vfio_group_put_external_user
  kvm_vfio_destroy
  kvm_destroy_vm
  kvm_vm_release
  __fput
  ____fput
  task_work_run
  do_exit
  do_group_exit
  get_signal
  do_signal
  do_notify_resume
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-02 16:12   ` Qian Cai
  0 siblings, 0 replies; 67+ messages in thread
From: Qian Cai @ 2022-05-02 16:12 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Kevin Tian, linux-kernel, iommu, Jason Gunthorpe

On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> Hi Joerg,
> 
> This is a resend version of v8 posted here:
> https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
> as we discussed in this thread:
> https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/
> 
> All patches can be applied perfectly except this one:
>  - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
> It conflicts with below refactoring commit:
>  - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
> The conflict has been fixed in this post.
> 
> No functional changes in this series. I suppress cc-ing this series to
> all v8 reviewers in order to avoid spam.
> 
> Please consider it for your iommu tree.

Reverting this series fixed an user-after-free while doing SR-IOV.

 BUG: KASAN: use-after-free in __lock_acquire
 Read of size 8 at addr ffff080279825d78 by task qemu-system-aar/22429
 CPU: 24 PID: 22429 Comm: qemu-system-aar Not tainted 5.18.0-rc5-next-20220502 #69
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  print_address_description.constprop.0
  print_report
  kasan_report
  __asan_report_load8_noabort
  __lock_acquire
  lock_acquire.part.0
  lock_acquire
  _raw_spin_lock_irqsave
  arm_smmu_detach_dev
  arm_smmu_detach_dev at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2377
  arm_smmu_attach_dev
  __iommu_attach_group
  __iommu_attach_device at drivers/iommu/iommu.c:1942
  (inlined by) iommu_group_do_attach_device at drivers/iommu/iommu.c:2058
  (inlined by) __iommu_group_for_each_dev at drivers/iommu/iommu.c:989
  (inlined by) __iommu_attach_group at drivers/iommu/iommu.c:2069
  iommu_group_release_dma_owner
  __vfio_group_unset_container
  vfio_group_try_dissolve_container
  vfio_group_put_external_user
  kvm_vfio_destroy
  kvm_destroy_vm
  kvm_vm_release
  __fput
  ____fput
  task_work_run
  do_exit
  do_group_exit
  get_signal
  do_signal
  do_notify_resume
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Allocated by task 22427:
  kasan_save_stack
  __kasan_kmalloc
  kmem_cache_alloc_trace
  arm_smmu_domain_alloc
  iommu_domain_alloc
  vfio_iommu_type1_attach_group
  vfio_ioctl_set_iommu
  vfio_fops_unl_ioctl
  __arm64_sys_ioctl
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync

 Freed by task 22429:
  kasan_save_stack
  kasan_set_track
  kasan_set_free_info
  ____kasan_slab_free
  __kasan_slab_free
  slab_free_freelist_hook
  kfree
  arm_smmu_domain_free
  arm_smmu_domain_free at iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2067
  iommu_domain_free
  vfio_iommu_type1_detach_group
  __vfio_group_unset_container
  vfio_group_try_dissolve_container
  vfio_group_put_external_user
  kvm_vfio_destroy
  kvm_destroy_vm
  kvm_vm_release
  __fput
  ____fput
  task_work_run
  do_exit
  do_group_exit
  get_signal
  do_signal
  do_notify_resume
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-02 16:12   ` Qian Cai
@ 2022-05-02 16:42     ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-05-02 16:42 UTC (permalink / raw)
  To: Qian Cai, Robin Murphy
  Cc: Lu Baolu, Joerg Roedel, Kevin Tian, Liu Yi L, iommu, linux-kernel

On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> > Hi Joerg,
> > 
> > This is a resend version of v8 posted here:
> > https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
> > as we discussed in this thread:
> > https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/
> > 
> > All patches can be applied perfectly except this one:
> >  - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
> > It conflicts with below refactoring commit:
> >  - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
> > The conflict has been fixed in this post.
> > 
> > No functional changes in this series. I suppress cc-ing this series to
> > all v8 reviewers in order to avoid spam.
> > 
> > Please consider it for your iommu tree.
> 
> Reverting this series fixed an user-after-free while doing SR-IOV.
> 
>  BUG: KASAN: use-after-free in __lock_acquire
>  Read of size 8 at addr ffff080279825d78 by task qemu-system-aar/22429
>  CPU: 24 PID: 22429 Comm: qemu-system-aar Not tainted 5.18.0-rc5-next-20220502 #69
>  Call trace:
>   dump_backtrace
>   show_stack
>   dump_stack_lvl
>   print_address_description.constprop.0
>   print_report
>   kasan_report
>   __asan_report_load8_noabort
>   __lock_acquire
>   lock_acquire.part.0
>   lock_acquire
>   _raw_spin_lock_irqsave
>   arm_smmu_detach_dev
>   arm_smmu_detach_dev at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2377
>   arm_smmu_attach_dev

Hum.

So what has happened is that VFIO does this sequence:

 iommu_detach_group()
 iommu_domain_free()
 iommu_group_release_dma_owner()

Which, I think should be valid, API wise.

From what I can see reading the code SMMUv3 blows up above because it
doesn't have a detach_dev op:

	.default_domain_ops = &(const struct iommu_domain_ops) {
		.attach_dev		= arm_smmu_attach_dev,
		.map_pages		= arm_smmu_map_pages,
		.unmap_pages		= arm_smmu_unmap_pages,
		.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
		.iotlb_sync		= arm_smmu_iotlb_sync,
		.iova_to_phys		= arm_smmu_iova_to_phys,
		.enable_nesting		= arm_smmu_enable_nesting,
		.free			= arm_smmu_domain_free,
	}

But it is internally tracking the domain inside the master - so when
the next domain is attached it does this:

static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
	struct arm_smmu_domain *smmu_domain = master->domain;

	spin_lock_irqsave(&smmu_domain->devices_lock, flags);

And explodes as the domain has been freed but master->domain was not
NULL'd.

It worked before because iommu_detach_group() used to attach the
default group and that was before the domain was freed in the above
sequence.

I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
the detach_dev op and null it's cached copy of the domain, but I don't
know this driver.. Robin?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-02 16:42     ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-05-02 16:42 UTC (permalink / raw)
  To: Qian Cai, Robin Murphy; +Cc: Kevin Tian, linux-kernel, iommu

On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
> > Hi Joerg,
> > 
> > This is a resend version of v8 posted here:
> > https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
> > as we discussed in this thread:
> > https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/
> > 
> > All patches can be applied perfectly except this one:
> >  - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
> > It conflicts with below refactoring commit:
> >  - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
> > The conflict has been fixed in this post.
> > 
> > No functional changes in this series. I suppress cc-ing this series to
> > all v8 reviewers in order to avoid spam.
> > 
> > Please consider it for your iommu tree.
> 
> Reverting this series fixed an user-after-free while doing SR-IOV.
> 
>  BUG: KASAN: use-after-free in __lock_acquire
>  Read of size 8 at addr ffff080279825d78 by task qemu-system-aar/22429
>  CPU: 24 PID: 22429 Comm: qemu-system-aar Not tainted 5.18.0-rc5-next-20220502 #69
>  Call trace:
>   dump_backtrace
>   show_stack
>   dump_stack_lvl
>   print_address_description.constprop.0
>   print_report
>   kasan_report
>   __asan_report_load8_noabort
>   __lock_acquire
>   lock_acquire.part.0
>   lock_acquire
>   _raw_spin_lock_irqsave
>   arm_smmu_detach_dev
>   arm_smmu_detach_dev at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2377
>   arm_smmu_attach_dev

Hum.

So what has happened is that VFIO does this sequence:

 iommu_detach_group()
 iommu_domain_free()
 iommu_group_release_dma_owner()

Which, I think should be valid, API wise.

From what I can see reading the code SMMUv3 blows up above because it
doesn't have a detach_dev op:

	.default_domain_ops = &(const struct iommu_domain_ops) {
		.attach_dev		= arm_smmu_attach_dev,
		.map_pages		= arm_smmu_map_pages,
		.unmap_pages		= arm_smmu_unmap_pages,
		.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
		.iotlb_sync		= arm_smmu_iotlb_sync,
		.iova_to_phys		= arm_smmu_iova_to_phys,
		.enable_nesting		= arm_smmu_enable_nesting,
		.free			= arm_smmu_domain_free,
	}

But it is internally tracking the domain inside the master - so when
the next domain is attached it does this:

static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
	struct arm_smmu_domain *smmu_domain = master->domain;

	spin_lock_irqsave(&smmu_domain->devices_lock, flags);

And explodes as the domain has been freed but master->domain was not
NULL'd.

It worked before because iommu_detach_group() used to attach the
default group and that was before the domain was freed in the above
sequence.

I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
the detach_dev op and null it's cached copy of the domain, but I don't
know this driver.. Robin?

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-02 16:42     ` Jason Gunthorpe via iommu
@ 2022-05-03 13:04       ` Robin Murphy
  -1 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-05-03 13:04 UTC (permalink / raw)
  To: Jason Gunthorpe, Qian Cai
  Cc: Lu Baolu, Joerg Roedel, Kevin Tian, Liu Yi L, iommu,
	linux-kernel, Jean-Philippe Brucker

On 2022-05-02 17:42, Jason Gunthorpe wrote:
> On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
>> On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
>>> Hi Joerg,
>>>
>>> This is a resend version of v8 posted here:
>>> https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
>>> as we discussed in this thread:
>>> https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/
>>>
>>> All patches can be applied perfectly except this one:
>>>   - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
>>> It conflicts with below refactoring commit:
>>>   - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
>>> The conflict has been fixed in this post.
>>>
>>> No functional changes in this series. I suppress cc-ing this series to
>>> all v8 reviewers in order to avoid spam.
>>>
>>> Please consider it for your iommu tree.
>>
>> Reverting this series fixed an user-after-free while doing SR-IOV.
>>
>>   BUG: KASAN: use-after-free in __lock_acquire
>>   Read of size 8 at addr ffff080279825d78 by task qemu-system-aar/22429
>>   CPU: 24 PID: 22429 Comm: qemu-system-aar Not tainted 5.18.0-rc5-next-20220502 #69
>>   Call trace:
>>    dump_backtrace
>>    show_stack
>>    dump_stack_lvl
>>    print_address_description.constprop.0
>>    print_report
>>    kasan_report
>>    __asan_report_load8_noabort
>>    __lock_acquire
>>    lock_acquire.part.0
>>    lock_acquire
>>    _raw_spin_lock_irqsave
>>    arm_smmu_detach_dev
>>    arm_smmu_detach_dev at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2377
>>    arm_smmu_attach_dev
> 
> Hum.
> 
> So what has happened is that VFIO does this sequence:
> 
>   iommu_detach_group()
>   iommu_domain_free()
>   iommu_group_release_dma_owner()
> 
> Which, I think should be valid, API wise.
> 
>  From what I can see reading the code SMMUv3 blows up above because it
> doesn't have a detach_dev op:
> 
> 	.default_domain_ops = &(const struct iommu_domain_ops) {
> 		.attach_dev		= arm_smmu_attach_dev,
> 		.map_pages		= arm_smmu_map_pages,
> 		.unmap_pages		= arm_smmu_unmap_pages,
> 		.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
> 		.iotlb_sync		= arm_smmu_iotlb_sync,
> 		.iova_to_phys		= arm_smmu_iova_to_phys,
> 		.enable_nesting		= arm_smmu_enable_nesting,
> 		.free			= arm_smmu_domain_free,
> 	}
> 
> But it is internally tracking the domain inside the master - so when
> the next domain is attached it does this:
> 
> static void arm_smmu_detach_dev(struct arm_smmu_master *master)
> {
> 	struct arm_smmu_domain *smmu_domain = master->domain;
> 
> 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> 
> And explodes as the domain has been freed but master->domain was not
> NULL'd.
> 
> It worked before because iommu_detach_group() used to attach the
> default group and that was before the domain was freed in the above
> sequence.

Oof, I totally overlooked the significance of that little subtlety in 
review :(

> I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
> the detach_dev op and null it's cached copy of the domain, but I don't
> know this driver.. Robin?

The original intent was that .detach_dev is deprecated in favour of 
default domains, and when the latter are in use, a device is always 
attached *somewhere* once probed (i.e. group->domain is never NULL). At 
face value, the neatest fix IMO would probably be for SMMUv3's 
.domain_free to handle smmu_domain->devices being non-empty and detach 
them at that point. However that wouldn't be viable for virtio-iommu or 
anyone else keeping an internal one-way association of devices to their 
current domains.

If we're giving up entirely on that notion of .detach_dev going away 
then all default-domain-supporting drivers probably want checking to 
make sure that path hasn't bitrotted; both Arm SMMU drivers had it 
proactively removed 6 years ago; virtio-iommu never had it at all; newer 
drivers like apple-dart have some code there, but it won't have ever run 
until now.

We *could* stay true to the original paradigm by introducing some real 
usage of IOMMU_DOMAIN_BLOCKED, such that we could keep one or more of 
those around to actively attach to instead of having groups in this 
unattached limbo state, but that's a bigger job involving adding support 
to drivers as well; too much for a quick fix now...

Robin.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-03 13:04       ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-05-03 13:04 UTC (permalink / raw)
  To: Jason Gunthorpe, Qian Cai
  Cc: Jean-Philippe Brucker, Kevin Tian, linux-kernel, iommu

On 2022-05-02 17:42, Jason Gunthorpe wrote:
> On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
>> On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote:
>>> Hi Joerg,
>>>
>>> This is a resend version of v8 posted here:
>>> https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu.lu@linux.intel.com/
>>> as we discussed in this thread:
>>> https://lore.kernel.org/linux-iommu/Yk%2Fq1BGN8pC5HVZp@8bytes.org/
>>>
>>> All patches can be applied perfectly except this one:
>>>   - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type
>>> It conflicts with below refactoring commit:
>>>   - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks"
>>> The conflict has been fixed in this post.
>>>
>>> No functional changes in this series. I suppress cc-ing this series to
>>> all v8 reviewers in order to avoid spam.
>>>
>>> Please consider it for your iommu tree.
>>
>> Reverting this series fixed an user-after-free while doing SR-IOV.
>>
>>   BUG: KASAN: use-after-free in __lock_acquire
>>   Read of size 8 at addr ffff080279825d78 by task qemu-system-aar/22429
>>   CPU: 24 PID: 22429 Comm: qemu-system-aar Not tainted 5.18.0-rc5-next-20220502 #69
>>   Call trace:
>>    dump_backtrace
>>    show_stack
>>    dump_stack_lvl
>>    print_address_description.constprop.0
>>    print_report
>>    kasan_report
>>    __asan_report_load8_noabort
>>    __lock_acquire
>>    lock_acquire.part.0
>>    lock_acquire
>>    _raw_spin_lock_irqsave
>>    arm_smmu_detach_dev
>>    arm_smmu_detach_dev at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2377
>>    arm_smmu_attach_dev
> 
> Hum.
> 
> So what has happened is that VFIO does this sequence:
> 
>   iommu_detach_group()
>   iommu_domain_free()
>   iommu_group_release_dma_owner()
> 
> Which, I think should be valid, API wise.
> 
>  From what I can see reading the code SMMUv3 blows up above because it
> doesn't have a detach_dev op:
> 
> 	.default_domain_ops = &(const struct iommu_domain_ops) {
> 		.attach_dev		= arm_smmu_attach_dev,
> 		.map_pages		= arm_smmu_map_pages,
> 		.unmap_pages		= arm_smmu_unmap_pages,
> 		.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
> 		.iotlb_sync		= arm_smmu_iotlb_sync,
> 		.iova_to_phys		= arm_smmu_iova_to_phys,
> 		.enable_nesting		= arm_smmu_enable_nesting,
> 		.free			= arm_smmu_domain_free,
> 	}
> 
> But it is internally tracking the domain inside the master - so when
> the next domain is attached it does this:
> 
> static void arm_smmu_detach_dev(struct arm_smmu_master *master)
> {
> 	struct arm_smmu_domain *smmu_domain = master->domain;
> 
> 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> 
> And explodes as the domain has been freed but master->domain was not
> NULL'd.
> 
> It worked before because iommu_detach_group() used to attach the
> default group and that was before the domain was freed in the above
> sequence.

Oof, I totally overlooked the significance of that little subtlety in 
review :(

> I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
> the detach_dev op and null it's cached copy of the domain, but I don't
> know this driver.. Robin?

The original intent was that .detach_dev is deprecated in favour of 
default domains, and when the latter are in use, a device is always 
attached *somewhere* once probed (i.e. group->domain is never NULL). At 
face value, the neatest fix IMO would probably be for SMMUv3's 
.domain_free to handle smmu_domain->devices being non-empty and detach 
them at that point. However that wouldn't be viable for virtio-iommu or 
anyone else keeping an internal one-way association of devices to their 
current domains.

If we're giving up entirely on that notion of .detach_dev going away 
then all default-domain-supporting drivers probably want checking to 
make sure that path hasn't bitrotted; both Arm SMMU drivers had it 
proactively removed 6 years ago; virtio-iommu never had it at all; newer 
drivers like apple-dart have some code there, but it won't have ever run 
until now.

We *could* stay true to the original paradigm by introducing some real 
usage of IOMMU_DOMAIN_BLOCKED, such that we could keep one or more of 
those around to actively attach to instead of having groups in this 
unattached limbo state, but that's a bigger job involving adding support 
to drivers as well; too much for a quick fix now...

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-03 13:04       ` Robin Murphy
@ 2022-05-03 15:23         ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-05-03 15:23 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Qian Cai, Lu Baolu, Joerg Roedel, Kevin Tian, Liu Yi L, iommu,
	linux-kernel, Jean-Philippe Brucker

On Tue, May 03, 2022 at 02:04:37PM +0100, Robin Murphy wrote:

> > I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
> > the detach_dev op and null it's cached copy of the domain, but I don't
> > know this driver.. Robin?
> 
> The original intent was that .detach_dev is deprecated in favour of default
> domains, and when the latter are in use, a device is always attached
> *somewhere* once probed (i.e. group->domain is never NULL). At face value,
> the neatest fix IMO would probably be for SMMUv3's .domain_free to handle
> smmu_domain->devices being non-empty and detach them at that point. However
> that wouldn't be viable for virtio-iommu or anyone else keeping an internal
> one-way association of devices to their current domains.

Oh wow that is not obvious

Actually, I think it is much worse than this because
iommu_group_claim_dma_owner() does a __iommu_detach_group() with the
expecation that this would actually result in DMA being blocked,
immediately. The idea that __iomuu_detatch_group() is a NOP is kind of
scary.

Leaving the group attached to the kernel DMA domain will allow
userspace to DMA to all kernel memory :\

So one approach could be to block use of iommu_group_claim_dma_owner()
if no detatch_dev op is present and then go through and put them back
or do something else. This could be short-term OK if we add an op to
SMMUv3, but long term everything would have to be fixed

Or we can allocate a dummy empty/blocked domain during
iommu_group_claim_dma_owner() and attach it whenever.

The really ugly trick is that detatch cannot fail, so attach to this
blocking domain must also not fail - IMHO this is a very complicated
API to expect for the driver to implement correctly... I see there is
already a WARN_ON that attaching to the default domain cannot
fail. Maybe this warrants an actual no-fail attach op so the driver
can be more aware of this..

And some of these internal APIs could stand some adjusting if we
really never want a true "detatch" it is always some kind of
replace/swap type operation, either to the default domain or to the
blocking domain.

> We *could* stay true to the original paradigm by introducing some real usage
> of IOMMU_DOMAIN_BLOCKED, such that we could keep one or more of those around
> to actively attach to instead of having groups in this unattached limbo
> state, but that's a bigger job involving adding support to drivers as well;
> too much for a quick fix now...

I suspect for the short term we can get by with an empty mapping
domain - using DOMAIN_BLOCKED is a bit of a refinement.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-03 15:23         ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-05-03 15:23 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Kevin Tian, Qian Cai, linux-kernel, iommu

On Tue, May 03, 2022 at 02:04:37PM +0100, Robin Murphy wrote:

> > I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
> > the detach_dev op and null it's cached copy of the domain, but I don't
> > know this driver.. Robin?
> 
> The original intent was that .detach_dev is deprecated in favour of default
> domains, and when the latter are in use, a device is always attached
> *somewhere* once probed (i.e. group->domain is never NULL). At face value,
> the neatest fix IMO would probably be for SMMUv3's .domain_free to handle
> smmu_domain->devices being non-empty and detach them at that point. However
> that wouldn't be viable for virtio-iommu or anyone else keeping an internal
> one-way association of devices to their current domains.

Oh wow that is not obvious

Actually, I think it is much worse than this because
iommu_group_claim_dma_owner() does a __iommu_detach_group() with the
expecation that this would actually result in DMA being blocked,
immediately. The idea that __iomuu_detatch_group() is a NOP is kind of
scary.

Leaving the group attached to the kernel DMA domain will allow
userspace to DMA to all kernel memory :\

So one approach could be to block use of iommu_group_claim_dma_owner()
if no detatch_dev op is present and then go through and put them back
or do something else. This could be short-term OK if we add an op to
SMMUv3, but long term everything would have to be fixed

Or we can allocate a dummy empty/blocked domain during
iommu_group_claim_dma_owner() and attach it whenever.

The really ugly trick is that detatch cannot fail, so attach to this
blocking domain must also not fail - IMHO this is a very complicated
API to expect for the driver to implement correctly... I see there is
already a WARN_ON that attaching to the default domain cannot
fail. Maybe this warrants an actual no-fail attach op so the driver
can be more aware of this..

And some of these internal APIs could stand some adjusting if we
really never want a true "detatch" it is always some kind of
replace/swap type operation, either to the default domain or to the
blocking domain.

> We *could* stay true to the original paradigm by introducing some real usage
> of IOMMU_DOMAIN_BLOCKED, such that we could keep one or more of those around
> to actively attach to instead of having groups in this unattached limbo
> state, but that's a bigger job involving adding support to drivers as well;
> too much for a quick fix now...

I suspect for the short term we can get by with an empty mapping
domain - using DOMAIN_BLOCKED is a bit of a refinement.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-03 15:23         ` Jason Gunthorpe via iommu
@ 2022-05-03 17:22           ` Robin Murphy
  -1 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-05-03 17:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Qian Cai, Lu Baolu, Joerg Roedel, Kevin Tian, Liu Yi L, iommu,
	linux-kernel, Jean-Philippe Brucker

On 2022-05-03 16:23, Jason Gunthorpe wrote:
> On Tue, May 03, 2022 at 02:04:37PM +0100, Robin Murphy wrote:
> 
>>> I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
>>> the detach_dev op and null it's cached copy of the domain, but I don't
>>> know this driver.. Robin?
>>
>> The original intent was that .detach_dev is deprecated in favour of default
>> domains, and when the latter are in use, a device is always attached
>> *somewhere* once probed (i.e. group->domain is never NULL). At face value,
>> the neatest fix IMO would probably be for SMMUv3's .domain_free to handle
>> smmu_domain->devices being non-empty and detach them at that point. However
>> that wouldn't be viable for virtio-iommu or anyone else keeping an internal
>> one-way association of devices to their current domains.
> 
> Oh wow that is not obvious
> 
> Actually, I think it is much worse than this because
> iommu_group_claim_dma_owner() does a __iommu_detach_group() with the
> expecation that this would actually result in DMA being blocked,
> immediately. The idea that __iomuu_detatch_group() is a NOP is kind of
> scary.

Scarier than the fact that even where it *is* implemented, .detach_dev
has never had a well-defined behaviour either, and plenty of drivers
treat it as a "remove the IOMMU from the picture altogether" operation
which ends up with the device in bypass rather than blocked?

> Leaving the group attached to the kernel DMA domain will allow
> userspace to DMA to all kernel memory :\

Note that a fair amount of IOMMU hardware only has two states, thus
could only actually achieve a blocking behaviour by enabling translation
with an empty pagetable anyway. (Trivia: and technically some of them
aren't even capable of blocking invalid accesses *when* translating -
they can only apply a "default" translation targeting some scratch page)

> So one approach could be to block use of iommu_group_claim_dma_owner()
> if no detatch_dev op is present and then go through and put them back
> or do something else. This could be short-term OK if we add an op to
> SMMUv3, but long term everything would have to be fixed
> 
> Or we can allocate a dummy empty/blocked domain during
> iommu_group_claim_dma_owner() and attach it whenever.

How does the compile-tested diff below seem? There's a fair chance it's
still broken, but I don't have the bandwidth to give it much more
thought right now.

Cheers,
Robin.

----->8-----
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 29906bc16371..597d70ed7007 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -45,6 +45,7 @@ struct iommu_group {
  	int id;
  	struct iommu_domain *default_domain;
  	struct iommu_domain *domain;
+	struct iommu_domain *purgatory;
  	struct list_head entry;
  	unsigned int owner_cnt;
  	void *owner;
@@ -596,6 +597,8 @@ static void iommu_group_release(struct kobject *kobj)
  
  	if (group->default_domain)
  		iommu_domain_free(group->default_domain);
+	if (group->purgatory)
+		iommu_domain_free(group->purgatory);
  
  	kfree(group->name);
  	kfree(group);
@@ -2041,6 +2044,12 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
  	return dev->iommu_group->default_domain;
  }
  
+static bool iommu_group_user_attached(struct iommu_group *group)
+{
+	return group->domain && group->domain != group->default_domain &&
+	       group->domain != group->purgatory;
+}
+
  /*
   * IOMMU groups are really the natural working unit of the IOMMU, but
   * the IOMMU API works on domains and devices.  Bridge that gap by
@@ -2063,7 +2072,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
  {
  	int ret;
  
-	if (group->domain && group->domain != group->default_domain)
+	if (iommu_group_user_attached(group))
  		return -EBUSY;
  
  	ret = __iommu_group_for_each_dev(group, domain,
@@ -2104,7 +2113,12 @@ static void __iommu_detach_group(struct iommu_domain *domain,
  	 * If the group has been claimed already, do not re-attach the default
  	 * domain.
  	 */
-	if (!group->default_domain || group->owner) {
+	if (group->owner) {
+		WARN_ON(__iommu_attach_group(group->purgatory, group));
+		return;
+	}
+
+	if (!group->default_domain) {
  		__iommu_group_for_each_dev(group, domain,
  					   iommu_group_do_detach_device);
  		group->domain = NULL;
@@ -3111,6 +3125,25 @@ void iommu_device_unuse_default_domain(struct device *dev)
  	iommu_group_put(group);
  }
  
+static struct iommu_domain *iommu_group_get_purgatory(struct iommu_group *group)
+{
+	struct group_device *dev;
+
+	mutex_lock(&group->mutex);
+	if (group->purgatory)
+		goto out;
+
+	dev = list_first_entry(&group->devices, struct group_device, list);
+	group->purgatory = __iommu_domain_alloc(dev->dev->bus,
+						IOMMU_DOMAIN_BLOCKED);
+	if (!group->purgatory)
+		group->purgatory = __iommu_domain_alloc(dev->dev->bus,
+							IOMMU_DOMAIN_UNMANAGED);
+out:
+	mutex_unlock(&group->mutex);
+	return group->purgatory;
+}
+
  /**
   * iommu_group_claim_dma_owner() - Set DMA ownership of a group
   * @group: The group.
@@ -3122,6 +3155,7 @@ void iommu_device_unuse_default_domain(struct device *dev)
   */
  int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
  {
+	struct iommu_domain *pd;
  	int ret = 0;
  
  	mutex_lock(&group->mutex);
@@ -3133,10 +3167,13 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
  			ret = -EBUSY;
  			goto unlock_out;
  		}
+		pd = iommu_group_get_purgatory(group);
+		if (!pd)
+			return -ENOMEM;
  
  		group->owner = owner;
-		if (group->domain)
-			__iommu_detach_group(group->domain, group);
+		if (group->domain && group->domain != pd)
+			__iommu_attach_group(pd, group);
  	}
  
  	group->owner_cnt++;
@@ -3164,7 +3201,7 @@ void iommu_group_release_dma_owner(struct iommu_group *group)
  	 * The UNMANAGED domain should be detached before all USER
  	 * owners have been released.
  	 */
-	if (!WARN_ON(group->domain) && group->default_domain)
+	if (!WARN_ON(iommu_group_user_attached(group) && group->default_domain))
  		__iommu_attach_group(group->default_domain, group);
  	group->owner = NULL;
  unlock_out:

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-03 17:22           ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-05-03 17:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Kevin Tian, Qian Cai, linux-kernel, iommu

On 2022-05-03 16:23, Jason Gunthorpe wrote:
> On Tue, May 03, 2022 at 02:04:37PM +0100, Robin Murphy wrote:
> 
>>> I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
>>> the detach_dev op and null it's cached copy of the domain, but I don't
>>> know this driver.. Robin?
>>
>> The original intent was that .detach_dev is deprecated in favour of default
>> domains, and when the latter are in use, a device is always attached
>> *somewhere* once probed (i.e. group->domain is never NULL). At face value,
>> the neatest fix IMO would probably be for SMMUv3's .domain_free to handle
>> smmu_domain->devices being non-empty and detach them at that point. However
>> that wouldn't be viable for virtio-iommu or anyone else keeping an internal
>> one-way association of devices to their current domains.
> 
> Oh wow that is not obvious
> 
> Actually, I think it is much worse than this because
> iommu_group_claim_dma_owner() does a __iommu_detach_group() with the
> expecation that this would actually result in DMA being blocked,
> immediately. The idea that __iomuu_detatch_group() is a NOP is kind of
> scary.

Scarier than the fact that even where it *is* implemented, .detach_dev
has never had a well-defined behaviour either, and plenty of drivers
treat it as a "remove the IOMMU from the picture altogether" operation
which ends up with the device in bypass rather than blocked?

> Leaving the group attached to the kernel DMA domain will allow
> userspace to DMA to all kernel memory :\

Note that a fair amount of IOMMU hardware only has two states, thus
could only actually achieve a blocking behaviour by enabling translation
with an empty pagetable anyway. (Trivia: and technically some of them
aren't even capable of blocking invalid accesses *when* translating -
they can only apply a "default" translation targeting some scratch page)

> So one approach could be to block use of iommu_group_claim_dma_owner()
> if no detatch_dev op is present and then go through and put them back
> or do something else. This could be short-term OK if we add an op to
> SMMUv3, but long term everything would have to be fixed
> 
> Or we can allocate a dummy empty/blocked domain during
> iommu_group_claim_dma_owner() and attach it whenever.

How does the compile-tested diff below seem? There's a fair chance it's
still broken, but I don't have the bandwidth to give it much more
thought right now.

Cheers,
Robin.

----->8-----
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 29906bc16371..597d70ed7007 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -45,6 +45,7 @@ struct iommu_group {
  	int id;
  	struct iommu_domain *default_domain;
  	struct iommu_domain *domain;
+	struct iommu_domain *purgatory;
  	struct list_head entry;
  	unsigned int owner_cnt;
  	void *owner;
@@ -596,6 +597,8 @@ static void iommu_group_release(struct kobject *kobj)
  
  	if (group->default_domain)
  		iommu_domain_free(group->default_domain);
+	if (group->purgatory)
+		iommu_domain_free(group->purgatory);
  
  	kfree(group->name);
  	kfree(group);
@@ -2041,6 +2044,12 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
  	return dev->iommu_group->default_domain;
  }
  
+static bool iommu_group_user_attached(struct iommu_group *group)
+{
+	return group->domain && group->domain != group->default_domain &&
+	       group->domain != group->purgatory;
+}
+
  /*
   * IOMMU groups are really the natural working unit of the IOMMU, but
   * the IOMMU API works on domains and devices.  Bridge that gap by
@@ -2063,7 +2072,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
  {
  	int ret;
  
-	if (group->domain && group->domain != group->default_domain)
+	if (iommu_group_user_attached(group))
  		return -EBUSY;
  
  	ret = __iommu_group_for_each_dev(group, domain,
@@ -2104,7 +2113,12 @@ static void __iommu_detach_group(struct iommu_domain *domain,
  	 * If the group has been claimed already, do not re-attach the default
  	 * domain.
  	 */
-	if (!group->default_domain || group->owner) {
+	if (group->owner) {
+		WARN_ON(__iommu_attach_group(group->purgatory, group));
+		return;
+	}
+
+	if (!group->default_domain) {
  		__iommu_group_for_each_dev(group, domain,
  					   iommu_group_do_detach_device);
  		group->domain = NULL;
@@ -3111,6 +3125,25 @@ void iommu_device_unuse_default_domain(struct device *dev)
  	iommu_group_put(group);
  }
  
+static struct iommu_domain *iommu_group_get_purgatory(struct iommu_group *group)
+{
+	struct group_device *dev;
+
+	mutex_lock(&group->mutex);
+	if (group->purgatory)
+		goto out;
+
+	dev = list_first_entry(&group->devices, struct group_device, list);
+	group->purgatory = __iommu_domain_alloc(dev->dev->bus,
+						IOMMU_DOMAIN_BLOCKED);
+	if (!group->purgatory)
+		group->purgatory = __iommu_domain_alloc(dev->dev->bus,
+							IOMMU_DOMAIN_UNMANAGED);
+out:
+	mutex_unlock(&group->mutex);
+	return group->purgatory;
+}
+
  /**
   * iommu_group_claim_dma_owner() - Set DMA ownership of a group
   * @group: The group.
@@ -3122,6 +3155,7 @@ void iommu_device_unuse_default_domain(struct device *dev)
   */
  int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
  {
+	struct iommu_domain *pd;
  	int ret = 0;
  
  	mutex_lock(&group->mutex);
@@ -3133,10 +3167,13 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
  			ret = -EBUSY;
  			goto unlock_out;
  		}
+		pd = iommu_group_get_purgatory(group);
+		if (!pd)
+			return -ENOMEM;
  
  		group->owner = owner;
-		if (group->domain)
-			__iommu_detach_group(group->domain, group);
+		if (group->domain && group->domain != pd)
+			__iommu_attach_group(pd, group);
  	}
  
  	group->owner_cnt++;
@@ -3164,7 +3201,7 @@ void iommu_group_release_dma_owner(struct iommu_group *group)
  	 * The UNMANAGED domain should be detached before all USER
  	 * owners have been released.
  	 */
-	if (!WARN_ON(group->domain) && group->default_domain)
+	if (!WARN_ON(iommu_group_user_attached(group) && group->default_domain))
  		__iommu_attach_group(group->default_domain, group);
  	group->owner = NULL;
  unlock_out:
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-02 16:12   ` Qian Cai
@ 2022-05-04  8:42     ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-05-04  8:42 UTC (permalink / raw)
  To: Qian Cai, Alex Williamson
  Cc: Lu Baolu, Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> Reverting this series fixed an user-after-free while doing SR-IOV.
> 
>  BUG: KASAN: use-after-free in __lock_acquire

Hrm, okay. I am going exclude this series from my next branch for now
until this has been sorted out.

Alex, I suggest you do the same.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-04  8:42     ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-05-04  8:42 UTC (permalink / raw)
  To: Qian Cai, Alex Williamson
  Cc: Kevin Tian, linux-kernel, iommu, Jason Gunthorpe

On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> Reverting this series fixed an user-after-free while doing SR-IOV.
> 
>  BUG: KASAN: use-after-free in __lock_acquire

Hrm, okay. I am going exclude this series from my next branch for now
until this has been sorted out.

Alex, I suggest you do the same.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-04  8:42     ` Joerg Roedel
@ 2022-05-04 11:51       ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-05-04 11:51 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Qian Cai, Alex Williamson, Lu Baolu, Kevin Tian, Liu Yi L, iommu,
	linux-kernel

On Wed, May 04, 2022 at 10:42:07AM +0200, Joerg Roedel wrote:
> On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> > Reverting this series fixed an user-after-free while doing SR-IOV.
> > 
> >  BUG: KASAN: use-after-free in __lock_acquire
>
> Hrm, okay. I am going exclude this series from my next branch for now
> until this has been sorted out.

This is going to blow up everything going on in vfio right now, let's
not do something so drastic please.

There is already a patch to fix it, lets wait for it to get sorted
out.

Nicolin and Eric have been testing with this series on ARM for a long
time now, it is not like it is completely broken.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-04 11:51       ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-05-04 11:51 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, Qian Cai, iommu, linux-kernel, Alex Williamson

On Wed, May 04, 2022 at 10:42:07AM +0200, Joerg Roedel wrote:
> On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> > Reverting this series fixed an user-after-free while doing SR-IOV.
> > 
> >  BUG: KASAN: use-after-free in __lock_acquire
>
> Hrm, okay. I am going exclude this series from my next branch for now
> until this has been sorted out.

This is going to blow up everything going on in vfio right now, let's
not do something so drastic please.

There is already a patch to fix it, lets wait for it to get sorted
out.

Nicolin and Eric have been testing with this series on ARM for a long
time now, it is not like it is completely broken.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-04 11:51       ` Jason Gunthorpe via iommu
@ 2022-05-04 11:57         ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-05-04 11:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Qian Cai, Alex Williamson, Lu Baolu, Kevin Tian, Liu Yi L, iommu,
	linux-kernel

On Wed, May 04, 2022 at 08:51:35AM -0300, Jason Gunthorpe wrote:
> Nicolin and Eric have been testing with this series on ARM for a long
> time now, it is not like it is completely broken.

Yeah, I am also optimistic this can be fixed soon. But the rule is that
the next branch should only contain patches which I would send to Linus.
And with a known issue in it I wouldn't, so it is excluded at least from
my next branch for now. The topic branch is still alive and I will merge
it again when the fix is in.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-04 11:57         ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-05-04 11:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Kevin Tian, Qian Cai, iommu, linux-kernel, Alex Williamson

On Wed, May 04, 2022 at 08:51:35AM -0300, Jason Gunthorpe wrote:
> Nicolin and Eric have been testing with this series on ARM for a long
> time now, it is not like it is completely broken.

Yeah, I am also optimistic this can be fixed soon. But the rule is that
the next branch should only contain patches which I would send to Linus.
And with a known issue in it I wouldn't, so it is excluded at least from
my next branch for now. The topic branch is still alive and I will merge
it again when the fix is in.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-04  8:42     ` Joerg Roedel
@ 2022-05-04 16:29       ` Alex Williamson
  -1 siblings, 0 replies; 67+ messages in thread
From: Alex Williamson @ 2022-05-04 16:29 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Qian Cai, Kevin Tian, linux-kernel, iommu, Jason Gunthorpe

On Wed, 4 May 2022 10:42:07 +0200
Joerg Roedel <joro@8bytes.org> wrote:

> On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> > Reverting this series fixed an user-after-free while doing SR-IOV.
> > 
> >  BUG: KASAN: use-after-free in __lock_acquire  
> 
> Hrm, okay. I am going exclude this series from my next branch for now
> until this has been sorted out.
> 
> Alex, I suggest you do the same.

Done, and thanks for the heads-up.  Please try to cc me when the
vfio-notifier-fix branch is merged back into your next branch.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-04 16:29       ` Alex Williamson
  0 siblings, 0 replies; 67+ messages in thread
From: Alex Williamson @ 2022-05-04 16:29 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, Qian Cai, linux-kernel, iommu, Jason Gunthorpe

On Wed, 4 May 2022 10:42:07 +0200
Joerg Roedel <joro@8bytes.org> wrote:

> On Mon, May 02, 2022 at 12:12:04PM -0400, Qian Cai wrote:
> > Reverting this series fixed an user-after-free while doing SR-IOV.
> > 
> >  BUG: KASAN: use-after-free in __lock_acquire  
> 
> Hrm, okay. I am going exclude this series from my next branch for now
> until this has been sorted out.
> 
> Alex, I suggest you do the same.

Done, and thanks for the heads-up.  Please try to cc me when the
vfio-notifier-fix branch is merged back into your next branch.  Thanks,

Alex

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-04 11:57         ` Joerg Roedel
@ 2022-05-09 18:33           ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-05-09 18:33 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Qian Cai, Alex Williamson, Lu Baolu, Kevin Tian, Liu Yi L, iommu,
	linux-kernel

On Wed, May 04, 2022 at 01:57:05PM +0200, Joerg Roedel wrote:
> On Wed, May 04, 2022 at 08:51:35AM -0300, Jason Gunthorpe wrote:
> > Nicolin and Eric have been testing with this series on ARM for a long
> > time now, it is not like it is completely broken.
> 
> Yeah, I am also optimistic this can be fixed soon. But the rule is that
> the next branch should only contain patches which I would send to Linus.
> And with a known issue in it I wouldn't, so it is excluded at least from
> my next branch for now. The topic branch is still alive and I will merge
> it again when the fix is in.

The fix is out, lets merge it back in so we can have some more time to
discover any additional issues. People seem to test when it is in your
branch.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-09 18:33           ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-05-09 18:33 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kevin Tian, Qian Cai, iommu, linux-kernel, Alex Williamson

On Wed, May 04, 2022 at 01:57:05PM +0200, Joerg Roedel wrote:
> On Wed, May 04, 2022 at 08:51:35AM -0300, Jason Gunthorpe wrote:
> > Nicolin and Eric have been testing with this series on ARM for a long
> > time now, it is not like it is completely broken.
> 
> Yeah, I am also optimistic this can be fixed soon. But the rule is that
> the next branch should only contain patches which I would send to Linus.
> And with a known issue in it I wouldn't, so it is excluded at least from
> my next branch for now. The topic branch is still alive and I will merge
> it again when the fix is in.

The fix is out, lets merge it back in so we can have some more time to
discover any additional issues. People seem to test when it is in your
branch.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-09 18:33           ` Jason Gunthorpe via iommu
@ 2022-05-13  8:13             ` Tian, Kevin
  -1 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-05-13  8:13 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: Qian Cai, iommu, linux-kernel, Alex Williamson

> From: Jason Gunthorpe
> Sent: Tuesday, May 10, 2022 2:33 AM
> 
> On Wed, May 04, 2022 at 01:57:05PM +0200, Joerg Roedel wrote:
> > On Wed, May 04, 2022 at 08:51:35AM -0300, Jason Gunthorpe wrote:
> > > Nicolin and Eric have been testing with this series on ARM for a long
> > > time now, it is not like it is completely broken.
> >
> > Yeah, I am also optimistic this can be fixed soon. But the rule is that
> > the next branch should only contain patches which I would send to Linus.
> > And with a known issue in it I wouldn't, so it is excluded at least from
> > my next branch for now. The topic branch is still alive and I will merge
> > it again when the fix is in.
> 
> The fix is out, lets merge it back in so we can have some more time to
> discover any additional issues. People seem to test when it is in your
> branch.
> 

Joerg, any chance you may give it a priority? This is the first step of
a long refactoring effort and it has been gating quite a few
well-reviewed improvements down the road. having it tested earlier
in your branch is definitely appreciated. 😊

Thanks
Kevin


^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-13  8:13             ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-05-13  8:13 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: Alex Williamson, iommu, Qian Cai, linux-kernel

> From: Jason Gunthorpe
> Sent: Tuesday, May 10, 2022 2:33 AM
> 
> On Wed, May 04, 2022 at 01:57:05PM +0200, Joerg Roedel wrote:
> > On Wed, May 04, 2022 at 08:51:35AM -0300, Jason Gunthorpe wrote:
> > > Nicolin and Eric have been testing with this series on ARM for a long
> > > time now, it is not like it is completely broken.
> >
> > Yeah, I am also optimistic this can be fixed soon. But the rule is that
> > the next branch should only contain patches which I would send to Linus.
> > And with a known issue in it I wouldn't, so it is excluded at least from
> > my next branch for now. The topic branch is still alive and I will merge
> > it again when the fix is in.
> 
> The fix is out, lets merge it back in so we can have some more time to
> discover any additional issues. People seem to test when it is in your
> branch.
> 

Joerg, any chance you may give it a priority? This is the first step of
a long refactoring effort and it has been gating quite a few
well-reviewed improvements down the road. having it tested earlier
in your branch is definitely appreciated. 😊

Thanks
Kevin

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-04 16:29       ` Alex Williamson
@ 2022-05-13 15:49         ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-05-13 15:49 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Qian Cai, Kevin Tian, linux-kernel, iommu, Jason Gunthorpe

Hi Alex,

On Wed, May 04, 2022 at 10:29:56AM -0600, Alex Williamson wrote:
> Done, and thanks for the heads-up.  Please try to cc me when the
> vfio-notifier-fix branch is merged back into your next branch.  Thanks,

This has happened now, the vfio-notifier-fix branch got the fix and is
merged back into my next branch.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-13 15:49         ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-05-13 15:49 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Kevin Tian, Qian Cai, linux-kernel, iommu, Jason Gunthorpe

Hi Alex,

On Wed, May 04, 2022 at 10:29:56AM -0600, Alex Williamson wrote:
> Done, and thanks for the heads-up.  Please try to cc me when the
> vfio-notifier-fix branch is merged back into your next branch.  Thanks,

This has happened now, the vfio-notifier-fix branch got the fix and is
merged back into my next branch.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-13 15:49         ` Joerg Roedel
@ 2022-05-13 16:25           ` Alex Williamson
  -1 siblings, 0 replies; 67+ messages in thread
From: Alex Williamson @ 2022-05-13 16:25 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe; +Cc: Qian Cai, Kevin Tian, linux-kernel, iommu

On Fri, 13 May 2022 17:49:44 +0200
Joerg Roedel <joro@8bytes.org> wrote:

> Hi Alex,
> 
> On Wed, May 04, 2022 at 10:29:56AM -0600, Alex Williamson wrote:
> > Done, and thanks for the heads-up.  Please try to cc me when the
> > vfio-notifier-fix branch is merged back into your next branch.  Thanks,  
> 
> This has happened now, the vfio-notifier-fix branch got the fix and is
> merged back into my next branch.

Thanks, Joerg!

Jason, I'll push a merge of this with

Subject: [PATCH] vfio: Delete container_q
0-v1-a1e8791d795b+6b-vfio_container_q_jgg@nvidia.com

and

Subject: [PATCH v3 0/8] Remove vfio_group from the struct file facing VFIO API
0-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com

as soon as my sanity build finishes.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-13 16:25           ` Alex Williamson
  0 siblings, 0 replies; 67+ messages in thread
From: Alex Williamson @ 2022-05-13 16:25 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe; +Cc: Kevin Tian, Qian Cai, linux-kernel, iommu

On Fri, 13 May 2022 17:49:44 +0200
Joerg Roedel <joro@8bytes.org> wrote:

> Hi Alex,
> 
> On Wed, May 04, 2022 at 10:29:56AM -0600, Alex Williamson wrote:
> > Done, and thanks for the heads-up.  Please try to cc me when the
> > vfio-notifier-fix branch is merged back into your next branch.  Thanks,  
> 
> This has happened now, the vfio-notifier-fix branch got the fix and is
> merged back into my next branch.

Thanks, Joerg!

Jason, I'll push a merge of this with

Subject: [PATCH] vfio: Delete container_q
0-v1-a1e8791d795b+6b-vfio_container_q_jgg@nvidia.com

and

Subject: [PATCH v3 0/8] Remove vfio_group from the struct file facing VFIO API
0-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com

as soon as my sanity build finishes.  Thanks,

Alex

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-05-13 16:25           ` Alex Williamson
@ 2022-05-13 19:06             ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-05-13 19:06 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Joerg Roedel, Qian Cai, Kevin Tian, linux-kernel, iommu

On Fri, May 13, 2022 at 10:25:48AM -0600, Alex Williamson wrote:
> On Fri, 13 May 2022 17:49:44 +0200
> Joerg Roedel <joro@8bytes.org> wrote:
> 
> > Hi Alex,
> > 
> > On Wed, May 04, 2022 at 10:29:56AM -0600, Alex Williamson wrote:
> > > Done, and thanks for the heads-up.  Please try to cc me when the
> > > vfio-notifier-fix branch is merged back into your next branch.  Thanks,  
> > 
> > This has happened now, the vfio-notifier-fix branch got the fix and is
> > merged back into my next branch.
> 
> Thanks, Joerg!
> 
> Jason, I'll push a merge of this with
> 
> Subject: [PATCH] vfio: Delete container_q
> 0-v1-a1e8791d795b+6b-vfio_container_q_jgg@nvidia.com
> 
> and
> 
> Subject: [PATCH v3 0/8] Remove vfio_group from the struct file facing VFIO API
> 0-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com
> 
> as soon as my sanity build finishes.  Thanks,

Thanks, I'll rebase and repost the remaining vfio series.

Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-05-13 19:06             ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-05-13 19:06 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Qian Cai, Kevin Tian, iommu, linux-kernel

On Fri, May 13, 2022 at 10:25:48AM -0600, Alex Williamson wrote:
> On Fri, 13 May 2022 17:49:44 +0200
> Joerg Roedel <joro@8bytes.org> wrote:
> 
> > Hi Alex,
> > 
> > On Wed, May 04, 2022 at 10:29:56AM -0600, Alex Williamson wrote:
> > > Done, and thanks for the heads-up.  Please try to cc me when the
> > > vfio-notifier-fix branch is merged back into your next branch.  Thanks,  
> > 
> > This has happened now, the vfio-notifier-fix branch got the fix and is
> > merged back into my next branch.
> 
> Thanks, Joerg!
> 
> Jason, I'll push a merge of this with
> 
> Subject: [PATCH] vfio: Delete container_q
> 0-v1-a1e8791d795b+6b-vfio_container_q_jgg@nvidia.com
> 
> and
> 
> Subject: [PATCH v3 0/8] Remove vfio_group from the struct file facing VFIO API
> 0-v3-f7729924a7ea+25e33-vfio_kvm_no_group_jgg@nvidia.com
> 
> as soon as my sanity build finishes.  Thanks,

Thanks, I'll rebase and repost the remaining vfio series.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
  2022-04-18  0:49   ` Lu Baolu
@ 2022-06-15  9:53     ` Steven Price
  -1 siblings, 0 replies; 67+ messages in thread
From: Steven Price @ 2022-06-15  9:53 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel
  Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel, Robin Murphy

On 18/04/2022 01:49, Lu Baolu wrote:
> Multiple devices may be placed in the same IOMMU group because they
> cannot be isolated from each other. These devices must either be
> entirely under kernel control or userspace control, never a mixture.
> 
> This adds dma ownership management in iommu core and exposes several
> interfaces for the device drivers and the device userspace assignment
> framework (i.e. VFIO), so that any conflict between user and kernel
> controlled dma could be detected at the beginning.
> 
> The device driver oriented interfaces are,
> 
> 	int iommu_device_use_default_domain(struct device *dev);
> 	void iommu_device_unuse_default_domain(struct device *dev);
> 
> By calling iommu_device_use_default_domain(), the device driver tells
> the iommu layer that the device dma is handled through the kernel DMA
> APIs. The iommu layer will manage the IOVA and use the default domain
> for DMA address translation.
> 
> The device user-space assignment framework oriented interfaces are,
> 
> 	int iommu_group_claim_dma_owner(struct iommu_group *group,
> 					void *owner);
> 	void iommu_group_release_dma_owner(struct iommu_group *group);
> 	bool iommu_group_dma_owner_claimed(struct iommu_group *group);
> 
> The device userspace assignment must be disallowed if the DMA owner
> claiming interface returns failure.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>

I'm seeing a regression that I've bisected to this commit on a Firefly
RK3288 board. The display driver fails to probe properly because
__iommu_attach_group() returns -EBUSY. This causes long hangs and splats
as the display flips timeout.

The call stack to __iommu_attach_group() is:

 __iommu_attach_group from iommu_attach_device+0x64/0xb4
 iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
 rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
 vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xa8/0x290
 drm_atomic_helper_commit_modeset_enables from drm_atomic_helper_commit_tail_rpm+0x44/0x8c
 drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
 commit_tail from drm_atomic_helper_commit+0x164/0x18c
 drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
 drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
 drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1c8
 drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
 drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
 drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
 drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224

> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
>  {
>  	int ret;
>  
> -	if (group->default_domain && group->domain != group->default_domain)
> +	if (group->domain && group->domain != group->default_domain)
>  		return -EBUSY;
>  
>  	ret = __iommu_group_for_each_dev(group, domain,

Reverting this 'fixes' the problem for me. The follow up 0286300e6045
("iommu: iommu_group_claim_dma_owner() must always assign a domain")
doesn't help.

Adding some debug printks I can see that domain is a valid pointer, but
both default_domain and blocking_domain are NULL.

I'm using the DTB from the kernel tree (rk3288-firefly.dtb).

Any ideas?

Thanks,

Steve

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
@ 2022-06-15  9:53     ` Steven Price
  0 siblings, 0 replies; 67+ messages in thread
From: Steven Price @ 2022-06-15  9:53 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel
  Cc: Kevin Tian, linux-kernel, iommu, Jason Gunthorpe, Robin Murphy

On 18/04/2022 01:49, Lu Baolu wrote:
> Multiple devices may be placed in the same IOMMU group because they
> cannot be isolated from each other. These devices must either be
> entirely under kernel control or userspace control, never a mixture.
> 
> This adds dma ownership management in iommu core and exposes several
> interfaces for the device drivers and the device userspace assignment
> framework (i.e. VFIO), so that any conflict between user and kernel
> controlled dma could be detected at the beginning.
> 
> The device driver oriented interfaces are,
> 
> 	int iommu_device_use_default_domain(struct device *dev);
> 	void iommu_device_unuse_default_domain(struct device *dev);
> 
> By calling iommu_device_use_default_domain(), the device driver tells
> the iommu layer that the device dma is handled through the kernel DMA
> APIs. The iommu layer will manage the IOVA and use the default domain
> for DMA address translation.
> 
> The device user-space assignment framework oriented interfaces are,
> 
> 	int iommu_group_claim_dma_owner(struct iommu_group *group,
> 					void *owner);
> 	void iommu_group_release_dma_owner(struct iommu_group *group);
> 	bool iommu_group_dma_owner_claimed(struct iommu_group *group);
> 
> The device userspace assignment must be disallowed if the DMA owner
> claiming interface returns failure.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>

I'm seeing a regression that I've bisected to this commit on a Firefly
RK3288 board. The display driver fails to probe properly because
__iommu_attach_group() returns -EBUSY. This causes long hangs and splats
as the display flips timeout.

The call stack to __iommu_attach_group() is:

 __iommu_attach_group from iommu_attach_device+0x64/0xb4
 iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
 rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
 vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xa8/0x290
 drm_atomic_helper_commit_modeset_enables from drm_atomic_helper_commit_tail_rpm+0x44/0x8c
 drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
 commit_tail from drm_atomic_helper_commit+0x164/0x18c
 drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
 drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
 drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1c8
 drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
 drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
 drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
 drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224

> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
>  {
>  	int ret;
>  
> -	if (group->default_domain && group->domain != group->default_domain)
> +	if (group->domain && group->domain != group->default_domain)
>  		return -EBUSY;
>  
>  	ret = __iommu_group_for_each_dev(group, domain,

Reverting this 'fixes' the problem for me. The follow up 0286300e6045
("iommu: iommu_group_claim_dma_owner() must always assign a domain")
doesn't help.

Adding some debug printks I can see that domain is a valid pointer, but
both default_domain and blocking_domain are NULL.

I'm using the DTB from the kernel tree (rk3288-firefly.dtb).

Any ideas?

Thanks,

Steve
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
  2022-06-15  9:53     ` Steven Price
@ 2022-06-15 10:57       ` Robin Murphy
  -1 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-06-15 10:57 UTC (permalink / raw)
  To: Steven Price, Lu Baolu, Joerg Roedel
  Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

On 2022-06-15 10:53, Steven Price wrote:
> On 18/04/2022 01:49, Lu Baolu wrote:
>> Multiple devices may be placed in the same IOMMU group because they
>> cannot be isolated from each other. These devices must either be
>> entirely under kernel control or userspace control, never a mixture.
>>
>> This adds dma ownership management in iommu core and exposes several
>> interfaces for the device drivers and the device userspace assignment
>> framework (i.e. VFIO), so that any conflict between user and kernel
>> controlled dma could be detected at the beginning.
>>
>> The device driver oriented interfaces are,
>>
>> 	int iommu_device_use_default_domain(struct device *dev);
>> 	void iommu_device_unuse_default_domain(struct device *dev);
>>
>> By calling iommu_device_use_default_domain(), the device driver tells
>> the iommu layer that the device dma is handled through the kernel DMA
>> APIs. The iommu layer will manage the IOVA and use the default domain
>> for DMA address translation.
>>
>> The device user-space assignment framework oriented interfaces are,
>>
>> 	int iommu_group_claim_dma_owner(struct iommu_group *group,
>> 					void *owner);
>> 	void iommu_group_release_dma_owner(struct iommu_group *group);
>> 	bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>>
>> The device userspace assignment must be disallowed if the DMA owner
>> claiming interface returns failure.
>>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
> I'm seeing a regression that I've bisected to this commit on a Firefly
> RK3288 board. The display driver fails to probe properly because
> __iommu_attach_group() returns -EBUSY. This causes long hangs and splats
> as the display flips timeout.
> 
> The call stack to __iommu_attach_group() is:
> 
>   __iommu_attach_group from iommu_attach_device+0x64/0xb4
>   iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
>   rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
>   vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xa8/0x290
>   drm_atomic_helper_commit_modeset_enables from drm_atomic_helper_commit_tail_rpm+0x44/0x8c
>   drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
>   commit_tail from drm_atomic_helper_commit+0x164/0x18c
>   drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
>   drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
>   drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1c8
>   drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
>   drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
>   drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
>   drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224
> 
>> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
>>   {
>>   	int ret;
>>   
>> -	if (group->default_domain && group->domain != group->default_domain)
>> +	if (group->domain && group->domain != group->default_domain)
>>   		return -EBUSY;
>>   
>>   	ret = __iommu_group_for_each_dev(group, domain,
> 
> Reverting this 'fixes' the problem for me. The follow up 0286300e6045
> ("iommu: iommu_group_claim_dma_owner() must always assign a domain")
> doesn't help.
> 
> Adding some debug printks I can see that domain is a valid pointer, but
> both default_domain and blocking_domain are NULL.
> 
> I'm using the DTB from the kernel tree (rk3288-firefly.dtb).
> 
> Any ideas?

Hmm, TBH I'm not sure how that worked previously... it'll be complaining 
because the ARM DMA domain is still attached, but even when the attach 
goes ahead and replaces the ARM domain with the driver's new one, it's 
not using the special arm_iommu_detach_device() interface anywhere so 
the device would still be left with the wrong DMA ops :/

I guess the most pragmatic option is probably to give rockchip-drm a 
similar bodge to exynos and tegra, to explicitly remove the ARM domain 
before attaching its own.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
@ 2022-06-15 10:57       ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-06-15 10:57 UTC (permalink / raw)
  To: Steven Price, Lu Baolu, Joerg Roedel
  Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

On 2022-06-15 10:53, Steven Price wrote:
> On 18/04/2022 01:49, Lu Baolu wrote:
>> Multiple devices may be placed in the same IOMMU group because they
>> cannot be isolated from each other. These devices must either be
>> entirely under kernel control or userspace control, never a mixture.
>>
>> This adds dma ownership management in iommu core and exposes several
>> interfaces for the device drivers and the device userspace assignment
>> framework (i.e. VFIO), so that any conflict between user and kernel
>> controlled dma could be detected at the beginning.
>>
>> The device driver oriented interfaces are,
>>
>> 	int iommu_device_use_default_domain(struct device *dev);
>> 	void iommu_device_unuse_default_domain(struct device *dev);
>>
>> By calling iommu_device_use_default_domain(), the device driver tells
>> the iommu layer that the device dma is handled through the kernel DMA
>> APIs. The iommu layer will manage the IOVA and use the default domain
>> for DMA address translation.
>>
>> The device user-space assignment framework oriented interfaces are,
>>
>> 	int iommu_group_claim_dma_owner(struct iommu_group *group,
>> 					void *owner);
>> 	void iommu_group_release_dma_owner(struct iommu_group *group);
>> 	bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>>
>> The device userspace assignment must be disallowed if the DMA owner
>> claiming interface returns failure.
>>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
> I'm seeing a regression that I've bisected to this commit on a Firefly
> RK3288 board. The display driver fails to probe properly because
> __iommu_attach_group() returns -EBUSY. This causes long hangs and splats
> as the display flips timeout.
> 
> The call stack to __iommu_attach_group() is:
> 
>   __iommu_attach_group from iommu_attach_device+0x64/0xb4
>   iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
>   rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
>   vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xa8/0x290
>   drm_atomic_helper_commit_modeset_enables from drm_atomic_helper_commit_tail_rpm+0x44/0x8c
>   drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
>   commit_tail from drm_atomic_helper_commit+0x164/0x18c
>   drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
>   drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
>   drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1c8
>   drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
>   drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
>   drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
>   drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224
> 
>> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
>>   {
>>   	int ret;
>>   
>> -	if (group->default_domain && group->domain != group->default_domain)
>> +	if (group->domain && group->domain != group->default_domain)
>>   		return -EBUSY;
>>   
>>   	ret = __iommu_group_for_each_dev(group, domain,
> 
> Reverting this 'fixes' the problem for me. The follow up 0286300e6045
> ("iommu: iommu_group_claim_dma_owner() must always assign a domain")
> doesn't help.
> 
> Adding some debug printks I can see that domain is a valid pointer, but
> both default_domain and blocking_domain are NULL.
> 
> I'm using the DTB from the kernel tree (rk3288-firefly.dtb).
> 
> Any ideas?

Hmm, TBH I'm not sure how that worked previously... it'll be complaining 
because the ARM DMA domain is still attached, but even when the attach 
goes ahead and replaces the ARM domain with the driver's new one, it's 
not using the special arm_iommu_detach_device() interface anywhere so 
the device would still be left with the wrong DMA ops :/

I guess the most pragmatic option is probably to give rockchip-drm a 
similar bodge to exynos and tegra, to explicitly remove the ARM domain 
before attaching its own.

Thanks,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
  2022-06-15 10:57       ` Robin Murphy
@ 2022-06-15 15:09         ` Steven Price
  -1 siblings, 0 replies; 67+ messages in thread
From: Steven Price @ 2022-06-15 15:09 UTC (permalink / raw)
  To: Robin Murphy, Lu Baolu, Joerg Roedel
  Cc: Jason Gunthorpe, Kevin Tian, Liu Yi L, iommu, linux-kernel

On 15/06/2022 11:57, Robin Murphy wrote:
> On 2022-06-15 10:53, Steven Price wrote:
>> On 18/04/2022 01:49, Lu Baolu wrote:
>>> Multiple devices may be placed in the same IOMMU group because they
>>> cannot be isolated from each other. These devices must either be
>>> entirely under kernel control or userspace control, never a mixture.
>>>
>>> This adds dma ownership management in iommu core and exposes several
>>> interfaces for the device drivers and the device userspace assignment
>>> framework (i.e. VFIO), so that any conflict between user and kernel
>>> controlled dma could be detected at the beginning.
>>>
>>> The device driver oriented interfaces are,
>>>
>>>     int iommu_device_use_default_domain(struct device *dev);
>>>     void iommu_device_unuse_default_domain(struct device *dev);
>>>
>>> By calling iommu_device_use_default_domain(), the device driver tells
>>> the iommu layer that the device dma is handled through the kernel DMA
>>> APIs. The iommu layer will manage the IOVA and use the default domain
>>> for DMA address translation.
>>>
>>> The device user-space assignment framework oriented interfaces are,
>>>
>>>     int iommu_group_claim_dma_owner(struct iommu_group *group,
>>>                     void *owner);
>>>     void iommu_group_release_dma_owner(struct iommu_group *group);
>>>     bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>>>
>>> The device userspace assignment must be disallowed if the DMA owner
>>> claiming interface returns failure.
>>>
>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>>
>> I'm seeing a regression that I've bisected to this commit on a Firefly
>> RK3288 board. The display driver fails to probe properly because
>> __iommu_attach_group() returns -EBUSY. This causes long hangs and splats
>> as the display flips timeout.
>>
>> The call stack to __iommu_attach_group() is:
>>
>>   __iommu_attach_group from iommu_attach_device+0x64/0xb4
>>   iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
>>   rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
>>   vop_crtc_atomic_enable from
>> drm_atomic_helper_commit_modeset_enables+0xa8/0x290
>>   drm_atomic_helper_commit_modeset_enables from
>> drm_atomic_helper_commit_tail_rpm+0x44/0x8c
>>   drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
>>   commit_tail from drm_atomic_helper_commit+0x164/0x18c
>>   drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
>>   drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
>>   drm_client_modeset_commit_atomic from
>> drm_client_modeset_commit_locked+0x60/0x1c8
>>   drm_client_modeset_commit_locked from
>> drm_client_modeset_commit+0x24/0x40
>>   drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
>>   drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
>>   drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224
>>
>>> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct
>>> iommu_domain *domain,
>>>   {
>>>       int ret;
>>>   -    if (group->default_domain && group->domain !=
>>> group->default_domain)
>>> +    if (group->domain && group->domain != group->default_domain)
>>>           return -EBUSY;
>>>         ret = __iommu_group_for_each_dev(group, domain,
>>
>> Reverting this 'fixes' the problem for me. The follow up 0286300e6045
>> ("iommu: iommu_group_claim_dma_owner() must always assign a domain")
>> doesn't help.
>>
>> Adding some debug printks I can see that domain is a valid pointer, but
>> both default_domain and blocking_domain are NULL.
>>
>> I'm using the DTB from the kernel tree (rk3288-firefly.dtb).
>>
>> Any ideas?
> 
> Hmm, TBH I'm not sure how that worked previously... it'll be complaining
> because the ARM DMA domain is still attached, but even when the attach
> goes ahead and replaces the ARM domain with the driver's new one, it's
> not using the special arm_iommu_detach_device() interface anywhere so
> the device would still be left with the wrong DMA ops :/
> 
> I guess the most pragmatic option is probably to give rockchip-drm a
> similar bodge to exynos and tegra, to explicitly remove the ARM domain
> before attaching its own.

A bodge like below indeed 'fixes' the problem:

---8<---
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
index 67d38f53d3e5..cbc6a5121296 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@ -23,6 +23,14 @@
 #include <drm/drm_probe_helper.h>
 #include <drm/drm_vblank.h>
 
+#if defined(CONFIG_ARM_DMA_USE_IOMMU)
+#include <asm/dma-iommu.h>
+#else
+#define arm_iommu_detach_device(...)	({ })
+#define arm_iommu_release_mapping(...)	({ })
+#define to_dma_iommu_mapping(dev) NULL
+#endif
+
 #include "rockchip_drm_drv.h"
 #include "rockchip_drm_fb.h"
 #include "rockchip_drm_gem.h"
@@ -49,6 +57,14 @@ int rockchip_drm_dma_attach_device(struct drm_device *drm_dev,
 	if (!private->domain)
 		return 0;
 
+	if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
+		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
+		if (mapping) {
+			arm_iommu_detach_device(dev);
+			arm_iommu_release_mapping(mapping);
+		}
+	}
+
 	ret = iommu_attach_device(private->domain, dev);
 	if (ret) {
 		DRM_DEV_ERROR(dev, "Failed to attach iommu device\n");
---8<---

I'll type up a proper commit message and see what the DRM maintainers think.

Thanks,

Steve

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces
@ 2022-06-15 15:09         ` Steven Price
  0 siblings, 0 replies; 67+ messages in thread
From: Steven Price @ 2022-06-15 15:09 UTC (permalink / raw)
  To: Robin Murphy, Lu Baolu, Joerg Roedel
  Cc: Kevin Tian, iommu, Jason Gunthorpe, linux-kernel

On 15/06/2022 11:57, Robin Murphy wrote:
> On 2022-06-15 10:53, Steven Price wrote:
>> On 18/04/2022 01:49, Lu Baolu wrote:
>>> Multiple devices may be placed in the same IOMMU group because they
>>> cannot be isolated from each other. These devices must either be
>>> entirely under kernel control or userspace control, never a mixture.
>>>
>>> This adds dma ownership management in iommu core and exposes several
>>> interfaces for the device drivers and the device userspace assignment
>>> framework (i.e. VFIO), so that any conflict between user and kernel
>>> controlled dma could be detected at the beginning.
>>>
>>> The device driver oriented interfaces are,
>>>
>>>     int iommu_device_use_default_domain(struct device *dev);
>>>     void iommu_device_unuse_default_domain(struct device *dev);
>>>
>>> By calling iommu_device_use_default_domain(), the device driver tells
>>> the iommu layer that the device dma is handled through the kernel DMA
>>> APIs. The iommu layer will manage the IOVA and use the default domain
>>> for DMA address translation.
>>>
>>> The device user-space assignment framework oriented interfaces are,
>>>
>>>     int iommu_group_claim_dma_owner(struct iommu_group *group,
>>>                     void *owner);
>>>     void iommu_group_release_dma_owner(struct iommu_group *group);
>>>     bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>>>
>>> The device userspace assignment must be disallowed if the DMA owner
>>> claiming interface returns failure.
>>>
>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>>
>> I'm seeing a regression that I've bisected to this commit on a Firefly
>> RK3288 board. The display driver fails to probe properly because
>> __iommu_attach_group() returns -EBUSY. This causes long hangs and splats
>> as the display flips timeout.
>>
>> The call stack to __iommu_attach_group() is:
>>
>>   __iommu_attach_group from iommu_attach_device+0x64/0xb4
>>   iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
>>   rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
>>   vop_crtc_atomic_enable from
>> drm_atomic_helper_commit_modeset_enables+0xa8/0x290
>>   drm_atomic_helper_commit_modeset_enables from
>> drm_atomic_helper_commit_tail_rpm+0x44/0x8c
>>   drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
>>   commit_tail from drm_atomic_helper_commit+0x164/0x18c
>>   drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
>>   drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
>>   drm_client_modeset_commit_atomic from
>> drm_client_modeset_commit_locked+0x60/0x1c8
>>   drm_client_modeset_commit_locked from
>> drm_client_modeset_commit+0x24/0x40
>>   drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
>>   drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
>>   drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224
>>
>>> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct
>>> iommu_domain *domain,
>>>   {
>>>       int ret;
>>>   -    if (group->default_domain && group->domain !=
>>> group->default_domain)
>>> +    if (group->domain && group->domain != group->default_domain)
>>>           return -EBUSY;
>>>         ret = __iommu_group_for_each_dev(group, domain,
>>
>> Reverting this 'fixes' the problem for me. The follow up 0286300e6045
>> ("iommu: iommu_group_claim_dma_owner() must always assign a domain")
>> doesn't help.
>>
>> Adding some debug printks I can see that domain is a valid pointer, but
>> both default_domain and blocking_domain are NULL.
>>
>> I'm using the DTB from the kernel tree (rk3288-firefly.dtb).
>>
>> Any ideas?
> 
> Hmm, TBH I'm not sure how that worked previously... it'll be complaining
> because the ARM DMA domain is still attached, but even when the attach
> goes ahead and replaces the ARM domain with the driver's new one, it's
> not using the special arm_iommu_detach_device() interface anywhere so
> the device would still be left with the wrong DMA ops :/
> 
> I guess the most pragmatic option is probably to give rockchip-drm a
> similar bodge to exynos and tegra, to explicitly remove the ARM domain
> before attaching its own.

A bodge like below indeed 'fixes' the problem:

---8<---
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
index 67d38f53d3e5..cbc6a5121296 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@ -23,6 +23,14 @@
 #include <drm/drm_probe_helper.h>
 #include <drm/drm_vblank.h>
 
+#if defined(CONFIG_ARM_DMA_USE_IOMMU)
+#include <asm/dma-iommu.h>
+#else
+#define arm_iommu_detach_device(...)	({ })
+#define arm_iommu_release_mapping(...)	({ })
+#define to_dma_iommu_mapping(dev) NULL
+#endif
+
 #include "rockchip_drm_drv.h"
 #include "rockchip_drm_fb.h"
 #include "rockchip_drm_gem.h"
@@ -49,6 +57,14 @@ int rockchip_drm_dma_attach_device(struct drm_device *drm_dev,
 	if (!private->domain)
 		return 0;
 
+	if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
+		struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
+		if (mapping) {
+			arm_iommu_detach_device(dev);
+			arm_iommu_release_mapping(mapping);
+		}
+	}
+
 	ret = iommu_attach_device(private->domain, dev);
 	if (ret) {
 		DRM_DEV_ERROR(dev, "Failed to attach iommu device\n");
---8<---

I'll type up a proper commit message and see what the DRM maintainers think.

Thanks,

Steve
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 04/11] bus: platform, amba, fsl-mc, PCI: Add device DMA ownership management
  2022-04-18  0:49   ` [RESEND PATCH v8 04/11] bus: platform, amba, fsl-mc, PCI: " Lu Baolu
  (?)
@ 2023-06-26 13:02   ` Zenghui Yu
  2023-06-28 14:36     ` Jason Gunthorpe
  -1 siblings, 1 reply; 67+ messages in thread
From: Zenghui Yu @ 2023-06-26 13:02 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Joerg Roedel, Kevin Tian, Jason Gunthorpe, linux-kernel, iommu

On 2022/4/18 8:49, Lu Baolu wrote:
> The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
> with the device DMA managed by kernel drivers or user-space applications.
> Unfortunately, multiple devices may be placed in the same IOMMU group
> because they cannot be isolated from each other. The DMA on these devices
> must either be entirely under kernel control or userspace control, never
> a mixture. Otherwise the driver integrity is not guaranteed because they
> could access each other through the peer-to-peer accesses which by-pass
> the IOMMU protection.
> 
> This checks and sets the default DMA mode during driver binding, and
> cleanups during driver unbinding. In the default mode, the device DMA is
> managed by the device driver which handles DMA operations through the
> kernel DMA APIs (see Documentation/core-api/dma-api.rst).
> 
> For cases where the devices are assigned for userspace control through the
> userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
> vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
> default setting in the assumption that the drivers know what they are
> doing with the device DMA.
> 
> Calling iommu_device_use_default_domain() before {of,acpi}_dma_configure
> is currently a problem. As things stand, the IOMMU driver ignored the
> initial iommu_probe_device() call when the device was added, since at
> that point it had no fwspec yet. In this situation,
> {of,acpi}_iommu_configure() are retriggering iommu_probe_device() after
> the IOMMU driver has seen the firmware data via .of_xlate to learn that
> it actually responsible for the given device. As the result, before
> that gets fixed, iommu_use_default_domain() goes at the end, and calls
> arch_teardown_dma_ops() if it fails.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Stuart Yoder <stuyoder@gmail.com>
> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> Tested-by: Eric Auger <eric.auger@redhat.com>

[...]

> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 60adf42460ab..b933d2b08d4d 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -895,6 +895,13 @@ struct module;
>   *              created once it is bound to the driver.
>   * @driver:	Driver model structure.
>   * @dynids:	List of dynamically added device IDs.
> + * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
> + *		For most device drivers, no need to care about this flag
> + *		as long as all DMAs are handled through the kernel DMA API.
> + *		For some special ones, for example VFIO drivers, they know
> + *		how to manage the DMA themselves and set this flag so that
> + *		the IOMMU layer will allow them to setup and manage their
> + *		own I/O address space.
>   */
>  struct pci_driver {
>  	struct list_head	node;
> @@ -913,6 +920,7 @@ struct pci_driver {
>  	const struct attribute_group **dev_groups;
>  	struct device_driver	driver;
>  	struct pci_dynids	dynids;
> +	bool driver_managed_dma;
>  };
>  
>  static inline struct pci_driver *to_pci_driver(struct device_driver *drv)

[...]

> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 4ceeb75fc899..f83f7fbac68f 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -20,6 +20,7 @@
>  #include <linux/of_device.h>
>  #include <linux/acpi.h>
>  #include <linux/dma-map-ops.h>
> +#include <linux/iommu.h>
>  #include "pci.h"
>  #include "pcie/portdrv.h"
>  
> @@ -1601,6 +1602,7 @@ static int pci_bus_num_vf(struct device *dev)
>   */
>  static int pci_dma_configure(struct device *dev)
>  {
> +	struct pci_driver *driver = to_pci_driver(dev->driver);
>  	struct device *bridge;
>  	int ret = 0;
>  
> @@ -1616,9 +1618,24 @@ static int pci_dma_configure(struct device *dev)
>  	}
>  
>  	pci_put_host_bridge_device(bridge);
> +
> +	if (!ret && !driver->driver_managed_dma) {
> +		ret = iommu_device_use_default_domain(dev);
> +		if (ret)
> +			arch_teardown_dma_ops(dev);
> +	}
> +
>  	return ret;
>  }
>  
> +static void pci_dma_cleanup(struct device *dev)
> +{
> +	struct pci_driver *driver = to_pci_driver(dev->driver);
> +
> +	if (!driver->driver_managed_dma)
> +		iommu_device_unuse_default_domain(dev);
> +}
> +
>  struct bus_type pci_bus_type = {
>  	.name		= "pci",
>  	.match		= pci_bus_match,
> @@ -1632,6 +1649,7 @@ struct bus_type pci_bus_type = {
>  	.pm		= PCI_PM_OPS_PTR,
>  	.num_vf		= pci_bus_num_vf,
>  	.dma_configure	= pci_dma_configure,
> +	.dma_cleanup	= pci_dma_cleanup,
>  };
>  EXPORT_SYMBOL(pci_bus_type);

I (somehow forgot to delete DEBUG_TEST_DRIVER_REMOVE in my .config, and)
failed to start the guest with an assigned PCI device, with something
like:

| qemu-system-aarch64: -device 
vfio-pci,host=0000:03:00.1,id=hostdev0,bus=pci.8,addr=0x0: vfio 
0000:03:00.1: group 45 is not viable
| Please ensure all devices within the iommu_group are bound to their 
vfio bus driver.

It looks like on device probe, with DEBUG_TEST_DRIVER_REMOVE,
.dma_configure() will be executed *twice* via the
really_probe()/re_probe path, and *no* .dma_cleanup() will be executed.
The resulting dev::iommu_group::owner_cnt is 2, which will confuse the
later iommu_group_dma_owner_claimed() call from VFIO on guest startup.

I can locally workaround the problem by deleting the DEBUG option or
performing a .dma_cleanup() during test_remove'ing, but it'd be great if
you can take a look.

Thanks,
Zenghui

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 04/11] bus: platform, amba, fsl-mc, PCI: Add device DMA ownership management
  2023-06-26 13:02   ` Zenghui Yu
@ 2023-06-28 14:36     ` Jason Gunthorpe
  2023-06-29  2:55       ` Zenghui Yu
  0 siblings, 1 reply; 67+ messages in thread
From: Jason Gunthorpe @ 2023-06-28 14:36 UTC (permalink / raw)
  To: Zenghui Yu; +Cc: Lu Baolu, Joerg Roedel, Kevin Tian, linux-kernel, iommu

On Mon, Jun 26, 2023 at 09:02:40PM +0800, Zenghui Yu wrote:

> It looks like on device probe, with DEBUG_TEST_DRIVER_REMOVE,
> .dma_configure() will be executed *twice* via the
> really_probe()/re_probe path, and *no* .dma_cleanup() will be executed.
> The resulting dev::iommu_group::owner_cnt is 2, which will confuse the
> later iommu_group_dma_owner_claimed() call from VFIO on guest startup.

Does this work for you?

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 9c09ca5c4ab68e..7145d9b940b14b 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -693,6 +693,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 
 		device_remove(dev);
 		driver_sysfs_remove(dev);
+		if (dev->bus && dev->bus->dma_cleanup)
+			dev->bus->dma_cleanup(dev);
 		device_unbind_cleanup(dev);
 
 		goto re_probe;
 

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RESEND PATCH v8 04/11] bus: platform, amba, fsl-mc, PCI: Add device DMA ownership management
  2023-06-28 14:36     ` Jason Gunthorpe
@ 2023-06-29  2:55       ` Zenghui Yu
  0 siblings, 0 replies; 67+ messages in thread
From: Zenghui Yu @ 2023-06-29  2:55 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Lu Baolu, Joerg Roedel, Kevin Tian, linux-kernel, iommu

On 2023/6/28 22:36, Jason Gunthorpe wrote:
> On Mon, Jun 26, 2023 at 09:02:40PM +0800, Zenghui Yu wrote:
> 
>> It looks like on device probe, with DEBUG_TEST_DRIVER_REMOVE,
>> .dma_configure() will be executed *twice* via the
>> really_probe()/re_probe path, and *no* .dma_cleanup() will be executed.
>> The resulting dev::iommu_group::owner_cnt is 2, which will confuse the
>> later iommu_group_dma_owner_claimed() call from VFIO on guest startup.
> 
> Does this work for you?

It works. Please feel free to add my Tested-by if you send it as a
formal patch. Thanks.

> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 9c09ca5c4ab68e..7145d9b940b14b 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -693,6 +693,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
>  
>  		device_remove(dev);
>  		driver_sysfs_remove(dev);
> +		if (dev->bus && dev->bus->dma_cleanup)
> +			dev->bus->dma_cleanup(dev);
>  		device_unbind_cleanup(dev);
>  
>  		goto re_probe;

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2023-06-29  2:55 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18  0:49 [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
2022-04-18  0:49 ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-06-15  9:53   ` Steven Price
2022-06-15  9:53     ` Steven Price
2022-06-15 10:57     ` Robin Murphy
2022-06-15 10:57       ` Robin Murphy
2022-06-15 15:09       ` Steven Price
2022-06-15 15:09         ` Steven Price
2022-04-18  0:49 ` [RESEND PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 03/11] amba: Stop sharing platform_dma_configure() Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 04/11] bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management Lu Baolu
2022-04-18  0:49   ` [RESEND PATCH v8 04/11] bus: platform, amba, fsl-mc, PCI: " Lu Baolu
2023-06-26 13:02   ` Zenghui Yu
2023-06-28 14:36     ` Jason Gunthorpe
2023-06-29  2:55       ` Zenghui Yu
2022-04-18  0:49 ` [RESEND PATCH v8 05/11] PCI: pci_stub: Set driver_managed_dma Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 06/11] PCI: portdrv: " Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 07/11] vfio: Set DMA ownership for VFIO devices Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 08/11] vfio: Remove use of vfio_group_viable() Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 09/11] vfio: Delete the unbound_list Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:49 ` [RESEND PATCH v8 10/11] vfio: Remove iommu group notifier Lu Baolu
2022-04-18  0:49   ` Lu Baolu
2022-04-18  0:50 ` [RESEND PATCH v8 11/11] iommu: Remove iommu group changes notifier Lu Baolu
2022-04-18  0:50   ` Lu Baolu
2022-04-28  9:32 ` [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Joerg Roedel
2022-04-28  9:32   ` Joerg Roedel
2022-04-28 11:54   ` Jason Gunthorpe
2022-04-28 11:54     ` Jason Gunthorpe via iommu
2022-04-28 13:34     ` Joerg Roedel
2022-04-28 13:34       ` Joerg Roedel
2022-05-02 16:12 ` Qian Cai
2022-05-02 16:12   ` Qian Cai
2022-05-02 16:42   ` Jason Gunthorpe
2022-05-02 16:42     ` Jason Gunthorpe via iommu
2022-05-03 13:04     ` Robin Murphy
2022-05-03 13:04       ` Robin Murphy
2022-05-03 15:23       ` Jason Gunthorpe
2022-05-03 15:23         ` Jason Gunthorpe via iommu
2022-05-03 17:22         ` Robin Murphy
2022-05-03 17:22           ` Robin Murphy
2022-05-04  8:42   ` Joerg Roedel
2022-05-04  8:42     ` Joerg Roedel
2022-05-04 11:51     ` Jason Gunthorpe
2022-05-04 11:51       ` Jason Gunthorpe via iommu
2022-05-04 11:57       ` Joerg Roedel
2022-05-04 11:57         ` Joerg Roedel
2022-05-09 18:33         ` Jason Gunthorpe
2022-05-09 18:33           ` Jason Gunthorpe via iommu
2022-05-13  8:13           ` Tian, Kevin
2022-05-13  8:13             ` Tian, Kevin
2022-05-04 16:29     ` Alex Williamson
2022-05-04 16:29       ` Alex Williamson
2022-05-13 15:49       ` Joerg Roedel
2022-05-13 15:49         ` Joerg Roedel
2022-05-13 16:25         ` Alex Williamson
2022-05-13 16:25           ` Alex Williamson
2022-05-13 19:06           ` Jason Gunthorpe
2022-05-13 19:06             ` Jason Gunthorpe via iommu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.