All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-02-18  0:55 ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

Hi folks,

The iommu group is the minimal isolation boundary for DMA. Devices in
a group can access each other's MMIO registers via peer to peer DMA
and also need share the same I/O address space.

Once the I/O address space is assigned to user control it is no longer
available to the dma_map* API, which effectively makes the DMA API
non-working.

Second, userspace can use DMA initiated by a device that it controls
to access the MMIO spaces of other devices in the group. This allows
userspace to indirectly attack any kernel owned device and it's driver.

Therefore groups must either be entirely under kernel control or
userspace control, never a mixture. Unfortunately some systems have
problems with the granularity of groups and there are a couple of
important exceptions:

 - pci_stub allows the admin to block driver binding on a device and
   make it permanently shared with userspace. Since PCI stub does not
   do DMA it is safe, however the admin must understand that using
   pci_stub allows userspace to attack whatever device it was bound
   it.

 - PCI bridges are sometimes included in groups. Typically PCI bridges
   do not use DMA, and generally do not have MMIO regions.

Generally any device that does not have any MMIO registers is a
possible candidate for an exception.

Currently vfio adopts a workaround to detect violations of the above
restrictions by monitoring the driver core BOUND event, and hardwiring
the above exceptions. Since there is no way for vfio to reject driver
binding at this point, BUG_ON() is triggered if a violation is
captured (kernel driver BOUND event on a group which already has some
devices assigned to userspace). Aside from the bad user experience
this opens a way for root userspace to crash the kernel, even in high
integrity configurations, by manipulating the module binding and
triggering the BUG_ON.

This series solves this problem by making the user/kernel ownership a
core concept at the IOMMU layer. The driver core enforces kernel
ownership while drivers are bound and violations now result in a error
codes during probe, not BUG_ON failures.

Patch partitions:
  [PATCH 1-4]: Detect DMA ownership conflicts during driver binding;
  [PATCH 5-7]: Add security context management for assigned devices;
  [PATCH 8-11]: Various cleanups.

This is also part one of three initial series for IOMMUFD:
 * Move IOMMU Group security into the iommu layer
 - Generic IOMMUFD implementation
 - VFIO ability to consume IOMMUFD

Change log:
v1: initial post
  - https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/

v2:
  - https://lore.kernel.org/linux-iommu/20211128025051.355578-1-baolu.lu@linux.intel.com/

  - Move kernel dma ownership auto-claiming from driver core to bus
    callback. [Greg/Christoph/Robin/Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#m153706912b770682cb12e3c28f57e171aa1f9d0c

  - Code and interface refactoring for iommu_set/release_dma_owner()
    interfaces. [Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e

  - [NEW]Add new iommu_attach/detach_device_shared() interfaces for
    multiple devices group. [Robin/Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e

  - [NEW]Use iommu_attach/detach_device_shared() in drm/tegra drivers.

  - Refactoring and description refinement.

v3:
  - https://lore.kernel.org/linux-iommu/20211206015903.88687-1-baolu.lu@linux.intel.com/

  - Rename bus_type::dma_unconfigure to bus_type::dma_cleanup. [Greg]
    https://lore.kernel.org/linux-iommu/c3230ace-c878-39db-1663-2b752ff5384e@linux.intel.com/T/#m6711e041e47cb0cbe3964fad0a3466f5ae4b3b9b

  - Avoid _platform_dma_configure for platform_bus_type::dma_configure.
    [Greg]
    https://lore.kernel.org/linux-iommu/c3230ace-c878-39db-1663-2b752ff5384e@linux.intel.com/T/#m43fc46286611aa56a5c0eeaad99d539e5519f3f6

  - Patch "0012-iommu-Add-iommu_at-de-tach_device_shared-for-mult.patch"
    and "0018-drm-tegra-Use-the-iommu-dma_owner-mechanism.patch" have
    been tested by Dmitry Osipenko <digetx@gmail.com>.

v4:
  - https://lore.kernel.org/linux-iommu/20211217063708.1740334-1-baolu.lu@linux.intel.com/
  - Remove unnecessary tegra->domain chech in the tegra patch. (Jason)
  - Remove DMA_OWNER_NONE. (Joerg)
  - Change refcount to unsigned int. (Christoph)
  - Move mutex lock into group set_dma_owner functions. (Christoph)
  - Add kernel doc for iommu_attach/detach_domain_shared(). (Christoph)
  - Move dma auto-claim into driver core. (Jason/Christoph)

v5:
  - https://lore.kernel.org/linux-iommu/20220104015644.2294354-1-baolu.lu@linux.intel.com/
  - Move kernel dma ownership auto-claiming from driver core to bus
    callback. (Greg)
  - Refactor the iommu interfaces to make them more specific.
    (Jason/Robin)
  - Simplify the dma ownership implementation by removing the owner
    type. (Jason)
  - Commit message refactoring for PCI drivers. (Bjorn)
  - Move iommu_attach/detach_device() improvement patches into another
    series as there are a lot of code refactoring and cleanup staffs
    in various device drivers.

v6:
  - Refine comments and commit mesages.
  - Rename iommu_group_set_dma_owner() to iommu_group_claim_dma_owner().
  - Rename iommu_device_use/unuse_kernel_dma() to
    iommu_device_use/unuse_default_domain().
  - Remove unnecessary EXPORT_SYMBOL_GPL.
  - Change flag name from no_kernel_api_dma to driver_managed_dma.
  - Merge 4 "Add driver dma ownership management" patches into single
    one.

This is based on next branch of linux-iommu tree:
https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
and also available on github:
https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v6

Best regards,
baolu

Jason Gunthorpe (1):
  vfio: Delete the unbound_list

Lu Baolu (10):
  iommu: Add dma ownership management interfaces
  driver core: Add dma_cleanup callback in bus_type
  amba: Stop sharing platform_dma_configure()
  bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
  PCI: pci_stub: Set driver_managed_dma
  PCI: portdrv: Set driver_managed_dma
  vfio: Set DMA ownership for VFIO devices
  vfio: Remove use of vfio_group_viable()
  vfio: Remove iommu group notifier
  iommu: Remove iommu group changes notifier

 include/linux/amba/bus.h              |   8 +
 include/linux/device/bus.h            |   3 +
 include/linux/fsl/mc.h                |   8 +
 include/linux/iommu.h                 |  54 +++---
 include/linux/pci.h                   |   8 +
 include/linux/platform_device.h       |  10 +-
 drivers/amba/bus.c                    |  39 +++-
 drivers/base/dd.c                     |   5 +
 drivers/base/platform.c               |  23 ++-
 drivers/bus/fsl-mc/fsl-mc-bus.c       |  26 ++-
 drivers/iommu/iommu.c                 | 233 ++++++++++++++++--------
 drivers/pci/pci-driver.c              |  21 +++
 drivers/pci/pci-stub.c                |   1 +
 drivers/pci/pcie/portdrv_pci.c        |   2 +
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |   1 +
 drivers/vfio/pci/vfio_pci.c           |   1 +
 drivers/vfio/platform/vfio_amba.c     |   1 +
 drivers/vfio/platform/vfio_platform.c |   1 +
 drivers/vfio/vfio.c                   | 245 ++------------------------
 19 files changed, 352 insertions(+), 338 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-02-18  0:55 ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

Hi folks,

The iommu group is the minimal isolation boundary for DMA. Devices in
a group can access each other's MMIO registers via peer to peer DMA
and also need share the same I/O address space.

Once the I/O address space is assigned to user control it is no longer
available to the dma_map* API, which effectively makes the DMA API
non-working.

Second, userspace can use DMA initiated by a device that it controls
to access the MMIO spaces of other devices in the group. This allows
userspace to indirectly attack any kernel owned device and it's driver.

Therefore groups must either be entirely under kernel control or
userspace control, never a mixture. Unfortunately some systems have
problems with the granularity of groups and there are a couple of
important exceptions:

 - pci_stub allows the admin to block driver binding on a device and
   make it permanently shared with userspace. Since PCI stub does not
   do DMA it is safe, however the admin must understand that using
   pci_stub allows userspace to attack whatever device it was bound
   it.

 - PCI bridges are sometimes included in groups. Typically PCI bridges
   do not use DMA, and generally do not have MMIO regions.

Generally any device that does not have any MMIO registers is a
possible candidate for an exception.

Currently vfio adopts a workaround to detect violations of the above
restrictions by monitoring the driver core BOUND event, and hardwiring
the above exceptions. Since there is no way for vfio to reject driver
binding at this point, BUG_ON() is triggered if a violation is
captured (kernel driver BOUND event on a group which already has some
devices assigned to userspace). Aside from the bad user experience
this opens a way for root userspace to crash the kernel, even in high
integrity configurations, by manipulating the module binding and
triggering the BUG_ON.

This series solves this problem by making the user/kernel ownership a
core concept at the IOMMU layer. The driver core enforces kernel
ownership while drivers are bound and violations now result in a error
codes during probe, not BUG_ON failures.

Patch partitions:
  [PATCH 1-4]: Detect DMA ownership conflicts during driver binding;
  [PATCH 5-7]: Add security context management for assigned devices;
  [PATCH 8-11]: Various cleanups.

This is also part one of three initial series for IOMMUFD:
 * Move IOMMU Group security into the iommu layer
 - Generic IOMMUFD implementation
 - VFIO ability to consume IOMMUFD

Change log:
v1: initial post
  - https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/

v2:
  - https://lore.kernel.org/linux-iommu/20211128025051.355578-1-baolu.lu@linux.intel.com/

  - Move kernel dma ownership auto-claiming from driver core to bus
    callback. [Greg/Christoph/Robin/Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#m153706912b770682cb12e3c28f57e171aa1f9d0c

  - Code and interface refactoring for iommu_set/release_dma_owner()
    interfaces. [Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e

  - [NEW]Add new iommu_attach/detach_device_shared() interfaces for
    multiple devices group. [Robin/Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e

  - [NEW]Use iommu_attach/detach_device_shared() in drm/tegra drivers.

  - Refactoring and description refinement.

v3:
  - https://lore.kernel.org/linux-iommu/20211206015903.88687-1-baolu.lu@linux.intel.com/

  - Rename bus_type::dma_unconfigure to bus_type::dma_cleanup. [Greg]
    https://lore.kernel.org/linux-iommu/c3230ace-c878-39db-1663-2b752ff5384e@linux.intel.com/T/#m6711e041e47cb0cbe3964fad0a3466f5ae4b3b9b

  - Avoid _platform_dma_configure for platform_bus_type::dma_configure.
    [Greg]
    https://lore.kernel.org/linux-iommu/c3230ace-c878-39db-1663-2b752ff5384e@linux.intel.com/T/#m43fc46286611aa56a5c0eeaad99d539e5519f3f6

  - Patch "0012-iommu-Add-iommu_at-de-tach_device_shared-for-mult.patch"
    and "0018-drm-tegra-Use-the-iommu-dma_owner-mechanism.patch" have
    been tested by Dmitry Osipenko <digetx@gmail.com>.

v4:
  - https://lore.kernel.org/linux-iommu/20211217063708.1740334-1-baolu.lu@linux.intel.com/
  - Remove unnecessary tegra->domain chech in the tegra patch. (Jason)
  - Remove DMA_OWNER_NONE. (Joerg)
  - Change refcount to unsigned int. (Christoph)
  - Move mutex lock into group set_dma_owner functions. (Christoph)
  - Add kernel doc for iommu_attach/detach_domain_shared(). (Christoph)
  - Move dma auto-claim into driver core. (Jason/Christoph)

v5:
  - https://lore.kernel.org/linux-iommu/20220104015644.2294354-1-baolu.lu@linux.intel.com/
  - Move kernel dma ownership auto-claiming from driver core to bus
    callback. (Greg)
  - Refactor the iommu interfaces to make them more specific.
    (Jason/Robin)
  - Simplify the dma ownership implementation by removing the owner
    type. (Jason)
  - Commit message refactoring for PCI drivers. (Bjorn)
  - Move iommu_attach/detach_device() improvement patches into another
    series as there are a lot of code refactoring and cleanup staffs
    in various device drivers.

v6:
  - Refine comments and commit mesages.
  - Rename iommu_group_set_dma_owner() to iommu_group_claim_dma_owner().
  - Rename iommu_device_use/unuse_kernel_dma() to
    iommu_device_use/unuse_default_domain().
  - Remove unnecessary EXPORT_SYMBOL_GPL.
  - Change flag name from no_kernel_api_dma to driver_managed_dma.
  - Merge 4 "Add driver dma ownership management" patches into single
    one.

This is based on next branch of linux-iommu tree:
https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
and also available on github:
https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v6

Best regards,
baolu

Jason Gunthorpe (1):
  vfio: Delete the unbound_list

Lu Baolu (10):
  iommu: Add dma ownership management interfaces
  driver core: Add dma_cleanup callback in bus_type
  amba: Stop sharing platform_dma_configure()
  bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
  PCI: pci_stub: Set driver_managed_dma
  PCI: portdrv: Set driver_managed_dma
  vfio: Set DMA ownership for VFIO devices
  vfio: Remove use of vfio_group_viable()
  vfio: Remove iommu group notifier
  iommu: Remove iommu group changes notifier

 include/linux/amba/bus.h              |   8 +
 include/linux/device/bus.h            |   3 +
 include/linux/fsl/mc.h                |   8 +
 include/linux/iommu.h                 |  54 +++---
 include/linux/pci.h                   |   8 +
 include/linux/platform_device.h       |  10 +-
 drivers/amba/bus.c                    |  39 +++-
 drivers/base/dd.c                     |   5 +
 drivers/base/platform.c               |  23 ++-
 drivers/bus/fsl-mc/fsl-mc-bus.c       |  26 ++-
 drivers/iommu/iommu.c                 | 233 ++++++++++++++++--------
 drivers/pci/pci-driver.c              |  21 +++
 drivers/pci/pci-stub.c                |   1 +
 drivers/pci/pcie/portdrv_pci.c        |   2 +
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |   1 +
 drivers/vfio/pci/vfio_pci.c           |   1 +
 drivers/vfio/platform/vfio_amba.c     |   1 +
 drivers/vfio/platform/vfio_platform.c |   1 +
 drivers/vfio/vfio.c                   | 245 ++------------------------
 19 files changed, 352 insertions(+), 338 deletions(-)

-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

Multiple devices may be placed in the same IOMMU group because they
cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture.

This adds dma ownership management in iommu core and exposes several
interfaces for the device drivers and the device userspace assignment
framework (i.e. VFIO), so that any conflict between user and kernel
controlled dma could be detected at the beginning.

The device driver oriented interfaces are,

	int iommu_device_use_default_domain(struct device *dev);
	void iommu_device_unuse_default_domain(struct device *dev);

By calling iommu_device_use_default_domain(), the device driver tells
the iommu layer that the device dma is handled through the kernel DMA
APIs. The iommu layer will manage the IOVA and use the default domain
for DMA address translation.

The device user-space assignment framework oriented interfaces are,

	int iommu_group_claim_dma_owner(struct iommu_group *group,
					void *owner);
	void iommu_group_release_dma_owner(struct iommu_group *group);
	bool iommu_group_dma_owner_claimed(struct iommu_group *group);

The device userspace assignment must be disallowed if the DMA owner
claiming interface returns failure.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h |  31 +++++++++
 drivers/iommu/iommu.c | 158 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 186 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1..77972ef978b5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -675,6 +675,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
+int iommu_device_use_default_domain(struct device *dev);
+void iommu_device_unuse_default_domain(struct device *dev);
+
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
+void iommu_group_release_dma_owner(struct iommu_group *group);
+bool iommu_group_dma_owner_claimed(struct iommu_group *group);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1031,6 +1038,30 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline int iommu_device_use_default_domain(struct device *dev)
+{
+	return 0;
+}
+
+static inline void iommu_device_unuse_default_domain(struct device *dev)
+{
+}
+
+static inline int
+iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	return -ENODEV;
+}
+
+static inline void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+}
+
+static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..4e2ad7124780 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,8 @@ struct iommu_group {
 	struct iommu_domain *default_domain;
 	struct iommu_domain *domain;
 	struct list_head entry;
+	unsigned int owner_cnt;
+	void *owner;
 };
 
 struct group_device {
@@ -294,7 +296,11 @@ int iommu_probe_device(struct device *dev)
 	mutex_lock(&group->mutex);
 	iommu_alloc_default_domain(group, dev);
 
-	if (group->default_domain) {
+	/*
+	 * If device joined an existing group which has been claimed, don't
+	 * attach the default domain.
+	 */
+	if (group->default_domain && !group->owner) {
 		ret = __iommu_attach_device(group->default_domain, dev);
 		if (ret) {
 			mutex_unlock(&group->mutex);
@@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (group->default_domain && group->domain != group->default_domain)
+	if (group->domain && group->domain != group->default_domain)
 		return -EBUSY;
 
 	ret = __iommu_group_for_each_dev(group, domain,
@@ -2146,7 +2152,11 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (!group->default_domain) {
+	/*
+	 * If the group has been claimed already, do not re-attach the default
+	 * domain.
+	 */
+	if (!group->default_domain || group->owner) {
 		__iommu_group_for_each_dev(group, domain,
 					   iommu_group_do_detach_device);
 		group->domain = NULL;
@@ -3095,3 +3105,145 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 
 	return ret;
 }
+
+/**
+ * iommu_device_use_default_domain() - Device driver wants to handle device
+ *                                     DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver about to bind @dev wants to do DMA through the kernel
+ * DMA API. Return 0 if it is allowed, otherwise an error.
+ */
+int iommu_device_use_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+	int ret = 0;
+
+	if (!group)
+		return 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		if (group->domain != group->default_domain ||
+		    group->owner) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+	}
+
+	group->owner_cnt++;
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+/**
+ * iommu_device_unuse_default_domain() - Device driver stops handling device
+ *                                       DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver doesn't want to do DMA through kernel DMA API anymore.
+ * It must be called after iommu_device_use_default_domain().
+ */
+void iommu_device_unuse_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+
+	if (!group)
+		return;
+
+	mutex_lock(&group->mutex);
+	if (!WARN_ON(!group->owner_cnt))
+		group->owner_cnt--;
+
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+}
+
+/**
+ * iommu_group_claim_dma_owner() - Set DMA ownership of a group
+ * @group: The group.
+ * @owner: Caller specified pointer. Used for exclusive ownership.
+ *
+ * This is to support backward compatibility for vfio which manages
+ * the dma ownership in iommu_group level. New invocations on this
+ * interface should be prohibited.
+ */
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	int ret = 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		if (group->owner != owner) {
+			ret = -EPERM;
+			goto unlock_out;
+		}
+	} else {
+		if (group->domain && group->domain != group->default_domain) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+
+		group->owner = owner;
+		if (group->domain)
+			__iommu_detach_group(group->domain, group);
+	}
+
+	group->owner_cnt++;
+unlock_out:
+	mutex_unlock(&group->mutex);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
+
+/**
+ * iommu_group_release_dma_owner() - Release DMA ownership of a group
+ * @group: The group.
+ *
+ * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
+ */
+void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+	mutex_lock(&group->mutex);
+	if (WARN_ON(!group->owner_cnt || !group->owner))
+		goto unlock_out;
+
+	if (--group->owner_cnt > 0)
+		goto unlock_out;
+
+	/*
+	 * The UNMANAGED domain should be detached before all USER
+	 * owners have been released.
+	 */
+	if (!WARN_ON(group->domain) && group->default_domain)
+		__iommu_attach_group(group->default_domain, group);
+	group->owner = NULL;
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+}
+EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
+
+/**
+ * iommu_group_dma_owner_claimed() - Query group dma ownership status
+ * @group: The group.
+ *
+ * This provides status query on a given group. It is racey and only for
+ * non-binding status reporting.
+ */
+bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	unsigned int user;
+
+	mutex_lock(&group->mutex);
+	user = group->owner_cnt;
+	mutex_unlock(&group->mutex);
+
+	return user;
+}
+EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

Multiple devices may be placed in the same IOMMU group because they
cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture.

This adds dma ownership management in iommu core and exposes several
interfaces for the device drivers and the device userspace assignment
framework (i.e. VFIO), so that any conflict between user and kernel
controlled dma could be detected at the beginning.

The device driver oriented interfaces are,

	int iommu_device_use_default_domain(struct device *dev);
	void iommu_device_unuse_default_domain(struct device *dev);

By calling iommu_device_use_default_domain(), the device driver tells
the iommu layer that the device dma is handled through the kernel DMA
APIs. The iommu layer will manage the IOVA and use the default domain
for DMA address translation.

The device user-space assignment framework oriented interfaces are,

	int iommu_group_claim_dma_owner(struct iommu_group *group,
					void *owner);
	void iommu_group_release_dma_owner(struct iommu_group *group);
	bool iommu_group_dma_owner_claimed(struct iommu_group *group);

The device userspace assignment must be disallowed if the DMA owner
claiming interface returns failure.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h |  31 +++++++++
 drivers/iommu/iommu.c | 158 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 186 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1..77972ef978b5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -675,6 +675,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
+int iommu_device_use_default_domain(struct device *dev);
+void iommu_device_unuse_default_domain(struct device *dev);
+
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
+void iommu_group_release_dma_owner(struct iommu_group *group);
+bool iommu_group_dma_owner_claimed(struct iommu_group *group);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1031,6 +1038,30 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline int iommu_device_use_default_domain(struct device *dev)
+{
+	return 0;
+}
+
+static inline void iommu_device_unuse_default_domain(struct device *dev)
+{
+}
+
+static inline int
+iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	return -ENODEV;
+}
+
+static inline void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+}
+
+static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..4e2ad7124780 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,8 @@ struct iommu_group {
 	struct iommu_domain *default_domain;
 	struct iommu_domain *domain;
 	struct list_head entry;
+	unsigned int owner_cnt;
+	void *owner;
 };
 
 struct group_device {
@@ -294,7 +296,11 @@ int iommu_probe_device(struct device *dev)
 	mutex_lock(&group->mutex);
 	iommu_alloc_default_domain(group, dev);
 
-	if (group->default_domain) {
+	/*
+	 * If device joined an existing group which has been claimed, don't
+	 * attach the default domain.
+	 */
+	if (group->default_domain && !group->owner) {
 		ret = __iommu_attach_device(group->default_domain, dev);
 		if (ret) {
 			mutex_unlock(&group->mutex);
@@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (group->default_domain && group->domain != group->default_domain)
+	if (group->domain && group->domain != group->default_domain)
 		return -EBUSY;
 
 	ret = __iommu_group_for_each_dev(group, domain,
@@ -2146,7 +2152,11 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (!group->default_domain) {
+	/*
+	 * If the group has been claimed already, do not re-attach the default
+	 * domain.
+	 */
+	if (!group->default_domain || group->owner) {
 		__iommu_group_for_each_dev(group, domain,
 					   iommu_group_do_detach_device);
 		group->domain = NULL;
@@ -3095,3 +3105,145 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 
 	return ret;
 }
+
+/**
+ * iommu_device_use_default_domain() - Device driver wants to handle device
+ *                                     DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver about to bind @dev wants to do DMA through the kernel
+ * DMA API. Return 0 if it is allowed, otherwise an error.
+ */
+int iommu_device_use_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+	int ret = 0;
+
+	if (!group)
+		return 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		if (group->domain != group->default_domain ||
+		    group->owner) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+	}
+
+	group->owner_cnt++;
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+/**
+ * iommu_device_unuse_default_domain() - Device driver stops handling device
+ *                                       DMA through the kernel DMA API.
+ * @dev: The device.
+ *
+ * The device driver doesn't want to do DMA through kernel DMA API anymore.
+ * It must be called after iommu_device_use_default_domain().
+ */
+void iommu_device_unuse_default_domain(struct device *dev)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+
+	if (!group)
+		return;
+
+	mutex_lock(&group->mutex);
+	if (!WARN_ON(!group->owner_cnt))
+		group->owner_cnt--;
+
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+}
+
+/**
+ * iommu_group_claim_dma_owner() - Set DMA ownership of a group
+ * @group: The group.
+ * @owner: Caller specified pointer. Used for exclusive ownership.
+ *
+ * This is to support backward compatibility for vfio which manages
+ * the dma ownership in iommu_group level. New invocations on this
+ * interface should be prohibited.
+ */
+int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
+{
+	int ret = 0;
+
+	mutex_lock(&group->mutex);
+	if (group->owner_cnt) {
+		if (group->owner != owner) {
+			ret = -EPERM;
+			goto unlock_out;
+		}
+	} else {
+		if (group->domain && group->domain != group->default_domain) {
+			ret = -EBUSY;
+			goto unlock_out;
+		}
+
+		group->owner = owner;
+		if (group->domain)
+			__iommu_detach_group(group->domain, group);
+	}
+
+	group->owner_cnt++;
+unlock_out:
+	mutex_unlock(&group->mutex);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
+
+/**
+ * iommu_group_release_dma_owner() - Release DMA ownership of a group
+ * @group: The group.
+ *
+ * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
+ */
+void iommu_group_release_dma_owner(struct iommu_group *group)
+{
+	mutex_lock(&group->mutex);
+	if (WARN_ON(!group->owner_cnt || !group->owner))
+		goto unlock_out;
+
+	if (--group->owner_cnt > 0)
+		goto unlock_out;
+
+	/*
+	 * The UNMANAGED domain should be detached before all USER
+	 * owners have been released.
+	 */
+	if (!WARN_ON(group->domain) && group->default_domain)
+		__iommu_attach_group(group->default_domain, group);
+	group->owner = NULL;
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+}
+EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
+
+/**
+ * iommu_group_dma_owner_claimed() - Query group dma ownership status
+ * @group: The group.
+ *
+ * This provides status query on a given group. It is racey and only for
+ * non-binding status reporting.
+ */
+bool iommu_group_dma_owner_claimed(struct iommu_group *group)
+{
+	unsigned int user;
+
+	mutex_lock(&group->mutex);
+	user = group->owner_cnt;
+	mutex_unlock(&group->mutex);
+
+	return user;
+}
+EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_cleanup()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.

One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.

Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.

With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/device/bus.h | 3 +++
 drivers/base/dd.c          | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index a039ab809753..d8b29ccd07e5 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -59,6 +59,8 @@ struct fwnode_handle;
  *		bus supports.
  * @dma_configure:	Called to setup DMA configuration on a device on
  *			this bus.
+ * @dma_cleanup:	Called to cleanup DMA configuration on a device on
+ *			this bus.
  * @pm:		Power management operations of this bus, callback the specific
  *		device driver's pm-ops.
  * @iommu_ops:  IOMMU specific operations for this bus, used to attach IOMMU
@@ -103,6 +105,7 @@ struct bus_type {
 	int (*num_vf)(struct device *dev);
 
 	int (*dma_configure)(struct device *dev);
+	void (*dma_cleanup)(struct device *dev);
 
 	const struct dev_pm_ops *pm;
 
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 9eaaff2f556c..de05c5c60c6b 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -662,6 +662,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 	if (dev->bus)
 		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 					     BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+	if (dev->bus && dev->bus->dma_cleanup)
+		dev->bus->dma_cleanup(dev);
 pinctrl_bind_failed:
 	device_links_no_driver(dev);
 	devres_release_all(dev);
@@ -1205,6 +1207,9 @@ static void __device_release_driver(struct device *dev, struct device *parent)
 		else if (drv->remove)
 			drv->remove(dev);
 
+		if (dev->bus && dev->bus->dma_cleanup)
+			dev->bus->dma_cleanup(dev);
+
 		device_links_driver_cleanup(dev);
 
 		devres_release_all(dev);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_cleanup()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.

One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.

Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.

With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/device/bus.h | 3 +++
 drivers/base/dd.c          | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index a039ab809753..d8b29ccd07e5 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -59,6 +59,8 @@ struct fwnode_handle;
  *		bus supports.
  * @dma_configure:	Called to setup DMA configuration on a device on
  *			this bus.
+ * @dma_cleanup:	Called to cleanup DMA configuration on a device on
+ *			this bus.
  * @pm:		Power management operations of this bus, callback the specific
  *		device driver's pm-ops.
  * @iommu_ops:  IOMMU specific operations for this bus, used to attach IOMMU
@@ -103,6 +105,7 @@ struct bus_type {
 	int (*num_vf)(struct device *dev);
 
 	int (*dma_configure)(struct device *dev);
+	void (*dma_cleanup)(struct device *dev);
 
 	const struct dev_pm_ops *pm;
 
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 9eaaff2f556c..de05c5c60c6b 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -662,6 +662,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 	if (dev->bus)
 		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 					     BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+	if (dev->bus && dev->bus->dma_cleanup)
+		dev->bus->dma_cleanup(dev);
 pinctrl_bind_failed:
 	device_links_no_driver(dev);
 	devres_release_all(dev);
@@ -1205,6 +1207,9 @@ static void __device_release_driver(struct device *dev, struct device *parent)
 		else if (drv->remove)
 			drv->remove(dev);
 
+		if (dev->bus && dev->bus->dma_cleanup)
+			dev->bus->dma_cleanup(dev);
+
 		device_links_driver_cleanup(dev);
 
 		devres_release_all(dev);
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 03/11] amba: Stop sharing platform_dma_configure()
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

Stop sharing platform_dma_configure() helper as they are about to have
their own bus dma_configure callbacks.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/platform_device.h |  2 --
 drivers/amba/bus.c              | 19 ++++++++++++++++++-
 drivers/base/platform.c         |  3 +--
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 7c96f169d274..17fde717df68 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -328,8 +328,6 @@ extern int platform_pm_restore(struct device *dev);
 #define platform_pm_restore		NULL
 #endif
 
-extern int platform_dma_configure(struct device *dev);
-
 #ifdef CONFIG_PM_SLEEP
 #define USE_PLATFORM_PM_SLEEP_OPS \
 	.suspend = platform_pm_suspend, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index e1a5eca3ae3c..8392f4aa251b 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -20,6 +20,8 @@
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 #include <linux/of_irq.h>
+#include <linux/of_device.h>
+#include <linux/acpi.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -273,6 +275,21 @@ static void amba_shutdown(struct device *dev)
 		drv->shutdown(to_amba_device(dev));
 }
 
+static int amba_dma_configure(struct device *dev)
+{
+	enum dev_dma_attr attr;
+	int ret = 0;
+
+	if (dev->of_node) {
+		ret = of_dma_configure(dev, dev->of_node, true);
+	} else if (has_acpi_companion(dev)) {
+		attr = acpi_get_dma_attr(to_acpi_device_node(dev->fwnode));
+		ret = acpi_dma_configure(dev, attr);
+	}
+
+	return ret;
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -341,7 +358,7 @@ struct bus_type amba_bustype = {
 	.probe		= amba_probe,
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
-	.dma_configure	= platform_dma_configure,
+	.dma_configure	= amba_dma_configure,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 6cb04ac48bf0..acbc6eae37b8 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1454,8 +1454,7 @@ static void platform_shutdown(struct device *_dev)
 		drv->shutdown(dev);
 }
 
-
-int platform_dma_configure(struct device *dev)
+static int platform_dma_configure(struct device *dev)
 {
 	enum dev_dma_attr attr;
 	int ret = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 03/11] amba: Stop sharing platform_dma_configure()
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

Stop sharing platform_dma_configure() helper as they are about to have
their own bus dma_configure callbacks.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/platform_device.h |  2 --
 drivers/amba/bus.c              | 19 ++++++++++++++++++-
 drivers/base/platform.c         |  3 +--
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 7c96f169d274..17fde717df68 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -328,8 +328,6 @@ extern int platform_pm_restore(struct device *dev);
 #define platform_pm_restore		NULL
 #endif
 
-extern int platform_dma_configure(struct device *dev);
-
 #ifdef CONFIG_PM_SLEEP
 #define USE_PLATFORM_PM_SLEEP_OPS \
 	.suspend = platform_pm_suspend, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index e1a5eca3ae3c..8392f4aa251b 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -20,6 +20,8 @@
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 #include <linux/of_irq.h>
+#include <linux/of_device.h>
+#include <linux/acpi.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -273,6 +275,21 @@ static void amba_shutdown(struct device *dev)
 		drv->shutdown(to_amba_device(dev));
 }
 
+static int amba_dma_configure(struct device *dev)
+{
+	enum dev_dma_attr attr;
+	int ret = 0;
+
+	if (dev->of_node) {
+		ret = of_dma_configure(dev, dev->of_node, true);
+	} else if (has_acpi_companion(dev)) {
+		attr = acpi_get_dma_attr(to_acpi_device_node(dev->fwnode));
+		ret = acpi_dma_configure(dev, attr);
+	}
+
+	return ret;
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -341,7 +358,7 @@ struct bus_type amba_bustype = {
 	.probe		= amba_probe,
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
-	.dma_configure	= platform_dma_configure,
+	.dma_configure	= amba_dma_configure,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 6cb04ac48bf0..acbc6eae37b8 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1454,8 +1454,7 @@ static void platform_shutdown(struct device *_dev)
 		drv->shutdown(dev);
 }
 
-
-int platform_dma_configure(struct device *dev)
+static int platform_dma_configure(struct device *dev)
 {
 	enum dev_dma_attr attr;
 	int ret = 0;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 04/11] bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
with the device DMA managed by kernel drivers or user-space applications.
Unfortunately, multiple devices may be placed in the same IOMMU group
because they cannot be isolated from each other. The DMA on these devices
must either be entirely under kernel control or userspace control, never
a mixture. Otherwise the driver integrity is not guaranteed because they
could access each other through the peer-to-peer accesses which by-pass
the IOMMU protection.

This checks and sets the default DMA mode during driver binding, and
cleanups during driver unbinding. In the default mode, the device DMA is
managed by the device driver which handles DMA operations through the
kernel DMA APIs (see Documentation/core-api/dma-api.rst).

For cases where the devices are assigned for userspace control through the
userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
default setting in the assumption that the drivers know what they are
doing with the device DMA.

With the IOMMU layer knowing DMA ownership of each device, above problem
can be solved.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/amba/bus.h        |  8 ++++++++
 include/linux/fsl/mc.h          |  8 ++++++++
 include/linux/pci.h             |  8 ++++++++
 include/linux/platform_device.h |  8 ++++++++
 drivers/amba/bus.c              | 20 ++++++++++++++++++++
 drivers/base/platform.c         | 20 ++++++++++++++++++++
 drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++++++++++++++--
 drivers/pci/pci-driver.c        | 21 +++++++++++++++++++++
 8 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/include/linux/amba/bus.h b/include/linux/amba/bus.h
index 6c7f47846971..e9cd981be94e 100644
--- a/include/linux/amba/bus.h
+++ b/include/linux/amba/bus.h
@@ -79,6 +79,14 @@ struct amba_driver {
 	void			(*remove)(struct amba_device *);
 	void			(*shutdown)(struct amba_device *);
 	const struct amba_id	*id_table;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 /*
diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index 7b6c42bfb660..27efef8affb1 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -32,6 +32,13 @@ struct fsl_mc_io;
  * @shutdown: Function called at shutdown time to quiesce the device
  * @suspend: Function called when a device is stopped
  * @resume: Function called when a device is resumed
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  *
  * Generic DPAA device driver object for device drivers that are registered
  * with a DPRC bus. This structure is to be embedded in each device-specific
@@ -45,6 +52,7 @@ struct fsl_mc_driver {
 	void (*shutdown)(struct fsl_mc_device *dev);
 	int (*suspend)(struct fsl_mc_device *dev, pm_message_t state);
 	int (*resume)(struct fsl_mc_device *dev);
+	bool driver_managed_dma;
 };
 
 #define to_fsl_mc_driver(_drv) \
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8253a5413d7c..b94bce839b83 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -894,6 +894,13 @@ struct module;
  *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  */
 struct pci_driver {
 	struct list_head	node;
@@ -912,6 +919,7 @@ struct pci_driver {
 	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
+	bool driver_managed_dma;
 };
 
 static inline struct pci_driver *to_pci_driver(struct device_driver *drv)
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 17fde717df68..b3d9c744f1e5 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -210,6 +210,14 @@ struct platform_driver {
 	struct device_driver driver;
 	const struct platform_device_id *id_table;
 	bool prevent_deferred_probe;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 #define to_platform_driver(drv)	(container_of((drv), struct platform_driver, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 8392f4aa251b..cebf03522524 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -22,6 +22,7 @@
 #include <linux/of_irq.h>
 #include <linux/of_device.h>
 #include <linux/acpi.h>
+#include <linux/iommu.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -277,9 +278,16 @@ static void amba_shutdown(struct device *dev)
 
 static int amba_dma_configure(struct device *dev)
 {
+	struct amba_driver *drv = to_amba_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
+	if (!drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
+
 	if (dev->of_node) {
 		ret = of_dma_configure(dev, dev->of_node, true);
 	} else if (has_acpi_companion(dev)) {
@@ -287,9 +295,20 @@ static int amba_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (ret && !drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
 	return ret;
 }
 
+static void amba_dma_cleanup(struct device *dev)
+{
+	struct amba_driver *drv = to_amba_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -359,6 +378,7 @@ struct bus_type amba_bustype = {
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
 	.dma_configure	= amba_dma_configure,
+	.dma_cleanup	= amba_dma_cleanup,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index acbc6eae37b8..ad8ea9453cdb 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -30,6 +30,7 @@
 #include <linux/property.h>
 #include <linux/kmemleak.h>
 #include <linux/types.h>
+#include <linux/iommu.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -1456,9 +1457,16 @@ static void platform_shutdown(struct device *_dev)
 
 static int platform_dma_configure(struct device *dev)
 {
+	struct platform_driver *drv = to_platform_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
+	if (!drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
+
 	if (dev->of_node) {
 		ret = of_dma_configure(dev, dev->of_node, true);
 	} else if (has_acpi_companion(dev)) {
@@ -1466,9 +1474,20 @@ static int platform_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (ret && !drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
 	return ret;
 }
 
+static void platform_dma_cleanup(struct device *dev)
+{
+	struct platform_driver *drv = to_platform_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 static const struct dev_pm_ops platform_dev_pm_ops = {
 	SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, pm_generic_runtime_resume, NULL)
 	USE_PLATFORM_PM_SLEEP_OPS
@@ -1483,6 +1502,7 @@ struct bus_type platform_bus_type = {
 	.remove		= platform_remove,
 	.shutdown	= platform_shutdown,
 	.dma_configure	= platform_dma_configure,
+	.dma_cleanup	= platform_dma_cleanup,
 	.pm		= &platform_dev_pm_ops,
 };
 EXPORT_SYMBOL_GPL(platform_bus_type);
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..eca3406a14ce 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -140,15 +140,36 @@ static int fsl_mc_dma_configure(struct device *dev)
 {
 	struct device *dma_dev = dev;
 	struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 	u32 input_id = mc_dev->icid;
+	int ret;
+
+	if (!mc_drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
 
 	while (dev_is_fsl_mc(dma_dev))
 		dma_dev = dma_dev->parent;
 
 	if (dev_of_node(dma_dev))
-		return of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+		ret = of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+	else
+		ret = acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+
+	if (ret && !mc_drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
+	return ret;
+}
+
+static void fsl_mc_dma_cleanup(struct device *dev)
+{
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 
-	return acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+	if (!mc_drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
 }
 
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
@@ -312,6 +333,7 @@ struct bus_type fsl_mc_bus_type = {
 	.match = fsl_mc_bus_match,
 	.uevent = fsl_mc_bus_uevent,
 	.dma_configure  = fsl_mc_dma_configure,
+	.dma_cleanup = fsl_mc_dma_cleanup,
 	.dev_groups = fsl_mc_dev_groups,
 	.bus_groups = fsl_mc_bus_groups,
 };
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 588588cfda48..893a8707c179 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -20,6 +20,7 @@
 #include <linux/of_device.h>
 #include <linux/acpi.h>
 #include <linux/dma-map-ops.h>
+#include <linux/iommu.h>
 #include "pci.h"
 #include "pcie/portdrv.h"
 
@@ -1590,9 +1591,16 @@ static int pci_bus_num_vf(struct device *dev)
  */
 static int pci_dma_configure(struct device *dev)
 {
+	struct pci_driver *driver = to_pci_driver(dev->driver);
 	struct device *bridge;
 	int ret = 0;
 
+	if (!driver->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
+
 	bridge = pci_get_host_bridge_device(to_pci_dev(dev));
 
 	if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
@@ -1605,9 +1613,21 @@ static int pci_dma_configure(struct device *dev)
 	}
 
 	pci_put_host_bridge_device(bridge);
+
+	if (ret && !driver->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
 	return ret;
 }
 
+static void pci_dma_cleanup(struct device *dev)
+{
+	struct pci_driver *driver = to_pci_driver(dev->driver);
+
+	if (!driver->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 struct bus_type pci_bus_type = {
 	.name		= "pci",
 	.match		= pci_bus_match,
@@ -1621,6 +1641,7 @@ struct bus_type pci_bus_type = {
 	.pm		= PCI_PM_OPS_PTR,
 	.num_vf		= pci_bus_num_vf,
 	.dma_configure	= pci_dma_configure,
+	.dma_cleanup	= pci_dma_cleanup,
 };
 EXPORT_SYMBOL(pci_bus_type);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 04/11] bus: platform, amba, fsl-mc, PCI: Add device DMA ownership management
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
with the device DMA managed by kernel drivers or user-space applications.
Unfortunately, multiple devices may be placed in the same IOMMU group
because they cannot be isolated from each other. The DMA on these devices
must either be entirely under kernel control or userspace control, never
a mixture. Otherwise the driver integrity is not guaranteed because they
could access each other through the peer-to-peer accesses which by-pass
the IOMMU protection.

This checks and sets the default DMA mode during driver binding, and
cleanups during driver unbinding. In the default mode, the device DMA is
managed by the device driver which handles DMA operations through the
kernel DMA APIs (see Documentation/core-api/dma-api.rst).

For cases where the devices are assigned for userspace control through the
userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
default setting in the assumption that the drivers know what they are
doing with the device DMA.

With the IOMMU layer knowing DMA ownership of each device, above problem
can be solved.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/amba/bus.h        |  8 ++++++++
 include/linux/fsl/mc.h          |  8 ++++++++
 include/linux/pci.h             |  8 ++++++++
 include/linux/platform_device.h |  8 ++++++++
 drivers/amba/bus.c              | 20 ++++++++++++++++++++
 drivers/base/platform.c         | 20 ++++++++++++++++++++
 drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++++++++++++++--
 drivers/pci/pci-driver.c        | 21 +++++++++++++++++++++
 8 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/include/linux/amba/bus.h b/include/linux/amba/bus.h
index 6c7f47846971..e9cd981be94e 100644
--- a/include/linux/amba/bus.h
+++ b/include/linux/amba/bus.h
@@ -79,6 +79,14 @@ struct amba_driver {
 	void			(*remove)(struct amba_device *);
 	void			(*shutdown)(struct amba_device *);
 	const struct amba_id	*id_table;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 /*
diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index 7b6c42bfb660..27efef8affb1 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -32,6 +32,13 @@ struct fsl_mc_io;
  * @shutdown: Function called at shutdown time to quiesce the device
  * @suspend: Function called when a device is stopped
  * @resume: Function called when a device is resumed
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  *
  * Generic DPAA device driver object for device drivers that are registered
  * with a DPRC bus. This structure is to be embedded in each device-specific
@@ -45,6 +52,7 @@ struct fsl_mc_driver {
 	void (*shutdown)(struct fsl_mc_device *dev);
 	int (*suspend)(struct fsl_mc_device *dev, pm_message_t state);
 	int (*resume)(struct fsl_mc_device *dev);
+	bool driver_managed_dma;
 };
 
 #define to_fsl_mc_driver(_drv) \
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8253a5413d7c..b94bce839b83 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -894,6 +894,13 @@ struct module;
  *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
+ * @driver_managed_dma: Device driver doesn't use kernel DMA API for DMA.
+ *		For most device drivers, no need to care about this flag
+ *		as long as all DMAs are handled through the kernel DMA API.
+ *		For some special ones, for example VFIO drivers, they know
+ *		how to manage the DMA themselves and set this flag so that
+ *		the IOMMU layer will allow them to setup and manage their
+ *		own I/O address space.
  */
 struct pci_driver {
 	struct list_head	node;
@@ -912,6 +919,7 @@ struct pci_driver {
 	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
+	bool driver_managed_dma;
 };
 
 static inline struct pci_driver *to_pci_driver(struct device_driver *drv)
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 17fde717df68..b3d9c744f1e5 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -210,6 +210,14 @@ struct platform_driver {
 	struct device_driver driver;
 	const struct platform_device_id *id_table;
 	bool prevent_deferred_probe;
+	/*
+	 * For most device drivers, no need to care about this flag as long as
+	 * all DMAs are handled through the kernel DMA API. For some special
+	 * ones, for example VFIO drivers, they know how to manage the DMA
+	 * themselves and set this flag so that the IOMMU layer will allow them
+	 * to setup and manage their own I/O address space.
+	 */
+	bool driver_managed_dma;
 };
 
 #define to_platform_driver(drv)	(container_of((drv), struct platform_driver, \
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 8392f4aa251b..cebf03522524 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -22,6 +22,7 @@
 #include <linux/of_irq.h>
 #include <linux/of_device.h>
 #include <linux/acpi.h>
+#include <linux/iommu.h>
 
 #define to_amba_driver(d)	container_of(d, struct amba_driver, drv)
 
@@ -277,9 +278,16 @@ static void amba_shutdown(struct device *dev)
 
 static int amba_dma_configure(struct device *dev)
 {
+	struct amba_driver *drv = to_amba_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
+	if (!drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
+
 	if (dev->of_node) {
 		ret = of_dma_configure(dev, dev->of_node, true);
 	} else if (has_acpi_companion(dev)) {
@@ -287,9 +295,20 @@ static int amba_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (ret && !drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
 	return ret;
 }
 
+static void amba_dma_cleanup(struct device *dev)
+{
+	struct amba_driver *drv = to_amba_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 #ifdef CONFIG_PM
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
@@ -359,6 +378,7 @@ struct bus_type amba_bustype = {
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
 	.dma_configure	= amba_dma_configure,
+	.dma_cleanup	= amba_dma_cleanup,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index acbc6eae37b8..ad8ea9453cdb 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -30,6 +30,7 @@
 #include <linux/property.h>
 #include <linux/kmemleak.h>
 #include <linux/types.h>
+#include <linux/iommu.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -1456,9 +1457,16 @@ static void platform_shutdown(struct device *_dev)
 
 static int platform_dma_configure(struct device *dev)
 {
+	struct platform_driver *drv = to_platform_driver(dev->driver);
 	enum dev_dma_attr attr;
 	int ret = 0;
 
+	if (!drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
+
 	if (dev->of_node) {
 		ret = of_dma_configure(dev, dev->of_node, true);
 	} else if (has_acpi_companion(dev)) {
@@ -1466,9 +1474,20 @@ static int platform_dma_configure(struct device *dev)
 		ret = acpi_dma_configure(dev, attr);
 	}
 
+	if (ret && !drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
 	return ret;
 }
 
+static void platform_dma_cleanup(struct device *dev)
+{
+	struct platform_driver *drv = to_platform_driver(dev->driver);
+
+	if (!drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 static const struct dev_pm_ops platform_dev_pm_ops = {
 	SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, pm_generic_runtime_resume, NULL)
 	USE_PLATFORM_PM_SLEEP_OPS
@@ -1483,6 +1502,7 @@ struct bus_type platform_bus_type = {
 	.remove		= platform_remove,
 	.shutdown	= platform_shutdown,
 	.dma_configure	= platform_dma_configure,
+	.dma_cleanup	= platform_dma_cleanup,
 	.pm		= &platform_dev_pm_ops,
 };
 EXPORT_SYMBOL_GPL(platform_bus_type);
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..eca3406a14ce 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -140,15 +140,36 @@ static int fsl_mc_dma_configure(struct device *dev)
 {
 	struct device *dma_dev = dev;
 	struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 	u32 input_id = mc_dev->icid;
+	int ret;
+
+	if (!mc_drv->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
 
 	while (dev_is_fsl_mc(dma_dev))
 		dma_dev = dma_dev->parent;
 
 	if (dev_of_node(dma_dev))
-		return of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+		ret = of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+	else
+		ret = acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+
+	if (ret && !mc_drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
+	return ret;
+}
+
+static void fsl_mc_dma_cleanup(struct device *dev)
+{
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 
-	return acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+	if (!mc_drv->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
 }
 
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
@@ -312,6 +333,7 @@ struct bus_type fsl_mc_bus_type = {
 	.match = fsl_mc_bus_match,
 	.uevent = fsl_mc_bus_uevent,
 	.dma_configure  = fsl_mc_dma_configure,
+	.dma_cleanup = fsl_mc_dma_cleanup,
 	.dev_groups = fsl_mc_dev_groups,
 	.bus_groups = fsl_mc_bus_groups,
 };
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 588588cfda48..893a8707c179 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -20,6 +20,7 @@
 #include <linux/of_device.h>
 #include <linux/acpi.h>
 #include <linux/dma-map-ops.h>
+#include <linux/iommu.h>
 #include "pci.h"
 #include "pcie/portdrv.h"
 
@@ -1590,9 +1591,16 @@ static int pci_bus_num_vf(struct device *dev)
  */
 static int pci_dma_configure(struct device *dev)
 {
+	struct pci_driver *driver = to_pci_driver(dev->driver);
 	struct device *bridge;
 	int ret = 0;
 
+	if (!driver->driver_managed_dma) {
+		ret = iommu_device_use_default_domain(dev);
+		if (ret)
+			return ret;
+	}
+
 	bridge = pci_get_host_bridge_device(to_pci_dev(dev));
 
 	if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
@@ -1605,9 +1613,21 @@ static int pci_dma_configure(struct device *dev)
 	}
 
 	pci_put_host_bridge_device(bridge);
+
+	if (ret && !driver->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+
 	return ret;
 }
 
+static void pci_dma_cleanup(struct device *dev)
+{
+	struct pci_driver *driver = to_pci_driver(dev->driver);
+
+	if (!driver->driver_managed_dma)
+		iommu_device_unuse_default_domain(dev);
+}
+
 struct bus_type pci_bus_type = {
 	.name		= "pci",
 	.match		= pci_bus_match,
@@ -1621,6 +1641,7 @@ struct bus_type pci_bus_type = {
 	.pm		= PCI_PM_OPS_PTR,
 	.num_vf		= pci_bus_num_vf,
 	.dma_configure	= pci_dma_configure,
+	.dma_cleanup	= pci_dma_cleanup,
 };
 EXPORT_SYMBOL(pci_bus_type);
 
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 05/11] PCI: pci_stub: Set driver_managed_dma
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

The current VFIO implementation allows pci-stub driver to be bound to
a PCI device with other devices in the same IOMMU group being assigned
to userspace. The pci-stub driver has no dependencies on DMA or the
IOVA mapping of the device, but it does prevent the user from having
direct access to the device, which is useful in some circumstances.

The pci_dma_configure() marks the iommu_group as containing only devices
with kernel drivers that manage DMA. For compatibility with the VFIO
usage, avoid this default behavior for the pci_stub. This allows the
pci_stub still able to be used by the admin to block driver binding after
applying the DMA ownership to VFIO.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/pci/pci-stub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pci-stub.c b/drivers/pci/pci-stub.c
index e408099fea52..d1f4c1ce7bd1 100644
--- a/drivers/pci/pci-stub.c
+++ b/drivers/pci/pci-stub.c
@@ -36,6 +36,7 @@ static struct pci_driver stub_driver = {
 	.name		= "pci-stub",
 	.id_table	= NULL,	/* only dynamic id's */
 	.probe		= pci_stub_probe,
+	.driver_managed_dma = true,
 };
 
 static int __init pci_stub_init(void)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 05/11] PCI: pci_stub: Set driver_managed_dma
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

The current VFIO implementation allows pci-stub driver to be bound to
a PCI device with other devices in the same IOMMU group being assigned
to userspace. The pci-stub driver has no dependencies on DMA or the
IOVA mapping of the device, but it does prevent the user from having
direct access to the device, which is useful in some circumstances.

The pci_dma_configure() marks the iommu_group as containing only devices
with kernel drivers that manage DMA. For compatibility with the VFIO
usage, avoid this default behavior for the pci_stub. This allows the
pci_stub still able to be used by the admin to block driver binding after
applying the DMA ownership to VFIO.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/pci/pci-stub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pci-stub.c b/drivers/pci/pci-stub.c
index e408099fea52..d1f4c1ce7bd1 100644
--- a/drivers/pci/pci-stub.c
+++ b/drivers/pci/pci-stub.c
@@ -36,6 +36,7 @@ static struct pci_driver stub_driver = {
 	.name		= "pci-stub",
 	.id_table	= NULL,	/* only dynamic id's */
 	.probe		= pci_stub_probe,
+	.driver_managed_dma = true,
 };
 
 static int __init pci_stub_init(void)
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 06/11] PCI: portdrv: Set driver_managed_dma
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

If a switch lacks ACS P2P Request Redirect, a device below the switch can
bypass the IOMMU and DMA directly to other devices below the switch, so
all the downstream devices must be in the same IOMMU group as the switch
itself.

The existing VFIO framework allows the portdrv driver to be bound to the
bridge while its downstream devices are assigned to user space. The
pci_dma_configure() marks the IOMMU group as containing only devices
with kernel drivers that manage DMA. Avoid this default behavior for the
portdrv driver in order for compatibility with the current VFIO usage.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/pci/pcie/portdrv_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 35eca6277a96..6b2adb678c21 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -202,6 +202,8 @@ static struct pci_driver pcie_portdriver = {
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
+	.driver_managed_dma = true,
+
 	.driver.pm	= PCIE_PORTDRV_PM_OPS,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 06/11] PCI: portdrv: Set driver_managed_dma
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

If a switch lacks ACS P2P Request Redirect, a device below the switch can
bypass the IOMMU and DMA directly to other devices below the switch, so
all the downstream devices must be in the same IOMMU group as the switch
itself.

The existing VFIO framework allows the portdrv driver to be bound to the
bridge while its downstream devices are assigned to user space. The
pci_dma_configure() marks the IOMMU group as containing only devices
with kernel drivers that manage DMA. Avoid this default behavior for the
portdrv driver in order for compatibility with the current VFIO usage.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/pci/pcie/portdrv_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 35eca6277a96..6b2adb678c21 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -202,6 +202,8 @@ static struct pci_driver pcie_portdriver = {
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
+	.driver_managed_dma = true,
+
 	.driver.pm	= PCIE_PORTDRV_PM_OPS,
 };
 
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 07/11] vfio: Set DMA ownership for VFIO devices
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

Claim group dma ownership when an IOMMU group is set to a container,
and release the dma ownership once the iommu group is unset from the
container.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |  1 +
 drivers/vfio/pci/vfio_pci.c           |  1 +
 drivers/vfio/platform/vfio_amba.c     |  1 +
 drivers/vfio/platform/vfio_platform.c |  1 +
 drivers/vfio/vfio.c                   | 10 +++++++++-
 5 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 6e2e62c6f47a..3feff729f3ce 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -588,6 +588,7 @@ static struct fsl_mc_driver vfio_fsl_mc_driver = {
 		.name	= "vfio-fsl-mc",
 		.owner	= THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 static int __init vfio_fsl_mc_driver_init(void)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index a5ce92beb655..941909d3918b 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -193,6 +193,7 @@ static struct pci_driver vfio_pci_driver = {
 	.remove			= vfio_pci_remove,
 	.sriov_configure	= vfio_pci_sriov_configure,
 	.err_handler		= &vfio_pci_core_err_handlers,
+	.driver_managed_dma	= true,
 };
 
 static void __init vfio_pci_fill_ids(void)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index badfffea14fb..1aaa4f721bd2 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -95,6 +95,7 @@ static struct amba_driver vfio_amba_driver = {
 		.name = "vfio-amba",
 		.owner = THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 module_amba_driver(vfio_amba_driver);
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 68a1c87066d7..04f40c5acfd6 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
 	.driver	= {
 		.name	= "vfio-platform",
 	},
+	.driver_managed_dma = true,
 };
 
 module_platform_driver(vfio_platform_driver);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 735d1d344af9..df9d4b60e5ae 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1198,6 +1198,8 @@ static void __vfio_group_unset_container(struct vfio_group *group)
 		driver->ops->detach_group(container->iommu_data,
 					  group->iommu_group);
 
+	iommu_group_release_dma_owner(group->iommu_group);
+
 	group->container = NULL;
 	wake_up(&group->container_q);
 	list_del(&group->container_next);
@@ -1282,13 +1284,19 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 		goto unlock_out;
 	}
 
+	ret = iommu_group_claim_dma_owner(group->iommu_group, f.file);
+	if (ret)
+		goto unlock_out;
+
 	driver = container->iommu_driver;
 	if (driver) {
 		ret = driver->ops->attach_group(container->iommu_data,
 						group->iommu_group,
 						group->type);
-		if (ret)
+		if (ret) {
+			iommu_group_release_dma_owner(group->iommu_group);
 			goto unlock_out;
+		}
 	}
 
 	group->container = container;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 07/11] vfio: Set DMA ownership for VFIO devices
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

Claim group dma ownership when an IOMMU group is set to a container,
and release the dma ownership once the iommu group is unset from the
container.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |  1 +
 drivers/vfio/pci/vfio_pci.c           |  1 +
 drivers/vfio/platform/vfio_amba.c     |  1 +
 drivers/vfio/platform/vfio_platform.c |  1 +
 drivers/vfio/vfio.c                   | 10 +++++++++-
 5 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 6e2e62c6f47a..3feff729f3ce 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -588,6 +588,7 @@ static struct fsl_mc_driver vfio_fsl_mc_driver = {
 		.name	= "vfio-fsl-mc",
 		.owner	= THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 static int __init vfio_fsl_mc_driver_init(void)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index a5ce92beb655..941909d3918b 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -193,6 +193,7 @@ static struct pci_driver vfio_pci_driver = {
 	.remove			= vfio_pci_remove,
 	.sriov_configure	= vfio_pci_sriov_configure,
 	.err_handler		= &vfio_pci_core_err_handlers,
+	.driver_managed_dma	= true,
 };
 
 static void __init vfio_pci_fill_ids(void)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index badfffea14fb..1aaa4f721bd2 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -95,6 +95,7 @@ static struct amba_driver vfio_amba_driver = {
 		.name = "vfio-amba",
 		.owner = THIS_MODULE,
 	},
+	.driver_managed_dma = true,
 };
 
 module_amba_driver(vfio_amba_driver);
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 68a1c87066d7..04f40c5acfd6 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
 	.driver	= {
 		.name	= "vfio-platform",
 	},
+	.driver_managed_dma = true,
 };
 
 module_platform_driver(vfio_platform_driver);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 735d1d344af9..df9d4b60e5ae 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1198,6 +1198,8 @@ static void __vfio_group_unset_container(struct vfio_group *group)
 		driver->ops->detach_group(container->iommu_data,
 					  group->iommu_group);
 
+	iommu_group_release_dma_owner(group->iommu_group);
+
 	group->container = NULL;
 	wake_up(&group->container_q);
 	list_del(&group->container_next);
@@ -1282,13 +1284,19 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 		goto unlock_out;
 	}
 
+	ret = iommu_group_claim_dma_owner(group->iommu_group, f.file);
+	if (ret)
+		goto unlock_out;
+
 	driver = container->iommu_driver;
 	if (driver) {
 		ret = driver->ops->attach_group(container->iommu_data,
 						group->iommu_group,
 						group->type);
-		if (ret)
+		if (ret) {
+			iommu_group_release_dma_owner(group->iommu_group);
 			goto unlock_out;
+		}
 	}
 
 	group->container = container;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 08/11] vfio: Remove use of vfio_group_viable()
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

As DMA ownership is claimed for the iommu group when a VFIO group is
added to a VFIO container, the VFIO group viability is guaranteed as long
as group->container_users > 0. Remove those unnecessary group viability
checks which are only hit when group->container_users is not zero.

The only remaining reference is in GROUP_GET_STATUS, which could be called
at any time when group fd is valid. Here we just replace the
vfio_group_viable() by directly calling IOMMU core to get viability status.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index df9d4b60e5ae..73034446e03f 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1313,12 +1313,6 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	return ret;
 }
 
-static bool vfio_group_viable(struct vfio_group *group)
-{
-	return (iommu_group_for_each_dev(group->iommu_group,
-					 group, vfio_dev_viable) == 0);
-}
-
 static int vfio_group_add_container_user(struct vfio_group *group)
 {
 	if (!atomic_inc_not_zero(&group->container_users))
@@ -1328,7 +1322,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
-	if (!group->container->iommu_driver || !vfio_group_viable(group)) {
+	if (!group->container->iommu_driver) {
 		atomic_dec(&group->container_users);
 		return -EINVAL;
 	}
@@ -1346,7 +1340,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	int ret = 0;
 
 	if (0 == atomic_read(&group->container_users) ||
-	    !group->container->iommu_driver || !vfio_group_viable(group))
+	    !group->container->iommu_driver)
 		return -EINVAL;
 
 	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
@@ -1438,11 +1432,11 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
 
 		status.flags = 0;
 
-		if (vfio_group_viable(group))
-			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
-
 		if (group->container)
-			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET;
+			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
+					VFIO_GROUP_FLAGS_VIABLE;
+		else if (!iommu_group_dma_owner_claimed(group->iommu_group))
+			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
 
 		if (copy_to_user((void __user *)arg, &status, minsz))
 			return -EFAULT;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 08/11] vfio: Remove use of vfio_group_viable()
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

As DMA ownership is claimed for the iommu group when a VFIO group is
added to a VFIO container, the VFIO group viability is guaranteed as long
as group->container_users > 0. Remove those unnecessary group viability
checks which are only hit when group->container_users is not zero.

The only remaining reference is in GROUP_GET_STATUS, which could be called
at any time when group fd is valid. Here we just replace the
vfio_group_viable() by directly calling IOMMU core to get viability status.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index df9d4b60e5ae..73034446e03f 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1313,12 +1313,6 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	return ret;
 }
 
-static bool vfio_group_viable(struct vfio_group *group)
-{
-	return (iommu_group_for_each_dev(group->iommu_group,
-					 group, vfio_dev_viable) == 0);
-}
-
 static int vfio_group_add_container_user(struct vfio_group *group)
 {
 	if (!atomic_inc_not_zero(&group->container_users))
@@ -1328,7 +1322,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
-	if (!group->container->iommu_driver || !vfio_group_viable(group)) {
+	if (!group->container->iommu_driver) {
 		atomic_dec(&group->container_users);
 		return -EINVAL;
 	}
@@ -1346,7 +1340,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	int ret = 0;
 
 	if (0 == atomic_read(&group->container_users) ||
-	    !group->container->iommu_driver || !vfio_group_viable(group))
+	    !group->container->iommu_driver)
 		return -EINVAL;
 
 	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
@@ -1438,11 +1432,11 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
 
 		status.flags = 0;
 
-		if (vfio_group_viable(group))
-			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
-
 		if (group->container)
-			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET;
+			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
+					VFIO_GROUP_FLAGS_VIABLE;
+		else if (!iommu_group_dma_owner_claimed(group->iommu_group))
+			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
 
 		if (copy_to_user((void __user *)arg, &status, minsz))
 			return -EFAULT;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 09/11] vfio: Delete the unbound_list
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

From: Jason Gunthorpe <jgg@nvidia.com>

commit 60720a0fc646 ("vfio: Add device tracking during unbind") added the
unbound list to plug a problem with KVM where KVM_DEV_VFIO_GROUP_DEL
relied on vfio_group_get_external_user() succeeding to return the
vfio_group from a group file descriptor. The unbound list allowed
vfio_group_get_external_user() to continue to succeed in edge cases.

However commit 5d6dee80a1e9 ("vfio: New external user group/file match")
deleted the call to vfio_group_get_external_user() during
KVM_DEV_VFIO_GROUP_DEL. Instead vfio_external_group_match_file() is used
to directly match the file descriptor to the group pointer.

This in turn avoids the call down to vfio_dev_viable() during
KVM_DEV_VFIO_GROUP_DEL and also avoids the trouble the first commit was
trying to fix.

There are no other users of vfio_dev_viable() that care about the time
after vfio_unregister_group_dev() returns, so simply delete the
unbound_list entirely.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 74 ++-------------------------------------------
 1 file changed, 2 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 73034446e03f..e0df2bc692b2 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -62,11 +62,6 @@ struct vfio_container {
 	bool				noiommu;
 };
 
-struct vfio_unbound_dev {
-	struct device			*dev;
-	struct list_head		unbound_next;
-};
-
 struct vfio_group {
 	struct device 			dev;
 	struct cdev			cdev;
@@ -79,8 +74,6 @@ struct vfio_group {
 	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
-	struct list_head		unbound_list;
-	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
 	enum vfio_group_type		type;
@@ -340,16 +333,8 @@ vfio_group_get_from_iommu(struct iommu_group *iommu_group)
 static void vfio_group_release(struct device *dev)
 {
 	struct vfio_group *group = container_of(dev, struct vfio_group, dev);
-	struct vfio_unbound_dev *unbound, *tmp;
-
-	list_for_each_entry_safe(unbound, tmp,
-				 &group->unbound_list, unbound_next) {
-		list_del(&unbound->unbound_next);
-		kfree(unbound);
-	}
 
 	mutex_destroy(&group->device_lock);
-	mutex_destroy(&group->unbound_lock);
 	iommu_group_put(group->iommu_group);
 	ida_free(&vfio.group_ida, MINOR(group->dev.devt));
 	kfree(group);
@@ -381,8 +366,6 @@ static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
 	refcount_set(&group->users, 1);
 	INIT_LIST_HEAD(&group->device_list);
 	mutex_init(&group->device_lock);
-	INIT_LIST_HEAD(&group->unbound_list);
-	mutex_init(&group->unbound_lock);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
 	/* put in vfio_group_release() */
@@ -571,19 +554,8 @@ static int vfio_dev_viable(struct device *dev, void *data)
 	struct vfio_group *group = data;
 	struct vfio_device *device;
 	struct device_driver *drv = READ_ONCE(dev->driver);
-	struct vfio_unbound_dev *unbound;
-	int ret = -EINVAL;
 
-	mutex_lock(&group->unbound_lock);
-	list_for_each_entry(unbound, &group->unbound_list, unbound_next) {
-		if (dev == unbound->dev) {
-			ret = 0;
-			break;
-		}
-	}
-	mutex_unlock(&group->unbound_lock);
-
-	if (!ret || !drv || vfio_dev_driver_allowed(dev, drv))
+	if (!drv || vfio_dev_driver_allowed(dev, drv))
 		return 0;
 
 	device = vfio_group_get_device(group, dev);
@@ -592,7 +564,7 @@ static int vfio_dev_viable(struct device *dev, void *data)
 		return 0;
 	}
 
-	return ret;
+	return -EINVAL;
 }
 
 /*
@@ -634,7 +606,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 {
 	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
 	struct device *dev = data;
-	struct vfio_unbound_dev *unbound;
 
 	switch (action) {
 	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
@@ -663,28 +634,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 			__func__, iommu_group_id(group->iommu_group),
 			dev->driver->name);
 		break;
-	case IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbound from driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		/*
-		 * XXX An unbound device in a live group is ok, but we'd
-		 * really like to avoid the above BUG_ON by preventing other
-		 * drivers from binding to it.  Once that occurs, we have to
-		 * stop the system to maintain isolation.  At a minimum, we'd
-		 * want a toggle to disable driver auto probe for this device.
-		 */
-
-		mutex_lock(&group->unbound_lock);
-		list_for_each_entry(unbound,
-				    &group->unbound_list, unbound_next) {
-			if (dev == unbound->dev) {
-				list_del(&unbound->unbound_next);
-				kfree(unbound);
-				break;
-			}
-		}
-		mutex_unlock(&group->unbound_lock);
-		break;
 	}
 	return NOTIFY_OK;
 }
@@ -889,29 +838,10 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 void vfio_unregister_group_dev(struct vfio_device *device)
 {
 	struct vfio_group *group = device->group;
-	struct vfio_unbound_dev *unbound;
 	unsigned int i = 0;
 	bool interrupted = false;
 	long rc;
 
-	/*
-	 * When the device is removed from the group, the group suddenly
-	 * becomes non-viable; the device has a driver (until the unbind
-	 * completes), but it's not present in the group.  This is bad news
-	 * for any external users that need to re-acquire a group reference
-	 * in order to match and release their existing reference.  To
-	 * solve this, we track such devices on the unbound_list to bridge
-	 * the gap until they're fully unbound.
-	 */
-	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
-	if (unbound) {
-		unbound->dev = device->dev;
-		mutex_lock(&group->unbound_lock);
-		list_add(&unbound->unbound_next, &group->unbound_list);
-		mutex_unlock(&group->unbound_lock);
-	}
-	WARN_ON(!unbound);
-
 	vfio_device_put(device);
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 09/11] vfio: Delete the unbound_list
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

From: Jason Gunthorpe <jgg@nvidia.com>

commit 60720a0fc646 ("vfio: Add device tracking during unbind") added the
unbound list to plug a problem with KVM where KVM_DEV_VFIO_GROUP_DEL
relied on vfio_group_get_external_user() succeeding to return the
vfio_group from a group file descriptor. The unbound list allowed
vfio_group_get_external_user() to continue to succeed in edge cases.

However commit 5d6dee80a1e9 ("vfio: New external user group/file match")
deleted the call to vfio_group_get_external_user() during
KVM_DEV_VFIO_GROUP_DEL. Instead vfio_external_group_match_file() is used
to directly match the file descriptor to the group pointer.

This in turn avoids the call down to vfio_dev_viable() during
KVM_DEV_VFIO_GROUP_DEL and also avoids the trouble the first commit was
trying to fix.

There are no other users of vfio_dev_viable() that care about the time
after vfio_unregister_group_dev() returns, so simply delete the
unbound_list entirely.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 74 ++-------------------------------------------
 1 file changed, 2 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 73034446e03f..e0df2bc692b2 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -62,11 +62,6 @@ struct vfio_container {
 	bool				noiommu;
 };
 
-struct vfio_unbound_dev {
-	struct device			*dev;
-	struct list_head		unbound_next;
-};
-
 struct vfio_group {
 	struct device 			dev;
 	struct cdev			cdev;
@@ -79,8 +74,6 @@ struct vfio_group {
 	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
-	struct list_head		unbound_list;
-	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
 	enum vfio_group_type		type;
@@ -340,16 +333,8 @@ vfio_group_get_from_iommu(struct iommu_group *iommu_group)
 static void vfio_group_release(struct device *dev)
 {
 	struct vfio_group *group = container_of(dev, struct vfio_group, dev);
-	struct vfio_unbound_dev *unbound, *tmp;
-
-	list_for_each_entry_safe(unbound, tmp,
-				 &group->unbound_list, unbound_next) {
-		list_del(&unbound->unbound_next);
-		kfree(unbound);
-	}
 
 	mutex_destroy(&group->device_lock);
-	mutex_destroy(&group->unbound_lock);
 	iommu_group_put(group->iommu_group);
 	ida_free(&vfio.group_ida, MINOR(group->dev.devt));
 	kfree(group);
@@ -381,8 +366,6 @@ static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
 	refcount_set(&group->users, 1);
 	INIT_LIST_HEAD(&group->device_list);
 	mutex_init(&group->device_lock);
-	INIT_LIST_HEAD(&group->unbound_list);
-	mutex_init(&group->unbound_lock);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
 	/* put in vfio_group_release() */
@@ -571,19 +554,8 @@ static int vfio_dev_viable(struct device *dev, void *data)
 	struct vfio_group *group = data;
 	struct vfio_device *device;
 	struct device_driver *drv = READ_ONCE(dev->driver);
-	struct vfio_unbound_dev *unbound;
-	int ret = -EINVAL;
 
-	mutex_lock(&group->unbound_lock);
-	list_for_each_entry(unbound, &group->unbound_list, unbound_next) {
-		if (dev == unbound->dev) {
-			ret = 0;
-			break;
-		}
-	}
-	mutex_unlock(&group->unbound_lock);
-
-	if (!ret || !drv || vfio_dev_driver_allowed(dev, drv))
+	if (!drv || vfio_dev_driver_allowed(dev, drv))
 		return 0;
 
 	device = vfio_group_get_device(group, dev);
@@ -592,7 +564,7 @@ static int vfio_dev_viable(struct device *dev, void *data)
 		return 0;
 	}
 
-	return ret;
+	return -EINVAL;
 }
 
 /*
@@ -634,7 +606,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 {
 	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
 	struct device *dev = data;
-	struct vfio_unbound_dev *unbound;
 
 	switch (action) {
 	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
@@ -663,28 +634,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 			__func__, iommu_group_id(group->iommu_group),
 			dev->driver->name);
 		break;
-	case IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbound from driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		/*
-		 * XXX An unbound device in a live group is ok, but we'd
-		 * really like to avoid the above BUG_ON by preventing other
-		 * drivers from binding to it.  Once that occurs, we have to
-		 * stop the system to maintain isolation.  At a minimum, we'd
-		 * want a toggle to disable driver auto probe for this device.
-		 */
-
-		mutex_lock(&group->unbound_lock);
-		list_for_each_entry(unbound,
-				    &group->unbound_list, unbound_next) {
-			if (dev == unbound->dev) {
-				list_del(&unbound->unbound_next);
-				kfree(unbound);
-				break;
-			}
-		}
-		mutex_unlock(&group->unbound_lock);
-		break;
 	}
 	return NOTIFY_OK;
 }
@@ -889,29 +838,10 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 void vfio_unregister_group_dev(struct vfio_device *device)
 {
 	struct vfio_group *group = device->group;
-	struct vfio_unbound_dev *unbound;
 	unsigned int i = 0;
 	bool interrupted = false;
 	long rc;
 
-	/*
-	 * When the device is removed from the group, the group suddenly
-	 * becomes non-viable; the device has a driver (until the unbind
-	 * completes), but it's not present in the group.  This is bad news
-	 * for any external users that need to re-acquire a group reference
-	 * in order to match and release their existing reference.  To
-	 * solve this, we track such devices on the unbound_list to bridge
-	 * the gap until they're fully unbound.
-	 */
-	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
-	if (unbound) {
-		unbound->dev = device->dev;
-		mutex_lock(&group->unbound_lock);
-		list_add(&unbound->unbound_next, &group->unbound_list);
-		mutex_unlock(&group->unbound_lock);
-	}
-	WARN_ON(!unbound);
-
 	vfio_device_put(device);
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 10/11] vfio: Remove iommu group notifier
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu

The iommu core and driver core have been enhanced to avoid unsafe driver
binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
has been called. There's no need to register iommu group notifier. This
removes the iommu group notifer which contains BUG_ON() and WARN().

The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") allowed all
pcieport drivers to be bound with devices while the group is assigned to
user space. This is not always safe. For example, The shpchp_core driver
relies on the PCI MMIO access for the controller functionality. With its
downstream devices assigned to the userspace, the MMIO might be changed
through user initiated P2P accesses without any notification. This might
break the kernel driver integrity and lead to some unpredictable
consequences. As the result, currently we only allow the portdrv driver.

For any bridge driver, in order to avoiding default kernel DMA ownership
claiming, we should consider:

 1) Does the bridge driver use DMA? Calling pci_set_master() or
    a dma_map_* API is a sure indicate the driver is doing DMA

 2) If the bridge driver uses MMIO, is it tolerant to hostile
    userspace also touching the same MMIO registers via P2P DMA
    attacks?

Conservatively if the driver maps an MMIO region at all, we can say that
it fails the test.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 147 --------------------------------------------
 1 file changed, 147 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index e0df2bc692b2..dd3fac0d6bc9 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -71,7 +71,6 @@ struct vfio_group {
 	struct vfio_container		*container;
 	struct list_head		device_list;
 	struct mutex			device_lock;
-	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
 	atomic_t			opened;
@@ -274,8 +273,6 @@ void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops)
 }
 EXPORT_SYMBOL_GPL(vfio_unregister_iommu_driver);
 
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data);
 static void vfio_group_get(struct vfio_group *group);
 
 /*
@@ -395,13 +392,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 		goto err_put;
 	}
 
-	group->nb.notifier_call = vfio_iommu_group_notifier;
-	err = iommu_group_register_notifier(iommu_group, &group->nb);
-	if (err) {
-		ret = ERR_PTR(err);
-		goto err_put;
-	}
-
 	mutex_lock(&vfio.group_lock);
 
 	/* Did we race creating this group? */
@@ -422,7 +412,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 
 err_unlock:
 	mutex_unlock(&vfio.group_lock);
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 err_put:
 	put_device(&group->dev);
 	return ret;
@@ -447,7 +436,6 @@ static void vfio_group_put(struct vfio_group *group)
 	cdev_device_del(&group->cdev, &group->dev);
 	mutex_unlock(&vfio.group_lock);
 
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 	put_device(&group->dev);
 }
 
@@ -503,141 +491,6 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 	return NULL;
 }
 
-/*
- * Some drivers, like pci-stub, are only used to prevent other drivers from
- * claiming a device and are therefore perfectly legitimate for a user owned
- * group.  The pci-stub driver has no dependencies on DMA or the IOVA mapping
- * of the device, but it does prevent the user from having direct access to
- * the device, which is useful in some circumstances.
- *
- * We also assume that we can include PCI interconnect devices, ie. bridges.
- * IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
- * then all of the downstream devices will be part of the same IOMMU group as
- * the bridge.  Thus, if placing the bridge into the user owned IOVA space
- * breaks anything, it only does so for user owned devices downstream.  Note
- * that error notification via MSI can be affected for platforms that handle
- * MSI within the same IOVA space as DMA.
- */
-static const char * const vfio_driver_allowed[] = { "pci-stub" };
-
-static bool vfio_dev_driver_allowed(struct device *dev,
-				    struct device_driver *drv)
-{
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
-			return true;
-	}
-
-	return match_string(vfio_driver_allowed,
-			    ARRAY_SIZE(vfio_driver_allowed),
-			    drv->name) >= 0;
-}
-
-/*
- * A vfio group is viable for use by userspace if all devices are in
- * one of the following states:
- *  - driver-less
- *  - bound to a vfio driver
- *  - bound to an otherwise allowed driver
- *  - a PCI interconnect device
- *
- * We use two methods to determine whether a device is bound to a vfio
- * driver.  The first is to test whether the device exists in the vfio
- * group.  The second is to test if the device exists on the group
- * unbound_list, indicating it's in the middle of transitioning from
- * a vfio driver to driver-less.
- */
-static int vfio_dev_viable(struct device *dev, void *data)
-{
-	struct vfio_group *group = data;
-	struct vfio_device *device;
-	struct device_driver *drv = READ_ONCE(dev->driver);
-
-	if (!drv || vfio_dev_driver_allowed(dev, drv))
-		return 0;
-
-	device = vfio_group_get_device(group, dev);
-	if (device) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
-/*
- * Async device support
- */
-static int vfio_group_nb_add_dev(struct vfio_group *group, struct device *dev)
-{
-	struct vfio_device *device;
-
-	/* Do we already know about it?  We shouldn't */
-	device = vfio_group_get_device(group, dev);
-	if (WARN_ON_ONCE(device)) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	/* Nothing to do for idle groups */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	/* TODO Prevent device auto probing */
-	dev_WARN(dev, "Device added to live group %d!\n",
-		 iommu_group_id(group->iommu_group));
-
-	return 0;
-}
-
-static int vfio_group_nb_verify(struct vfio_group *group, struct device *dev)
-{
-	/* We don't care what happens when the group isn't in use */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	return vfio_dev_viable(dev, group);
-}
-
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data)
-{
-	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
-	struct device *dev = data;
-
-	switch (action) {
-	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
-		vfio_group_nb_add_dev(group, dev);
-		break;
-	case IOMMU_GROUP_NOTIFY_DEL_DEVICE:
-		/*
-		 * Nothing to do here.  If the device is in use, then the
-		 * vfio sub-driver should block the remove callback until
-		 * it is unused.  If the device is unused or attached to a
-		 * stub driver, then it should be released and we don't
-		 * care that it will be going away.
-		 */
-		break;
-	case IOMMU_GROUP_NOTIFY_BIND_DRIVER:
-		dev_dbg(dev, "%s: group %d binding to driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		break;
-	case IOMMU_GROUP_NOTIFY_BOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d bound to driver %s\n", __func__,
-			iommu_group_id(group->iommu_group), dev->driver->name);
-		BUG_ON(vfio_group_nb_verify(group, dev));
-		break;
-	case IOMMU_GROUP_NOTIFY_UNBIND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbinding from driver %s\n",
-			__func__, iommu_group_id(group->iommu_group),
-			dev->driver->name);
-		break;
-	}
-	return NOTIFY_OK;
-}
-
 /*
  * VFIO driver API
  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 10/11] vfio: Remove iommu group notifier
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

The iommu core and driver core have been enhanced to avoid unsafe driver
binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
has been called. There's no need to register iommu group notifier. This
removes the iommu group notifer which contains BUG_ON() and WARN().

The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") allowed all
pcieport drivers to be bound with devices while the group is assigned to
user space. This is not always safe. For example, The shpchp_core driver
relies on the PCI MMIO access for the controller functionality. With its
downstream devices assigned to the userspace, the MMIO might be changed
through user initiated P2P accesses without any notification. This might
break the kernel driver integrity and lead to some unpredictable
consequences. As the result, currently we only allow the portdrv driver.

For any bridge driver, in order to avoiding default kernel DMA ownership
claiming, we should consider:

 1) Does the bridge driver use DMA? Calling pci_set_master() or
    a dma_map_* API is a sure indicate the driver is doing DMA

 2) If the bridge driver uses MMIO, is it tolerant to hostile
    userspace also touching the same MMIO registers via P2P DMA
    attacks?

Conservatively if the driver maps an MMIO region at all, we can say that
it fails the test.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 147 --------------------------------------------
 1 file changed, 147 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index e0df2bc692b2..dd3fac0d6bc9 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -71,7 +71,6 @@ struct vfio_group {
 	struct vfio_container		*container;
 	struct list_head		device_list;
 	struct mutex			device_lock;
-	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
 	atomic_t			opened;
@@ -274,8 +273,6 @@ void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops)
 }
 EXPORT_SYMBOL_GPL(vfio_unregister_iommu_driver);
 
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data);
 static void vfio_group_get(struct vfio_group *group);
 
 /*
@@ -395,13 +392,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 		goto err_put;
 	}
 
-	group->nb.notifier_call = vfio_iommu_group_notifier;
-	err = iommu_group_register_notifier(iommu_group, &group->nb);
-	if (err) {
-		ret = ERR_PTR(err);
-		goto err_put;
-	}
-
 	mutex_lock(&vfio.group_lock);
 
 	/* Did we race creating this group? */
@@ -422,7 +412,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 
 err_unlock:
 	mutex_unlock(&vfio.group_lock);
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 err_put:
 	put_device(&group->dev);
 	return ret;
@@ -447,7 +436,6 @@ static void vfio_group_put(struct vfio_group *group)
 	cdev_device_del(&group->cdev, &group->dev);
 	mutex_unlock(&vfio.group_lock);
 
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 	put_device(&group->dev);
 }
 
@@ -503,141 +491,6 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 	return NULL;
 }
 
-/*
- * Some drivers, like pci-stub, are only used to prevent other drivers from
- * claiming a device and are therefore perfectly legitimate for a user owned
- * group.  The pci-stub driver has no dependencies on DMA or the IOVA mapping
- * of the device, but it does prevent the user from having direct access to
- * the device, which is useful in some circumstances.
- *
- * We also assume that we can include PCI interconnect devices, ie. bridges.
- * IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
- * then all of the downstream devices will be part of the same IOMMU group as
- * the bridge.  Thus, if placing the bridge into the user owned IOVA space
- * breaks anything, it only does so for user owned devices downstream.  Note
- * that error notification via MSI can be affected for platforms that handle
- * MSI within the same IOVA space as DMA.
- */
-static const char * const vfio_driver_allowed[] = { "pci-stub" };
-
-static bool vfio_dev_driver_allowed(struct device *dev,
-				    struct device_driver *drv)
-{
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
-			return true;
-	}
-
-	return match_string(vfio_driver_allowed,
-			    ARRAY_SIZE(vfio_driver_allowed),
-			    drv->name) >= 0;
-}
-
-/*
- * A vfio group is viable for use by userspace if all devices are in
- * one of the following states:
- *  - driver-less
- *  - bound to a vfio driver
- *  - bound to an otherwise allowed driver
- *  - a PCI interconnect device
- *
- * We use two methods to determine whether a device is bound to a vfio
- * driver.  The first is to test whether the device exists in the vfio
- * group.  The second is to test if the device exists on the group
- * unbound_list, indicating it's in the middle of transitioning from
- * a vfio driver to driver-less.
- */
-static int vfio_dev_viable(struct device *dev, void *data)
-{
-	struct vfio_group *group = data;
-	struct vfio_device *device;
-	struct device_driver *drv = READ_ONCE(dev->driver);
-
-	if (!drv || vfio_dev_driver_allowed(dev, drv))
-		return 0;
-
-	device = vfio_group_get_device(group, dev);
-	if (device) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
-/*
- * Async device support
- */
-static int vfio_group_nb_add_dev(struct vfio_group *group, struct device *dev)
-{
-	struct vfio_device *device;
-
-	/* Do we already know about it?  We shouldn't */
-	device = vfio_group_get_device(group, dev);
-	if (WARN_ON_ONCE(device)) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	/* Nothing to do for idle groups */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	/* TODO Prevent device auto probing */
-	dev_WARN(dev, "Device added to live group %d!\n",
-		 iommu_group_id(group->iommu_group));
-
-	return 0;
-}
-
-static int vfio_group_nb_verify(struct vfio_group *group, struct device *dev)
-{
-	/* We don't care what happens when the group isn't in use */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	return vfio_dev_viable(dev, group);
-}
-
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data)
-{
-	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
-	struct device *dev = data;
-
-	switch (action) {
-	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
-		vfio_group_nb_add_dev(group, dev);
-		break;
-	case IOMMU_GROUP_NOTIFY_DEL_DEVICE:
-		/*
-		 * Nothing to do here.  If the device is in use, then the
-		 * vfio sub-driver should block the remove callback until
-		 * it is unused.  If the device is unused or attached to a
-		 * stub driver, then it should be released and we don't
-		 * care that it will be going away.
-		 */
-		break;
-	case IOMMU_GROUP_NOTIFY_BIND_DRIVER:
-		dev_dbg(dev, "%s: group %d binding to driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		break;
-	case IOMMU_GROUP_NOTIFY_BOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d bound to driver %s\n", __func__,
-			iommu_group_id(group->iommu_group), dev->driver->name);
-		BUG_ON(vfio_group_nb_verify(group, dev));
-		break;
-	case IOMMU_GROUP_NOTIFY_UNBIND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbinding from driver %s\n",
-			__func__, iommu_group_id(group->iommu_group),
-			dev->driver->name);
-		break;
-	}
-	return NOTIFY_OK;
-}
-
 /*
  * VFIO driver API
  */
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 11/11] iommu: Remove iommu group changes notifier
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18  0:55   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Christoph Hellwig,
	Stuart Yoder, Jonathan Hunter, Chaitanya Kulkarni, Dan Williams,
	Cornelia Huck, linux-kernel, Li Yang, iommu, Jacob jun Pan,
	Daniel Vetter, Robin Murphy

The iommu group changes notifer is not referenced in the tree. Remove it
to avoid dead code.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h | 23 -------------
 drivers/iommu/iommu.c | 75 -------------------------------------------
 2 files changed, 98 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 77972ef978b5..6ef2df258673 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -407,13 +407,6 @@ static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
 	return dev->iommu->iommu_dev->ops;
 }
 
-#define IOMMU_GROUP_NOTIFY_ADD_DEVICE		1 /* Device added */
-#define IOMMU_GROUP_NOTIFY_DEL_DEVICE		2 /* Pre Device removed */
-#define IOMMU_GROUP_NOTIFY_BIND_DRIVER		3 /* Pre Driver bind */
-#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER		4 /* Post Driver bind */
-#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER	5 /* Pre Driver unbind */
-#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER	6 /* Post Driver unbind */
-
 extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
 extern int bus_iommu_probe(struct bus_type *bus);
 extern bool iommu_present(struct bus_type *bus);
@@ -478,10 +471,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
 extern struct iommu_group *iommu_group_get(struct device *dev);
 extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group);
 extern void iommu_group_put(struct iommu_group *group);
-extern int iommu_group_register_notifier(struct iommu_group *group,
-					 struct notifier_block *nb);
-extern int iommu_group_unregister_notifier(struct iommu_group *group,
-					   struct notifier_block *nb);
 extern int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
 					void *data);
@@ -878,18 +867,6 @@ static inline void iommu_group_put(struct iommu_group *group)
 {
 }
 
-static inline int iommu_group_register_notifier(struct iommu_group *group,
-						struct notifier_block *nb)
-{
-	return -ENODEV;
-}
-
-static inline int iommu_group_unregister_notifier(struct iommu_group *group,
-						  struct notifier_block *nb)
-{
-	return 0;
-}
-
 static inline
 int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4e2ad7124780..196358ed8c3d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -18,7 +18,6 @@
 #include <linux/errno.h>
 #include <linux/iommu.h>
 #include <linux/idr.h>
-#include <linux/notifier.h>
 #include <linux/err.h>
 #include <linux/pci.h>
 #include <linux/bitops.h>
@@ -40,7 +39,6 @@ struct iommu_group {
 	struct kobject *devices_kobj;
 	struct list_head devices;
 	struct mutex mutex;
-	struct blocking_notifier_head notifier;
 	void *iommu_data;
 	void (*iommu_data_release)(void *iommu_data);
 	char *name;
@@ -632,7 +630,6 @@ struct iommu_group *iommu_group_alloc(void)
 	mutex_init(&group->mutex);
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
-	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
@@ -905,10 +902,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
-	/* Notify any listeners about change to group. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-
 	trace_add_device_to_group(group->id, dev);
 
 	dev_info(dev, "Adding to iommu group %d\n", group->id);
@@ -950,10 +943,6 @@ void iommu_group_remove_device(struct device *dev)
 
 	dev_info(dev, "Removing from iommu group %d\n", group->id);
 
-	/* Pre-notify listeners that a device is being removed. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
-
 	mutex_lock(&group->mutex);
 	list_for_each_entry(tmp_device, &group->devices, list) {
 		if (tmp_device->dev == dev) {
@@ -1075,36 +1064,6 @@ void iommu_group_put(struct iommu_group *group)
 }
 EXPORT_SYMBOL_GPL(iommu_group_put);
 
-/**
- * iommu_group_register_notifier - Register a notifier for group changes
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * This function allows iommu group users to track changes in a group.
- * See include/linux/iommu.h for actions sent via this notifier.  Caller
- * should hold a reference to the group throughout notifier registration.
- */
-int iommu_group_register_notifier(struct iommu_group *group,
-				  struct notifier_block *nb)
-{
-	return blocking_notifier_chain_register(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_register_notifier);
-
-/**
- * iommu_group_unregister_notifier - Unregister a notifier
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * Unregister a previously registered group notifier block.
- */
-int iommu_group_unregister_notifier(struct iommu_group *group,
-				    struct notifier_block *nb)
-{
-	return blocking_notifier_chain_unregister(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier);
-
 /**
  * iommu_register_device_fault_handler() - Register a device fault handler
  * @dev: the device
@@ -1650,14 +1609,8 @@ static int remove_iommu_group(struct device *dev, void *data)
 static int iommu_bus_notifier(struct notifier_block *nb,
 			      unsigned long action, void *data)
 {
-	unsigned long group_action = 0;
 	struct device *dev = data;
-	struct iommu_group *group;
 
-	/*
-	 * ADD/DEL call into iommu driver ops if provided, which may
-	 * result in ADD/DEL notifiers to group->notifier
-	 */
 	if (action == BUS_NOTIFY_ADD_DEVICE) {
 		int ret;
 
@@ -1668,34 +1621,6 @@ static int iommu_bus_notifier(struct notifier_block *nb,
 		return NOTIFY_OK;
 	}
 
-	/*
-	 * Remaining BUS_NOTIFYs get filtered and republished to the
-	 * group, if anyone is listening
-	 */
-	group = iommu_group_get(dev);
-	if (!group)
-		return 0;
-
-	switch (action) {
-	case BUS_NOTIFY_BIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BIND_DRIVER;
-		break;
-	case BUS_NOTIFY_BOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BOUND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBIND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER;
-		break;
-	}
-
-	if (group_action)
-		blocking_notifier_call_chain(&group->notifier,
-					     group_action, dev);
-
-	iommu_group_put(group);
 	return 0;
 }
 
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 11/11] iommu: Remove iommu group changes notifier
@ 2022-02-18  0:55   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-18  0:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel,
	Lu Baolu, Christoph Hellwig

The iommu group changes notifer is not referenced in the tree. Remove it
to avoid dead code.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h | 23 -------------
 drivers/iommu/iommu.c | 75 -------------------------------------------
 2 files changed, 98 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 77972ef978b5..6ef2df258673 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -407,13 +407,6 @@ static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
 	return dev->iommu->iommu_dev->ops;
 }
 
-#define IOMMU_GROUP_NOTIFY_ADD_DEVICE		1 /* Device added */
-#define IOMMU_GROUP_NOTIFY_DEL_DEVICE		2 /* Pre Device removed */
-#define IOMMU_GROUP_NOTIFY_BIND_DRIVER		3 /* Pre Driver bind */
-#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER		4 /* Post Driver bind */
-#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER	5 /* Pre Driver unbind */
-#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER	6 /* Post Driver unbind */
-
 extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
 extern int bus_iommu_probe(struct bus_type *bus);
 extern bool iommu_present(struct bus_type *bus);
@@ -478,10 +471,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
 extern struct iommu_group *iommu_group_get(struct device *dev);
 extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group);
 extern void iommu_group_put(struct iommu_group *group);
-extern int iommu_group_register_notifier(struct iommu_group *group,
-					 struct notifier_block *nb);
-extern int iommu_group_unregister_notifier(struct iommu_group *group,
-					   struct notifier_block *nb);
 extern int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
 					void *data);
@@ -878,18 +867,6 @@ static inline void iommu_group_put(struct iommu_group *group)
 {
 }
 
-static inline int iommu_group_register_notifier(struct iommu_group *group,
-						struct notifier_block *nb)
-{
-	return -ENODEV;
-}
-
-static inline int iommu_group_unregister_notifier(struct iommu_group *group,
-						  struct notifier_block *nb)
-{
-	return 0;
-}
-
 static inline
 int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4e2ad7124780..196358ed8c3d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -18,7 +18,6 @@
 #include <linux/errno.h>
 #include <linux/iommu.h>
 #include <linux/idr.h>
-#include <linux/notifier.h>
 #include <linux/err.h>
 #include <linux/pci.h>
 #include <linux/bitops.h>
@@ -40,7 +39,6 @@ struct iommu_group {
 	struct kobject *devices_kobj;
 	struct list_head devices;
 	struct mutex mutex;
-	struct blocking_notifier_head notifier;
 	void *iommu_data;
 	void (*iommu_data_release)(void *iommu_data);
 	char *name;
@@ -632,7 +630,6 @@ struct iommu_group *iommu_group_alloc(void)
 	mutex_init(&group->mutex);
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
-	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
@@ -905,10 +902,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
-	/* Notify any listeners about change to group. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-
 	trace_add_device_to_group(group->id, dev);
 
 	dev_info(dev, "Adding to iommu group %d\n", group->id);
@@ -950,10 +943,6 @@ void iommu_group_remove_device(struct device *dev)
 
 	dev_info(dev, "Removing from iommu group %d\n", group->id);
 
-	/* Pre-notify listeners that a device is being removed. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
-
 	mutex_lock(&group->mutex);
 	list_for_each_entry(tmp_device, &group->devices, list) {
 		if (tmp_device->dev == dev) {
@@ -1075,36 +1064,6 @@ void iommu_group_put(struct iommu_group *group)
 }
 EXPORT_SYMBOL_GPL(iommu_group_put);
 
-/**
- * iommu_group_register_notifier - Register a notifier for group changes
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * This function allows iommu group users to track changes in a group.
- * See include/linux/iommu.h for actions sent via this notifier.  Caller
- * should hold a reference to the group throughout notifier registration.
- */
-int iommu_group_register_notifier(struct iommu_group *group,
-				  struct notifier_block *nb)
-{
-	return blocking_notifier_chain_register(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_register_notifier);
-
-/**
- * iommu_group_unregister_notifier - Unregister a notifier
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * Unregister a previously registered group notifier block.
- */
-int iommu_group_unregister_notifier(struct iommu_group *group,
-				    struct notifier_block *nb)
-{
-	return blocking_notifier_chain_unregister(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier);
-
 /**
  * iommu_register_device_fault_handler() - Register a device fault handler
  * @dev: the device
@@ -1650,14 +1609,8 @@ static int remove_iommu_group(struct device *dev, void *data)
 static int iommu_bus_notifier(struct notifier_block *nb,
 			      unsigned long action, void *data)
 {
-	unsigned long group_action = 0;
 	struct device *dev = data;
-	struct iommu_group *group;
 
-	/*
-	 * ADD/DEL call into iommu driver ops if provided, which may
-	 * result in ADD/DEL notifiers to group->notifier
-	 */
 	if (action == BUS_NOTIFY_ADD_DEVICE) {
 		int ret;
 
@@ -1668,34 +1621,6 @@ static int iommu_bus_notifier(struct notifier_block *nb,
 		return NOTIFY_OK;
 	}
 
-	/*
-	 * Remaining BUS_NOTIFYs get filtered and republished to the
-	 * group, if anyone is listening
-	 */
-	group = iommu_group_get(dev);
-	if (!group)
-		return 0;
-
-	switch (action) {
-	case BUS_NOTIFY_BIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BIND_DRIVER;
-		break;
-	case BUS_NOTIFY_BOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BOUND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBIND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER;
-		break;
-	}
-
-	if (group_action)
-		blocking_notifier_call_chain(&group->notifier,
-					     group_action, dev);
-
-	iommu_group_put(group);
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 04/11] bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
  2022-02-18  0:55   ` [PATCH v6 04/11] bus: platform, amba, fsl-mc, PCI: " Lu Baolu
@ 2022-02-18  7:55     ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-18  7:55 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Alex Williamson, Bjorn Helgaas, Jason Gunthorpe,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, Dmitry Osipenko, iommu,
	linux-pci, kvm, linux-kernel

On Fri, Feb 18, 2022 at 08:55:14AM +0800, Lu Baolu wrote:
> The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
> with the device DMA managed by kernel drivers or user-space applications.
> Unfortunately, multiple devices may be placed in the same IOMMU group
> because they cannot be isolated from each other. The DMA on these devices
> must either be entirely under kernel control or userspace control, never
> a mixture. Otherwise the driver integrity is not guaranteed because they
> could access each other through the peer-to-peer accesses which by-pass
> the IOMMU protection.
> 
> This checks and sets the default DMA mode during driver binding, and
> cleanups during driver unbinding. In the default mode, the device DMA is
> managed by the device driver which handles DMA operations through the
> kernel DMA APIs (see Documentation/core-api/dma-api.rst).
> 
> For cases where the devices are assigned for userspace control through the
> userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
> vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
> default setting in the assumption that the drivers know what they are
> doing with the device DMA.
> 
> With the IOMMU layer knowing DMA ownership of each device, above problem
> can be solved.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Stuart Yoder <stuyoder@gmail.com>
> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/amba/bus.h        |  8 ++++++++
>  include/linux/fsl/mc.h          |  8 ++++++++
>  include/linux/pci.h             |  8 ++++++++
>  include/linux/platform_device.h |  8 ++++++++
>  drivers/amba/bus.c              | 20 ++++++++++++++++++++
>  drivers/base/platform.c         | 20 ++++++++++++++++++++
>  drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++++++++++++++--
>  drivers/pci/pci-driver.c        | 21 +++++++++++++++++++++
>  8 files changed, 117 insertions(+), 2 deletions(-)

For the platform.c stuff:

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


thanks for renaming this.

greg k-h

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 04/11] bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
@ 2022-02-18  7:55     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-18  7:55 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Jason Gunthorpe, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter, Robin Murphy

On Fri, Feb 18, 2022 at 08:55:14AM +0800, Lu Baolu wrote:
> The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
> with the device DMA managed by kernel drivers or user-space applications.
> Unfortunately, multiple devices may be placed in the same IOMMU group
> because they cannot be isolated from each other. The DMA on these devices
> must either be entirely under kernel control or userspace control, never
> a mixture. Otherwise the driver integrity is not guaranteed because they
> could access each other through the peer-to-peer accesses which by-pass
> the IOMMU protection.
> 
> This checks and sets the default DMA mode during driver binding, and
> cleanups during driver unbinding. In the default mode, the device DMA is
> managed by the device driver which handles DMA operations through the
> kernel DMA APIs (see Documentation/core-api/dma-api.rst).
> 
> For cases where the devices are assigned for userspace control through the
> userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
> vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
> default setting in the assumption that the drivers know what they are
> doing with the device DMA.
> 
> With the IOMMU layer knowing DMA ownership of each device, above problem
> can be solved.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Stuart Yoder <stuyoder@gmail.com>
> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/amba/bus.h        |  8 ++++++++
>  include/linux/fsl/mc.h          |  8 ++++++++
>  include/linux/pci.h             |  8 ++++++++
>  include/linux/platform_device.h |  8 ++++++++
>  drivers/amba/bus.c              | 20 ++++++++++++++++++++
>  drivers/base/platform.c         | 20 ++++++++++++++++++++
>  drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++++++++++++++--
>  drivers/pci/pci-driver.c        | 21 +++++++++++++++++++++
>  8 files changed, 117 insertions(+), 2 deletions(-)

For the platform.c stuff:

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


thanks for renaming this.

greg k-h
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-18 15:51   ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-18 15:51 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, Dmitry Osipenko, iommu,
	linux-pci, kvm, linux-kernel

On Fri, Feb 18, 2022 at 08:55:10AM +0800, Lu Baolu wrote:
> Hi folks,
> 
> The iommu group is the minimal isolation boundary for DMA. Devices in
> a group can access each other's MMIO registers via peer to peer DMA
> and also need share the same I/O address space.
> 
> Once the I/O address space is assigned to user control it is no longer
> available to the dma_map* API, which effectively makes the DMA API
> non-working.
> 
> Second, userspace can use DMA initiated by a device that it controls
> to access the MMIO spaces of other devices in the group. This allows
> userspace to indirectly attack any kernel owned device and it's driver.

This series has changed quite a lot since v1 - but I couldn't spot
anything wrong with this. It is a small incremental step and I think
it is fine now, so 

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

I hope you continue to work on the "Scrap iommu_attach/detach_group()
interfaces" series and try to minimize all the special places testing
against the default domain

Thanks,
Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-02-18 15:51   ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-18 15:51 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter, Robin Murphy

On Fri, Feb 18, 2022 at 08:55:10AM +0800, Lu Baolu wrote:
> Hi folks,
> 
> The iommu group is the minimal isolation boundary for DMA. Devices in
> a group can access each other's MMIO registers via peer to peer DMA
> and also need share the same I/O address space.
> 
> Once the I/O address space is assigned to user control it is no longer
> available to the dma_map* API, which effectively makes the DMA API
> non-working.
> 
> Second, userspace can use DMA initiated by a device that it controls
> to access the MMIO spaces of other devices in the group. This allows
> userspace to indirectly attack any kernel owned device and it's driver.

This series has changed quite a lot since v1 - but I couldn't spot
anything wrong with this. It is a small incremental step and I think
it is fine now, so 

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

I hope you continue to work on the "Scrap iommu_attach/detach_group()
interfaces" series and try to minimize all the special places testing
against the default domain

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-18  0:55   ` Lu Baolu
@ 2022-02-19  7:31     ` Christoph Hellwig
  -1 siblings, 0 replies; 90+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:31 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter, Robin Murphy

The overall API and patch looks fine, but:

> + * iommu_group_dma_owner_claimed() - Query group dma ownership status
> + * @group: The group.
> + *
> + * This provides status query on a given group. It is racey and only for
> + * non-binding status reporting.

s/racey/racy/

> + */
> +bool iommu_group_dma_owner_claimed(struct iommu_group *group)
> +{
> +	unsigned int user;
> +
> +	mutex_lock(&group->mutex);
> +	user = group->owner_cnt;
> +	mutex_unlock(&group->mutex);
> +
> +	return user;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);

Still no no need for the lock here.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-19  7:31     ` Christoph Hellwig
  0 siblings, 0 replies; 90+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:31 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Jason Gunthorpe, Alex Williamson,
	Bjorn Helgaas, Dan Williams, Greg Kroah-Hartman, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

The overall API and patch looks fine, but:

> + * iommu_group_dma_owner_claimed() - Query group dma ownership status
> + * @group: The group.
> + *
> + * This provides status query on a given group. It is racey and only for
> + * non-binding status reporting.

s/racey/racy/

> + */
> +bool iommu_group_dma_owner_claimed(struct iommu_group *group)
> +{
> +	unsigned int user;
> +
> +	mutex_lock(&group->mutex);
> +	user = group->owner_cnt;
> +	mutex_unlock(&group->mutex);
> +
> +	return user;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);

Still no no need for the lock here.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-18  0:55   ` Lu Baolu
@ 2022-02-19  7:32     ` Christoph Hellwig
  -1 siblings, 0 replies; 90+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:32 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter, Robin Murphy

So we are back to the callback madness instead of the nice and simple
flag?  Sigh.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-19  7:32     ` Christoph Hellwig
  0 siblings, 0 replies; 90+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:32 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Jason Gunthorpe, Alex Williamson,
	Bjorn Helgaas, Dan Williams, Greg Kroah-Hartman, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

So we are back to the callback madness instead of the nice and simple
flag?  Sigh.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-02-18 15:51   ` Jason Gunthorpe via iommu
@ 2022-02-21  3:38     ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-21  3:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Christoph Hellwig, Kevin Tian, Ashok Raj,
	Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel

On 2/18/22 11:51 PM, Jason Gunthorpe wrote:
> On Fri, Feb 18, 2022 at 08:55:10AM +0800, Lu Baolu wrote:
>> Hi folks,
>>
>> The iommu group is the minimal isolation boundary for DMA. Devices in
>> a group can access each other's MMIO registers via peer to peer DMA
>> and also need share the same I/O address space.
>>
>> Once the I/O address space is assigned to user control it is no longer
>> available to the dma_map* API, which effectively makes the DMA API
>> non-working.
>>
>> Second, userspace can use DMA initiated by a device that it controls
>> to access the MMIO spaces of other devices in the group. This allows
>> userspace to indirectly attack any kernel owned device and it's driver.
> This series has changed quite a lot since v1 - but I couldn't spot
> anything wrong with this. It is a small incremental step and I think
> it is fine now, so
> 
> Reviewed-by: Jason Gunthorpe<jgg@nvidia.com>
> 
> I hope you continue to work on the "Scrap iommu_attach/detach_group()
> interfaces" series and try to minimize all the special places testing
> against the default domain

Sure.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-02-21  3:38     ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-21  3:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter, Robin Murphy

On 2/18/22 11:51 PM, Jason Gunthorpe wrote:
> On Fri, Feb 18, 2022 at 08:55:10AM +0800, Lu Baolu wrote:
>> Hi folks,
>>
>> The iommu group is the minimal isolation boundary for DMA. Devices in
>> a group can access each other's MMIO registers via peer to peer DMA
>> and also need share the same I/O address space.
>>
>> Once the I/O address space is assigned to user control it is no longer
>> available to the dma_map* API, which effectively makes the DMA API
>> non-working.
>>
>> Second, userspace can use DMA initiated by a device that it controls
>> to access the MMIO spaces of other devices in the group. This allows
>> userspace to indirectly attack any kernel owned device and it's driver.
> This series has changed quite a lot since v1 - but I couldn't spot
> anything wrong with this. It is a small incremental step and I think
> it is fine now, so
> 
> Reviewed-by: Jason Gunthorpe<jgg@nvidia.com>
> 
> I hope you continue to work on the "Scrap iommu_attach/detach_group()
> interfaces" series and try to minimize all the special places testing
> against the default domain

Sure.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-19  7:31     ` Christoph Hellwig
@ 2022-02-21  4:02       ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-21  4:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: baolu.lu, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Jason Gunthorpe, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter, Robin Murphy

On 2/19/22 3:31 PM, Christoph Hellwig wrote:
> The overall API and patch looks fine, but:
> 
>> + * iommu_group_dma_owner_claimed() - Query group dma ownership status
>> + * @group: The group.
>> + *
>> + * This provides status query on a given group. It is racey and only for
>> + * non-binding status reporting.
> 
> s/racey/racy/

Yes.

> 
>> + */
>> +bool iommu_group_dma_owner_claimed(struct iommu_group *group)
>> +{
>> +	unsigned int user;
>> +
>> +	mutex_lock(&group->mutex);
>> +	user = group->owner_cnt;
>> +	mutex_unlock(&group->mutex);
>> +
>> +	return user;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
> 
> Still no no need for the lock here.

We've discussed this before. I tend to think that is right.

We don't lose anything with this lock held and it also follows the rule
that all accesses to the internal group structure must be done with the
group->mutex held.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-21  4:02       ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-21  4:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Stuart Yoder, Kevin Tian, Chaitanya Kulkarni,
	Jason Gunthorpe, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter, Robin Murphy

On 2/19/22 3:31 PM, Christoph Hellwig wrote:
> The overall API and patch looks fine, but:
> 
>> + * iommu_group_dma_owner_claimed() - Query group dma ownership status
>> + * @group: The group.
>> + *
>> + * This provides status query on a given group. It is racey and only for
>> + * non-binding status reporting.
> 
> s/racey/racy/

Yes.

> 
>> + */
>> +bool iommu_group_dma_owner_claimed(struct iommu_group *group)
>> +{
>> +	unsigned int user;
>> +
>> +	mutex_lock(&group->mutex);
>> +	user = group->owner_cnt;
>> +	mutex_unlock(&group->mutex);
>> +
>> +	return user;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
> 
> Still no no need for the lock here.

We've discussed this before. I tend to think that is right.

We don't lose anything with this lock held and it also follows the rule
that all accesses to the internal group structure must be done with the
group->mutex held.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-19  7:32     ` Christoph Hellwig
@ 2022-02-21 20:43       ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-21 20:43 UTC (permalink / raw)
  To: Christoph Hellwig, Lu Baolu
  Cc: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Kevin Tian, Ashok Raj, kvm, rafael,
	David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2022-02-19 07:32, Christoph Hellwig wrote:
> So we are back to the callback madness instead of the nice and simple
> flag?  Sigh.

TBH, I *think* this part could be a fair bit simpler. It looks like this 
whole callback mess is effectively just to decrement group->owner_cnt, 
but since we should only care about ownership at probe, hotplug, and 
other places well outside critical fast-paths, I'm not sure we really 
need to keep track of that anyway - it can always be recalculated by 
walking the group->devices list, and some of the relevant places have to 
do that anyway. It should be pretty straightforward for 
iommu_bus_notifier to clear group->owner automatically upon an unbind of 
the matching driver when it's no longer bound to any other devices in 
the group either. And if we still want to entertain the notion of VFIO 
being able to release ownership without unbinding (I'm not entirely 
convinced that's a realistically necessary use-case) then it should be 
up to VFIO to decide when it's finally finished with the whole group, 
rather than pretending we can keep track of nested ownership claims from 
inside the API.

Furthermore, If Greg was willing to compromise just far enough to let us 
put driver_managed_dma in the 3-byte hole in the generic struct 
device_driver, we wouldn't have to have quite so much boilerplate 
repeated across the various bus implementations (I'm not suggesting to 
move any actual calls back into the driver core, just the storage of 
flag itself). FWIW I have some ideas for re-converging .dma_configure in 
future which I think should probably be able to subsume this into a 
completely generic common path, given a common flag.

Robin.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-21 20:43       ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-21 20:43 UTC (permalink / raw)
  To: Christoph Hellwig, Lu Baolu
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Stuart Yoder, Kevin Tian, Chaitanya Kulkarni,
	Jason Gunthorpe, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On 2022-02-19 07:32, Christoph Hellwig wrote:
> So we are back to the callback madness instead of the nice and simple
> flag?  Sigh.

TBH, I *think* this part could be a fair bit simpler. It looks like this 
whole callback mess is effectively just to decrement group->owner_cnt, 
but since we should only care about ownership at probe, hotplug, and 
other places well outside critical fast-paths, I'm not sure we really 
need to keep track of that anyway - it can always be recalculated by 
walking the group->devices list, and some of the relevant places have to 
do that anyway. It should be pretty straightforward for 
iommu_bus_notifier to clear group->owner automatically upon an unbind of 
the matching driver when it's no longer bound to any other devices in 
the group either. And if we still want to entertain the notion of VFIO 
being able to release ownership without unbinding (I'm not entirely 
convinced that's a realistically necessary use-case) then it should be 
up to VFIO to decide when it's finally finished with the whole group, 
rather than pretending we can keep track of nested ownership claims from 
inside the API.

Furthermore, If Greg was willing to compromise just far enough to let us 
put driver_managed_dma in the 3-byte hole in the generic struct 
device_driver, we wouldn't have to have quite so much boilerplate 
repeated across the various bus implementations (I'm not suggesting to 
move any actual calls back into the driver core, just the storage of 
flag itself). FWIW I have some ideas for re-converging .dma_configure in 
future which I think should probably be able to subsume this into a 
completely generic common path, given a common flag.

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-21 20:43       ` Robin Murphy
@ 2022-02-21 23:48         ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-21 23:48 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Lu Baolu, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
> On 2022-02-19 07:32, Christoph Hellwig wrote:
> > So we are back to the callback madness instead of the nice and simple
> > flag?  Sigh.
> 
> TBH, I *think* this part could be a fair bit simpler. It looks like this
> whole callback mess is effectively just to decrement
> group->owner_cnt, but

Right, the new callback is because of Greg's push to put all the work
into the existing bus callback. Having symetrical callbacks is
cleaner.

> since we should only care about ownership at probe, hotplug, and other
> places well outside critical fast-paths, I'm not sure we really need to keep
> track of that anyway - it can always be recalculated by walking the
> group->devices list, 

It has to be locked against concurrent probe, and there isn't
currently any locking scheme that can support this. The owner_cnt is
effectively a new lock for this purpose. It is the same issue we
talked about with that VFIO patch you showed me.

So, using the group->device_list would require adding something else
somewhere - which I think should happen when someone has
justification for another use of whatever that something else is.

Also, Greg's did have an objection to the the first version, with code
living in dd.c, that was basically probe time performance. I'm not
sure making this slower would really be welcomed..

> and some of the relevant places have to do that anyway.

???

> It has to be s It should be pretty straightforward for
> iommu_bus_notifier to clear group->owner automatically upon an
> unbind of the matching driver when it's no longer bound to any other
> devices in the group either.

That not_bound/unbind notifier isn't currently triggred during
necessary failure paths of really_probe().

Even if this was patched up, it looks like spaghetti to me..

> use-case) then it should be up to VFIO to decide when it's finally
> finished with the whole group, rather than pretending we can keep
> track of nested ownership claims from inside the API.

What nesting?
 
> Furthermore, If Greg was willing to compromise just far enough to let us put
> driver_managed_dma in the 3-byte hole in the generic struct
> device_driver,

Space was not an issue, the earlier version of this switched an
existing bool to a bitfield.

> we wouldn't have to have quite so much boilerplate repeated across the
> various bus implementations (I'm not suggesting to move any actual calls
> back into the driver core, just the storage of flag itself). 

Not sure that makes sense.. But I don't understand why we need to copy
and paste this code into every bus's dma_configure *shrug*

> FWIW I have some ideas for re-converging .dma_configure in future
> which I think should probably be able to subsume this into a
> completely generic common path, given a common flag.

This would be great!

Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-21 23:48         ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-21 23:48 UTC (permalink / raw)
  To: Robin Murphy
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
> On 2022-02-19 07:32, Christoph Hellwig wrote:
> > So we are back to the callback madness instead of the nice and simple
> > flag?  Sigh.
> 
> TBH, I *think* this part could be a fair bit simpler. It looks like this
> whole callback mess is effectively just to decrement
> group->owner_cnt, but

Right, the new callback is because of Greg's push to put all the work
into the existing bus callback. Having symetrical callbacks is
cleaner.

> since we should only care about ownership at probe, hotplug, and other
> places well outside critical fast-paths, I'm not sure we really need to keep
> track of that anyway - it can always be recalculated by walking the
> group->devices list, 

It has to be locked against concurrent probe, and there isn't
currently any locking scheme that can support this. The owner_cnt is
effectively a new lock for this purpose. It is the same issue we
talked about with that VFIO patch you showed me.

So, using the group->device_list would require adding something else
somewhere - which I think should happen when someone has
justification for another use of whatever that something else is.

Also, Greg's did have an objection to the the first version, with code
living in dd.c, that was basically probe time performance. I'm not
sure making this slower would really be welcomed..

> and some of the relevant places have to do that anyway.

???

> It has to be s It should be pretty straightforward for
> iommu_bus_notifier to clear group->owner automatically upon an
> unbind of the matching driver when it's no longer bound to any other
> devices in the group either.

That not_bound/unbind notifier isn't currently triggred during
necessary failure paths of really_probe().

Even if this was patched up, it looks like spaghetti to me..

> use-case) then it should be up to VFIO to decide when it's finally
> finished with the whole group, rather than pretending we can keep
> track of nested ownership claims from inside the API.

What nesting?
 
> Furthermore, If Greg was willing to compromise just far enough to let us put
> driver_managed_dma in the 3-byte hole in the generic struct
> device_driver,

Space was not an issue, the earlier version of this switched an
existing bool to a bitfield.

> we wouldn't have to have quite so much boilerplate repeated across the
> various bus implementations (I'm not suggesting to move any actual calls
> back into the driver core, just the storage of flag itself). 

Not sure that makes sense.. But I don't understand why we need to copy
and paste this code into every bus's dma_configure *shrug*

> FWIW I have some ideas for re-converging .dma_configure in future
> which I think should probably be able to subsume this into a
> completely generic common path, given a common flag.

This would be great!

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-21 23:48         ` Jason Gunthorpe via iommu
@ 2022-02-22  4:48           ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-22  4:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On 2/22/22 7:48 AM, Jason Gunthorpe wrote:
>> since we should only care about ownership at probe, hotplug, and other
>> places well outside critical fast-paths, I'm not sure we really need to keep
>> track of that anyway - it can always be recalculated by walking the
>> group->devices list,
> It has to be locked against concurrent probe, and there isn't
> currently any locking scheme that can support this. The owner_cnt is
> effectively a new lock for this purpose. It is the same issue we
> talked about with that VFIO patch you showed me.
> 
> So, using the group->device_list would require adding something else
> somewhere - which I think should happen when someone has
> justification for another use of whatever that something else is.

This series was originated from the similar idea by adding some fields
in driver structure and intercepting it in iommu core. We stopped doing
that due to the lack of lock mechanism between iommu and driver core.
It then evolved into what it is today.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-22  4:48           ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-22  4:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: baolu.lu, Christoph Hellwig, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2/22/22 7:48 AM, Jason Gunthorpe wrote:
>> since we should only care about ownership at probe, hotplug, and other
>> places well outside critical fast-paths, I'm not sure we really need to keep
>> track of that anyway - it can always be recalculated by walking the
>> group->devices list,
> It has to be locked against concurrent probe, and there isn't
> currently any locking scheme that can support this. The owner_cnt is
> effectively a new lock for this purpose. It is the same issue we
> talked about with that VFIO patch you showed me.
> 
> So, using the group->device_list would require adding something else
> somewhere - which I think should happen when someone has
> justification for another use of whatever that something else is.

This series was originated from the similar idea by adding some fields
in driver structure and intercepting it in iommu core. We stopped doing
that due to the lack of lock mechanism between iommu and driver core.
It then evolved into what it is today.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-21 23:48         ` Jason Gunthorpe via iommu
@ 2022-02-22 10:58           ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-22 10:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Lu Baolu, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2022-02-21 23:48, Jason Gunthorpe wrote:
> On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
>> On 2022-02-19 07:32, Christoph Hellwig wrote:
>>> So we are back to the callback madness instead of the nice and simple
>>> flag?  Sigh.
>>
>> TBH, I *think* this part could be a fair bit simpler. It looks like this
>> whole callback mess is effectively just to decrement
>> group->owner_cnt, but
> 
> Right, the new callback is because of Greg's push to put all the work
> into the existing bus callback. Having symetrical callbacks is
> cleaner.

I'll continue to disagree that having tons more code purely for the sake 
of it is cleaner. The high-level requirements are fundamentally 
asymmetrical - ownership has to be actively claimed by the bus code at a 
point during probe where it can block probing if necessary, but it can 
be released anywhere at all during remove since that cannot fail. I 
don't personally see the value in a bunch of code bloat for no reason 
other than trying to pretend that an asymmetrical thing isn't.

We already have other concepts in the IOMMU API, like the domain ops 
lifecycle, which are almost self-contained but for needing an external 
prod to get started, so I'm naturally viewing this one the same way.

>> since we should only care about ownership at probe, hotplug, and other
>> places well outside critical fast-paths, I'm not sure we really need to keep
>> track of that anyway - it can always be recalculated by walking the
>> group->devices list,
> 
> It has to be locked against concurrent probe, and there isn't
> currently any locking scheme that can support this. The owner_cnt is
> effectively a new lock for this purpose. It is the same issue we
> talked about with that VFIO patch you showed me.

Huh? How hard is it to hold group->mutex when reading or writing 
group->owner? Walking the list would only have to be done for 
*releasing* ownership and I'm pretty sure all the races there are benign 
- only probe/remove of the driver (or DMA API token) matching a current 
non-NULL owner matter; if two removes race, the first might end up 
releasing ownership "early", but the second is waiting to do that anyway 
so it's OK; if a remove races with a probe, the remove may end up 
leaving the owner set, but the probe is waiting to do that anyway so 
it's OK.

> So, using the group->device_list would require adding something else
> somewhere - which I think should happen when someone has
> justification for another use of whatever that something else is.
> 
> Also, Greg's did have an objection to the the first version, with code
> living in dd.c, that was basically probe time performance. I'm not
> sure making this slower would really be welcomed..

Again, this does not affect probe at all, only remove, and TBH I'd 
expect the performance impact to be negligible. On any sensible system, 
IOMMU groups are not large. Heck, in the typical case I'd guess it's no 
worse than the time we currently spend on group notifiers. I was just 
making the point that there should not be a significant performance 
argument for needing to cache a count value.

>> and some of the relevant places have to do that anyway.
> 
> ???

I was looking at iommu_group_remove_device() at the time, but of course 
we should always have seen an unbind before we get there - that one's on 
me, sorry for the confusion.

>> It has to be s It should be pretty straightforward for
>> iommu_bus_notifier to clear group->owner automatically upon an
>> unbind of the matching driver when it's no longer bound to any other
>> devices in the group either.
> 
> That not_bound/unbind notifier isn't currently triggred during
> necessary failure paths of really_probe().

Eh? Just look at the context of patch #2, let alone the rest of the 
function, and tell me how, if we can't rely on 
BUS_NOTIFY_DRIVER_NOT_BOUND, calling .dma_cleanup *from the exact same 
place* is somehow more reliable?

AFAICS, a notifier handling both BUS_NOTIFY_UNBOUND_DRIVER and 
BUS_NOTIFY_DRIVER_NOT_BOUND would be directly equivalent to the callers 
of .dma_cleanup here.

> Even if this was patched up, it looks like spaghetti to me..
> 
>> use-case) then it should be up to VFIO to decide when it's finally
>> finished with the whole group, rather than pretending we can keep
>> track of nested ownership claims from inside the API.
> 
> What nesting?

The current implementation of iommu_group_claim_dma_owner() allows 
owner_cnt to increase beyond 1, and correspondingly requires 
iommu_group_release_dma_owner() to be called the same number of times. 
It doesn't appear that VFIO needs that, and I'm not sure I'd trust any 
other potential users to get it right either.

>> Furthermore, If Greg was willing to compromise just far enough to let us put
>> driver_managed_dma in the 3-byte hole in the generic struct
>> device_driver,
> 
> Space was not an issue, the earlier version of this switched an
> existing bool to a bitfield.
> 
>> we wouldn't have to have quite so much boilerplate repeated across the
>> various bus implementations (I'm not suggesting to move any actual calls
>> back into the driver core, just the storage of flag itself).
> 
> Not sure that makes sense.. But I don't understand why we need to copy
> and paste this code into every bus's dma_configure *shrug*

That's what I'm saying - right now every bus *has* to have a specific 
.dma_configure implementation if only to retrieve a 
semantically-identical flag from each bus-specific structure; there is 
zero possible code-sharing. With a generically-defined flag, there is 
some possibility for code-sharing now (e.g. patch #3 wouldn't be 
needed), and the potential for more in future.

>> FWIW I have some ideas for re-converging .dma_configure in future
>> which I think should probably be able to subsume this into a
>> completely generic common path, given a common flag.
> 
> This would be great!

Indeed, so if we're enthusiastic about future cleanup that necessitates 
a generic flag, why not make the flag generic to start with?

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-22 10:58           ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-22 10:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On 2022-02-21 23:48, Jason Gunthorpe wrote:
> On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
>> On 2022-02-19 07:32, Christoph Hellwig wrote:
>>> So we are back to the callback madness instead of the nice and simple
>>> flag?  Sigh.
>>
>> TBH, I *think* this part could be a fair bit simpler. It looks like this
>> whole callback mess is effectively just to decrement
>> group->owner_cnt, but
> 
> Right, the new callback is because of Greg's push to put all the work
> into the existing bus callback. Having symetrical callbacks is
> cleaner.

I'll continue to disagree that having tons more code purely for the sake 
of it is cleaner. The high-level requirements are fundamentally 
asymmetrical - ownership has to be actively claimed by the bus code at a 
point during probe where it can block probing if necessary, but it can 
be released anywhere at all during remove since that cannot fail. I 
don't personally see the value in a bunch of code bloat for no reason 
other than trying to pretend that an asymmetrical thing isn't.

We already have other concepts in the IOMMU API, like the domain ops 
lifecycle, which are almost self-contained but for needing an external 
prod to get started, so I'm naturally viewing this one the same way.

>> since we should only care about ownership at probe, hotplug, and other
>> places well outside critical fast-paths, I'm not sure we really need to keep
>> track of that anyway - it can always be recalculated by walking the
>> group->devices list,
> 
> It has to be locked against concurrent probe, and there isn't
> currently any locking scheme that can support this. The owner_cnt is
> effectively a new lock for this purpose. It is the same issue we
> talked about with that VFIO patch you showed me.

Huh? How hard is it to hold group->mutex when reading or writing 
group->owner? Walking the list would only have to be done for 
*releasing* ownership and I'm pretty sure all the races there are benign 
- only probe/remove of the driver (or DMA API token) matching a current 
non-NULL owner matter; if two removes race, the first might end up 
releasing ownership "early", but the second is waiting to do that anyway 
so it's OK; if a remove races with a probe, the remove may end up 
leaving the owner set, but the probe is waiting to do that anyway so 
it's OK.

> So, using the group->device_list would require adding something else
> somewhere - which I think should happen when someone has
> justification for another use of whatever that something else is.
> 
> Also, Greg's did have an objection to the the first version, with code
> living in dd.c, that was basically probe time performance. I'm not
> sure making this slower would really be welcomed..

Again, this does not affect probe at all, only remove, and TBH I'd 
expect the performance impact to be negligible. On any sensible system, 
IOMMU groups are not large. Heck, in the typical case I'd guess it's no 
worse than the time we currently spend on group notifiers. I was just 
making the point that there should not be a significant performance 
argument for needing to cache a count value.

>> and some of the relevant places have to do that anyway.
> 
> ???

I was looking at iommu_group_remove_device() at the time, but of course 
we should always have seen an unbind before we get there - that one's on 
me, sorry for the confusion.

>> It has to be s It should be pretty straightforward for
>> iommu_bus_notifier to clear group->owner automatically upon an
>> unbind of the matching driver when it's no longer bound to any other
>> devices in the group either.
> 
> That not_bound/unbind notifier isn't currently triggred during
> necessary failure paths of really_probe().

Eh? Just look at the context of patch #2, let alone the rest of the 
function, and tell me how, if we can't rely on 
BUS_NOTIFY_DRIVER_NOT_BOUND, calling .dma_cleanup *from the exact same 
place* is somehow more reliable?

AFAICS, a notifier handling both BUS_NOTIFY_UNBOUND_DRIVER and 
BUS_NOTIFY_DRIVER_NOT_BOUND would be directly equivalent to the callers 
of .dma_cleanup here.

> Even if this was patched up, it looks like spaghetti to me..
> 
>> use-case) then it should be up to VFIO to decide when it's finally
>> finished with the whole group, rather than pretending we can keep
>> track of nested ownership claims from inside the API.
> 
> What nesting?

The current implementation of iommu_group_claim_dma_owner() allows 
owner_cnt to increase beyond 1, and correspondingly requires 
iommu_group_release_dma_owner() to be called the same number of times. 
It doesn't appear that VFIO needs that, and I'm not sure I'd trust any 
other potential users to get it right either.

>> Furthermore, If Greg was willing to compromise just far enough to let us put
>> driver_managed_dma in the 3-byte hole in the generic struct
>> device_driver,
> 
> Space was not an issue, the earlier version of this switched an
> existing bool to a bitfield.
> 
>> we wouldn't have to have quite so much boilerplate repeated across the
>> various bus implementations (I'm not suggesting to move any actual calls
>> back into the driver core, just the storage of flag itself).
> 
> Not sure that makes sense.. But I don't understand why we need to copy
> and paste this code into every bus's dma_configure *shrug*

That's what I'm saying - right now every bus *has* to have a specific 
.dma_configure implementation if only to retrieve a 
semantically-identical flag from each bus-specific structure; there is 
zero possible code-sharing. With a generically-defined flag, there is 
some possibility for code-sharing now (e.g. patch #3 wouldn't be 
needed), and the potential for more in future.

>> FWIW I have some ideas for re-converging .dma_configure in future
>> which I think should probably be able to subsume this into a
>> completely generic common path, given a common flag.
> 
> This would be great!

Indeed, so if we're enthusiastic about future cleanup that necessitates 
a generic flag, why not make the flag generic to start with?

Thanks,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-22 10:58           ` Robin Murphy
@ 2022-02-22 15:16             ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-22 15:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Lu Baolu, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Tue, Feb 22, 2022 at 10:58:37AM +0000, Robin Murphy wrote:
> On 2022-02-21 23:48, Jason Gunthorpe wrote:
> > On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
> > > On 2022-02-19 07:32, Christoph Hellwig wrote:
> > > > So we are back to the callback madness instead of the nice and simple
> > > > flag?  Sigh.
> > > 
> > > TBH, I *think* this part could be a fair bit simpler. It looks like this
> > > whole callback mess is effectively just to decrement
> > > group->owner_cnt, but
> > 
> > Right, the new callback is because of Greg's push to put all the work
> > into the existing bus callback. Having symetrical callbacks is
> > cleaner.
> 
> I'll continue to disagree that having tons more code purely for the sake of
> it is cleaner. The high-level requirements are fundamentally asymmetrical -
> ownership has to be actively claimed by the bus code at a point during probe
> where it can block probing if necessary, but it can be released anywhere at
> all during remove since that cannot fail. I don't personally see the value
> in a bunch of code bloat for no reason other than trying to pretend that an
> asymmetrical thing isn't.

Then we should put this in the share core code like most of us want.

If we are doing this distorted thing then it may as well make some
kind of self consistent sense with a configure/unconfigure op pair.

> group->owner?  Walking the list would only have to be done for *releasing*
> ownership and I'm pretty sure all the races there are benign - only
> probe/remove of the driver (or DMA API token) matching a current non-NULL
> owner matter; if two removes race, the first might end up releasing
> ownership "early", but the second is waiting to do that anyway so it's OK;
> if a remove races with a probe, the remove may end up leaving the owner set,
> but the probe is waiting to do that anyway so it's OK.

With a lockless algorithm the race is probably wrongly releasing an
ownership that probe just set in the multi-device group case.

Still not sure I see what you are thinking though..

How did we get from adding a few simple lines to dd.c into building
some complex lockless algorithm and hoping we did it right?

> > > It has to be s It should be pretty straightforward for
> > > iommu_bus_notifier to clear group->owner automatically upon an
> > > unbind of the matching driver when it's no longer bound to any other
> > > devices in the group either.
> > 
> > That not_bound/unbind notifier isn't currently triggred during
> > necessary failure paths of really_probe().
> 
> Eh? Just look at the context of patch #2, let alone the rest of the
> function, and tell me how, if we can't rely on BUS_NOTIFY_DRIVER_NOT_BOUND,
> calling .dma_cleanup *from the exact same place* is somehow more reliable?

Yeah, OK

> AFAICS, a notifier handling both BUS_NOTIFY_UNBOUND_DRIVER and
> BUS_NOTIFY_DRIVER_NOT_BOUND would be directly equivalent to the callers of
> .dma_cleanup here.

Yes, but why hide this in a notifier, it is still spaghetti

> > > use-case) then it should be up to VFIO to decide when it's finally
> > > finished with the whole group, rather than pretending we can keep
> > > track of nested ownership claims from inside the API.
> > 
> > What nesting?
> 
> The current implementation of iommu_group_claim_dma_owner() allows owner_cnt
> to increase beyond 1, and correspondingly requires
> iommu_group_release_dma_owner() to be called the same number of times. It
> doesn't appear that VFIO needs that, and I'm not sure I'd trust any other
> potential users to get it right either.

That isn't for "nesting" it is keeping track of multi-device
groups. Each count represents a device, not a nest.

> > > FWIW I have some ideas for re-converging .dma_configure in future
> > > which I think should probably be able to subsume this into a
> > > completely generic common path, given a common flag.
> > 
> > This would be great!
> 
> Indeed, so if we're enthusiastic about future cleanup that necessitates a
> generic flag, why not make the flag generic to start with?

Maybe when someone has patches to delete the bus ops completely they
can convince Greg. The good news is that it isn't much work to flip
the flag, Lu has already done it 3 times in the previous versions..

It has already been 8 weeks on this point, lets just move on please.

Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-22 15:16             ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-22 15:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On Tue, Feb 22, 2022 at 10:58:37AM +0000, Robin Murphy wrote:
> On 2022-02-21 23:48, Jason Gunthorpe wrote:
> > On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
> > > On 2022-02-19 07:32, Christoph Hellwig wrote:
> > > > So we are back to the callback madness instead of the nice and simple
> > > > flag?  Sigh.
> > > 
> > > TBH, I *think* this part could be a fair bit simpler. It looks like this
> > > whole callback mess is effectively just to decrement
> > > group->owner_cnt, but
> > 
> > Right, the new callback is because of Greg's push to put all the work
> > into the existing bus callback. Having symetrical callbacks is
> > cleaner.
> 
> I'll continue to disagree that having tons more code purely for the sake of
> it is cleaner. The high-level requirements are fundamentally asymmetrical -
> ownership has to be actively claimed by the bus code at a point during probe
> where it can block probing if necessary, but it can be released anywhere at
> all during remove since that cannot fail. I don't personally see the value
> in a bunch of code bloat for no reason other than trying to pretend that an
> asymmetrical thing isn't.

Then we should put this in the share core code like most of us want.

If we are doing this distorted thing then it may as well make some
kind of self consistent sense with a configure/unconfigure op pair.

> group->owner?  Walking the list would only have to be done for *releasing*
> ownership and I'm pretty sure all the races there are benign - only
> probe/remove of the driver (or DMA API token) matching a current non-NULL
> owner matter; if two removes race, the first might end up releasing
> ownership "early", but the second is waiting to do that anyway so it's OK;
> if a remove races with a probe, the remove may end up leaving the owner set,
> but the probe is waiting to do that anyway so it's OK.

With a lockless algorithm the race is probably wrongly releasing an
ownership that probe just set in the multi-device group case.

Still not sure I see what you are thinking though..

How did we get from adding a few simple lines to dd.c into building
some complex lockless algorithm and hoping we did it right?

> > > It has to be s It should be pretty straightforward for
> > > iommu_bus_notifier to clear group->owner automatically upon an
> > > unbind of the matching driver when it's no longer bound to any other
> > > devices in the group either.
> > 
> > That not_bound/unbind notifier isn't currently triggred during
> > necessary failure paths of really_probe().
> 
> Eh? Just look at the context of patch #2, let alone the rest of the
> function, and tell me how, if we can't rely on BUS_NOTIFY_DRIVER_NOT_BOUND,
> calling .dma_cleanup *from the exact same place* is somehow more reliable?

Yeah, OK

> AFAICS, a notifier handling both BUS_NOTIFY_UNBOUND_DRIVER and
> BUS_NOTIFY_DRIVER_NOT_BOUND would be directly equivalent to the callers of
> .dma_cleanup here.

Yes, but why hide this in a notifier, it is still spaghetti

> > > use-case) then it should be up to VFIO to decide when it's finally
> > > finished with the whole group, rather than pretending we can keep
> > > track of nested ownership claims from inside the API.
> > 
> > What nesting?
> 
> The current implementation of iommu_group_claim_dma_owner() allows owner_cnt
> to increase beyond 1, and correspondingly requires
> iommu_group_release_dma_owner() to be called the same number of times. It
> doesn't appear that VFIO needs that, and I'm not sure I'd trust any other
> potential users to get it right either.

That isn't for "nesting" it is keeping track of multi-device
groups. Each count represents a device, not a nest.

> > > FWIW I have some ideas for re-converging .dma_configure in future
> > > which I think should probably be able to subsume this into a
> > > completely generic common path, given a common flag.
> > 
> > This would be great!
> 
> Indeed, so if we're enthusiastic about future cleanup that necessitates a
> generic flag, why not make the flag generic to start with?

Maybe when someone has patches to delete the bus ops completely they
can convince Greg. The good news is that it isn't much work to flip
the flag, Lu has already done it 3 times in the previous versions..

It has already been 8 weeks on this point, lets just move on please.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-22 15:16             ` Jason Gunthorpe via iommu
@ 2022-02-22 21:18               ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-22 21:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Lu Baolu, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2022-02-22 15:16, Jason Gunthorpe wrote:
> On Tue, Feb 22, 2022 at 10:58:37AM +0000, Robin Murphy wrote:
>> On 2022-02-21 23:48, Jason Gunthorpe wrote:
>>> On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
>>>> On 2022-02-19 07:32, Christoph Hellwig wrote:
>>>>> So we are back to the callback madness instead of the nice and simple
>>>>> flag?  Sigh.
>>>>
>>>> TBH, I *think* this part could be a fair bit simpler. It looks like this
>>>> whole callback mess is effectively just to decrement
>>>> group->owner_cnt, but
>>>
>>> Right, the new callback is because of Greg's push to put all the work
>>> into the existing bus callback. Having symetrical callbacks is
>>> cleaner.
>>
>> I'll continue to disagree that having tons more code purely for the sake of
>> it is cleaner. The high-level requirements are fundamentally asymmetrical -
>> ownership has to be actively claimed by the bus code at a point during probe
>> where it can block probing if necessary, but it can be released anywhere at
>> all during remove since that cannot fail. I don't personally see the value
>> in a bunch of code bloat for no reason other than trying to pretend that an
>> asymmetrical thing isn't.
> 
> Then we should put this in the share core code like most of us want.
> 
> If we are doing this distorted thing then it may as well make some
> kind of self consistent sense with a configure/unconfigure op pair.
> 
>> group->owner?  Walking the list would only have to be done for *releasing*
>> ownership and I'm pretty sure all the races there are benign - only
>> probe/remove of the driver (or DMA API token) matching a current non-NULL
>> owner matter; if two removes race, the first might end up releasing
>> ownership "early", but the second is waiting to do that anyway so it's OK;
>> if a remove races with a probe, the remove may end up leaving the owner set,
>> but the probe is waiting to do that anyway so it's OK.
> 
> With a lockless algorithm the race is probably wrongly releasing an
> ownership that probe just set in the multi-device group case.
> 
> Still not sure I see what you are thinking though..

What part of "How hard is it to hold group->mutex when reading or 
writing group->owner?" sounded like "complex lockless algorithm", exactly?

To spell it out, the scheme I'm proposing looks like this:

probe/claim:
	void *owner = driver_or_DMA_API_token(dev);//oversimplification!
	if (owner) {
		mutex_lock(group->mutex);
		if (!group->owner)
			group->owner = owner;
		else if (group->owner != owner);
			ret = -EBUSY;
		mutex_unlock(group->mutex);
	}

remove:
	bool still_owned = false;
	mutex_lock(group->mutex);
	list_for_each_entry(tmp, &group->devices, list) {
		void *owner = driver_or_DMA_API_token(tmp);
		if (tmp == dev || !owner || owner != group->owner)
			continue;
		still_owned = true;
		break;
	}
	if (!still_owned)
		group->owner = NULL;
	mutex_unlock(group->mutex);

Of course now that I've made it more concrete I realise that the remove 
hook does need to run *after* dev->driver is cleared, so not quite 
"anywhere at all", but the main point remains: as long as actual changes 
of ownership are always serialised, even if the list walk in the remove 
hook sees "future" information WRT other devices' drivers, at worst it 
should merely short-cut to a corresponding pending reclaim of ownership.

> How did we get from adding a few simple lines to dd.c into building
> some complex lockless algorithm and hoping we did it right?

Because the current alternative to adding a few simple lines to dd.c is 
adding loads of lines all over the place to end up calling back into 
common IOMMU code, to do something I'm 99% certain the common IOMMU code 
could do for itself in private. That said, having worked through the 
above, it does start looking like a bit of a big change for this series 
at this point, so I'm happy to keep it on the back burner for when I 
have to rip .dma_configure to pieces anyway.

According to lockdep, I think I've solved the VFIO locking issue 
provided vfio_group_viable() goes away, so I'm certainly keen not to 
delay that for another cycle!

>>>> It has to be s It should be pretty straightforward for
>>>> iommu_bus_notifier to clear group->owner automatically upon an
>>>> unbind of the matching driver when it's no longer bound to any other
>>>> devices in the group either.
>>>
>>> That not_bound/unbind notifier isn't currently triggred during
>>> necessary failure paths of really_probe().
>>
>> Eh? Just look at the context of patch #2, let alone the rest of the
>> function, and tell me how, if we can't rely on BUS_NOTIFY_DRIVER_NOT_BOUND,
>> calling .dma_cleanup *from the exact same place* is somehow more reliable?
> 
> Yeah, OK
> 
>> AFAICS, a notifier handling both BUS_NOTIFY_UNBOUND_DRIVER and
>> BUS_NOTIFY_DRIVER_NOT_BOUND would be directly equivalent to the callers of
>> .dma_cleanup here.
> 
> Yes, but why hide this in a notifier, it is still spaghetti

Quick quiz!

1: The existing IOMMU group management has spent the last 10 years being 
driven from:

   A - All over random bits of bus code and the driver core
   B - A private bus notifier


2: The functionality that this series replaces and improves upon was 
split between VFIO and...

   A - Random bits of bus code and the driver core
   B - The same private bus notifier

>>>> use-case) then it should be up to VFIO to decide when it's finally
>>>> finished with the whole group, rather than pretending we can keep
>>>> track of nested ownership claims from inside the API.
>>>
>>> What nesting?
>>
>> The current implementation of iommu_group_claim_dma_owner() allows owner_cnt
>> to increase beyond 1, and correspondingly requires
>> iommu_group_release_dma_owner() to be called the same number of times. It
>> doesn't appear that VFIO needs that, and I'm not sure I'd trust any other
>> potential users to get it right either.
> 
> That isn't for "nesting" it is keeping track of multi-device
> groups. Each count represents a device, not a nest.

I was originally going to say "recursion", but then thought that might 
carry too much risk of misinterpretation, oh well. Hold your favourite 
word for "taking a mutual-exclusion token that you already hold" in mind 
and read my paragraph quoted above again. I'm not talking about 
automatic DMA API claiming, that clearly happens per-device; I'm talking 
about explicit callers of iommu_group_claim_dma_owner(). Does VFIO call 
that multiple times for individual devices? No. Should it? No. Is it 
reasonable that any other future callers should need to? I don't think 
so. Would things be easier to reason about if we just disallowed it 
outright? For sure.

>>>> FWIW I have some ideas for re-converging .dma_configure in future
>>>> which I think should probably be able to subsume this into a
>>>> completely generic common path, given a common flag.
>>>
>>> This would be great!
>>
>> Indeed, so if we're enthusiastic about future cleanup that necessitates a
>> generic flag, why not make the flag generic to start with?
> 
> Maybe when someone has patches to delete the bus ops completely they
> can convince Greg. The good news is that it isn't much work to flip
> the flag, Lu has already done it 3 times in the previous versions..
> 
> It has already been 8 weeks on this point, lets just move on please.

Sure, if it was rc7 with the merge window looming I'd be saying "this is 
close enough, let's get it in now and fix the small stuff next cycle". 
However while there's still potentially time to get things right first 
time, I for one am going to continue to point them out because I'm not a 
fan of avoidable churn. I'm sorry I haven't had a chance to look 
properly at this series between v1 and v6, but that's just how things 
have been.

Robin.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-22 21:18               ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-22 21:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On 2022-02-22 15:16, Jason Gunthorpe wrote:
> On Tue, Feb 22, 2022 at 10:58:37AM +0000, Robin Murphy wrote:
>> On 2022-02-21 23:48, Jason Gunthorpe wrote:
>>> On Mon, Feb 21, 2022 at 08:43:33PM +0000, Robin Murphy wrote:
>>>> On 2022-02-19 07:32, Christoph Hellwig wrote:
>>>>> So we are back to the callback madness instead of the nice and simple
>>>>> flag?  Sigh.
>>>>
>>>> TBH, I *think* this part could be a fair bit simpler. It looks like this
>>>> whole callback mess is effectively just to decrement
>>>> group->owner_cnt, but
>>>
>>> Right, the new callback is because of Greg's push to put all the work
>>> into the existing bus callback. Having symetrical callbacks is
>>> cleaner.
>>
>> I'll continue to disagree that having tons more code purely for the sake of
>> it is cleaner. The high-level requirements are fundamentally asymmetrical -
>> ownership has to be actively claimed by the bus code at a point during probe
>> where it can block probing if necessary, but it can be released anywhere at
>> all during remove since that cannot fail. I don't personally see the value
>> in a bunch of code bloat for no reason other than trying to pretend that an
>> asymmetrical thing isn't.
> 
> Then we should put this in the share core code like most of us want.
> 
> If we are doing this distorted thing then it may as well make some
> kind of self consistent sense with a configure/unconfigure op pair.
> 
>> group->owner?  Walking the list would only have to be done for *releasing*
>> ownership and I'm pretty sure all the races there are benign - only
>> probe/remove of the driver (or DMA API token) matching a current non-NULL
>> owner matter; if two removes race, the first might end up releasing
>> ownership "early", but the second is waiting to do that anyway so it's OK;
>> if a remove races with a probe, the remove may end up leaving the owner set,
>> but the probe is waiting to do that anyway so it's OK.
> 
> With a lockless algorithm the race is probably wrongly releasing an
> ownership that probe just set in the multi-device group case.
> 
> Still not sure I see what you are thinking though..

What part of "How hard is it to hold group->mutex when reading or 
writing group->owner?" sounded like "complex lockless algorithm", exactly?

To spell it out, the scheme I'm proposing looks like this:

probe/claim:
	void *owner = driver_or_DMA_API_token(dev);//oversimplification!
	if (owner) {
		mutex_lock(group->mutex);
		if (!group->owner)
			group->owner = owner;
		else if (group->owner != owner);
			ret = -EBUSY;
		mutex_unlock(group->mutex);
	}

remove:
	bool still_owned = false;
	mutex_lock(group->mutex);
	list_for_each_entry(tmp, &group->devices, list) {
		void *owner = driver_or_DMA_API_token(tmp);
		if (tmp == dev || !owner || owner != group->owner)
			continue;
		still_owned = true;
		break;
	}
	if (!still_owned)
		group->owner = NULL;
	mutex_unlock(group->mutex);

Of course now that I've made it more concrete I realise that the remove 
hook does need to run *after* dev->driver is cleared, so not quite 
"anywhere at all", but the main point remains: as long as actual changes 
of ownership are always serialised, even if the list walk in the remove 
hook sees "future" information WRT other devices' drivers, at worst it 
should merely short-cut to a corresponding pending reclaim of ownership.

> How did we get from adding a few simple lines to dd.c into building
> some complex lockless algorithm and hoping we did it right?

Because the current alternative to adding a few simple lines to dd.c is 
adding loads of lines all over the place to end up calling back into 
common IOMMU code, to do something I'm 99% certain the common IOMMU code 
could do for itself in private. That said, having worked through the 
above, it does start looking like a bit of a big change for this series 
at this point, so I'm happy to keep it on the back burner for when I 
have to rip .dma_configure to pieces anyway.

According to lockdep, I think I've solved the VFIO locking issue 
provided vfio_group_viable() goes away, so I'm certainly keen not to 
delay that for another cycle!

>>>> It has to be s It should be pretty straightforward for
>>>> iommu_bus_notifier to clear group->owner automatically upon an
>>>> unbind of the matching driver when it's no longer bound to any other
>>>> devices in the group either.
>>>
>>> That not_bound/unbind notifier isn't currently triggred during
>>> necessary failure paths of really_probe().
>>
>> Eh? Just look at the context of patch #2, let alone the rest of the
>> function, and tell me how, if we can't rely on BUS_NOTIFY_DRIVER_NOT_BOUND,
>> calling .dma_cleanup *from the exact same place* is somehow more reliable?
> 
> Yeah, OK
> 
>> AFAICS, a notifier handling both BUS_NOTIFY_UNBOUND_DRIVER and
>> BUS_NOTIFY_DRIVER_NOT_BOUND would be directly equivalent to the callers of
>> .dma_cleanup here.
> 
> Yes, but why hide this in a notifier, it is still spaghetti

Quick quiz!

1: The existing IOMMU group management has spent the last 10 years being 
driven from:

   A - All over random bits of bus code and the driver core
   B - A private bus notifier


2: The functionality that this series replaces and improves upon was 
split between VFIO and...

   A - Random bits of bus code and the driver core
   B - The same private bus notifier

>>>> use-case) then it should be up to VFIO to decide when it's finally
>>>> finished with the whole group, rather than pretending we can keep
>>>> track of nested ownership claims from inside the API.
>>>
>>> What nesting?
>>
>> The current implementation of iommu_group_claim_dma_owner() allows owner_cnt
>> to increase beyond 1, and correspondingly requires
>> iommu_group_release_dma_owner() to be called the same number of times. It
>> doesn't appear that VFIO needs that, and I'm not sure I'd trust any other
>> potential users to get it right either.
> 
> That isn't for "nesting" it is keeping track of multi-device
> groups. Each count represents a device, not a nest.

I was originally going to say "recursion", but then thought that might 
carry too much risk of misinterpretation, oh well. Hold your favourite 
word for "taking a mutual-exclusion token that you already hold" in mind 
and read my paragraph quoted above again. I'm not talking about 
automatic DMA API claiming, that clearly happens per-device; I'm talking 
about explicit callers of iommu_group_claim_dma_owner(). Does VFIO call 
that multiple times for individual devices? No. Should it? No. Is it 
reasonable that any other future callers should need to? I don't think 
so. Would things be easier to reason about if we just disallowed it 
outright? For sure.

>>>> FWIW I have some ideas for re-converging .dma_configure in future
>>>> which I think should probably be able to subsume this into a
>>>> completely generic common path, given a common flag.
>>>
>>> This would be great!
>>
>> Indeed, so if we're enthusiastic about future cleanup that necessitates a
>> generic flag, why not make the flag generic to start with?
> 
> Maybe when someone has patches to delete the bus ops completely they
> can convince Greg. The good news is that it isn't much work to flip
> the flag, Lu has already done it 3 times in the previous versions..
> 
> It has already been 8 weeks on this point, lets just move on please.

Sure, if it was rc7 with the merge window looming I'd be saying "this is 
close enough, let's get it in now and fix the small stuff next cycle". 
However while there's still potentially time to get things right first 
time, I for one am going to continue to point them out because I'm not a 
fan of avoidable churn. I'm sorry I haven't had a chance to look 
properly at this series between v1 and v6, but that's just how things 
have been.

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-22 21:18               ` Robin Murphy
@ 2022-02-22 23:53                 ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-22 23:53 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Lu Baolu, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Tue, Feb 22, 2022 at 09:18:23PM +0000, Robin Murphy wrote:

> > Still not sure I see what you are thinking though..
> 
> What part of "How hard is it to hold group->mutex when reading or writing
> group->owner?" sounded like "complex lockless algorithm", exactly?

group->owner is not the issue, this series is already using the group
lock to protect the group_cnt and owner.

It is how you inspect the struct device it iterates over to decide if
it is still using the DMA API or not that is the problem. Hint: this
is why I keep mentioning the device_lock() as it is the only locking we
for this today.

> To spell it out, the scheme I'm proposing looks like this:

Well, I already got this, it is what is in driver_or_DMA_API_token()
that matters

I think you are suggesting to do something like:

   if (!READ_ONCE(dev->driver) ||  ???)
       return NULL;
   return group;  // A DMA_API 'token'

Which is locklessly reading dev->driver, and why you are talking about
races, I guess.

> remove:
>        bool still_owned = false;
>        mutex_lock(group->mutex);
>        list_for_each_entry(tmp, &group->devices, list) {
>                void *owner = driver_or_DMA_API_token(tmp);
>                if (tmp == dev || !owner || owner != group->owner)

And here you expect this will never be called if a group is owned by
VFIO? Which bakes in that weird behavior of really_probe() that only
some errors deserve to get a notifier?

How does the next series work? The iommu_attach_device() work relies
on the owner_cnt too. I don't think this list can replace it there.

> always serialised, even if the list walk in the remove hook sees "future"
> information WRT other devices' drivers, at worst it should merely short-cut
> to a corresponding pending reclaim of ownership.

Depending on what the actual logic is I could accept this argument,
assuming it came with a WRITE_ONCE on the store side and we all
thought carefully about how all this is ordered.

> Because the current alternative to adding a few simple lines to dd.c is
> adding loads of lines all over the place to end up calling back into common
> IOMMU code, to do something I'm 99% certain the common IOMMU code
> could do

*shrug* both Christoph and I tried to convince Greg. He never really
explained why, but currently he thinks this is the right way to design
it, and so here we are.

> for itself in private. That said, having worked through the above, it does
> start looking like a bit of a big change for this series at this point, so
> I'm happy to keep it on the back burner for when I have to rip
> .dma_configure to pieces anyway.

OK, thanks.

> According to lockdep, I think I've solved the VFIO locking issue provided
> vfio_group_viable() goes away, so I'm certainly keen not to delay that for
> another cycle!

Indeed.

Keep in mind that lockdep is disabled on the device_lock()..

> paragraph quoted above again. I'm not talking about automatic DMA API
> claiming, that clearly happens per-device; I'm talking about explicit
> callers of iommu_group_claim_dma_owner(). Does VFIO call that multiple times
> for individual devices? No. Should it? No. Is it reasonable that any other
> future callers should need to? I don't think so. Would things be easier to
> reason about if we just disallowed it outright? For sure.

iommufd is device centric and the current draft does call
iommu_group_claim_dma_owner() once for each device. It doesn't have
any reason to track groups, so it has no way to know if it is
"nesting" or not.

I hope the iommu_attach_device() work will progress and iommufd can
eventualy call a cleaner device API, it is setup to work this way at
least.

So, yes, currently future calls need the owner_cnt to work right.
(and we are doing all this to allow iommufd and vfio to share the
ownership logic - adding VFIO-like group tracking complexity to
iommufd to save a few bus callbacks is not a win, IMHO)

> > It has already been 8 weeks on this point, lets just move on please.
> 
> Sure, if it was rc7 with the merge window looming I'd be saying "this is
> close enough, let's get it in now and fix the small stuff next cycle".
> However while there's still potentially time to get things right first time,
> I for one am going to continue to point them out because I'm not a fan of
> avoidable churn. I'm sorry I haven't had a chance to look properly at this
> series between v1 and v6, but that's just how things have been.

Well, it is understandable, but this was supposed to be a smallish
cleanup series. All the improvments from the discussion are welcomed
and certainly did improve it, but this started in November and is
dragging on..

Sometimes we need churn to bring everyone along the journey.

Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-22 23:53                 ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-22 23:53 UTC (permalink / raw)
  To: Robin Murphy
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On Tue, Feb 22, 2022 at 09:18:23PM +0000, Robin Murphy wrote:

> > Still not sure I see what you are thinking though..
> 
> What part of "How hard is it to hold group->mutex when reading or writing
> group->owner?" sounded like "complex lockless algorithm", exactly?

group->owner is not the issue, this series is already using the group
lock to protect the group_cnt and owner.

It is how you inspect the struct device it iterates over to decide if
it is still using the DMA API or not that is the problem. Hint: this
is why I keep mentioning the device_lock() as it is the only locking we
for this today.

> To spell it out, the scheme I'm proposing looks like this:

Well, I already got this, it is what is in driver_or_DMA_API_token()
that matters

I think you are suggesting to do something like:

   if (!READ_ONCE(dev->driver) ||  ???)
       return NULL;
   return group;  // A DMA_API 'token'

Which is locklessly reading dev->driver, and why you are talking about
races, I guess.

> remove:
>        bool still_owned = false;
>        mutex_lock(group->mutex);
>        list_for_each_entry(tmp, &group->devices, list) {
>                void *owner = driver_or_DMA_API_token(tmp);
>                if (tmp == dev || !owner || owner != group->owner)

And here you expect this will never be called if a group is owned by
VFIO? Which bakes in that weird behavior of really_probe() that only
some errors deserve to get a notifier?

How does the next series work? The iommu_attach_device() work relies
on the owner_cnt too. I don't think this list can replace it there.

> always serialised, even if the list walk in the remove hook sees "future"
> information WRT other devices' drivers, at worst it should merely short-cut
> to a corresponding pending reclaim of ownership.

Depending on what the actual logic is I could accept this argument,
assuming it came with a WRITE_ONCE on the store side and we all
thought carefully about how all this is ordered.

> Because the current alternative to adding a few simple lines to dd.c is
> adding loads of lines all over the place to end up calling back into common
> IOMMU code, to do something I'm 99% certain the common IOMMU code
> could do

*shrug* both Christoph and I tried to convince Greg. He never really
explained why, but currently he thinks this is the right way to design
it, and so here we are.

> for itself in private. That said, having worked through the above, it does
> start looking like a bit of a big change for this series at this point, so
> I'm happy to keep it on the back burner for when I have to rip
> .dma_configure to pieces anyway.

OK, thanks.

> According to lockdep, I think I've solved the VFIO locking issue provided
> vfio_group_viable() goes away, so I'm certainly keen not to delay that for
> another cycle!

Indeed.

Keep in mind that lockdep is disabled on the device_lock()..

> paragraph quoted above again. I'm not talking about automatic DMA API
> claiming, that clearly happens per-device; I'm talking about explicit
> callers of iommu_group_claim_dma_owner(). Does VFIO call that multiple times
> for individual devices? No. Should it? No. Is it reasonable that any other
> future callers should need to? I don't think so. Would things be easier to
> reason about if we just disallowed it outright? For sure.

iommufd is device centric and the current draft does call
iommu_group_claim_dma_owner() once for each device. It doesn't have
any reason to track groups, so it has no way to know if it is
"nesting" or not.

I hope the iommu_attach_device() work will progress and iommufd can
eventualy call a cleaner device API, it is setup to work this way at
least.

So, yes, currently future calls need the owner_cnt to work right.
(and we are doing all this to allow iommufd and vfio to share the
ownership logic - adding VFIO-like group tracking complexity to
iommufd to save a few bus callbacks is not a win, IMHO)

> > It has already been 8 weeks on this point, lets just move on please.
> 
> Sure, if it was rc7 with the merge window looming I'd be saying "this is
> close enough, let's get it in now and fix the small stuff next cycle".
> However while there's still potentially time to get things right first time,
> I for one am going to continue to point them out because I'm not a fan of
> avoidable churn. I'm sorry I haven't had a chance to look properly at this
> series between v1 and v6, but that's just how things have been.

Well, it is understandable, but this was supposed to be a smallish
cleanup series. All the improvments from the discussion are welcomed
and certainly did improve it, but this started in November and is
dragging on..

Sometimes we need churn to bring everyone along the journey.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-22 23:53                 ` Jason Gunthorpe via iommu
@ 2022-02-23  5:01                   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-23  5:01 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: baolu.lu, Christoph Hellwig, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2/23/22 7:53 AM, Jason Gunthorpe wrote:
>> To spell it out, the scheme I'm proposing looks like this:
> Well, I already got this, it is what is in driver_or_DMA_API_token()
> that matters
> 
> I think you are suggesting to do something like:
> 
>     if (!READ_ONCE(dev->driver) ||  ???)
>         return NULL;
>     return group;  // A DMA_API 'token'
> 
> Which is locklessly reading dev->driver, and why you are talking about
> races, I guess.
> 

I am afraid that we are not able to implement a race-free
driver_or_DMA_API_token() helper. The lock problem between the IOMMU
core and driver core always exists.

For example, when we implemented iommu_group_store_type() to change the
default domain type of a device through sysfs, we could only comprised
and limited this functionality to singleton groups to avoid the lock
issue.

Unfortunately, that compromise cannot simply applied to the problem to
be solved by this series, because the iommu core cannot abort the driver
binding when the conflict is detected in the bus notifier.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23  5:01                   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-23  5:01 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On 2/23/22 7:53 AM, Jason Gunthorpe wrote:
>> To spell it out, the scheme I'm proposing looks like this:
> Well, I already got this, it is what is in driver_or_DMA_API_token()
> that matters
> 
> I think you are suggesting to do something like:
> 
>     if (!READ_ONCE(dev->driver) ||  ???)
>         return NULL;
>     return group;  // A DMA_API 'token'
> 
> Which is locklessly reading dev->driver, and why you are talking about
> races, I guess.
> 

I am afraid that we are not able to implement a race-free
driver_or_DMA_API_token() helper. The lock problem between the IOMMU
core and driver core always exists.

For example, when we implemented iommu_group_store_type() to change the
default domain type of a device through sysfs, we could only comprised
and limited this functionality to singleton groups to avoid the lock
issue.

Unfortunately, that compromise cannot simply applied to the problem to
be solved by this series, because the iommu core cannot abort the driver
binding when the conflict is detected in the bus notifier.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23  5:01                   ` Lu Baolu
@ 2022-02-23 13:04                     ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 13:04 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe
  Cc: Christoph Hellwig, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2022-02-23 05:01, Lu Baolu wrote:
> On 2/23/22 7:53 AM, Jason Gunthorpe wrote:
>>> To spell it out, the scheme I'm proposing looks like this:
>> Well, I already got this, it is what is in driver_or_DMA_API_token()
>> that matters
>>
>> I think you are suggesting to do something like:
>>
>>     if (!READ_ONCE(dev->driver) ||  ???)
>>         return NULL;
>>     return group;  // A DMA_API 'token'
>>
>> Which is locklessly reading dev->driver, and why you are talking about
>> races, I guess.
>>
> 
> I am afraid that we are not able to implement a race-free
> driver_or_DMA_API_token() helper. The lock problem between the IOMMU
> core and driver core always exists.

It's not race-free. My point is that the races aren't harmful because 
what we might infer from the "wrong" information still leads to the 
right action. dev->driver is obviously always valid and constant for 
*claiming* ownership, since that either happens for the DMA API in the 
middle of really_probe() binding driver to dev, or while driver is 
actively using dev and calling iommu_group_claim_dma_owner(). The races 
exist during remove, but both probe and remove are serialised on the 
group mutex after respectively setting/clearing dev->driver, there are 
only 4 possibilities for the state of any other group sibling "tmp" 
during the time that dev holds that mutex in its remove path:

1 - tmp->driver is non-NULL because tmp is already bound.
   1.a - If tmp->driver->driver_managed_dma == 0, the group must 
currently be DMA-API-owned as a whole. Regardless of what driver dev has 
unbound from, its removal does not release someone else's DMA API 
(co-)ownership.
   1.b - If tmp->driver->driver_managed_dma == 1 and tmp->driver == 
group->owner, then dev must have unbound from the same driver, but 
either way that driver has not yet released ownership so dev's removal 
does not change anything.
   1.c - If tmp->driver->driver_managed_dma == 1 and tmp->driver != 
group->owner, it doesn't matter. Even if tmp->driver is currently 
waiting to attempt to claim ownership it can't do so until we release 
the mutex.

2 - tmp->driver is non-NULL because tmp is in the process of binding.
   2.a - If tmp->driver->driver_managed_dma == 0, tmp can be assumed to 
be waiting on the group mutex to claim DMA API ownership.
     2.a.i - If the group is DMA API owned, this race is simply a 
short-cut to case 1.a - dev's ownership is effectively handed off 
directly to tmp, rather than potentially being released and immediately 
reclaimed. Once tmp gets its turn, it finds the group already 
DMA-API-owned as it wanted and all is well. This may be "unfair" if an 
explicit claim was also waiting, but not incorrect.
     2.a.ii - If the group is driver-owned, it doesn't matter. Removing 
dev does not change the current ownership, and tmp's probe will 
eventually get its turn and find whatever it finds at that point in future.
   2.b - If tmp->driver->driver_managed_dma == 1, it doesn't matter. 
Either that driver already owns the group, or it might try to claim it 
after we've resolved dev's removal and released the mutex, in which case 
it will find whatever it finds.

3 - tmp->driver is NULL because tmp is unbound. Obviously no impact.

4 - tmp->driver is NULL because tmp is in the process of unbinding.
   4.a - If the group is DMA-API-owned, either way tmp has no further 
influence.
     4.a.i - If tmp has unbound from a driver_managed_dma=0 driver, it 
must be waiting to release its DMA API ownership, thus if tmp would 
otherwise be the only remaining DMA API owner, the race is that dev's 
removal releases ownership on behalf of both devices. When tmp's own 
removal subsequently gets the mutex, it will either see that the group 
is already unowned, or maybe that someone else has re-claimed it in the 
interim, and either way do nothing, which is fine.
     4.a.ii - If tmp has unbound from a driver_managed_dma=1 driver, it 
doesn't matter, as in case 1.c.
   4.b - If the group is driver-owned, it doesn't matter. That ownership 
can only change if that driver releases it, which isn't happening while 
we hold the mutex.

As I said yesterday, I'm really just airing out an idea here; I might 
write up some proper patches as part of the bus ops work, and we can 
give it proper scrutiny then.

> For example, when we implemented iommu_group_store_type() to change the
> default domain type of a device through sysfs, we could only comprised
> and limited this functionality to singleton groups to avoid the lock
> issue.

Indeed, but once the probe and remove paths for grouped devices have to 
serialise on the group mutex, as we're introducing here, the story 
changes and we gain a lot more power. In fact that's a good point I 
hadn't considered yet - that sysfs constraint is functionally equivalent 
to the one in iommu_attach_device(), so once we land this ownership 
concept we should be free to relax it from "singleton" to "unowned" in 
much the same way as your other series is doing for attach.

> Unfortunately, that compromise cannot simply applied to the problem to
> be solved by this series, because the iommu core cannot abort the driver
> binding when the conflict is detected in the bus notifier.

No, I've never proposed that probe-time DMA ownership can be claimed 
from a notifier, we all know why that doesn't work. It's only the 
dma_cleanup() step that *could* be punted back to iommu_bus_notifier vs. 
the driver core having to know about it. Either way we're still 
serialising remove/failure against probe/remove of other devices in a 
group, and that's the critical aspect.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 13:04                     ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 13:04 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On 2022-02-23 05:01, Lu Baolu wrote:
> On 2/23/22 7:53 AM, Jason Gunthorpe wrote:
>>> To spell it out, the scheme I'm proposing looks like this:
>> Well, I already got this, it is what is in driver_or_DMA_API_token()
>> that matters
>>
>> I think you are suggesting to do something like:
>>
>>     if (!READ_ONCE(dev->driver) ||  ???)
>>         return NULL;
>>     return group;  // A DMA_API 'token'
>>
>> Which is locklessly reading dev->driver, and why you are talking about
>> races, I guess.
>>
> 
> I am afraid that we are not able to implement a race-free
> driver_or_DMA_API_token() helper. The lock problem between the IOMMU
> core and driver core always exists.

It's not race-free. My point is that the races aren't harmful because 
what we might infer from the "wrong" information still leads to the 
right action. dev->driver is obviously always valid and constant for 
*claiming* ownership, since that either happens for the DMA API in the 
middle of really_probe() binding driver to dev, or while driver is 
actively using dev and calling iommu_group_claim_dma_owner(). The races 
exist during remove, but both probe and remove are serialised on the 
group mutex after respectively setting/clearing dev->driver, there are 
only 4 possibilities for the state of any other group sibling "tmp" 
during the time that dev holds that mutex in its remove path:

1 - tmp->driver is non-NULL because tmp is already bound.
   1.a - If tmp->driver->driver_managed_dma == 0, the group must 
currently be DMA-API-owned as a whole. Regardless of what driver dev has 
unbound from, its removal does not release someone else's DMA API 
(co-)ownership.
   1.b - If tmp->driver->driver_managed_dma == 1 and tmp->driver == 
group->owner, then dev must have unbound from the same driver, but 
either way that driver has not yet released ownership so dev's removal 
does not change anything.
   1.c - If tmp->driver->driver_managed_dma == 1 and tmp->driver != 
group->owner, it doesn't matter. Even if tmp->driver is currently 
waiting to attempt to claim ownership it can't do so until we release 
the mutex.

2 - tmp->driver is non-NULL because tmp is in the process of binding.
   2.a - If tmp->driver->driver_managed_dma == 0, tmp can be assumed to 
be waiting on the group mutex to claim DMA API ownership.
     2.a.i - If the group is DMA API owned, this race is simply a 
short-cut to case 1.a - dev's ownership is effectively handed off 
directly to tmp, rather than potentially being released and immediately 
reclaimed. Once tmp gets its turn, it finds the group already 
DMA-API-owned as it wanted and all is well. This may be "unfair" if an 
explicit claim was also waiting, but not incorrect.
     2.a.ii - If the group is driver-owned, it doesn't matter. Removing 
dev does not change the current ownership, and tmp's probe will 
eventually get its turn and find whatever it finds at that point in future.
   2.b - If tmp->driver->driver_managed_dma == 1, it doesn't matter. 
Either that driver already owns the group, or it might try to claim it 
after we've resolved dev's removal and released the mutex, in which case 
it will find whatever it finds.

3 - tmp->driver is NULL because tmp is unbound. Obviously no impact.

4 - tmp->driver is NULL because tmp is in the process of unbinding.
   4.a - If the group is DMA-API-owned, either way tmp has no further 
influence.
     4.a.i - If tmp has unbound from a driver_managed_dma=0 driver, it 
must be waiting to release its DMA API ownership, thus if tmp would 
otherwise be the only remaining DMA API owner, the race is that dev's 
removal releases ownership on behalf of both devices. When tmp's own 
removal subsequently gets the mutex, it will either see that the group 
is already unowned, or maybe that someone else has re-claimed it in the 
interim, and either way do nothing, which is fine.
     4.a.ii - If tmp has unbound from a driver_managed_dma=1 driver, it 
doesn't matter, as in case 1.c.
   4.b - If the group is driver-owned, it doesn't matter. That ownership 
can only change if that driver releases it, which isn't happening while 
we hold the mutex.

As I said yesterday, I'm really just airing out an idea here; I might 
write up some proper patches as part of the bus ops work, and we can 
give it proper scrutiny then.

> For example, when we implemented iommu_group_store_type() to change the
> default domain type of a device through sysfs, we could only comprised
> and limited this functionality to singleton groups to avoid the lock
> issue.

Indeed, but once the probe and remove paths for grouped devices have to 
serialise on the group mutex, as we're introducing here, the story 
changes and we gain a lot more power. In fact that's a good point I 
hadn't considered yet - that sysfs constraint is functionally equivalent 
to the one in iommu_attach_device(), so once we land this ownership 
concept we should be free to relax it from "singleton" to "unowned" in 
much the same way as your other series is doing for attach.

> Unfortunately, that compromise cannot simply applied to the problem to
> be solved by this series, because the iommu core cannot abort the driver
> binding when the conflict is detected in the bus notifier.

No, I've never proposed that probe-time DMA ownership can be claimed 
from a notifier, we all know why that doesn't work. It's only the 
dma_cleanup() step that *could* be punted back to iommu_bus_notifier vs. 
the driver core having to know about it. Either way we're still 
serialising remove/failure against probe/remove of other devices in a 
group, and that's the critical aspect.

Thanks,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23 13:04                     ` Robin Murphy
@ 2022-02-23 13:46                       ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-23 13:46 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Lu Baolu, Christoph Hellwig, Greg Kroah-Hartman, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:

> 1 - tmp->driver is non-NULL because tmp is already bound.
>   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> its removal does not release someone else's DMA API (co-)ownership.

This is an uncommon locking pattern, but it does work. It relies on
the mutex being an effective synchronization barrier for an unlocked
store:

				      WRITE_ONCE(dev->driver, NULL)

 mutex_lock(&group->lock)
 READ_ONCE(dev->driver) != NULL and no UAF
 mutex_unlock(&group->lock)

				      mutex_lock(&group->lock)
				      tmp = READ_ONCE(dev1->driver);
				      if (tmp && tmp->blah) [..]
				      mutex_unlock(&group->lock)
 mutex_lock(&group->lock)
 READ_ONCE(dev->driver) == NULL
 mutex_unlock(&group->lock)

				      /* No other CPU can UAF dev->driver */
                                      kfree(driver)

Ie the CPU setting driver cannot pass to the next step without all
other CPUs observing the new value because of the release/acquire built
into the mutex_lock.

It is tricky, and can work in this instance, but the pattern's unlocked
design relies on ordering between the WRITE_ONCE and the locks - and
that ordering in dd.c isn't like that today.

Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 13:46                       ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-23 13:46 UTC (permalink / raw)
  To: Robin Murphy
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:

> 1 - tmp->driver is non-NULL because tmp is already bound.
>   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> its removal does not release someone else's DMA API (co-)ownership.

This is an uncommon locking pattern, but it does work. It relies on
the mutex being an effective synchronization barrier for an unlocked
store:

				      WRITE_ONCE(dev->driver, NULL)

 mutex_lock(&group->lock)
 READ_ONCE(dev->driver) != NULL and no UAF
 mutex_unlock(&group->lock)

				      mutex_lock(&group->lock)
				      tmp = READ_ONCE(dev1->driver);
				      if (tmp && tmp->blah) [..]
				      mutex_unlock(&group->lock)
 mutex_lock(&group->lock)
 READ_ONCE(dev->driver) == NULL
 mutex_unlock(&group->lock)

				      /* No other CPU can UAF dev->driver */
                                      kfree(driver)

Ie the CPU setting driver cannot pass to the next step without all
other CPUs observing the new value because of the release/acquire built
into the mutex_lock.

It is tricky, and can work in this instance, but the pattern's unlocked
design relies on ordering between the WRITE_ONCE and the locks - and
that ordering in dd.c isn't like that today.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23 13:46                       ` Jason Gunthorpe via iommu
@ 2022-02-23 14:06                         ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-23 14:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Lu Baolu, Christoph Hellwig, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> 
> > 1 - tmp->driver is non-NULL because tmp is already bound.
> >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > its removal does not release someone else's DMA API (co-)ownership.
> 
> This is an uncommon locking pattern, but it does work. It relies on
> the mutex being an effective synchronization barrier for an unlocked
> store:
> 
> 				      WRITE_ONCE(dev->driver, NULL)

Only the driver core should be messing with the dev->driver pointer as
when it does so, it already has the proper locks held.  Do I need to
move that to a "private" location so that nothing outside of the driver
core can mess with it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 14:06                         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-23 14:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Cornelia Huck, linux-kernel, Li Yang, iommu, Jacob jun Pan,
	Daniel Vetter, Robin Murphy

On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> 
> > 1 - tmp->driver is non-NULL because tmp is already bound.
> >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > its removal does not release someone else's DMA API (co-)ownership.
> 
> This is an uncommon locking pattern, but it does work. It relies on
> the mutex being an effective synchronization barrier for an unlocked
> store:
> 
> 				      WRITE_ONCE(dev->driver, NULL)

Only the driver core should be messing with the dev->driver pointer as
when it does so, it already has the proper locks held.  Do I need to
move that to a "private" location so that nothing outside of the driver
core can mess with it?

thanks,

greg k-h
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23 14:06                         ` Greg Kroah-Hartman
@ 2022-02-23 14:09                           ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-23 14:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Robin Murphy, Lu Baolu, Christoph Hellwig, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > 
> > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > its removal does not release someone else's DMA API (co-)ownership.
> > 
> > This is an uncommon locking pattern, but it does work. It relies on
> > the mutex being an effective synchronization barrier for an unlocked
> > store:
> > 
> > 				      WRITE_ONCE(dev->driver, NULL)
> 
> Only the driver core should be messing with the dev->driver pointer as
> when it does so, it already has the proper locks held.  Do I need to
> move that to a "private" location so that nothing outside of the driver
> core can mess with it?

It would be nice, I've seen a abuse and mislocking of it in drivers

Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 14:09                           ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-23 14:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Cornelia Huck, linux-kernel, Li Yang, iommu, Jacob jun Pan,
	Daniel Vetter, Robin Murphy

On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > 
> > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > its removal does not release someone else's DMA API (co-)ownership.
> > 
> > This is an uncommon locking pattern, but it does work. It relies on
> > the mutex being an effective synchronization barrier for an unlocked
> > store:
> > 
> > 				      WRITE_ONCE(dev->driver, NULL)
> 
> Only the driver core should be messing with the dev->driver pointer as
> when it does so, it already has the proper locks held.  Do I need to
> move that to a "private" location so that nothing outside of the driver
> core can mess with it?

It would be nice, I've seen a abuse and mislocking of it in drivers

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23 14:09                           ` Jason Gunthorpe via iommu
@ 2022-02-23 14:30                             ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-23 14:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Robin Murphy, Lu Baolu, Christoph Hellwig, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > > 
> > > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > > >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > > its removal does not release someone else's DMA API (co-)ownership.
> > > 
> > > This is an uncommon locking pattern, but it does work. It relies on
> > > the mutex being an effective synchronization barrier for an unlocked
> > > store:
> > > 
> > > 				      WRITE_ONCE(dev->driver, NULL)
> > 
> > Only the driver core should be messing with the dev->driver pointer as
> > when it does so, it already has the proper locks held.  Do I need to
> > move that to a "private" location so that nothing outside of the driver
> > core can mess with it?
> 
> It would be nice, I've seen a abuse and mislocking of it in drivers

Though to be clear, what Robin is describing is still keeping the
dev->driver stores in dd.c, just reading it in a lockless way from
other modules.

Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 14:30                             ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-23 14:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Cornelia Huck, linux-kernel, Li Yang, iommu, Jacob jun Pan,
	Daniel Vetter, Robin Murphy

On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > > 
> > > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > > >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > > its removal does not release someone else's DMA API (co-)ownership.
> > > 
> > > This is an uncommon locking pattern, but it does work. It relies on
> > > the mutex being an effective synchronization barrier for an unlocked
> > > store:
> > > 
> > > 				      WRITE_ONCE(dev->driver, NULL)
> > 
> > Only the driver core should be messing with the dev->driver pointer as
> > when it does so, it already has the proper locks held.  Do I need to
> > move that to a "private" location so that nothing outside of the driver
> > core can mess with it?
> 
> It would be nice, I've seen a abuse and mislocking of it in drivers

Though to be clear, what Robin is describing is still keeping the
dev->driver stores in dd.c, just reading it in a lockless way from
other modules.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23 14:30                             ` Jason Gunthorpe via iommu
@ 2022-02-23 16:03                               ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-23 16:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Lu Baolu, Christoph Hellwig, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
> > On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > > > 
> > > > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > > > >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > > > its removal does not release someone else's DMA API (co-)ownership.
> > > > 
> > > > This is an uncommon locking pattern, but it does work. It relies on
> > > > the mutex being an effective synchronization barrier for an unlocked
> > > > store:
> > > > 
> > > > 				      WRITE_ONCE(dev->driver, NULL)
> > > 
> > > Only the driver core should be messing with the dev->driver pointer as
> > > when it does so, it already has the proper locks held.  Do I need to
> > > move that to a "private" location so that nothing outside of the driver
> > > core can mess with it?
> > 
> > It would be nice, I've seen a abuse and mislocking of it in drivers
> 
> Though to be clear, what Robin is describing is still keeping the
> dev->driver stores in dd.c, just reading it in a lockless way from
> other modules.

"other modules" should never care if a device has a driver bound to it
because instantly after the check happens, it can change so what ever
logic it wanted to do with that knowledge is gone.

Unless the bus lock is held that the device is on, but that should be
only accessable from within the driver core as it controls that type of
stuff, not any random other part of the kernel.

And in looking at this, ick, there are loads of places in the kernel
that are thinking that this pointer being set to something actually
means something.  Sometimes it does, but lots of places, it doesn't as
it can change.

In a semi-related incident right now, we currently have a syzbot failure
in the usb gadget code where it was manipulating the ->driver pointer
directly and other parts of the kernel are crashing.  See
https://lore.kernel.org/r/PH0PR11MB58805E3C4CF7D4C41D49BFCFDA3C9@PH0PR11MB5880.namprd11.prod.outlook.com
for the thread.

I'll poke at this as a background task to try to clean up over time.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 16:03                               ` Greg Kroah-Hartman
  0 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-23 16:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Cornelia Huck, linux-kernel, Li Yang, iommu, Jacob jun Pan,
	Daniel Vetter, Robin Murphy

On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
> > On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > > > 
> > > > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > > > >   1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > > > its removal does not release someone else's DMA API (co-)ownership.
> > > > 
> > > > This is an uncommon locking pattern, but it does work. It relies on
> > > > the mutex being an effective synchronization barrier for an unlocked
> > > > store:
> > > > 
> > > > 				      WRITE_ONCE(dev->driver, NULL)
> > > 
> > > Only the driver core should be messing with the dev->driver pointer as
> > > when it does so, it already has the proper locks held.  Do I need to
> > > move that to a "private" location so that nothing outside of the driver
> > > core can mess with it?
> > 
> > It would be nice, I've seen a abuse and mislocking of it in drivers
> 
> Though to be clear, what Robin is describing is still keeping the
> dev->driver stores in dd.c, just reading it in a lockless way from
> other modules.

"other modules" should never care if a device has a driver bound to it
because instantly after the check happens, it can change so what ever
logic it wanted to do with that knowledge is gone.

Unless the bus lock is held that the device is on, but that should be
only accessable from within the driver core as it controls that type of
stuff, not any random other part of the kernel.

And in looking at this, ick, there are loads of places in the kernel
that are thinking that this pointer being set to something actually
means something.  Sometimes it does, but lots of places, it doesn't as
it can change.

In a semi-related incident right now, we currently have a syzbot failure
in the usb gadget code where it was manipulating the ->driver pointer
directly and other parts of the kernel are crashing.  See
https://lore.kernel.org/r/PH0PR11MB58805E3C4CF7D4C41D49BFCFDA3C9@PH0PR11MB5880.namprd11.prod.outlook.com
for the thread.

I'll poke at this as a background task to try to clean up over time.

thanks,

greg k-h
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23 16:03                               ` Greg Kroah-Hartman
@ 2022-02-23 17:05                                 ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 17:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jason Gunthorpe
  Cc: Lu Baolu, Christoph Hellwig, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm, rafael, David Airlie,
	linux-pci, Thierry Reding, Diana Craciun, Dmitry Osipenko,
	Will Deacon, Stuart Yoder, Jonathan Hunter, Chaitanya Kulkarni,
	Dan Williams, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter

On 2022-02-23 16:03, Greg Kroah-Hartman wrote:
> On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote:
>> On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
>>> On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
>>>> On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
>>>>> On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
>>>>>
>>>>>> 1 - tmp->driver is non-NULL because tmp is already bound.
>>>>>>    1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
>>>>>> DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
>>>>>> its removal does not release someone else's DMA API (co-)ownership.
>>>>>
>>>>> This is an uncommon locking pattern, but it does work. It relies on
>>>>> the mutex being an effective synchronization barrier for an unlocked
>>>>> store:
>>>>>
>>>>> 				      WRITE_ONCE(dev->driver, NULL)
>>>>
>>>> Only the driver core should be messing with the dev->driver pointer as
>>>> when it does so, it already has the proper locks held.  Do I need to
>>>> move that to a "private" location so that nothing outside of the driver
>>>> core can mess with it?
>>>
>>> It would be nice, I've seen a abuse and mislocking of it in drivers
>>
>> Though to be clear, what Robin is describing is still keeping the
>> dev->driver stores in dd.c, just reading it in a lockless way from
>> other modules.
> 
> "other modules" should never care if a device has a driver bound to it
> because instantly after the check happens, it can change so what ever
> logic it wanted to do with that knowledge is gone.
> 
> Unless the bus lock is held that the device is on, but that should be
> only accessable from within the driver core as it controls that type of
> stuff, not any random other part of the kernel.
> 
> And in looking at this, ick, there are loads of places in the kernel
> that are thinking that this pointer being set to something actually
> means something.  Sometimes it does, but lots of places, it doesn't as
> it can change.

That's fine. In this case we're only talking about the low-level IOMMU 
code which has to be in cahoots with the driver core to some degree (via 
these new callbacks) anyway, but if you're uncomfortable about relying 
on dev->driver even there, I can live with that. There are several 
potential places to capture the relevant information in IOMMU API 
private data, from the point in really_probe() where it *is* stable, and 
then never look at dev->driver ever again - even from .dma_cleanup() or 
future equivalent, which is the aspect from whence this whole 
proof-of-concept tangent span out.

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 17:05                                 ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 17:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jason Gunthorpe
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, Bjorn Helgaas, Dan Williams,
	Cornelia Huck, linux-kernel, Li Yang, iommu, Jacob jun Pan,
	Daniel Vetter

On 2022-02-23 16:03, Greg Kroah-Hartman wrote:
> On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote:
>> On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
>>> On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
>>>> On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
>>>>> On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
>>>>>
>>>>>> 1 - tmp->driver is non-NULL because tmp is already bound.
>>>>>>    1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
>>>>>> DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
>>>>>> its removal does not release someone else's DMA API (co-)ownership.
>>>>>
>>>>> This is an uncommon locking pattern, but it does work. It relies on
>>>>> the mutex being an effective synchronization barrier for an unlocked
>>>>> store:
>>>>>
>>>>> 				      WRITE_ONCE(dev->driver, NULL)
>>>>
>>>> Only the driver core should be messing with the dev->driver pointer as
>>>> when it does so, it already has the proper locks held.  Do I need to
>>>> move that to a "private" location so that nothing outside of the driver
>>>> core can mess with it?
>>>
>>> It would be nice, I've seen a abuse and mislocking of it in drivers
>>
>> Though to be clear, what Robin is describing is still keeping the
>> dev->driver stores in dd.c, just reading it in a lockless way from
>> other modules.
> 
> "other modules" should never care if a device has a driver bound to it
> because instantly after the check happens, it can change so what ever
> logic it wanted to do with that knowledge is gone.
> 
> Unless the bus lock is held that the device is on, but that should be
> only accessable from within the driver core as it controls that type of
> stuff, not any random other part of the kernel.
> 
> And in looking at this, ick, there are loads of places in the kernel
> that are thinking that this pointer being set to something actually
> means something.  Sometimes it does, but lots of places, it doesn't as
> it can change.

That's fine. In this case we're only talking about the low-level IOMMU 
code which has to be in cahoots with the driver core to some degree (via 
these new callbacks) anyway, but if you're uncomfortable about relying 
on dev->driver even there, I can live with that. There are several 
potential places to capture the relevant information in IOMMU API 
private data, from the point in really_probe() where it *is* stable, and 
then never look at dev->driver ever again - even from .dma_cleanup() or 
future equivalent, which is the aspect from whence this whole 
proof-of-concept tangent span out.

Cheers,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
  2022-02-23 17:05                                 ` Robin Murphy
@ 2022-02-23 17:47                                   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-23 17:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jason Gunthorpe, Lu Baolu, Christoph Hellwig, Joerg Roedel,
	Alex Williamson, Bjorn Helgaas, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 05:05:23PM +0000, Robin Murphy wrote:
> On 2022-02-23 16:03, Greg Kroah-Hartman wrote:
> > On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote:
> > > On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> > > > > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > > > > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > > > > > 
> > > > > > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > > > > > >    1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > > > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > > > > > its removal does not release someone else's DMA API (co-)ownership.
> > > > > > 
> > > > > > This is an uncommon locking pattern, but it does work. It relies on
> > > > > > the mutex being an effective synchronization barrier for an unlocked
> > > > > > store:
> > > > > > 
> > > > > > 				      WRITE_ONCE(dev->driver, NULL)
> > > > > 
> > > > > Only the driver core should be messing with the dev->driver pointer as
> > > > > when it does so, it already has the proper locks held.  Do I need to
> > > > > move that to a "private" location so that nothing outside of the driver
> > > > > core can mess with it?
> > > > 
> > > > It would be nice, I've seen a abuse and mislocking of it in drivers
> > > 
> > > Though to be clear, what Robin is describing is still keeping the
> > > dev->driver stores in dd.c, just reading it in a lockless way from
> > > other modules.
> > 
> > "other modules" should never care if a device has a driver bound to it
> > because instantly after the check happens, it can change so what ever
> > logic it wanted to do with that knowledge is gone.
> > 
> > Unless the bus lock is held that the device is on, but that should be
> > only accessable from within the driver core as it controls that type of
> > stuff, not any random other part of the kernel.
> > 
> > And in looking at this, ick, there are loads of places in the kernel
> > that are thinking that this pointer being set to something actually
> > means something.  Sometimes it does, but lots of places, it doesn't as
> > it can change.
> 
> That's fine. In this case we're only talking about the low-level IOMMU code
> which has to be in cahoots with the driver core to some degree (via these
> new callbacks) anyway, but if you're uncomfortable about relying on
> dev->driver even there, I can live with that. There are several potential
> places to capture the relevant information in IOMMU API private data, from
> the point in really_probe() where it *is* stable, and then never look at
> dev->driver ever again - even from .dma_cleanup() or future equivalent,
> which is the aspect from whence this whole proof-of-concept tangent span
> out.

For a specific driver core callback, like dma_cleanup(), all is fine,
but you shouldn't be caring about a driver pointer in your bus callback
for stuff like this as you "know" what happened by virtue of the
callback being called.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type
@ 2022-02-23 17:47                                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 90+ messages in thread
From: Greg Kroah-Hartman @ 2022-02-23 17:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Jason Gunthorpe, Alex Williamson,
	Bjorn Helgaas, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 05:05:23PM +0000, Robin Murphy wrote:
> On 2022-02-23 16:03, Greg Kroah-Hartman wrote:
> > On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote:
> > > On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> > > > > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > > > > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > > > > > 
> > > > > > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > > > > > >    1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be
> > > > > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from,
> > > > > > > its removal does not release someone else's DMA API (co-)ownership.
> > > > > > 
> > > > > > This is an uncommon locking pattern, but it does work. It relies on
> > > > > > the mutex being an effective synchronization barrier for an unlocked
> > > > > > store:
> > > > > > 
> > > > > > 				      WRITE_ONCE(dev->driver, NULL)
> > > > > 
> > > > > Only the driver core should be messing with the dev->driver pointer as
> > > > > when it does so, it already has the proper locks held.  Do I need to
> > > > > move that to a "private" location so that nothing outside of the driver
> > > > > core can mess with it?
> > > > 
> > > > It would be nice, I've seen a abuse and mislocking of it in drivers
> > > 
> > > Though to be clear, what Robin is describing is still keeping the
> > > dev->driver stores in dd.c, just reading it in a lockless way from
> > > other modules.
> > 
> > "other modules" should never care if a device has a driver bound to it
> > because instantly after the check happens, it can change so what ever
> > logic it wanted to do with that knowledge is gone.
> > 
> > Unless the bus lock is held that the device is on, but that should be
> > only accessable from within the driver core as it controls that type of
> > stuff, not any random other part of the kernel.
> > 
> > And in looking at this, ick, there are loads of places in the kernel
> > that are thinking that this pointer being set to something actually
> > means something.  Sometimes it does, but lots of places, it doesn't as
> > it can change.
> 
> That's fine. In this case we're only talking about the low-level IOMMU code
> which has to be in cahoots with the driver core to some degree (via these
> new callbacks) anyway, but if you're uncomfortable about relying on
> dev->driver even there, I can live with that. There are several potential
> places to capture the relevant information in IOMMU API private data, from
> the point in really_probe() where it *is* stable, and then never look at
> dev->driver ever again - even from .dma_cleanup() or future equivalent,
> which is the aspect from whence this whole proof-of-concept tangent span
> out.

For a specific driver core callback, like dma_cleanup(), all is fine,
but you shouldn't be caring about a driver pointer in your bus callback
for stuff like this as you "know" what happened by virtue of the
callback being called.

thanks,

greg k-h
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-18  0:55   ` Lu Baolu
@ 2022-02-23 18:00     ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 18:00 UTC (permalink / raw)
  To: Lu Baolu, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj
  Cc: Will Deacon, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, Dmitry Osipenko, iommu,
	linux-pci, kvm, linux-kernel

On 2022-02-18 00:55, Lu Baolu wrote:
[...]
> +/**
> + * iommu_group_claim_dma_owner() - Set DMA ownership of a group
> + * @group: The group.
> + * @owner: Caller specified pointer. Used for exclusive ownership.
> + *
> + * This is to support backward compatibility for vfio which manages
> + * the dma ownership in iommu_group level. New invocations on this
> + * interface should be prohibited.
> + */
> +int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
> +{
> +	int ret = 0;
> +
> +	mutex_lock(&group->mutex);
> +	if (group->owner_cnt) {

To clarify the comment buried in the other thread, I really think we 
should just unconditionally flag the error here...

> +		if (group->owner != owner) {
> +			ret = -EPERM;
> +			goto unlock_out;
> +		}
> +	} else {
> +		if (group->domain && group->domain != group->default_domain) {
> +			ret = -EBUSY;
> +			goto unlock_out;
> +		}
> +
> +		group->owner = owner;
> +		if (group->domain)
> +			__iommu_detach_group(group->domain, group);
> +	}
> +
> +	group->owner_cnt++;
> +unlock_out:
> +	mutex_unlock(&group->mutex);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
> +
> +/**
> + * iommu_group_release_dma_owner() - Release DMA ownership of a group
> + * @group: The group.
> + *
> + * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
> + */
> +void iommu_group_release_dma_owner(struct iommu_group *group)
> +{
> +	mutex_lock(&group->mutex);
> +	if (WARN_ON(!group->owner_cnt || !group->owner))
> +		goto unlock_out;
> +
> +	if (--group->owner_cnt > 0)
> +		goto unlock_out;

...and equivalently just set owner_cnt directly to 0 here. I don't see a 
realistic use-case for any driver to claim the same group more than 
once, and allowing it in the API just feels like opening up various 
potential corners for things to get out of sync.

I think that's the only significant concern I have left with the series 
as a whole - you can consider my other grumbles non-blocking :)

Thanks,
Robin.

> +
> +	/*
> +	 * The UNMANAGED domain should be detached before all USER
> +	 * owners have been released.
> +	 */
> +	if (!WARN_ON(group->domain) && group->default_domain)
> +		__iommu_attach_group(group->default_domain, group);
> +	group->owner = NULL;
> +
> +unlock_out:
> +	mutex_unlock(&group->mutex);
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
> +
> +/**
> + * iommu_group_dma_owner_claimed() - Query group dma ownership status
> + * @group: The group.
> + *
> + * This provides status query on a given group. It is racey and only for
> + * non-binding status reporting.
> + */
> +bool iommu_group_dma_owner_claimed(struct iommu_group *group)
> +{
> +	unsigned int user;
> +
> +	mutex_lock(&group->mutex);
> +	user = group->owner_cnt;
> +	mutex_unlock(&group->mutex);
> +
> +	return user;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-23 18:00     ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 18:00 UTC (permalink / raw)
  To: Lu Baolu, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj
  Cc: Chaitanya Kulkarni, kvm, Stuart Yoder, rafael, David Airlie,
	linux-pci, Cornelia Huck, linux-kernel, Jonathan Hunter, iommu,
	Thierry Reding, Jacob jun Pan, Daniel Vetter, Diana Craciun,
	Dan Williams, Li Yang, Will Deacon, Dmitry Osipenko

On 2022-02-18 00:55, Lu Baolu wrote:
[...]
> +/**
> + * iommu_group_claim_dma_owner() - Set DMA ownership of a group
> + * @group: The group.
> + * @owner: Caller specified pointer. Used for exclusive ownership.
> + *
> + * This is to support backward compatibility for vfio which manages
> + * the dma ownership in iommu_group level. New invocations on this
> + * interface should be prohibited.
> + */
> +int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
> +{
> +	int ret = 0;
> +
> +	mutex_lock(&group->mutex);
> +	if (group->owner_cnt) {

To clarify the comment buried in the other thread, I really think we 
should just unconditionally flag the error here...

> +		if (group->owner != owner) {
> +			ret = -EPERM;
> +			goto unlock_out;
> +		}
> +	} else {
> +		if (group->domain && group->domain != group->default_domain) {
> +			ret = -EBUSY;
> +			goto unlock_out;
> +		}
> +
> +		group->owner = owner;
> +		if (group->domain)
> +			__iommu_detach_group(group->domain, group);
> +	}
> +
> +	group->owner_cnt++;
> +unlock_out:
> +	mutex_unlock(&group->mutex);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
> +
> +/**
> + * iommu_group_release_dma_owner() - Release DMA ownership of a group
> + * @group: The group.
> + *
> + * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
> + */
> +void iommu_group_release_dma_owner(struct iommu_group *group)
> +{
> +	mutex_lock(&group->mutex);
> +	if (WARN_ON(!group->owner_cnt || !group->owner))
> +		goto unlock_out;
> +
> +	if (--group->owner_cnt > 0)
> +		goto unlock_out;

...and equivalently just set owner_cnt directly to 0 here. I don't see a 
realistic use-case for any driver to claim the same group more than 
once, and allowing it in the API just feels like opening up various 
potential corners for things to get out of sync.

I think that's the only significant concern I have left with the series 
as a whole - you can consider my other grumbles non-blocking :)

Thanks,
Robin.

> +
> +	/*
> +	 * The UNMANAGED domain should be detached before all USER
> +	 * owners have been released.
> +	 */
> +	if (!WARN_ON(group->domain) && group->default_domain)
> +		__iommu_attach_group(group->default_domain, group);
> +	group->owner = NULL;
> +
> +unlock_out:
> +	mutex_unlock(&group->mutex);
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
> +
> +/**
> + * iommu_group_dma_owner_claimed() - Query group dma ownership status
> + * @group: The group.
> + *
> + * This provides status query on a given group. It is racey and only for
> + * non-binding status reporting.
> + */
> +bool iommu_group_dma_owner_claimed(struct iommu_group *group)
> +{
> +	unsigned int user;
> +
> +	mutex_lock(&group->mutex);
> +	user = group->owner_cnt;
> +	mutex_unlock(&group->mutex);
> +
> +	return user;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-23 18:00     ` Robin Murphy
@ 2022-02-23 18:02       ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-23 18:02 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Lu Baolu, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Christoph Hellwig, Kevin Tian, Ashok Raj,
	Will Deacon, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, Dmitry Osipenko, iommu,
	linux-pci, kvm, linux-kernel

On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:

> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
> realistic use-case for any driver to claim the same group more than once,
> and allowing it in the API just feels like opening up various potential
> corners for things to get out of sync.

I am Ok if we toss it out to get this merged, as there is no in-kernel
user right now.

Something will have to come back for iommufd, but we can look at what
is best suited then.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-23 18:02       ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-23 18:02 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:

> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
> realistic use-case for any driver to claim the same group more than once,
> and allowing it in the API just feels like opening up various potential
> corners for things to get out of sync.

I am Ok if we toss it out to get this merged, as there is no in-kernel
user right now.

Something will have to come back for iommufd, but we can look at what
is best suited then.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-23 18:02       ` Jason Gunthorpe via iommu
@ 2022-02-23 18:20         ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 18:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2022-02-23 18:02, Jason Gunthorpe via iommu wrote:
> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
> 
>> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
>> realistic use-case for any driver to claim the same group more than once,
>> and allowing it in the API just feels like opening up various potential
>> corners for things to get out of sync.
> 
> I am Ok if we toss it out to get this merged, as there is no in-kernel
> user right now.
> 
> Something will have to come back for iommufd, but we can look at what
> is best suited then.

If iommufd plans to be too dumb to keep track of whether it already owns 
a given group or not, I can't see it dealing with attaching that group 
to a single domain no more than once, either ;)

Robin.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-23 18:20         ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-23 18:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2022-02-23 18:02, Jason Gunthorpe via iommu wrote:
> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
> 
>> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
>> realistic use-case for any driver to claim the same group more than once,
>> and allowing it in the API just feels like opening up various potential
>> corners for things to get out of sync.
> 
> I am Ok if we toss it out to get this merged, as there is no in-kernel
> user right now.
> 
> Something will have to come back for iommufd, but we can look at what
> is best suited then.

If iommufd plans to be too dumb to keep track of whether it already owns 
a given group or not, I can't see it dealing with attaching that group 
to a single domain no more than once, either ;)

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-23 18:20         ` Robin Murphy
@ 2022-02-23 18:32           ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe @ 2022-02-23 18:32 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 06:20:36PM +0000, Robin Murphy wrote:
> On 2022-02-23 18:02, Jason Gunthorpe via iommu wrote:
> > On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
> > 
> > > ...and equivalently just set owner_cnt directly to 0 here. I don't see a
> > > realistic use-case for any driver to claim the same group more than once,
> > > and allowing it in the API just feels like opening up various potential
> > > corners for things to get out of sync.
> > 
> > I am Ok if we toss it out to get this merged, as there is no in-kernel
> > user right now.
> > 
> > Something will have to come back for iommufd, but we can look at what
> > is best suited then.
> 
> If iommufd plans to be too dumb to keep track of whether it already owns a
> given group or not, I can't see it dealing with attaching that group to a
> single domain no more than once, either ;)

Indeed, this is why I'd like to use the device API :)

Jason

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-23 18:32           ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 90+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-02-23 18:32 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On Wed, Feb 23, 2022 at 06:20:36PM +0000, Robin Murphy wrote:
> On 2022-02-23 18:02, Jason Gunthorpe via iommu wrote:
> > On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
> > 
> > > ...and equivalently just set owner_cnt directly to 0 here. I don't see a
> > > realistic use-case for any driver to claim the same group more than once,
> > > and allowing it in the API just feels like opening up various potential
> > > corners for things to get out of sync.
> > 
> > I am Ok if we toss it out to get this merged, as there is no in-kernel
> > user right now.
> > 
> > Something will have to come back for iommufd, but we can look at what
> > is best suited then.
> 
> If iommufd plans to be too dumb to keep track of whether it already owns a
> given group or not, I can't see it dealing with attaching that group to a
> single domain no more than once, either ;)

Indeed, this is why I'd like to use the device API :)

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 10/11] vfio: Remove iommu group notifier
  2022-02-18  0:55   ` Lu Baolu
@ 2022-02-23 21:53     ` Alex Williamson
  -1 siblings, 0 replies; 90+ messages in thread
From: Alex Williamson @ 2022-02-23 21:53 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Greg Kroah-Hartman, Joerg Roedel, Bjorn Helgaas, Jason Gunthorpe,
	Christoph Hellwig, Kevin Tian, Ashok Raj, kvm, rafael,
	David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter, Robin Murphy

On Fri, 18 Feb 2022 08:55:20 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> The iommu core and driver core have been enhanced to avoid unsafe driver
> binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
> has been called. There's no need to register iommu group notifier. This
> removes the iommu group notifer which contains BUG_ON() and WARN().
> 
> The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") allowed all
> pcieport drivers to be bound with devices while the group is assigned to
> user space. This is not always safe. For example, The shpchp_core driver
> relies on the PCI MMIO access for the controller functionality. With its
> downstream devices assigned to the userspace, the MMIO might be changed
> through user initiated P2P accesses without any notification. This might
> break the kernel driver integrity and lead to some unpredictable
> consequences. As the result, currently we only allow the portdrv driver.
> 
> For any bridge driver, in order to avoiding default kernel DMA ownership
> claiming, we should consider:
> 
>  1) Does the bridge driver use DMA? Calling pci_set_master() or
>     a dma_map_* API is a sure indicate the driver is doing DMA
> 
>  2) If the bridge driver uses MMIO, is it tolerant to hostile
>     userspace also touching the same MMIO registers via P2P DMA
>     attacks?
> 
> Conservatively if the driver maps an MMIO region at all, we can say that
> it fails the test.

IIUC, there's a chance we're going to break user configurations if
they're assigning devices from a group containing a bridge that uses a
driver other than pcieport.  The recommendation to such an affected user
would be that the previously allowed host bridge driver was unsafe for
this use case and to continue to enable assignment of devices within
that group, the driver should be unbound from the bridge device or
replaced with the pci-stub driver.  Is that right?

Unfortunately I also think a bisect of such a breakage wouldn't land
here, I think it was actually broken in "vfio: Set DMA ownership for
VFIO" since that's where vfio starts to make use of
iommu_group_claim_dma_owner() which should fail due to
pci_dma_configure() calling iommu_device_use_default_domain() for
any driver not identifying itself as driver_managed_dma.

If that's correct, can we leave a breadcrumb in the correct commit log
indicating why this potential breakage is intentional and how the
bridge driver might be reconfigured to continue to allow assignment from
within the group more safely?  Thanks,

Alex


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 10/11] vfio: Remove iommu group notifier
@ 2022-02-23 21:53     ` Alex Williamson
  0 siblings, 0 replies; 90+ messages in thread
From: Alex Williamson @ 2022-02-23 21:53 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Jason Gunthorpe, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter, Robin Murphy

On Fri, 18 Feb 2022 08:55:20 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> The iommu core and driver core have been enhanced to avoid unsafe driver
> binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
> has been called. There's no need to register iommu group notifier. This
> removes the iommu group notifer which contains BUG_ON() and WARN().
> 
> The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") allowed all
> pcieport drivers to be bound with devices while the group is assigned to
> user space. This is not always safe. For example, The shpchp_core driver
> relies on the PCI MMIO access for the controller functionality. With its
> downstream devices assigned to the userspace, the MMIO might be changed
> through user initiated P2P accesses without any notification. This might
> break the kernel driver integrity and lead to some unpredictable
> consequences. As the result, currently we only allow the portdrv driver.
> 
> For any bridge driver, in order to avoiding default kernel DMA ownership
> claiming, we should consider:
> 
>  1) Does the bridge driver use DMA? Calling pci_set_master() or
>     a dma_map_* API is a sure indicate the driver is doing DMA
> 
>  2) If the bridge driver uses MMIO, is it tolerant to hostile
>     userspace also touching the same MMIO registers via P2P DMA
>     attacks?
> 
> Conservatively if the driver maps an MMIO region at all, we can say that
> it fails the test.

IIUC, there's a chance we're going to break user configurations if
they're assigning devices from a group containing a bridge that uses a
driver other than pcieport.  The recommendation to such an affected user
would be that the previously allowed host bridge driver was unsafe for
this use case and to continue to enable assignment of devices within
that group, the driver should be unbound from the bridge device or
replaced with the pci-stub driver.  Is that right?

Unfortunately I also think a bisect of such a breakage wouldn't land
here, I think it was actually broken in "vfio: Set DMA ownership for
VFIO" since that's where vfio starts to make use of
iommu_group_claim_dma_owner() which should fail due to
pci_dma_configure() calling iommu_device_use_default_domain() for
any driver not identifying itself as driver_managed_dma.

If that's correct, can we leave a breadcrumb in the correct commit log
indicating why this potential breakage is intentional and how the
bridge driver might be reconfigured to continue to allow assignment from
within the group more safely?  Thanks,

Alex

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 10/11] vfio: Remove iommu group notifier
  2022-02-23 21:53     ` Alex Williamson
@ 2022-02-24  2:49       ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  2:49 UTC (permalink / raw)
  To: Alex Williamson
  Cc: baolu.lu, Greg Kroah-Hartman, Joerg Roedel, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj, kvm,
	rafael, David Airlie, linux-pci, Thierry Reding, Diana Craciun,
	Dmitry Osipenko, Will Deacon, Stuart Yoder, Jonathan Hunter,
	Chaitanya Kulkarni, Dan Williams, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter, Robin Murphy

Hi Alex,

On 2/24/22 5:53 AM, Alex Williamson wrote:
> On Fri, 18 Feb 2022 08:55:20 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> The iommu core and driver core have been enhanced to avoid unsafe driver
>> binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
>> has been called. There's no need to register iommu group notifier. This
>> removes the iommu group notifer which contains BUG_ON() and WARN().
>>
>> The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") allowed all
>> pcieport drivers to be bound with devices while the group is assigned to
>> user space. This is not always safe. For example, The shpchp_core driver
>> relies on the PCI MMIO access for the controller functionality. With its
>> downstream devices assigned to the userspace, the MMIO might be changed
>> through user initiated P2P accesses without any notification. This might
>> break the kernel driver integrity and lead to some unpredictable
>> consequences. As the result, currently we only allow the portdrv driver.
>>
>> For any bridge driver, in order to avoiding default kernel DMA ownership
>> claiming, we should consider:
>>
>>   1) Does the bridge driver use DMA? Calling pci_set_master() or
>>      a dma_map_* API is a sure indicate the driver is doing DMA
>>
>>   2) If the bridge driver uses MMIO, is it tolerant to hostile
>>      userspace also touching the same MMIO registers via P2P DMA
>>      attacks?
>>
>> Conservatively if the driver maps an MMIO region at all, we can say that
>> it fails the test.
> 
> IIUC, there's a chance we're going to break user configurations if
> they're assigning devices from a group containing a bridge that uses a
> driver other than pcieport.  The recommendation to such an affected user
> would be that the previously allowed host bridge driver was unsafe for
> this use case and to continue to enable assignment of devices within
> that group, the driver should be unbound from the bridge device or
> replaced with the pci-stub driver.  Is that right?

Yes. You are right.

Another possible solution (for long term) is to re-audit the bridge
driver code and set the .device_managed_dma field on the premise that
the driver doesn't violate above potential hazards.

> 
> Unfortunately I also think a bisect of such a breakage wouldn't land
> here, I think it was actually broken in "vfio: Set DMA ownership for
> VFIO" since that's where vfio starts to make use of
> iommu_group_claim_dma_owner() which should fail due to
> pci_dma_configure() calling iommu_device_use_default_domain() for
> any driver not identifying itself as driver_managed_dma.

Yes. Great point. Thank you!

> 
> If that's correct, can we leave a breadcrumb in the correct commit log
> indicating why this potential breakage is intentional and how the
> bridge driver might be reconfigured to continue to allow assignment from
> within the group more safely?  Thanks,

Sure. I will add below in the commit message of "vfio: Set DMA ownership 
for VFIO":

"
This change disallows some unsafe bridge drivers to bind to non-ACS
bridges while devices under them are assigned to user space. This is an
intentional enhancement and possibly breaks some existing
configurations. The recommendation to such an affected user would be
that the previously allowed host bridge driver was unsafe for this use
case and to continue to enable assignment of devices within that group,
the driver should be unbound from the bridge device or replaced with the
pci-stub driver.

For any bridge driver, we consider it unsafe if it satisfies any of the
following conditions:

   1) The bridge driver uses DMA. Calling pci_set_master() or calling any
      kernel DMA API (dma_map_*() and etc.) is an indicate that the
      driver is doing DMA.

   2) If the bridge driver uses MMIO, it should be tolerant to hostile
      userspace also touching the same MMIO registers via P2P DMA
      attacks.

If the bridge driver turns out to be a safe one, it could be used as
before by setting the driver's .driver_managed_dma field, just like what
we have done in the pcieport driver.
"

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 10/11] vfio: Remove iommu group notifier
@ 2022-02-24  2:49       ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  2:49 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Stuart Yoder, Kevin Tian,
	Chaitanya Kulkarni, Jason Gunthorpe, Bjorn Helgaas, Dan Williams,
	Greg Kroah-Hartman, Cornelia Huck, linux-kernel, Li Yang, iommu,
	Jacob jun Pan, Daniel Vetter, Robin Murphy

Hi Alex,

On 2/24/22 5:53 AM, Alex Williamson wrote:
> On Fri, 18 Feb 2022 08:55:20 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
>> The iommu core and driver core have been enhanced to avoid unsafe driver
>> binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
>> has been called. There's no need to register iommu group notifier. This
>> removes the iommu group notifer which contains BUG_ON() and WARN().
>>
>> The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") allowed all
>> pcieport drivers to be bound with devices while the group is assigned to
>> user space. This is not always safe. For example, The shpchp_core driver
>> relies on the PCI MMIO access for the controller functionality. With its
>> downstream devices assigned to the userspace, the MMIO might be changed
>> through user initiated P2P accesses without any notification. This might
>> break the kernel driver integrity and lead to some unpredictable
>> consequences. As the result, currently we only allow the portdrv driver.
>>
>> For any bridge driver, in order to avoiding default kernel DMA ownership
>> claiming, we should consider:
>>
>>   1) Does the bridge driver use DMA? Calling pci_set_master() or
>>      a dma_map_* API is a sure indicate the driver is doing DMA
>>
>>   2) If the bridge driver uses MMIO, is it tolerant to hostile
>>      userspace also touching the same MMIO registers via P2P DMA
>>      attacks?
>>
>> Conservatively if the driver maps an MMIO region at all, we can say that
>> it fails the test.
> 
> IIUC, there's a chance we're going to break user configurations if
> they're assigning devices from a group containing a bridge that uses a
> driver other than pcieport.  The recommendation to such an affected user
> would be that the previously allowed host bridge driver was unsafe for
> this use case and to continue to enable assignment of devices within
> that group, the driver should be unbound from the bridge device or
> replaced with the pci-stub driver.  Is that right?

Yes. You are right.

Another possible solution (for long term) is to re-audit the bridge
driver code and set the .device_managed_dma field on the premise that
the driver doesn't violate above potential hazards.

> 
> Unfortunately I also think a bisect of such a breakage wouldn't land
> here, I think it was actually broken in "vfio: Set DMA ownership for
> VFIO" since that's where vfio starts to make use of
> iommu_group_claim_dma_owner() which should fail due to
> pci_dma_configure() calling iommu_device_use_default_domain() for
> any driver not identifying itself as driver_managed_dma.

Yes. Great point. Thank you!

> 
> If that's correct, can we leave a breadcrumb in the correct commit log
> indicating why this potential breakage is intentional and how the
> bridge driver might be reconfigured to continue to allow assignment from
> within the group more safely?  Thanks,

Sure. I will add below in the commit message of "vfio: Set DMA ownership 
for VFIO":

"
This change disallows some unsafe bridge drivers to bind to non-ACS
bridges while devices under them are assigned to user space. This is an
intentional enhancement and possibly breaks some existing
configurations. The recommendation to such an affected user would be
that the previously allowed host bridge driver was unsafe for this use
case and to continue to enable assignment of devices within that group,
the driver should be unbound from the bridge device or replaced with the
pci-stub driver.

For any bridge driver, we consider it unsafe if it satisfies any of the
following conditions:

   1) The bridge driver uses DMA. Calling pci_set_master() or calling any
      kernel DMA API (dma_map_*() and etc.) is an indicate that the
      driver is doing DMA.

   2) If the bridge driver uses MMIO, it should be tolerant to hostile
      userspace also touching the same MMIO registers via P2P DMA
      attacks.

If the bridge driver turns out to be a safe one, it could be used as
before by setting the driver's .driver_managed_dma field, just like what
we have done in the pcieport driver.
"

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-23 18:02       ` Jason Gunthorpe via iommu
@ 2022-02-24  5:16         ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  5:16 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: baolu.lu, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Christoph Hellwig, Kevin Tian, Ashok Raj,
	Will Deacon, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, Dmitry Osipenko, iommu,
	linux-pci, kvm, linux-kernel

Hi Robin and Jason,

On 2/24/22 2:02 AM, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
> 
>> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
>> realistic use-case for any driver to claim the same group more than once,
>> and allowing it in the API just feels like opening up various potential
>> corners for things to get out of sync.
> I am Ok if we toss it out to get this merged, as there is no in-kernel
> user right now.

So we don't need the owner pointer in the API anymore, right? As we will
only allow the claiming interface to be called only once, this token is
unnecessary.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-24  5:16         ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  5:16 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

Hi Robin and Jason,

On 2/24/22 2:02 AM, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
> 
>> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
>> realistic use-case for any driver to claim the same group more than once,
>> and allowing it in the API just feels like opening up various potential
>> corners for things to get out of sync.
> I am Ok if we toss it out to get this merged, as there is no in-kernel
> user right now.

So we don't need the owner pointer in the API anymore, right? As we will
only allow the claiming interface to be called only once, this token is
unnecessary.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-23 18:00     ` Robin Murphy
@ 2022-02-24  5:21       ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  5:21 UTC (permalink / raw)
  To: Robin Murphy, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj
  Cc: baolu.lu, Will Deacon, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel

On 2/24/22 2:00 AM, Robin Murphy wrote:
> On 2022-02-18 00:55, Lu Baolu wrote:
> [...]
>> +/**
>> + * iommu_group_claim_dma_owner() - Set DMA ownership of a group
>> + * @group: The group.
>> + * @owner: Caller specified pointer. Used for exclusive ownership.
>> + *
>> + * This is to support backward compatibility for vfio which manages
>> + * the dma ownership in iommu_group level. New invocations on this
>> + * interface should be prohibited.
>> + */
>> +int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
>> +{
>> +    int ret = 0;
>> +
>> +    mutex_lock(&group->mutex);
>> +    if (group->owner_cnt) {
> 
> To clarify the comment buried in the other thread, I really think we 
> should just unconditionally flag the error here...
> 
>> +        if (group->owner != owner) {
>> +            ret = -EPERM;
>> +            goto unlock_out;
>> +        }
>> +    } else {
>> +        if (group->domain && group->domain != group->default_domain) {
>> +            ret = -EBUSY;
>> +            goto unlock_out;
>> +        }
>> +
>> +        group->owner = owner;
>> +        if (group->domain)
>> +            __iommu_detach_group(group->domain, group);
>> +    }
>> +
>> +    group->owner_cnt++;
>> +unlock_out:
>> +    mutex_unlock(&group->mutex);
>> +
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
>> +
>> +/**
>> + * iommu_group_release_dma_owner() - Release DMA ownership of a group
>> + * @group: The group.
>> + *
>> + * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
>> + */
>> +void iommu_group_release_dma_owner(struct iommu_group *group)
>> +{
>> +    mutex_lock(&group->mutex);
>> +    if (WARN_ON(!group->owner_cnt || !group->owner))
>> +        goto unlock_out;
>> +
>> +    if (--group->owner_cnt > 0)
>> +        goto unlock_out;
> 
> ...and equivalently just set owner_cnt directly to 0 here. I don't see a 
> realistic use-case for any driver to claim the same group more than 
> once, and allowing it in the API just feels like opening up various 
> potential corners for things to get out of sync.

Yeah! Both make sense to me. I will also drop the owner token in the API
as it's unnecessary anymore after the change.

> I think that's the only significant concern I have left with the series 
> as a whole - you can consider my other grumbles non-blocking :)

Thank you and very appreciated for your time!

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-24  5:21       ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  5:21 UTC (permalink / raw)
  To: Robin Murphy, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2/24/22 2:00 AM, Robin Murphy wrote:
> On 2022-02-18 00:55, Lu Baolu wrote:
> [...]
>> +/**
>> + * iommu_group_claim_dma_owner() - Set DMA ownership of a group
>> + * @group: The group.
>> + * @owner: Caller specified pointer. Used for exclusive ownership.
>> + *
>> + * This is to support backward compatibility for vfio which manages
>> + * the dma ownership in iommu_group level. New invocations on this
>> + * interface should be prohibited.
>> + */
>> +int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
>> +{
>> +    int ret = 0;
>> +
>> +    mutex_lock(&group->mutex);
>> +    if (group->owner_cnt) {
> 
> To clarify the comment buried in the other thread, I really think we 
> should just unconditionally flag the error here...
> 
>> +        if (group->owner != owner) {
>> +            ret = -EPERM;
>> +            goto unlock_out;
>> +        }
>> +    } else {
>> +        if (group->domain && group->domain != group->default_domain) {
>> +            ret = -EBUSY;
>> +            goto unlock_out;
>> +        }
>> +
>> +        group->owner = owner;
>> +        if (group->domain)
>> +            __iommu_detach_group(group->domain, group);
>> +    }
>> +
>> +    group->owner_cnt++;
>> +unlock_out:
>> +    mutex_unlock(&group->mutex);
>> +
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
>> +
>> +/**
>> + * iommu_group_release_dma_owner() - Release DMA ownership of a group
>> + * @group: The group.
>> + *
>> + * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
>> + */
>> +void iommu_group_release_dma_owner(struct iommu_group *group)
>> +{
>> +    mutex_lock(&group->mutex);
>> +    if (WARN_ON(!group->owner_cnt || !group->owner))
>> +        goto unlock_out;
>> +
>> +    if (--group->owner_cnt > 0)
>> +        goto unlock_out;
> 
> ...and equivalently just set owner_cnt directly to 0 here. I don't see a 
> realistic use-case for any driver to claim the same group more than 
> once, and allowing it in the API just feels like opening up various 
> potential corners for things to get out of sync.

Yeah! Both make sense to me. I will also drop the owner token in the API
as it's unnecessary anymore after the change.

> I think that's the only significant concern I have left with the series 
> as a whole - you can consider my other grumbles non-blocking :)

Thank you and very appreciated for your time!

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-24  5:16         ` Lu Baolu
@ 2022-02-24  5:29           ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  5:29 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: baolu.lu, Greg Kroah-Hartman, Joerg Roedel, Alex Williamson,
	Bjorn Helgaas, Christoph Hellwig, Kevin Tian, Ashok Raj,
	Will Deacon, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, Dmitry Osipenko, iommu,
	linux-pci, kvm, linux-kernel

On 2/24/22 1:16 PM, Lu Baolu wrote:
> Hi Robin and Jason,
> 
> On 2/24/22 2:02 AM, Jason Gunthorpe wrote:
>> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
>>
>>> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
>>> realistic use-case for any driver to claim the same group more than 
>>> once,
>>> and allowing it in the API just feels like opening up various potential
>>> corners for things to get out of sync.
>> I am Ok if we toss it out to get this merged, as there is no in-kernel
>> user right now.
> 
> So we don't need the owner pointer in the API anymore, right?

Oh, NO.

The owner token represents that the group has been claimed for user
space access. And the default domain auto-attach policy will be changed
accordingly.

So we still need this. Sorry for the noise.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-24  5:29           ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-24  5:29 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2/24/22 1:16 PM, Lu Baolu wrote:
> Hi Robin and Jason,
> 
> On 2/24/22 2:02 AM, Jason Gunthorpe wrote:
>> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
>>
>>> ...and equivalently just set owner_cnt directly to 0 here. I don't see a
>>> realistic use-case for any driver to claim the same group more than 
>>> once,
>>> and allowing it in the API just feels like opening up various potential
>>> corners for things to get out of sync.
>> I am Ok if we toss it out to get this merged, as there is no in-kernel
>> user right now.
> 
> So we don't need the owner pointer in the API anymore, right?

Oh, NO.

The owner token represents that the group has been claimed for user
space access. And the default domain auto-attach policy will be changed
accordingly.

So we still need this. Sorry for the noise.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
  2022-02-24  5:29           ` Lu Baolu
@ 2022-02-24  8:58             ` Robin Murphy
  -1 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-24  8:58 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe
  Cc: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Dan Williams, rafael, Diana Craciun, Cornelia Huck, Eric Auger,
	Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni, Stuart Yoder,
	Laurentiu Tudor, Thierry Reding, David Airlie, Daniel Vetter,
	Jonathan Hunter, Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm,
	linux-kernel

On 2022-02-24 05:29, Lu Baolu wrote:
> On 2/24/22 1:16 PM, Lu Baolu wrote:
>> Hi Robin and Jason,
>>
>> On 2/24/22 2:02 AM, Jason Gunthorpe wrote:
>>> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
>>>
>>>> ...and equivalently just set owner_cnt directly to 0 here. I don't 
>>>> see a
>>>> realistic use-case for any driver to claim the same group more than 
>>>> once,
>>>> and allowing it in the API just feels like opening up various potential
>>>> corners for things to get out of sync.
>>> I am Ok if we toss it out to get this merged, as there is no in-kernel
>>> user right now.
>>
>> So we don't need the owner pointer in the API anymore, right?
> 
> Oh, NO.
> 
> The owner token represents that the group has been claimed for user
> space access. And the default domain auto-attach policy will be changed
> accordingly.
> 
> So we still need this. Sorry for the noise.

Exactly. In fact we could almost go the other way, and rename owner_cnt 
to dma_api_users and make it mutually exclusive with owner being set, 
but that's really just cosmetic. It's understandable enough as-is that 
owner_cnt > 0 with owner == NULL represents implicit DMA API ownership.

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/11] iommu: Add dma ownership management interfaces
@ 2022-02-24  8:58             ` Robin Murphy
  0 siblings, 0 replies; 90+ messages in thread
From: Robin Murphy @ 2022-02-24  8:58 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe
  Cc: Stuart Yoder, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Ashok Raj,
	Jonathan Hunter, Christoph Hellwig, Kevin Tian,
	Chaitanya Kulkarni, Alex Williamson, kvm, Bjorn Helgaas,
	Dan Williams, Greg Kroah-Hartman, Cornelia Huck, linux-kernel,
	Li Yang, iommu, Jacob jun Pan, Daniel Vetter

On 2022-02-24 05:29, Lu Baolu wrote:
> On 2/24/22 1:16 PM, Lu Baolu wrote:
>> Hi Robin and Jason,
>>
>> On 2/24/22 2:02 AM, Jason Gunthorpe wrote:
>>> On Wed, Feb 23, 2022 at 06:00:06PM +0000, Robin Murphy wrote:
>>>
>>>> ...and equivalently just set owner_cnt directly to 0 here. I don't 
>>>> see a
>>>> realistic use-case for any driver to claim the same group more than 
>>>> once,
>>>> and allowing it in the API just feels like opening up various potential
>>>> corners for things to get out of sync.
>>> I am Ok if we toss it out to get this merged, as there is no in-kernel
>>> user right now.
>>
>> So we don't need the owner pointer in the API anymore, right?
> 
> Oh, NO.
> 
> The owner token represents that the group has been claimed for user
> space access. And the default domain auto-attach policy will be changed
> accordingly.
> 
> So we still need this. Sorry for the noise.

Exactly. In fact we could almost go the other way, and rename owner_cnt 
to dma_api_users and make it mutually exclusive with owner being set, 
but that's really just cosmetic. It's understandable enough as-is that 
owner_cnt > 0 with owner == NULL represents implicit DMA API ownership.

Cheers,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
  2022-02-18  0:55 ` Lu Baolu
@ 2022-02-28  0:58   ` Lu Baolu
  -1 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-28  0:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: baolu.lu, Will Deacon, Robin Murphy, Dan Williams, rafael,
	Diana Craciun, Cornelia Huck, Eric Auger, Liu Yi L,
	Jacob jun Pan, Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, Dmitry Osipenko, iommu, linux-pci, kvm, linux-kernel

On 2/18/22 8:55 AM, Lu Baolu wrote:
> v6:
>    - Refine comments and commit mesages.
>    - Rename iommu_group_set_dma_owner() to iommu_group_claim_dma_owner().
>    - Rename iommu_device_use/unuse_kernel_dma() to
>      iommu_device_use/unuse_default_domain().
>    - Remove unnecessary EXPORT_SYMBOL_GPL.
>    - Change flag name from no_kernel_api_dma to driver_managed_dma.
>    - Merge 4 "Add driver dma ownership management" patches into single
>      one.

Thanks you very much for review and comments. A new version (v7) has
been posted.

https://lore.kernel.org/linux-iommu/20220228005056.599595-1-baolu.lu@linux.intel.com/

If I missed anything there, please let me know.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2022-02-28  0:58   ` Lu Baolu
  0 siblings, 0 replies; 90+ messages in thread
From: Lu Baolu @ 2022-02-28  0:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: kvm, rafael, David Airlie, linux-pci, Thierry Reding,
	Diana Craciun, Dmitry Osipenko, Will Deacon, Stuart Yoder,
	Jonathan Hunter, Chaitanya Kulkarni, Dan Williams, Cornelia Huck,
	linux-kernel, Li Yang, iommu, Jacob jun Pan, Daniel Vetter,
	Robin Murphy

On 2/18/22 8:55 AM, Lu Baolu wrote:
> v6:
>    - Refine comments and commit mesages.
>    - Rename iommu_group_set_dma_owner() to iommu_group_claim_dma_owner().
>    - Rename iommu_device_use/unuse_kernel_dma() to
>      iommu_device_use/unuse_default_domain().
>    - Remove unnecessary EXPORT_SYMBOL_GPL.
>    - Change flag name from no_kernel_api_dma to driver_managed_dma.
>    - Merge 4 "Add driver dma ownership management" patches into single
>      one.

Thanks you very much for review and comments. A new version (v7) has
been posted.

https://lore.kernel.org/linux-iommu/20220228005056.599595-1-baolu.lu@linux.intel.com/

If I missed anything there, please let me know.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 90+ messages in thread

end of thread, other threads:[~2022-02-28  1:00 UTC | newest]

Thread overview: 90+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-18  0:55 [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
2022-02-18  0:55 ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 01/11] iommu: Add dma ownership management interfaces Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-19  7:31   ` Christoph Hellwig
2022-02-19  7:31     ` Christoph Hellwig
2022-02-21  4:02     ` Lu Baolu
2022-02-21  4:02       ` Lu Baolu
2022-02-23 18:00   ` Robin Murphy
2022-02-23 18:00     ` Robin Murphy
2022-02-23 18:02     ` Jason Gunthorpe
2022-02-23 18:02       ` Jason Gunthorpe via iommu
2022-02-23 18:20       ` Robin Murphy
2022-02-23 18:20         ` Robin Murphy
2022-02-23 18:32         ` Jason Gunthorpe
2022-02-23 18:32           ` Jason Gunthorpe via iommu
2022-02-24  5:16       ` Lu Baolu
2022-02-24  5:16         ` Lu Baolu
2022-02-24  5:29         ` Lu Baolu
2022-02-24  5:29           ` Lu Baolu
2022-02-24  8:58           ` Robin Murphy
2022-02-24  8:58             ` Robin Murphy
2022-02-24  5:21     ` Lu Baolu
2022-02-24  5:21       ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 02/11] driver core: Add dma_cleanup callback in bus_type Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-19  7:32   ` Christoph Hellwig
2022-02-19  7:32     ` Christoph Hellwig
2022-02-21 20:43     ` Robin Murphy
2022-02-21 20:43       ` Robin Murphy
2022-02-21 23:48       ` Jason Gunthorpe
2022-02-21 23:48         ` Jason Gunthorpe via iommu
2022-02-22  4:48         ` Lu Baolu
2022-02-22  4:48           ` Lu Baolu
2022-02-22 10:58         ` Robin Murphy
2022-02-22 10:58           ` Robin Murphy
2022-02-22 15:16           ` Jason Gunthorpe
2022-02-22 15:16             ` Jason Gunthorpe via iommu
2022-02-22 21:18             ` Robin Murphy
2022-02-22 21:18               ` Robin Murphy
2022-02-22 23:53               ` Jason Gunthorpe
2022-02-22 23:53                 ` Jason Gunthorpe via iommu
2022-02-23  5:01                 ` Lu Baolu
2022-02-23  5:01                   ` Lu Baolu
2022-02-23 13:04                   ` Robin Murphy
2022-02-23 13:04                     ` Robin Murphy
2022-02-23 13:46                     ` Jason Gunthorpe
2022-02-23 13:46                       ` Jason Gunthorpe via iommu
2022-02-23 14:06                       ` Greg Kroah-Hartman
2022-02-23 14:06                         ` Greg Kroah-Hartman
2022-02-23 14:09                         ` Jason Gunthorpe
2022-02-23 14:09                           ` Jason Gunthorpe via iommu
2022-02-23 14:30                           ` Jason Gunthorpe
2022-02-23 14:30                             ` Jason Gunthorpe via iommu
2022-02-23 16:03                             ` Greg Kroah-Hartman
2022-02-23 16:03                               ` Greg Kroah-Hartman
2022-02-23 17:05                               ` Robin Murphy
2022-02-23 17:05                                 ` Robin Murphy
2022-02-23 17:47                                 ` Greg Kroah-Hartman
2022-02-23 17:47                                   ` Greg Kroah-Hartman
2022-02-18  0:55 ` [PATCH v6 03/11] amba: Stop sharing platform_dma_configure() Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 04/11] bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management Lu Baolu
2022-02-18  0:55   ` [PATCH v6 04/11] bus: platform, amba, fsl-mc, PCI: " Lu Baolu
2022-02-18  7:55   ` [PATCH v6 04/11] bus: platform,amba,fsl-mc,PCI: " Greg Kroah-Hartman
2022-02-18  7:55     ` Greg Kroah-Hartman
2022-02-18  0:55 ` [PATCH v6 05/11] PCI: pci_stub: Set driver_managed_dma Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 06/11] PCI: portdrv: " Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 07/11] vfio: Set DMA ownership for VFIO devices Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 08/11] vfio: Remove use of vfio_group_viable() Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 09/11] vfio: Delete the unbound_list Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 10/11] vfio: Remove iommu group notifier Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-23 21:53   ` Alex Williamson
2022-02-23 21:53     ` Alex Williamson
2022-02-24  2:49     ` Lu Baolu
2022-02-24  2:49       ` Lu Baolu
2022-02-18  0:55 ` [PATCH v6 11/11] iommu: Remove iommu group changes notifier Lu Baolu
2022-02-18  0:55   ` Lu Baolu
2022-02-18 15:51 ` [PATCH v6 00/11] Fix BUG_ON in vfio_iommu_group_notifier() Jason Gunthorpe
2022-02-18 15:51   ` Jason Gunthorpe via iommu
2022-02-21  3:38   ` Lu Baolu
2022-02-21  3:38     ` Lu Baolu
2022-02-28  0:58 ` Lu Baolu
2022-02-28  0:58   ` Lu Baolu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.