linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v11 00/13] iommu: SVA and IOPF refactoring
@ 2022-08-17  1:20 Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 01/13] iommu: Add max_pasids field in struct iommu_device Lu Baolu
                   ` (14 more replies)
  0 siblings, 15 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu

Hi folks,

The former part of this series introduces the IOMMU interfaces to attach
or detach an iommu domain to/from a pasid of a device, and refactors the
exsiting IOMMU SVA implementation by assigning an SVA type of iommu
domain to a shared virtual address and replacing sva_bind/unbind iommu
ops with a set_dev_pasid domain ops.

The latter part changes the existing I/O page fault handling framework
from only serving SVA to a generic one. Any driver or component could
handle the I/O page faults for its domain in its own way by installing
an I/O page fault handler.

This series has been functionally tested on an x86 machine and compile
tested for all architectures.

This series is also available on github:
[2] https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v11

Please review and suggest.

Best regards,
baolu

Change log:
v11:
 - [PATCH 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
   - new patch
 - [PATCH 05/13] iommu: Add attach/detach_dev_pasid iommu interface
   - Remove block_dev_pasid domain ops and use setting group blocking
     domain instead.
   - Remove iommu_group_immutable_singleton(). Move the PCI/ACS
     requirement into the pci_enable_pasid(). All devices in an iommu
     group share a same iommu domain for each pasid.
 - [PATCH 06/13] iommu: Add IOMMU SVA domain support
   - Add a refcount for SVA multiple bindings.
 - [PATCH 07/13] iommu/vt-d: Add SVA domain support
   - Use set_dev_pasid for both domain attaching and detaching.
 - [PATCH 08/13] arm-smmu-v3/sva: Add SVA domain support
   - Use set_dev_pasid for both domain attaching and detaching.
 - [PATCH 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device()
   - Remove the refcount of iommu_sva::users.
   - Add iommu_sva::domain.
 - [PATCH 11/13] iommu: Prepare IOMMU domain for IOPF
   - Remove unnecessary check of IS_ERR_OR_NULL(mm).
 - [Overall]
   - Rebase to v6.0-rc1.
   - Remove previous Test-by's as some APIs are changed.
   - Polishing of various codes and comments.

v10:
 - https://lore.kernel.org/linux-iommu/20220705050710.2887204-1-baolu.lu@linux.intel.com/
 - Rebase on next branch of iommu tree.
 - Split attach/detach_device_pasid interfaces and SVA domain extensions
   to different patches.
 - Handle the return error of xa_cmpxchg() gracefully.
 - Directly pass mm in as the SVA fault data.
 - Rename iopf_handle_group() to iopf_handler().
 - Some commit message and code comment refinement.
 - Add Tested-by's from Zhangfei and Tony.

v9:
 - https://lore.kernel.org/linux-iommu/20220621144353.17547-1-baolu.lu@linux.intel.com/
 - Some minor changes on comments and function names.
 - Simplify dev_iommu_get_max_pasids().

v8:
 - https://lore.kernel.org/linux-iommu/20220607014942.3954894-1-baolu.lu@linux.intel.com/
 - Add support for calculating the max pasids that a device could
   consume.
 - Replace container_of_safe() with container_of.
 - Remove iommu_ops->sva_domain_ops and make sva support through the
   generic domain_alloc/free() interfaces.
 - [Robin] It would be logical to pass IOMMU_DOMAIN_SVA to the normal
   domain_alloc call, so that driver-internal stuff like context
   descriptors can be still be hung off the domain as usual (rather than
   all drivers having to implement some extra internal lookup mechanism
   to handle all the SVA domain ops).
 - [Robin] I'd just stick the mm pointer in struct iommu_domain, in a
   union with the fault handler stuff those are mutually exclusive with
   SVA.
 - https://lore.kernel.org/linux-iommu/f3170016-4d7f-e78e-db48-68305f683349@arm.com/

v7:
 - https://lore.kernel.org/linux-iommu/20220519072047.2996983-1-baolu.lu@linux.intel.com/
 - Remove duplicate array for sva domain.
 - Rename detach_dev_pasid to block_dev_pasid.
 - Add raw device driver interfaces for iommufd.
 - Other misc refinements and patch reorganization.
 - Drop "dmaengine: idxd: Separate user and kernel pasid enabling" which
   has been picked for dmaengine tree.

v6:
 - https://lore.kernel.org/linux-iommu/20220510061738.2761430-1-baolu.lu@linux.intel.com/
 - Refine the SVA basic data structures.
   Link: https://lore.kernel.org/linux-iommu/YnFv0ps0Ad8v+7uH@myrica/
 - Refine arm smmuv3 sva domain allocation.
 - Fix a possible lock issue.
   Link: https://lore.kernel.org/linux-iommu/YnFydE8j8l7Q4m+b@myrica/

v5:
 - https://lore.kernel.org/linux-iommu/20220502014842.991097-1-baolu.lu@linux.intel.com/
 - Address review comments from Jean-Philippe Brucker. Very appreciated!
 - Remove redundant pci aliases check in
   device_group_immutable_singleton().
 - Treat all buses except PCI as static in immutable singleton check.
 - As the sva_bind/unbind() have already guaranteed sva domain free only
   after iopf_queue_flush_dev(), remove the unnecessary domain refcount.
 - Move domain get() out of the list iteration in iopf_handle_group().

v4:
 - https://lore.kernel.org/linux-iommu/20220421052121.3464100-1-baolu.lu@linux.intel.com/
 - Solve the overlap with another series and make this series
   self-contained.
 - No objection to the abstraction of data structure during v3 review.
   Hence remove the RFC subject prefix.
 - Refine the immutable singleton group code according to Kevin's
   comments.

v3:
 - https://lore.kernel.org/linux-iommu/20220410102443.294128-1-baolu.lu@linux.intel.com/
 - Rework iommu_group_singleton_lockdown() by adding a flag to the group
   that positively indicates the group can never have more than one
   member, even after hot plug.
 - Abstract the data structs used for iommu sva in a separated patches to
   make it easier for review.
 - I still keep the RFC prefix in this series as above two significant
   changes need at least another round review to be finalized.
 - Several misc refinements.

v2:
 - https://lore.kernel.org/linux-iommu/20220329053800.3049561-1-baolu.lu@linux.intel.com/
 - Add sva domain life cycle management to avoid race between unbind and
   page fault handling.
 - Use a single domain for each mm.
 - Return a single sva handler for the same binding.
 - Add a new helper to meet singleton group requirement.
 - Rework the SVA domain allocation for arm smmu v3 driver and move the
   pasid_bit initialization to device probe.
 - Drop the patch "iommu: Handle IO page faults directly".
 - Add mmget_not_zero(mm) in SVA page fault handler.

v1:
 - https://lore.kernel.org/linux-iommu/20220320064030.2936936-1-baolu.lu@linux.intel.com/
 - Initial post.

Lu Baolu (13):
  iommu: Add max_pasids field in struct iommu_device
  iommu: Add max_pasids field in struct dev_iommu
  iommu: Remove SVM_FLAG_SUPERVISOR_MODE support
  PCI: Allow PASID only when ACS enforced on upstreaming path
  iommu: Add attach/detach_dev_pasid iommu interface
  iommu: Add IOMMU SVA domain support
  iommu/vt-d: Add SVA domain support
  arm-smmu-v3/sva: Add SVA domain support
  iommu/sva: Refactoring iommu_sva_bind/unbind_device()
  iommu: Remove SVA related callbacks from iommu ops
  iommu: Prepare IOMMU domain for IOPF
  iommu: Per-domain I/O page fault handling
  iommu: Rename iommu-sva-lib.{c,h}

 include/linux/intel-svm.h                     |  13 -
 include/linux/iommu.h                         | 110 +++++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  19 +-
 drivers/iommu/intel/iommu.h                   |  13 +-
 .../iommu/{iommu-sva-lib.h => iommu-sva.h}    |  14 +-
 drivers/dma/idxd/cdev.c                       |   3 +-
 drivers/dma/idxd/init.c                       |  25 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 119 +++++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |   9 +-
 drivers/iommu/intel/dmar.c                    |   7 +
 drivers/iommu/intel/iommu.c                   |   7 +-
 drivers/iommu/intel/svm.c                     | 150 +++++-----
 drivers/iommu/io-pgfault.c                    |  77 ++----
 drivers/iommu/iommu-sva-lib.c                 |  71 -----
 drivers/iommu/iommu-sva.c                     | 233 ++++++++++++++++
 drivers/iommu/iommu.c                         | 259 +++++++++++-------
 drivers/misc/uacce/uacce.c                    |   2 +-
 drivers/pci/ats.c                             |   5 +
 drivers/iommu/Makefile                        |   2 +-
 19 files changed, 680 insertions(+), 458 deletions(-)
 rename drivers/iommu/{iommu-sva-lib.h => iommu-sva.h} (83%)
 delete mode 100644 drivers/iommu/iommu-sva-lib.c
 create mode 100644 drivers/iommu/iommu-sva.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v11 01/13] iommu: Add max_pasids field in struct iommu_device
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 02/13] iommu: Add max_pasids field in struct dev_iommu Lu Baolu
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

Use this field to keep the number of supported PASIDs that an IOMMU
hardware is able to support. This is a generic attribute of an IOMMU
and lifting it into the per-IOMMU device structure makes it possible
to allocate a PASID for device without calls into the IOMMU drivers.
Any iommu driver that supports PASID related features should set this
field before enabling them on the devices.

In the Intel IOMMU driver, intel_iommu_sm is moved to CONFIG_INTEL_IOMMU
enclave so that the pasid_supported() helper could be used in dmar.c
without compilation errors.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h                       | 2 ++
 drivers/iommu/intel/iommu.h                 | 4 ++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
 drivers/iommu/intel/dmar.c                  | 7 +++++++
 4 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ea30f00dc145..ed172cbdabf2 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -322,12 +322,14 @@ struct iommu_domain_ops {
  * @list: Used by the iommu-core to keep a list of registered iommus
  * @ops: iommu-ops for talking to this iommu
  * @dev: struct device for sysfs handling
+ * @max_pasids: number of supported PASIDs
  */
 struct iommu_device {
 	struct list_head list;
 	const struct iommu_ops *ops;
 	struct fwnode_handle *fwnode;
 	struct device *dev;
+	u32 max_pasids;
 };
 
 /**
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index fae45bbb0c7f..4bebe0547c8f 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -480,8 +480,6 @@ enum {
 #define VTD_FLAG_IRQ_REMAP_PRE_ENABLED	(1 << 1)
 #define VTD_FLAG_SVM_CAPABLE		(1 << 2)
 
-extern int intel_iommu_sm;
-
 #define sm_supported(iommu)	(intel_iommu_sm && ecap_smts((iommu)->ecap))
 #define pasid_supported(iommu)	(sm_supported(iommu) &&			\
 				 ecap_pasid((iommu)->ecap))
@@ -791,6 +789,7 @@ struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus,
 extern const struct iommu_ops intel_iommu_ops;
 
 #ifdef CONFIG_INTEL_IOMMU
+extern int intel_iommu_sm;
 extern int iommu_calculate_agaw(struct intel_iommu *iommu);
 extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu);
 extern int dmar_disabled;
@@ -806,6 +805,7 @@ static inline int iommu_calculate_max_sagaw(struct intel_iommu *iommu)
 }
 #define dmar_disabled	(1)
 #define intel_iommu_enabled (0)
+#define intel_iommu_sm (0)
 #endif
 
 static inline const char *decode_prq_descriptor(char *str, size_t size,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d32b02336411..f88541be8213 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3521,6 +3521,7 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 	/* SID/SSID sizes */
 	smmu->ssid_bits = FIELD_GET(IDR1_SSIDSIZE, reg);
 	smmu->sid_bits = FIELD_GET(IDR1_SIDSIZE, reg);
+	smmu->iommu.max_pasids = 1UL << smmu->ssid_bits;
 
 	/*
 	 * If the SMMU supports fewer bits than would fill a single L2 stream
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 5a8f780e7ffd..3528058d253e 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1104,6 +1104,13 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
 
 	raw_spin_lock_init(&iommu->register_lock);
 
+	/*
+	 * A value of N in PSS field of eCap register indicates hardware
+	 * supports PASID field of N+1 bits.
+	 */
+	if (pasid_supported(iommu))
+		iommu->iommu.max_pasids = 2UL << ecap_pss(iommu->ecap);
+
 	/*
 	 * This is only for hotplug; at boot time intel_iommu_enabled won't
 	 * be set yet. When intel_iommu_init() runs, it registers the units
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 02/13] iommu: Add max_pasids field in struct dev_iommu
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 01/13] iommu: Add max_pasids field in struct iommu_device Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 03/13] iommu: Remove SVM_FLAG_SUPERVISOR_MODE support Lu Baolu
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu

Use this field to save the number of PASIDs that a device is able to
consume. It is a generic attribute of a device and lifting it into the
per-device dev_iommu struct could help to avoid the boilerplate code
in various IOMMU drivers.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h |  2 ++
 drivers/iommu/iommu.c | 20 ++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ed172cbdabf2..0d9ce209a501 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -368,6 +368,7 @@ struct iommu_fault_param {
  * @fwspec:	 IOMMU fwspec data
  * @iommu_dev:	 IOMMU device this device is linked to
  * @priv:	 IOMMU Driver private data
+ * @max_pasids:  number of PASIDs this device can consume
  *
  * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
  *	struct iommu_group	*iommu_group;
@@ -379,6 +380,7 @@ struct dev_iommu {
 	struct iommu_fwspec		*fwspec;
 	struct iommu_device		*iommu_dev;
 	void				*priv;
+	u32				max_pasids;
 };
 
 int iommu_device_register(struct iommu_device *iommu,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 780fb7071577..e9f6a8d33b58 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -20,6 +20,7 @@
 #include <linux/idr.h>
 #include <linux/err.h>
 #include <linux/pci.h>
+#include <linux/pci-ats.h>
 #include <linux/bitops.h>
 #include <linux/property.h>
 #include <linux/fsl/mc.h>
@@ -218,6 +219,24 @@ static void dev_iommu_free(struct device *dev)
 	kfree(param);
 }
 
+static u32 dev_iommu_get_max_pasids(struct device *dev)
+{
+	u32 max_pasids = 0, bits = 0;
+	int ret;
+
+	if (dev_is_pci(dev)) {
+		ret = pci_max_pasids(to_pci_dev(dev));
+		if (ret > 0)
+			max_pasids = ret;
+	} else {
+		ret = device_property_read_u32(dev, "pasid-num-bits", &bits);
+		if (!ret)
+			max_pasids = 1UL << bits;
+	}
+
+	return min_t(u32, max_pasids, dev->iommu->iommu_dev->max_pasids);
+}
+
 static int __iommu_probe_device(struct device *dev, struct list_head *group_list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
@@ -243,6 +262,7 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
 	}
 
 	dev->iommu->iommu_dev = iommu_dev;
+	dev->iommu->max_pasids = dev_iommu_get_max_pasids(dev);
 
 	group = iommu_group_get_for_dev(dev);
 	if (IS_ERR(group)) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 03/13] iommu: Remove SVM_FLAG_SUPERVISOR_MODE support
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 01/13] iommu: Add max_pasids field in struct iommu_device Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 02/13] iommu: Add max_pasids field in struct dev_iommu Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path Lu Baolu
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jacob Pan,
	Jean-Philippe Brucker

The current kernel DMA with PASID support is based on the SVA with a flag
SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address
space to a PASID of the device. The device driver programs the device with
kernel virtual address (KVA) for DMA access. There have been security and
functional issues with this approach:

- The lack of IOTLB synchronization upon kernel page table updates.
  (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.)
- Other than slight more protection, using kernel virtual address (KVA)
  has little advantage over physical address. There are also no use
  cases yet where DMA engines need kernel virtual addresses for in-kernel
  DMA.

This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface.
The device drivers are suggested to handle kernel DMA with PASID through
the kernel DMA APIs.

The drvdata parameter in iommu_sva_bind_device() and all callbacks is not
needed anymore. Cleanup them as well.

Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 include/linux/intel-svm.h                     | 13 -----
 include/linux/iommu.h                         |  8 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  5 +-
 drivers/iommu/intel/iommu.h                   |  3 +-
 drivers/dma/idxd/cdev.c                       |  3 +-
 drivers/dma/idxd/init.c                       | 25 +-------
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  3 +-
 drivers/iommu/intel/svm.c                     | 57 +++++--------------
 drivers/iommu/iommu.c                         |  5 +-
 drivers/misc/uacce/uacce.c                    |  2 +-
 10 files changed, 26 insertions(+), 98 deletions(-)

diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index 207ef06ba3e1..f9a0d44f6fdb 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -13,17 +13,4 @@
 #define PRQ_RING_MASK	((0x1000 << PRQ_ORDER) - 0x20)
 #define PRQ_DEPTH	((0x1000 << PRQ_ORDER) >> 5)
 
-/*
- * The SVM_FLAG_SUPERVISOR_MODE flag requests a PASID which can be used only
- * for access to kernel addresses. No IOTLB flushes are automatically done
- * for kernel mappings; it is valid only for access to the kernel's static
- * 1:1 mapping of physical memory — not to vmalloc or even module mappings.
- * A future API addition may permit the use of such ranges, by means of an
- * explicit IOTLB flush call (akin to the DMA API's unmap method).
- *
- * It is unlikely that we will ever hook into flush_tlb_kernel_range() to
- * do such IOTLB flushes automatically.
- */
-#define SVM_FLAG_SUPERVISOR_MODE	BIT(0)
-
 #endif /* __INTEL_SVM_H__ */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 0d9ce209a501..2f237c3cd680 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -247,8 +247,7 @@ struct iommu_ops {
 	int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
 	int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
 
-	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
-				      void *drvdata);
+	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm);
 	void (*sva_unbind)(struct iommu_sva *handle);
 	u32 (*sva_get_pasid)(struct iommu_sva *handle);
 
@@ -670,8 +669,7 @@ int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features f);
 int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
 
 struct iommu_sva *iommu_sva_bind_device(struct device *dev,
-					struct mm_struct *mm,
-					void *drvdata);
+					struct mm_struct *mm);
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
@@ -1007,7 +1005,7 @@ iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
 }
 
 static inline struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
 {
 	return NULL;
 }
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index cd48590ada30..d2ba86470c42 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -754,8 +754,7 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
 bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
-struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
-				    void *drvdata);
+struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm);
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
 u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
@@ -791,7 +790,7 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master
 }
 
 static inline struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
+arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 {
 	return ERR_PTR(-ENODEV);
 }
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 4bebe0547c8f..a9b8367c9361 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -742,8 +742,7 @@ struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn);
 extern void intel_svm_check(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
-				 void *drvdata);
+struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm);
 void intel_svm_unbind(struct iommu_sva *handle);
 u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
index c2808fd081d6..66720001ba1c 100644
--- a/drivers/dma/idxd/cdev.c
+++ b/drivers/dma/idxd/cdev.c
@@ -6,7 +6,6 @@
 #include <linux/pci.h>
 #include <linux/device.h>
 #include <linux/sched/task.h>
-#include <linux/intel-svm.h>
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/cdev.h>
 #include <linux/fs.h>
@@ -100,7 +99,7 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp)
 	filp->private_data = ctx;
 
 	if (device_user_pasid_enabled(idxd)) {
-		sva = iommu_sva_bind_device(dev, current->mm, NULL);
+		sva = iommu_sva_bind_device(dev, current->mm);
 		if (IS_ERR(sva)) {
 			rc = PTR_ERR(sva);
 			dev_err(dev, "pasid allocation failed: %d\n", rc);
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index aa3478257ddb..25ba73243c6f 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -14,7 +14,6 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/device.h>
 #include <linux/idr.h>
-#include <linux/intel-svm.h>
 #include <linux/iommu.h>
 #include <uapi/linux/idxd.h>
 #include <linux/dmaengine.h>
@@ -466,29 +465,7 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
 
 static int idxd_enable_system_pasid(struct idxd_device *idxd)
 {
-	int flags;
-	unsigned int pasid;
-	struct iommu_sva *sva;
-
-	flags = SVM_FLAG_SUPERVISOR_MODE;
-
-	sva = iommu_sva_bind_device(&idxd->pdev->dev, NULL, &flags);
-	if (IS_ERR(sva)) {
-		dev_warn(&idxd->pdev->dev,
-			 "iommu sva bind failed: %ld\n", PTR_ERR(sva));
-		return PTR_ERR(sva);
-	}
-
-	pasid = iommu_sva_get_pasid(sva);
-	if (pasid == IOMMU_PASID_INVALID) {
-		iommu_sva_unbind_device(sva);
-		return -ENODEV;
-	}
-
-	idxd->sva = sva;
-	idxd->pasid = pasid;
-	dev_dbg(&idxd->pdev->dev, "system pasid: %u\n", pasid);
-	return 0;
+	return -EOPNOTSUPP;
 }
 
 static void idxd_disable_system_pasid(struct idxd_device *idxd)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 1ef7bbb4acf3..f155d406c5d5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -367,8 +367,7 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 	return ERR_PTR(ret);
 }
 
-struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
+struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 {
 	struct iommu_sva *handle;
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 8bcfb93dda56..2420fa5c2360 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -313,8 +313,7 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
 	return 0;
 }
 
-static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
-				 unsigned int flags)
+static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm)
 {
 	ioasid_t max_pasid = dev_is_pci(dev) ?
 			pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id;
@@ -324,8 +323,7 @@ static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
 
 static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 					   struct device *dev,
-					   struct mm_struct *mm,
-					   unsigned int flags)
+					   struct mm_struct *mm)
 {
 	struct device_domain_info *info = dev_iommu_priv_get(dev);
 	struct intel_svm_dev *sdev;
@@ -341,22 +339,18 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 
 		svm->pasid = mm->pasid;
 		svm->mm = mm;
-		svm->flags = flags;
 		INIT_LIST_HEAD_RCU(&svm->devs);
 
-		if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) {
-			svm->notifier.ops = &intel_mmuops;
-			ret = mmu_notifier_register(&svm->notifier, mm);
-			if (ret) {
-				kfree(svm);
-				return ERR_PTR(ret);
-			}
+		svm->notifier.ops = &intel_mmuops;
+		ret = mmu_notifier_register(&svm->notifier, mm);
+		if (ret) {
+			kfree(svm);
+			return ERR_PTR(ret);
 		}
 
 		ret = pasid_private_add(svm->pasid, svm);
 		if (ret) {
-			if (svm->notifier.ops)
-				mmu_notifier_unregister(&svm->notifier, mm);
+			mmu_notifier_unregister(&svm->notifier, mm);
 			kfree(svm);
 			return ERR_PTR(ret);
 		}
@@ -391,9 +385,7 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 	}
 
 	/* Setup the pasid table: */
-	sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
-			PASID_FLAG_SUPERVISOR_MODE : 0;
-	sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
+	sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
 	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm->pasid,
 					    FLPT_DEFAULT_DID, sflags);
 	if (ret)
@@ -407,8 +399,7 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 	kfree(sdev);
 free_svm:
 	if (list_empty(&svm->devs)) {
-		if (svm->notifier.ops)
-			mmu_notifier_unregister(&svm->notifier, mm);
+		mmu_notifier_unregister(&svm->notifier, mm);
 		pasid_private_remove(mm->pasid);
 		kfree(svm);
 	}
@@ -764,7 +755,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 			 * to unbind the mm while any page faults are outstanding.
 			 */
 			svm = pasid_private_find(req->pasid);
-			if (IS_ERR_OR_NULL(svm) || (svm->flags & SVM_FLAG_SUPERVISOR_MODE))
+			if (IS_ERR_OR_NULL(svm))
 				goto bad_req;
 		}
 
@@ -815,40 +806,20 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 	return IRQ_RETVAL(handled);
 }
 
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
+struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm)
 {
 	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
-	unsigned int flags = 0;
 	struct iommu_sva *sva;
 	int ret;
 
-	if (drvdata)
-		flags = *(unsigned int *)drvdata;
-
-	if (flags & SVM_FLAG_SUPERVISOR_MODE) {
-		if (!ecap_srs(iommu->ecap)) {
-			dev_err(dev, "%s: Supervisor PASID not supported\n",
-				iommu->name);
-			return ERR_PTR(-EOPNOTSUPP);
-		}
-
-		if (mm) {
-			dev_err(dev, "%s: Supervisor PASID with user provided mm\n",
-				iommu->name);
-			return ERR_PTR(-EINVAL);
-		}
-
-		mm = &init_mm;
-	}
-
 	mutex_lock(&pasid_mutex);
-	ret = intel_svm_alloc_pasid(dev, mm, flags);
+	ret = intel_svm_alloc_pasid(dev, mm);
 	if (ret) {
 		mutex_unlock(&pasid_mutex);
 		return ERR_PTR(ret);
 	}
 
-	sva = intel_svm_bind_mm(iommu, dev, mm, flags);
+	sva = intel_svm_bind_mm(iommu, dev, mm);
 	mutex_unlock(&pasid_mutex);
 
 	return sva;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index e9f6a8d33b58..1d28a74a0511 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2775,7 +2775,6 @@ EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
  * iommu_sva_bind_device() - Bind a process address space to a device
  * @dev: the device
  * @mm: the mm to bind, caller must hold a reference to it
- * @drvdata: opaque data pointer to pass to bind callback
  *
  * Create a bond between device and address space, allowing the device to access
  * the mm using the returned PASID. If a bond already exists between @device and
@@ -2788,7 +2787,7 @@ EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
  * On error, returns an ERR_PTR value.
  */
 struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
 {
 	struct iommu_group *group;
 	struct iommu_sva *handle = ERR_PTR(-EINVAL);
@@ -2813,7 +2812,7 @@ iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
 	if (iommu_group_device_count(group) != 1)
 		goto out_unlock;
 
-	handle = ops->sva_bind(dev, mm, drvdata);
+	handle = ops->sva_bind(dev, mm);
 
 out_unlock:
 	mutex_unlock(&group->mutex);
diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index b70a013139c7..905eff1f840e 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -108,7 +108,7 @@ static int uacce_bind_queue(struct uacce_device *uacce, struct uacce_queue *q)
 	if (!(uacce->flags & UACCE_DEV_SVA))
 		return 0;
 
-	handle = iommu_sva_bind_device(uacce->parent, current->mm, NULL);
+	handle = iommu_sva_bind_device(uacce->parent, current->mm);
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (2 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 03/13] iommu: Remove SVM_FLAG_SUPERVISOR_MODE support Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17 21:17   ` Bjorn Helgaas
  2022-08-18 13:04   ` Jason Gunthorpe
  2022-08-17  1:20 ` [PATCH v11 05/13] iommu: Add attach/detach_dev_pasid iommu interface Lu Baolu
                   ` (10 subsequent siblings)
  14 siblings, 2 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu

Some configurations of the PCI fabric will route device originated TLP
packets based on the memory addresses. These configurations are
incompatible with PASID as the PASID packets form a distinct address
space. For instance, any configuration where switches are present
without ACS enabled is incompatible.

This enhances the pci_enable_pasid() interface by requiring the ACS to
support Source Validation, Request Redirection, Completer Redirection,
and Upstream Forwarding. This effectively means that devices cannot
spoof their requester ID, requests and completions cannot be redirected,
and all transactions are forwarded upstream, even as it passes through a
bridge where the target device is downstream.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/pci/ats.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index c967ad6e2626..0715e48e7973 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -382,6 +382,11 @@ int pci_enable_pasid(struct pci_dev *pdev, int features)
 	if (!pasid)
 		return -EINVAL;
 
+	if (!pci_acs_path_enabled(pdev, NULL,
+				  PCI_ACS_SV | PCI_ACS_RR |
+				  PCI_ACS_CR | PCI_ACS_UF))
+		return -EINVAL;
+
 	pci_read_config_word(pdev, pasid + PCI_PASID_CAP, &supported);
 	supported &= PCI_PASID_CAP_EXEC | PCI_PASID_CAP_PRIV;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 05/13] iommu: Add attach/detach_dev_pasid iommu interface
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (3 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-18 13:33   ` Jason Gunthorpe
  2022-08-17  1:20 ` [PATCH v11 06/13] iommu: Add IOMMU SVA domain support Lu Baolu
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

Attaching an IOMMU domain to a PASID of a device is a generic operation
for modern IOMMU drivers which support PASID-granular DMA address
translation. Currently visible usage scenarios include (but not limited):

 - SVA (Shared Virtual Address)
 - kernel DMA with PASID
 - hardware-assist mediated device

This adds set_dev_pasid domain ops for this purpose and also adds some
interfaces for device drivers to attach/detach/retrieve a domain for a
PASID of a device.

If multiple devices share a single group, it's fine as long the fabric
always routes every TLP marked with a PASID to the host bridge and only
the host bridge. For example, ACS achieves this universally and has been
checked when pci_enable_pasid() is called. As we can't reliably tell the
source apart in a group, all the devices in a group have to be considered
as the same source, and mapped to the same PASID table.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h |  26 +++++++++
 drivers/iommu/iommu.c | 123 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 149 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2f237c3cd680..f1e8953b1e2e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -266,6 +266,7 @@ struct iommu_ops {
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
  * @detach_dev: detach an iommu domain from a device
+ * @set_dev_pasid: set an iommu domain to a pasid of device
  * @map: map a physically contiguous memory region to an iommu domain
  * @map_pages: map a physically contiguous set of pages of the same size to
  *             an iommu domain.
@@ -286,6 +287,8 @@ struct iommu_ops {
 struct iommu_domain_ops {
 	int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
 	void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
+	int (*set_dev_pasid)(struct iommu_domain *domain, struct device *dev,
+			     ioasid_t pasid);
 
 	int (*map)(struct iommu_domain *domain, unsigned long iova,
 		   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
@@ -680,6 +683,12 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
 void iommu_group_release_dma_owner(struct iommu_group *group);
 bool iommu_group_dma_owner_claimed(struct iommu_group *group);
 
+int iommu_attach_device_pasid(struct iommu_domain *domain,
+			      struct device *dev, ioasid_t pasid);
+void iommu_detach_device_pasid(struct iommu_domain *domain,
+			       struct device *dev, ioasid_t pasid);
+struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid);
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1047,6 +1056,23 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)
 {
 	return false;
 }
+
+static inline int iommu_attach_device_pasid(struct iommu_domain *domain,
+					    struct device *dev, ioasid_t pasid)
+{
+	return -ENODEV;
+}
+
+static inline void iommu_detach_device_pasid(struct iommu_domain *domain,
+					     struct device *dev, ioasid_t pasid)
+{
+}
+
+static inline struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
+{
+	return NULL;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1d28a74a0511..6f2cbccc0570 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -39,6 +39,7 @@ struct iommu_group {
 	struct kobject kobj;
 	struct kobject *devices_kobj;
 	struct list_head devices;
+	struct xarray pasid_array;
 	struct mutex mutex;
 	void *iommu_data;
 	void (*iommu_data_release)(void *iommu_data);
@@ -663,6 +664,7 @@ struct iommu_group *iommu_group_alloc(void)
 	mutex_init(&group->mutex);
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
+	xa_init(&group->pasid_array);
 
 	ret = ida_alloc(&iommu_group_ida, GFP_KERNEL);
 	if (ret < 0) {
@@ -3258,3 +3260,124 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group)
 	return user;
 }
 EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
+
+static int __iommu_set_group_pasid(struct iommu_domain *domain,
+				   struct iommu_group *group, ioasid_t pasid)
+{
+	struct iommu_domain *ops_domain;
+	struct group_device *device;
+	int ret = 0;
+
+	if (domain == group->blocking_domain)
+		ops_domain = xa_load(&group->pasid_array, pasid);
+	else
+		ops_domain = domain;
+
+	list_for_each_entry(device, &group->devices, list) {
+		ret = ops_domain->ops->set_dev_pasid(domain, device->dev, pasid);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+
+/*
+ * iommu_attach_device_pasid() - Attach a domain to pasid of device
+ * @domain: the iommu domain.
+ * @dev: the attached device.
+ * @pasid: the pasid of the device.
+ *
+ * Return: 0 on success, or an error.
+ */
+int iommu_attach_device_pasid(struct iommu_domain *domain,
+			      struct device *dev, ioasid_t pasid)
+{
+	struct iommu_group *group;
+	void *curr;
+	int ret;
+
+	if (!domain->ops->set_dev_pasid)
+		return -EOPNOTSUPP;
+
+	group = iommu_group_get(dev);
+	if (!group)
+		return -ENODEV;
+
+	mutex_lock(&group->mutex);
+	curr = xa_cmpxchg(&group->pasid_array, pasid, NULL, domain, GFP_KERNEL);
+	if (curr) {
+		ret = xa_err(curr) ? : -EBUSY;
+		goto out_unlock;
+	}
+
+	ret = __iommu_set_group_pasid(domain, group, pasid);
+	if (ret) {
+		__iommu_set_group_pasid(group->blocking_domain, group, pasid);
+		xa_erase(&group->pasid_array, pasid);
+	}
+out_unlock:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_attach_device_pasid);
+
+/*
+ * iommu_detach_device_pasid() - Detach the domain from pasid of device
+ * @domain: the iommu domain.
+ * @dev: the attached device.
+ * @pasid: the pasid of the device.
+ *
+ * The @domain must have been attached to @pasid of the @dev with
+ * iommu_attach_device_pasid().
+ */
+void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev,
+			       ioasid_t pasid)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+
+	mutex_lock(&group->mutex);
+	__iommu_set_group_pasid(group->blocking_domain, group, pasid);
+	WARN_ON(xa_erase(&group->pasid_array, pasid) != domain);
+	mutex_unlock(&group->mutex);
+
+	iommu_group_put(group);
+}
+EXPORT_SYMBOL_GPL(iommu_detach_device_pasid);
+
+/*
+ * iommu_get_domain_for_dev_pasid() - Retrieve domain for @pasid of @dev
+ * @dev: the queried device
+ * @pasid: the pasid of the device
+ *
+ * This is a variant of iommu_get_domain_for_dev(). It returns the existing
+ * domain attached to pasid of a device. It's only for internal use of the
+ * IOMMU subsystem. The caller must take care to avoid any possible
+ * use-after-free case.
+ *
+ * Return: attached domain on success, NULL otherwise.
+ */
+struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
+{
+	struct iommu_domain *domain;
+	struct iommu_group *group;
+
+	if (!pasid_valid(pasid))
+		return NULL;
+
+	group = iommu_group_get(dev);
+	if (!group)
+		return NULL;
+	/*
+	 * The xarray protects its internal state with RCU. Hence the domain
+	 * obtained is either NULL or fully formed.
+	 */
+	domain = xa_load(&group->pasid_array, pasid);
+	iommu_group_put(group);
+
+	return domain;
+}
+EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev_pasid);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 06/13] iommu: Add IOMMU SVA domain support
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (4 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 05/13] iommu: Add attach/detach_dev_pasid iommu interface Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 07/13] iommu/vt-d: Add " Lu Baolu
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

The sva iommu_domain represents a hardware pagetable that the IOMMU
hardware could use for SVA translation. This adds some infrastructure
to support SVA domain in the iommu common layer. It includes:

- Extend the iommu_domain to support a new IOMMU_DOMAIN_SVA domain
  type. The IOMMU drivers that support allocation of the SVA domain
  should provide its own sva domain specific iommu_domain_ops.
- Add a helper to allocate an SVA domain. The iommu_domain_free()
  is still used to free an SVA domain.

The report_iommu_fault() should be replaced by the new
iommu_report_device_fault(). Leave the existing fault handler with the
existing users and the newly added SVA members excludes it.

Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h | 25 +++++++++++++++++++++++--
 drivers/iommu/iommu.c | 20 ++++++++++++++++++++
 2 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index f1e8953b1e2e..d0b32a289835 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -64,6 +64,8 @@ struct iommu_domain_geometry {
 #define __IOMMU_DOMAIN_PT	(1U << 2)  /* Domain is identity mapped   */
 #define __IOMMU_DOMAIN_DMA_FQ	(1U << 3)  /* DMA-API uses flush queue    */
 
+#define __IOMMU_DOMAIN_SVA	(1U << 4)  /* Shared process address space */
+
 /*
  * This are the possible domain-types
  *
@@ -77,6 +79,8 @@ struct iommu_domain_geometry {
  *				  certain optimizations for these domains
  *	IOMMU_DOMAIN_DMA_FQ	- As above, but definitely using batched TLB
  *				  invalidation.
+ *	IOMMU_DOMAIN_SVA	- DMA addresses are shared process addresses
+ *				  represented by mm_struct's.
  */
 #define IOMMU_DOMAIN_BLOCKED	(0U)
 #define IOMMU_DOMAIN_IDENTITY	(__IOMMU_DOMAIN_PT)
@@ -86,15 +90,24 @@ struct iommu_domain_geometry {
 #define IOMMU_DOMAIN_DMA_FQ	(__IOMMU_DOMAIN_PAGING |	\
 				 __IOMMU_DOMAIN_DMA_API |	\
 				 __IOMMU_DOMAIN_DMA_FQ)
+#define IOMMU_DOMAIN_SVA	(__IOMMU_DOMAIN_SVA)
 
 struct iommu_domain {
 	unsigned type;
 	const struct iommu_domain_ops *ops;
 	unsigned long pgsize_bitmap;	/* Bitmap of page sizes in use */
-	iommu_fault_handler_t handler;
-	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	struct iommu_dma_cookie *iova_cookie;
+	union {
+		struct {
+			iommu_fault_handler_t handler;
+			void *handler_token;
+		};
+		struct {	/* IOMMU_DOMAIN_SVA */
+			struct mm_struct *mm;
+			refcount_t users;
+		};
+	};
 };
 
 static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
@@ -683,6 +696,8 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
 void iommu_group_release_dma_owner(struct iommu_group *group);
 bool iommu_group_dma_owner_claimed(struct iommu_group *group);
 
+struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
+					    struct mm_struct *mm);
 int iommu_attach_device_pasid(struct iommu_domain *domain,
 			      struct device *dev, ioasid_t pasid);
 void iommu_detach_device_pasid(struct iommu_domain *domain,
@@ -1057,6 +1072,12 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)
 	return false;
 }
 
+static inline struct iommu_domain *
+iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm)
+{
+	return NULL;
+}
+
 static inline int iommu_attach_device_pasid(struct iommu_domain *domain,
 					    struct device *dev, ioasid_t pasid)
 {
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6f2cbccc0570..ac5a1f51a9a1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -27,6 +27,7 @@
 #include <linux/module.h>
 #include <linux/cc_platform.h>
 #include <trace/events/iommu.h>
+#include <linux/sched/mm.h>
 
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
@@ -1960,6 +1961,8 @@ EXPORT_SYMBOL_GPL(iommu_domain_alloc);
 
 void iommu_domain_free(struct iommu_domain *domain)
 {
+	if (domain->type == IOMMU_DOMAIN_SVA)
+		mmdrop(domain->mm);
 	iommu_put_dma_cookie(domain);
 	domain->ops->free(domain);
 }
@@ -3381,3 +3384,20 @@ iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
 	return domain;
 }
 EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev_pasid);
+
+struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
+					    struct mm_struct *mm)
+{
+	const struct iommu_ops *ops = dev_iommu_ops(dev);
+	struct iommu_domain *domain;
+
+	domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
+	if (!domain)
+		return NULL;
+
+	domain->type = IOMMU_DOMAIN_SVA;
+	mmgrab(mm);
+	domain->mm = mm;
+
+	return domain;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 07/13] iommu/vt-d: Add SVA domain support
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (5 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 06/13] iommu: Add IOMMU SVA domain support Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-18 13:36   ` Jason Gunthorpe
  2022-08-17  1:20 ` [PATCH v11 08/13] arm-smmu-v3/sva: " Lu Baolu
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu

Add support for SVA domain allocation and provide an SVA-specific
iommu_domain_ops. This implementation is based on the existing SVA
code. Possible cleanup and refactoring are left for incremental
changes later.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/intel/iommu.h |  5 ++++
 drivers/iommu/intel/iommu.c |  2 ++
 drivers/iommu/intel/svm.c   | 50 +++++++++++++++++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index a9b8367c9361..4875c9974abd 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -747,6 +747,7 @@ void intel_svm_unbind(struct iommu_sva *handle);
 u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
 			    struct iommu_page_response *msg);
+struct iommu_domain *intel_svm_domain_alloc(void);
 
 struct intel_svm_dev {
 	struct list_head list;
@@ -772,6 +773,10 @@ struct intel_svm {
 };
 #else
 static inline void intel_svm_check(struct intel_iommu *iommu) {}
+static inline struct iommu_domain *intel_svm_domain_alloc(void)
+{
+	return NULL;
+}
 #endif
 
 #ifdef CONFIG_INTEL_IOMMU_DEBUGFS
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7cca030a508e..27c9fd6139a8 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4149,6 +4149,8 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 		return domain;
 	case IOMMU_DOMAIN_IDENTITY:
 		return &si_domain->domain;
+	case IOMMU_DOMAIN_SVA:
+		return intel_svm_domain_alloc();
 	default:
 		return NULL;
 	}
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 2420fa5c2360..16a4d413fce4 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -928,3 +928,53 @@ int intel_svm_page_response(struct device *dev,
 	mutex_unlock(&pasid_mutex);
 	return ret;
 }
+
+static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
+				   struct device *dev, ioasid_t pasid)
+{
+	struct device_domain_info *info = dev_iommu_priv_get(dev);
+	struct intel_iommu *iommu = info->iommu;
+	struct iommu_sva *sva;
+	int ret = 0;
+
+	mutex_lock(&pasid_mutex);
+	/*
+	 * Detach the domain if a blocking domain is set. Check the
+	 * right domain type once the IOMMU driver supports a real
+	 * blocking domain.
+	 */
+	if (!domain || domain->type == IOMMU_DOMAIN_UNMANAGED) {
+		intel_svm_unbind_mm(dev, pasid);
+	} else {
+		struct mm_struct *mm = domain->mm;
+
+		sva = intel_svm_bind_mm(iommu, dev, mm);
+		if (IS_ERR(sva))
+			ret = PTR_ERR(sva);
+	}
+	mutex_unlock(&pasid_mutex);
+
+	return ret;
+}
+
+static void intel_svm_domain_free(struct iommu_domain *domain)
+{
+	kfree(to_dmar_domain(domain));
+}
+
+static const struct iommu_domain_ops intel_svm_domain_ops = {
+	.set_dev_pasid		= intel_svm_set_dev_pasid,
+	.free			= intel_svm_domain_free,
+};
+
+struct iommu_domain *intel_svm_domain_alloc(void)
+{
+	struct dmar_domain *domain;
+
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		return NULL;
+	domain->domain.ops = &intel_svm_domain_ops;
+
+	return &domain->domain;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 08/13] arm-smmu-v3/sva: Add SVA domain support
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (6 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 07/13] iommu/vt-d: Add " Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device() Lu Baolu
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

Add support for SVA domain allocation and provide an SVA-specific
iommu_domain_ops. This implementation is based on the existing SVA
code. Possible cleanup and refactoring are left for incremental
changes later.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  6 ++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 76 +++++++++++++++++++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  3 +
 3 files changed, 85 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d2ba86470c42..96399dd3a67a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -758,6 +758,7 @@ struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm);
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
 u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
+struct iommu_domain *arm_smmu_sva_domain_alloc(void);
 #else /* CONFIG_ARM_SMMU_V3_SVA */
 static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 {
@@ -803,5 +804,10 @@ static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
 }
 
 static inline void arm_smmu_sva_notifier_synchronize(void) {}
+
+static inline struct iommu_domain *arm_smmu_sva_domain_alloc(void)
+{
+	return NULL;
+}
 #endif /* CONFIG_ARM_SMMU_V3_SVA */
 #endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index f155d406c5d5..43564b61c726 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -549,3 +549,79 @@ void arm_smmu_sva_notifier_synchronize(void)
 	 */
 	mmu_notifier_synchronize();
 }
+
+static void arm_smmu_sva_block_dev_pasid(struct iommu_domain *domain,
+					 struct device *dev, ioasid_t id)
+{
+	struct mm_struct *mm = domain->mm;
+	struct arm_smmu_bond *bond = NULL, *t;
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	mutex_lock(&sva_lock);
+	list_for_each_entry(t, &master->bonds, list) {
+		if (t->mm == mm) {
+			bond = t;
+			break;
+		}
+	}
+
+	if (!WARN_ON(!bond) && refcount_dec_and_test(&bond->refs)) {
+		list_del(&bond->list);
+		arm_smmu_mmu_notifier_put(bond->smmu_mn);
+		kfree(bond);
+	}
+	mutex_unlock(&sva_lock);
+}
+
+static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
+				      struct device *dev, ioasid_t id)
+{
+	int ret = 0;
+	struct mm_struct *mm;
+	struct iommu_sva *handle;
+
+	/*
+	 * Detach the domain if a blocking domain is set. Check the
+	 * right domain type once the IOMMU driver supports a real
+	 * blocking domain.
+	 */
+	if (!domain || domain->type == IOMMU_DOMAIN_UNMANAGED) {
+		domain = iommu_get_domain_for_dev_pasid(dev, id);
+		if (!domain)
+			return -EINVAL;
+		arm_smmu_sva_block_dev_pasid(domain, dev, id);
+
+		return 0;
+	}
+
+	mm = domain->mm;
+	mutex_lock(&sva_lock);
+	handle = __arm_smmu_sva_bind(dev, mm);
+	if (IS_ERR(handle))
+		ret = PTR_ERR(handle);
+	mutex_unlock(&sva_lock);
+
+	return ret;
+}
+
+static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
+{
+	kfree(domain);
+}
+
+static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
+	.set_dev_pasid		= arm_smmu_sva_set_dev_pasid,
+	.free			= arm_smmu_sva_domain_free,
+};
+
+struct iommu_domain *arm_smmu_sva_domain_alloc(void)
+{
+	struct iommu_domain *domain;
+
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		return NULL;
+	domain->ops = &arm_smmu_sva_domain_ops;
+
+	return domain;
+}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f88541be8213..057f7c8824d7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2008,6 +2008,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 {
 	struct arm_smmu_domain *smmu_domain;
 
+	if (type == IOMMU_DOMAIN_SVA)
+		return arm_smmu_sva_domain_alloc();
+
 	if (type != IOMMU_DOMAIN_UNMANAGED &&
 	    type != IOMMU_DOMAIN_DMA &&
 	    type != IOMMU_DOMAIN_DMA_FQ &&
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device()
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (7 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 08/13] arm-smmu-v3/sva: " Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-18 13:41   ` Jason Gunthorpe
  2022-08-17  1:20 ` [PATCH v11 10/13] iommu: Remove SVA related callbacks from iommu ops Lu Baolu
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

The existing iommu SVA interfaces are implemented by calling the SVA
specific iommu ops provided by the IOMMU drivers. There's no need for
any SVA specific ops in iommu_ops vector anymore as we can achieve
this through the generic attach/detach_dev_pasid domain ops.

This refactors the IOMMU SVA interfaces implementation by using the
iommu_detach/detach_device_pasid interfaces and align them with the
concept of the SVA iommu domain. Put the new SVA code in the SVA
related file in order to make it self-contained.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 include/linux/iommu.h         |  43 +++++++-------
 drivers/iommu/iommu-sva-lib.c | 104 ++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c         |  91 -----------------------------
 3 files changed, 127 insertions(+), 111 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d0b32a289835..86b6870f9697 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -643,6 +643,7 @@ struct iommu_fwspec {
  */
 struct iommu_sva {
 	struct device			*dev;
+	struct iommu_domain		*domain;
 };
 
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
@@ -684,11 +685,6 @@ void iommu_release_device(struct device *dev);
 int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features f);
 int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
 
-struct iommu_sva *iommu_sva_bind_device(struct device *dev,
-					struct mm_struct *mm);
-void iommu_sva_unbind_device(struct iommu_sva *handle);
-u32 iommu_sva_get_pasid(struct iommu_sva *handle);
-
 int iommu_device_use_default_domain(struct device *dev);
 void iommu_device_unuse_default_domain(struct device *dev);
 
@@ -1028,21 +1024,6 @@ iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
 	return -ENODEV;
 }
 
-static inline struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
-{
-	return NULL;
-}
-
-static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
-{
-}
-
-static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
-{
-	return IOMMU_PASID_INVALID;
-}
-
 static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
@@ -1119,4 +1100,26 @@ void iommu_debugfs_setup(void);
 static inline void iommu_debugfs_setup(void) {}
 #endif
 
+#ifdef CONFIG_IOMMU_SVA
+struct iommu_sva *iommu_sva_bind_device(struct device *dev,
+					struct mm_struct *mm);
+void iommu_sva_unbind_device(struct iommu_sva *handle);
+u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+#else
+static inline struct iommu_sva *
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
+{
+	return NULL;
+}
+
+static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
+{
+}
+
+static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
+{
+	return IOMMU_PASID_INVALID;
+}
+#endif /* CONFIG_IOMMU_SVA */
+
 #endif /* __LINUX_IOMMU_H */
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index 106506143896..f5a9adde7491 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -4,6 +4,7 @@
  */
 #include <linux/mutex.h>
 #include <linux/sched/mm.h>
+#include <linux/iommu.h>
 
 #include "iommu-sva-lib.h"
 
@@ -69,3 +70,106 @@ struct mm_struct *iommu_sva_find(ioasid_t pasid)
 	return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_find);
+
+/**
+ * iommu_sva_bind_device() - Bind a process address space to a device
+ * @dev: the device
+ * @mm: the mm to bind, caller must hold a reference to mm_users
+ *
+ * Create a bond between device and address space, allowing the device to access
+ * the mm using the returned PASID. If a bond already exists between @device and
+ * @mm, it is returned and an additional reference is taken. Caller must call
+ * iommu_sva_unbind_device() to release each reference.
+ *
+ * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
+ * initialize the required SVA features.
+ *
+ * On error, returns an ERR_PTR value.
+ */
+struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
+{
+	struct iommu_domain *domain;
+	struct iommu_sva *bond;
+	ioasid_t max_pasids;
+	int ret;
+
+	max_pasids = dev->iommu->max_pasids;
+	if (!max_pasids)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	/* Allocate mm->pasid if necessary. */
+	ret = iommu_sva_alloc_pasid(mm, 1, max_pasids - 1);
+	if (ret)
+		return ERR_PTR(ret);
+
+	bond = kzalloc(sizeof(*bond), GFP_KERNEL);
+	if (!bond)
+		return ERR_PTR(-ENOMEM);
+
+	mutex_lock(&iommu_sva_lock);
+	/* Search for an existing domain. */
+	domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid);
+	if (domain) {
+		refcount_inc(&domain->users);
+		goto out;
+	}
+
+	/* Allocate a new domain and set it on device pasid. */
+	domain = iommu_sva_domain_alloc(dev, mm);
+	if (!domain) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	ret = iommu_attach_device_pasid(domain, dev, mm->pasid);
+	if (ret)
+		goto out_free_domain;
+	refcount_set(&domain->users, 1);
+out:
+	mutex_unlock(&iommu_sva_lock);
+	bond->dev = dev;
+	bond->domain = domain;
+
+	return bond;
+
+out_free_domain:
+	iommu_domain_free(domain);
+out_unlock:
+	mutex_unlock(&iommu_sva_lock);
+	kfree(bond);
+
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
+
+/**
+ * iommu_sva_unbind_device() - Remove a bond created with iommu_sva_bind_device
+ * @handle: the handle returned by iommu_sva_bind_device()
+ *
+ * Put reference to a bond between device and address space. The device should
+ * not be issuing any more transaction for this PASID. All outstanding page
+ * requests for this PASID must have been flushed to the IOMMU.
+ */
+void iommu_sva_unbind_device(struct iommu_sva *handle)
+{
+	struct iommu_domain *domain = handle->domain;
+	ioasid_t pasid = domain->mm->pasid;
+	struct device *dev = handle->dev;
+
+	mutex_lock(&iommu_sva_lock);
+	if (refcount_dec_and_test(&domain->users)) {
+		iommu_detach_device_pasid(domain, dev, pasid);
+		iommu_domain_free(domain);
+	}
+	mutex_unlock(&iommu_sva_lock);
+	kfree(handle);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
+
+u32 iommu_sva_get_pasid(struct iommu_sva *handle)
+{
+	struct iommu_domain *domain = handle->domain;
+
+	return domain->mm->pasid;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ac5a1f51a9a1..c77860695de3 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2776,97 +2776,6 @@ int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
 }
 EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
 
-/**
- * iommu_sva_bind_device() - Bind a process address space to a device
- * @dev: the device
- * @mm: the mm to bind, caller must hold a reference to it
- *
- * Create a bond between device and address space, allowing the device to access
- * the mm using the returned PASID. If a bond already exists between @device and
- * @mm, it is returned and an additional reference is taken. Caller must call
- * iommu_sva_unbind_device() to release each reference.
- *
- * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
- * initialize the required SVA features.
- *
- * On error, returns an ERR_PTR value.
- */
-struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
-{
-	struct iommu_group *group;
-	struct iommu_sva *handle = ERR_PTR(-EINVAL);
-	const struct iommu_ops *ops = dev_iommu_ops(dev);
-
-	if (!ops->sva_bind)
-		return ERR_PTR(-ENODEV);
-
-	group = iommu_group_get(dev);
-	if (!group)
-		return ERR_PTR(-ENODEV);
-
-	/* Ensure device count and domain don't change while we're binding */
-	mutex_lock(&group->mutex);
-
-	/*
-	 * To keep things simple, SVA currently doesn't support IOMMU groups
-	 * with more than one device. Existing SVA-capable systems are not
-	 * affected by the problems that required IOMMU groups (lack of ACS
-	 * isolation, device ID aliasing and other hardware issues).
-	 */
-	if (iommu_group_device_count(group) != 1)
-		goto out_unlock;
-
-	handle = ops->sva_bind(dev, mm);
-
-out_unlock:
-	mutex_unlock(&group->mutex);
-	iommu_group_put(group);
-
-	return handle;
-}
-EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
-
-/**
- * iommu_sva_unbind_device() - Remove a bond created with iommu_sva_bind_device
- * @handle: the handle returned by iommu_sva_bind_device()
- *
- * Put reference to a bond between device and address space. The device should
- * not be issuing any more transaction for this PASID. All outstanding page
- * requests for this PASID must have been flushed to the IOMMU.
- */
-void iommu_sva_unbind_device(struct iommu_sva *handle)
-{
-	struct iommu_group *group;
-	struct device *dev = handle->dev;
-	const struct iommu_ops *ops = dev_iommu_ops(dev);
-
-	if (!ops->sva_unbind)
-		return;
-
-	group = iommu_group_get(dev);
-	if (!group)
-		return;
-
-	mutex_lock(&group->mutex);
-	ops->sva_unbind(handle);
-	mutex_unlock(&group->mutex);
-
-	iommu_group_put(group);
-}
-EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
-
-u32 iommu_sva_get_pasid(struct iommu_sva *handle)
-{
-	const struct iommu_ops *ops = dev_iommu_ops(handle->dev);
-
-	if (!ops->sva_get_pasid)
-		return IOMMU_PASID_INVALID;
-
-	return ops->sva_get_pasid(handle);
-}
-EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
-
 /*
  * Changes the default domain of an iommu group that has *only* one device
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 10/13] iommu: Remove SVA related callbacks from iommu ops
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (8 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device() Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 11/13] iommu: Prepare IOMMU domain for IOPF Lu Baolu
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

These ops'es have been deprecated. There's no need for them anymore.
Remove them to avoid dead code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h                         |  7 ---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 16 ------
 drivers/iommu/intel/iommu.h                   |  3 --
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 40 ---------------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  3 --
 drivers/iommu/intel/iommu.c                   |  3 --
 drivers/iommu/intel/svm.c                     | 49 -------------------
 7 files changed, 121 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 86b6870f9697..790bdc007f54 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -227,9 +227,6 @@ struct iommu_iotlb_gather {
  *                      driver init to device driver init (default no)
  * @dev_has/enable/disable_feat: per device entries to check/enable/disable
  *                               iommu specific features.
- * @sva_bind: Bind process address space to device
- * @sva_unbind: Unbind process address space from device
- * @sva_get_pasid: Get PASID associated to a SVA handle
  * @page_response: handle page request response
  * @def_domain_type: device default domain type, return value:
  *		- IOMMU_DOMAIN_IDENTITY: must use an identity domain
@@ -260,10 +257,6 @@ struct iommu_ops {
 	int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
 	int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
 
-	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm);
-	void (*sva_unbind)(struct iommu_sva *handle);
-	u32 (*sva_get_pasid)(struct iommu_sva *handle);
-
 	int (*page_response)(struct device *dev,
 			     struct iommu_fault_event *evt,
 			     struct iommu_page_response *msg);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 96399dd3a67a..15dd4c7e6d3a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -754,9 +754,6 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
 bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
-struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm);
-void arm_smmu_sva_unbind(struct iommu_sva *handle);
-u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
 struct iommu_domain *arm_smmu_sva_domain_alloc(void);
 #else /* CONFIG_ARM_SMMU_V3_SVA */
@@ -790,19 +787,6 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master
 	return false;
 }
 
-static inline struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
-{
-	return ERR_PTR(-ENODEV);
-}
-
-static inline void arm_smmu_sva_unbind(struct iommu_sva *handle) {}
-
-static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
-{
-	return IOMMU_PASID_INVALID;
-}
-
 static inline void arm_smmu_sva_notifier_synchronize(void) {}
 
 static inline struct iommu_domain *arm_smmu_sva_domain_alloc(void)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 4875c9974abd..18d4128d58ea 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -742,9 +742,6 @@ struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn);
 extern void intel_svm_check(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm);
-void intel_svm_unbind(struct iommu_sva *handle);
-u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
 			    struct iommu_page_response *msg);
 struct iommu_domain *intel_svm_domain_alloc(void);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 43564b61c726..27a56ae6fff8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -344,11 +344,6 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 	if (!bond)
 		return ERR_PTR(-ENOMEM);
 
-	/* Allocate a PASID for this mm if necessary */
-	ret = iommu_sva_alloc_pasid(mm, 1, (1U << master->ssid_bits) - 1);
-	if (ret)
-		goto err_free_bond;
-
 	bond->mm = mm;
 	bond->sva.dev = dev;
 	refcount_set(&bond->refs, 1);
@@ -367,41 +362,6 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 	return ERR_PTR(ret);
 }
 
-struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
-{
-	struct iommu_sva *handle;
-	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-
-	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
-		return ERR_PTR(-EINVAL);
-
-	mutex_lock(&sva_lock);
-	handle = __arm_smmu_sva_bind(dev, mm);
-	mutex_unlock(&sva_lock);
-	return handle;
-}
-
-void arm_smmu_sva_unbind(struct iommu_sva *handle)
-{
-	struct arm_smmu_bond *bond = sva_to_bond(handle);
-
-	mutex_lock(&sva_lock);
-	if (refcount_dec_and_test(&bond->refs)) {
-		list_del(&bond->list);
-		arm_smmu_mmu_notifier_put(bond->smmu_mn);
-		kfree(bond);
-	}
-	mutex_unlock(&sva_lock);
-}
-
-u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
-{
-	struct arm_smmu_bond *bond = sva_to_bond(handle);
-
-	return bond->mm->pasid;
-}
-
 bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 {
 	unsigned long reg, fld;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 057f7c8824d7..0dd31466401d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2830,9 +2830,6 @@ static struct iommu_ops arm_smmu_ops = {
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
 	.dev_disable_feat	= arm_smmu_dev_disable_feature,
-	.sva_bind		= arm_smmu_sva_bind,
-	.sva_unbind		= arm_smmu_sva_unbind,
-	.sva_get_pasid		= arm_smmu_sva_get_pasid,
 	.page_response		= arm_smmu_page_response,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 	.owner			= THIS_MODULE,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 27c9fd6139a8..804bfae54d89 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4754,9 +4754,6 @@ const struct iommu_ops intel_iommu_ops = {
 	.def_domain_type	= device_def_domain_type,
 	.pgsize_bitmap		= SZ_4K,
 #ifdef CONFIG_INTEL_IOMMU_SVM
-	.sva_bind		= intel_svm_bind,
-	.sva_unbind		= intel_svm_unbind,
-	.sva_get_pasid		= intel_svm_get_pasid,
 	.page_response		= intel_svm_page_response,
 #endif
 	.default_domain_ops = &(const struct iommu_domain_ops) {
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 16a4d413fce4..af4dbcdbe380 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -313,14 +313,6 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
 	return 0;
 }
 
-static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm)
-{
-	ioasid_t max_pasid = dev_is_pci(dev) ?
-			pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id;
-
-	return iommu_sva_alloc_pasid(mm, PASID_MIN, max_pasid - 1);
-}
-
 static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 					   struct device *dev,
 					   struct mm_struct *mm)
@@ -806,47 +798,6 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 	return IRQ_RETVAL(handled);
 }
 
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm)
-{
-	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
-	struct iommu_sva *sva;
-	int ret;
-
-	mutex_lock(&pasid_mutex);
-	ret = intel_svm_alloc_pasid(dev, mm);
-	if (ret) {
-		mutex_unlock(&pasid_mutex);
-		return ERR_PTR(ret);
-	}
-
-	sva = intel_svm_bind_mm(iommu, dev, mm);
-	mutex_unlock(&pasid_mutex);
-
-	return sva;
-}
-
-void intel_svm_unbind(struct iommu_sva *sva)
-{
-	struct intel_svm_dev *sdev = to_intel_svm_dev(sva);
-
-	mutex_lock(&pasid_mutex);
-	intel_svm_unbind_mm(sdev->dev, sdev->pasid);
-	mutex_unlock(&pasid_mutex);
-}
-
-u32 intel_svm_get_pasid(struct iommu_sva *sva)
-{
-	struct intel_svm_dev *sdev;
-	u32 pasid;
-
-	mutex_lock(&pasid_mutex);
-	sdev = to_intel_svm_dev(sva);
-	pasid = sdev->pasid;
-	mutex_unlock(&pasid_mutex);
-
-	return pasid;
-}
-
 int intel_svm_page_response(struct device *dev,
 			    struct iommu_fault_event *evt,
 			    struct iommu_page_response *msg)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 11/13] iommu: Prepare IOMMU domain for IOPF
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (9 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 10/13] iommu: Remove SVA related callbacks from iommu ops Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 12/13] iommu: Per-domain I/O page fault handling Lu Baolu
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

This adds some mechanisms around the iommu_domain so that the I/O page
fault handling framework could route a page fault to the domain and
call the fault handler from it.

Add pointers to the page fault handler and its private data in struct
iommu_domain. The fault handler will be called with the private data
as a parameter once a page fault is routed to the domain. Any kernel
component which owns an iommu domain could install handler and its
private parameter so that the page fault could be further routed and
handled.

This also prepares the SVA implementation to be the first consumer of
the per-domain page fault handling model. The I/O page fault handler
for SVA is copied to the SVA file with mmget_not_zero() added before
mmap_read_lock().

Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 include/linux/iommu.h         |  3 ++
 drivers/iommu/iommu-sva-lib.h |  8 +++++
 drivers/iommu/io-pgfault.c    |  7 +++++
 drivers/iommu/iommu-sva-lib.c | 58 +++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c         |  4 +++
 5 files changed, 80 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 790bdc007f54..bd6331a4dbae 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -98,6 +98,9 @@ struct iommu_domain {
 	unsigned long pgsize_bitmap;	/* Bitmap of page sizes in use */
 	struct iommu_domain_geometry geometry;
 	struct iommu_dma_cookie *iova_cookie;
+	enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault,
+						      void *data);
+	void *fault_data;
 	union {
 		struct {
 			iommu_fault_handler_t handler;
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 8909ea1094e3..1b3ace4b5863 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -26,6 +26,8 @@ int iopf_queue_flush_dev(struct device *dev);
 struct iopf_queue *iopf_queue_alloc(const char *name);
 void iopf_queue_free(struct iopf_queue *queue);
 int iopf_queue_discard_partial(struct iopf_queue *queue);
+enum iommu_page_response_code
+iommu_sva_handle_iopf(struct iommu_fault *fault, void *data);
 
 #else /* CONFIG_IOMMU_SVA */
 static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
@@ -63,5 +65,11 @@ static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
 {
 	return -ENODEV;
 }
+
+static inline enum iommu_page_response_code
+iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
+{
+	return IOMMU_PAGE_RESP_INVALID;
+}
 #endif /* CONFIG_IOMMU_SVA */
 #endif /* _IOMMU_SVA_LIB_H */
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index 1df8c1dcae77..aee9e033012f 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -181,6 +181,13 @@ static void iopf_handle_group(struct work_struct *work)
  * request completes, outstanding faults will have been dealt with by the time
  * the PASID is freed.
  *
+ * Any valid page fault will be eventually routed to an iommu domain and the
+ * page fault handler installed there will get called. The users of this
+ * handling framework should guarantee that the iommu domain could only be
+ * freed after the device has stopped generating page faults (or the iommu
+ * hardware has been set to block the page faults) and the pending page faults
+ * have been flushed.
+ *
  * Return: 0 on success and <0 on error.
  */
 int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index f5a9adde7491..79a9d43bdfe1 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -173,3 +173,61 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle)
 	return domain->mm->pasid;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * I/O page fault handler for SVA
+ */
+enum iommu_page_response_code
+iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
+{
+	vm_fault_t ret;
+	struct vm_area_struct *vma;
+	struct mm_struct *mm = data;
+	unsigned int access_flags = 0;
+	unsigned int fault_flags = FAULT_FLAG_REMOTE;
+	struct iommu_fault_page_request *prm = &fault->prm;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
+
+	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
+		return status;
+
+	if (!mmget_not_zero(mm))
+		return status;
+
+	mmap_read_lock(mm);
+
+	vma = find_extend_vma(mm, prm->addr);
+	if (!vma)
+		/* Unmapped area */
+		goto out_put_mm;
+
+	if (prm->perm & IOMMU_FAULT_PERM_READ)
+		access_flags |= VM_READ;
+
+	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
+		access_flags |= VM_WRITE;
+		fault_flags |= FAULT_FLAG_WRITE;
+	}
+
+	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
+		access_flags |= VM_EXEC;
+		fault_flags |= FAULT_FLAG_INSTRUCTION;
+	}
+
+	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
+		fault_flags |= FAULT_FLAG_USER;
+
+	if (access_flags & ~vma->vm_flags)
+		/* Access fault */
+		goto out_put_mm;
+
+	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
+	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
+		IOMMU_PAGE_RESP_SUCCESS;
+
+out_put_mm:
+	mmap_read_unlock(mm);
+	mmput(mm);
+
+	return status;
+}
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index c77860695de3..3ad32aa7523a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,6 +29,8 @@
 #include <trace/events/iommu.h>
 #include <linux/sched/mm.h>
 
+#include "iommu-sva-lib.h"
+
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
 
@@ -3307,6 +3309,8 @@ struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
 	domain->type = IOMMU_DOMAIN_SVA;
 	mmgrab(mm);
 	domain->mm = mm;
+	domain->iopf_handler = iommu_sva_handle_iopf;
+	domain->fault_data = mm;
 
 	return domain;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 12/13] iommu: Per-domain I/O page fault handling
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (10 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 11/13] iommu: Prepare IOMMU domain for IOPF Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-17  1:20 ` [PATCH v11 13/13] iommu: Rename iommu-sva-lib.{c,h} Lu Baolu
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

Tweak the I/O page fault handling framework to route the page faults to
the domain and call the page fault handler retrieved from the domain.
This makes the I/O page fault handling framework possible to serve more
usage scenarios as long as they have an IOMMU domain and install a page
fault handler in it. Some unused functions are also removed to avoid
dead code.

The iommu_get_domain_for_dev_pasid() which retrieves attached domain
for a {device, PASID} pair is used. It will be used by the page fault
handling framework which knows {device, PASID} reported from the iommu
driver. We have a guarantee that the SVA domain doesn't go away during
IOPF handling, because unbind() won't free the domain until all the
pending page requests have been flushed from the pipeline. The drivers
either call iopf_queue_flush_dev() explicitly, or in stall case, the
device driver is required to flush all DMAs including stalled
transactions before calling unbind().

This also renames iopf_handle_group() to iopf_handler() to avoid
confusing.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/io-pgfault.c | 68 +++++---------------------------------
 1 file changed, 9 insertions(+), 59 deletions(-)

diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index aee9e033012f..d1c522f4ab34 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -69,69 +69,18 @@ static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf,
 	return iommu_page_response(dev, &resp);
 }
 
-static enum iommu_page_response_code
-iopf_handle_single(struct iopf_fault *iopf)
-{
-	vm_fault_t ret;
-	struct mm_struct *mm;
-	struct vm_area_struct *vma;
-	unsigned int access_flags = 0;
-	unsigned int fault_flags = FAULT_FLAG_REMOTE;
-	struct iommu_fault_page_request *prm = &iopf->fault.prm;
-	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
-
-	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
-		return status;
-
-	mm = iommu_sva_find(prm->pasid);
-	if (IS_ERR_OR_NULL(mm))
-		return status;
-
-	mmap_read_lock(mm);
-
-	vma = find_extend_vma(mm, prm->addr);
-	if (!vma)
-		/* Unmapped area */
-		goto out_put_mm;
-
-	if (prm->perm & IOMMU_FAULT_PERM_READ)
-		access_flags |= VM_READ;
-
-	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
-		access_flags |= VM_WRITE;
-		fault_flags |= FAULT_FLAG_WRITE;
-	}
-
-	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
-		access_flags |= VM_EXEC;
-		fault_flags |= FAULT_FLAG_INSTRUCTION;
-	}
-
-	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
-		fault_flags |= FAULT_FLAG_USER;
-
-	if (access_flags & ~vma->vm_flags)
-		/* Access fault */
-		goto out_put_mm;
-
-	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
-	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
-		IOMMU_PAGE_RESP_SUCCESS;
-
-out_put_mm:
-	mmap_read_unlock(mm);
-	mmput(mm);
-
-	return status;
-}
-
-static void iopf_handle_group(struct work_struct *work)
+static void iopf_handler(struct work_struct *work)
 {
 	struct iopf_group *group;
+	struct iommu_domain *domain;
 	struct iopf_fault *iopf, *next;
 	enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
 
 	group = container_of(work, struct iopf_group, work);
+	domain = iommu_get_domain_for_dev_pasid(group->dev,
+				group->last_fault.fault.prm.pasid);
+	if (!domain || !domain->iopf_handler)
+		status = IOMMU_PAGE_RESP_INVALID;
 
 	list_for_each_entry_safe(iopf, next, &group->faults, list) {
 		/*
@@ -139,7 +88,8 @@ static void iopf_handle_group(struct work_struct *work)
 		 * faults in the group if there is an error.
 		 */
 		if (status == IOMMU_PAGE_RESP_SUCCESS)
-			status = iopf_handle_single(iopf);
+			status = domain->iopf_handler(&iopf->fault,
+						      domain->fault_data);
 
 		if (!(iopf->fault.prm.flags &
 		      IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
@@ -242,7 +192,7 @@ int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
 	group->last_fault.fault = *fault;
 	INIT_LIST_HEAD(&group->faults);
 	list_add(&group->last_fault.list, &group->faults);
-	INIT_WORK(&group->work, iopf_handle_group);
+	INIT_WORK(&group->work, iopf_handler);
 
 	/* See if we have partial faults for this group */
 	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v11 13/13] iommu: Rename iommu-sva-lib.{c,h}
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (11 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 12/13] iommu: Per-domain I/O page fault handling Lu Baolu
@ 2022-08-17  1:20 ` Lu Baolu
  2022-08-22  4:49 ` [PATCH v11 00/13] iommu: SVA and IOPF refactoring Zhangfei Gao
  2022-08-26  4:17 ` Baolu Lu
  14 siblings, 0 replies; 33+ messages in thread
From: Lu Baolu @ 2022-08-17  1:20 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel, Lu Baolu, Jean-Philippe Brucker

Rename iommu-sva-lib.c[h] to iommu-sva.c[h] as it contains all code
for SVA implementation in iommu core.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/{iommu-sva-lib.h => iommu-sva.h}  | 6 +++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c     | 2 +-
 drivers/iommu/intel/iommu.c                     | 2 +-
 drivers/iommu/intel/svm.c                       | 2 +-
 drivers/iommu/io-pgfault.c                      | 2 +-
 drivers/iommu/{iommu-sva-lib.c => iommu-sva.c}  | 2 +-
 drivers/iommu/iommu.c                           | 2 +-
 drivers/iommu/Makefile                          | 2 +-
 9 files changed, 11 insertions(+), 11 deletions(-)
 rename drivers/iommu/{iommu-sva-lib.h => iommu-sva.h} (95%)
 rename drivers/iommu/{iommu-sva-lib.c => iommu-sva.c} (99%)

diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva.h
similarity index 95%
rename from drivers/iommu/iommu-sva-lib.h
rename to drivers/iommu/iommu-sva.h
index 1b3ace4b5863..7215a761b962 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva.h
@@ -2,8 +2,8 @@
 /*
  * SVA library for IOMMU drivers
  */
-#ifndef _IOMMU_SVA_LIB_H
-#define _IOMMU_SVA_LIB_H
+#ifndef _IOMMU_SVA_H
+#define _IOMMU_SVA_H
 
 #include <linux/ioasid.h>
 #include <linux/mm_types.h>
@@ -72,4 +72,4 @@ iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
 	return IOMMU_PAGE_RESP_INVALID;
 }
 #endif /* CONFIG_IOMMU_SVA */
-#endif /* _IOMMU_SVA_LIB_H */
+#endif /* _IOMMU_SVA_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 27a56ae6fff8..553cd44773e1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -10,7 +10,7 @@
 #include <linux/slab.h>
 
 #include "arm-smmu-v3.h"
-#include "../../iommu-sva-lib.h"
+#include "../../iommu-sva.h"
 #include "../../io-pgtable-arm.h"
 
 struct arm_smmu_mmu_notifier {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0dd31466401d..6dd57b6f9050 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -31,7 +31,7 @@
 #include <linux/amba/bus.h>
 
 #include "arm-smmu-v3.h"
-#include "../../iommu-sva-lib.h"
+#include "../../iommu-sva.h"
 
 static bool disable_bypass = true;
 module_param(disable_bypass, bool, 0444);
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 804bfae54d89..17aa07e9799c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -27,7 +27,7 @@
 
 #include "iommu.h"
 #include "../irq_remapping.h"
-#include "../iommu-sva-lib.h"
+#include "../iommu-sva.h"
 #include "pasid.h"
 #include "cap_audit.h"
 
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index af4dbcdbe380..d07b16918e4c 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -24,7 +24,7 @@
 #include "iommu.h"
 #include "pasid.h"
 #include "perf.h"
-#include "../iommu-sva-lib.h"
+#include "../iommu-sva.h"
 #include "trace.h"
 
 static irqreturn_t prq_event_thread(int irq, void *d);
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index d1c522f4ab34..7a60b123e6b9 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -11,7 +11,7 @@
 #include <linux/slab.h>
 #include <linux/workqueue.h>
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 /**
  * struct iopf_queue - IO Page Fault queue
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva.c
similarity index 99%
rename from drivers/iommu/iommu-sva-lib.c
rename to drivers/iommu/iommu-sva.c
index 79a9d43bdfe1..d9a3a6336e51 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva.c
@@ -6,7 +6,7 @@
 #include <linux/sched/mm.h>
 #include <linux/iommu.h>
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 static DEFINE_MUTEX(iommu_sva_lock);
 static DECLARE_IOASID_SET(iommu_sva_pasid);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ad32aa7523a..be4c652871d8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,7 +29,7 @@
 #include <trace/events/iommu.h>
 #include <linux/sched/mm.h>
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 44475a9b3eea..c1763476162b 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,6 +27,6 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
-obj-$(CONFIG_IOMMU_SVA) += iommu-sva-lib.o io-pgfault.o
+obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o io-pgfault.o
 obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
 obj-$(CONFIG_APPLE_DART) += apple-dart.o
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-17  1:20 ` [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path Lu Baolu
@ 2022-08-17 21:17   ` Bjorn Helgaas
  2022-08-17 22:48     ` Jason Gunthorpe
  2022-08-18 11:53     ` Baolu Lu
  2022-08-18 13:04   ` Jason Gunthorpe
  1 sibling, 2 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2022-08-17 21:17 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On Wed, Aug 17, 2022 at 09:20:15AM +0800, Lu Baolu wrote:
> Some configurations of the PCI fabric will route device originated TLP
> packets based on the memory addresses.

This makes it sound like a few unusual configurations will route TLPs
based on memory addresses, but address routing is the default for all
PCIe Memory Requests, and ACS provides a way to override that default.

> These configurations are incompatible with PASID as the PASID
> packets form a distinct address space.

I would say "the Requester ID/PASID combination forms a distinct
address space."

> For instance, any configuration where switches are present
> without ACS enabled is incompatible.
> 
> This enhances the pci_enable_pasid() interface by requiring the ACS to
> support Source Validation, Request Redirection, Completer Redirection,
> and Upstream Forwarding. This effectively means that devices cannot
> spoof their requester ID, requests and completions cannot be redirected,
> and all transactions are forwarded upstream, even as it passes through a
> bridge where the target device is downstream.

I think your patch actually requires all those features to be not just
"supported" but actually *enabled* for the entire path leading to the
device.

To use the terms from the spec:

  "P2P Request Redirect"
  "P2P Completion Redirect"
  "Requester ID, Requests, and Completions"

and maybe something like:

  ... even if the TLP looks like a P2P Request because its memory
  address (ignoring the PASID) would fall in a bridge window and would
  normally be routed downstream.

Does the PCIe spec really allow TLPs with PASID to be routed anywhere
except upstream?  It seems nonsensical to route them downstream, and
hardware should be able to check that easily.  But I took a quick look
through the spec and didn't see anything about PASID by itself
influencing routing.

> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Suggested-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/pci/ats.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index c967ad6e2626..0715e48e7973 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -382,6 +382,11 @@ int pci_enable_pasid(struct pci_dev *pdev, int features)
>  	if (!pasid)
>  		return -EINVAL;
>  
> +	if (!pci_acs_path_enabled(pdev, NULL,
> +				  PCI_ACS_SV | PCI_ACS_RR |
> +				  PCI_ACS_CR | PCI_ACS_UF))
> +		return -EINVAL;
> +
>  	pci_read_config_word(pdev, pasid + PCI_PASID_CAP, &supported);
>  	supported &= PCI_PASID_CAP_EXEC | PCI_PASID_CAP_PRIV;
>  
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-17 21:17   ` Bjorn Helgaas
@ 2022-08-17 22:48     ` Jason Gunthorpe
  2022-08-18 11:55       ` Baolu Lu
  2022-08-18 11:53     ` Baolu Lu
  1 sibling, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2022-08-17 22:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lu Baolu, Joerg Roedel, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On Wed, Aug 17, 2022 at 04:17:43PM -0500, Bjorn Helgaas wrote:

> Does the PCIe spec really allow TLPs with PASID to be routed anywhere
> except upstream?

I think yes:

 2.2.10.2 End-End TLP Prefix Processing:

 The presence of an End-End TLP Prefix does not alter the routing of a
 TLP. TLPs are routed based on the routing rules covered in Section
 2.2.4 .

Which I read as saying that routing is done after stripping all the
prefixes. PASID is a prefix.

Lu, you may want to quote the spec in the commit message to make it
clear.

A strange choice in my opinion, but there it is..

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-17 21:17   ` Bjorn Helgaas
  2022-08-17 22:48     ` Jason Gunthorpe
@ 2022-08-18 11:53     ` Baolu Lu
  2022-08-18 23:00       ` Bjorn Helgaas
  1 sibling, 1 reply; 33+ messages in thread
From: Baolu Lu @ 2022-08-18 11:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: baolu.lu, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Bjorn Helgaas, Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

Hi Bjorn,

On 2022/8/18 05:17, Bjorn Helgaas wrote:
> On Wed, Aug 17, 2022 at 09:20:15AM +0800, Lu Baolu wrote:
>> Some configurations of the PCI fabric will route device originated TLP
>> packets based on the memory addresses.
> This makes it sound like a few unusual configurations will route TLPs
> based on memory addresses, but address routing is the default for all
> PCIe Memory Requests, and ACS provides a way to override that default.
> 
>> These configurations are incompatible with PASID as the PASID
>> packets form a distinct address space.
> I would say "the Requester ID/PASID combination forms a distinct
> address space."
> 
>> For instance, any configuration where switches are present
>> without ACS enabled is incompatible.
>>
>> This enhances the pci_enable_pasid() interface by requiring the ACS to
>> support Source Validation, Request Redirection, Completer Redirection,
>> and Upstream Forwarding. This effectively means that devices cannot
>> spoof their requester ID, requests and completions cannot be redirected,
>> and all transactions are forwarded upstream, even as it passes through a
>> bridge where the target device is downstream.
> I think your patch actually requires all those features to be not just
> "supported" but actually*enabled*  for the entire path leading to the
> device.
> 
> To use the terms from the spec:
> 
>    "P2P Request Redirect"
>    "P2P Completion Redirect"
>    "Requester ID, Requests, and Completions"
> 
> and maybe something like:
> 
>    ... even if the TLP looks like a P2P Request because its memory
>    address (ignoring the PASID) would fall in a bridge window and would
>    normally be routed downstream.

Thank you for the suggestions. I will rephrase the commit message
accordingly like this:


PCI: Allow PASID only when ACS enforced on upstreaming path

The PCIe fabric routes TLPs based on memory addresses for all PCIe Memory
Requests regardless of whether TLPs have PASID prefixes. This is stated in
section "2.2.10.2 End-End TLP Prefix Processing" of the specification:

   The presence of an End-End TLP Prefix does not alter the routing of a
   TLP. TLPs are routed based on the routing rules covered in Section
   2.2.4 .

As the Requester ID/PASID combination forms a distinct address space. The
memory address based routing is not compatible for PASID TLPs anymore.
Therefore we have to rely on ACS to override that default.

This enhances pci_enable_pasid() interface by requiring the ACS features
to be enabled for the entire path leading to the device. So that even if
the TLP looks like a P2P Request because its memory address (ignoring the
PASID) would fall in a bridge window and would normally be routed
downstream.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-17 22:48     ` Jason Gunthorpe
@ 2022-08-18 11:55       ` Baolu Lu
  0 siblings, 0 replies; 33+ messages in thread
From: Baolu Lu @ 2022-08-18 11:55 UTC (permalink / raw)
  To: Jason Gunthorpe, Bjorn Helgaas
  Cc: baolu.lu, Joerg Roedel, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On 2022/8/18 06:48, Jason Gunthorpe wrote:
> On Wed, Aug 17, 2022 at 04:17:43PM -0500, Bjorn Helgaas wrote:
> 
>> Does the PCIe spec really allow TLPs with PASID to be routed anywhere
>> except upstream?
> I think yes:
> 
>   2.2.10.2 End-End TLP Prefix Processing:
> 
>   The presence of an End-End TLP Prefix does not alter the routing of a
>   TLP. TLPs are routed based on the routing rules covered in Section
>   2.2.4 .
> 
> Which I read as saying that routing is done after stripping all the
> prefixes. PASID is a prefix.
> 
> Lu, you may want to quote the spec in the commit message to make it
> clear.

Yes. Sure thing. Thank you!

Best regards,
baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-17  1:20 ` [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path Lu Baolu
  2022-08-17 21:17   ` Bjorn Helgaas
@ 2022-08-18 13:04   ` Jason Gunthorpe
  2022-08-23  7:10     ` Baolu Lu
  1 sibling, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2022-08-18 13:04 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Christoph Hellwig, Bjorn Helgaas, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker,
	Dave Jiang, Vinod Koul, Eric Auger, Liu Yi L, Jacob jun Pan,
	Zhangfei Gao, Zhu Tony, iommu, linux-pci, linux-kernel

On Wed, Aug 17, 2022 at 09:20:15AM +0800, Lu Baolu wrote:
> Some configurations of the PCI fabric will route device originated TLP
> packets based on the memory addresses. These configurations are
> incompatible with PASID as the PASID packets form a distinct address
> space. For instance, any configuration where switches are present
> without ACS enabled is incompatible.
> 
> This enhances the pci_enable_pasid() interface by requiring the ACS to
> support Source Validation, Request Redirection, Completer Redirection,
> and Upstream Forwarding. This effectively means that devices cannot
> spoof their requester ID, requests and completions cannot be redirected,
> and all transactions are forwarded upstream, even as it passes through a
> bridge where the target device is downstream.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Suggested-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/pci/ats.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index c967ad6e2626..0715e48e7973 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -382,6 +382,11 @@ int pci_enable_pasid(struct pci_dev *pdev, int features)
>  	if (!pasid)
>  		return -EINVAL;
>  
> +	if (!pci_acs_path_enabled(pdev, NULL,
> +				  PCI_ACS_SV | PCI_ACS_RR |
> +				  PCI_ACS_CR | PCI_ACS_UF))

I think we only need RR and UF here?

Source Validation causes the switch to validate the requestor RID in
each TLP which has nothing to do with address based routing

Completion Redirect changes how RID routing works, and has nothing to
do with address based routing.

Yes, both of those are usually set for virtualization scenarios but we
shouldn't check it here as a basic requirement to enable PASID.

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 05/13] iommu: Add attach/detach_dev_pasid iommu interface
  2022-08-17  1:20 ` [PATCH v11 05/13] iommu: Add attach/detach_dev_pasid iommu interface Lu Baolu
@ 2022-08-18 13:33   ` Jason Gunthorpe
  2022-08-23  7:30     ` Baolu Lu
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2022-08-18 13:33 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Christoph Hellwig, Bjorn Helgaas, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker,
	Dave Jiang, Vinod Koul, Eric Auger, Liu Yi L, Jacob jun Pan,
	Zhangfei Gao, Zhu Tony, iommu, linux-pci, linux-kernel,
	Jean-Philippe Brucker

On Wed, Aug 17, 2022 at 09:20:16AM +0800, Lu Baolu wrote:

> +static int __iommu_set_group_pasid(struct iommu_domain *domain,
> +				   struct iommu_group *group, ioasid_t pasid)
> +{
> +	struct iommu_domain *ops_domain;
> +	struct group_device *device;
> +	int ret = 0;
> +
> +	if (domain == group->blocking_domain)
> +		ops_domain = xa_load(&group->pasid_array, pasid);
> +	else
> +		ops_domain = domain;

This seems weird, why isn't this just always

domain->ops->set_dev_pasid()?

> +	if (curr) {
> +		ret = xa_err(curr) ? : -EBUSY;
> +		goto out_unlock;
> +	}
> +
> +	ret = __iommu_set_group_pasid(domain, group, pasid);
> +	if (ret) {
> +		__iommu_set_group_pasid(group->blocking_domain, group, pasid);
> +		xa_erase(&group->pasid_array, pasid);

I was looking at this trying to figure out why we are having
attach/detach semantics vs set and this error handling seems to be the
reason

Lets add a comment because it is subtle thing:

  Setting a PASID to a blocking domain cannot fail, so we can always
  safely error unwind a failure to attach a domain back to the original
  group configuration of the PASID being unused.

> +/*
> + * iommu_detach_device_pasid() - Detach the domain from pasid of device
> + * @domain: the iommu domain.
> + * @dev: the attached device.
> + * @pasid: the pasid of the device.
> + *
> + * The @domain must have been attached to @pasid of the @dev with
> + * iommu_attach_device_pasid().
> + */
> +void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev,
> +			       ioasid_t pasid)

Don't pass domain here?

> +/*
> + * iommu_get_domain_for_dev_pasid() - Retrieve domain for @pasid of @dev
> + * @dev: the queried device
> + * @pasid: the pasid of the device
> + *
> + * This is a variant of iommu_get_domain_for_dev(). It returns the existing
> + * domain attached to pasid of a device. It's only for internal use of the
> + * IOMMU subsystem. The caller must take care to avoid any possible
> + * use-after-free case.

How exactly does the caller manage that?

> + *
> + * Return: attached domain on success, NULL otherwise.
> + */
> +struct iommu_domain *
> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
> +{
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +
> +	if (!pasid_valid(pasid))
> +		return NULL;

Why bother? If the pasid is not valid then it definitely won't be in the xarray.

But otherwise this overall thing seems fine to me

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 07/13] iommu/vt-d: Add SVA domain support
  2022-08-17  1:20 ` [PATCH v11 07/13] iommu/vt-d: Add " Lu Baolu
@ 2022-08-18 13:36   ` Jason Gunthorpe
  2022-08-23  7:33     ` Baolu Lu
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2022-08-18 13:36 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Christoph Hellwig, Bjorn Helgaas, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker,
	Dave Jiang, Vinod Koul, Eric Auger, Liu Yi L, Jacob jun Pan,
	Zhangfei Gao, Zhu Tony, iommu, linux-pci, linux-kernel

On Wed, Aug 17, 2022 at 09:20:18AM +0800, Lu Baolu wrote:

> +static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
> +				   struct device *dev, ioasid_t pasid)
> +{
> +	struct device_domain_info *info = dev_iommu_priv_get(dev);
> +	struct intel_iommu *iommu = info->iommu;
> +	struct iommu_sva *sva;
> +	int ret = 0;
> +
> +	mutex_lock(&pasid_mutex);
> +	/*
> +	 * Detach the domain if a blocking domain is set. Check the
> +	 * right domain type once the IOMMU driver supports a real
> +	 * blocking domain.
> +	 */
> +	if (!domain || domain->type == IOMMU_DOMAIN_UNMANAGED) {
> +		intel_svm_unbind_mm(dev, pasid);

See, I think this is exactly the wrong way to use the ops

The blockin domain ops should have its own function that just
unconditionally calls intel_svm_unbind_mm()

> +	} else {
> +		struct mm_struct *mm = domain->mm;
> +
> +		sva = intel_svm_bind_mm(iommu, dev, mm);
> +		if (IS_ERR(sva))
> +			ret = PTR_ERR(sva);

And similarly the SVA domain should have its own op that does this SVM
call.

Muxing the ops with tests on the domain is an anti-pattern. In fact I
would say any time you see an op testing the domain->type it is very
suspicious.

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device()
  2022-08-17  1:20 ` [PATCH v11 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device() Lu Baolu
@ 2022-08-18 13:41   ` Jason Gunthorpe
  2022-08-23 10:12     ` Baolu Lu
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2022-08-18 13:41 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Christoph Hellwig, Bjorn Helgaas, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker,
	Dave Jiang, Vinod Koul, Eric Auger, Liu Yi L, Jacob jun Pan,
	Zhangfei Gao, Zhu Tony, iommu, linux-pci, linux-kernel,
	Jean-Philippe Brucker

On Wed, Aug 17, 2022 at 09:20:20AM +0800, Lu Baolu wrote:
> +
> +/**
> + * iommu_sva_bind_device() - Bind a process address space to a device
> + * @dev: the device
> + * @mm: the mm to bind, caller must hold a reference to mm_users
> + *
> + * Create a bond between device and address space, allowing the device to access
> + * the mm using the returned PASID. If a bond already exists between @device and
> + * @mm, it is returned and an additional reference is taken. Caller must call
> + * iommu_sva_unbind_device() to release each reference.
> + *
> + * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
> + * initialize the required SVA features.
> + *
> + * On error, returns an ERR_PTR value.
> + */
> +struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
> +{
> +	struct iommu_domain *domain;
> +	struct iommu_sva *bond;

This is called handle below, pick one name please

> +	ioasid_t max_pasids;
> +	int ret;
> +
> +	max_pasids = dev->iommu->max_pasids;
> +	if (!max_pasids)
> +		return ERR_PTR(-EOPNOTSUPP);
> +
> +	/* Allocate mm->pasid if necessary. */
> +	ret = iommu_sva_alloc_pasid(mm, 1, max_pasids - 1);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	bond = kzalloc(sizeof(*bond), GFP_KERNEL);
> +	if (!bond)
> +		return ERR_PTR(-ENOMEM);
> +
> +	mutex_lock(&iommu_sva_lock);
> +	/* Search for an existing domain. */
> +	domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid);
> +	if (domain) {

This isn't safe, or sane. A driver could have attached something to
this PASID that is not a SVA domain and thus not protected by the
iommu_sva_lock.

At a minimum you should add a type match to
iommu_get_domain_for_dev_pasid(), eg to confirm it is a SVA domain and
do that check under the xa_lock of the pasid xarray.

And then the general idea is that SVA domain attach/detach must hold
this janky global lock. 

> +		refcount_inc(&domain->users);

This atomic is always processed under the iommu_sva_lock, so it
doesn't need to be an atomic anymore.

Otherwise this design looks OK to me too

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-18 11:53     ` Baolu Lu
@ 2022-08-18 23:00       ` Bjorn Helgaas
  2022-08-22  7:43         ` Ethan Zhao
  2022-08-23  7:05         ` Baolu Lu
  0 siblings, 2 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2022-08-18 23:00 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On Thu, Aug 18, 2022 at 07:53:15PM +0800, Baolu Lu wrote:
> On 2022/8/18 05:17, Bjorn Helgaas wrote:
> > On Wed, Aug 17, 2022 at 09:20:15AM +0800, Lu Baolu wrote:
> > > Some configurations of the PCI fabric will route device originated TLP
> > > packets based on the memory addresses.
> > This makes it sound like a few unusual configurations will route TLPs
> > based on memory addresses, but address routing is the default for all
> > PCIe Memory Requests, and ACS provides a way to override that default.
> > 
> > > These configurations are incompatible with PASID as the PASID
> > > packets form a distinct address space.
> > I would say "the Requester ID/PASID combination forms a distinct
> > address space."
> > 
> > > For instance, any configuration where switches are present
> > > without ACS enabled is incompatible.
> > > 
> > > This enhances the pci_enable_pasid() interface by requiring the ACS to
> > > support Source Validation, Request Redirection, Completer Redirection,
> > > and Upstream Forwarding. This effectively means that devices cannot
> > > spoof their requester ID, requests and completions cannot be redirected,
> > > and all transactions are forwarded upstream, even as it passes through a
> > > bridge where the target device is downstream.
> >
> > I think your patch actually requires all those features to be not just
> > "supported" but actually*enabled*  for the entire path leading to the
> > device.
> > 
> > To use the terms from the spec:
> > 
> >    "P2P Request Redirect"
> >    "P2P Completion Redirect"
> >    "Requester ID, Requests, and Completions"
> > 
> > and maybe something like:
> > 
> >    ... even if the TLP looks like a P2P Request because its memory
> >    address (ignoring the PASID) would fall in a bridge window and would
> >    normally be routed downstream.
> 
> Thank you for the suggestions. I will rephrase the commit message
> accordingly like this:
> 
> 
> PCI: Allow PASID only when ACS enforced on upstreaming path

PCI: Enable PASID only when ACS RR & UF enabled on upstream path

The Requester ID/Process Address Space ID (PASID) combination
identifies an address space distinct from the PCI bus address space,
e.g., an address space defined by an IOMMU.

But the PCIe fabric routes Memory Requests based on the TLP address,
ignoring any PASID (PCIe r6.0, sec 2.2.10.4), so a TLP with PASID that
*should* go upstream to the IOMMU may instead be routed as a P2P
Request if its address falls in a bridge window.

To ensure that all Memory Requests with PASID are routed upstream,
only enable PASID if ACS P2P Request Redirect and Upstream Forwarding
are enabled for the path leading to the device.

> The PCIe fabric routes TLPs based on memory addresses for all PCIe Memory
> Requests regardless of whether TLPs have PASID prefixes. This is stated in
> section "2.2.10.2 End-End TLP Prefix Processing" of the specification:
> 
>   The presence of an End-End TLP Prefix does not alter the routing of a
>   TLP. TLPs are routed based on the routing rules covered in Section
>   2.2.4 .
> 
> As the Requester ID/PASID combination forms a distinct address space. The
> memory address based routing is not compatible for PASID TLPs anymore.
> Therefore we have to rely on ACS to override that default.
> 
> This enhances pci_enable_pasid() interface by requiring the ACS features
> to be enabled for the entire path leading to the device. So that even if
> the TLP looks like a P2P Request because its memory address (ignoring the
> PASID) would fall in a bridge window and would normally be routed
> downstream.
> 
> Best regards,
> baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 00/13] iommu: SVA and IOPF refactoring
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (12 preceding siblings ...)
  2022-08-17  1:20 ` [PATCH v11 13/13] iommu: Rename iommu-sva-lib.{c,h} Lu Baolu
@ 2022-08-22  4:49 ` Zhangfei Gao
  2022-08-23  7:00   ` Baolu Lu
  2022-08-26  4:17 ` Baolu Lu
  14 siblings, 1 reply; 33+ messages in thread
From: Zhangfei Gao @ 2022-08-22  4:49 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Bjorn Helgaas, Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony,
	iommu, linux-pci, linux-kernel



On 2022/8/17 上午9:20, Lu Baolu wrote:
> Hi folks,
>
> The former part of this series introduces the IOMMU interfaces to attach
> or detach an iommu domain to/from a pasid of a device, and refactors the
> exsiting IOMMU SVA implementation by assigning an SVA type of iommu
> domain to a shared virtual address and replacing sva_bind/unbind iommu
> ops with a set_dev_pasid domain ops.
>
> The latter part changes the existing I/O page fault handling framework
> from only serving SVA to a generic one. Any driver or component could
> handle the I/O page faults for its domain in its own way by installing
> an I/O page fault handler.
>
> This series has been functionally tested on an x86 machine and compile
> tested for all architectures.
>
> This series is also available on github:
> [2] https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v11
>
> Please review and suggest.
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
On arm64 (Kunpeng920) with uacce.

Thanks

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-18 23:00       ` Bjorn Helgaas
@ 2022-08-22  7:43         ` Ethan Zhao
  2022-08-23  7:05         ` Baolu Lu
  1 sibling, 0 replies; 33+ messages in thread
From: Ethan Zhao @ 2022-08-22  7:43 UTC (permalink / raw)
  To: Bjorn Helgaas, Baolu Lu
  Cc: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel


在 2022/8/19 7:00, Bjorn Helgaas 写道:
> On Thu, Aug 18, 2022 at 07:53:15PM +0800, Baolu Lu wrote:
>> On 2022/8/18 05:17, Bjorn Helgaas wrote:
>>> On Wed, Aug 17, 2022 at 09:20:15AM +0800, Lu Baolu wrote:
>>>> Some configurations of the PCI fabric will route device originated TLP
>>>> packets based on the memory addresses.
>>> This makes it sound like a few unusual configurations will route TLPs
>>> based on memory addresses, but address routing is the default for all
>>> PCIe Memory Requests, and ACS provides a way to override that default.
>>>
>>>> These configurations are incompatible with PASID as the PASID
>>>> packets form a distinct address space.
>>> I would say "the Requester ID/PASID combination forms a distinct
>>> address space."
>>>
>>>> For instance, any configuration where switches are present
>>>> without ACS enabled is incompatible.
>>>>
>>>> This enhances the pci_enable_pasid() interface by requiring the ACS to
>>>> support Source Validation, Request Redirection, Completer Redirection,
>>>> and Upstream Forwarding. This effectively means that devices cannot
>>>> spoof their requester ID, requests and completions cannot be redirected,
>>>> and all transactions are forwarded upstream, even as it passes through a
>>>> bridge where the target device is downstream.
>>> I think your patch actually requires all those features to be not just
>>> "supported" but actually*enabled*  for the entire path leading to the
>>> device.
>>>
>>> To use the terms from the spec:
>>>
>>>     "P2P Request Redirect"
>>>     "P2P Completion Redirect"
>>>     "Requester ID, Requests, and Completions"
>>>
>>> and maybe something like:
>>>
>>>     ... even if the TLP looks like a P2P Request because its memory
>>>     address (ignoring the PASID) would fall in a bridge window and would
>>>     normally be routed downstream.
>> Thank you for the suggestions. I will rephrase the commit message
>> accordingly like this:
>>
>>
>> PCI: Allow PASID only when ACS enforced on upstreaming path
> PCI: Enable PASID only when ACS RR & UF enabled on upstream path
>
> The Requester ID/Process Address Space ID (PASID) combination
> identifies an address space distinct from the PCI bus address space,
> e.g., an address space defined by an IOMMU.
>
> But the PCIe fabric routes Memory Requests based on the TLP address,
> ignoring any PASID (PCIe r6.0, sec 2.2.10.4), so a TLP with PASID that
> *should* go upstream to the IOMMU may instead be routed as a P2P
> Request if its address falls in a bridge window.
>
> To ensure that all Memory Requests with PASID are routed upstream,
> only enable PASID if ACS P2P Request Redirect and Upstream Forwarding
> are enabled for the path leading to the device.

Seeing these comments, my questions gone.

Thanks Bjorn !

>> The PCIe fabric routes TLPs based on memory addresses for all PCIe Memory
>> Requests regardless of whether TLPs have PASID prefixes. This is stated in
>> section "2.2.10.2 End-End TLP Prefix Processing" of the specification:
>>
>>    The presence of an End-End TLP Prefix does not alter the routing of a
>>    TLP. TLPs are routed based on the routing rules covered in Section
>>    2.2.4 .
>>
>> As the Requester ID/PASID combination forms a distinct address space. The
>> memory address based routing is not compatible for PASID TLPs anymore.
>> Therefore we have to rely on ACS to override that default.
>>
>> This enhances pci_enable_pasid() interface by requiring the ACS features
>> to be enabled for the entire path leading to the device. So that even if
>> the TLP looks like a P2P Request because its memory address (ignoring the
>> PASID) would fall in a bridge window and would normally be routed
>> downstream.
>>
>> Best regards,
>> baolu

-- 
"firm, enduring, strong, and long-lived"


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 00/13] iommu: SVA and IOPF refactoring
  2022-08-22  4:49 ` [PATCH v11 00/13] iommu: SVA and IOPF refactoring Zhangfei Gao
@ 2022-08-23  7:00   ` Baolu Lu
  0 siblings, 0 replies; 33+ messages in thread
From: Baolu Lu @ 2022-08-23  7:00 UTC (permalink / raw)
  To: Zhangfei Gao, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Bjorn Helgaas, Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: baolu.lu, Eric Auger, Liu Yi L, Jacob jun Pan, Zhu Tony, iommu,
	linux-pci, linux-kernel

On 2022/8/22 12:49, Zhangfei Gao wrote:
> On 2022/8/17 上午9:20, Lu Baolu wrote:
>> Hi folks,
>>
>> The former part of this series introduces the IOMMU interfaces to attach
>> or detach an iommu domain to/from a pasid of a device, and refactors the
>> exsiting IOMMU SVA implementation by assigning an SVA type of iommu
>> domain to a shared virtual address and replacing sva_bind/unbind iommu
>> ops with a set_dev_pasid domain ops.
>>
>> The latter part changes the existing I/O page fault handling framework
>> from only serving SVA to a generic one. Any driver or component could
>> handle the I/O page faults for its domain in its own way by installing
>> an I/O page fault handler.
>>
>> This series has been functionally tested on an x86 machine and compile
>> tested for all architectures.
>>
>> This series is also available on github:
>> [2] 
>> https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v11
>>
>> Please review and suggest.
> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> On arm64 (Kunpeng920) with uacce.

Thank you very much!

Best regards,
baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-18 23:00       ` Bjorn Helgaas
  2022-08-22  7:43         ` Ethan Zhao
@ 2022-08-23  7:05         ` Baolu Lu
  2022-08-24 16:23           ` Bjorn Helgaas
  1 sibling, 1 reply; 33+ messages in thread
From: Baolu Lu @ 2022-08-23  7:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: baolu.lu, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Bjorn Helgaas, Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On 2022/8/19 07:00, Bjorn Helgaas wrote:
> On Thu, Aug 18, 2022 at 07:53:15PM +0800, Baolu Lu wrote:
>> On 2022/8/18 05:17, Bjorn Helgaas wrote:
>>> On Wed, Aug 17, 2022 at 09:20:15AM +0800, Lu Baolu wrote:
>>>> Some configurations of the PCI fabric will route device originated TLP
>>>> packets based on the memory addresses.
>>> This makes it sound like a few unusual configurations will route TLPs
>>> based on memory addresses, but address routing is the default for all
>>> PCIe Memory Requests, and ACS provides a way to override that default.
>>>
>>>> These configurations are incompatible with PASID as the PASID
>>>> packets form a distinct address space.
>>> I would say "the Requester ID/PASID combination forms a distinct
>>> address space."
>>>
>>>> For instance, any configuration where switches are present
>>>> without ACS enabled is incompatible.
>>>>
>>>> This enhances the pci_enable_pasid() interface by requiring the ACS to
>>>> support Source Validation, Request Redirection, Completer Redirection,
>>>> and Upstream Forwarding. This effectively means that devices cannot
>>>> spoof their requester ID, requests and completions cannot be redirected,
>>>> and all transactions are forwarded upstream, even as it passes through a
>>>> bridge where the target device is downstream.
>>> I think your patch actually requires all those features to be not just
>>> "supported" but actually*enabled*  for the entire path leading to the
>>> device.
>>>
>>> To use the terms from the spec:
>>>
>>>     "P2P Request Redirect"
>>>     "P2P Completion Redirect"
>>>     "Requester ID, Requests, and Completions"
>>>
>>> and maybe something like:
>>>
>>>     ... even if the TLP looks like a P2P Request because its memory
>>>     address (ignoring the PASID) would fall in a bridge window and would
>>>     normally be routed downstream.
>> Thank you for the suggestions. I will rephrase the commit message
>> accordingly like this:
>>
>>
>> PCI: Allow PASID only when ACS enforced on upstreaming path
> PCI: Enable PASID only when ACS RR & UF enabled on upstream path
> 
> The Requester ID/Process Address Space ID (PASID) combination
> identifies an address space distinct from the PCI bus address space,
> e.g., an address space defined by an IOMMU.
> 
> But the PCIe fabric routes Memory Requests based on the TLP address,
> ignoring any PASID (PCIe r6.0, sec 2.2.10.4), so a TLP with PASID that
> *should*  go upstream to the IOMMU may instead be routed as a P2P
> Request if its address falls in a bridge window.
> 
> To ensure that all Memory Requests with PASID are routed upstream,
> only enable PASID if ACS P2P Request Redirect and Upstream Forwarding
> are enabled for the path leading to the device.

Yours is clear and straight-forward. I will update the patch with above.
Thank you and very appreciated!

Best regards,
baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-18 13:04   ` Jason Gunthorpe
@ 2022-08-23  7:10     ` Baolu Lu
  0 siblings, 0 replies; 33+ messages in thread
From: Baolu Lu @ 2022-08-23  7:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Joerg Roedel, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On 2022/8/18 21:04, Jason Gunthorpe wrote:
> On Wed, Aug 17, 2022 at 09:20:15AM +0800, Lu Baolu wrote:
>> Some configurations of the PCI fabric will route device originated TLP
>> packets based on the memory addresses. These configurations are
>> incompatible with PASID as the PASID packets form a distinct address
>> space. For instance, any configuration where switches are present
>> without ACS enabled is incompatible.
>>
>> This enhances the pci_enable_pasid() interface by requiring the ACS to
>> support Source Validation, Request Redirection, Completer Redirection,
>> and Upstream Forwarding. This effectively means that devices cannot
>> spoof their requester ID, requests and completions cannot be redirected,
>> and all transactions are forwarded upstream, even as it passes through a
>> bridge where the target device is downstream.
>>
>> Suggested-by: Jason Gunthorpe<jgg@nvidia.com>
>> Suggested-by: Kevin Tian<kevin.tian@intel.com>
>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>> ---
>>   drivers/pci/ats.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
>> index c967ad6e2626..0715e48e7973 100644
>> --- a/drivers/pci/ats.c
>> +++ b/drivers/pci/ats.c
>> @@ -382,6 +382,11 @@ int pci_enable_pasid(struct pci_dev *pdev, int features)
>>   	if (!pasid)
>>   		return -EINVAL;
>>   
>> +	if (!pci_acs_path_enabled(pdev, NULL,
>> +				  PCI_ACS_SV | PCI_ACS_RR |
>> +				  PCI_ACS_CR | PCI_ACS_UF))
> I think we only need RR and UF here?
> 
> Source Validation causes the switch to validate the requestor RID in
> each TLP which has nothing to do with address based routing
> 
> Completion Redirect changes how RID routing works, and has nothing to
> do with address based routing.
> 
> Yes, both of those are usually set for virtualization scenarios but we
> shouldn't check it here as a basic requirement to enable PASID.

Yes. Here only requires RR and UF.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 05/13] iommu: Add attach/detach_dev_pasid iommu interface
  2022-08-18 13:33   ` Jason Gunthorpe
@ 2022-08-23  7:30     ` Baolu Lu
  0 siblings, 0 replies; 33+ messages in thread
From: Baolu Lu @ 2022-08-23  7:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Joerg Roedel, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel, Jean-Philippe Brucker

On 2022/8/18 21:33, Jason Gunthorpe wrote:
> On Wed, Aug 17, 2022 at 09:20:16AM +0800, Lu Baolu wrote:
> 
>> +static int __iommu_set_group_pasid(struct iommu_domain *domain,
>> +				   struct iommu_group *group, ioasid_t pasid)
>> +{
>> +	struct iommu_domain *ops_domain;
>> +	struct group_device *device;
>> +	int ret = 0;
>> +
>> +	if (domain == group->blocking_domain)
>> +		ops_domain = xa_load(&group->pasid_array, pasid);
>> +	else
>> +		ops_domain = domain;
> 
> This seems weird, why isn't this just always
> 
> domain->ops->set_dev_pasid()?

Sure. I will fix this in the next version.

> 
>> +	if (curr) {
>> +		ret = xa_err(curr) ? : -EBUSY;
>> +		goto out_unlock;
>> +	}
>> +
>> +	ret = __iommu_set_group_pasid(domain, group, pasid);
>> +	if (ret) {
>> +		__iommu_set_group_pasid(group->blocking_domain, group, pasid);
>> +		xa_erase(&group->pasid_array, pasid);
> 
> I was looking at this trying to figure out why we are having
> attach/detach semantics vs set and this error handling seems to be the
> reason
> 
> Lets add a comment because it is subtle thing:
> 
>    Setting a PASID to a blocking domain cannot fail, so we can always
>    safely error unwind a failure to attach a domain back to the original
>    group configuration of the PASID being unused.

Updated.

> 
>> +/*
>> + * iommu_detach_device_pasid() - Detach the domain from pasid of device
>> + * @domain: the iommu domain.
>> + * @dev: the attached device.
>> + * @pasid: the pasid of the device.
>> + *
>> + * The @domain must have been attached to @pasid of the @dev with
>> + * iommu_attach_device_pasid().
>> + */
>> +void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev,
>> +			       ioasid_t pasid)
> 
> Don't pass domain here?

It is checked in the function to make sure that the detached domain is
the same one as the previous attached one.

> 
>> +/*
>> + * iommu_get_domain_for_dev_pasid() - Retrieve domain for @pasid of @dev
>> + * @dev: the queried device
>> + * @pasid: the pasid of the device
>> + *
>> + * This is a variant of iommu_get_domain_for_dev(). It returns the existing
>> + * domain attached to pasid of a device. It's only for internal use of the
>> + * IOMMU subsystem. The caller must take care to avoid any possible
>> + * use-after-free case.
> 
> How exactly does the caller manage that?

"... the returned domain pointer could only be used before detaching
from the device PASID."

> 
>> + *
>> + * Return: attached domain on success, NULL otherwise.
>> + */
>> +struct iommu_domain *
>> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
>> +{
>> +	struct iommu_domain *domain;
>> +	struct iommu_group *group;
>> +
>> +	if (!pasid_valid(pasid))
>> +		return NULL;
> 
> Why bother? If the pasid is not valid then it definitely won't be in the xarray.

Removed.

> But otherwise this overall thing seems fine to me

Thank you!

Best regards,
baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 07/13] iommu/vt-d: Add SVA domain support
  2022-08-18 13:36   ` Jason Gunthorpe
@ 2022-08-23  7:33     ` Baolu Lu
  0 siblings, 0 replies; 33+ messages in thread
From: Baolu Lu @ 2022-08-23  7:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Joerg Roedel, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On 2022/8/18 21:36, Jason Gunthorpe wrote:
> On Wed, Aug 17, 2022 at 09:20:18AM +0800, Lu Baolu wrote:
> 
>> +static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
>> +				   struct device *dev, ioasid_t pasid)
>> +{
>> +	struct device_domain_info *info = dev_iommu_priv_get(dev);
>> +	struct intel_iommu *iommu = info->iommu;
>> +	struct iommu_sva *sva;
>> +	int ret = 0;
>> +
>> +	mutex_lock(&pasid_mutex);
>> +	/*
>> +	 * Detach the domain if a blocking domain is set. Check the
>> +	 * right domain type once the IOMMU driver supports a real
>> +	 * blocking domain.
>> +	 */
>> +	if (!domain || domain->type == IOMMU_DOMAIN_UNMANAGED) {
>> +		intel_svm_unbind_mm(dev, pasid);
> 
> See, I think this is exactly the wrong way to use the ops
> 
> The blockin domain ops should have its own function that just
> unconditionally calls intel_svm_unbind_mm()
> 
>> +	} else {
>> +		struct mm_struct *mm = domain->mm;
>> +
>> +		sva = intel_svm_bind_mm(iommu, dev, mm);
>> +		if (IS_ERR(sva))
>> +			ret = PTR_ERR(sva);
> 
> And similarly the SVA domain should have its own op that does this SVM
> call.
> 
> Muxing the ops with tests on the domain is an anti-pattern. In fact I
> would say any time you see an op testing the domain->type it is very
> suspicious.

Both agreed. Will fix them in the next version.

Best regards,
baolu


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device()
  2022-08-18 13:41   ` Jason Gunthorpe
@ 2022-08-23 10:12     ` Baolu Lu
  0 siblings, 0 replies; 33+ messages in thread
From: Baolu Lu @ 2022-08-23 10:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: baolu.lu, Joerg Roedel, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel, Jean-Philippe Brucker

On 2022/8/18 21:41, Jason Gunthorpe wrote:
> On Wed, Aug 17, 2022 at 09:20:20AM +0800, Lu Baolu wrote:
>> +
>> +/**
>> + * iommu_sva_bind_device() - Bind a process address space to a device
>> + * @dev: the device
>> + * @mm: the mm to bind, caller must hold a reference to mm_users
>> + *
>> + * Create a bond between device and address space, allowing the device to access
>> + * the mm using the returned PASID. If a bond already exists between @device and
>> + * @mm, it is returned and an additional reference is taken. Caller must call
>> + * iommu_sva_unbind_device() to release each reference.
>> + *
>> + * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
>> + * initialize the required SVA features.
>> + *
>> + * On error, returns an ERR_PTR value.
>> + */
>> +struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
>> +{
>> +	struct iommu_domain *domain;
>> +	struct iommu_sva *bond;
> 
> This is called handle below, pick one name please

Updated.

> 
>> +	ioasid_t max_pasids;
>> +	int ret;
>> +
>> +	max_pasids = dev->iommu->max_pasids;
>> +	if (!max_pasids)
>> +		return ERR_PTR(-EOPNOTSUPP);
>> +
>> +	/* Allocate mm->pasid if necessary. */
>> +	ret = iommu_sva_alloc_pasid(mm, 1, max_pasids - 1);
>> +	if (ret)
>> +		return ERR_PTR(ret);
>> +
>> +	bond = kzalloc(sizeof(*bond), GFP_KERNEL);
>> +	if (!bond)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	mutex_lock(&iommu_sva_lock);
>> +	/* Search for an existing domain. */
>> +	domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid);
>> +	if (domain) {
> 
> This isn't safe, or sane. A driver could have attached something to
> this PASID that is not a SVA domain and thus not protected by the
> iommu_sva_lock.
> 
> At a minimum you should add a type match to
> iommu_get_domain_for_dev_pasid(), eg to confirm it is a SVA domain and
> do that check under the xa_lock of the pasid xarray.
> 
> And then the general idea is that SVA domain attach/detach must hold
> this janky global lock.

Make sense. I will add this logic.

> 
>> +		refcount_inc(&domain->users);
> 
> This atomic is always processed under the iommu_sva_lock, so it
> doesn't need to be an atomic anymore.

Will change it to an integer.

> 
> Otherwise this design looks OK to me too

Thank you very much for your suggestions.

Best regards,
baolu


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path
  2022-08-23  7:05         ` Baolu Lu
@ 2022-08-24 16:23           ` Bjorn Helgaas
  0 siblings, 0 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2022-08-24 16:23 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul, Eric Auger,
	Liu Yi L, Jacob jun Pan, Zhangfei Gao, Zhu Tony, iommu,
	linux-pci, linux-kernel

On Tue, Aug 23, 2022 at 03:05:53PM +0800, Baolu Lu wrote:
> On 2022/8/19 07:00, Bjorn Helgaas wrote:

> > PCI: Enable PASID only when ACS RR & UF enabled on upstream path
> > 
> > The Requester ID/Process Address Space ID (PASID) combination
> > identifies an address space distinct from the PCI bus address space,
> > e.g., an address space defined by an IOMMU.
> > 
> > But the PCIe fabric routes Memory Requests based on the TLP address,
> > ignoring any PASID (PCIe r6.0, sec 2.2.10.4), so a TLP with PASID that
> > *should*  go upstream to the IOMMU may instead be routed as a P2P
> > Request if its address falls in a bridge window.
> > 
> > To ensure that all Memory Requests with PASID are routed upstream,
> > only enable PASID if ACS P2P Request Redirect and Upstream Forwarding
> > are enabled for the path leading to the device.
> 
> Yours is clear and straight-forward. I will update the patch with above.
> Thank you and very appreciated!

With the update to only require RR and UF and the commit log update,

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Thanks!

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v11 00/13] iommu: SVA and IOPF refactoring
  2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (13 preceding siblings ...)
  2022-08-22  4:49 ` [PATCH v11 00/13] iommu: SVA and IOPF refactoring Zhangfei Gao
@ 2022-08-26  4:17 ` Baolu Lu
  14 siblings, 0 replies; 33+ messages in thread
From: Baolu Lu @ 2022-08-26  4:17 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Bjorn Helgaas,
	Kevin Tian, Ashok Raj, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker, Dave Jiang, Vinod Koul
  Cc: baolu.lu, Eric Auger, Liu Yi L, Jacob jun Pan, Zhangfei Gao,
	Zhu Tony, iommu, linux-pci, linux-kernel

On 2022/8/17 9:20, Lu Baolu wrote:
> Hi folks,
> 
> The former part of this series introduces the IOMMU interfaces to attach
> or detach an iommu domain to/from a pasid of a device, and refactors the
> exsiting IOMMU SVA implementation by assigning an SVA type of iommu
> domain to a shared virtual address and replacing sva_bind/unbind iommu
> ops with a set_dev_pasid domain ops.
> 
> The latter part changes the existing I/O page fault handling framework
> from only serving SVA to a generic one. Any driver or component could
> handle the I/O page faults for its domain in its own way by installing
> an I/O page fault handler.
> 
> This series has been functionally tested on an x86 machine and compile
> tested for all architectures.
> 
> This series is also available on github:
> [2]https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v11
> 
> Please review and suggest.

Thank you all for review and test. I have updated this series and
uploaded a new version at

https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v12

Zhangfei and Tony have tested it on real Intel and arm64 hardware.

I will soon post it for further review.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-08-26  4:18 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-17  1:20 [PATCH v11 00/13] iommu: SVA and IOPF refactoring Lu Baolu
2022-08-17  1:20 ` [PATCH v11 01/13] iommu: Add max_pasids field in struct iommu_device Lu Baolu
2022-08-17  1:20 ` [PATCH v11 02/13] iommu: Add max_pasids field in struct dev_iommu Lu Baolu
2022-08-17  1:20 ` [PATCH v11 03/13] iommu: Remove SVM_FLAG_SUPERVISOR_MODE support Lu Baolu
2022-08-17  1:20 ` [PATCH v11 04/13] PCI: Allow PASID only when ACS enforced on upstreaming path Lu Baolu
2022-08-17 21:17   ` Bjorn Helgaas
2022-08-17 22:48     ` Jason Gunthorpe
2022-08-18 11:55       ` Baolu Lu
2022-08-18 11:53     ` Baolu Lu
2022-08-18 23:00       ` Bjorn Helgaas
2022-08-22  7:43         ` Ethan Zhao
2022-08-23  7:05         ` Baolu Lu
2022-08-24 16:23           ` Bjorn Helgaas
2022-08-18 13:04   ` Jason Gunthorpe
2022-08-23  7:10     ` Baolu Lu
2022-08-17  1:20 ` [PATCH v11 05/13] iommu: Add attach/detach_dev_pasid iommu interface Lu Baolu
2022-08-18 13:33   ` Jason Gunthorpe
2022-08-23  7:30     ` Baolu Lu
2022-08-17  1:20 ` [PATCH v11 06/13] iommu: Add IOMMU SVA domain support Lu Baolu
2022-08-17  1:20 ` [PATCH v11 07/13] iommu/vt-d: Add " Lu Baolu
2022-08-18 13:36   ` Jason Gunthorpe
2022-08-23  7:33     ` Baolu Lu
2022-08-17  1:20 ` [PATCH v11 08/13] arm-smmu-v3/sva: " Lu Baolu
2022-08-17  1:20 ` [PATCH v11 09/13] iommu/sva: Refactoring iommu_sva_bind/unbind_device() Lu Baolu
2022-08-18 13:41   ` Jason Gunthorpe
2022-08-23 10:12     ` Baolu Lu
2022-08-17  1:20 ` [PATCH v11 10/13] iommu: Remove SVA related callbacks from iommu ops Lu Baolu
2022-08-17  1:20 ` [PATCH v11 11/13] iommu: Prepare IOMMU domain for IOPF Lu Baolu
2022-08-17  1:20 ` [PATCH v11 12/13] iommu: Per-domain I/O page fault handling Lu Baolu
2022-08-17  1:20 ` [PATCH v11 13/13] iommu: Rename iommu-sva-lib.{c,h} Lu Baolu
2022-08-22  4:49 ` [PATCH v11 00/13] iommu: SVA and IOPF refactoring Zhangfei Gao
2022-08-23  7:00   ` Baolu Lu
2022-08-26  4:17 ` Baolu Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).