iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring
@ 2022-03-29  5:37 Lu Baolu
  2022-03-29  5:37 ` [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu Lu Baolu
                   ` (10 more replies)
  0 siblings, 11 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Hi,

The former part of this series refactors the IOMMU SVA code by assigning
an SVA type of iommu_domain to a shared virtual address and replacing
sva_bind/unbind iommu ops with attach/detach_dev_pasid domain ops.

The latter part changes the existing I/O page fault handling framework
from only serving SVA to a generic one. Any driver or component could
handle the I/O page faults for its domain in its own way by installing
an I/O page fault handler.

This series overlaps with another series posted here [1]. For the
convenience of review, I included all relevant patches in this series.
We will solve the overlap problem later.

This series is also available on github here [2].

[1] https://lore.kernel.org/lkml/20220315050713.2000518-1-jacob.jun.pan@linux.intel.com/
[2] https://github.com/LuBaolu/intel-iommu/commits/iommu-sva-refactoring-v2

Please help review and suggest.

Best regards,
baolu

Change log:
v1:
 - https://lore.kernel.org/linux-iommu/20220320064030.2936936-1-baolu.lu@linux.intel.com/
 - Initial post.

v2:
 - Add sva domain life cycle management to avoid race between unbind and
   page fault handling.
 - Use a single domain for each mm.
 - Return a single sva handler for the same binding.
 - Add a new helper to meet singleton group requirement.
 - Rework the SVA domain allocation for arm smmu v3 driver and move the
   pasid_bit initialization to device probe.
 - Drop the patch "iommu: Handle IO page faults directly".
 - Add mmget_not_zero(mm) in SVA page fault handler.

Lu Baolu (11):
  iommu: Add pasid_bits field in struct dev_iommu
  iommu: Add iommu_group_singleton_lockdown()
  iommu/sva: Add iommu_domain type for SVA
  iommu: Add attach/detach_dev_pasid domain ops
  iommu/vt-d: Remove SVM_FLAG_SUPERVISOR_MODE suport
  iommu/vt-d: Add SVA domain support
  arm-smmu-v3/sva: Add SVA domain support
  iommu/sva: Use attach/detach_pasid_dev in SVA interfaces
  iommu: Remove SVA related callbacks from iommu ops
  iommu: Per-domain I/O page fault handling
  iommu: Rename iommu-sva-lib.{c,h}

 include/linux/intel-iommu.h                   |   5 +-
 include/linux/iommu.h                         | 107 ++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  25 +-
 .../iommu/{iommu-sva-lib.h => iommu-sva.h}    |  12 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  85 ++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  28 +-
 drivers/iommu/intel/iommu.c                   |  20 +-
 drivers/iommu/intel/svm.c                     | 135 +++-----
 drivers/iommu/io-pgfault.c                    |  72 +---
 drivers/iommu/iommu-sva-lib.c                 |  71 ----
 drivers/iommu/iommu-sva.c                     | 307 ++++++++++++++++++
 drivers/iommu/iommu.c                         | 208 ++++++------
 drivers/iommu/Makefile                        |   2 +-
 13 files changed, 655 insertions(+), 422 deletions(-)
 rename drivers/iommu/{iommu-sva-lib.h => iommu-sva.h} (80%)
 delete mode 100644 drivers/iommu/iommu-sva-lib.c
 create mode 100644 drivers/iommu/iommu-sva.c

-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-29 21:00   ` Jacob Pan
  2022-03-30  7:05   ` Tian, Kevin
  2022-03-29  5:37 ` [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown() Lu Baolu
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Use this field to save the pasid/ssid bits that a device is able to
support with its IOMMU hardware. It is a generic attribute of a device
and lifting it into the per-device dev_iommu struct makes it possible
to allocate a PASID for device without calls into the IOMMU drivers.
Any iommu driver which suports PASID related features should set this
field before features are enabled on the devices.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h                       | 1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++
 drivers/iommu/intel/iommu.c                 | 5 ++++-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 6ef2df258673..36f43af0af53 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -368,6 +368,7 @@ struct dev_iommu {
 	struct iommu_fwspec		*fwspec;
 	struct iommu_device		*iommu_dev;
 	void				*priv;
+	unsigned int			pasid_bits;
 };
 
 int iommu_device_register(struct iommu_device *iommu,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 627a3ed5ee8f..afc63fce6107 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2681,6 +2681,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
 		master->stall_enabled = true;
 
+	dev->iommu->pasid_bits = master->ssid_bits;
+
 	return &smmu->iommu;
 
 err_free_master:
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 6f7485c44a4b..c1b91bce1530 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4587,8 +4587,11 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
 			if (pasid_supported(iommu)) {
 				int features = pci_pasid_features(pdev);
 
-				if (features >= 0)
+				if (features >= 0) {
 					info->pasid_supported = features | 1;
+					dev->iommu->pasid_bits =
+						fls(pci_max_pasids(pdev)) - 1;
+				}
 			}
 
 			if (info->ats_supported && ecap_prs(iommu->ecap) &&
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
  2022-03-29  5:37 ` [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-29  8:42   ` Tian, Kevin
  2022-03-29  5:37 ` [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA Lu Baolu
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Some of the interfaces in the IOMMU core require that only a single
kernel device driver controls the device in the IOMMU group. The
existing method is to check the device count in the IOMMU group in
the interfaces. This is unreliable because any device added to the
IOMMU group later breaks this assumption without notifying the driver
using the interface. This adds iommu_group_singleton_lockdown() that
checks the requirement and locks down the IOMMU group with only single
device driver bound.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/iommu.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0c42ece25854..9fb8a5b4491e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,7 @@ struct iommu_group {
 	struct list_head entry;
 	unsigned int owner_cnt;
 	void *owner;
+	bool singleton_lockdown;
 };
 
 struct group_device {
@@ -968,15 +969,16 @@ void iommu_group_remove_device(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_group_remove_device);
 
-static int iommu_group_device_count(struct iommu_group *group)
+/* Callers should hold the group->mutex. */
+static bool iommu_group_singleton_lockdown(struct iommu_group *group)
 {
-	struct group_device *entry;
-	int ret = 0;
-
-	list_for_each_entry(entry, &group->devices, list)
-		ret++;
+	if (group->owner_cnt != 1 ||
+	    group->domain != group->default_domain ||
+	    group->owner)
+		return false;
+	group->singleton_lockdown = true;
 
-	return ret;
+	return true;
 }
 
 static int __iommu_group_for_each_dev(struct iommu_group *group, void *data,
@@ -1936,7 +1938,7 @@ int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
 	 */
 	mutex_lock(&group->mutex);
 	ret = -EINVAL;
-	if (iommu_group_device_count(group) != 1)
+	if (!iommu_group_singleton_lockdown(group))
 		goto out_unlock;
 
 	ret = __iommu_attach_group(domain, group);
@@ -1979,7 +1981,7 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
 		return;
 
 	mutex_lock(&group->mutex);
-	if (iommu_group_device_count(group) != 1) {
+	if (!iommu_group_singleton_lockdown(group)) {
 		WARN_ON(1);
 		goto out_unlock;
 	}
@@ -2745,7 +2747,7 @@ iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
 	 * affected by the problems that required IOMMU groups (lack of ACS
 	 * isolation, device ID aliasing and other hardware issues).
 	 */
-	if (iommu_group_device_count(group) != 1)
+	if (!iommu_group_singleton_lockdown(group))
 		goto out_unlock;
 
 	handle = ops->sva_bind(dev, mm, drvdata);
@@ -2842,7 +2844,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 	 * case) that has already acquired some of the device locks and might be
 	 * waiting for T1 to release other device locks.
 	 */
-	if (iommu_group_device_count(group) != 1) {
+	if (!iommu_group_singleton_lockdown(group)) {
 		dev_err_ratelimited(prev_dev, "Cannot change default domain: Group has more than one device\n");
 		ret = -EINVAL;
 		goto out;
@@ -2975,7 +2977,7 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	 * 2. Get struct *dev which is needed to lock device
 	 */
 	mutex_lock(&group->mutex);
-	if (iommu_group_device_count(group) != 1) {
+	if (!iommu_group_singleton_lockdown(group)) {
 		mutex_unlock(&group->mutex);
 		pr_err_ratelimited("Cannot change default domain: Group has more than one device\n");
 		return -EINVAL;
@@ -3050,6 +3052,7 @@ int iommu_device_use_default_domain(struct device *dev)
 	mutex_lock(&group->mutex);
 	if (group->owner_cnt) {
 		if (group->domain != group->default_domain ||
+		    group->singleton_lockdown ||
 		    group->owner) {
 			ret = -EBUSY;
 			goto unlock_out;
@@ -3084,6 +3087,9 @@ void iommu_device_unuse_default_domain(struct device *dev)
 	if (!WARN_ON(!group->owner_cnt))
 		group->owner_cnt--;
 
+	if (!group->owner_cnt)
+		group->singleton_lockdown = false;
+
 	mutex_unlock(&group->mutex);
 	iommu_group_put(group);
 }
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
  2022-03-29  5:37 ` [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu Lu Baolu
  2022-03-29  5:37 ` [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown() Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-29 21:38   ` Jacob Pan
  2022-03-30 19:02   ` Jason Gunthorpe via iommu
  2022-03-29  5:37 ` [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops Lu Baolu
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Add a new iommu domain type IOMMU_DOMAIN_SVA to represent an I/O page
table which is shared from CPU host VA. Add some helpers to get and
put an SVA domain and implement SVA domain life cycle management.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h         |  7 +++
 drivers/iommu/iommu-sva-lib.h | 10 ++++
 drivers/iommu/iommu-sva-lib.c | 89 +++++++++++++++++++++++++++++++++++
 3 files changed, 106 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 36f43af0af53..29c4c2edd706 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -42,6 +42,7 @@ struct notifier_block;
 struct iommu_sva;
 struct iommu_fault_event;
 struct iommu_dma_cookie;
+struct iommu_sva_cookie;
 
 /* iommu fault flags */
 #define IOMMU_FAULT_READ	0x0
@@ -64,6 +65,9 @@ struct iommu_domain_geometry {
 #define __IOMMU_DOMAIN_PT	(1U << 2)  /* Domain is identity mapped   */
 #define __IOMMU_DOMAIN_DMA_FQ	(1U << 3)  /* DMA-API uses flush queue    */
 
+#define __IOMMU_DOMAIN_SHARED	(1U << 4)  /* Page table shared from CPU  */
+#define __IOMMU_DOMAIN_HOST_VA	(1U << 5)  /* Host CPU virtual address */
+
 /*
  * This are the possible domain-types
  *
@@ -86,6 +90,8 @@ struct iommu_domain_geometry {
 #define IOMMU_DOMAIN_DMA_FQ	(__IOMMU_DOMAIN_PAGING |	\
 				 __IOMMU_DOMAIN_DMA_API |	\
 				 __IOMMU_DOMAIN_DMA_FQ)
+#define IOMMU_DOMAIN_SVA	(__IOMMU_DOMAIN_SHARED |	\
+				 __IOMMU_DOMAIN_HOST_VA)
 
 struct iommu_domain {
 	unsigned type;
@@ -95,6 +101,7 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	struct iommu_dma_cookie *iova_cookie;
+	struct iommu_sva_cookie *sva_cookie;
 };
 
 static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 8909ea1094e3..1a71218b07f5 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -10,6 +10,7 @@
 
 int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
+struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain);
 
 /* I/O Page fault */
 struct device;
@@ -26,6 +27,8 @@ int iopf_queue_flush_dev(struct device *dev);
 struct iopf_queue *iopf_queue_alloc(const char *name);
 void iopf_queue_free(struct iopf_queue *queue);
 int iopf_queue_discard_partial(struct iopf_queue *queue);
+bool iommu_sva_domain_get_user(struct iommu_domain *domain);
+void iommu_sva_domain_put_user(struct iommu_domain *domain);
 
 #else /* CONFIG_IOMMU_SVA */
 static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
@@ -63,5 +66,12 @@ static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
 {
 	return -ENODEV;
 }
+
+static inline bool iommu_sva_domain_get_user(struct iommu_domain *domain)
+{
+	return false;
+}
+
+static inline void iommu_sva_domain_put_user(struct iommu_domain *domain) { }
 #endif /* CONFIG_IOMMU_SVA */
 #endif /* _IOMMU_SVA_LIB_H */
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index 106506143896..78820be23f15 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -3,12 +3,21 @@
  * Helpers for IOMMU drivers implementing SVA
  */
 #include <linux/mutex.h>
+#include <linux/iommu.h>
+#include <linux/slab.h>
 #include <linux/sched/mm.h>
 
 #include "iommu-sva-lib.h"
 
 static DEFINE_MUTEX(iommu_sva_lock);
 static DECLARE_IOASID_SET(iommu_sva_pasid);
+static DEFINE_XARRAY_ALLOC(sva_domain_array);
+
+struct iommu_sva_cookie {
+	struct mm_struct *mm;
+	ioasid_t pasid;
+	refcount_t users;
+};
 
 /**
  * iommu_sva_alloc_pasid - Allocate a PASID for the mm
@@ -69,3 +78,83 @@ struct mm_struct *iommu_sva_find(ioasid_t pasid)
 	return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_find);
+
+static struct iommu_domain *
+iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm)
+{
+	struct bus_type *bus = dev->bus;
+	struct iommu_sva_cookie *cookie;
+	struct iommu_domain *domain;
+	void *curr;
+
+	if (!bus || !bus->iommu_ops)
+		return NULL;
+
+	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+	if (!cookie)
+		return NULL;
+
+	domain = bus->iommu_ops->domain_alloc(IOMMU_DOMAIN_SVA);
+	if (!domain)
+		goto err_domain_alloc;
+
+	cookie->mm = mm;
+	cookie->pasid = mm->pasid;
+	refcount_set(&cookie->users, 1);
+	domain->type = IOMMU_DOMAIN_SVA;
+	domain->sva_cookie = cookie;
+	curr = xa_store(&sva_domain_array, mm->pasid, domain, GFP_KERNEL);
+	if (xa_err(curr))
+		goto err_xa_store;
+
+	return domain;
+err_xa_store:
+	domain->ops->free(domain);
+err_domain_alloc:
+	kfree(cookie);
+	return NULL;
+}
+
+static void iommu_sva_free_domain(struct iommu_domain *domain)
+{
+	xa_erase(&sva_domain_array, domain->sva_cookie->pasid);
+	kfree(domain->sva_cookie);
+	domain->ops->free(domain);
+}
+
+bool iommu_sva_domain_get_user(struct iommu_domain *domain)
+{
+	struct iommu_sva_cookie *cookie = domain->sva_cookie;
+
+	return refcount_inc_not_zero(&cookie->users);
+}
+
+void iommu_sva_domain_put_user(struct iommu_domain *domain)
+{
+	struct iommu_sva_cookie *cookie = domain->sva_cookie;
+
+	if (refcount_dec_and_test(&cookie->users))
+		iommu_sva_free_domain(domain);
+}
+
+static __maybe_unused struct iommu_domain *
+iommu_sva_get_domain(struct device *dev, struct mm_struct *mm)
+{
+	struct iommu_domain *domain;
+	ioasid_t pasid = mm->pasid;
+
+	if (pasid == INVALID_IOASID)
+		return NULL;
+
+	domain = xa_load(&sva_domain_array, pasid);
+	if (!domain)
+		return iommu_sva_alloc_domain(dev, mm);
+	iommu_sva_domain_get_user(domain);
+
+	return domain;
+}
+
+struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain)
+{
+	return domain->sva_cookie->mm;
+}
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (2 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-30 19:08   ` Jason Gunthorpe via iommu
  2022-03-29  5:37 ` [PATCH RFC v2 05/11] iommu/vt-d: Remove SVM_FLAG_SUPERVISOR_MODE suport Lu Baolu
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Attaching an IOMMU domain to a PASID of a device is a generic operation
for modern IOMMU drivers which support PASID-granular DMA address
translation. Currently visible usage scenarios include (but not limited):

 - SVA (Shared Virtual Address)
 - kernel DMA with PASID
 - hardware-assist mediated device

This adds a pair of common domain ops for this purpose and adds some
common helpers to attach/detach a domain to/from a {device, PASID} and
get/put the domain attached to {device, PASID}.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h | 36 ++++++++++++++++++
 drivers/iommu/iommu.c | 88 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 124 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 29c4c2edd706..a46285488a57 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -269,6 +269,8 @@ struct iommu_ops {
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
  * @detach_dev: detach an iommu domain from a device
+ * @attach_dev_pasid: attach an iommu domain to a pasid of device
+ * @detach_dev_pasid: detach an iommu domain from a pasid of device
  * @map: map a physically contiguous memory region to an iommu domain
  * @map_pages: map a physically contiguous set of pages of the same size to
  *             an iommu domain.
@@ -286,6 +288,10 @@ struct iommu_ops {
 struct iommu_domain_ops {
 	int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
 	void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
+	int (*attach_dev_pasid)(struct iommu_domain *domain,
+				struct device *dev, ioasid_t id);
+	void (*detach_dev_pasid)(struct iommu_domain *domain,
+				 struct device *dev, ioasid_t id);
 
 	int (*map)(struct iommu_domain *domain, unsigned long iova,
 		   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
@@ -679,6 +685,14 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
 void iommu_group_release_dma_owner(struct iommu_group *group);
 bool iommu_group_dma_owner_claimed(struct iommu_group *group);
 
+int iommu_attach_device_pasid(struct iommu_domain *domain,
+			      struct device *dev, ioasid_t pasid);
+void iommu_detach_device_pasid(struct iommu_domain *domain,
+			       struct device *dev, ioasid_t pasid);
+struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid);
+void iommu_put_domain_for_dev_pasid(struct iommu_domain *domain);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1047,6 +1061,28 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)
 {
 	return false;
 }
+
+static inline int iommu_attach_device_pasid(struct iommu_domain *domain,
+					    struct device *dev, ioasid_t pasid)
+{
+	return -ENODEV;
+}
+
+static inline void iommu_detach_device_pasid(struct iommu_domain *domain,
+					     struct device *dev, ioasid_t pasid)
+{
+}
+
+static inline struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
+{
+	return NULL;
+}
+
+static inline void
+iommu_put_domain_for_dev_pasid(struct iommu_domain *domain)
+{
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 9fb8a5b4491e..8163ad7f6902 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -27,6 +27,8 @@
 #include <linux/cc_platform.h>
 #include <trace/events/iommu.h>
 
+#include "iommu-sva-lib.h"
+
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
 
@@ -38,6 +40,7 @@ struct iommu_group {
 	struct kobject kobj;
 	struct kobject *devices_kobj;
 	struct list_head devices;
+	struct xarray pasid_array;
 	struct mutex mutex;
 	void *iommu_data;
 	void (*iommu_data_release)(void *iommu_data);
@@ -631,6 +634,7 @@ struct iommu_group *iommu_group_alloc(void)
 	mutex_init(&group->mutex);
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
+	xa_init(&group->pasid_array);
 
 	ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
@@ -3173,3 +3177,87 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group)
 	return user;
 }
 EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
+
+int iommu_attach_device_pasid(struct iommu_domain *domain,
+			      struct device *dev, ioasid_t pasid)
+{
+	struct iommu_group *group;
+	int ret = -EINVAL;
+	void *curr;
+
+	if (!domain->ops->attach_dev_pasid)
+		return -EINVAL;
+
+	group = iommu_group_get(dev);
+	if (!group)
+		return -ENODEV;
+
+	mutex_lock(&group->mutex);
+	/*
+	 * To keep things simple, we currently don't support IOMMU groups
+	 * with more than one device. Existing SVA-capable systems are not
+	 * affected by the problems that required IOMMU groups (lack of ACS
+	 * isolation, device ID aliasing and other hardware issues).
+	 */
+	if (!iommu_group_singleton_lockdown(group))
+		goto out_unlock;
+
+	xa_lock(&group->pasid_array);
+	curr = __xa_cmpxchg(&group->pasid_array, pasid, NULL,
+			    domain, GFP_KERNEL);
+	xa_unlock(&group->pasid_array);
+	if (curr)
+		goto out_unlock;
+
+	ret = domain->ops->attach_dev_pasid(domain, dev, pasid);
+	if (ret)
+		xa_erase(&group->pasid_array, pasid);
+
+out_unlock:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+
+void iommu_detach_device_pasid(struct iommu_domain *domain,
+			       struct device *dev, ioasid_t pasid)
+{
+	struct iommu_group *group;
+
+	group = iommu_group_get(dev);
+	if (WARN_ON(!group))
+		return;
+
+	mutex_lock(&group->mutex);
+	domain->ops->detach_dev_pasid(domain, dev, pasid);
+	xa_erase(&group->pasid_array, pasid);
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+}
+
+struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
+{
+	struct iommu_domain *domain;
+	struct iommu_group *group;
+
+	group = iommu_group_get(dev);
+	if (!group)
+		return NULL;
+
+	mutex_lock(&group->mutex);
+	domain = xa_load(&group->pasid_array, pasid);
+	if (domain && domain->type == IOMMU_DOMAIN_SVA)
+		iommu_sva_domain_get_user(domain);
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return domain;
+}
+
+void iommu_put_domain_for_dev_pasid(struct iommu_domain *domain)
+{
+	if (domain->type == IOMMU_DOMAIN_SVA)
+		iommu_sva_domain_put_user(domain);
+}
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 05/11] iommu/vt-d: Remove SVM_FLAG_SUPERVISOR_MODE suport
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (3 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-29  5:37 ` [PATCH RFC v2 06/11] iommu/vt-d: Add SVA domain support Lu Baolu
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

The current in-kernel supervisor PASID support is based on the SVA
machinery in SVA lib. The binding between a kernel PASID and kernel
mapping has many flaws. Remove SVM_FLAG_SUPERVISOR_MODE support.

Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/svm.c | 53 +++++++++------------------------------
 1 file changed, 12 insertions(+), 41 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index d1c42dfae6ca..ee5ecde5b318 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -313,8 +313,7 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
 	return 0;
 }
 
-static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
-				 unsigned int flags)
+static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm)
 {
 	ioasid_t max_pasid = dev_is_pci(dev) ?
 			pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id;
@@ -324,8 +323,7 @@ static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
 
 static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 					   struct device *dev,
-					   struct mm_struct *mm,
-					   unsigned int flags)
+					   struct mm_struct *mm)
 {
 	struct device_domain_info *info = dev_iommu_priv_get(dev);
 	unsigned long iflags, sflags;
@@ -341,22 +339,18 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 
 		svm->pasid = mm->pasid;
 		svm->mm = mm;
-		svm->flags = flags;
 		INIT_LIST_HEAD_RCU(&svm->devs);
 
-		if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) {
-			svm->notifier.ops = &intel_mmuops;
-			ret = mmu_notifier_register(&svm->notifier, mm);
-			if (ret) {
-				kfree(svm);
-				return ERR_PTR(ret);
-			}
+		svm->notifier.ops = &intel_mmuops;
+		ret = mmu_notifier_register(&svm->notifier, mm);
+		if (ret) {
+			kfree(svm);
+			return ERR_PTR(ret);
 		}
 
 		ret = pasid_private_add(svm->pasid, svm);
 		if (ret) {
-			if (svm->notifier.ops)
-				mmu_notifier_unregister(&svm->notifier, mm);
+			mmu_notifier_unregister(&svm->notifier, mm);
 			kfree(svm);
 			return ERR_PTR(ret);
 		}
@@ -391,9 +385,7 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 	}
 
 	/* Setup the pasid table: */
-	sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
-			PASID_FLAG_SUPERVISOR_MODE : 0;
-	sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
+	sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
 	sflags |= PASID_FLAG_PAGE_SNOOP;
 	spin_lock_irqsave(&iommu->lock, iflags);
 	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm->pasid,
@@ -411,8 +403,7 @@ static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 	kfree(sdev);
 free_svm:
 	if (list_empty(&svm->devs)) {
-		if (svm->notifier.ops)
-			mmu_notifier_unregister(&svm->notifier, mm);
+		mmu_notifier_unregister(&svm->notifier, mm);
 		pasid_private_remove(mm->pasid);
 		kfree(svm);
 	}
@@ -822,37 +813,17 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
 	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
-	unsigned int flags = 0;
 	struct iommu_sva *sva;
 	int ret;
 
-	if (drvdata)
-		flags = *(unsigned int *)drvdata;
-
-	if (flags & SVM_FLAG_SUPERVISOR_MODE) {
-		if (!ecap_srs(iommu->ecap)) {
-			dev_err(dev, "%s: Supervisor PASID not supported\n",
-				iommu->name);
-			return ERR_PTR(-EOPNOTSUPP);
-		}
-
-		if (mm) {
-			dev_err(dev, "%s: Supervisor PASID with user provided mm\n",
-				iommu->name);
-			return ERR_PTR(-EINVAL);
-		}
-
-		mm = &init_mm;
-	}
-
 	mutex_lock(&pasid_mutex);
-	ret = intel_svm_alloc_pasid(dev, mm, flags);
+	ret = intel_svm_alloc_pasid(dev, mm);
 	if (ret) {
 		mutex_unlock(&pasid_mutex);
 		return ERR_PTR(ret);
 	}
 
-	sva = intel_svm_bind_mm(iommu, dev, mm, flags);
+	sva = intel_svm_bind_mm(iommu, dev, mm);
 	mutex_unlock(&pasid_mutex);
 
 	return sva;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 06/11] iommu/vt-d: Add SVA domain support
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (4 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 05/11] iommu/vt-d: Remove SVM_FLAG_SUPERVISOR_MODE suport Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-30 19:09   ` Jason Gunthorpe via iommu
  2022-03-29  5:37 ` [PATCH RFC v2 07/11] arm-smmu-v3/sva: " Lu Baolu
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Add support for SVA domain allocation and provide an SVA-specific
iommu_domain_ops.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/intel-iommu.h |  1 +
 drivers/iommu/intel/iommu.c | 10 ++++++++++
 drivers/iommu/intel/svm.c   | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2f9891cb3d00..c14283137fb5 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -744,6 +744,7 @@ void intel_svm_unbind(struct iommu_sva *handle);
 u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
 			    struct iommu_page_response *msg);
+extern const struct iommu_domain_ops intel_svm_domain_ops;
 
 struct intel_svm_dev {
 	struct list_head list;
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index c1b91bce1530..5eae7cf9bee5 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4318,6 +4318,16 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 		return domain;
 	case IOMMU_DOMAIN_IDENTITY:
 		return &si_domain->domain;
+#ifdef CONFIG_INTEL_IOMMU_SVM
+	case IOMMU_DOMAIN_SVA:
+		dmar_domain = alloc_domain(type);
+		if (!dmar_domain)
+			return NULL;
+		domain = &dmar_domain->domain;
+		domain->ops = &intel_svm_domain_ops;
+
+		return domain;
+#endif /* CONFIG_INTEL_IOMMU_SVM */
 	default:
 		return NULL;
 	}
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index ee5ecde5b318..8f59dd641b2d 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -932,3 +932,40 @@ int intel_svm_page_response(struct device *dev,
 	mutex_unlock(&pasid_mutex);
 	return ret;
 }
+
+static int intel_svm_attach_dev_pasid(struct iommu_domain *domain,
+				      struct device *dev, ioasid_t pasid)
+{
+	struct device_domain_info *info = dev_iommu_priv_get(dev);
+	struct mm_struct *mm = iommu_sva_domain_mm(domain);
+	struct intel_iommu *iommu = info->iommu;
+	struct iommu_sva *sva;
+	int ret = 0;
+
+	mutex_lock(&pasid_mutex);
+	sva = intel_svm_bind_mm(iommu, dev, mm);
+	if (IS_ERR(sva))
+		ret = PTR_ERR(sva);
+	mutex_unlock(&pasid_mutex);
+
+	return ret;
+}
+
+static void intel_svm_detach_dev_pasid(struct iommu_domain *domain,
+				       struct device *dev, ioasid_t pasid)
+{
+	mutex_lock(&pasid_mutex);
+	intel_svm_unbind_mm(dev, pasid);
+	mutex_unlock(&pasid_mutex);
+}
+
+static void intel_svm_domain_free(struct iommu_domain *domain)
+{
+	kfree(domain);
+}
+
+const struct iommu_domain_ops intel_svm_domain_ops = {
+	.attach_dev_pasid	= intel_svm_attach_dev_pasid,
+	.detach_dev_pasid	= intel_svm_detach_dev_pasid,
+	.free			= intel_svm_domain_free,
+};
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 07/11] arm-smmu-v3/sva: Add SVA domain support
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (5 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 06/11] iommu/vt-d: Add SVA domain support Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-29  5:37 ` [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces Lu Baolu
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Add support for SVA domain allocation and provide an SVA-specific
iommu_domain_ops.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 14 +++++++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 42 +++++++++++++++++++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 21 ++++++++++
 3 files changed, 77 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index cd48590ada30..7631c00fdcbd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -759,6 +759,10 @@ struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
 u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
+int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
+				  struct device *dev, ioasid_t id);
+void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
+				   struct device *dev, ioasid_t id);
 #else /* CONFIG_ARM_SMMU_V3_SVA */
 static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 {
@@ -804,5 +808,15 @@ static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
 }
 
 static inline void arm_smmu_sva_notifier_synchronize(void) {}
+
+static inline int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
+						struct device *dev, ioasid_t id)
+{
+	return -ENODEV;
+}
+
+static inline void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
+						 struct device *dev,
+						 ioasid_t id) {}
 #endif /* CONFIG_ARM_SMMU_V3_SVA */
 #endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 22ddd05bbdcd..ce229820086d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -534,3 +534,45 @@ void arm_smmu_sva_notifier_synchronize(void)
 	 */
 	mmu_notifier_synchronize();
 }
+
+int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
+				  struct device *dev, ioasid_t id)
+{
+	int ret = 0;
+	struct iommu_sva *handle;
+	struct mm_struct *mm = iommu_sva_domain_mm(domain);
+
+	if (domain->type != IOMMU_DOMAIN_SVA || !mm)
+		return -EINVAL;
+
+	mutex_lock(&sva_lock);
+	handle = __arm_smmu_sva_bind(dev, mm);
+	if (IS_ERR(handle))
+		ret = PTR_ERR(handle);
+	mutex_unlock(&sva_lock);
+
+	return ret;
+}
+
+void arm_smmu_sva_detach_dev_pasid(struct iommu_domain *domain,
+				   struct device *dev, ioasid_t id)
+{
+	struct arm_smmu_bond *bond = NULL, *t;
+	struct mm_struct *mm = iommu_sva_domain_mm(domain);
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	mutex_lock(&sva_lock);
+	list_for_each_entry(t, &master->bonds, list) {
+		if (t->mm == mm) {
+			bond = t;
+			break;
+		}
+	}
+
+	if (!WARN_ON(!bond) && refcount_dec_and_test(&bond->refs)) {
+		list_del(&bond->list);
+		arm_smmu_mmu_notifier_put(bond->smmu_mn);
+		kfree(bond);
+	}
+	mutex_unlock(&sva_lock);
+}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index afc63fce6107..bd80de0bad98 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1995,10 +1995,31 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 	}
 }
 
+static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
+{
+	kfree(domain);
+}
+
+static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
+	.attach_dev_pasid	= arm_smmu_sva_attach_dev_pasid,
+	.detach_dev_pasid	= arm_smmu_sva_detach_dev_pasid,
+	.free			= arm_smmu_sva_domain_free,
+};
+
 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 {
 	struct arm_smmu_domain *smmu_domain;
 
+	if (type == IOMMU_DOMAIN_SVA) {
+		struct iommu_domain *domain;
+
+		domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+		if (domain)
+			domain->ops = &arm_smmu_sva_domain_ops;
+
+		return domain;
+	}
+
 	if (type != IOMMU_DOMAIN_UNMANAGED &&
 	    type != IOMMU_DOMAIN_DMA &&
 	    type != IOMMU_DOMAIN_DMA_FQ &&
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (6 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 07/11] arm-smmu-v3/sva: " Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-31 20:59   ` Jacob Pan
  2022-03-29  5:37 ` [PATCH RFC v2 09/11] iommu: Remove SVA related callbacks from iommu ops Lu Baolu
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

The existing iommu SVA interfaces are implemented by calling the SVA
specific iommu ops provided by the IOMMU drivers. There's no need for
any SVA specific ops in iommu_ops vector anymore as we can achieve
this through the generic attach/detach_dev_pasid domain ops.

This refactors the IOMMU SVA interfaces implementation by using the
attach/detach_pasid_dev ops and align them with the concept of the
iommu domain. Put the new SVA code in the sva related file in order
to make it self-contained.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h         |  51 +++++++++-------
 drivers/iommu/iommu-sva-lib.c | 110 +++++++++++++++++++++++++++++++++-
 drivers/iommu/iommu.c         |  92 ----------------------------
 3 files changed, 138 insertions(+), 115 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a46285488a57..11c4d99e122d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -629,7 +629,12 @@ struct iommu_fwspec {
  * struct iommu_sva - handle to a device-mm bond
  */
 struct iommu_sva {
-	struct device			*dev;
+	struct device		*dev;
+	ioasid_t		pasid;
+	struct iommu_domain	*domain;
+	/* Link to sva domain's bonds list */
+	struct list_head	node;
+	refcount_t		users;
 };
 
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
@@ -672,12 +677,6 @@ int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features f);
 int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
 bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features f);
 
-struct iommu_sva *iommu_sva_bind_device(struct device *dev,
-					struct mm_struct *mm,
-					void *drvdata);
-void iommu_sva_unbind_device(struct iommu_sva *handle);
-u32 iommu_sva_get_pasid(struct iommu_sva *handle);
-
 int iommu_device_use_default_domain(struct device *dev);
 void iommu_device_unuse_default_domain(struct device *dev);
 
@@ -1018,21 +1017,6 @@ iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
 	return -ENODEV;
 }
 
-static inline struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
-{
-	return NULL;
-}
-
-static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
-{
-}
-
-static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
-{
-	return IOMMU_PASID_INVALID;
-}
-
 static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
@@ -1085,6 +1069,29 @@ iommu_put_domain_for_dev_pasid(struct iommu_domain *domain)
 }
 #endif /* CONFIG_IOMMU_API */
 
+#ifdef CONFIG_IOMMU_SVA
+struct iommu_sva *iommu_sva_bind_device(struct device *dev,
+					struct mm_struct *mm,
+					void *drvdata);
+void iommu_sva_unbind_device(struct iommu_sva *handle);
+u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+#else /* CONFIG_IOMMU_SVA */
+static inline struct iommu_sva *
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
+{
+	return NULL;
+}
+
+static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
+{
+}
+
+static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
+{
+	return IOMMU_PASID_INVALID;
+}
+#endif /* CONFIG_IOMMU_SVA */
+
 /**
  * iommu_map_sgtable - Map the given buffer to the IOMMU domain
  * @domain:	The IOMMU domain to perform the mapping
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index 78820be23f15..1b45b7d01836 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -17,6 +17,7 @@ struct iommu_sva_cookie {
 	struct mm_struct *mm;
 	ioasid_t pasid;
 	refcount_t users;
+	struct list_head bonds;
 };
 
 /**
@@ -101,6 +102,7 @@ iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm)
 	cookie->mm = mm;
 	cookie->pasid = mm->pasid;
 	refcount_set(&cookie->users, 1);
+	INIT_LIST_HEAD(&cookie->bonds);
 	domain->type = IOMMU_DOMAIN_SVA;
 	domain->sva_cookie = cookie;
 	curr = xa_store(&sva_domain_array, mm->pasid, domain, GFP_KERNEL);
@@ -118,6 +120,7 @@ iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm)
 static void iommu_sva_free_domain(struct iommu_domain *domain)
 {
 	xa_erase(&sva_domain_array, domain->sva_cookie->pasid);
+	WARN_ON(!list_empty(&domain->sva_cookie->bonds));
 	kfree(domain->sva_cookie);
 	domain->ops->free(domain);
 }
@@ -137,7 +140,7 @@ void iommu_sva_domain_put_user(struct iommu_domain *domain)
 		iommu_sva_free_domain(domain);
 }
 
-static __maybe_unused struct iommu_domain *
+static struct iommu_domain *
 iommu_sva_get_domain(struct device *dev, struct mm_struct *mm)
 {
 	struct iommu_domain *domain;
@@ -158,3 +161,108 @@ struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain)
 {
 	return domain->sva_cookie->mm;
 }
+
+/**
+ * iommu_sva_bind_device() - Bind a process address space to a device
+ * @dev: the device
+ * @mm: the mm to bind, caller must hold a reference to it
+ * @drvdata: opaque data pointer to pass to bind callback
+ *
+ * Create a bond between device and address space, allowing the device to access
+ * the mm using the returned PASID. If a bond already exists between @device and
+ * @mm, it is returned and an additional reference is taken. Caller must call
+ * iommu_sva_unbind_device() to release each reference.
+ *
+ * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
+ * initialize the required SVA features.
+ *
+ * On error, returns an ERR_PTR value.
+ */
+struct iommu_sva *
+iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
+{
+	int ret = -EINVAL;
+	struct iommu_sva *handle;
+	struct iommu_domain *domain;
+
+	ret = iommu_sva_alloc_pasid(mm, 1, (1U << dev->iommu->pasid_bits) - 1);
+	if (ret)
+		return ERR_PTR(ret);
+
+	mutex_lock(&iommu_sva_lock);
+	domain = iommu_sva_get_domain(dev, mm);
+	if (!domain) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	/* Search for an existing bond. */
+	list_for_each_entry(handle, &domain->sva_cookie->bonds, node) {
+		if (handle->dev == dev && handle->pasid == mm->pasid) {
+			refcount_inc(&handle->users);
+			mutex_lock(&iommu_sva_lock);
+
+			return handle;
+		}
+	}
+
+	handle = kzalloc(sizeof(*handle), GFP_KERNEL);
+	if (!handle) {
+		ret = -ENOMEM;
+		goto out_put_domain;
+	}
+
+	ret = iommu_attach_device_pasid(domain, dev, mm->pasid);
+	if (ret)
+		goto out_free_handle;
+
+	handle->dev = dev;
+	handle->domain = domain;
+	handle->pasid = mm->pasid;
+	refcount_set(&handle->users, 1);
+	list_add_tail(&handle->node, &domain->sva_cookie->bonds);
+
+	mutex_unlock(&iommu_sva_lock);
+	return handle;
+
+out_free_handle:
+	kfree(handle);
+out_put_domain:
+	iommu_sva_domain_put_user(domain);
+out_unlock:
+	mutex_unlock(&iommu_sva_lock);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
+
+/**
+ * iommu_sva_unbind_device() - Remove a bond created with iommu_sva_bind_device
+ * @handle: the handle returned by iommu_sva_bind_device()
+ *
+ * Put reference to a bond between device and address space. The device should
+ * not be issuing any more transaction for this PASID. All outstanding page
+ * requests for this PASID must have been flushed to the IOMMU.
+ */
+void iommu_sva_unbind_device(struct iommu_sva *handle)
+{
+	struct device *dev = handle->dev;
+	struct iommu_domain *domain = handle->domain;
+	struct mm_struct *mm = iommu_sva_domain_mm(domain);
+
+	mutex_lock(&iommu_sva_lock);
+	if (refcount_dec_and_test(&handle->users)) {
+		list_del(&handle->node);
+		iommu_detach_device_pasid(domain, dev, mm->pasid);
+		kfree(handle);
+	}
+
+	iommu_sva_domain_put_user(domain);
+	mutex_unlock(&iommu_sva_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
+
+u32 iommu_sva_get_pasid(struct iommu_sva *handle)
+{
+	return handle->pasid;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8163ad7f6902..6b51ead9d63b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2712,98 +2712,6 @@ bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features feat)
 }
 EXPORT_SYMBOL_GPL(iommu_dev_feature_enabled);
 
-/**
- * iommu_sva_bind_device() - Bind a process address space to a device
- * @dev: the device
- * @mm: the mm to bind, caller must hold a reference to it
- * @drvdata: opaque data pointer to pass to bind callback
- *
- * Create a bond between device and address space, allowing the device to access
- * the mm using the returned PASID. If a bond already exists between @device and
- * @mm, it is returned and an additional reference is taken. Caller must call
- * iommu_sva_unbind_device() to release each reference.
- *
- * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
- * initialize the required SVA features.
- *
- * On error, returns an ERR_PTR value.
- */
-struct iommu_sva *
-iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
-{
-	struct iommu_group *group;
-	struct iommu_sva *handle = ERR_PTR(-EINVAL);
-	const struct iommu_ops *ops = dev_iommu_ops(dev);
-
-	if (!ops->sva_bind)
-		return ERR_PTR(-ENODEV);
-
-	group = iommu_group_get(dev);
-	if (!group)
-		return ERR_PTR(-ENODEV);
-
-	/* Ensure device count and domain don't change while we're binding */
-	mutex_lock(&group->mutex);
-
-	/*
-	 * To keep things simple, SVA currently doesn't support IOMMU groups
-	 * with more than one device. Existing SVA-capable systems are not
-	 * affected by the problems that required IOMMU groups (lack of ACS
-	 * isolation, device ID aliasing and other hardware issues).
-	 */
-	if (!iommu_group_singleton_lockdown(group))
-		goto out_unlock;
-
-	handle = ops->sva_bind(dev, mm, drvdata);
-
-out_unlock:
-	mutex_unlock(&group->mutex);
-	iommu_group_put(group);
-
-	return handle;
-}
-EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
-
-/**
- * iommu_sva_unbind_device() - Remove a bond created with iommu_sva_bind_device
- * @handle: the handle returned by iommu_sva_bind_device()
- *
- * Put reference to a bond between device and address space. The device should
- * not be issuing any more transaction for this PASID. All outstanding page
- * requests for this PASID must have been flushed to the IOMMU.
- */
-void iommu_sva_unbind_device(struct iommu_sva *handle)
-{
-	struct iommu_group *group;
-	struct device *dev = handle->dev;
-	const struct iommu_ops *ops = dev_iommu_ops(dev);
-
-	if (!ops->sva_unbind)
-		return;
-
-	group = iommu_group_get(dev);
-	if (!group)
-		return;
-
-	mutex_lock(&group->mutex);
-	ops->sva_unbind(handle);
-	mutex_unlock(&group->mutex);
-
-	iommu_group_put(group);
-}
-EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
-
-u32 iommu_sva_get_pasid(struct iommu_sva *handle)
-{
-	const struct iommu_ops *ops = dev_iommu_ops(handle->dev);
-
-	if (!ops->sva_get_pasid)
-		return IOMMU_PASID_INVALID;
-
-	return ops->sva_get_pasid(handle);
-}
-EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
-
 /*
  * Changes the default domain of an iommu group that has *only* one device
  *
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 09/11] iommu: Remove SVA related callbacks from iommu ops
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (7 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-29  5:37 ` [PATCH RFC v2 10/11] iommu: Per-domain I/O page fault handling Lu Baolu
  2022-03-29  5:38 ` [PATCH RFC v2 11/11] iommu: Rename iommu-sva-lib.{c,h} Lu Baolu
  10 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

These ops'es have been replaced with the dev_attach/detach_pasid domain
ops'es. There's no need for them anymore. Remove them to avoid dead
code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/intel-iommu.h                   |  4 --
 include/linux/iommu.h                         |  8 ---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 17 -------
 drivers/iommu/iommu-sva-lib.h                 |  1 -
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 41 ----------------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  3 --
 drivers/iommu/intel/iommu.c                   |  3 --
 drivers/iommu/intel/svm.c                     | 49 -------------------
 drivers/iommu/iommu-sva-lib.c                 |  4 +-
 9 files changed, 2 insertions(+), 128 deletions(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index c14283137fb5..7ce12590ce0f 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -738,10 +738,6 @@ struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn);
 extern void intel_svm_check(struct intel_iommu *iommu);
 extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
-				 void *drvdata);
-void intel_svm_unbind(struct iommu_sva *handle);
-u32 intel_svm_get_pasid(struct iommu_sva *handle);
 int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
 			    struct iommu_page_response *msg);
 extern const struct iommu_domain_ops intel_svm_domain_ops;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 11c4d99e122d..7e30b88d7bef 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -213,9 +213,6 @@ struct iommu_iotlb_gather {
  * @dev_has/enable/disable_feat: per device entries to check/enable/disable
  *                               iommu specific features.
  * @dev_feat_enabled: check enabled feature
- * @sva_bind: Bind process address space to device
- * @sva_unbind: Unbind process address space from device
- * @sva_get_pasid: Get PASID associated to a SVA handle
  * @page_response: handle page request response
  * @def_domain_type: device default domain type, return value:
  *		- IOMMU_DOMAIN_IDENTITY: must use an identity domain
@@ -249,11 +246,6 @@ struct iommu_ops {
 	int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
 	int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
 
-	struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,
-				      void *drvdata);
-	void (*sva_unbind)(struct iommu_sva *handle);
-	u32 (*sva_get_pasid)(struct iommu_sva *handle);
-
 	int (*page_response)(struct device *dev,
 			     struct iommu_fault_event *evt,
 			     struct iommu_page_response *msg);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7631c00fdcbd..2513309ec0db 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -754,10 +754,6 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
 bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
-struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
-				    void *drvdata);
-void arm_smmu_sva_unbind(struct iommu_sva *handle);
-u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle);
 void arm_smmu_sva_notifier_synchronize(void);
 int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
 				  struct device *dev, ioasid_t id);
@@ -794,19 +790,6 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master
 	return false;
 }
 
-static inline struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
-{
-	return ERR_PTR(-ENODEV);
-}
-
-static inline void arm_smmu_sva_unbind(struct iommu_sva *handle) {}
-
-static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
-{
-	return IOMMU_PASID_INVALID;
-}
-
 static inline void arm_smmu_sva_notifier_synchronize(void) {}
 
 static inline int arm_smmu_sva_attach_dev_pasid(struct iommu_domain *domain,
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 1a71218b07f5..95d51f748dff 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -8,7 +8,6 @@
 #include <linux/ioasid.h>
 #include <linux/mm_types.h>
 
-int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
 struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain);
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index ce229820086d..5591321f9e11 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -328,11 +328,6 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 	if (!bond)
 		return ERR_PTR(-ENOMEM);
 
-	/* Allocate a PASID for this mm if necessary */
-	ret = iommu_sva_alloc_pasid(mm, 1, (1U << master->ssid_bits) - 1);
-	if (ret)
-		goto err_free_bond;
-
 	bond->mm = mm;
 	bond->sva.dev = dev;
 	refcount_set(&bond->refs, 1);
@@ -351,42 +346,6 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 	return ERR_PTR(ret);
 }
 
-struct iommu_sva *
-arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
-{
-	struct iommu_sva *handle;
-	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-
-	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
-		return ERR_PTR(-EINVAL);
-
-	mutex_lock(&sva_lock);
-	handle = __arm_smmu_sva_bind(dev, mm);
-	mutex_unlock(&sva_lock);
-	return handle;
-}
-
-void arm_smmu_sva_unbind(struct iommu_sva *handle)
-{
-	struct arm_smmu_bond *bond = sva_to_bond(handle);
-
-	mutex_lock(&sva_lock);
-	if (refcount_dec_and_test(&bond->refs)) {
-		list_del(&bond->list);
-		arm_smmu_mmu_notifier_put(bond->smmu_mn);
-		kfree(bond);
-	}
-	mutex_unlock(&sva_lock);
-}
-
-u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
-{
-	struct arm_smmu_bond *bond = sva_to_bond(handle);
-
-	return bond->mm->pasid;
-}
-
 bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 {
 	unsigned long reg, fld;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bd80de0bad98..543d3ef1c102 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2875,9 +2875,6 @@ static struct iommu_ops arm_smmu_ops = {
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
 	.dev_disable_feat	= arm_smmu_dev_disable_feature,
-	.sva_bind		= arm_smmu_sva_bind,
-	.sva_unbind		= arm_smmu_sva_unbind,
-	.sva_get_pasid		= arm_smmu_sva_get_pasid,
 	.page_response		= arm_smmu_page_response,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 	.owner			= THIS_MODULE,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 5eae7cf9bee5..1d6a62512bca 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4890,9 +4890,6 @@ const struct iommu_ops intel_iommu_ops = {
 	.def_domain_type	= device_def_domain_type,
 	.pgsize_bitmap		= SZ_4K,
 #ifdef CONFIG_INTEL_IOMMU_SVM
-	.sva_bind		= intel_svm_bind,
-	.sva_unbind		= intel_svm_unbind,
-	.sva_get_pasid		= intel_svm_get_pasid,
 	.page_response		= intel_svm_page_response,
 #endif
 	.default_domain_ops = &(const struct iommu_domain_ops) {
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 8f59dd641b2d..c76b99b46626 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -313,14 +313,6 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
 	return 0;
 }
 
-static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm)
-{
-	ioasid_t max_pasid = dev_is_pci(dev) ?
-			pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id;
-
-	return iommu_sva_alloc_pasid(mm, PASID_MIN, max_pasid - 1);
-}
-
 static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 					   struct device *dev,
 					   struct mm_struct *mm)
@@ -810,47 +802,6 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 	return IRQ_RETVAL(handled);
 }
 
-struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
-{
-	struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
-	struct iommu_sva *sva;
-	int ret;
-
-	mutex_lock(&pasid_mutex);
-	ret = intel_svm_alloc_pasid(dev, mm);
-	if (ret) {
-		mutex_unlock(&pasid_mutex);
-		return ERR_PTR(ret);
-	}
-
-	sva = intel_svm_bind_mm(iommu, dev, mm);
-	mutex_unlock(&pasid_mutex);
-
-	return sva;
-}
-
-void intel_svm_unbind(struct iommu_sva *sva)
-{
-	struct intel_svm_dev *sdev = to_intel_svm_dev(sva);
-
-	mutex_lock(&pasid_mutex);
-	intel_svm_unbind_mm(sdev->dev, sdev->pasid);
-	mutex_unlock(&pasid_mutex);
-}
-
-u32 intel_svm_get_pasid(struct iommu_sva *sva)
-{
-	struct intel_svm_dev *sdev;
-	u32 pasid;
-
-	mutex_lock(&pasid_mutex);
-	sdev = to_intel_svm_dev(sva);
-	pasid = sdev->pasid;
-	mutex_unlock(&pasid_mutex);
-
-	return pasid;
-}
-
 int intel_svm_page_response(struct device *dev,
 			    struct iommu_fault_event *evt,
 			    struct iommu_page_response *msg)
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index 1b45b7d01836..96e967e58aa9 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -32,7 +32,8 @@ struct iommu_sva_cookie {
  *
  * Returns 0 on success and < 0 on error.
  */
-int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
+static int iommu_sva_alloc_pasid(struct mm_struct *mm,
+				 ioasid_t min, ioasid_t max)
 {
 	int ret = 0;
 	ioasid_t pasid;
@@ -58,7 +59,6 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
 	mutex_unlock(&iommu_sva_lock);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
 
 /* ioasid_find getter() requires a void * argument */
 static bool __mmget_not_zero(void *mm)
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 10/11] iommu: Per-domain I/O page fault handling
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (8 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 09/11] iommu: Remove SVA related callbacks from iommu ops Lu Baolu
@ 2022-03-29  5:37 ` Lu Baolu
  2022-03-29  5:38 ` [PATCH RFC v2 11/11] iommu: Rename iommu-sva-lib.{c,h} Lu Baolu
  10 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:37 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Tweak the I/O page fault handling framework to route the page faults to
the domain and call the page fault handler retrieved from the domain.
This makes the I/O page fault handling framework possible to serve more
usage scenarios as long as they have an IOMMU domain and install a page
fault handler in it. This also refactors the SVA implementation to be
the first consumer of the per-domain page fault handling model.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h         |  4 ++
 drivers/iommu/iommu-sva-lib.h |  1 -
 drivers/iommu/io-pgfault.c    | 70 +++++++---------------------------
 drivers/iommu/iommu-sva-lib.c | 71 +++++++++++++++++++++++++++--------
 4 files changed, 72 insertions(+), 74 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7e30b88d7bef..729d602fcba5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -51,6 +51,8 @@ struct iommu_sva_cookie;
 typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
 			struct device *, unsigned long, int, void *);
 typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault *, void *);
+typedef enum iommu_page_response_code (*iommu_domain_iopf_handler_t)
+			(struct iommu_fault *, void *);
 
 struct iommu_domain_geometry {
 	dma_addr_t aperture_start; /* First address that can be mapped    */
@@ -102,6 +104,8 @@ struct iommu_domain {
 	struct iommu_domain_geometry geometry;
 	struct iommu_dma_cookie *iova_cookie;
 	struct iommu_sva_cookie *sva_cookie;
+	iommu_domain_iopf_handler_t fault_handler;
+	void *fault_data;
 };
 
 static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 95d51f748dff..abef83ac1b2d 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -8,7 +8,6 @@
 #include <linux/ioasid.h>
 #include <linux/mm_types.h>
 
-struct mm_struct *iommu_sva_find(ioasid_t pasid);
 struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain);
 
 /* I/O Page fault */
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index 1df8c1dcae77..159e4f107fe3 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -69,62 +69,6 @@ static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf,
 	return iommu_page_response(dev, &resp);
 }
 
-static enum iommu_page_response_code
-iopf_handle_single(struct iopf_fault *iopf)
-{
-	vm_fault_t ret;
-	struct mm_struct *mm;
-	struct vm_area_struct *vma;
-	unsigned int access_flags = 0;
-	unsigned int fault_flags = FAULT_FLAG_REMOTE;
-	struct iommu_fault_page_request *prm = &iopf->fault.prm;
-	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
-
-	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
-		return status;
-
-	mm = iommu_sva_find(prm->pasid);
-	if (IS_ERR_OR_NULL(mm))
-		return status;
-
-	mmap_read_lock(mm);
-
-	vma = find_extend_vma(mm, prm->addr);
-	if (!vma)
-		/* Unmapped area */
-		goto out_put_mm;
-
-	if (prm->perm & IOMMU_FAULT_PERM_READ)
-		access_flags |= VM_READ;
-
-	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
-		access_flags |= VM_WRITE;
-		fault_flags |= FAULT_FLAG_WRITE;
-	}
-
-	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
-		access_flags |= VM_EXEC;
-		fault_flags |= FAULT_FLAG_INSTRUCTION;
-	}
-
-	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
-		fault_flags |= FAULT_FLAG_USER;
-
-	if (access_flags & ~vma->vm_flags)
-		/* Access fault */
-		goto out_put_mm;
-
-	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
-	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
-		IOMMU_PAGE_RESP_SUCCESS;
-
-out_put_mm:
-	mmap_read_unlock(mm);
-	mmput(mm);
-
-	return status;
-}
-
 static void iopf_handle_group(struct work_struct *work)
 {
 	struct iopf_group *group;
@@ -134,12 +78,24 @@ static void iopf_handle_group(struct work_struct *work)
 	group = container_of(work, struct iopf_group, work);
 
 	list_for_each_entry_safe(iopf, next, &group->faults, list) {
+		struct iommu_domain *domain;
+
+		domain = iommu_get_domain_for_dev_pasid(group->dev,
+							iopf->fault.prm.pasid);
+
+		if (!domain || !domain->fault_handler)
+			status = IOMMU_PAGE_RESP_INVALID;
+
 		/*
 		 * For the moment, errors are sticky: don't handle subsequent
 		 * faults in the group if there is an error.
 		 */
 		if (status == IOMMU_PAGE_RESP_SUCCESS)
-			status = iopf_handle_single(iopf);
+			status = domain->fault_handler(&iopf->fault,
+						       domain->fault_data);
+
+		if (domain)
+			iommu_put_domain_for_dev_pasid(domain);
 
 		if (!(iopf->fault.prm.flags &
 		      IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index 96e967e58aa9..a53574f9559a 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -60,25 +60,62 @@ static int iommu_sva_alloc_pasid(struct mm_struct *mm,
 	return ret;
 }
 
-/* ioasid_find getter() requires a void * argument */
-static bool __mmget_not_zero(void *mm)
+static enum iommu_page_response_code
+iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
 {
-	return mmget_not_zero(mm);
-}
+	vm_fault_t ret;
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	unsigned int access_flags = 0;
+	struct iommu_domain *domain = data;
+	unsigned int fault_flags = FAULT_FLAG_REMOTE;
+	struct iommu_fault_page_request *prm = &fault->prm;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
 
-/**
- * iommu_sva_find() - Find mm associated to the given PASID
- * @pasid: Process Address Space ID assigned to the mm
- *
- * On success a reference to the mm is taken, and must be released with mmput().
- *
- * Returns the mm corresponding to this PASID, or an error if not found.
- */
-struct mm_struct *iommu_sva_find(ioasid_t pasid)
-{
-	return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero);
+	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
+		return status;
+
+	mm = iommu_sva_domain_mm(domain);
+	if (IS_ERR_OR_NULL(mm) || !mmget_not_zero(mm))
+		return status;
+
+	mmap_read_lock(mm);
+
+	vma = find_extend_vma(mm, prm->addr);
+	if (!vma)
+		/* Unmapped area */
+		goto out_put_mm;
+
+	if (prm->perm & IOMMU_FAULT_PERM_READ)
+		access_flags |= VM_READ;
+
+	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
+		access_flags |= VM_WRITE;
+		fault_flags |= FAULT_FLAG_WRITE;
+	}
+
+	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
+		access_flags |= VM_EXEC;
+		fault_flags |= FAULT_FLAG_INSTRUCTION;
+	}
+
+	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
+		fault_flags |= FAULT_FLAG_USER;
+
+	if (access_flags & ~vma->vm_flags)
+		/* Access fault */
+		goto out_put_mm;
+
+	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
+	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
+		IOMMU_PAGE_RESP_SUCCESS;
+
+out_put_mm:
+	mmap_read_unlock(mm);
+	mmput(mm);
+
+	return status;
 }
-EXPORT_SYMBOL_GPL(iommu_sva_find);
 
 static struct iommu_domain *
 iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm)
@@ -105,6 +142,8 @@ iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm)
 	INIT_LIST_HEAD(&cookie->bonds);
 	domain->type = IOMMU_DOMAIN_SVA;
 	domain->sva_cookie = cookie;
+	domain->fault_handler = iommu_sva_handle_iopf;
+	domain->fault_data = domain;
 	curr = xa_store(&sva_domain_array, mm->pasid, domain, GFP_KERNEL);
 	if (xa_err(curr))
 		goto err_xa_store;
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH RFC v2 11/11] iommu: Rename iommu-sva-lib.{c,h}
  2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
                   ` (9 preceding siblings ...)
  2022-03-29  5:37 ` [PATCH RFC v2 10/11] iommu: Per-domain I/O page fault handling Lu Baolu
@ 2022-03-29  5:38 ` Lu Baolu
  10 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-29  5:38 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Kevin Tian,
	Ashok Raj, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Jacob jun Pan

Rename iommu-sva-lib.c[h] to iommu-sva.c[h] as it contains all code
for SVA implementation in iommu core.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/{iommu-sva-lib.h => iommu-sva.h}  | 0
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c     | 2 +-
 drivers/iommu/intel/iommu.c                     | 2 +-
 drivers/iommu/intel/svm.c                       | 2 +-
 drivers/iommu/io-pgfault.c                      | 2 +-
 drivers/iommu/{iommu-sva-lib.c => iommu-sva.c}  | 2 +-
 drivers/iommu/iommu.c                           | 2 +-
 drivers/iommu/Makefile                          | 2 +-
 9 files changed, 8 insertions(+), 8 deletions(-)
 rename drivers/iommu/{iommu-sva-lib.h => iommu-sva.h} (100%)
 rename drivers/iommu/{iommu-sva-lib.c => iommu-sva.c} (99%)

diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva.h
similarity index 100%
rename from drivers/iommu/iommu-sva-lib.h
rename to drivers/iommu/iommu-sva.h
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 5591321f9e11..a13543aa9df4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -9,7 +9,7 @@
 #include <linux/slab.h>
 
 #include "arm-smmu-v3.h"
-#include "../../iommu-sva-lib.h"
+#include "../../iommu-sva.h"
 #include "../../io-pgtable-arm.h"
 
 struct arm_smmu_mmu_notifier {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 543d3ef1c102..ca2bd17eec41 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -31,7 +31,7 @@
 #include <linux/amba/bus.h>
 
 #include "arm-smmu-v3.h"
-#include "../../iommu-sva-lib.h"
+#include "../../iommu-sva.h"
 
 static bool disable_bypass = true;
 module_param(disable_bypass, bool, 0444);
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 1d6a62512bca..57a3d54c414b 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -27,7 +27,7 @@
 #include <linux/tboot.h>
 
 #include "../irq_remapping.h"
-#include "../iommu-sva-lib.h"
+#include "../iommu-sva.h"
 #include "pasid.h"
 #include "cap_audit.h"
 
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index c76b99b46626..b6b83ca705d5 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -25,7 +25,7 @@
 
 #include "pasid.h"
 #include "perf.h"
-#include "../iommu-sva-lib.h"
+#include "../iommu-sva.h"
 
 static irqreturn_t prq_event_thread(int irq, void *d);
 static void intel_svm_drain_prq(struct device *dev, u32 pasid);
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index 159e4f107fe3..5468993c8c6c 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -11,7 +11,7 @@
 #include <linux/slab.h>
 #include <linux/workqueue.h>
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 /**
  * struct iopf_queue - IO Page Fault queue
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva.c
similarity index 99%
rename from drivers/iommu/iommu-sva-lib.c
rename to drivers/iommu/iommu-sva.c
index a53574f9559a..dbe4b9958e48 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva.c
@@ -7,7 +7,7 @@
 #include <linux/slab.h>
 #include <linux/sched/mm.h>
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 static DEFINE_MUTEX(iommu_sva_lock);
 static DECLARE_IOASID_SET(iommu_sva_pasid);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6b51ead9d63b..f4a5d6351421 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -27,7 +27,7 @@
 #include <linux/cc_platform.h>
 #include <trace/events/iommu.h>
 
-#include "iommu-sva-lib.h"
+#include "iommu-sva.h"
 
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 44475a9b3eea..c1763476162b 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,6 +27,6 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
-obj-$(CONFIG_IOMMU_SVA) += iommu-sva-lib.o io-pgfault.o
+obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o io-pgfault.o
 obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
 obj-$(CONFIG_APPLE_DART) += apple-dart.o
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-29  5:37 ` [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown() Lu Baolu
@ 2022-03-29  8:42   ` Tian, Kevin
  2022-03-29 11:42     ` Jason Gunthorpe via iommu
                       ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: Tian, Kevin @ 2022-03-29  8:42 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Raj,
	Ashok, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: iommu, Pan, Jacob jun, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, March 29, 2022 1:38 PM
> 
> Some of the interfaces in the IOMMU core require that only a single
> kernel device driver controls the device in the IOMMU group. The
> existing method is to check the device count in the IOMMU group in
> the interfaces. This is unreliable because any device added to the
> IOMMU group later breaks this assumption without notifying the driver
> using the interface. This adds iommu_group_singleton_lockdown() that
> checks the requirement and locks down the IOMMU group with only single
> device driver bound.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/iommu.c | 30 ++++++++++++++++++------------
>  1 file changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 0c42ece25854..9fb8a5b4491e 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -48,6 +48,7 @@ struct iommu_group {
>  	struct list_head entry;
>  	unsigned int owner_cnt;
>  	void *owner;
> +	bool singleton_lockdown;
>  };
> 
>  struct group_device {
> @@ -968,15 +969,16 @@ void iommu_group_remove_device(struct device
> *dev)
>  }
>  EXPORT_SYMBOL_GPL(iommu_group_remove_device);
> 
> -static int iommu_group_device_count(struct iommu_group *group)
> +/* Callers should hold the group->mutex. */
> +static bool iommu_group_singleton_lockdown(struct iommu_group *group)
>  {
> -	struct group_device *entry;
> -	int ret = 0;
> -
> -	list_for_each_entry(entry, &group->devices, list)
> -		ret++;
> +	if (group->owner_cnt != 1 ||
> +	    group->domain != group->default_domain ||
> +	    group->owner)
> +		return false;

Curious why there will be a case where group uses default_domain
while still having a owner? I have the impression that owner is used
for userspace DMA where a different domain is used.

> +	group->singleton_lockdown = true;
> 
> -	return ret;
> +	return true;
>  }

btw I'm not sure whether this is what SVA requires. IIRC the problem with
SVA is because PASID TLP prefix is not counted in PCI packet routing thus
a DMA target address with PASID might be treated as P2P if the address
falls into the MMIO BAR of other devices in the group. This is why the
original code needs to strictly apply SVA in a group containing a single
device, instead of a group attached by a single driver, unless we want to
reserve those MMIO ranges in CPU VA space.

Jean can correct me if my memory is wrong.

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-29  8:42   ` Tian, Kevin
@ 2022-03-29 11:42     ` Jason Gunthorpe via iommu
  2022-03-30  6:50       ` Tian, Kevin
  2022-04-01  6:20       ` Yi Liu
  2022-03-30  4:59     ` Lu Baolu
  2022-04-01  5:49     ` Yi Liu
  2 siblings, 2 replies; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-03-29 11:42 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

On Tue, Mar 29, 2022 at 08:42:13AM +0000, Tian, Kevin wrote:

> btw I'm not sure whether this is what SVA requires. IIRC the problem with
> SVA is because PASID TLP prefix is not counted in PCI packet routing thus
> a DMA target address with PASID might be treated as P2P if the address
> falls into the MMIO BAR of other devices in the group. This is why the
> original code needs to strictly apply SVA in a group containing a single
> device, instead of a group attached by a single driver, unless we want to
> reserve those MMIO ranges in CPU VA space.

I think it is not such a good idea to mix up group with this test

Here you want to say that all TLPs from the RID route to the host
bridge - ie ACS is on/etc. This is subtly different from a group with
a single device. Specifically it is an immutable property of the
fabric and doesn't change after hot plug events.

ie if we have a singleton group that doesn't have ACS and someone
hotplugs in another device on a bridge, then our SVA is completely
broken and we get data corruption.

Testing the group size is inherently the wrong test to make.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu
  2022-03-29  5:37 ` [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu Lu Baolu
@ 2022-03-29 21:00   ` Jacob Pan
  2022-03-30  4:30     ` Lu Baolu
  2022-03-30  7:05   ` Tian, Kevin
  1 sibling, 1 reply; 62+ messages in thread
From: Jacob Pan @ 2022-03-29 21:00 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, jacob.jun.pan,
	Jason Gunthorpe, Will Deacon

Hi BaoLu,

On Tue, 29 Mar 2022 13:37:50 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> Use this field to save the pasid/ssid bits that a device is able to
> support with its IOMMU hardware. It is a generic attribute of a device
> and lifting it into the per-device dev_iommu struct makes it possible
> to allocate a PASID for device without calls into the IOMMU drivers.
> Any iommu driver which suports PASID related features should set this
> field before features are enabled on the devices.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/iommu.h                       | 1 +
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++
>  drivers/iommu/intel/iommu.c                 | 5 ++++-
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 6ef2df258673..36f43af0af53 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -368,6 +368,7 @@ struct dev_iommu {
>  	struct iommu_fwspec		*fwspec;
>  	struct iommu_device		*iommu_dev;
>  	void				*priv;
> +	unsigned int			pasid_bits;
pasid_width?
PCI spec uses "Max PASID Width"

>  };
>  
>  int iommu_device_register(struct iommu_device *iommu,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index
> 627a3ed5ee8f..afc63fce6107 100644 ---
> a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2681,6 +2681,8 @@
> static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> smmu->features & ARM_SMMU_FEAT_STALL_FORCE) master->stall_enabled = true;
>  
> +	dev->iommu->pasid_bits = master->ssid_bits;
> +
>  	return &smmu->iommu;
>  
>  err_free_master:
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 6f7485c44a4b..c1b91bce1530 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4587,8 +4587,11 @@ static struct iommu_device
> *intel_iommu_probe_device(struct device *dev) if (pasid_supported(iommu))
> { int features = pci_pasid_features(pdev);
>  
> -				if (features >= 0)
> +				if (features >= 0) {
>  					info->pasid_supported = features
> | 1;
> +					dev->iommu->pasid_bits =
> +
> fls(pci_max_pasids(pdev)) - 1;
> +				}
>  			}
>  
>  			if (info->ats_supported && ecap_prs(iommu->ecap)
> &&


Thanks,

Jacob
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-03-29  5:37 ` [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA Lu Baolu
@ 2022-03-29 21:38   ` Jacob Pan
  2022-03-30  4:35     ` Lu Baolu
  2022-03-30 19:02   ` Jason Gunthorpe via iommu
  1 sibling, 1 reply; 62+ messages in thread
From: Jacob Pan @ 2022-03-29 21:38 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, jacob.jun.pan,
	Jason Gunthorpe, Will Deacon

Hi BaoLu,

On Tue, 29 Mar 2022 13:37:52 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> Add a new iommu domain type IOMMU_DOMAIN_SVA to represent an I/O page
> table which is shared from CPU host VA. Add some helpers to get and
> put an SVA domain and implement SVA domain life cycle management.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/iommu.h         |  7 +++
>  drivers/iommu/iommu-sva-lib.h | 10 ++++
>  drivers/iommu/iommu-sva-lib.c | 89 +++++++++++++++++++++++++++++++++++
>  3 files changed, 106 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 36f43af0af53..29c4c2edd706 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -42,6 +42,7 @@ struct notifier_block;
>  struct iommu_sva;
>  struct iommu_fault_event;
>  struct iommu_dma_cookie;
> +struct iommu_sva_cookie;
>  
>  /* iommu fault flags */
>  #define IOMMU_FAULT_READ	0x0
> @@ -64,6 +65,9 @@ struct iommu_domain_geometry {
>  #define __IOMMU_DOMAIN_PT	(1U << 2)  /* Domain is identity mapped
>   */ #define __IOMMU_DOMAIN_DMA_FQ	(1U << 3)  /* DMA-API uses
> flush queue    */ 
> +#define __IOMMU_DOMAIN_SHARED	(1U << 4)  /* Page table shared
> from CPU  */ +#define __IOMMU_DOMAIN_HOST_VA	(1U << 5)  /* Host
> CPU virtual address */ +
>  /*
>   * This are the possible domain-types
>   *
> @@ -86,6 +90,8 @@ struct iommu_domain_geometry {
>  #define IOMMU_DOMAIN_DMA_FQ	(__IOMMU_DOMAIN_PAGING |	\
>  				 __IOMMU_DOMAIN_DMA_API |	\
>  				 __IOMMU_DOMAIN_DMA_FQ)
> +#define IOMMU_DOMAIN_SVA	(__IOMMU_DOMAIN_SHARED |	\
> +				 __IOMMU_DOMAIN_HOST_VA)
>  
>  struct iommu_domain {
>  	unsigned type;
> @@ -95,6 +101,7 @@ struct iommu_domain {
>  	void *handler_token;
>  	struct iommu_domain_geometry geometry;
>  	struct iommu_dma_cookie *iova_cookie;
> +	struct iommu_sva_cookie *sva_cookie;
>  };
>  
>  static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
> diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
> index 8909ea1094e3..1a71218b07f5 100644
> --- a/drivers/iommu/iommu-sva-lib.h
> +++ b/drivers/iommu/iommu-sva-lib.h
> @@ -10,6 +10,7 @@
>  
>  int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t
> max); struct mm_struct *iommu_sva_find(ioasid_t pasid);
> +struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain);
>  
>  /* I/O Page fault */
>  struct device;
> @@ -26,6 +27,8 @@ int iopf_queue_flush_dev(struct device *dev);
>  struct iopf_queue *iopf_queue_alloc(const char *name);
>  void iopf_queue_free(struct iopf_queue *queue);
>  int iopf_queue_discard_partial(struct iopf_queue *queue);
> +bool iommu_sva_domain_get_user(struct iommu_domain *domain);
> +void iommu_sva_domain_put_user(struct iommu_domain *domain);
>  
>  #else /* CONFIG_IOMMU_SVA */
>  static inline int iommu_queue_iopf(struct iommu_fault *fault, void
> *cookie) @@ -63,5 +66,12 @@ static inline int
> iopf_queue_discard_partial(struct iopf_queue *queue) {
>  	return -ENODEV;
>  }
> +
> +static inline bool iommu_sva_domain_get_user(struct iommu_domain *domain)
> +{
> +	return false;
> +}
> +
> +static inline void iommu_sva_domain_put_user(struct iommu_domain
> *domain) { } #endif /* CONFIG_IOMMU_SVA */
>  #endif /* _IOMMU_SVA_LIB_H */
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 106506143896..78820be23f15 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -3,12 +3,21 @@
>   * Helpers for IOMMU drivers implementing SVA
>   */
>  #include <linux/mutex.h>
> +#include <linux/iommu.h>
> +#include <linux/slab.h>
>  #include <linux/sched/mm.h>
>  
>  #include "iommu-sva-lib.h"
>  
>  static DEFINE_MUTEX(iommu_sva_lock);
>  static DECLARE_IOASID_SET(iommu_sva_pasid);
> +static DEFINE_XARRAY_ALLOC(sva_domain_array);
> +
> +struct iommu_sva_cookie {
> +	struct mm_struct *mm;
> +	ioasid_t pasid;
> +	refcount_t users;
> +};
>  
>  /**
>   * iommu_sva_alloc_pasid - Allocate a PASID for the mm
> @@ -69,3 +78,83 @@ struct mm_struct *iommu_sva_find(ioasid_t pasid)
>  	return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero);
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_find);
> +
> +static struct iommu_domain *
> +iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm)
> +{
> +	struct bus_type *bus = dev->bus;
> +	struct iommu_sva_cookie *cookie;
> +	struct iommu_domain *domain;
> +	void *curr;
> +
> +	if (!bus || !bus->iommu_ops)
> +		return NULL;
> +
> +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
> +	if (!cookie)
> +		return NULL;
> +
> +	domain = bus->iommu_ops->domain_alloc(IOMMU_DOMAIN_SVA);
> +	if (!domain)
> +		goto err_domain_alloc;
> +
> +	cookie->mm = mm;
> +	cookie->pasid = mm->pasid;
How do you manage the mm life cycle? do you require caller take mm reference?
Or this should be limited to the current mm?

> +	refcount_set(&cookie->users, 1);
> +	domain->type = IOMMU_DOMAIN_SVA;
> +	domain->sva_cookie = cookie;
> +	curr = xa_store(&sva_domain_array, mm->pasid, domain,
> GFP_KERNEL);
> +	if (xa_err(curr))
> +		goto err_xa_store;
> +
> +	return domain;
> +err_xa_store:
> +	domain->ops->free(domain);
> +err_domain_alloc:
> +	kfree(cookie);
> +	return NULL;
> +}
> +
> +static void iommu_sva_free_domain(struct iommu_domain *domain)
> +{
> +	xa_erase(&sva_domain_array, domain->sva_cookie->pasid);
> +	kfree(domain->sva_cookie);
> +	domain->ops->free(domain);
> +}
> +
> +bool iommu_sva_domain_get_user(struct iommu_domain *domain)
> +{
> +	struct iommu_sva_cookie *cookie = domain->sva_cookie;
> +
> +	return refcount_inc_not_zero(&cookie->users);
> +}
> +
> +void iommu_sva_domain_put_user(struct iommu_domain *domain)
> +{
> +	struct iommu_sva_cookie *cookie = domain->sva_cookie;
> +
> +	if (refcount_dec_and_test(&cookie->users))
> +		iommu_sva_free_domain(domain);
> +}
> +
> +static __maybe_unused struct iommu_domain *
> +iommu_sva_get_domain(struct device *dev, struct mm_struct *mm)
> +{
> +	struct iommu_domain *domain;
> +	ioasid_t pasid = mm->pasid;
> +
> +	if (pasid == INVALID_IOASID)
> +		return NULL;
> +
> +	domain = xa_load(&sva_domain_array, pasid);
> +	if (!domain)
> +		return iommu_sva_alloc_domain(dev, mm);
> +	iommu_sva_domain_get_user(domain);
> +
> +	return domain;
> +}
> +
> +struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain)
> +{
> +	return domain->sva_cookie->mm;
> +}


Thanks,

Jacob
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu
  2022-03-29 21:00   ` Jacob Pan
@ 2022-03-30  4:30     ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-30  4:30 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Jason Gunthorpe,
	Will Deacon

Hi Jacob,

On 2022/3/30 5:00, Jacob Pan wrote:
> Hi BaoLu,
> 
> On Tue, 29 Mar 2022 13:37:50 +0800, Lu Baolu<baolu.lu@linux.intel.com>
> wrote:
> 
>> Use this field to save the pasid/ssid bits that a device is able to
>> support with its IOMMU hardware. It is a generic attribute of a device
>> and lifting it into the per-device dev_iommu struct makes it possible
>> to allocate a PASID for device without calls into the IOMMU drivers.
>> Any iommu driver which suports PASID related features should set this
>> field before features are enabled on the devices.
>>
>> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
>> ---
>>   include/linux/iommu.h                       | 1 +
>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++
>>   drivers/iommu/intel/iommu.c                 | 5 ++++-
>>   3 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 6ef2df258673..36f43af0af53 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -368,6 +368,7 @@ struct dev_iommu {
>>   	struct iommu_fwspec		*fwspec;
>>   	struct iommu_device		*iommu_dev;
>>   	void				*priv;
>> +	unsigned int			pasid_bits;
> pasid_width?
> PCI spec uses "Max PASID Width"
> 

My understanding is that this field represents "the pasid bits that the
device is able to use with its IOMMU". This field considers the
capabilities of both device and IOMMU. This is the reason why I put it
in the per-device iommu object and initialize it in the iommu
probe_device() callback.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-03-29 21:38   ` Jacob Pan
@ 2022-03-30  4:35     ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-30  4:35 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Jason Gunthorpe,
	Will Deacon

On 2022/3/30 5:38, Jacob Pan wrote:
>> +static struct iommu_domain *
>> +iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm)
>> +{
>> +	struct bus_type *bus = dev->bus;
>> +	struct iommu_sva_cookie *cookie;
>> +	struct iommu_domain *domain;
>> +	void *curr;
>> +
>> +	if (!bus || !bus->iommu_ops)
>> +		return NULL;
>> +
>> +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
>> +	if (!cookie)
>> +		return NULL;
>> +
>> +	domain = bus->iommu_ops->domain_alloc(IOMMU_DOMAIN_SVA);
>> +	if (!domain)
>> +		goto err_domain_alloc;
>> +
>> +	cookie->mm = mm;
>> +	cookie->pasid = mm->pasid;
> How do you manage the mm life cycle? do you require caller take mm reference?
> Or this should be limited to the current mm?
> 

The existing sva_bind() interface requires the caller to take the mm
reference before calling it. So mm is safe during sva bind() and
unbind().

drivers/iommu/iommu.c:
/**
  * iommu_sva_bind_device() - Bind a process address space to a device
  * @dev: the device
  * @mm: the mm to bind, caller must hold a reference to it
  * @drvdata: opaque data pointer to pass to bind callback
  *
  * Create a bond between device and address space, allowing the device 
to access
  * the mm using the returned PASID. If a bond already exists between 
@device and
  * @mm, it is returned and an additional reference is taken. Caller 
must call
  * iommu_sva_unbind_device() to release each reference.
  *
  * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called 
first, to
  * initialize the required SVA features.
  *
  * On error, returns an ERR_PTR value.
  */
struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void 
*drvdata)

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-29  8:42   ` Tian, Kevin
  2022-03-29 11:42     ` Jason Gunthorpe via iommu
@ 2022-03-30  4:59     ` Lu Baolu
  2022-03-30  6:55       ` Tian, Kevin
  2022-04-01  5:49     ` Yi Liu
  2 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-03-30  4:59 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Raj, Ashok, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Pan, Jacob jun

Hi Kevin,

On 2022/3/29 16:42, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, March 29, 2022 1:38 PM
>>
>> Some of the interfaces in the IOMMU core require that only a single
>> kernel device driver controls the device in the IOMMU group. The
>> existing method is to check the device count in the IOMMU group in
>> the interfaces. This is unreliable because any device added to the
>> IOMMU group later breaks this assumption without notifying the driver
>> using the interface. This adds iommu_group_singleton_lockdown() that
>> checks the requirement and locks down the IOMMU group with only single
>> device driver bound.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/iommu.c | 30 ++++++++++++++++++------------
>>   1 file changed, 18 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 0c42ece25854..9fb8a5b4491e 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -48,6 +48,7 @@ struct iommu_group {
>>   	struct list_head entry;
>>   	unsigned int owner_cnt;
>>   	void *owner;
>> +	bool singleton_lockdown;
>>   };
>>
>>   struct group_device {
>> @@ -968,15 +969,16 @@ void iommu_group_remove_device(struct device
>> *dev)
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_group_remove_device);
>>
>> -static int iommu_group_device_count(struct iommu_group *group)
>> +/* Callers should hold the group->mutex. */
>> +static bool iommu_group_singleton_lockdown(struct iommu_group *group)
>>   {
>> -	struct group_device *entry;
>> -	int ret = 0;
>> -
>> -	list_for_each_entry(entry, &group->devices, list)
>> -		ret++;
>> +	if (group->owner_cnt != 1 ||
>> +	    group->domain != group->default_domain ||
>> +	    group->owner)
>> +		return false;
> 
> Curious why there will be a case where group uses default_domain
> while still having a owner? I have the impression that owner is used
> for userspace DMA where a different domain is used.

You are right. The default domain is automatically detached when a user
is claimed. As long as a user is claimed, the group could only use an
empty or user-specified domain.

> 
>> +	group->singleton_lockdown = true;
>>
>> -	return ret;
>> +	return true;
>>   }
> 
> btw I'm not sure whether this is what SVA requires. IIRC the problem with
> SVA is because PASID TLP prefix is not counted in PCI packet routing thus
> a DMA target address with PASID might be treated as P2P if the address
> falls into the MMIO BAR of other devices in the group. This is why the
> original code needs to strictly apply SVA in a group containing a single
> device, instead of a group attached by a single driver, unless we want to
> reserve those MMIO ranges in CPU VA space.

You are right. But I don't think the IOMMU core is able to guarantee
above in a platform/device-agnostic way. Or any suggestions?

I guess this should be somewhat off-loaded to the device driver which
knows details of the device. The device driver should know this and
guarantee it before calling
iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA).

This patch itself just replaces the existing
"iommu_group_device_count(group) != 1" logic with a new one based on the
group ownership logistics. The former is obviously not friendly to
device hot joined afterward.

> 
> Jean can correct me if my memory is wrong.
> 
> Thanks
> Kevin

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-29 11:42     ` Jason Gunthorpe via iommu
@ 2022-03-30  6:50       ` Tian, Kevin
  2022-03-30 11:57         ` Lu Baolu
  2022-03-30 11:58         ` Jason Gunthorpe via iommu
  2022-04-01  6:20       ` Yi Liu
  1 sibling, 2 replies; 62+ messages in thread
From: Tian, Kevin @ 2022-03-30  6:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 29, 2022 7:43 PM
> 
> On Tue, Mar 29, 2022 at 08:42:13AM +0000, Tian, Kevin wrote:
> 
> > btw I'm not sure whether this is what SVA requires. IIRC the problem with
> > SVA is because PASID TLP prefix is not counted in PCI packet routing thus
> > a DMA target address with PASID might be treated as P2P if the address
> > falls into the MMIO BAR of other devices in the group. This is why the
> > original code needs to strictly apply SVA in a group containing a single
> > device, instead of a group attached by a single driver, unless we want to
> > reserve those MMIO ranges in CPU VA space.
> 
> I think it is not such a good idea to mix up group with this test
> 
> Here you want to say that all TLPs from the RID route to the host
> bridge - ie ACS is on/etc. This is subtly different from a group with
> a single device. Specifically it is an immutable property of the
> fabric and doesn't change after hot plug events.

Yes, in concept your description is more accurate.

However according to pci_device_group() a group is created mainly
for lacking of ACS. Without ACS there is no guarantee that all TLPs
from the RID in a multi-devices group are routed to the host bridge.
In this case only singleton group can meet this requirement.

One thing that I'm not very sure is about DMA alias. Even when physically
there is only a single device within the group the aliasing could lead
to multiple RIDs in the group making it non-singleton. But probably we
don't need support SVA on such device until a real demand comes?

> 
> ie if we have a singleton group that doesn't have ACS and someone
> hotplugs in another device on a bridge, then our SVA is completely
> broken and we get data corruption.

Can we capture that in iommu_probe_device() when identifying
the group which the probed device will be added to has already been
locked down for SVA? i.e. make iommu_group_singleton_lockdown()
in this patch to lock down the fact of singleton group instead of
the fact of singleton driver...

> 
> Testing the group size is inherently the wrong test to make.
> 

What is your suggestion then?

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30  4:59     ` Lu Baolu
@ 2022-03-30  6:55       ` Tian, Kevin
  0 siblings, 0 replies; 62+ messages in thread
From: Tian, Kevin @ 2022-03-30  6:55 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Raj,
	Ashok, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: iommu, Pan, Jacob jun, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Wednesday, March 30, 2022 1:00 PM
> >
> > btw I'm not sure whether this is what SVA requires. IIRC the problem with
> > SVA is because PASID TLP prefix is not counted in PCI packet routing thus
> > a DMA target address with PASID might be treated as P2P if the address
> > falls into the MMIO BAR of other devices in the group. This is why the
> > original code needs to strictly apply SVA in a group containing a single
> > device, instead of a group attached by a single driver, unless we want to
> > reserve those MMIO ranges in CPU VA space.
> 
> You are right. But I don't think the IOMMU core is able to guarantee
> above in a platform/device-agnostic way. Or any suggestions?
> 
> I guess this should be somewhat off-loaded to the device driver which
> knows details of the device. The device driver should know this and
> guarantee it before calling
> iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA).

How would the device driver know whether SVA requests from a
device might be mis-interpreted as p2p by upstreaming ports?

> 
> This patch itself just replaces the existing
> "iommu_group_device_count(group) != 1" logic with a new one based on the
> group ownership logistics. The former is obviously not friendly to
> device hot joined afterward.
> 

IMHO this replacement changes the semantics and device hotplug is
something that we must deal with...

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu
  2022-03-29  5:37 ` [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu Lu Baolu
  2022-03-29 21:00   ` Jacob Pan
@ 2022-03-30  7:05   ` Tian, Kevin
  2022-03-30 11:58     ` Lu Baolu
  1 sibling, 1 reply; 62+ messages in thread
From: Tian, Kevin @ 2022-03-30  7:05 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig, Raj,
	Ashok, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: iommu, Pan, Jacob jun, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, March 29, 2022 1:38 PM
> 
> Use this field to save the pasid/ssid bits that a device is able to
> support with its IOMMU hardware. It is a generic attribute of a device
> and lifting it into the per-device dev_iommu struct makes it possible
> to allocate a PASID for device without calls into the IOMMU drivers.
> Any iommu driver which suports PASID related features should set this
> field before features are enabled on the devices.

Miss a clarification for non-PCI SVA as discussed here:

https://lore.kernel.org/all/85d61ad6-0cf0-ac65-3312-32d0cdeb1b27@linux.intel.com/


> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/iommu.h                       | 1 +
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++
>  drivers/iommu/intel/iommu.c                 | 5 ++++-
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 6ef2df258673..36f43af0af53 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -368,6 +368,7 @@ struct dev_iommu {
>  	struct iommu_fwspec		*fwspec;
>  	struct iommu_device		*iommu_dev;
>  	void				*priv;
> +	unsigned int			pasid_bits;
>  };
> 
>  int iommu_device_register(struct iommu_device *iommu,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 627a3ed5ee8f..afc63fce6107 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2681,6 +2681,8 @@ static struct iommu_device
> *arm_smmu_probe_device(struct device *dev)
>  	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
>  		master->stall_enabled = true;
> 
> +	dev->iommu->pasid_bits = master->ssid_bits;
> +
>  	return &smmu->iommu;
> 
>  err_free_master:
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 6f7485c44a4b..c1b91bce1530 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4587,8 +4587,11 @@ static struct iommu_device
> *intel_iommu_probe_device(struct device *dev)
>  			if (pasid_supported(iommu)) {
>  				int features = pci_pasid_features(pdev);
> 
> -				if (features >= 0)
> +				if (features >= 0) {
>  					info->pasid_supported = features | 1;
> +					dev->iommu->pasid_bits =
> +						fls(pci_max_pasids(pdev)) - 1;
> +				}
>  			}
> 
>  			if (info->ats_supported && ecap_prs(iommu->ecap)
> &&
> --
> 2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30  6:50       ` Tian, Kevin
@ 2022-03-30 11:57         ` Lu Baolu
  2022-03-30 11:58         ` Jason Gunthorpe via iommu
  1 sibling, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-30 11:57 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

Hi Kevin,

On 2022/3/30 14:50, Tian, Kevin wrote:
>> ie if we have a singleton group that doesn't have ACS and someone
>> hotplugs in another device on a bridge, then our SVA is completely
>> broken and we get data corruption.
> Can we capture that in iommu_probe_device() when identifying
> the group which the probed device will be added to has already been
> locked down for SVA? i.e. make iommu_group_singleton_lockdown()
> in this patch to lock down the fact of singleton group instead of
> the fact of singleton driver...
> 

The iommu_probe_device() is called in the bus notifier callback. It has
no way to stop the probe of the device. Unless we could add a direct
call in the device probe path of the driver core?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30  6:50       ` Tian, Kevin
  2022-03-30 11:57         ` Lu Baolu
@ 2022-03-30 11:58         ` Jason Gunthorpe via iommu
  2022-03-30 14:12           ` Tian, Kevin
                             ` (2 more replies)
  1 sibling, 3 replies; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-03-30 11:58 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

On Wed, Mar 30, 2022 at 06:50:11AM +0000, Tian, Kevin wrote:

> One thing that I'm not very sure is about DMA alias. Even when physically
> there is only a single device within the group the aliasing could lead
> to multiple RIDs in the group making it non-singleton. But probably we
> don't need support SVA on such device until a real demand comes?

How can we have multiple RIDs in the same group and have only one
device in the group?
 
> > ie if we have a singleton group that doesn't have ACS and someone
> > hotplugs in another device on a bridge, then our SVA is completely
> > broken and we get data corruption.
> 
> Can we capture that in iommu_probe_device() when identifying
> the group which the probed device will be added to has already been
> locked down for SVA? i.e. make iommu_group_singleton_lockdown()
> in this patch to lock down the fact of singleton group instead of
> the fact of singleton driver...

No, that is backwards

> > Testing the group size is inherently the wrong test to make.
> 
> What is your suggestion then?

Add a flag to the group that positively indicates the group can never
have more than one member, even after hot plug. eg because it is
impossible due to ACS, or lack of bridges, and so on.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu
  2022-03-30  7:05   ` Tian, Kevin
@ 2022-03-30 11:58     ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-03-30 11:58 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Jason Gunthorpe, Christoph Hellwig,
	Raj, Ashok, Will Deacon, Robin Murphy, Jean-Philippe Brucker
  Cc: linux-kernel, iommu, Pan, Jacob jun

On 2022/3/30 15:05, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Tuesday, March 29, 2022 1:38 PM
>>
>> Use this field to save the pasid/ssid bits that a device is able to
>> support with its IOMMU hardware. It is a generic attribute of a device
>> and lifting it into the per-device dev_iommu struct makes it possible
>> to allocate a PASID for device without calls into the IOMMU drivers.
>> Any iommu driver which suports PASID related features should set this
>> field before features are enabled on the devices.
> Miss a clarification for non-PCI SVA as discussed here:
> 
> https://lore.kernel.org/all/85d61ad6-0cf0-ac65-3312-32d0cdeb1b27@linux.intel.com/

Yes. Thanks for the reminding.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30 11:58         ` Jason Gunthorpe via iommu
@ 2022-03-30 14:12           ` Tian, Kevin
  2022-03-30 14:30             ` Jason Gunthorpe via iommu
  2022-03-30 14:18           ` Tian, Kevin
  2022-04-04  5:43           ` Lu Baolu
  2 siblings, 1 reply; 62+ messages in thread
From: Tian, Kevin @ 2022-03-30 14:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig, iommu,
	Pan, Jacob jun, Jean-Philippe Brucker, Robin Murphy

> From: Jason Gunthorpe
> Sent: Wednesday, March 30, 2022 7:58 PM
> 
> On Wed, Mar 30, 2022 at 06:50:11AM +0000, Tian, Kevin wrote:
> 
> > One thing that I'm not very sure is about DMA alias. Even when physically
> > there is only a single device within the group the aliasing could lead
> > to multiple RIDs in the group making it non-singleton. But probably we
> > don't need support SVA on such device until a real demand comes?
> 
> How can we have multiple RIDs in the same group and have only one
> device in the group?

Alex may help throw some insight here. Per what I read from the code
looks like certain device can generate traffic with multiple RIDs.

> 
> > > ie if we have a singleton group that doesn't have ACS and someone
> > > hotplugs in another device on a bridge, then our SVA is completely
> > > broken and we get data corruption.
> >
> > Can we capture that in iommu_probe_device() when identifying
> > the group which the probed device will be added to has already been
> > locked down for SVA? i.e. make iommu_group_singleton_lockdown()
> > in this patch to lock down the fact of singleton group instead of
> > the fact of singleton driver...
> 
> No, that is backwards
> 
> > > Testing the group size is inherently the wrong test to make.
> >
> > What is your suggestion then?
> 
> Add a flag to the group that positively indicates the group can never
> have more than one member, even after hot plug. eg because it is
> impossible due to ACS, or lack of bridges, and so on.
> 

OK, I see your point. It essentially refers to a singleton group which
is immutable to hotplug.

Thanks
Kevin 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30 11:58         ` Jason Gunthorpe via iommu
  2022-03-30 14:12           ` Tian, Kevin
@ 2022-03-30 14:18           ` Tian, Kevin
  2022-03-30 15:04             ` Alex Williamson
  2022-04-04  5:43           ` Lu Baolu
  2 siblings, 1 reply; 62+ messages in thread
From: Tian, Kevin @ 2022-03-30 14:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson (alex.williamson@redhat.com),
	Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig, iommu,
	Pan, Jacob jun, Jean-Philippe Brucker, Robin Murphy

+Alex

> From: Tian, Kevin
> Sent: Wednesday, March 30, 2022 10:13 PM
> 
> > From: Jason Gunthorpe
> > Sent: Wednesday, March 30, 2022 7:58 PM
> >
> > On Wed, Mar 30, 2022 at 06:50:11AM +0000, Tian, Kevin wrote:
> >
> > > One thing that I'm not very sure is about DMA alias. Even when physically
> > > there is only a single device within the group the aliasing could lead
> > > to multiple RIDs in the group making it non-singleton. But probably we
> > > don't need support SVA on such device until a real demand comes?
> >
> > How can we have multiple RIDs in the same group and have only one
> > device in the group?
> 
> Alex may help throw some insight here. Per what I read from the code
> looks like certain device can generate traffic with multiple RIDs.
> 
> >
> > > > ie if we have a singleton group that doesn't have ACS and someone
> > > > hotplugs in another device on a bridge, then our SVA is completely
> > > > broken and we get data corruption.
> > >
> > > Can we capture that in iommu_probe_device() when identifying
> > > the group which the probed device will be added to has already been
> > > locked down for SVA? i.e. make iommu_group_singleton_lockdown()
> > > in this patch to lock down the fact of singleton group instead of
> > > the fact of singleton driver...
> >
> > No, that is backwards
> >
> > > > Testing the group size is inherently the wrong test to make.
> > >
> > > What is your suggestion then?
> >
> > Add a flag to the group that positively indicates the group can never
> > have more than one member, even after hot plug. eg because it is
> > impossible due to ACS, or lack of bridges, and so on.
> >
> 
> OK, I see your point. It essentially refers to a singleton group which
> is immutable to hotplug.
> 
> Thanks
> Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30 14:12           ` Tian, Kevin
@ 2022-03-30 14:30             ` Jason Gunthorpe via iommu
  2022-04-02  7:12               ` Tian, Kevin
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-03-30 14:30 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig, iommu,
	Pan, Jacob jun, Jean-Philippe Brucker, Robin Murphy

On Wed, Mar 30, 2022 at 02:12:57PM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe
> > Sent: Wednesday, March 30, 2022 7:58 PM
> > 
> > On Wed, Mar 30, 2022 at 06:50:11AM +0000, Tian, Kevin wrote:
> > 
> > > One thing that I'm not very sure is about DMA alias. Even when physically
> > > there is only a single device within the group the aliasing could lead
> > > to multiple RIDs in the group making it non-singleton. But probably we
> > > don't need support SVA on such device until a real demand comes?
> > 
> > How can we have multiple RIDs in the same group and have only one
> > device in the group?
> 
> Alex may help throw some insight here. Per what I read from the code
> looks like certain device can generate traffic with multiple RIDs.

IIRC "dma alias" refers to things like legacy PCI to PCIe bridges that
do still have multiple PCI ID's behind the bridge used in
configuration cycles however the PCI to PCIe bridge will tag all PCIe
TLPs with its own RID because classic PCI has no way for the requestor
to convey a RID to the bridge.

So, from a Linux perspective the group should have have multiple
struct devices behind the bridge, the bridge itself, and the RID the
IOMMU HW matches on is only the RID of the PCI bridge.

But we know this because we know there is classic PCI stuff in the
heigharchy, so we can just mark that group as incompatible.

> > Add a flag to the group that positively indicates the group can never
> > have more than one member, even after hot plug. eg because it is
> > impossible due to ACS, or lack of bridges, and so on.
> 
> OK, I see your point. It essentially refers to a singleton group which
> is immutable to hotplug.

Yes, known at creation time, not retroactively enforced because
someone used SVA

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30 14:18           ` Tian, Kevin
@ 2022-03-30 15:04             ` Alex Williamson
  0 siblings, 0 replies; 62+ messages in thread
From: Alex Williamson @ 2022-03-30 15:04 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig, iommu,
	Pan, Jacob jun, Jason Gunthorpe, Jean-Philippe Brucker,
	Robin Murphy

On Wed, 30 Mar 2022 14:18:47 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> +Alex
> 
> > From: Tian, Kevin
> > Sent: Wednesday, March 30, 2022 10:13 PM
> >   
> > > From: Jason Gunthorpe
> > > Sent: Wednesday, March 30, 2022 7:58 PM
> > >
> > > On Wed, Mar 30, 2022 at 06:50:11AM +0000, Tian, Kevin wrote:
> > >  
> > > > One thing that I'm not very sure is about DMA alias. Even when physically
> > > > there is only a single device within the group the aliasing could lead
> > > > to multiple RIDs in the group making it non-singleton. But probably we
> > > > don't need support SVA on such device until a real demand comes?  
> > >
> > > How can we have multiple RIDs in the same group and have only one
> > > device in the group?  
> > 
> > Alex may help throw some insight here. Per what I read from the code
> > looks like certain device can generate traffic with multiple RIDs.

You only need to look so far as the dma alias quirks to find devices
that use the wrong RID for DMA.  In general I don't think we have
enough confidence to say that for all these devices the wrong RID is
exclusively used versus some combination of both RIDs.  Also, the
aliased RID is not always a physical device.  Thanks,

Alex

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-03-29  5:37 ` [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA Lu Baolu
  2022-03-29 21:38   ` Jacob Pan
@ 2022-03-30 19:02   ` Jason Gunthorpe via iommu
  2022-04-02  8:43     ` Tian, Kevin
  1 sibling, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-03-30 19:02 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Kevin Tian, Ashok Raj, Will Deacon, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Jacob jun Pan,
	Robin Murphy

On Tue, Mar 29, 2022 at 01:37:52PM +0800, Lu Baolu wrote:
> @@ -95,6 +101,7 @@ struct iommu_domain {
>  	void *handler_token;
>  	struct iommu_domain_geometry geometry;
>  	struct iommu_dma_cookie *iova_cookie;
> +	struct iommu_sva_cookie *sva_cookie;

Cookie is still the wrong word to use here

> +struct iommu_sva_cookie {
> +	struct mm_struct *mm;
> +	ioasid_t pasid;
> +	refcount_t users;

Really surprised to see a refcount buried inside the iommu_domain..

This design seems inside out, the SVA struct should 'enclose' the domain, not
be a pointer inside it.

struct iommu_sva_domain {
       struct kref_t kref;
       struct mm_struct *mm;
       ioasid_t pasid;

       /* All the domains that are linked to this */
       struct xarray domain_list;
};

And then you could have a pointer to that inside the mm_struct instead
of just the naked pasid.

> +static __maybe_unused struct iommu_domain *

Why maybe unused?

> +iommu_sva_get_domain(struct device *dev, struct mm_struct *mm)
> +{
> +	struct iommu_domain *domain;
> +	ioasid_t pasid = mm->pasid;
> +
> +	if (pasid == INVALID_IOASID)
> +		return NULL;
> +
> +	domain = xa_load(&sva_domain_array, pasid);
> +	if (!domain)
> +		return iommu_sva_alloc_domain(dev, mm);
> +	iommu_sva_domain_get_user(domain);

This assumes any domain is interchangeable with any device, which is
not the iommu model. We need a domain op to check if a device is
compatiable with the domain for vfio an iommufd, this should do the
same.

It means each mm can have a list of domains associated with it and a
new domain is auto-created if the device doesn't work with any of the
existing domains.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-29  5:37 ` [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops Lu Baolu
@ 2022-03-30 19:08   ` Jason Gunthorpe via iommu
  2022-04-04  6:47     ` Lu Baolu
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-03-30 19:08 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Kevin Tian, Ashok Raj, Will Deacon, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Jacob jun Pan,
	Robin Murphy

On Tue, Mar 29, 2022 at 01:37:53PM +0800, Lu Baolu wrote:
> Attaching an IOMMU domain to a PASID of a device is a generic operation
> for modern IOMMU drivers which support PASID-granular DMA address
> translation. Currently visible usage scenarios include (but not limited):
> 
>  - SVA (Shared Virtual Address)
>  - kernel DMA with PASID
>  - hardware-assist mediated device
> 
> This adds a pair of common domain ops for this purpose and adds some
> common helpers to attach/detach a domain to/from a {device, PASID} and
> get/put the domain attached to {device, PASID}.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>  include/linux/iommu.h | 36 ++++++++++++++++++
>  drivers/iommu/iommu.c | 88 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 124 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 29c4c2edd706..a46285488a57 100644
> +++ b/include/linux/iommu.h
> @@ -269,6 +269,8 @@ struct iommu_ops {
>   * struct iommu_domain_ops - domain specific operations
>   * @attach_dev: attach an iommu domain to a device
>   * @detach_dev: detach an iommu domain from a device
> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
> + * @detach_dev_pasid: detach an iommu domain from a pasid of device
>   * @map: map a physically contiguous memory region to an iommu domain
>   * @map_pages: map a physically contiguous set of pages of the same size to
>   *             an iommu domain.
> @@ -286,6 +288,10 @@ struct iommu_ops {
>  struct iommu_domain_ops {
>  	int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
>  	void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
> +	int (*attach_dev_pasid)(struct iommu_domain *domain,
> +				struct device *dev, ioasid_t id);
> +	void (*detach_dev_pasid)(struct iommu_domain *domain,
> +				 struct device *dev, ioasid_t id);

ID should be pasid for consistency

> +int iommu_attach_device_pasid(struct iommu_domain *domain,
> +			      struct device *dev, ioasid_t pasid)
> +{
> +	struct iommu_group *group;
> +	int ret = -EINVAL;
> +	void *curr;
> +
> +	if (!domain->ops->attach_dev_pasid)
> +		return -EINVAL;
> +
> +	group = iommu_group_get(dev);
> +	if (!group)
> +		return -ENODEV;
> +
> +	mutex_lock(&group->mutex);
> +	/*
> +	 * To keep things simple, we currently don't support IOMMU groups
> +	 * with more than one device. Existing SVA-capable systems are not
> +	 * affected by the problems that required IOMMU groups (lack of ACS
> +	 * isolation, device ID aliasing and other hardware issues).
> +	 */
> +	if (!iommu_group_singleton_lockdown(group))
> +		goto out_unlock;
> +
> +	xa_lock(&group->pasid_array);
> +	curr = __xa_cmpxchg(&group->pasid_array, pasid, NULL,
> +			    domain, GFP_KERNEL);
> +	xa_unlock(&group->pasid_array);

Why the xa_lock/unlock? Just call the normal xa_cmpxchg?

> +void iommu_detach_device_pasid(struct iommu_domain *domain,
> +			       struct device *dev, ioasid_t pasid)
> +{
> +	struct iommu_group *group;
> +
> +	group = iommu_group_get(dev);
> +	if (WARN_ON(!group))
> +		return;

This group_get stuff really needs some cleaning, this makes no sense
at all.

If the kref to group can go to zero within this function then the
initial access of the kref is already buggy:

	if (group)
		kobject_get(group->devices_kobj);

Because it will crash or WARN_ON.

We don't hit this because it is required that a group cannot be
destroyed while a struct device has a driver bound, and all these
paths are driver bound paths.

So none of these group_get/put patterns have any purpose at all, and
now we are adding impossible WARN_ONs to..

> +struct iommu_domain *
> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
> +{
> +	struct iommu_domain *domain;
> +	struct iommu_group *group;
> +
> +	group = iommu_group_get(dev);
> +	if (!group)
> +		return NULL;

And now we are doing useless things on a performance path!

> +	mutex_lock(&group->mutex);
> +	domain = xa_load(&group->pasid_array, pasid);
> +	if (domain && domain->type == IOMMU_DOMAIN_SVA)
> +		iommu_sva_domain_get_user(domain);
> +	mutex_unlock(&group->mutex);
> +	iommu_group_put(group);

Why do we need so much locking on a performance path? RCU out of the
xarray..

Not sure I see how this get_user refcounting can work ?

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 06/11] iommu/vt-d: Add SVA domain support
  2022-03-29  5:37 ` [PATCH RFC v2 06/11] iommu/vt-d: Add SVA domain support Lu Baolu
@ 2022-03-30 19:09   ` Jason Gunthorpe via iommu
  2022-04-04  6:52     ` Lu Baolu
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-03-30 19:09 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Kevin Tian, Ashok Raj, Will Deacon, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Jacob jun Pan,
	Robin Murphy

On Tue, Mar 29, 2022 at 01:37:55PM +0800, Lu Baolu wrote:
> Add support for SVA domain allocation and provide an SVA-specific
> iommu_domain_ops.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>  include/linux/intel-iommu.h |  1 +
>  drivers/iommu/intel/iommu.c | 10 ++++++++++
>  drivers/iommu/intel/svm.c   | 37 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 48 insertions(+)
> 
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d00..c14283137fb5 100644
> +++ b/include/linux/intel-iommu.h
> @@ -744,6 +744,7 @@ void intel_svm_unbind(struct iommu_sva *handle);
>  u32 intel_svm_get_pasid(struct iommu_sva *handle);
>  int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
>  			    struct iommu_page_response *msg);
> +extern const struct iommu_domain_ops intel_svm_domain_ops;
>  
>  struct intel_svm_dev {
>  	struct list_head list;
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index c1b91bce1530..5eae7cf9bee5 100644
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4318,6 +4318,16 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
>  		return domain;
>  	case IOMMU_DOMAIN_IDENTITY:
>  		return &si_domain->domain;
> +#ifdef CONFIG_INTEL_IOMMU_SVM
> +	case IOMMU_DOMAIN_SVA:
> +		dmar_domain = alloc_domain(type);
> +		if (!dmar_domain)
> +			return NULL;
> +		domain = &dmar_domain->domain;
> +		domain->ops = &intel_svm_domain_ops;
> +
> +		return domain;
> +#endif /* CONFIG_INTEL_IOMMU_SVM */

If this is the usual pattern for drivers I would prefer to see an
alloc_sva op instead of more and more types.

Multiplexing functions is often not a great idea...

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces
  2022-03-29  5:37 ` [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces Lu Baolu
@ 2022-03-31 20:59   ` Jacob Pan
  2022-03-31 22:26     ` Jason Gunthorpe via iommu
  0 siblings, 1 reply; 62+ messages in thread
From: Jacob Pan @ 2022-03-31 20:59 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, jacob.jun.pan,
	Jason Gunthorpe, Will Deacon

Hi Lu,

On Tue, 29 Mar 2022 13:37:57 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> The existing iommu SVA interfaces are implemented by calling the SVA
> specific iommu ops provided by the IOMMU drivers. There's no need for
> any SVA specific ops in iommu_ops vector anymore as we can achieve
> this through the generic attach/detach_dev_pasid domain ops.
> 
> This refactors the IOMMU SVA interfaces implementation by using the
> attach/detach_pasid_dev ops and align them with the concept of the
> iommu domain. Put the new SVA code in the sva related file in order
> to make it self-contained.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/iommu.h         |  51 +++++++++-------
>  drivers/iommu/iommu-sva-lib.c | 110 +++++++++++++++++++++++++++++++++-
>  drivers/iommu/iommu.c         |  92 ----------------------------
>  3 files changed, 138 insertions(+), 115 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index a46285488a57..11c4d99e122d 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -629,7 +629,12 @@ struct iommu_fwspec {
>   * struct iommu_sva - handle to a device-mm bond
>   */
>  struct iommu_sva {
> -	struct device			*dev;
> +	struct device		*dev;
> +	ioasid_t		pasid;
> +	struct iommu_domain	*domain;
> +	/* Link to sva domain's bonds list */
> +	struct list_head	node;
> +	refcount_t		users;
>  };
>  
>  int iommu_fwspec_init(struct device *dev, struct fwnode_handle
> *iommu_fwnode, @@ -672,12 +677,6 @@ int iommu_dev_enable_feature(struct
> device *dev, enum iommu_dev_features f); int
> iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
> bool iommu_dev_feature_enabled(struct device *dev, enum
> iommu_dev_features f); -struct iommu_sva *iommu_sva_bind_device(struct
> device *dev,
> -					struct mm_struct *mm,
> -					void *drvdata);
> -void iommu_sva_unbind_device(struct iommu_sva *handle);
> -u32 iommu_sva_get_pasid(struct iommu_sva *handle);
> -
>  int iommu_device_use_default_domain(struct device *dev);
>  void iommu_device_unuse_default_domain(struct device *dev);
>  
> @@ -1018,21 +1017,6 @@ iommu_dev_disable_feature(struct device *dev, enum
> iommu_dev_features feat) return -ENODEV;
>  }
>  
> -static inline struct iommu_sva *
> -iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void
> *drvdata) -{
> -	return NULL;
> -}
> -
> -static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
> -{
> -}
> -
> -static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> -{
> -	return IOMMU_PASID_INVALID;
> -}
> -
>  static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device
> *dev) {
>  	return NULL;
> @@ -1085,6 +1069,29 @@ iommu_put_domain_for_dev_pasid(struct iommu_domain
> *domain) }
>  #endif /* CONFIG_IOMMU_API */
>  
> +#ifdef CONFIG_IOMMU_SVA
> +struct iommu_sva *iommu_sva_bind_device(struct device *dev,
> +					struct mm_struct *mm,
> +					void *drvdata);
> +void iommu_sva_unbind_device(struct iommu_sva *handle);
> +u32 iommu_sva_get_pasid(struct iommu_sva *handle);
> +#else /* CONFIG_IOMMU_SVA */
> +static inline struct iommu_sva *
> +iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void
> *drvdata) +{
> +	return NULL;
> +}
> +
> +static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
> +{
> +}
> +
> +static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> +{
> +	return IOMMU_PASID_INVALID;
> +}
> +#endif /* CONFIG_IOMMU_SVA */
> +
>  /**
>   * iommu_map_sgtable - Map the given buffer to the IOMMU domain
>   * @domain:	The IOMMU domain to perform the mapping
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 78820be23f15..1b45b7d01836 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -17,6 +17,7 @@ struct iommu_sva_cookie {
>  	struct mm_struct *mm;
>  	ioasid_t pasid;
>  	refcount_t users;
> +	struct list_head bonds;
>  };
>  
>  /**
> @@ -101,6 +102,7 @@ iommu_sva_alloc_domain(struct device *dev, struct
> mm_struct *mm) cookie->mm = mm;
>  	cookie->pasid = mm->pasid;
>  	refcount_set(&cookie->users, 1);
> +	INIT_LIST_HEAD(&cookie->bonds);
>  	domain->type = IOMMU_DOMAIN_SVA;
>  	domain->sva_cookie = cookie;
>  	curr = xa_store(&sva_domain_array, mm->pasid, domain,
> GFP_KERNEL); @@ -118,6 +120,7 @@ iommu_sva_alloc_domain(struct device
> *dev, struct mm_struct *mm) static void iommu_sva_free_domain(struct
> iommu_domain *domain) {
>  	xa_erase(&sva_domain_array, domain->sva_cookie->pasid);
> +	WARN_ON(!list_empty(&domain->sva_cookie->bonds));
>  	kfree(domain->sva_cookie);
>  	domain->ops->free(domain);
>  }
> @@ -137,7 +140,7 @@ void iommu_sva_domain_put_user(struct iommu_domain
> *domain) iommu_sva_free_domain(domain);
>  }
>  
> -static __maybe_unused struct iommu_domain *
> +static struct iommu_domain *
>  iommu_sva_get_domain(struct device *dev, struct mm_struct *mm)
>  {
>  	struct iommu_domain *domain;
> @@ -158,3 +161,108 @@ struct mm_struct *iommu_sva_domain_mm(struct
> iommu_domain *domain) {
>  	return domain->sva_cookie->mm;
>  }
> +
> +/**
> + * iommu_sva_bind_device() - Bind a process address space to a device
> + * @dev: the device
> + * @mm: the mm to bind, caller must hold a reference to it
> + * @drvdata: opaque data pointer to pass to bind callback
> + *
> + * Create a bond between device and address space, allowing the device
> to access
> + * the mm using the returned PASID. If a bond already exists between
> @device and
> + * @mm, it is returned and an additional reference is taken. Caller must
> call
> + * iommu_sva_unbind_device() to release each reference.
> + *
> + * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called
> first, to
> + * initialize the required SVA features.
> + *
> + * On error, returns an ERR_PTR value.
> + */
> +struct iommu_sva *
> +iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void
> *drvdata) +{
> +	int ret = -EINVAL;
> +	struct iommu_sva *handle;
> +	struct iommu_domain *domain;
> +
> +	ret = iommu_sva_alloc_pasid(mm, 1, (1U <<
> dev->iommu->pasid_bits) - 1);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	mutex_lock(&iommu_sva_lock);
> +	domain = iommu_sva_get_domain(dev, mm);
> +	if (!domain) {
> +		ret = -ENOMEM;
> +		goto out_unlock;
> +	}
> +
> +	/* Search for an existing bond. */
> +	list_for_each_entry(handle, &domain->sva_cookie->bonds, node) {
> +		if (handle->dev == dev && handle->pasid == mm->pasid) {
> +			refcount_inc(&handle->users);
> +			mutex_lock(&iommu_sva_lock);
> +
> +			return handle;
> +		}
> +	}
> +
> +	handle = kzalloc(sizeof(*handle), GFP_KERNEL);
> +	if (!handle) {
> +		ret = -ENOMEM;
> +		goto out_put_domain;
> +	}
> +
> +	ret = iommu_attach_device_pasid(domain, dev, mm->pasid);
> +	if (ret)
> +		goto out_free_handle;
> +
> +	handle->dev = dev;
> +	handle->domain = domain;
> +	handle->pasid = mm->pasid;
why do we need to store pasid here? Conceptually, pasid is per sva domain
not per bind. You can get it from handle->domain->sva_cookie.

> +	refcount_set(&handle->users, 1);
> +	list_add_tail(&handle->node, &domain->sva_cookie->bonds);
> +
> +	mutex_unlock(&iommu_sva_lock);
> +	return handle;
> +
> +out_free_handle:
> +	kfree(handle);
> +out_put_domain:
> +	iommu_sva_domain_put_user(domain);
> +out_unlock:
> +	mutex_unlock(&iommu_sva_lock);
> +	return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
> +
> +/**
> + * iommu_sva_unbind_device() - Remove a bond created with
> iommu_sva_bind_device
> + * @handle: the handle returned by iommu_sva_bind_device()
> + *
> + * Put reference to a bond between device and address space. The device
> should
> + * not be issuing any more transaction for this PASID. All outstanding
> page
> + * requests for this PASID must have been flushed to the IOMMU.
> + */
> +void iommu_sva_unbind_device(struct iommu_sva *handle)
> +{
> +	struct device *dev = handle->dev;
> +	struct iommu_domain *domain = handle->domain;
> +	struct mm_struct *mm = iommu_sva_domain_mm(domain);
> +
> +	mutex_lock(&iommu_sva_lock);
> +	if (refcount_dec_and_test(&handle->users)) {
> +		list_del(&handle->node);
> +		iommu_detach_device_pasid(domain, dev, mm->pasid);
> +		kfree(handle);
> +	}
> +
> +	iommu_sva_domain_put_user(domain);
> +	mutex_unlock(&iommu_sva_lock);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
> +
> +u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> +{
> +	return handle->pasid;
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 8163ad7f6902..6b51ead9d63b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2712,98 +2712,6 @@ bool iommu_dev_feature_enabled(struct device *dev,
> enum iommu_dev_features feat) }
>  EXPORT_SYMBOL_GPL(iommu_dev_feature_enabled);
>  
> -/**
> - * iommu_sva_bind_device() - Bind a process address space to a device
> - * @dev: the device
> - * @mm: the mm to bind, caller must hold a reference to it
> - * @drvdata: opaque data pointer to pass to bind callback
> - *
> - * Create a bond between device and address space, allowing the device
> to access
> - * the mm using the returned PASID. If a bond already exists between
> @device and
> - * @mm, it is returned and an additional reference is taken. Caller must
> call
> - * iommu_sva_unbind_device() to release each reference.
> - *
> - * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called
> first, to
> - * initialize the required SVA features.
> - *
> - * On error, returns an ERR_PTR value.
> - */
> -struct iommu_sva *
> -iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void
> *drvdata) -{
> -	struct iommu_group *group;
> -	struct iommu_sva *handle = ERR_PTR(-EINVAL);
> -	const struct iommu_ops *ops = dev_iommu_ops(dev);
> -
> -	if (!ops->sva_bind)
> -		return ERR_PTR(-ENODEV);
> -
> -	group = iommu_group_get(dev);
> -	if (!group)
> -		return ERR_PTR(-ENODEV);
> -
> -	/* Ensure device count and domain don't change while we're
> binding */
> -	mutex_lock(&group->mutex);
> -
> -	/*
> -	 * To keep things simple, SVA currently doesn't support IOMMU
> groups
> -	 * with more than one device. Existing SVA-capable systems are
> not
> -	 * affected by the problems that required IOMMU groups (lack of
> ACS
> -	 * isolation, device ID aliasing and other hardware issues).
> -	 */
> -	if (!iommu_group_singleton_lockdown(group))
> -		goto out_unlock;
> -
> -	handle = ops->sva_bind(dev, mm, drvdata);
> -
> -out_unlock:
> -	mutex_unlock(&group->mutex);
> -	iommu_group_put(group);
> -
> -	return handle;
> -}
> -EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
> -
> -/**
> - * iommu_sva_unbind_device() - Remove a bond created with
> iommu_sva_bind_device
> - * @handle: the handle returned by iommu_sva_bind_device()
> - *
> - * Put reference to a bond between device and address space. The device
> should
> - * not be issuing any more transaction for this PASID. All outstanding
> page
> - * requests for this PASID must have been flushed to the IOMMU.
> - */
> -void iommu_sva_unbind_device(struct iommu_sva *handle)
> -{
> -	struct iommu_group *group;
> -	struct device *dev = handle->dev;
> -	const struct iommu_ops *ops = dev_iommu_ops(dev);
> -
> -	if (!ops->sva_unbind)
> -		return;
> -
> -	group = iommu_group_get(dev);
> -	if (!group)
> -		return;
> -
> -	mutex_lock(&group->mutex);
> -	ops->sva_unbind(handle);
> -	mutex_unlock(&group->mutex);
> -
> -	iommu_group_put(group);
> -}
> -EXPORT_SYMBOL_GPL(iommu_sva_unbind_device);
> -
> -u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> -{
> -	const struct iommu_ops *ops = dev_iommu_ops(handle->dev);
> -
> -	if (!ops->sva_get_pasid)
> -		return IOMMU_PASID_INVALID;
> -
> -	return ops->sva_get_pasid(handle);
> -}
> -EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> -
>  /*
>   * Changes the default domain of an iommu group that has *only* one
> device *


Thanks,

Jacob
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces
  2022-03-31 20:59   ` Jacob Pan
@ 2022-03-31 22:26     ` Jason Gunthorpe via iommu
  2022-04-04  5:55       ` Lu Baolu
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-03-31 22:26 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Will Deacon

On Thu, Mar 31, 2022 at 01:59:22PM -0700, Jacob Pan wrote:

> > +	handle->dev = dev;
> > +	handle->domain = domain;
> > +	handle->pasid = mm->pasid;

> why do we need to store pasid here? Conceptually, pasid is per sva domain
> not per bind. You can get it from handle->domain->sva_cookie.

That is a mistake - SVA needs to follow the general PASID design - the
domain does not encode the PASID, the PASID comes from the device
attachment only.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-29  8:42   ` Tian, Kevin
  2022-03-29 11:42     ` Jason Gunthorpe via iommu
  2022-03-30  4:59     ` Lu Baolu
@ 2022-04-01  5:49     ` Yi Liu
  2 siblings, 0 replies; 62+ messages in thread
From: Yi Liu @ 2022-04-01  5:49 UTC (permalink / raw)
  To: Tian, Kevin, Lu Baolu, Joerg Roedel, Jason Gunthorpe,
	Christoph Hellwig, Raj, Ashok, Will Deacon, Robin Murphy,
	Jean-Philippe Brucker
  Cc: iommu, Pan, Jacob jun, linux-kernel


On 2022/3/29 16:42, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, March 29, 2022 1:38 PM
>>
>> Some of the interfaces in the IOMMU core require that only a single
>> kernel device driver controls the device in the IOMMU group. The
>> existing method is to check the device count in the IOMMU group in
>> the interfaces. This is unreliable because any device added to the
>> IOMMU group later breaks this assumption without notifying the driver
>> using the interface. This adds iommu_group_singleton_lockdown() that
>> checks the requirement and locks down the IOMMU group with only single
>> device driver bound.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/iommu.c | 30 ++++++++++++++++++------------
>>   1 file changed, 18 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 0c42ece25854..9fb8a5b4491e 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -48,6 +48,7 @@ struct iommu_group {
>>   	struct list_head entry;
>>   	unsigned int owner_cnt;
>>   	void *owner;
>> +	bool singleton_lockdown;
>>   };
>>
>>   struct group_device {
>> @@ -968,15 +969,16 @@ void iommu_group_remove_device(struct device
>> *dev)
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_group_remove_device);
>>
>> -static int iommu_group_device_count(struct iommu_group *group)
>> +/* Callers should hold the group->mutex. */
>> +static bool iommu_group_singleton_lockdown(struct iommu_group *group)
>>   {
>> -	struct group_device *entry;
>> -	int ret = 0;
>> -
>> -	list_for_each_entry(entry, &group->devices, list)
>> -		ret++;
>> +	if (group->owner_cnt != 1 ||
>> +	    group->domain != group->default_domain ||
>> +	    group->owner)
>> +		return false;
> 
> Curious why there will be a case where group uses default_domain
> while still having a owner? I have the impression that owner is used
> for userspace DMA where a different domain is used.
> 
>> +	group->singleton_lockdown = true;
>>
>> -	return ret;
>> +	return true;
>>   }
> 
> btw I'm not sure whether this is what SVA requires. IIRC the problem with
> SVA is because PASID TLP prefix is not counted in PCI packet routing thus
> a DMA target address with PASID might be treated as P2P if the address
> falls into the MMIO BAR of other devices in the group. This is why the
> original code needs to strictly apply SVA in a group containing a single
> device, instead of a group attached by a single driver, unless we want to
> reserve those MMIO ranges in CPU VA space.
> 
> Jean can correct me if my memory is wrong.

I think so. I remember the major gap is PASID info is not used in the PCI's 
address based TLP routing mechanism. So P2P may happen if the address
happens to be device MMIO. Per the commit message from Jean. Even for 
singular groups, it's not an easy thing. So non-sigleton groups are not
considered so far.

"
IOMMU groups with more than one device aren't supported for SVA at the
moment. There may be P2P traffic between devices within a group, which
cannot be seen by an IOMMU (note that supporting PASID doesn't add any
form of isolation with regard to P2P). Supporting groups would require
calling bind() for all bound processes every time a device is added to a
group, to perform sanity checks (e.g. ensure that new devices support
PASIDs at least as big as those already allocated in the group). It also
means making sure that reserved ranges (IOMMU_RESV_*) of all devices are
carved out of processes. This is already tricky with single devices, but
becomes very difficult with groups. Since SVA-capable devices are expected
to be cleanly isolated, and since we don't have any way to test groups or
hot-plug, we only allow singular groups for now.
"

https://lore.kernel.org/kvm/20180511190641.23008-3-jean-philippe.brucker@arm.com/

-- 
Regards,
Yi Liu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-29 11:42     ` Jason Gunthorpe via iommu
  2022-03-30  6:50       ` Tian, Kevin
@ 2022-04-01  6:20       ` Yi Liu
  2022-04-01 11:52         ` Jason Gunthorpe via iommu
  1 sibling, 1 reply; 62+ messages in thread
From: Yi Liu @ 2022-04-01  6:20 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy



On 2022/3/29 19:42, Jason Gunthorpe wrote:
> On Tue, Mar 29, 2022 at 08:42:13AM +0000, Tian, Kevin wrote:
> 
>> btw I'm not sure whether this is what SVA requires. IIRC the problem with
>> SVA is because PASID TLP prefix is not counted in PCI packet routing thus
>> a DMA target address with PASID might be treated as P2P if the address
>> falls into the MMIO BAR of other devices in the group. This is why the
>> original code needs to strictly apply SVA in a group containing a single
>> device, instead of a group attached by a single driver, unless we want to
>> reserve those MMIO ranges in CPU VA space.
> 
> I think it is not such a good idea to mix up group with this test
> 
> Here you want to say that all TLPs from the RID route to the host
> bridge - ie ACS is on/etc. This is subtly different from a group with
> a single device. Specifically it is an immutable property of the
> fabric and doesn't change after hot plug events.

so the group size can be immutable for specific topology. right? I think 
for non-multi-function devices plugged behind an PCIE bridge which has 
enabled ACS, such devices should have their own groups. Under such topology 
the group size should be 1 constantly. May just enable SVA for such devices.

> ie if we have a singleton group that doesn't have ACS and someone
> hotplugs in another device on a bridge, then our SVA is completely
> broken and we get data corruption.

I think this may be a device plugged in a PCIE-to-PCI bridge, and then 
hotplug a device to this bridge. The group size is variable. right? Per my 
understanding, maybe such a bridge cannot support PASID Prefix at all, 
hence no SVA support for such devices.

-- 
Regards,
Yi Liu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-01  6:20       ` Yi Liu
@ 2022-04-01 11:52         ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-01 11:52 UTC (permalink / raw)
  To: Yi Liu
  Cc: Tian, Kevin, Raj, Ashok, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Pan, Jacob jun,
	Will Deacon

On Fri, Apr 01, 2022 at 02:20:23PM +0800, Yi Liu wrote:
> 
> 
> On 2022/3/29 19:42, Jason Gunthorpe wrote:
> > On Tue, Mar 29, 2022 at 08:42:13AM +0000, Tian, Kevin wrote:
> > 
> > > btw I'm not sure whether this is what SVA requires. IIRC the problem with
> > > SVA is because PASID TLP prefix is not counted in PCI packet routing thus
> > > a DMA target address with PASID might be treated as P2P if the address
> > > falls into the MMIO BAR of other devices in the group. This is why the
> > > original code needs to strictly apply SVA in a group containing a single
> > > device, instead of a group attached by a single driver, unless we want to
> > > reserve those MMIO ranges in CPU VA space.
> > 
> > I think it is not such a good idea to mix up group with this test
> > 
> > Here you want to say that all TLPs from the RID route to the host
> > bridge - ie ACS is on/etc. This is subtly different from a group with
> > a single device. Specifically it is an immutable property of the
> > fabric and doesn't change after hot plug events.
> 
> so the group size can be immutable for specific topology. right? I think for
> non-multi-function devices plugged behind an PCIE bridge which has enabled
> ACS, such devices should have their own groups. Under such topology the
> group size should be 1 constantly. May just enable SVA for such devices.

Like I said, you should stop thinking about group size.

You need to know that 100% of TLPs translate through the IOMMU to
enable SVA, nothing less will do, and that property has nothing to do
with group size.

> > ie if we have a singleton group that doesn't have ACS and someone
> > hotplugs in another device on a bridge, then our SVA is completely
> > broken and we get data corruption.
> 
> I think this may be a device plugged in a PCIE-to-PCI bridge, and then
> hotplug a device to this bridge. The group size is variable. right? Per my
> understanding, maybe such a bridge cannot support PASID Prefix at all, hence
> no SVA support for such devices.

Any PCIE-to-PCIE bridge will do, don't ned to involve legacy PCI here
to have hotplug problems.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30 14:30             ` Jason Gunthorpe via iommu
@ 2022-04-02  7:12               ` Tian, Kevin
  2022-04-02 23:29                 ` Jason Gunthorpe via iommu
  2022-04-06 10:02                 ` Lu Baolu
  0 siblings, 2 replies; 62+ messages in thread
From: Tian, Kevin @ 2022-04-02  7:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson (alex.williamson@redhat.com),
	Raj, Ashok, Jean-Philippe Brucker, linux-kernel,
	Christoph Hellwig, iommu, Pan, Jacob jun, Will Deacon,
	Robin Murphy

> From: Jason Gunthorpe
> Sent: Wednesday, March 30, 2022 10:30 PM
> 
> On Wed, Mar 30, 2022 at 02:12:57PM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe
> > > Sent: Wednesday, March 30, 2022 7:58 PM
> > >
> > > On Wed, Mar 30, 2022 at 06:50:11AM +0000, Tian, Kevin wrote:
> > >
> > > > One thing that I'm not very sure is about DMA alias. Even when
> physically
> > > > there is only a single device within the group the aliasing could lead
> > > > to multiple RIDs in the group making it non-singleton. But probably we
> > > > don't need support SVA on such device until a real demand comes?
> > >
> > > How can we have multiple RIDs in the same group and have only one
> > > device in the group?
> >
> > Alex may help throw some insight here. Per what I read from the code
> > looks like certain device can generate traffic with multiple RIDs.
> 
> IIRC "dma alias" refers to things like legacy PCI to PCIe bridges that
> do still have multiple PCI ID's behind the bridge used in
> configuration cycles however the PCI to PCIe bridge will tag all PCIe
> TLPs with its own RID because classic PCI has no way for the requestor
> to convey a RID to the bridge.

That is one scenario of dma aliasing. Another is like Alex replied where
one device has an alias requestor ID due to PCI quirks. The alias RID
may or may not map to a real device but probably what we really care
here regarding to p2p are struct devices listed in the group.

> 
> So, from a Linux perspective the group should have have multiple
> struct devices behind the bridge, the bridge itself, and the RID the
> IOMMU HW matches on is only the RID of the PCI bridge.
> 
> But we know this because we know there is classic PCI stuff in the
> heigharchy, so we can just mark that group as incompatible.

Yes.

> 
> > > Add a flag to the group that positively indicates the group can never
> > > have more than one member, even after hot plug. eg because it is
> > > impossible due to ACS, or lack of bridges, and so on.
> >
> > OK, I see your point. It essentially refers to a singleton group which
> > is immutable to hotplug.
> 
> Yes, known at creation time, not retroactively enforced because
> someone used SVA
> 

We may check following conditions to set the immutable flag when
a new group is created for a device in pci_device_group():

1) ACS is enabled in the upstream path of the device;
2) the device is single function or ACS is enabled on a multi-function device;
3) the device type is PCI_EXP_TYPE_ENDPOINT (thus no hotplug);
4) no 'dma aliasing' on this device;

The last one is a bit conservative as it also precludes a device which aliasing
dma due to quirks from being treated as a singleton group. But doing so 
saves the effort on trying to separate different aliasing scenarios as defined
in pci_for_each_dma_alias(). Probably we can go this way as the first step.

Once the flag is set on a group no other event can change it. If a new
identified device hits an existing singleton group in pci_device_group()
then it's a bug.

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-03-30 19:02   ` Jason Gunthorpe via iommu
@ 2022-04-02  8:43     ` Tian, Kevin
  2022-04-02 23:32       ` Jason Gunthorpe via iommu
  0 siblings, 1 reply; 62+ messages in thread
From: Tian, Kevin @ 2022-04-02  8:43 UTC (permalink / raw)
  To: Jason Gunthorpe, Lu Baolu
  Cc: Raj, Ashok, Will Deacon, Jean-Philippe Brucker, linux-kernel,
	Christoph Hellwig, iommu, Pan, Jacob jun, Robin Murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, March 31, 2022 3:02 AM
> 
> On Tue, Mar 29, 2022 at 01:37:52PM +0800, Lu Baolu wrote:
> > @@ -95,6 +101,7 @@ struct iommu_domain {
> >  	void *handler_token;
> >  	struct iommu_domain_geometry geometry;
> >  	struct iommu_dma_cookie *iova_cookie;
> > +	struct iommu_sva_cookie *sva_cookie;
> 
> Cookie is still the wrong word to use here
> 
> > +struct iommu_sva_cookie {
> > +	struct mm_struct *mm;
> > +	ioasid_t pasid;
> > +	refcount_t users;
> 
> Really surprised to see a refcount buried inside the iommu_domain..
> 
> This design seems inside out, the SVA struct should 'enclose' the domain, not
> be a pointer inside it.
> 
> struct iommu_sva_domain {
>        struct kref_t kref;
>        struct mm_struct *mm;
>        ioasid_t pasid;
> 
>        /* All the domains that are linked to this */
>        struct xarray domain_list;
> };
> 
> And then you could have a pointer to that inside the mm_struct instead
> of just the naked pasid.
> 
> > +static __maybe_unused struct iommu_domain *
> 
> Why maybe unused?
> 
> > +iommu_sva_get_domain(struct device *dev, struct mm_struct *mm)
> > +{
> > +	struct iommu_domain *domain;
> > +	ioasid_t pasid = mm->pasid;
> > +
> > +	if (pasid == INVALID_IOASID)
> > +		return NULL;
> > +
> > +	domain = xa_load(&sva_domain_array, pasid);
> > +	if (!domain)
> > +		return iommu_sva_alloc_domain(dev, mm);
> > +	iommu_sva_domain_get_user(domain);
> 
> This assumes any domain is interchangeable with any device, which is
> not the iommu model. We need a domain op to check if a device is
> compatiable with the domain for vfio an iommufd, this should do the
> same.

This suggests that mm_struct needs to include the format information
of the CPU page table so the format can be checked by the domain op?

> 
> It means each mm can have a list of domains associated with it and a
> new domain is auto-created if the device doesn't work with any of the
> existing domains.
> 

mm has only one page table and one format. If a device is incompatible
with an existing domain wrapping that page table, how come creating
another domain could make it compatible?

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-02  7:12               ` Tian, Kevin
@ 2022-04-02 23:29                 ` Jason Gunthorpe via iommu
  2022-04-06 10:02                 ` Lu Baolu
  1 sibling, 0 replies; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-02 23:29 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson (alex.williamson@redhat.com),
	Raj, Ashok, Jean-Philippe Brucker, linux-kernel,
	Christoph Hellwig, iommu, Pan, Jacob jun, Will Deacon,
	Robin Murphy

On Sat, Apr 02, 2022 at 07:12:12AM +0000, Tian, Kevin wrote:

> That is one scenario of dma aliasing. Another is like Alex replied where
> one device has an alias requestor ID due to PCI quirks. The alias RID
> may or may not map to a real device but probably what we really care
> here regarding to p2p are struct devices listed in the group.

Those devices are just broken and not spec complaint. IMHO we can
treat them as disabling ACS for their segment as well.

> We may check following conditions to set the immutable flag when
> a new group is created for a device in pci_device_group():
> 
> 1) ACS is enabled in the upstream path of the device;
> 2) the device is single function or ACS is enabled on a multi-function device;
> 3) the device type is PCI_EXP_TYPE_ENDPOINT (thus no hotplug);
> 4) no 'dma aliasing' on this device;

It makes sense to me

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-02  8:43     ` Tian, Kevin
@ 2022-04-02 23:32       ` Jason Gunthorpe via iommu
  2022-04-04  6:09         ` Lu Baolu
  2022-04-06  1:00         ` Tian, Kevin
  0 siblings, 2 replies; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-02 23:32 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

On Sat, Apr 02, 2022 at 08:43:16AM +0000, Tian, Kevin wrote:

> > This assumes any domain is interchangeable with any device, which is
> > not the iommu model. We need a domain op to check if a device is
> > compatiable with the domain for vfio an iommufd, this should do the
> > same.
> 
> This suggests that mm_struct needs to include the format information
> of the CPU page table so the format can be checked by the domain op?

No, Linux does not support multiple formats for CPU page tables,
AFAICT, and creating the SVA domain in the first place should check
this.

> > It means each mm can have a list of domains associated with it and a
> > new domain is auto-created if the device doesn't work with any of the
> > existing domains.
> 
> mm has only one page table and one format. If a device is incompatible
> with an existing domain wrapping that page table, how come creating
> another domain could make it compatible?

Because domains wrap more than just the IOPTE format, they have
additional data related to the IOMMU HW block itself. Imagine a SOC
with two IOMMU HW blocks that can both process the CPU IOPTE format,
but have different configuration.

So if device A users IOMMU A it needs an iommu_domain from driver A and
same for another device B, even if both iommu_domains are thin
wrappers around the same mm_struct.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-03-30 11:58         ` Jason Gunthorpe via iommu
  2022-03-30 14:12           ` Tian, Kevin
  2022-03-30 14:18           ` Tian, Kevin
@ 2022-04-04  5:43           ` Lu Baolu
  2022-04-04 17:24             ` Jason Gunthorpe via iommu
  2 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-04-04  5:43 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

On 2022/3/30 19:58, Jason Gunthorpe wrote:
>>> Testing the group size is inherently the wrong test to make.
>> What is your suggestion then?
> Add a flag to the group that positively indicates the group can never
> have more than one member, even after hot plug. eg because it is
> impossible due to ACS, or lack of bridges, and so on.

The check method seems to be bus specific. For platform devices, perhaps
this kind of information should be retrieved from firmware interfaces
like APCI or DT.

 From this point of view, would it be simpler and more reasonable for the
device driver to do such check? After all, it is the device driver that
decides whether to provide SVA services to the application via uacce.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces
  2022-03-31 22:26     ` Jason Gunthorpe via iommu
@ 2022-04-04  5:55       ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-04-04  5:55 UTC (permalink / raw)
  To: Jason Gunthorpe, Jacob Pan
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Will Deacon

On 2022/4/1 6:26, Jason Gunthorpe wrote:
> On Thu, Mar 31, 2022 at 01:59:22PM -0700, Jacob Pan wrote:
> 
>>> +	handle->dev = dev;
>>> +	handle->domain = domain;
>>> +	handle->pasid = mm->pasid;
>> why do we need to store pasid here? Conceptually, pasid is per sva domain
>> not per bind. You can get it from handle->domain->sva_cookie.
> That is a mistake - SVA needs to follow the general PASID design - the
> domain does not encode the PASID, the PASID comes from the device

You are right. We should not store any pasid information in the domain.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-02 23:32       ` Jason Gunthorpe via iommu
@ 2022-04-04  6:09         ` Lu Baolu
  2022-04-06  1:00         ` Tian, Kevin
  1 sibling, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-04-04  6:09 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

Hi Jason and Kevin,

On 2022/4/3 7:32, Jason Gunthorpe wrote:
> On Sat, Apr 02, 2022 at 08:43:16AM +0000, Tian, Kevin wrote:
> 
>>> This assumes any domain is interchangeable with any device, which is
>>> not the iommu model. We need a domain op to check if a device is
>>> compatiable with the domain for vfio an iommufd, this should do the
>>> same.
>>
>> This suggests that mm_struct needs to include the format information
>> of the CPU page table so the format can be checked by the domain op?
> 
> No, Linux does not support multiple formats for CPU page tables,
> AFAICT, and creating the SVA domain in the first place should check
> this.
> 
>>> It means each mm can have a list of domains associated with it and a
>>> new domain is auto-created if the device doesn't work with any of the
>>> existing domains.
>>
>> mm has only one page table and one format. If a device is incompatible
>> with an existing domain wrapping that page table, how come creating
>> another domain could make it compatible?
> 
> Because domains wrap more than just the IOPTE format, they have
> additional data related to the IOMMU HW block itself. Imagine a SOC
> with two IOMMU HW blocks that can both process the CPU IOPTE format,
> but have different configuration.
> 
> So if device A users IOMMU A it needs an iommu_domain from driver A and
> same for another device B, even if both iommu_domains are thin
> wrappers around the same mm_struct.

How about below data structure design?

- [New]struct iommu_sva_ioas
  Represent the I/O address space shared with an application CPU address
  space. This structure has a 1:1 relationship with an mm_struct. It
  graps a "mm->mm_count" refcount during creation and drop it on release.

struct iommu_sva_ioas {
         struct mm_struct *mm;
         ioasid_t pasid;

         /* Counter of domains attached to this ioas. */
         refcount_t users;

         /* All bindings are linked here. */
         struct list_head bonds;
};

- [Enhance existing] struct iommu_domain (IOMMU_DOMAIN_SVA type)
  Represent a hardware pagetable that the IOMMU hardware could use for
  SVA translation. Multiple iommu domains could be bound with an SVA ioas
  and each graps a refcount from ioas in order to make sure ioas could
  only be freed after all domains have been unbound.

@@ -95,6 +101,7 @@ struct iommu_domain {
         void *handler_token;
         struct iommu_domain_geometry geometry;
         struct iommu_dma_cookie *iova_cookie;
+       struct iommu_sva_ioas *sva_ioas;
  };


- [Enhance existing] struct iommu_sva
   Represent a bond relationship between an SVA ioas and an iommu domain.
   If a bond already exists, it's reused and a reference is taken.

/**
  * struct iommu_sva - handle to a device-mm bond
  */
struct iommu_sva {
         struct device           *dev;
         struct iommu_sva_ioas   *sva_ioas;
         struct iommu_domain     *domain;
         /* Link to sva ioas's bonds list */
         struct list_head        node;
         refcount_t              users;
};

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops
  2022-03-30 19:08   ` Jason Gunthorpe via iommu
@ 2022-04-04  6:47     ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-04-04  6:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Jacob jun Pan,
	Will Deacon

Hi Jason,

On 2022/3/31 3:08, Jason Gunthorpe wrote:
> On Tue, Mar 29, 2022 at 01:37:53PM +0800, Lu Baolu wrote:
>> Attaching an IOMMU domain to a PASID of a device is a generic operation
>> for modern IOMMU drivers which support PASID-granular DMA address
>> translation. Currently visible usage scenarios include (but not limited):
>>
>>   - SVA (Shared Virtual Address)
>>   - kernel DMA with PASID
>>   - hardware-assist mediated device
>>
>> This adds a pair of common domain ops for this purpose and adds some
>> common helpers to attach/detach a domain to/from a {device, PASID} and
>> get/put the domain attached to {device, PASID}.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>   include/linux/iommu.h | 36 ++++++++++++++++++
>>   drivers/iommu/iommu.c | 88 +++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 124 insertions(+)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 29c4c2edd706..a46285488a57 100644
>> +++ b/include/linux/iommu.h
>> @@ -269,6 +269,8 @@ struct iommu_ops {
>>    * struct iommu_domain_ops - domain specific operations
>>    * @attach_dev: attach an iommu domain to a device
>>    * @detach_dev: detach an iommu domain from a device
>> + * @attach_dev_pasid: attach an iommu domain to a pasid of device
>> + * @detach_dev_pasid: detach an iommu domain from a pasid of device
>>    * @map: map a physically contiguous memory region to an iommu domain
>>    * @map_pages: map a physically contiguous set of pages of the same size to
>>    *             an iommu domain.
>> @@ -286,6 +288,10 @@ struct iommu_ops {
>>   struct iommu_domain_ops {
>>   	int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
>>   	void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
>> +	int (*attach_dev_pasid)(struct iommu_domain *domain,
>> +				struct device *dev, ioasid_t id);
>> +	void (*detach_dev_pasid)(struct iommu_domain *domain,
>> +				 struct device *dev, ioasid_t id);
> 
> ID should be pasid for consistency

Sure.

> 
>> +int iommu_attach_device_pasid(struct iommu_domain *domain,
>> +			      struct device *dev, ioasid_t pasid)
>> +{
>> +	struct iommu_group *group;
>> +	int ret = -EINVAL;
>> +	void *curr;
>> +
>> +	if (!domain->ops->attach_dev_pasid)
>> +		return -EINVAL;
>> +
>> +	group = iommu_group_get(dev);
>> +	if (!group)
>> +		return -ENODEV;
>> +
>> +	mutex_lock(&group->mutex);
>> +	/*
>> +	 * To keep things simple, we currently don't support IOMMU groups
>> +	 * with more than one device. Existing SVA-capable systems are not
>> +	 * affected by the problems that required IOMMU groups (lack of ACS
>> +	 * isolation, device ID aliasing and other hardware issues).
>> +	 */
>> +	if (!iommu_group_singleton_lockdown(group))
>> +		goto out_unlock;
>> +
>> +	xa_lock(&group->pasid_array);
>> +	curr = __xa_cmpxchg(&group->pasid_array, pasid, NULL,
>> +			    domain, GFP_KERNEL);
>> +	xa_unlock(&group->pasid_array);
> 
> Why the xa_lock/unlock? Just call the normal xa_cmpxchg?

I should use xa_cmpxchg() instead.


> 
>> +void iommu_detach_device_pasid(struct iommu_domain *domain,
>> +			       struct device *dev, ioasid_t pasid)
>> +{
>> +	struct iommu_group *group;
>> +
>> +	group = iommu_group_get(dev);
>> +	if (WARN_ON(!group))
>> +		return;
> 
> This group_get stuff really needs some cleaning, this makes no sense
> at all.
> 
> If the kref to group can go to zero within this function then the
> initial access of the kref is already buggy:
> 
> 	if (group)
> 		kobject_get(group->devices_kobj);
> 
> Because it will crash or WARN_ON.
> 
> We don't hit this because it is required that a group cannot be
> destroyed while a struct device has a driver bound, and all these
> paths are driver bound paths.
> 
> So none of these group_get/put patterns have any purpose at all, and
> now we are adding impossible WARN_ONs to..

The original intention of this check is that the helper is called on the
correct device. I agree that WARN_ON() is unnecessary because NULL
pointer reference will be caught automatically.

> 
>> +struct iommu_domain *
>> +iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
>> +{
>> +	struct iommu_domain *domain;
>> +	struct iommu_group *group;
>> +
>> +	group = iommu_group_get(dev);
>> +	if (!group)
>> +		return NULL;
> 
> And now we are doing useless things on a performance path!

Agreed.

> 
>> +	mutex_lock(&group->mutex);
>> +	domain = xa_load(&group->pasid_array, pasid);
>> +	if (domain && domain->type == IOMMU_DOMAIN_SVA)
>> +		iommu_sva_domain_get_user(domain);
>> +	mutex_unlock(&group->mutex);
>> +	iommu_group_put(group);
> 
> Why do we need so much locking on a performance path? RCU out of the
> xarray..
> 
> Not sure I see how this get_user refcounting can work ?

I should move the refcountering things to iommu_domain and make the
change easier for review. Will improve this in the new version.

> 
> Jason

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 06/11] iommu/vt-d: Add SVA domain support
  2022-03-30 19:09   ` Jason Gunthorpe via iommu
@ 2022-04-04  6:52     ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-04-04  6:52 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Kevin Tian, Ashok Raj, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Jacob jun Pan,
	Will Deacon

On 2022/3/31 3:09, Jason Gunthorpe wrote:
> On Tue, Mar 29, 2022 at 01:37:55PM +0800, Lu Baolu wrote:
>> Add support for SVA domain allocation and provide an SVA-specific
>> iommu_domain_ops.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>   include/linux/intel-iommu.h |  1 +
>>   drivers/iommu/intel/iommu.c | 10 ++++++++++
>>   drivers/iommu/intel/svm.c   | 37 +++++++++++++++++++++++++++++++++++++
>>   3 files changed, 48 insertions(+)
>>
>> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
>> index 2f9891cb3d00..c14283137fb5 100644
>> +++ b/include/linux/intel-iommu.h
>> @@ -744,6 +744,7 @@ void intel_svm_unbind(struct iommu_sva *handle);
>>   u32 intel_svm_get_pasid(struct iommu_sva *handle);
>>   int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
>>   			    struct iommu_page_response *msg);
>> +extern const struct iommu_domain_ops intel_svm_domain_ops;
>>   
>>   struct intel_svm_dev {
>>   	struct list_head list;
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index c1b91bce1530..5eae7cf9bee5 100644
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -4318,6 +4318,16 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
>>   		return domain;
>>   	case IOMMU_DOMAIN_IDENTITY:
>>   		return &si_domain->domain;
>> +#ifdef CONFIG_INTEL_IOMMU_SVM
>> +	case IOMMU_DOMAIN_SVA:
>> +		dmar_domain = alloc_domain(type);
>> +		if (!dmar_domain)
>> +			return NULL;
>> +		domain = &dmar_domain->domain;
>> +		domain->ops = &intel_svm_domain_ops;
>> +
>> +		return domain;
>> +#endif /* CONFIG_INTEL_IOMMU_SVM */
> 
> If this is the usual pattern for drivers I would prefer to see an
> alloc_sva op instead of more and more types.
> 
> Multiplexing functions is often not a great idea...

Robin mentioned that the iommu domain alloc/free interfaces are under
reforming. These cleanups need to wait to see what the final code looks
like.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-04  5:43           ` Lu Baolu
@ 2022-04-04 17:24             ` Jason Gunthorpe via iommu
  2022-04-05  6:12               ` Lu Baolu
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-04 17:24 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, Raj, Ashok, Will Deacon, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Pan, Jacob jun,
	Robin Murphy

On Mon, Apr 04, 2022 at 01:43:49PM +0800, Lu Baolu wrote:
> On 2022/3/30 19:58, Jason Gunthorpe wrote:
> > > > Testing the group size is inherently the wrong test to make.
> > > What is your suggestion then?
> > Add a flag to the group that positively indicates the group can never
> > have more than one member, even after hot plug. eg because it is
> > impossible due to ACS, or lack of bridges, and so on.
> 
> The check method seems to be bus specific. For platform devices, perhaps
> this kind of information should be retrieved from firmware interfaces
> like APCI or DT.
> 
> From this point of view, would it be simpler and more reasonable for the
> device driver to do such check? After all, it is the device driver that
> decides whether to provide SVA services to the application via uacce.

The check has to do with the interconnect, not the device - I don't
see how a device driver would know any better.

Why do you bring up uacce? Nothing should need uacce to access SVA.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-04 17:24             ` Jason Gunthorpe via iommu
@ 2022-04-05  6:12               ` Lu Baolu
  2022-04-05 14:10                 ` Jason Gunthorpe via iommu
  0 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-04-05  6:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Tian, Kevin, Raj, Ashok, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Pan, Jacob jun,
	Will Deacon

On 2022/4/5 1:24, Jason Gunthorpe wrote:
> On Mon, Apr 04, 2022 at 01:43:49PM +0800, Lu Baolu wrote:
>> On 2022/3/30 19:58, Jason Gunthorpe wrote:
>>>>> Testing the group size is inherently the wrong test to make.
>>>> What is your suggestion then?
>>> Add a flag to the group that positively indicates the group can never
>>> have more than one member, even after hot plug. eg because it is
>>> impossible due to ACS, or lack of bridges, and so on.
>>
>> The check method seems to be bus specific. For platform devices, perhaps
>> this kind of information should be retrieved from firmware interfaces
>> like APCI or DT.
>>
>>  From this point of view, would it be simpler and more reasonable for the
>> device driver to do such check? After all, it is the device driver that
>> decides whether to provide SVA services to the application via uacce.
> 
> The check has to do with the interconnect, not the device - I don't
> see how a device driver would know any better.

I'm worried about how to support this group flag for devices that are
not connected to the system through PCI buses. If IOMMU can support
sva_bind() only when this flag is set, the SVA on many devices cannot
be supported. Or this flag is always set for non PCI devices by default?

> 
> Why do you bring up uacce? Nothing should need uacce to access SVA.

The uacce is irrelevant here.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-05  6:12               ` Lu Baolu
@ 2022-04-05 14:10                 ` Jason Gunthorpe via iommu
  2022-04-06  9:51                   ` Lu Baolu
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 14:10 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, Raj, Ashok, Will Deacon, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Pan, Jacob jun,
	Robin Murphy

On Tue, Apr 05, 2022 at 02:12:42PM +0800, Lu Baolu wrote:
> On 2022/4/5 1:24, Jason Gunthorpe wrote:
> > On Mon, Apr 04, 2022 at 01:43:49PM +0800, Lu Baolu wrote:
> > > On 2022/3/30 19:58, Jason Gunthorpe wrote:
> > > > > > Testing the group size is inherently the wrong test to make.
> > > > > What is your suggestion then?
> > > > Add a flag to the group that positively indicates the group can never
> > > > have more than one member, even after hot plug. eg because it is
> > > > impossible due to ACS, or lack of bridges, and so on.
> > > 
> > > The check method seems to be bus specific. For platform devices, perhaps
> > > this kind of information should be retrieved from firmware interfaces
> > > like APCI or DT.
> > > 
> > >  From this point of view, would it be simpler and more reasonable for the
> > > device driver to do such check? After all, it is the device driver that
> > > decides whether to provide SVA services to the application via uacce.
> > 
> > The check has to do with the interconnect, not the device - I don't
> > see how a device driver would know any better.
> 
> I'm worried about how to support this group flag for devices that are
> not connected to the system through PCI buses. If IOMMU can support
> sva_bind() only when this flag is set, the SVA on many devices cannot
> be supported. Or this flag is always set for non PCI devices by
> default?

IHMO it is not so different from how we determine if ACS like
functionality is supported on non-PCI. It is really just a more narrow
application of the existing ACS idea.

For instance it may be that if the iommu_group came from DT we can
assume it is static and then singleton can know ACS is reliable.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-02 23:32       ` Jason Gunthorpe via iommu
  2022-04-04  6:09         ` Lu Baolu
@ 2022-04-06  1:00         ` Tian, Kevin
  2022-04-06  1:23           ` Jason Gunthorpe via iommu
  1 sibling, 1 reply; 62+ messages in thread
From: Tian, Kevin @ 2022-04-06  1:00 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Sunday, April 3, 2022 7:32 AM
> 
> On Sat, Apr 02, 2022 at 08:43:16AM +0000, Tian, Kevin wrote:
> 
> > > This assumes any domain is interchangeable with any device, which is
> > > not the iommu model. We need a domain op to check if a device is
> > > compatiable with the domain for vfio an iommufd, this should do the
> > > same.
> >
> > This suggests that mm_struct needs to include the format information
> > of the CPU page table so the format can be checked by the domain op?
> 
> No, Linux does not support multiple formats for CPU page tables,
> AFAICT, and creating the SVA domain in the first place should check
> this.

One interesting usage is when virtio-iommu supports vSVA one day. At
that time there needs a way to know the format of the CPU page table
and then virtio-iommu driver needs to check whether it is compatible
with what the host iommu driver supports. But possibly this can wait to
be solved until that usage comes...

> 
> > > It means each mm can have a list of domains associated with it and a
> > > new domain is auto-created if the device doesn't work with any of the
> > > existing domains.
> >
> > mm has only one page table and one format. If a device is incompatible
> > with an existing domain wrapping that page table, how come creating
> > another domain could make it compatible?
> 
> Because domains wrap more than just the IOPTE format, they have
> additional data related to the IOMMU HW block itself. Imagine a SOC
> with two IOMMU HW blocks that can both process the CPU IOPTE format,
> but have different configuration.

Curious. Is it hypothesis or real? If real can you help give a concrete
example?

> 
> So if device A users IOMMU A it needs an iommu_domain from driver A and
> same for another device B, even if both iommu_domains are thin
> wrappers around the same mm_struct.
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-06  1:00         ` Tian, Kevin
@ 2022-04-06  1:23           ` Jason Gunthorpe via iommu
  2022-04-06  5:58             ` Tian, Kevin
  0 siblings, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06  1:23 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

On Wed, Apr 06, 2022 at 01:00:13AM +0000, Tian, Kevin wrote:

> > Because domains wrap more than just the IOPTE format, they have
> > additional data related to the IOMMU HW block itself. Imagine a SOC
> > with two IOMMU HW blocks that can both process the CPU IOPTE format,
> > but have different configuration.
> 
> Curious. Is it hypothesis or real? If real can you help give a concrete
> example?

Look at arm_smmu_attach_dev() - the domain has exactly one smmu
pointer which contains the base address for the SMMU IP block. If the
domain doesn't match the smmu pointer from the struct device it won't
allow attaching.

I know of ARM SOCs with many copies of the SMMU IP block.

So at least with current drivers ARM seems to have this limitation.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-06  1:23           ` Jason Gunthorpe via iommu
@ 2022-04-06  5:58             ` Tian, Kevin
  2022-04-06 12:32               ` Robin Murphy
  0 siblings, 1 reply; 62+ messages in thread
From: Tian, Kevin @ 2022-04-06  5:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig,
	Jean-Philippe Brucker, iommu, Pan, Jacob jun, Robin Murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 9:24 AM
> 
> On Wed, Apr 06, 2022 at 01:00:13AM +0000, Tian, Kevin wrote:
> 
> > > Because domains wrap more than just the IOPTE format, they have
> > > additional data related to the IOMMU HW block itself. Imagine a SOC
> > > with two IOMMU HW blocks that can both process the CPU IOPTE format,
> > > but have different configuration.
> >
> > Curious. Is it hypothesis or real? If real can you help give a concrete
> > example?
> 
> Look at arm_smmu_attach_dev() - the domain has exactly one smmu
> pointer which contains the base address for the SMMU IP block. If the
> domain doesn't match the smmu pointer from the struct device it won't
> allow attaching.
> 
> I know of ARM SOCs with many copies of the SMMU IP block.
> 
> So at least with current drivers ARM seems to have this limitation.
> 

I saw that code, but before this series it is used only for stage-2 instead
of SVA. and I didn't see similar check in the old sva related paths (though
it doesn't use domain):

arm_smmu_master_sva_enable_iopf()
arm_smmu_master_enable_sva{}
__arm_smmu_sva_bind()

If I didn't overlook some trick hiding in the call chain of those functions,
is there a bug in the existing SMMU sva logic or is it conceptually correct
to not have such check for SVA?

If the former then yes we have to take SMMU IP block into consideration
thus could have multiple domains per CPU page table. If the latter then
this is not a valid example for that configuration.

Which one is correct?

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-05 14:10                 ` Jason Gunthorpe via iommu
@ 2022-04-06  9:51                   ` Lu Baolu
  0 siblings, 0 replies; 62+ messages in thread
From: Lu Baolu @ 2022-04-06  9:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Tian, Kevin, Raj, Ashok, Robin Murphy, linux-kernel,
	Christoph Hellwig, Jean-Philippe Brucker, iommu, Pan, Jacob jun,
	Will Deacon

On 2022/4/5 22:10, Jason Gunthorpe wrote:
> On Tue, Apr 05, 2022 at 02:12:42PM +0800, Lu Baolu wrote:
>> On 2022/4/5 1:24, Jason Gunthorpe wrote:
>>> On Mon, Apr 04, 2022 at 01:43:49PM +0800, Lu Baolu wrote:
>>>> On 2022/3/30 19:58, Jason Gunthorpe wrote:
>>>>>>> Testing the group size is inherently the wrong test to make.
>>>>>> What is your suggestion then?
>>>>> Add a flag to the group that positively indicates the group can never
>>>>> have more than one member, even after hot plug. eg because it is
>>>>> impossible due to ACS, or lack of bridges, and so on.
>>>>
>>>> The check method seems to be bus specific. For platform devices, perhaps
>>>> this kind of information should be retrieved from firmware interfaces
>>>> like APCI or DT.
>>>>
>>>>   From this point of view, would it be simpler and more reasonable for the
>>>> device driver to do such check? After all, it is the device driver that
>>>> decides whether to provide SVA services to the application via uacce.
>>>
>>> The check has to do with the interconnect, not the device - I don't
>>> see how a device driver would know any better.
>>
>> I'm worried about how to support this group flag for devices that are
>> not connected to the system through PCI buses. If IOMMU can support
>> sva_bind() only when this flag is set, the SVA on many devices cannot
>> be supported. Or this flag is always set for non PCI devices by
>> default?
> 
> IHMO it is not so different from how we determine if ACS like
> functionality is supported on non-PCI. It is really just a more narrow
> application of the existing ACS idea.
> 
> For instance it may be that if the iommu_group came from DT we can
> assume it is static and then singleton can know ACS is reliable.

Okay, let me head this direction.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-02  7:12               ` Tian, Kevin
  2022-04-02 23:29                 ` Jason Gunthorpe via iommu
@ 2022-04-06 10:02                 ` Lu Baolu
  2022-04-06 10:44                   ` Tian, Kevin
  1 sibling, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-04-06 10:02 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, iommu, Christoph Hellwig,
	Alex Williamson (alex.williamson@redhat.com),
	Pan, Jacob jun, Robin Murphy, Jean-Philippe Brucker

Hi Kevin,

On 2022/4/2 15:12, Tian, Kevin wrote:
>>>> Add a flag to the group that positively indicates the group can never
>>>> have more than one member, even after hot plug. eg because it is
>>>> impossible due to ACS, or lack of bridges, and so on.
>>> OK, I see your point. It essentially refers to a singleton group which
>>> is immutable to hotplug.
>> Yes, known at creation time, not retroactively enforced because
>> someone used SVA
>>
> We may check following conditions to set the immutable flag when
> a new group is created for a device in pci_device_group():
> 
> 1) ACS is enabled in the upstream path of the device;
> 2) the device is single function or ACS is enabled on a multi-function device;
> 3) the device type is PCI_EXP_TYPE_ENDPOINT (thus no hotplug);
> 4) no 'dma aliasing' on this device;
> 
> The last one is a bit conservative as it also precludes a device which aliasing
> dma due to quirks from being treated as a singleton group. But doing so
> saves the effort on trying to separate different aliasing scenarios as defined
> in pci_for_each_dma_alias(). Probably we can go this way as the first step.
> 
> Once the flag is set on a group no other event can change it. If a new
> identified device hits an existing singleton group in pci_device_group()
> then it's a bug.

How about below implementation?

/* callback for pci_for_each_dma_alias() */
static int has_pci_alias(struct pci_dev *pdev, u16 alias, void *opaque)
{
	return -EEXIST;
}

static bool pci_dev_is_immutably_isolated(struct pci_dev *pdev)
{
	/* Skip bridges. */
	if (pci_is_bridge(pdev))
		return false;

	/* Either connect to root bridge or the ACS-enabled bridge. */
	if (!pci_is_root_bus(pdev->bus) &&
	    !pci_acs_enabled(pdev->bus->self, REQ_ACS_FLAGS))
		return false;

	/* ACS is required for MFD. */
	if (pdev->multifunction && !pci_acs_enabled(pdev, REQ_ACS_FLAGS))
		return false;

	/* Make sure no PCI alias. */
	if (pci_for_each_dma_alias(pdev, has_pci_alias, NULL))
		return false;

	return true;
}

I didn't get why do we need to check the PCI_EXP_TYPE_ENDPOINT device
type. Can you please elaborate a bit more?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-06 10:02                 ` Lu Baolu
@ 2022-04-06 10:44                   ` Tian, Kevin
  2022-04-06 11:03                     ` Lu Baolu
  0 siblings, 1 reply; 62+ messages in thread
From: Tian, Kevin @ 2022-04-06 10:44 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, iommu, linux-kernel, Christoph Hellwig,
	Alex Williamson (alex.williamson@redhat.com),
	Pan, Jacob jun, Jean-Philippe Brucker, Robin Murphy

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Wednesday, April 6, 2022 6:02 PM
> 
> Hi Kevin,
> 
> On 2022/4/2 15:12, Tian, Kevin wrote:
> >>>> Add a flag to the group that positively indicates the group can never
> >>>> have more than one member, even after hot plug. eg because it is
> >>>> impossible due to ACS, or lack of bridges, and so on.
> >>> OK, I see your point. It essentially refers to a singleton group which
> >>> is immutable to hotplug.
> >> Yes, known at creation time, not retroactively enforced because
> >> someone used SVA
> >>
> > We may check following conditions to set the immutable flag when
> > a new group is created for a device in pci_device_group():
> >
> > 1) ACS is enabled in the upstream path of the device;
> > 2) the device is single function or ACS is enabled on a multi-function device;
> > 3) the device type is PCI_EXP_TYPE_ENDPOINT (thus no hotplug);
> > 4) no 'dma aliasing' on this device;
> >
> > The last one is a bit conservative as it also precludes a device which aliasing
> > dma due to quirks from being treated as a singleton group. But doing so
> > saves the effort on trying to separate different aliasing scenarios as defined
> > in pci_for_each_dma_alias(). Probably we can go this way as the first step.
> >
> > Once the flag is set on a group no other event can change it. If a new
> > identified device hits an existing singleton group in pci_device_group()
> > then it's a bug.
> 
> How about below implementation?
> 
> /* callback for pci_for_each_dma_alias() */
> static int has_pci_alias(struct pci_dev *pdev, u16 alias, void *opaque)
> {
> 	return -EEXIST;
> }
> 
> static bool pci_dev_is_immutably_isolated(struct pci_dev *pdev)
> {
> 	/* Skip bridges. */
> 	if (pci_is_bridge(pdev))
> 		return false;
> 
> 	/* Either connect to root bridge or the ACS-enabled bridge. */
> 	if (!pci_is_root_bus(pdev->bus) &&
> 	    !pci_acs_enabled(pdev->bus->self, REQ_ACS_FLAGS))
> 		return false;

it's not sufficient to just check the non-root bridge itself. This needs to
cover the entire path from the bridge to the root port, as pci_device_group()
does.

> 
> 	/* ACS is required for MFD. */
> 	if (pdev->multifunction && !pci_acs_enabled(pdev, REQ_ACS_FLAGS))
> 		return false;

Above two checks be replaced by a simple check as below:

	if (!pci_acs_path_enabled(pdev, NULL, REQ_ACS_FLAGS))
		return false;

> 
> 	/* Make sure no PCI alias. */
> 	if (pci_for_each_dma_alias(pdev, has_pci_alias, NULL))
> 		return false;
> 
> 	return true;
> }
> 
> I didn't get why do we need to check the PCI_EXP_TYPE_ENDPOINT device
> type. Can you please elaborate a bit more?
> 

I didn't know there is a pci_is_bridge() facility thus be conservative
to restrict it to only endpoint device. If checking pci_is_bridge() alone
excludes any hotplug possibility, then it's definitely better.

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-06 10:44                   ` Tian, Kevin
@ 2022-04-06 11:03                     ` Lu Baolu
  2022-04-06 23:56                       ` Tian, Kevin
  0 siblings, 1 reply; 62+ messages in thread
From: Lu Baolu @ 2022-04-06 11:03 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, iommu, Christoph Hellwig,
	Alex Williamson (alex.williamson@redhat.com),
	Pan, Jacob jun, Robin Murphy, Jean-Philippe Brucker

On 2022/4/6 18:44, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Wednesday, April 6, 2022 6:02 PM
>>
>> Hi Kevin,
>>
>> On 2022/4/2 15:12, Tian, Kevin wrote:
>>>>>> Add a flag to the group that positively indicates the group can never
>>>>>> have more than one member, even after hot plug. eg because it is
>>>>>> impossible due to ACS, or lack of bridges, and so on.
>>>>> OK, I see your point. It essentially refers to a singleton group which
>>>>> is immutable to hotplug.
>>>> Yes, known at creation time, not retroactively enforced because
>>>> someone used SVA
>>>>
>>> We may check following conditions to set the immutable flag when
>>> a new group is created for a device in pci_device_group():
>>>
>>> 1) ACS is enabled in the upstream path of the device;
>>> 2) the device is single function or ACS is enabled on a multi-function device;
>>> 3) the device type is PCI_EXP_TYPE_ENDPOINT (thus no hotplug);
>>> 4) no 'dma aliasing' on this device;
>>>
>>> The last one is a bit conservative as it also precludes a device which aliasing
>>> dma due to quirks from being treated as a singleton group. But doing so
>>> saves the effort on trying to separate different aliasing scenarios as defined
>>> in pci_for_each_dma_alias(). Probably we can go this way as the first step.
>>>
>>> Once the flag is set on a group no other event can change it. If a new
>>> identified device hits an existing singleton group in pci_device_group()
>>> then it's a bug.
>>
>> How about below implementation?
>>
>> /* callback for pci_for_each_dma_alias() */
>> static int has_pci_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>> {
>> 	return -EEXIST;
>> }
>>
>> static bool pci_dev_is_immutably_isolated(struct pci_dev *pdev)
>> {
>> 	/* Skip bridges. */
>> 	if (pci_is_bridge(pdev))
>> 		return false;
>>
>> 	/* Either connect to root bridge or the ACS-enabled bridge. */
>> 	if (!pci_is_root_bus(pdev->bus) &&
>> 	    !pci_acs_enabled(pdev->bus->self, REQ_ACS_FLAGS))
>> 		return false;
> 
> it's not sufficient to just check the non-root bridge itself. This needs to
> cover the entire path from the bridge to the root port, as pci_device_group()
> does.

Yes! You are right.

> 
>>
>> 	/* ACS is required for MFD. */
>> 	if (pdev->multifunction && !pci_acs_enabled(pdev, REQ_ACS_FLAGS))
>> 		return false;
> 
> Above two checks be replaced by a simple check as below:
> 
> 	if (!pci_acs_path_enabled(pdev, NULL, REQ_ACS_FLAGS))
> 		return false;

If !pdev->multifunction, do we still need to start from the device
itself? ACS is only for MFDs and bridges, do I understand it right?
Do we need to consider the SRIOV case?

> 
>>
>> 	/* Make sure no PCI alias. */
>> 	if (pci_for_each_dma_alias(pdev, has_pci_alias, NULL))
>> 		return false;
>>
>> 	return true;
>> }
>>
>> I didn't get why do we need to check the PCI_EXP_TYPE_ENDPOINT device
>> type. Can you please elaborate a bit more?
>>
> 
> I didn't know there is a pci_is_bridge() facility thus be conservative
> to restrict it to only endpoint device. If checking pci_is_bridge() alone
> excludes any hotplug possibility, then it's definitely better.

Okay! Thanks!

> Thanks
> Kevin

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-06  5:58             ` Tian, Kevin
@ 2022-04-06 12:32               ` Robin Murphy
  2022-04-06 13:06                 ` Jason Gunthorpe via iommu
  2022-04-07  0:11                 ` Tian, Kevin
  0 siblings, 2 replies; 62+ messages in thread
From: Robin Murphy @ 2022-04-06 12:32 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig, iommu,
	Pan, Jacob jun, Jean-Philippe Brucker

On 2022-04-06 06:58, Tian, Kevin wrote:
>> From: Jason Gunthorpe <jgg@nvidia.com>
>> Sent: Wednesday, April 6, 2022 9:24 AM
>>
>> On Wed, Apr 06, 2022 at 01:00:13AM +0000, Tian, Kevin wrote:
>>
>>>> Because domains wrap more than just the IOPTE format, they have
>>>> additional data related to the IOMMU HW block itself. Imagine a SOC
>>>> with two IOMMU HW blocks that can both process the CPU IOPTE format,
>>>> but have different configuration.
>>>
>>> Curious. Is it hypothesis or real? If real can you help give a concrete
>>> example?
>>
>> Look at arm_smmu_attach_dev() - the domain has exactly one smmu
>> pointer which contains the base address for the SMMU IP block. If the
>> domain doesn't match the smmu pointer from the struct device it won't
>> allow attaching.
>>
>> I know of ARM SOCs with many copies of the SMMU IP block.
>>
>> So at least with current drivers ARM seems to have this limitation.
>>
> 
> I saw that code, but before this series it is used only for stage-2 instead
> of SVA. and I didn't see similar check in the old sva related paths (though
> it doesn't use domain):
> 
> arm_smmu_master_sva_enable_iopf()
> arm_smmu_master_enable_sva{}
> __arm_smmu_sva_bind()
> 
> If I didn't overlook some trick hiding in the call chain of those functions,
> is there a bug in the existing SMMU sva logic or is it conceptually correct
> to not have such check for SVA?

The current SVA APIs are all device-based, so implicitly reflect 
whichever SMMU instance serves the given device. Once domains come into 
the picture, callers are going to have to be more aware that a domain 
may be specific to a particular IOMMU instance, and potentially allocate 
separate domains for separate devices to represent the same address 
space, much like vfio_iommu_type1_attach_group() does.

It's not really worth IOMMU drivers trying to support a domain spanning 
potentially-heterogeneous instances internally, since they can't 
reasonably know what matters in any particular situation. That's 
primarily why we've never tried to do it in the SMMU drivers. It's a lot 
easier for relevant callers to look at what they get and figure out 
whether any mismatch in capabilities is tolerable or not.

Robin.

> If the former then yes we have to take SMMU IP block into consideration
> thus could have multiple domains per CPU page table. If the latter then
> this is not a valid example for that configuration.
> 
> Which one is correct?
> 
> Thanks
> Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-06 12:32               ` Robin Murphy
@ 2022-04-06 13:06                 ` Jason Gunthorpe via iommu
  2022-04-06 13:37                   ` Robin Murphy
  2022-04-07  0:11                 ` Tian, Kevin
  1 sibling, 1 reply; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 13:06 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tian, Kevin, Raj, Ashok, Jean-Philippe Brucker, linux-kernel,
	Christoph Hellwig, iommu, Pan, Jacob jun, Will Deacon

On Wed, Apr 06, 2022 at 01:32:07PM +0100, Robin Murphy wrote:
> a particular IOMMU instance, and potentially allocate separate domains for
> separate devices to represent the same address space, much like
> vfio_iommu_type1_attach_group() does.

I think this VFIO code also needs some more work, it currently assumes
that if the domain ops are the same then the domains are compatible,
and that is not true for ARM SMMU drivers.

I've been thinking of adding a domain callback 'can device attach' and
replacing the ops compare with that instead.

> It's not really worth IOMMU drivers trying to support a domain spanning
> potentially-heterogeneous instances internally, since they can't reasonably
> know what matters in any particular situation. 

In the long run I think it will be worth optimizing. If the SMMU
instances can share IOPTE memory then we get two wins - memory
reduction and reduced work to read dirty bits.

The dirty read in particular is very performance sensitive so if real
work loads have many SMMUs per VM it will become a pain point.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-06 13:06                 ` Jason Gunthorpe via iommu
@ 2022-04-06 13:37                   ` Robin Murphy
  2022-04-06 14:01                     ` Jason Gunthorpe via iommu
  0 siblings, 1 reply; 62+ messages in thread
From: Robin Murphy @ 2022-04-06 13:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Tian, Kevin, Raj, Ashok, Jean-Philippe Brucker, linux-kernel,
	Christoph Hellwig, iommu, Pan, Jacob jun, Will Deacon

On 2022-04-06 14:06, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 01:32:07PM +0100, Robin Murphy wrote:
>> a particular IOMMU instance, and potentially allocate separate domains for
>> separate devices to represent the same address space, much like
>> vfio_iommu_type1_attach_group() does.
> 
> I think this VFIO code also needs some more work, it currently assumes
> that if the domain ops are the same then the domains are compatible,
> and that is not true for ARM SMMU drivers.

Well, strictly it uses the ops as a "negative" heuristic that the 
domains are not definitely incompatible, and only then consolidates 
domains if cross-attaching actually succeeds. So I don't think it's 
really so bad.

> I've been thinking of adding a domain callback 'can device attach' and
> replacing the ops compare with that instead.

Previous comment notwithstanding, much as I do tend to prefer "just try 
the operation and see what happens" APIs, that might be more reliable in 
the long run than trying to encode specific "sorry, you'll need to 
allocate a separate domain for this device" vs. "this should have worked 
but something went wrong" semantics in the return value from attach.

>> It's not really worth IOMMU drivers trying to support a domain spanning
>> potentially-heterogeneous instances internally, since they can't reasonably
>> know what matters in any particular situation.
> 
> In the long run I think it will be worth optimizing. If the SMMU
> instances can share IOPTE memory then we get two wins - memory
> reduction and reduced work to read dirty bits.
> 
> The dirty read in particular is very performance sensitive so if real
> work loads have many SMMUs per VM it will become a pain point.

In the ideal case though, the SVA domains are only there to logically 
bridge between an existing process pagetable and IOMMU instances at the 
API level, right? Surely if we're shadowing physical pagetables for SVA 
we've basically already lost the performance game?

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-06 13:37                   ` Robin Murphy
@ 2022-04-06 14:01                     ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 62+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 14:01 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tian, Kevin, Raj, Ashok, Jean-Philippe Brucker, linux-kernel,
	Christoph Hellwig, iommu, Pan, Jacob jun, Will Deacon

On Wed, Apr 06, 2022 at 02:37:40PM +0100, Robin Murphy wrote:
> On 2022-04-06 14:06, Jason Gunthorpe wrote:
> > On Wed, Apr 06, 2022 at 01:32:07PM +0100, Robin Murphy wrote:
> > > a particular IOMMU instance, and potentially allocate separate domains for
> > > separate devices to represent the same address space, much like
> > > vfio_iommu_type1_attach_group() does.
> > 
> > I think this VFIO code also needs some more work, it currently assumes
> > that if the domain ops are the same then the domains are compatible,
> > and that is not true for ARM SMMU drivers.
> 
> Well, strictly it uses the ops as a "negative" heuristic that the domains
> are not definitely incompatible, and only then consolidates domains if
> cross-attaching actually succeeds. So I don't think it's really so bad.

Oh that is sneaky, I didn't appreciate that bit of logic..

> > I've been thinking of adding a domain callback 'can device attach' and
> > replacing the ops compare with that instead.
> 
> Previous comment notwithstanding, much as I do tend to prefer "just try the
> operation and see what happens" APIs, that might be more reliable in the
> long run than trying to encode specific "sorry, you'll need to allocate a
> separate domain for this device" vs. "this should have worked but something
> went wrong" semantics in the return value from attach.

Overall the way vfio is doing this has alot of overhead. Not only does
it create a domain it may not use it also does this:

			iommu_detach_group(domain->domain, group->iommu_group);
			if (!iommu_attach_group(d->domain,
						group->iommu_group)) {
				list_add(&group->next, &d->group_list);
				iommu_domain_free(domain->domain);
				kfree(domain);
				goto done;
			}

			ret = iommu_attach_group(domain->domain,
						 group->iommu_group);

So, if we already have a compatible domain VFIO does an extra domain
alloc/dealloc and 3 extra attach/detatches per domain it tests.

It is not very elegant at least..

> > > It's not really worth IOMMU drivers trying to support a domain spanning
> > > potentially-heterogeneous instances internally, since they can't reasonably
> > > know what matters in any particular situation.
> > 
> > In the long run I think it will be worth optimizing. If the SMMU
> > instances can share IOPTE memory then we get two wins - memory
> > reduction and reduced work to read dirty bits.
> > 
> > The dirty read in particular is very performance sensitive so if real
> > work loads have many SMMUs per VM it will become a pain point.
> 
> In the ideal case though, the SVA domains are only there to logically bridge
> between an existing process pagetable and IOMMU instances at the API level,
> right? Surely if we're shadowing physical pagetables for SVA we've basically
> already lost the performance game?

Sorry, I was not talking about SVA with that remark, speaking
generally about normal iommu_domains and sharing between SMMU
instances.

For SVA I see no issue with duplicating it per instance since it is
very small/etc - the only drawback is that the common code has to do
the 'can device attach' dance and keep a domain list per
mm-struct. Which seems OK to me.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
  2022-04-06 11:03                     ` Lu Baolu
@ 2022-04-06 23:56                       ` Tian, Kevin
  0 siblings, 0 replies; 62+ messages in thread
From: Tian, Kevin @ 2022-04-06 23:56 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, iommu, linux-kernel, Christoph Hellwig,
	Alex Williamson (alex.williamson@redhat.com),
	Pan, Jacob jun, Jean-Philippe Brucker, Robin Murphy

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Wednesday, April 6, 2022 7:03 PM
> 
> On 2022/4/6 18:44, Tian, Kevin wrote:
> >> From: Lu Baolu <baolu.lu@linux.intel.com>
> >> Sent: Wednesday, April 6, 2022 6:02 PM
> >>
> >> Hi Kevin,
> >>
> >> On 2022/4/2 15:12, Tian, Kevin wrote:
> >>>>>> Add a flag to the group that positively indicates the group can never
> >>>>>> have more than one member, even after hot plug. eg because it is
> >>>>>> impossible due to ACS, or lack of bridges, and so on.
> >>>>> OK, I see your point. It essentially refers to a singleton group which
> >>>>> is immutable to hotplug.
> >>>> Yes, known at creation time, not retroactively enforced because
> >>>> someone used SVA
> >>>>
> >>> We may check following conditions to set the immutable flag when
> >>> a new group is created for a device in pci_device_group():
> >>>
> >>> 1) ACS is enabled in the upstream path of the device;
> >>> 2) the device is single function or ACS is enabled on a multi-function
> device;
> >>> 3) the device type is PCI_EXP_TYPE_ENDPOINT (thus no hotplug);
> >>> 4) no 'dma aliasing' on this device;
> >>>
> >>> The last one is a bit conservative as it also precludes a device which
> aliasing
> >>> dma due to quirks from being treated as a singleton group. But doing so
> >>> saves the effort on trying to separate different aliasing scenarios as
> defined
> >>> in pci_for_each_dma_alias(). Probably we can go this way as the first
> step.
> >>>
> >>> Once the flag is set on a group no other event can change it. If a new
> >>> identified device hits an existing singleton group in pci_device_group()
> >>> then it's a bug.
> >>
> >> How about below implementation?
> >>
> >> /* callback for pci_for_each_dma_alias() */
> >> static int has_pci_alias(struct pci_dev *pdev, u16 alias, void *opaque)
> >> {
> >> 	return -EEXIST;
> >> }
> >>
> >> static bool pci_dev_is_immutably_isolated(struct pci_dev *pdev)
> >> {
> >> 	/* Skip bridges. */
> >> 	if (pci_is_bridge(pdev))
> >> 		return false;
> >>
> >> 	/* Either connect to root bridge or the ACS-enabled bridge. */
> >> 	if (!pci_is_root_bus(pdev->bus) &&
> >> 	    !pci_acs_enabled(pdev->bus->self, REQ_ACS_FLAGS))
> >> 		return false;
> >
> > it's not sufficient to just check the non-root bridge itself. This needs to
> > cover the entire path from the bridge to the root port, as pci_device_group()
> > does.
> 
> Yes! You are right.
> 
> >
> >>
> >> 	/* ACS is required for MFD. */
> >> 	if (pdev->multifunction && !pci_acs_enabled(pdev, REQ_ACS_FLAGS))
> >> 		return false;
> >
> > Above two checks be replaced by a simple check as below:
> >
> > 	if (!pci_acs_path_enabled(pdev, NULL, REQ_ACS_FLAGS))
> > 		return false;
> 
> If !pdev->multifunction, do we still need to start from the device
> itself? ACS is only for MFDs and bridges, do I understand it right?
> Do we need to consider the SRIOV case?

SRIOV is same as MFD. and all those tricks are already considered
properly in pci_acs_enabled().

> 
> >
> >>
> >> 	/* Make sure no PCI alias. */
> >> 	if (pci_for_each_dma_alias(pdev, has_pci_alias, NULL))
> >> 		return false;
> >>
> >> 	return true;
> >> }
> >>
> >> I didn't get why do we need to check the PCI_EXP_TYPE_ENDPOINT device
> >> type. Can you please elaborate a bit more?
> >>
> >
> > I didn't know there is a pci_is_bridge() facility thus be conservative
> > to restrict it to only endpoint device. If checking pci_is_bridge() alone
> > excludes any hotplug possibility, then it's definitely better.
> 
> Okay! Thanks!
> 
> > Thanks
> > Kevin
> 
> Best regards,
> baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

* RE: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
  2022-04-06 12:32               ` Robin Murphy
  2022-04-06 13:06                 ` Jason Gunthorpe via iommu
@ 2022-04-07  0:11                 ` Tian, Kevin
  1 sibling, 0 replies; 62+ messages in thread
From: Tian, Kevin @ 2022-04-07  0:11 UTC (permalink / raw)
  To: Robin Murphy, Jason Gunthorpe
  Cc: Raj, Ashok, Will Deacon, linux-kernel, Christoph Hellwig, iommu,
	Pan, Jacob jun, Jean-Philippe Brucker

> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Wednesday, April 6, 2022 8:32 PM
> 
> On 2022-04-06 06:58, Tian, Kevin wrote:
> >> From: Jason Gunthorpe <jgg@nvidia.com>
> >> Sent: Wednesday, April 6, 2022 9:24 AM
> >>
> >> On Wed, Apr 06, 2022 at 01:00:13AM +0000, Tian, Kevin wrote:
> >>
> >>>> Because domains wrap more than just the IOPTE format, they have
> >>>> additional data related to the IOMMU HW block itself. Imagine a SOC
> >>>> with two IOMMU HW blocks that can both process the CPU IOPTE
> format,
> >>>> but have different configuration.
> >>>
> >>> Curious. Is it hypothesis or real? If real can you help give a concrete
> >>> example?
> >>
> >> Look at arm_smmu_attach_dev() - the domain has exactly one smmu
> >> pointer which contains the base address for the SMMU IP block. If the
> >> domain doesn't match the smmu pointer from the struct device it won't
> >> allow attaching.
> >>
> >> I know of ARM SOCs with many copies of the SMMU IP block.
> >>
> >> So at least with current drivers ARM seems to have this limitation.
> >>
> >
> > I saw that code, but before this series it is used only for stage-2 instead
> > of SVA. and I didn't see similar check in the old sva related paths (though
> > it doesn't use domain):
> >
> > arm_smmu_master_sva_enable_iopf()
> > arm_smmu_master_enable_sva{}
> > __arm_smmu_sva_bind()
> >
> > If I didn't overlook some trick hiding in the call chain of those functions,
> > is there a bug in the existing SMMU sva logic or is it conceptually correct
> > to not have such check for SVA?
> 
> The current SVA APIs are all device-based, so implicitly reflect
> whichever SMMU instance serves the given device. Once domains come into
> the picture, callers are going to have to be more aware that a domain
> may be specific to a particular IOMMU instance, and potentially allocate
> separate domains for separate devices to represent the same address
> space, much like vfio_iommu_type1_attach_group() does.

The current definition of domain is specific to stage-2. It is interesting 
whether we want to keep the same stage-2 constraints to stage-1 when
extending the domain concept to cover both. From what you explained
SVA conceptually is device-based thus SMMU instance doesn't matter.
Then allowing one domain to wrap CPU page table cross devices under
different SMMU instances is also one choice, as long as this domain is
differentiated from stage-2 domain in proper way. I get you may not
want to go this route for smmu driver as explained below, but let's
align on that it's just an implementation choice instead of a conceptual
requirement (I didn't see how IOMMU instances can have heterogenous
configurations when walking the same CPU page table). 😊

> 
> It's not really worth IOMMU drivers trying to support a domain spanning
> potentially-heterogeneous instances internally, since they can't
> reasonably know what matters in any particular situation. That's
> primarily why we've never tried to do it in the SMMU drivers. It's a lot
> easier for relevant callers to look at what they get and figure out
> whether any mismatch in capabilities is tolerable or not.
> 
> Robin.
> 
> > If the former then yes we have to take SMMU IP block into consideration
> > thus could have multiple domains per CPU page table. If the latter then
> > this is not a valid example for that configuration.
> >
> > Which one is correct?
> >
> > Thanks
> > Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2022-04-07  0:11 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-29  5:37 [PATCH RFC v2 00/11] iommu: SVA and IOPF refactoring Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu Lu Baolu
2022-03-29 21:00   ` Jacob Pan
2022-03-30  4:30     ` Lu Baolu
2022-03-30  7:05   ` Tian, Kevin
2022-03-30 11:58     ` Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown() Lu Baolu
2022-03-29  8:42   ` Tian, Kevin
2022-03-29 11:42     ` Jason Gunthorpe via iommu
2022-03-30  6:50       ` Tian, Kevin
2022-03-30 11:57         ` Lu Baolu
2022-03-30 11:58         ` Jason Gunthorpe via iommu
2022-03-30 14:12           ` Tian, Kevin
2022-03-30 14:30             ` Jason Gunthorpe via iommu
2022-04-02  7:12               ` Tian, Kevin
2022-04-02 23:29                 ` Jason Gunthorpe via iommu
2022-04-06 10:02                 ` Lu Baolu
2022-04-06 10:44                   ` Tian, Kevin
2022-04-06 11:03                     ` Lu Baolu
2022-04-06 23:56                       ` Tian, Kevin
2022-03-30 14:18           ` Tian, Kevin
2022-03-30 15:04             ` Alex Williamson
2022-04-04  5:43           ` Lu Baolu
2022-04-04 17:24             ` Jason Gunthorpe via iommu
2022-04-05  6:12               ` Lu Baolu
2022-04-05 14:10                 ` Jason Gunthorpe via iommu
2022-04-06  9:51                   ` Lu Baolu
2022-04-01  6:20       ` Yi Liu
2022-04-01 11:52         ` Jason Gunthorpe via iommu
2022-03-30  4:59     ` Lu Baolu
2022-03-30  6:55       ` Tian, Kevin
2022-04-01  5:49     ` Yi Liu
2022-03-29  5:37 ` [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA Lu Baolu
2022-03-29 21:38   ` Jacob Pan
2022-03-30  4:35     ` Lu Baolu
2022-03-30 19:02   ` Jason Gunthorpe via iommu
2022-04-02  8:43     ` Tian, Kevin
2022-04-02 23:32       ` Jason Gunthorpe via iommu
2022-04-04  6:09         ` Lu Baolu
2022-04-06  1:00         ` Tian, Kevin
2022-04-06  1:23           ` Jason Gunthorpe via iommu
2022-04-06  5:58             ` Tian, Kevin
2022-04-06 12:32               ` Robin Murphy
2022-04-06 13:06                 ` Jason Gunthorpe via iommu
2022-04-06 13:37                   ` Robin Murphy
2022-04-06 14:01                     ` Jason Gunthorpe via iommu
2022-04-07  0:11                 ` Tian, Kevin
2022-03-29  5:37 ` [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops Lu Baolu
2022-03-30 19:08   ` Jason Gunthorpe via iommu
2022-04-04  6:47     ` Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 05/11] iommu/vt-d: Remove SVM_FLAG_SUPERVISOR_MODE suport Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 06/11] iommu/vt-d: Add SVA domain support Lu Baolu
2022-03-30 19:09   ` Jason Gunthorpe via iommu
2022-04-04  6:52     ` Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 07/11] arm-smmu-v3/sva: " Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces Lu Baolu
2022-03-31 20:59   ` Jacob Pan
2022-03-31 22:26     ` Jason Gunthorpe via iommu
2022-04-04  5:55       ` Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 09/11] iommu: Remove SVA related callbacks from iommu ops Lu Baolu
2022-03-29  5:37 ` [PATCH RFC v2 10/11] iommu: Per-domain I/O page fault handling Lu Baolu
2022-03-29  5:38 ` [PATCH RFC v2 11/11] iommu: Rename iommu-sva-lib.{c,h} Lu Baolu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).