linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v6 PATCH 0/5] Share sva domains with all devices bound to a mm
@ 2023-10-11  6:51 Tina Zhang
  2023-10-11  6:51 ` [v6 PATCH 1/5] iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm() Tina Zhang
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Tina Zhang @ 2023-10-11  6:51 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Lu Baolu; +Cc: iommu, linux-kernel, Tina Zhang

This series is to share sva(shared virtual addressing) domains with all
devices bound to one mm.

Problem
-------
In the current iommu core code, sva domain is allocated per IOMMU group,
when device driver is binding a process address space to a device (which is
handled in iommu_sva_bind_device()). If one than more device is bound to
the same process address space, there must be more than one sva domain
instance, with each device having one. In other words, the sva domain
doesn't share between those devices bound to the same process address
space, and that leads to two problems:
1) device driver has to duplicate sva domains with enqcmd, as those sva
domains have the same PASID and are relevant to one virtual address space.
This makes the sva domain handling complex in device drivers.
2) IOMMU driver cannot get sufficient info of the IOMMUs that have
devices behind them bound to the same virtual address space, when handling
mmu_notifier_ops callbacks. As a result, IOMMU IOTLB invalidation is
performed per device instead of per IOMMU, and that may lead to
superfluous IOTLB invalidation issue, especially in a virtualization
environment where all devices may be behind one virtual IOMMU.

Solution
--------
This patch-set tries to fix those two problems by allowing sharing sva
domains with all devices bound to a mm. To achieve this, a new structure
pointer is introduced to mm to replace the old PASID field, which can keep
the info of PASID as well as the corresponding shared sva domains.
Besides, function iommu_sva_bind_device() is updated to ensure a new sva
domain can only be allocated when the old ones cannot work for the IOMMU.
With these changes, a device driver can expect one sva domain could work
for per PASID instance(e.g., enqcmd PASID instance), and therefore may get
rid of handling sva domain duplication. Besides, IOMMU driver (e.g., intel
vt-d driver) can get sufficient info (e.g., the info of the IOMMUs having
their devices bound to one virtual address space) when handling
mmu_notifier_ops callbacks, to remove the redundant IOTLB invalidations.

Arguably there shouldn't be more than one sva_domain with the same PASID,
and in any sane configuration there should be only 1 type of IOMMU driver
that needs only 1 SVA domain. However, in reality, IOMMUs on one platform
may not be identical to each other. Thus, attaching a sva domain that has
been successfully bound to device A behind a IOMMU A, to device B behind
IOMMU B may get failed due to the difference between IOMMU A and IOMMU
B. In this case, a new sva domain with the same PASID needs to be
allocated to work with IOMMU B. That's why we need a list to keep sva
domains of one PASID. For the platform where IOMMUs are compatible to each
other, there should be one sva domain in the list.

v6:
 - Rename iommu_sva_alloc_pasid() to iommu_alloc_mm_data().
 - Hold the iommu_sva_lock before invoking iommu_alloc_mm_data().
 - Remove "iommu: Introduce mm_get_pasid() helper function" patch, because
   SMMUv3 decides to use mm_get_enqcmd_pasid() instead and other users are
   using iommu_sva_get_pasid() to get the pasid value. Besides, the iommu
   core accesses iommu_mm_data in the critical section protected by
   iommu_sva_lock. So no need to add another helper to retrieve PASID
   atomically.

v5:
 - Order patch "iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm()"
   first in this series.
 - Update commit message of patch "iommu: Introduce mm_get_pasid()
   helper function"
 - Use smp_store_release() & READ_ONCE() in storing and loading mm's
   pasid value.

v4:
 - Rebase to v6.6-rc1.

v3:
 - Add a comment describing domain->next.
 - Expand explanation of why PASID isn't released in
   iommu_sva_unbind_device().
 - Add a patch to remove mm->pasid in intel_sva_bind_mm()

v2:
 - Add mm_get_enqcmd_pasid().
 - Update commit message.

v1: https://lore.kernel.org/linux-iommu/20230808074944.7825-1-tina.zhang@intel.com/

Tina Zhang (5):
  iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm()
  iommu: Add mm_get_enqcmd_pasid() helper function
  mm: Add structure to keep sva information
  iommu: Support mm PASID 1:n with sva domains
  mm: Deprecate pasid field

 arch/x86/kernel/traps.c                       |  2 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 12 +--
 drivers/iommu/intel/svm.c                     | 14 +--
 drivers/iommu/iommu-sva.c                     | 94 +++++++++++--------
 include/linux/iommu.h                         | 27 +++++-
 include/linux/mm_types.h                      |  3 +-
 kernel/fork.c                                 |  1 -
 mm/init-mm.c                                  |  3 -
 8 files changed, 93 insertions(+), 63 deletions(-)

-- 
2.39.3


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [v6 PATCH 1/5] iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm()
  2023-10-11  6:51 [v6 PATCH 0/5] Share sva domains with all devices bound to a mm Tina Zhang
@ 2023-10-11  6:51 ` Tina Zhang
  2023-10-11  6:51 ` [v6 PATCH 2/5] iommu: Add mm_get_enqcmd_pasid() helper function Tina Zhang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Tina Zhang @ 2023-10-11  6:51 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Lu Baolu
  Cc: iommu, linux-kernel, Tina Zhang, Jason Gunthorpe

The pasid is passed in as a parameter through .set_dev_pasid() callback.
Thus, intel_sva_bind_mm() can directly use it instead of retrieving the
pasid value from mm->pasid.

Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
---
 drivers/iommu/intel/svm.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 50a481c895b8..3c531af58658 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -290,21 +290,22 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
 }
 
 static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
-			     struct mm_struct *mm)
+			     struct iommu_domain *domain, ioasid_t pasid)
 {
 	struct device_domain_info *info = dev_iommu_priv_get(dev);
+	struct mm_struct *mm = domain->mm;
 	struct intel_svm_dev *sdev;
 	struct intel_svm *svm;
 	unsigned long sflags;
 	int ret = 0;
 
-	svm = pasid_private_find(mm->pasid);
+	svm = pasid_private_find(pasid);
 	if (!svm) {
 		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
 		if (!svm)
 			return -ENOMEM;
 
-		svm->pasid = mm->pasid;
+		svm->pasid = pasid;
 		svm->mm = mm;
 		INIT_LIST_HEAD_RCU(&svm->devs);
 
@@ -342,7 +343,7 @@ static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
 
 	/* Setup the pasid table: */
 	sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
-	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm->pasid,
+	ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid,
 					    FLPT_DEFAULT_DID, sflags);
 	if (ret)
 		goto free_sdev;
@@ -356,7 +357,7 @@ static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
 free_svm:
 	if (list_empty(&svm->devs)) {
 		mmu_notifier_unregister(&svm->notifier, mm);
-		pasid_private_remove(mm->pasid);
+		pasid_private_remove(pasid);
 		kfree(svm);
 	}
 
@@ -796,9 +797,8 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
 {
 	struct device_domain_info *info = dev_iommu_priv_get(dev);
 	struct intel_iommu *iommu = info->iommu;
-	struct mm_struct *mm = domain->mm;
 
-	return intel_svm_bind_mm(iommu, dev, mm);
+	return intel_svm_bind_mm(iommu, dev, domain, pasid);
 }
 
 static void intel_svm_domain_free(struct iommu_domain *domain)
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [v6 PATCH 2/5] iommu: Add mm_get_enqcmd_pasid() helper function
  2023-10-11  6:51 [v6 PATCH 0/5] Share sva domains with all devices bound to a mm Tina Zhang
  2023-10-11  6:51 ` [v6 PATCH 1/5] iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm() Tina Zhang
@ 2023-10-11  6:51 ` Tina Zhang
  2023-10-11 15:44   ` Jason Gunthorpe
  2023-10-11  6:51 ` [v6 PATCH 3/5] mm: Add structure to keep sva information Tina Zhang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Tina Zhang @ 2023-10-11  6:51 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Lu Baolu
  Cc: iommu, linux-kernel, Tina Zhang, Jason Gunthorpe

mm_get_enqcmd_pasid() is for getting enqcmd pasid value.

The motivation is to replace mm->pasid with an iommu private data
structure that is introduced in a later patch.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
---

Changes in v6:
- Let SMMUv3 call mm_get_enqcmd_pasid().
- Let iommu_sva_get_pasid() call mm_get_enqcmd_pasid().

Change in v2:
- Change mm_get_pasid() to mm_get_enqcmd_pasid()

 arch/x86/kernel/traps.c                         |  2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 12 ++++++------
 drivers/iommu/iommu-sva.c                       |  2 +-
 include/linux/iommu.h                           |  8 ++++++++
 4 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index c876f1d36a81..832f4413d96a 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -591,7 +591,7 @@ static bool try_fixup_enqcmd_gp(void)
 	if (!mm_valid_pasid(current->mm))
 		return false;
 
-	pasid = current->mm->pasid;
+	pasid = mm_get_enqcmd_pasid(current->mm);
 
 	/*
 	 * Did this thread already have its PASID activated?
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 8a16cd3ef487..49aaa7262ea1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -229,7 +229,7 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
 						    smmu_domain);
 	}
 
-	arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, start, size);
+	arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), start, size);
 }
 
 static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -247,10 +247,10 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 	 * DMA may still be running. Keep the cd valid to avoid C_BAD_CD events,
 	 * but disable translation.
 	 */
-	arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, &quiet_cd);
+	arm_smmu_write_ctx_desc(smmu_domain, mm_get_enqcmd_pasid(mm), &quiet_cd);
 
 	arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
-	arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0);
+	arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
 
 	smmu_mn->cleared = true;
 	mutex_unlock(&sva_lock);
@@ -304,7 +304,7 @@ arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain,
 		goto err_free_cd;
 	}
 
-	ret = arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, cd);
+	ret = arm_smmu_write_ctx_desc(smmu_domain, mm_get_enqcmd_pasid(mm), cd);
 	if (ret)
 		goto err_put_notifier;
 
@@ -329,7 +329,7 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
 		return;
 
 	list_del(&smmu_mn->list);
-	arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, NULL);
+	arm_smmu_write_ctx_desc(smmu_domain, mm_get_enqcmd_pasid(mm), NULL);
 
 	/*
 	 * If we went through clear(), we've already invalidated, and no
@@ -337,7 +337,7 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
 	 */
 	if (!smmu_mn->cleared) {
 		arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
-		arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0);
+		arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
 	}
 
 	/* Frees smmu_mn */
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index b78671a8a914..4a2f5699747f 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -141,7 +141,7 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle)
 {
 	struct iommu_domain *domain = handle->domain;
 
-	return domain->mm->pasid;
+	return mm_get_enqcmd_pasid(domain->mm);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c50a769d569a..a4eab6697fe1 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1189,6 +1189,10 @@ static inline bool mm_valid_pasid(struct mm_struct *mm)
 {
 	return mm->pasid != IOMMU_PASID_INVALID;
 }
+static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
+{
+	return mm->pasid;
+}
 void mm_pasid_drop(struct mm_struct *mm);
 struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 					struct mm_struct *mm);
@@ -1211,6 +1215,10 @@ static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
 }
 static inline void mm_pasid_init(struct mm_struct *mm) {}
 static inline bool mm_valid_pasid(struct mm_struct *mm) { return false; }
+static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
+{
+	return IOMMU_PASID_INVALID;
+}
 static inline void mm_pasid_drop(struct mm_struct *mm) {}
 #endif /* CONFIG_IOMMU_SVA */
 
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [v6 PATCH 3/5] mm: Add structure to keep sva information
  2023-10-11  6:51 [v6 PATCH 0/5] Share sva domains with all devices bound to a mm Tina Zhang
  2023-10-11  6:51 ` [v6 PATCH 1/5] iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm() Tina Zhang
  2023-10-11  6:51 ` [v6 PATCH 2/5] iommu: Add mm_get_enqcmd_pasid() helper function Tina Zhang
@ 2023-10-11  6:51 ` Tina Zhang
  2023-10-11  6:51 ` [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains Tina Zhang
  2023-10-11  6:51 ` [v6 PATCH 5/5] mm: Deprecate pasid field Tina Zhang
  4 siblings, 0 replies; 10+ messages in thread
From: Tina Zhang @ 2023-10-11  6:51 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Lu Baolu
  Cc: iommu, linux-kernel, Tina Zhang, Vasant Hegde, Jason Gunthorpe

Introduce iommu_mm_data structure to keep sva information (pasid and the
related sva domains). Add iommu_mm pointer, pointing to an instance of
iommu_mm_data structure, to mm.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
---
 include/linux/iommu.h    | 5 +++++
 include/linux/mm_types.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a4eab6697fe1..dc1f98e12f4b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -670,6 +670,11 @@ struct iommu_sva {
 	struct iommu_domain		*domain;
 };
 
+struct iommu_mm_data {
+	u32			pasid;
+	struct list_head	sva_domains;
+};
+
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
 		      const struct iommu_ops *ops);
 void iommu_fwspec_free(struct device *dev);
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 36c5b43999e6..9f4efed85f74 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -670,6 +670,7 @@ struct mm_cid {
 #endif
 
 struct kioctx_table;
+struct iommu_mm_data;
 struct mm_struct {
 	struct {
 		/*
@@ -883,6 +884,7 @@ struct mm_struct {
 
 #ifdef CONFIG_IOMMU_SVA
 		u32 pasid;
+		struct iommu_mm_data *iommu_mm;
 #endif
 #ifdef CONFIG_KSM
 		/*
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains
  2023-10-11  6:51 [v6 PATCH 0/5] Share sva domains with all devices bound to a mm Tina Zhang
                   ` (2 preceding siblings ...)
  2023-10-11  6:51 ` [v6 PATCH 3/5] mm: Add structure to keep sva information Tina Zhang
@ 2023-10-11  6:51 ` Tina Zhang
  2023-10-11 12:39   ` Jason Gunthorpe
  2023-10-11  6:51 ` [v6 PATCH 5/5] mm: Deprecate pasid field Tina Zhang
  4 siblings, 1 reply; 10+ messages in thread
From: Tina Zhang @ 2023-10-11  6:51 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Lu Baolu
  Cc: iommu, linux-kernel, Tina Zhang, Vasant Hegde, Jason Gunthorpe

Each mm bound to devices gets a PASID and corresponding sva domains
allocated in iommu_sva_bind_device(), which are referenced by iommu_mm
field of the mm. The PASID is released in __mmdrop(), while a sva domain
is released when no one is using it (the reference count is decremented
in iommu_sva_unbind_device()). However, although sva domains and their
PASID are separate objects such that their own life cycles could be
handled independently, an enqcmd use case may require releasing the
PASID in releasing the mm (i.e., once a PASID is allocated for a mm, it
will be permanently used by the mm and won't be released until the end
of mm) and only allows to drop the PASID after the sva domains are
released. To this end, mmgrab() is called in iommu_sva_domain_alloc() to
increment the mm reference count and mmdrop() is invoked in
iommu_domain_free() to decrement the mm reference count.

Since the required info of PASID and sva domains is kept in struct
iommu_mm_data of a mm, use mm->iommu_mm field instead of the old pasid
field in mm struct. The sva domain list is protected by iommu_sva_lock.

Besides, this patch removes mm_pasid_init(), as with the introduced
iommu_mm structure, initializing mm pasid in mm_init() is unnecessary.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
---

Change in v6:
- Rename iommu_sva_alloc_pasid() to iommu_alloc_mm_data().
- Hold the iommu_sva_lock before invoking iommu_alloc_mm_data().

Change in v5:
- Use smp_store_release() & READ_ONCE() in storing and loading mm's
  pasid value.

Change in v4:
- Rebase to v6.6-rc1.

 drivers/iommu/iommu-sva.c | 92 +++++++++++++++++++++++----------------
 include/linux/iommu.h     | 18 +++++---
 kernel/fork.c             |  1 -
 3 files changed, 65 insertions(+), 46 deletions(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 4a2f5699747f..5175e8d85247 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -12,32 +12,42 @@
 static DEFINE_MUTEX(iommu_sva_lock);
 
 /* Allocate a PASID for the mm within range (inclusive) */
-static int iommu_sva_alloc_pasid(struct mm_struct *mm, struct device *dev)
+static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct device *dev)
 {
+	struct iommu_mm_data *iommu_mm;
 	ioasid_t pasid;
-	int ret = 0;
+
+	lockdep_assert_held(&iommu_sva_lock);
 
 	if (!arch_pgtable_dma_compat(mm))
-		return -EBUSY;
+		return ERR_PTR(-EBUSY);
 
-	mutex_lock(&iommu_sva_lock);
+	iommu_mm = mm->iommu_mm;
 	/* Is a PASID already associated with this mm? */
-	if (mm_valid_pasid(mm)) {
-		if (mm->pasid >= dev->iommu->max_pasids)
-			ret = -EOVERFLOW;
-		goto out;
+	if (iommu_mm) {
+		if (iommu_mm->pasid >= dev->iommu->max_pasids)
+			return ERR_PTR(-EOVERFLOW);
+		return iommu_mm;
 	}
 
+	iommu_mm = kzalloc(sizeof(struct iommu_mm_data), GFP_KERNEL);
+	if (!iommu_mm)
+		return ERR_PTR(-ENOMEM);
+
 	pasid = iommu_alloc_global_pasid(dev);
 	if (pasid == IOMMU_PASID_INVALID) {
-		ret = -ENOSPC;
-		goto out;
+		kfree(iommu_mm);
+		return ERR_PTR(-ENOSPC);
 	}
-	mm->pasid = pasid;
-	ret = 0;
-out:
-	mutex_unlock(&iommu_sva_lock);
-	return ret;
+	iommu_mm->pasid = pasid;
+	INIT_LIST_HEAD(&iommu_mm->sva_domains);
+	/*
+	 * Make sure the write to mm->iommu_mm is not reordered in front of
+	 * initialization to iommu_mm fields. If it does, readers may see a
+	 * valid iommu_mm with uninitialized values.
+	 */
+	smp_store_release(&mm->iommu_mm, iommu_mm);
+	return iommu_mm;
 }
 
 /**
@@ -58,31 +68,33 @@ static int iommu_sva_alloc_pasid(struct mm_struct *mm, struct device *dev)
  */
 struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
 {
+	struct iommu_mm_data *iommu_mm;
 	struct iommu_domain *domain;
 	struct iommu_sva *handle;
 	int ret;
 
+	mutex_lock(&iommu_sva_lock);
+
 	/* Allocate mm->pasid if necessary. */
-	ret = iommu_sva_alloc_pasid(mm, dev);
-	if (ret)
-		return ERR_PTR(ret);
+	iommu_mm = iommu_alloc_mm_data(mm, dev);
+	if (IS_ERR(iommu_mm)) {
+		ret = PTR_ERR(iommu_mm);
+		goto out_unlock;
+	}
 
 	handle = kzalloc(sizeof(*handle), GFP_KERNEL);
-	if (!handle)
-		return ERR_PTR(-ENOMEM);
-
-	mutex_lock(&iommu_sva_lock);
-	/* Search for an existing domain. */
-	domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid,
-						IOMMU_DOMAIN_SVA);
-	if (IS_ERR(domain)) {
-		ret = PTR_ERR(domain);
+	if (!handle) {
+		ret = -ENOMEM;
 		goto out_unlock;
 	}
 
-	if (domain) {
-		domain->users++;
-		goto out;
+	/* Search for an existing domain. */
+	list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) {
+		ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid);
+		if (!ret) {
+			domain->users++;
+			goto out;
+		}
 	}
 
 	/* Allocate a new domain and set it on device pasid. */
@@ -92,23 +104,23 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
 		goto out_unlock;
 	}
 
-	ret = iommu_attach_device_pasid(domain, dev, mm->pasid);
+	ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid);
 	if (ret)
 		goto out_free_domain;
 	domain->users = 1;
+	list_add(&domain->next, &mm->iommu_mm->sva_domains);
+
 out:
 	mutex_unlock(&iommu_sva_lock);
 	handle->dev = dev;
 	handle->domain = domain;
-
 	return handle;
 
 out_free_domain:
 	iommu_domain_free(domain);
+	kfree(handle);
 out_unlock:
 	mutex_unlock(&iommu_sva_lock);
-	kfree(handle);
-
 	return ERR_PTR(ret);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
@@ -124,12 +136,13 @@ EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
 void iommu_sva_unbind_device(struct iommu_sva *handle)
 {
 	struct iommu_domain *domain = handle->domain;
-	ioasid_t pasid = domain->mm->pasid;
+	struct iommu_mm_data *iommu_mm = domain->mm->iommu_mm;
 	struct device *dev = handle->dev;
 
 	mutex_lock(&iommu_sva_lock);
+	iommu_detach_device_pasid(domain, dev, iommu_mm->pasid);
 	if (--domain->users == 0) {
-		iommu_detach_device_pasid(domain, dev, pasid);
+		list_del(&domain->next);
 		iommu_domain_free(domain);
 	}
 	mutex_unlock(&iommu_sva_lock);
@@ -205,8 +218,11 @@ iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
 
 void mm_pasid_drop(struct mm_struct *mm)
 {
-	if (likely(!mm_valid_pasid(mm)))
+	struct iommu_mm_data *iommu_mm = mm->iommu_mm;
+
+	if (!iommu_mm)
 		return;
 
-	iommu_free_global_pasid(mm->pasid);
+	iommu_free_global_pasid(iommu_mm->pasid);
+	kfree(iommu_mm);
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index dc1f98e12f4b..bd79d4e4af89 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -109,6 +109,11 @@ struct iommu_domain {
 		struct {	/* IOMMU_DOMAIN_SVA */
 			struct mm_struct *mm;
 			int users;
+			/*
+			 * Next iommu_domain in mm->iommu_mm->sva-domains list
+			 * protected by iommu_sva_lock.
+			 */
+			struct list_head next;
 		};
 	};
 };
@@ -1186,17 +1191,17 @@ static inline bool tegra_dev_iommu_get_stream_id(struct device *dev, u32 *stream
 }
 
 #ifdef CONFIG_IOMMU_SVA
-static inline void mm_pasid_init(struct mm_struct *mm)
-{
-	mm->pasid = IOMMU_PASID_INVALID;
-}
 static inline bool mm_valid_pasid(struct mm_struct *mm)
 {
-	return mm->pasid != IOMMU_PASID_INVALID;
+	return READ_ONCE(mm->iommu_mm);
 }
 static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
 {
-	return mm->pasid;
+	struct iommu_mm_data *iommu_mm = READ_ONCE(mm->iommu_mm);
+
+	if (!iommu_mm)
+		return IOMMU_PASID_INVALID;
+	return iommu_mm->pasid;
 }
 void mm_pasid_drop(struct mm_struct *mm);
 struct iommu_sva *iommu_sva_bind_device(struct device *dev,
@@ -1218,7 +1223,6 @@ static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
 {
 	return IOMMU_PASID_INVALID;
 }
-static inline void mm_pasid_init(struct mm_struct *mm) {}
 static inline bool mm_valid_pasid(struct mm_struct *mm) { return false; }
 static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
 {
diff --git a/kernel/fork.c b/kernel/fork.c
index 3b6d20dfb9a8..985403a7a747 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1277,7 +1277,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	mm_init_cpumask(mm);
 	mm_init_aio(mm);
 	mm_init_owner(mm, p);
-	mm_pasid_init(mm);
 	RCU_INIT_POINTER(mm->exe_file, NULL);
 	mmu_notifier_subscriptions_init(mm);
 	init_tlb_flush_pending(mm);
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [v6 PATCH 5/5] mm: Deprecate pasid field
  2023-10-11  6:51 [v6 PATCH 0/5] Share sva domains with all devices bound to a mm Tina Zhang
                   ` (3 preceding siblings ...)
  2023-10-11  6:51 ` [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains Tina Zhang
@ 2023-10-11  6:51 ` Tina Zhang
  4 siblings, 0 replies; 10+ messages in thread
From: Tina Zhang @ 2023-10-11  6:51 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Lu Baolu
  Cc: iommu, linux-kernel, Tina Zhang, Vasant Hegde, Jason Gunthorpe

Drop the pasid field, as all the information needed for sva domain
management has been moved to the newly added iommu_mm field.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
---
 include/linux/mm_types.h | 1 -
 mm/init-mm.c             | 3 ---
 2 files changed, 4 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 9f4efed85f74..37f049c4b059 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -883,7 +883,6 @@ struct mm_struct {
 		struct work_struct async_put_work;
 
 #ifdef CONFIG_IOMMU_SVA
-		u32 pasid;
 		struct iommu_mm_data *iommu_mm;
 #endif
 #ifdef CONFIG_KSM
diff --git a/mm/init-mm.c b/mm/init-mm.c
index cfd367822cdd..24c809379274 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -44,9 +44,6 @@ struct mm_struct init_mm = {
 #endif
 	.user_ns	= &init_user_ns,
 	.cpu_bitmap	= CPU_BITS_NONE,
-#ifdef CONFIG_IOMMU_SVA
-	.pasid		= IOMMU_PASID_INVALID,
-#endif
 	INIT_MM_CONTEXT(init_mm)
 };
 
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains
  2023-10-11  6:51 ` [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains Tina Zhang
@ 2023-10-11 12:39   ` Jason Gunthorpe
  2023-10-11 13:26     ` Tina Zhang
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2023-10-11 12:39 UTC (permalink / raw)
  To: Tina Zhang, Nicolin Chen
  Cc: Kevin Tian, Lu Baolu, iommu, linux-kernel, Vasant Hegde

On Wed, Oct 11, 2023 at 02:51:31PM +0800, Tina Zhang wrote:

> diff --git a/kernel/fork.c b/kernel/fork.c
> index 3b6d20dfb9a8..985403a7a747 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1277,7 +1277,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
>  	mm_init_cpumask(mm);
>  	mm_init_aio(mm);
>  	mm_init_owner(mm, p);
> -	mm_pasid_init(mm);
>  	RCU_INIT_POINTER(mm->exe_file, NULL);
>  	mmu_notifier_subscriptions_init(mm);
>  	init_tlb_flush_pending(mm);

Nicolin debugged his crash report last night and sent me the details.

This hunk is the cause of the bug that Nicolin reported.

The dup_mm() flow does:

static struct mm_struct *dup_mm(struct task_struct *tsk,
				struct mm_struct *oldmm)
{
	struct mm_struct *mm;
	int err;

	mm = allocate_mm();
	if (!mm)
		goto fail_nomem;

	memcpy(mm, oldmm, sizeof(*mm));

	if (!mm_init(mm, tsk, mm->user_ns))
		goto fail_nomem;

It is essential that mm_pasid_init() zero the new pointer otherwise,
due to the memcpy, after a fork two mm structs will point to the same
thing and one will UAF/doube free.

Keep mm_pasid_init() and add zeroing the new pointer to it.

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains
  2023-10-11 12:39   ` Jason Gunthorpe
@ 2023-10-11 13:26     ` Tina Zhang
  2023-10-11 19:33       ` Nicolin Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Tina Zhang @ 2023-10-11 13:26 UTC (permalink / raw)
  To: Jason Gunthorpe, Nicolin Chen
  Cc: Kevin Tian, Lu Baolu, iommu, linux-kernel, Vasant Hegde



On 10/11/23 20:39, Jason Gunthorpe wrote:
> On Wed, Oct 11, 2023 at 02:51:31PM +0800, Tina Zhang wrote:
> 
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index 3b6d20dfb9a8..985403a7a747 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -1277,7 +1277,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
>>   	mm_init_cpumask(mm);
>>   	mm_init_aio(mm);
>>   	mm_init_owner(mm, p);
>> -	mm_pasid_init(mm);
>>   	RCU_INIT_POINTER(mm->exe_file, NULL);
>>   	mmu_notifier_subscriptions_init(mm);
>>   	init_tlb_flush_pending(mm);
> 
> Nicolin debugged his crash report last night and sent me the details.
> 
> This hunk is the cause of the bug that Nicolin reported.
> 
> The dup_mm() flow does:
> 
> static struct mm_struct *dup_mm(struct task_struct *tsk,
> 				struct mm_struct *oldmm)
> {
> 	struct mm_struct *mm;
> 	int err;
> 
> 	mm = allocate_mm();
> 	if (!mm)
> 		goto fail_nomem;
> 
> 	memcpy(mm, oldmm, sizeof(*mm));
> 
> 	if (!mm_init(mm, tsk, mm->user_ns))
> 		goto fail_nomem;
> 
> It is essential that mm_pasid_init() zero the new pointer otherwise,
> due to the memcpy, after a fork two mm structs will point to the same
> thing and one will UAF/doube free.
Good catch.

Thanks,
-Tina
> 
> Keep mm_pasid_init() and add zeroing the new pointer to it.
> 
> Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [v6 PATCH 2/5] iommu: Add mm_get_enqcmd_pasid() helper function
  2023-10-11  6:51 ` [v6 PATCH 2/5] iommu: Add mm_get_enqcmd_pasid() helper function Tina Zhang
@ 2023-10-11 15:44   ` Jason Gunthorpe
  0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2023-10-11 15:44 UTC (permalink / raw)
  To: Tina Zhang; +Cc: Kevin Tian, Lu Baolu, iommu, linux-kernel

On Wed, Oct 11, 2023 at 02:51:29PM +0800, Tina Zhang wrote:
> mm_get_enqcmd_pasid() is for getting enqcmd pasid value.
> 
> The motivation is to replace mm->pasid with an iommu private data
> structure that is introduced in a later patch.

When you do v7 how about:

===

mm_get_enqcmd_pasid() should be used by architecture code and closely
related to learn the PASID value that the x86 ENQCMD operation should
use for the mm.

For the moment SMMUv3 uses this without any connection to ENQCMD, it
will be cleaned up similar to how the prior patch made VT-d use the
PASID argument of set_dev_pasid().

The motivation is to replace mm->pasid with an iommu private data
structure that is introduced in a later patch.

===

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains
  2023-10-11 13:26     ` Tina Zhang
@ 2023-10-11 19:33       ` Nicolin Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Nicolin Chen @ 2023-10-11 19:33 UTC (permalink / raw)
  To: Jason Gunthorpe, Tina Zhang
  Cc: Kevin Tian, Lu Baolu, iommu, linux-kernel, Vasant Hegde

On Wed, Oct 11, 2023 at 09:26:12PM +0800, Tina Zhang wrote:
> On 10/11/23 20:39, Jason Gunthorpe wrote:
> > On Wed, Oct 11, 2023 at 02:51:31PM +0800, Tina Zhang wrote:
> > 
> > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > index 3b6d20dfb9a8..985403a7a747 100644
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -1277,7 +1277,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
> > >      mm_init_cpumask(mm);
> > >      mm_init_aio(mm);
> > >      mm_init_owner(mm, p);
> > > -    mm_pasid_init(mm);
> > >      RCU_INIT_POINTER(mm->exe_file, NULL);
> > >      mmu_notifier_subscriptions_init(mm);
> > >      init_tlb_flush_pending(mm);
> > 
> > Nicolin debugged his crash report last night and sent me the details.
> > 
> > This hunk is the cause of the bug that Nicolin reported.
> > 
> > The dup_mm() flow does:
> > 
> > static struct mm_struct *dup_mm(struct task_struct *tsk,
> >                               struct mm_struct *oldmm)
> > {
> >       struct mm_struct *mm;
> >       int err;
> > 
> >       mm = allocate_mm();
> >       if (!mm)
> >               goto fail_nomem;
> > 
> >       memcpy(mm, oldmm, sizeof(*mm));
> > 
> >       if (!mm_init(mm, tsk, mm->user_ns))
> >               goto fail_nomem;
> > 
> > It is essential that mm_pasid_init() zero the new pointer otherwise,
> > due to the memcpy, after a fork two mm structs will point to the same
> > thing and one will UAF/doube free.
> Good catch.
> 
> Thanks,
> -Tina
> > 
> > Keep mm_pasid_init() and add zeroing the new pointer to it.

Yea, testing with this sees no more WARN_ON:

---------------------------------------------------------
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 3d782fd0f485..4bc3c49cdaf9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1208,2 +1208,6 @@ static inline bool tegra_dev_iommu_get_stream_id(struct device *dev, u32 *stream
 #ifdef CONFIG_IOMMU_SVA
+static inline void mm_pasid_init(struct mm_struct *mm)
+{
+	mm->iommu_mm = NULL;
+}
 static inline bool mm_valid_pasid(struct mm_struct *mm)
@@ -1240,2 +1244,3 @@ static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
 }
+static inline void mm_pasid_init(struct mm_struct *mm) {}
 static inline bool mm_valid_pasid(struct mm_struct *mm) { return false; }
diff --git a/kernel/fork.c b/kernel/fork.c
index f06392dd1ca8..d2e12b6d2b18 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1276,2 +1276,3 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	mm_init_owner(mm, p);
+	mm_pasid_init(mm);
 	RCU_INIT_POINTER(mm->exe_file, NULL);
---------------------------------------------------------

I'll confirm with v7 too.

Thanks
Nicolin

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-11 19:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-11  6:51 [v6 PATCH 0/5] Share sva domains with all devices bound to a mm Tina Zhang
2023-10-11  6:51 ` [v6 PATCH 1/5] iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm() Tina Zhang
2023-10-11  6:51 ` [v6 PATCH 2/5] iommu: Add mm_get_enqcmd_pasid() helper function Tina Zhang
2023-10-11 15:44   ` Jason Gunthorpe
2023-10-11  6:51 ` [v6 PATCH 3/5] mm: Add structure to keep sva information Tina Zhang
2023-10-11  6:51 ` [v6 PATCH 4/5] iommu: Support mm PASID 1:n with sva domains Tina Zhang
2023-10-11 12:39   ` Jason Gunthorpe
2023-10-11 13:26     ` Tina Zhang
2023-10-11 19:33       ` Nicolin Chen
2023-10-11  6:51 ` [v6 PATCH 5/5] mm: Deprecate pasid field Tina Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).