linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)
@ 2020-06-18 15:51 Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 01/12] mm: Define pasid in mm Jean-Philippe Brucker
                   ` (13 more replies)
  0 siblings, 14 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

Since v7 [1], I split the series into three parts to ease review. This
first one adds page table sharing to the SMMUv3 driver. The second one
adds support for I/O page faults through PRI and Stall, and the last one
adds additional and optional features (DVM, VHE and HTTU). SVA needs the
three parts to work. No significant change apart from that, I just
addressed the previous comments.

I'd rather everything went through the IOMMU tree but I'm assuming patch
1 will also go through the x86 tree as part of [2]. It is definitely
required by patch 3 which is required by patch 11. I don't know how this
kind of conflict is usually resolved, but if it's a problem I could
further shrink the series to only patches 4-10 this cycle.

[1] https://lore.kernel.org/linux-iommu/20200519175502.2504091-1-jean-philippe@linaro.org/
[2] https://lore.kernel.org/linux-iommu/1592418233-17762-1-git-send-email-fenghua.yu@intel.com/

Fenghua Yu (1):
  mm: Define pasid in mm

Jean-Philippe Brucker (11):
  iommu/ioasid: Add ioasid references
  iommu/sva: Add PASID helpers
  arm64: mm: Pin down ASIDs for sharing mm with devices
  iommu/io-pgtable-arm: Move some definitions to a header
  arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
  iommu/arm-smmu-v3: Share process page tables
  iommu/arm-smmu-v3: Seize private ASID
  iommu/arm-smmu-v3: Check for SVA features
  iommu/arm-smmu-v3: Add SVA device feature
  iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
  iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops

 drivers/iommu/Kconfig                |   7 +
 drivers/iommu/Makefile               |   1 +
 arch/arm64/include/asm/mmu.h         |   1 +
 arch/arm64/include/asm/mmu_context.h |  11 +-
 drivers/iommu/io-pgtable-arm.h       |  30 ++
 drivers/iommu/iommu-sva-lib.h        |  15 +
 include/linux/ioasid.h               |  10 +-
 include/linux/mm_types.h             |   4 +
 arch/arm64/kernel/cpufeature.c       |   1 +
 arch/arm64/mm/context.c              |  95 +++-
 drivers/iommu/arm-smmu-v3.c          | 702 ++++++++++++++++++++++++++-
 drivers/iommu/intel/iommu.c          |   4 +-
 drivers/iommu/intel/svm.c            |   6 +-
 drivers/iommu/io-pgtable-arm.c       |  27 +-
 drivers/iommu/ioasid.c               |  38 +-
 drivers/iommu/iommu-sva-lib.c        |  85 ++++
 MAINTAINERS                          |   3 +-
 17 files changed, 977 insertions(+), 63 deletions(-)
 create mode 100644 drivers/iommu/io-pgtable-arm.h
 create mode 100644 drivers/iommu/iommu-sva-lib.h
 create mode 100644 drivers/iommu/iommu-sva-lib.c

-- 
2.27.0



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v8 01/12] mm: Define pasid in mm
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 02/12] iommu/ioasid: Add ioasid references Jean-Philippe Brucker
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Christoph Hellwig, Tony Luck

From: Fenghua Yu <fenghua.yu@intel.com>

PASID is shared by all threads in a process. So the logical place to keep
track of it is in the "mm". Both ARM and X86 need to use the PASID in the
"mm".

Suggested-by: Christoph Hellwig <hch@infradeed.org>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
https://lore.kernel.org/linux-iommu/1592418233-17762-9-git-send-email-fenghua.yu@intel.com/
---
 include/linux/mm_types.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 64ede5f150dc5..1ad0e54ebbbaa 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -538,6 +538,10 @@ struct mm_struct {
 		atomic_long_t hugetlb_usage;
 #endif
 		struct work_struct async_put_work;
+
+#ifdef CONFIG_IOMMU_SUPPORT
+		unsigned int pasid;
+#endif
 	} __randomize_layout;
 
 	/*
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 02/12] iommu/ioasid: Add ioasid references
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 01/12] mm: Define pasid in mm Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 03/12] iommu/sva: Add PASID helpers Jean-Philippe Brucker
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

Let IOASID users take references to existing ioasids with ioasid_get().
ioasid_put() drops a reference and only frees the ioasid when its
reference number is zero. It returns true if the ioasid was freed.
For drivers that don't call ioasid_get(), ioasid_put() is the same as
ioasid_free().

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 include/linux/ioasid.h      | 10 ++++++++--
 drivers/iommu/intel/iommu.c |  4 ++--
 drivers/iommu/intel/svm.c   |  6 +++---
 drivers/iommu/ioasid.c      | 38 +++++++++++++++++++++++++++++++++----
 4 files changed, 47 insertions(+), 11 deletions(-)

diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index 6f000d7a0ddcd..e9dacd4b9f6bb 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -34,7 +34,8 @@ struct ioasid_allocator_ops {
 #if IS_ENABLED(CONFIG_IOASID)
 ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
 		      void *private);
-void ioasid_free(ioasid_t ioasid);
+void ioasid_get(ioasid_t ioasid);
+bool ioasid_put(ioasid_t ioasid);
 void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
 		  bool (*getter)(void *));
 int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
@@ -48,10 +49,15 @@ static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
 	return INVALID_IOASID;
 }
 
-static inline void ioasid_free(ioasid_t ioasid)
+static inline void ioasid_get(ioasid_t ioasid)
 {
 }
 
+static inline bool ioasid_put(ioasid_t ioasid)
+{
+	return false;
+}
+
 static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
 				bool (*getter)(void *))
 {
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9129663a7406b..a5840b8ef14b7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5131,7 +5131,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain,
 	domain->auxd_refcnt--;
 
 	if (!domain->auxd_refcnt && domain->default_pasid > 0)
-		ioasid_free(domain->default_pasid);
+		ioasid_put(domain->default_pasid);
 }
 
 static int aux_domain_add_dev(struct dmar_domain *domain,
@@ -5193,7 +5193,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
 	spin_unlock(&iommu->lock);
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 	if (!domain->auxd_refcnt && domain->default_pasid > 0)
-		ioasid_free(domain->default_pasid);
+		ioasid_put(domain->default_pasid);
 
 	return ret;
 }
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 6c87c807a0abb..b078a697c42e8 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -546,7 +546,7 @@ intel_svm_bind_mm(struct device *dev, int flags, struct svm_dev_ops *ops,
 		if (mm) {
 			ret = mmu_notifier_register(&svm->notifier, mm);
 			if (ret) {
-				ioasid_free(svm->pasid);
+				ioasid_put(svm->pasid);
 				kfree(svm);
 				kfree(sdev);
 				goto out;
@@ -564,7 +564,7 @@ intel_svm_bind_mm(struct device *dev, int flags, struct svm_dev_ops *ops,
 		if (ret) {
 			if (mm)
 				mmu_notifier_unregister(&svm->notifier, mm);
-			ioasid_free(svm->pasid);
+			ioasid_put(svm->pasid);
 			kfree(svm);
 			kfree(sdev);
 			goto out;
@@ -639,7 +639,7 @@ static int intel_svm_unbind_mm(struct device *dev, int pasid)
 			kfree_rcu(sdev, rcu);
 
 			if (list_empty(&svm->devs)) {
-				ioasid_free(svm->pasid);
+				ioasid_put(svm->pasid);
 				if (svm->mm)
 					mmu_notifier_unregister(&svm->notifier, svm->mm);
 				list_del(&svm->list);
diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 0f8dd377aada3..50ee27bbd04ec 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -2,7 +2,7 @@
 /*
  * I/O Address Space ID allocator. There is one global IOASID space, split into
  * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
- * free IOASIDs with ioasid_alloc and ioasid_free.
+ * free IOASIDs with ioasid_alloc and ioasid_put.
  */
 #include <linux/ioasid.h>
 #include <linux/module.h>
@@ -15,6 +15,7 @@ struct ioasid_data {
 	struct ioasid_set *set;
 	void *private;
 	struct rcu_head rcu;
+	refcount_t refs;
 };
 
 /*
@@ -314,6 +315,7 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
 
 	data->set = set;
 	data->private = private;
+	refcount_set(&data->refs, 1);
 
 	/*
 	 * Custom allocator needs allocator data to perform platform specific
@@ -346,11 +348,34 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
 EXPORT_SYMBOL_GPL(ioasid_alloc);
 
 /**
- * ioasid_free - Free an IOASID
+ * ioasid_get - obtain a reference to the IOASID
+ */
+void ioasid_get(ioasid_t ioasid)
+{
+	struct ioasid_data *ioasid_data;
+
+	spin_lock(&ioasid_allocator_lock);
+	ioasid_data = xa_load(&active_allocator->xa, ioasid);
+	if (ioasid_data)
+		refcount_inc(&ioasid_data->refs);
+	else
+		WARN_ON(1);
+	spin_unlock(&ioasid_allocator_lock);
+}
+EXPORT_SYMBOL_GPL(ioasid_get);
+
+/**
+ * ioasid_put - Release a reference to an ioasid
  * @ioasid: the ID to remove
+ *
+ * Put a reference to the IOASID, free it when the number of references drops to
+ * zero.
+ *
+ * Return: %true if the IOASID was freed, %false otherwise.
  */
-void ioasid_free(ioasid_t ioasid)
+bool ioasid_put(ioasid_t ioasid)
 {
+	bool free = false;
 	struct ioasid_data *ioasid_data;
 
 	spin_lock(&ioasid_allocator_lock);
@@ -360,6 +385,10 @@ void ioasid_free(ioasid_t ioasid)
 		goto exit_unlock;
 	}
 
+	free = refcount_dec_and_test(&ioasid_data->refs);
+	if (!free)
+		goto exit_unlock;
+
 	active_allocator->ops->free(ioasid, active_allocator->ops->pdata);
 	/* Custom allocator needs additional steps to free the xa element */
 	if (active_allocator->flags & IOASID_ALLOCATOR_CUSTOM) {
@@ -369,8 +398,9 @@ void ioasid_free(ioasid_t ioasid)
 
 exit_unlock:
 	spin_unlock(&ioasid_allocator_lock);
+	return free;
 }
-EXPORT_SYMBOL_GPL(ioasid_free);
+EXPORT_SYMBOL_GPL(ioasid_put);
 
 /**
  * ioasid_find - Find IOASID data
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 03/12] iommu/sva: Add PASID helpers
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 01/12] mm: Define pasid in mm Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 02/12] iommu/ioasid: Add ioasid references Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-19  7:37   ` Lu Baolu
  2020-06-18 15:51 ` [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices Jean-Philippe Brucker
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

Let IOMMU drivers allocate a single PASID per mm. Store the mm in the
IOASID set to allow refcounting and searching mm by PASID, when handling
an I/O page fault.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v7->v8: rename to IOMMU_SVA_LIB (Lu Baolu)
---
 drivers/iommu/Kconfig         |  5 +++
 drivers/iommu/Makefile        |  1 +
 drivers/iommu/iommu-sva-lib.h | 15 +++++++
 drivers/iommu/iommu-sva-lib.c | 85 +++++++++++++++++++++++++++++++++++
 4 files changed, 106 insertions(+)
 create mode 100644 drivers/iommu/iommu-sva-lib.h
 create mode 100644 drivers/iommu/iommu-sva-lib.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b510f67dfa499..74a10e7a8d082 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -102,6 +102,11 @@ config IOMMU_DMA
 	select IRQ_MSI_IOMMU
 	select NEED_SG_DMA_LENGTH
 
+# Shared Virtual Addressing library
+config IOMMU_SVA_LIB
+	bool
+	select IOASID
+
 config FSL_PAMU
 	bool "Freescale IOMMU support"
 	depends on PCI
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 342190196dfb0..0fe5a7f9bc69c 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -38,3 +38,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
+obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
new file mode 100644
index 0000000000000..b40990aef3fde
--- /dev/null
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SVA library for IOMMU drivers
+ */
+#ifndef _IOMMU_SVA_LIB_H
+#define _IOMMU_SVA_LIB_H
+
+#include <linux/ioasid.h>
+#include <linux/mm_types.h>
+
+int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
+void iommu_sva_free_pasid(struct mm_struct *mm);
+struct mm_struct *iommu_sva_find(ioasid_t pasid);
+
+#endif /* _IOMMU_SVA_LIB_H */
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
new file mode 100644
index 0000000000000..db7e6c104d6b0
--- /dev/null
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for IOMMU drivers implementing SVA
+ */
+#include <linux/mutex.h>
+#include <linux/sched/mm.h>
+
+#include "iommu-sva-lib.h"
+
+static DEFINE_MUTEX(iommu_sva_lock);
+static DECLARE_IOASID_SET(iommu_sva_pasid);
+
+/**
+ * iommu_sva_alloc_pasid - Allocate a PASID for the mm
+ * @mm: the mm
+ * @min: minimum PASID value (inclusive)
+ * @max: maximum PASID value (inclusive)
+ *
+ * Try to allocate a PASID for this mm, or take a reference to the existing one
+ * provided it fits within the [min, max] range. On success the PASID is
+ * available in mm->pasid, and must be released with iommu_sva_free_pasid().
+ *
+ * Returns 0 on success and < 0 on error.
+ */
+int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
+{
+	int ret = 0;
+	ioasid_t pasid;
+
+	if (min == INVALID_IOASID || max == INVALID_IOASID ||
+	    min == 0 || max < min)
+		return -EINVAL;
+
+	mutex_lock(&iommu_sva_lock);
+	if (mm->pasid) {
+		if (mm->pasid >= min && mm->pasid <= max)
+			ioasid_get(mm->pasid);
+		else
+			ret = -EOVERFLOW;
+	} else {
+		pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
+		if (pasid == INVALID_IOASID)
+			ret = -ENOMEM;
+		else
+			mm->pasid = pasid;
+	}
+	mutex_unlock(&iommu_sva_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
+
+/**
+ * iommu_sva_free_pasid - Release the mm's PASID
+ * @mm: the mm.
+ *
+ * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid()
+ */
+void iommu_sva_free_pasid(struct mm_struct *mm)
+{
+	mutex_lock(&iommu_sva_lock);
+	if (ioasid_put(mm->pasid))
+		mm->pasid = 0;
+	mutex_unlock(&iommu_sva_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
+
+/* ioasid wants a void * argument */
+static bool __mmget_not_zero(void *mm)
+{
+	return mmget_not_zero(mm);
+}
+
+/**
+ * iommu_sva_find() - Find mm associated to the given PASID
+ * @pasid: Process Address Space ID assigned to the mm
+ *
+ * On success a reference to the mm is taken, and must be released with mmput().
+ *
+ * Returns the mm corresponding to this PASID, or an error if not found.
+ */
+struct mm_struct *iommu_sva_find(ioasid_t pasid)
+{
+	return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_find);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (2 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 03/12] iommu/sva: Add PASID helpers Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-07-13 15:46   ` Will Deacon
  2020-06-18 15:51 ` [PATCH v8 05/12] iommu/io-pgtable-arm: Move some definitions to a header Jean-Philippe Brucker
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

To enable address space sharing with the IOMMU, introduce mm_context_get()
and mm_context_put(), that pin down a context and ensure that it will keep
its ASID after a rollover. Export the symbols to let the modular SMMUv3
driver use them.

Pinning is necessary because a device constantly needs a valid ASID,
unlike tasks that only require one when running. Without pinning, we would
need to notify the IOMMU when we're about to use a new ASID for a task,
and it would get complicated when a new task is assigned a shared ASID.
Consider the following scenario with no ASID pinned:

1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1)
2. Task t2 is scheduled on CPUx, gets ASID (1, 2)
3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1)
   We would now have to immediately generate a new ASID for t1, notify
   the IOMMU, and finally enable task tn. We are holding the lock during
   all that time, since we can't afford having another CPU trigger a
   rollover. The IOMMU issues invalidation commands that can take tens of
   milliseconds.

It gets needlessly complicated. All we wanted to do was schedule task tn,
that has no business with the IOMMU. By letting the IOMMU pin tasks when
needed, we avoid stalling the slow path, and let the pinning fail when
we're out of shareable ASIDs.

After a rollover, the allocator expects at least one ASID to be available
in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS -
1) is the maximum number of ASIDs that can be shared with the IOMMU.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 arch/arm64/include/asm/mmu.h         |  1 +
 arch/arm64/include/asm/mmu_context.h | 11 +++-
 arch/arm64/mm/context.c              | 95 +++++++++++++++++++++++++++-
 3 files changed, 104 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 68140fdd89d6b..bbdd291e31d59 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -19,6 +19,7 @@
 
 typedef struct {
 	atomic64_t	id;
+	unsigned long	pinned;
 	void		*vdso;
 	unsigned long	flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index b0bd9b55594c5..7b0e0f6cb7e87 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -177,7 +177,13 @@ static inline void cpu_replace_ttbr1(pgd_t *pgdp)
 #define destroy_context(mm)		do { } while(0)
 void check_and_switch_context(struct mm_struct *mm, unsigned int cpu);
 
-#define init_new_context(tsk,mm)	({ atomic64_set(&(mm)->context.id, 0); 0; })
+static inline int
+init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+	atomic64_set(&mm->context.id, 0);
+	mm->context.pinned = 0;
+	return 0;
+}
 
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
 static inline void update_saved_ttbr0(struct task_struct *tsk,
@@ -250,6 +256,9 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
 void verify_cpu_asid_bits(void);
 void post_ttbr_update_workaround(void);
 
+unsigned long mm_context_get(struct mm_struct *mm);
+void mm_context_put(struct mm_struct *mm);
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* !__ASM_MMU_CONTEXT_H */
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index d702d60e64dab..d0ddd413f5645 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -27,6 +27,10 @@ static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
 static cpumask_t tlb_flush_pending;
 
+static unsigned long max_pinned_asids;
+static unsigned long nr_pinned_asids;
+static unsigned long *pinned_asid_map;
+
 #define ASID_MASK		(~GENMASK(asid_bits - 1, 0))
 #define ASID_FIRST_VERSION	(1UL << asid_bits)
 
@@ -74,6 +78,9 @@ void verify_cpu_asid_bits(void)
 
 static void set_kpti_asid_bits(void)
 {
+	unsigned int k;
+	u8 *dst = (u8 *)asid_map;
+	u8 *src = (u8 *)pinned_asid_map;
 	unsigned int len = BITS_TO_LONGS(NUM_USER_ASIDS) * sizeof(unsigned long);
 	/*
 	 * In case of KPTI kernel/user ASIDs are allocated in
@@ -81,7 +88,8 @@ static void set_kpti_asid_bits(void)
 	 * is set, then the ASID will map only userspace. Thus
 	 * mark even as reserved for kernel.
 	 */
-	memset(asid_map, 0xaa, len);
+	for (k = 0; k < len; k++)
+		dst[k] = src[k] | 0xaa;
 }
 
 static void set_reserved_asid_bits(void)
@@ -89,7 +97,7 @@ static void set_reserved_asid_bits(void)
 	if (arm64_kernel_unmapped_at_el0())
 		set_kpti_asid_bits();
 	else
-		bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
+		bitmap_copy(asid_map, pinned_asid_map, NUM_USER_ASIDS);
 }
 
 #define asid_gen_match(asid) \
@@ -165,6 +173,14 @@ static u64 new_context(struct mm_struct *mm)
 		if (check_update_reserved_asid(asid, newasid))
 			return newasid;
 
+		/*
+		 * If it is pinned, we can keep using it. Note that reserved
+		 * takes priority, because even if it is also pinned, we need to
+		 * update the generation into the reserved_asids.
+		 */
+		if (mm->context.pinned)
+			return newasid;
+
 		/*
 		 * We had a valid ASID in a previous life, so try to re-use
 		 * it if possible.
@@ -254,6 +270,68 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
 		cpu_switch_mm(mm->pgd, mm);
 }
 
+unsigned long mm_context_get(struct mm_struct *mm)
+{
+	unsigned long flags;
+	u64 asid;
+
+	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
+
+	asid = atomic64_read(&mm->context.id);
+
+	if (mm->context.pinned) {
+		mm->context.pinned++;
+		asid &= ~ASID_MASK;
+		goto out_unlock;
+	}
+
+	if (nr_pinned_asids >= max_pinned_asids) {
+		asid = 0;
+		goto out_unlock;
+	}
+
+	if (!asid_gen_match(asid)) {
+		/*
+		 * We went through one or more rollover since that ASID was
+		 * used. Ensure that it is still valid, or generate a new one.
+		 */
+		asid = new_context(mm);
+		atomic64_set(&mm->context.id, asid);
+	}
+
+	asid &= ~ASID_MASK;
+
+	nr_pinned_asids++;
+	__set_bit(asid2idx(asid), pinned_asid_map);
+	mm->context.pinned++;
+
+out_unlock:
+	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
+
+	/* Set the equivalent of USER_ASID_BIT */
+	if (asid && IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0))
+		asid |= 1;
+
+	return asid;
+}
+EXPORT_SYMBOL_GPL(mm_context_get);
+
+void mm_context_put(struct mm_struct *mm)
+{
+	unsigned long flags;
+	u64 asid = atomic64_read(&mm->context.id) & ~ASID_MASK;
+
+	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
+
+	if (--mm->context.pinned == 0) {
+		__clear_bit(asid2idx(asid), pinned_asid_map);
+		nr_pinned_asids--;
+	}
+
+	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
+}
+EXPORT_SYMBOL_GPL(mm_context_put);
+
 /* Errata workaround post TTBRx_EL1 update. */
 asmlinkage void post_ttbr_update_workaround(void)
 {
@@ -303,6 +381,13 @@ static int asids_update_limit(void)
 	WARN_ON(num_available_asids - 1 <= num_possible_cpus());
 	pr_info("ASID allocator initialised with %lu entries\n",
 		num_available_asids);
+
+	/*
+	 * We assume that an ASID is always available after a rollover. This
+	 * means that even if all CPUs have a reserved ASID, there still is at
+	 * least one slot available in the asid map.
+	 */
+	max_pinned_asids = num_available_asids - num_possible_cpus() - 2;
 	return 0;
 }
 arch_initcall(asids_update_limit);
@@ -317,6 +402,12 @@ static int asids_init(void)
 		panic("Failed to allocate bitmap for %lu ASIDs\n",
 		      NUM_USER_ASIDS);
 
+	pinned_asid_map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS),
+				  sizeof(*pinned_asid_map), GFP_KERNEL);
+	if (!pinned_asid_map)
+		panic("Failed to allocate pinned ASID bitmap\n");
+	nr_pinned_asids = 0;
+
 	/*
 	 * We cannot call set_reserved_asid_bits() here because CPU
 	 * caps are not finalized yet, so it is safer to assume KPTI
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 05/12] iommu/io-pgtable-arm: Move some definitions to a header
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (3 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 06/12] arm64: cpufeature: Export symbol read_sanitised_ftr_reg() Jean-Philippe Brucker
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

Extract some of the most generic TCR defines, so they can be reused by
the page table sharing code.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/io-pgtable-arm.h | 30 ++++++++++++++++++++++++++++++
 drivers/iommu/io-pgtable-arm.c | 27 ++-------------------------
 MAINTAINERS                    |  3 +--
 3 files changed, 33 insertions(+), 27 deletions(-)
 create mode 100644 drivers/iommu/io-pgtable-arm.h

diff --git a/drivers/iommu/io-pgtable-arm.h b/drivers/iommu/io-pgtable-arm.h
new file mode 100644
index 0000000000000..ba7cfdf7afa03
--- /dev/null
+++ b/drivers/iommu/io-pgtable-arm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef IO_PGTABLE_ARM_H_
+#define IO_PGTABLE_ARM_H_
+
+#define ARM_LPAE_TCR_TG0_4K		0
+#define ARM_LPAE_TCR_TG0_64K		1
+#define ARM_LPAE_TCR_TG0_16K		2
+
+#define ARM_LPAE_TCR_TG1_16K		1
+#define ARM_LPAE_TCR_TG1_4K		2
+#define ARM_LPAE_TCR_TG1_64K		3
+
+#define ARM_LPAE_TCR_SH_NS		0
+#define ARM_LPAE_TCR_SH_OS		2
+#define ARM_LPAE_TCR_SH_IS		3
+
+#define ARM_LPAE_TCR_RGN_NC		0
+#define ARM_LPAE_TCR_RGN_WBWA		1
+#define ARM_LPAE_TCR_RGN_WT		2
+#define ARM_LPAE_TCR_RGN_WB		3
+
+#define ARM_LPAE_TCR_PS_32_BIT		0x0ULL
+#define ARM_LPAE_TCR_PS_36_BIT		0x1ULL
+#define ARM_LPAE_TCR_PS_40_BIT		0x2ULL
+#define ARM_LPAE_TCR_PS_42_BIT		0x3ULL
+#define ARM_LPAE_TCR_PS_44_BIT		0x4ULL
+#define ARM_LPAE_TCR_PS_48_BIT		0x5ULL
+#define ARM_LPAE_TCR_PS_52_BIT		0x6ULL
+
+#endif /* IO_PGTABLE_ARM_H_ */
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 04fbd4bf0ff9f..f71a2eade04ab 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -20,6 +20,8 @@
 
 #include <asm/barrier.h>
 
+#include "io-pgtable-arm.h"
+
 #define ARM_LPAE_MAX_ADDR_BITS		52
 #define ARM_LPAE_S2_MAX_CONCAT_PAGES	16
 #define ARM_LPAE_MAX_LEVELS		4
@@ -100,23 +102,6 @@
 #define ARM_LPAE_PTE_MEMATTR_DEV	(((arm_lpae_iopte)0x1) << 2)
 
 /* Register bits */
-#define ARM_LPAE_TCR_TG0_4K		0
-#define ARM_LPAE_TCR_TG0_64K		1
-#define ARM_LPAE_TCR_TG0_16K		2
-
-#define ARM_LPAE_TCR_TG1_16K		1
-#define ARM_LPAE_TCR_TG1_4K		2
-#define ARM_LPAE_TCR_TG1_64K		3
-
-#define ARM_LPAE_TCR_SH_NS		0
-#define ARM_LPAE_TCR_SH_OS		2
-#define ARM_LPAE_TCR_SH_IS		3
-
-#define ARM_LPAE_TCR_RGN_NC		0
-#define ARM_LPAE_TCR_RGN_WBWA		1
-#define ARM_LPAE_TCR_RGN_WT		2
-#define ARM_LPAE_TCR_RGN_WB		3
-
 #define ARM_LPAE_VTCR_SL0_MASK		0x3
 
 #define ARM_LPAE_TCR_T0SZ_SHIFT		0
@@ -124,14 +109,6 @@
 #define ARM_LPAE_VTCR_PS_SHIFT		16
 #define ARM_LPAE_VTCR_PS_MASK		0x7
 
-#define ARM_LPAE_TCR_PS_32_BIT		0x0ULL
-#define ARM_LPAE_TCR_PS_36_BIT		0x1ULL
-#define ARM_LPAE_TCR_PS_40_BIT		0x2ULL
-#define ARM_LPAE_TCR_PS_42_BIT		0x3ULL
-#define ARM_LPAE_TCR_PS_44_BIT		0x4ULL
-#define ARM_LPAE_TCR_PS_48_BIT		0x5ULL
-#define ARM_LPAE_TCR_PS_52_BIT		0x6ULL
-
 #define ARM_LPAE_MAIR_ATTR_SHIFT(n)	((n) << 3)
 #define ARM_LPAE_MAIR_ATTR_MASK		0xff
 #define ARM_LPAE_MAIR_ATTR_DEVICE	0x04
diff --git a/MAINTAINERS b/MAINTAINERS
index 68f21d46614c4..2e3220273e4d5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1499,8 +1499,7 @@ L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Maintained
 F:	Documentation/devicetree/bindings/iommu/arm,smmu*
 F:	drivers/iommu/arm-smmu*
-F:	drivers/iommu/io-pgtable-arm-v7s.c
-F:	drivers/iommu/io-pgtable-arm.c
+F:	drivers/iommu/io-pgtable-arm*
 
 ARM SUB-ARCHITECTURES
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 06/12] arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (4 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 05/12] iommu/io-pgtable-arm: Move some definitions to a header Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 07/12] iommu/arm-smmu-v3: Share process page tables Jean-Philippe Brucker
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker,
	Suzuki K Poulose

The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to
share CPU page tables with devices. Allow the driver to be built as
module by exporting the read_sanitized_ftr_reg() cpufeature symbol.

Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 arch/arm64/kernel/cpufeature.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 4ae41670c2e6b..ba1e17ad17447 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1068,6 +1068,7 @@ u64 read_sanitised_ftr_reg(u32 id)
 		return 0;
 	return regp->sys_val;
 }
+EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg);
 
 #define read_sysreg_case(r)	\
 	case r:		return read_sysreg_s(r)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 07/12] iommu/arm-smmu-v3: Share process page tables
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (5 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 06/12] arm64: cpufeature: Export symbol read_sanitised_ftr_reg() Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-07-13 20:22   ` Will Deacon
  2020-06-18 15:51 ` [PATCH v8 08/12] iommu/arm-smmu-v3: Seize private ASID Jean-Philippe Brucker
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker,
	Suzuki K Poulose

With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR,
MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split
into two sets, shared and private. Shared ASIDs correspond to those
obtained from the arch ASID allocator, and private ASIDs are used for
"classic" map/unmap DMA.

A possible conflict happens when trying to use a shared ASID that has
already been allocated for private use by the SMMU driver. This will be
addressed in a later patch by replacing the private ASID. At the
moment we return -EBUSY.

Each mm_struct shared with the SMMU will have a single context
descriptor. Add a refcount to keep track of this. It will be protected
by the global SVA lock.

Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm-smmu-v3.c | 150 +++++++++++++++++++++++++++++++++++-
 1 file changed, 146 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 937aa1af428d5..cabd942e4cbf3 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -22,6 +22,7 @@
 #include <linux/iommu.h>
 #include <linux/iopoll.h>
 #include <linux/module.h>
+#include <linux/mmu_context.h>
 #include <linux/msi.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
@@ -33,6 +34,8 @@
 
 #include <linux/amba/bus.h>
 
+#include "io-pgtable-arm.h"
+
 /* MMIO registers */
 #define ARM_SMMU_IDR0			0x0
 #define IDR0_ST_LVL			GENMASK(28, 27)
@@ -589,6 +592,9 @@ struct arm_smmu_ctx_desc {
 	u64				ttbr;
 	u64				tcr;
 	u64				mair;
+
+	refcount_t			refs;
+	struct mm_struct		*mm;
 };
 
 struct arm_smmu_l1_ctx_desc {
@@ -727,6 +733,7 @@ struct arm_smmu_option_prop {
 };
 
 static DEFINE_XARRAY_ALLOC1(asid_xa);
+static DEFINE_MUTEX(sva_lock);
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
@@ -1662,7 +1669,8 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain,
 #ifdef __BIG_ENDIAN
 			CTXDESC_CD_0_ENDI |
 #endif
-			CTXDESC_CD_0_R | CTXDESC_CD_0_A | CTXDESC_CD_0_ASET |
+			CTXDESC_CD_0_R | CTXDESC_CD_0_A |
+			(cd->mm ? 0 : CTXDESC_CD_0_ASET) |
 			CTXDESC_CD_0_AA64 |
 			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
 			CTXDESC_CD_0_V;
@@ -1766,12 +1774,144 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain)
 	cdcfg->cdtab = NULL;
 }
 
-static void arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
+static void arm_smmu_init_cd(struct arm_smmu_ctx_desc *cd)
 {
+	refcount_set(&cd->refs, 1);
+}
+
+static bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
+{
+	bool free;
+	struct arm_smmu_ctx_desc *old_cd;
+
 	if (!cd->asid)
-		return;
+		return false;
+
+	free = refcount_dec_and_test(&cd->refs);
+	if (free) {
+		old_cd = xa_erase(&asid_xa, cd->asid);
+		WARN_ON(old_cd != cd);
+	}
+	return free;
+}
+
+static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
+{
+	struct arm_smmu_ctx_desc *cd;
 
-	xa_erase(&asid_xa, cd->asid);
+	cd = xa_load(&asid_xa, asid);
+	if (!cd)
+		return NULL;
+
+	if (cd->mm) {
+		/* All devices bound to this mm use the same cd struct. */
+		refcount_inc(&cd->refs);
+		return cd;
+	}
+
+	/* Ouch, ASID is already in use for a private cd. */
+	return ERR_PTR(-EBUSY);
+}
+
+__maybe_unused
+static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
+{
+	u16 asid;
+	int ret = 0;
+	u64 tcr, par, reg;
+	struct arm_smmu_ctx_desc *cd;
+	struct arm_smmu_ctx_desc *old_cd = NULL;
+
+	lockdep_assert_held(&sva_lock);
+
+	asid = mm_context_get(mm);
+	if (!asid)
+		return ERR_PTR(-ESRCH);
+
+	cd = kzalloc(sizeof(*cd), GFP_KERNEL);
+	if (!cd) {
+		ret = -ENOMEM;
+		goto err_put_context;
+	}
+
+	arm_smmu_init_cd(cd);
+
+	old_cd = arm_smmu_share_asid(asid);
+	if (IS_ERR(old_cd)) {
+		ret = PTR_ERR(old_cd);
+		goto err_free_cd;
+	} else if (old_cd) {
+		if (WARN_ON(old_cd->mm != mm)) {
+			ret = -EINVAL;
+			goto err_free_cd;
+		}
+		kfree(cd);
+		mm_context_put(mm);
+		return old_cd;
+	}
+
+	/* Fails if a private ASID has been allocated since we last checked */
+	ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL);
+	if (ret)
+		goto err_free_cd;
+
+	tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) |
+	      FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
+	      FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
+	      FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) |
+	      CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
+
+	switch (PAGE_SIZE) {
+	case SZ_4K:
+		tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_4K);
+		break;
+	case SZ_16K:
+		tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_16K);
+		break;
+	case SZ_64K:
+		tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_64K);
+		break;
+	default:
+		WARN_ON(1);
+		ret = -EINVAL;
+		goto err_free_asid;
+	}
+
+	reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	par = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_PARANGE_SHIFT);
+	tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_IPS, par);
+
+	cd->ttbr = virt_to_phys(mm->pgd);
+	cd->tcr = tcr;
+	/*
+	 * MAIR value is pretty much constant and global, so we can just get it
+	 * from the current CPU register
+	 */
+	cd->mair = read_sysreg(mair_el1);
+	cd->asid = asid;
+	cd->mm = mm;
+
+	return cd;
+
+err_free_asid:
+	arm_smmu_free_asid(cd);
+err_free_cd:
+	kfree(cd);
+err_put_context:
+	mm_context_put(mm);
+	return ERR_PTR(ret);
+}
+
+__maybe_unused
+static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd)
+{
+	lockdep_assert_held(&sva_lock);
+
+	if (arm_smmu_free_asid(cd)) {
+		/* Unpin ASID */
+		mm_context_put(cd->mm);
+		kfree(cd);
+	}
 }
 
 /* Stream table manipulation functions */
@@ -2481,6 +2621,8 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
 	typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 
+	arm_smmu_init_cd(&cfg->cd);
+
 	ret = xa_alloc(&asid_xa, &asid, &cfg->cd,
 		       XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
 	if (ret)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 08/12] iommu/arm-smmu-v3: Seize private ASID
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (6 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 07/12] iommu/arm-smmu-v3: Share process page tables Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-07-06 12:40   ` Xiang Zheng
  2020-06-18 15:51 ` [PATCH v8 09/12] iommu/arm-smmu-v3: Check for SVA features Jean-Philippe Brucker
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

The SMMU has a single ASID space, the union of shared and private ASID
sets. This means that the SMMU driver competes with the arch allocator
for ASIDs. Shared ASIDs are those of Linux processes, allocated by the
arch, and contribute in broadcast TLB maintenance. Private ASIDs are
allocated by the SMMU driver and used for "classic" map/unmap DMA. They
require command-queue TLB invalidations.

When we pin down an mm_context and get an ASID that is already in use by
the SMMU, it belongs to a private context. We used to simply abort the
bind, but this is unfair to users that would be unable to bind a few
seemingly random processes. Try to allocate a new private ASID for the
context, and make the old ASID shared.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm-smmu-v3.c | 101 +++++++++++++++++++++++++++++-------
 1 file changed, 82 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index cabd942e4cbf3..5506add42c9c8 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -733,6 +733,7 @@ struct arm_smmu_option_prop {
 };
 
 static DEFINE_XARRAY_ALLOC1(asid_xa);
+static DEFINE_MUTEX(asid_lock);
 static DEFINE_MUTEX(sva_lock);
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
@@ -1537,6 +1538,17 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
 }
 
 /* Context descriptor manipulation functions */
+static void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
+{
+	struct arm_smmu_cmdq_ent cmd = {
+		.opcode = CMDQ_OP_TLBI_NH_ASID,
+		.tlbi.asid = asid,
+	};
+
+	arm_smmu_cmdq_issue_cmd(smmu, &cmd);
+	arm_smmu_cmdq_issue_sync(smmu);
+}
+
 static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain,
 			     int ssid, bool leaf)
 {
@@ -1795,9 +1807,18 @@ static bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
 	return free;
 }
 
+/*
+ * Try to reserve this ASID in the SMMU. If it is in use, try to steal it from
+ * the private entry. Careful here, we may be modifying the context tables of
+ * another SMMU!
+ */
 static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
 {
+	int ret;
+	u32 new_asid;
 	struct arm_smmu_ctx_desc *cd;
+	struct arm_smmu_device *smmu;
+	struct arm_smmu_domain *smmu_domain;
 
 	cd = xa_load(&asid_xa, asid);
 	if (!cd)
@@ -1809,8 +1830,31 @@ static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
 		return cd;
 	}
 
-	/* Ouch, ASID is already in use for a private cd. */
-	return ERR_PTR(-EBUSY);
+	smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd);
+	smmu = smmu_domain->smmu;
+
+	ret = xa_alloc(&asid_xa, &new_asid, cd,
+		       XA_LIMIT(1, 1 << smmu->asid_bits), GFP_KERNEL);
+	if (ret)
+		return ERR_PTR(-ENOSPC);
+	/*
+	 * Race with unmap: TLB invalidations will start targeting the new ASID,
+	 * which isn't assigned yet. We'll do an invalidate-all on the old ASID
+	 * later, so it doesn't matter.
+	 */
+	cd->asid = new_asid;
+
+	/*
+	 * Update ASID and invalidate CD in all associated masters. There will
+	 * be some overlap between use of both ASIDs, until we invalidate the
+	 * TLB.
+	 */
+	arm_smmu_write_ctx_desc(smmu_domain, 0, cd);
+
+	/* Invalidate TLB entries previously associated with that context */
+	arm_smmu_tlb_inv_asid(smmu, asid);
+
+	return NULL;
 }
 
 __maybe_unused
@@ -1836,7 +1880,20 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
 
 	arm_smmu_init_cd(cd);
 
+	/*
+	 * Serialize against arm_smmu_domain_finalise_s1() and
+	 * arm_smmu_domain_free() as we might need to replace the private ASID
+	 * from an existing CD.
+	 */
+	mutex_lock(&asid_lock);
 	old_cd = arm_smmu_share_asid(asid);
+	if (!old_cd) {
+		ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL);
+		if (ret)
+			old_cd = ERR_PTR(ret);
+	}
+	mutex_unlock(&asid_lock);
+
 	if (IS_ERR(old_cd)) {
 		ret = PTR_ERR(old_cd);
 		goto err_free_cd;
@@ -1850,11 +1907,6 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
 		return old_cd;
 	}
 
-	/* Fails if a private ASID has been allocated since we last checked */
-	ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL);
-	if (ret)
-		goto err_free_cd;
-
 	tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
@@ -2398,15 +2450,6 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cmdq_ent cmd;
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
-		cmd.opcode	= CMDQ_OP_TLBI_NH_ASID;
-		cmd.tlbi.asid	= smmu_domain->s1_cfg.cd.asid;
-		cmd.tlbi.vmid	= 0;
-	} else {
-		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
-		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
-	}
-
 	/*
 	 * NOTE: when io-pgtable is in non-strict mode, we may get here with
 	 * PTEs previously cleared by unmaps on the current CPU not yet visible
@@ -2414,8 +2457,14 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	 * insertion to guarantee those are observed before the TLBI. Do be
 	 * careful, 007.
 	 */
-	arm_smmu_cmdq_issue_cmd(smmu, &cmd);
-	arm_smmu_cmdq_issue_sync(smmu);
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+		arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid);
+	} else {
+		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
+		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
+		arm_smmu_cmdq_issue_cmd(smmu, &cmd);
+		arm_smmu_cmdq_issue_sync(smmu);
+	}
 	arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0);
 }
 
@@ -2599,9 +2648,15 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
 
+		/*
+		 * Prevent arm_smmu_share_asid() from rewriting CD#0 while we're
+		 * freeing it.
+		 */
+		mutex_lock(&asid_lock);
 		if (cfg->cdcfg.cdtab)
 			arm_smmu_free_cd_tables(smmu_domain);
 		arm_smmu_free_asid(&cfg->cd);
+		mutex_unlock(&asid_lock);
 	} else {
 		struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
 		if (cfg->vmid)
@@ -2623,10 +2678,15 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 
 	arm_smmu_init_cd(&cfg->cd);
 
+	/*
+	 * Prevent arm_smmu_share_asid() from seizing the private ASID we're
+	 * allocating here until it is written to the CD.
+	 */
+	mutex_lock(&asid_lock);
 	ret = xa_alloc(&asid_xa, &asid, &cfg->cd,
 		       XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
 	if (ret)
-		return ret;
+		goto out_unlock;
 
 	cfg->s1cdmax = master->ssid_bits;
 
@@ -2654,12 +2714,15 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	if (ret)
 		goto out_free_cd_tables;
 
+	mutex_unlock(&asid_lock);
 	return 0;
 
 out_free_cd_tables:
 	arm_smmu_free_cd_tables(smmu_domain);
 out_free_asid:
 	arm_smmu_free_asid(&cfg->cd);
+out_unlock:
+	mutex_unlock(&asid_lock);
 	return ret;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 09/12] iommu/arm-smmu-v3: Check for SVA features
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (7 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 08/12] iommu/arm-smmu-v3: Seize private ASID Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 10/12] iommu/arm-smmu-v3: Add SVA device feature Jean-Philippe Brucker
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker,
	Suzuki K Poulose

Aggregate all sanity-checks for sharing CPU page tables with the SMMU
under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to
check FEAT_ATS and FEAT_PRI. For platform SVA, they will have to check
FEAT_STALLS.

Introduce ARM_SMMU_FEAT_BTM (Broadcast TLB Maintenance), but don't
enable it at the moment. Since the entire VMID space is shared with the
CPU, enabling DVM (by clearing SMMU_CR2.PTM) could result in
over-invalidation and affect performance of stage-2 mappings.

Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v7->v8: Use id_aa64mmfr0_parange_to_phys_shift()
---
 drivers/iommu/arm-smmu-v3.c | 48 +++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5506add42c9c8..e2d5171bfb7b9 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -654,6 +654,8 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_STALL_FORCE	(1 << 13)
 #define ARM_SMMU_FEAT_VAX		(1 << 14)
 #define ARM_SMMU_FEAT_RANGE_INV		(1 << 15)
+#define ARM_SMMU_FEAT_BTM		(1 << 16)
+#define ARM_SMMU_FEAT_SVA		(1 << 17)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
@@ -3894,6 +3896,49 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	return 0;
 }
 
+static bool arm_smmu_supports_sva(struct arm_smmu_device *smmu)
+{
+	unsigned long reg, fld;
+	unsigned long oas;
+	unsigned long asid_bits;
+
+	u32 feat_mask = ARM_SMMU_FEAT_BTM | ARM_SMMU_FEAT_COHERENCY;
+
+	if ((smmu->features & feat_mask) != feat_mask)
+		return false;
+
+	if (!(smmu->pgsize_bitmap & PAGE_SIZE))
+		return false;
+
+	/*
+	 * Get the smallest PA size of all CPUs (sanitized by cpufeature). We're
+	 * not even pretending to support AArch32 here. Abort if the MMU outputs
+	 * addresses larger than what we support.
+	 */
+	reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	fld = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_PARANGE_SHIFT);
+	oas = id_aa64mmfr0_parange_to_phys_shift(fld);
+	if (smmu->oas < oas)
+		return false;
+
+	/* We can support bigger ASIDs than the CPU, but not smaller */
+	fld = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_ASID_SHIFT);
+	asid_bits = fld ? 16 : 8;
+	if (smmu->asid_bits < asid_bits)
+		return false;
+
+	/*
+	 * See max_pinned_asids in arch/arm64/mm/context.c. The following is
+	 * generally the maximum number of bindable processes.
+	 */
+	if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0))
+		asid_bits--;
+	dev_dbg(smmu->dev, "%d shared contexts\n", (1 << asid_bits) -
+		num_possible_cpus() - 2);
+
+	return true;
+}
+
 static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
@@ -4093,6 +4138,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 
 	smmu->ias = max(smmu->ias, smmu->oas);
 
+	if (arm_smmu_supports_sva(smmu))
+		smmu->features |= ARM_SMMU_FEAT_SVA;
+
 	dev_info(smmu->dev, "ias %lu-bit, oas %lu-bit (features 0x%08x)\n",
 		 smmu->ias, smmu->oas, smmu->features);
 	return 0;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 10/12] iommu/arm-smmu-v3: Add SVA device feature
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (8 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 09/12] iommu/arm-smmu-v3: Check for SVA features Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 11/12] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind() Jean-Philippe Brucker
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

Implement the IOMMU device feature callbacks to support the SVA feature.
At the moment dev_has_feat() returns false since I/O Page Faults isn't
yet implemented.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm-smmu-v3.c | 124 ++++++++++++++++++++++++++++++++++++
 1 file changed, 124 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e2d5171bfb7b9..d357c9f77bf7f 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -698,6 +698,8 @@ struct arm_smmu_master {
 	u32				*sids;
 	unsigned int			num_sids;
 	bool				ats_enabled;
+	bool				sva_enabled;
+	struct list_head		bonds;
 	unsigned int			ssid_bits;
 };
 
@@ -3000,6 +3002,19 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	master = dev_iommu_priv_get(dev);
 	smmu = master->smmu;
 
+	/*
+	 * Checking that SVA is disabled ensures that this device isn't bound to
+	 * any mm, and can be safely detached from its old domain. Bonds cannot
+	 * be removed concurrently since we're holding the group mutex.
+	 */
+	mutex_lock(&sva_lock);
+	if (master->sva_enabled) {
+		mutex_unlock(&sva_lock);
+		dev_err(dev, "cannot attach - SVA enabled\n");
+		return -EBUSY;
+	}
+	mutex_unlock(&sva_lock);
+
 	arm_smmu_detach_dev(master);
 
 	mutex_lock(&smmu_domain->init_mutex);
@@ -3147,6 +3162,7 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 	master->smmu = smmu;
 	master->sids = fwspec->ids;
 	master->num_sids = fwspec->num_ids;
+	INIT_LIST_HEAD(&master->bonds);
 	dev_iommu_priv_set(dev, master);
 
 	/* Check the SIDs are in range of the SMMU and our stream table */
@@ -3199,6 +3215,7 @@ static void arm_smmu_release_device(struct device *dev)
 		return;
 
 	master = dev_iommu_priv_get(dev);
+	WARN_ON(master->sva_enabled);
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	kfree(master);
@@ -3316,6 +3333,109 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	iommu_dma_get_resv_regions(dev, head);
 }
 
+static bool arm_smmu_iopf_supported(struct arm_smmu_master *master)
+{
+	return false;
+}
+
+static bool arm_smmu_dev_has_feature(struct device *dev,
+				     enum iommu_dev_features feat)
+{
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	if (!master)
+		return false;
+
+	switch (feat) {
+	case IOMMU_DEV_FEAT_SVA:
+		if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
+			return false;
+
+		/* SSID and IOPF support are mandatory for the moment */
+		return master->ssid_bits && arm_smmu_iopf_supported(master);
+	default:
+		return false;
+	}
+}
+
+static bool arm_smmu_dev_feature_enabled(struct device *dev,
+					 enum iommu_dev_features feat)
+{
+	bool enabled = false;
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	if (!master)
+		return false;
+
+	switch (feat) {
+	case IOMMU_DEV_FEAT_SVA:
+		mutex_lock(&sva_lock);
+		enabled = master->sva_enabled;
+		mutex_unlock(&sva_lock);
+		return enabled;
+	default:
+		return false;
+	}
+}
+
+static int arm_smmu_dev_enable_sva(struct device *dev)
+{
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	mutex_lock(&sva_lock);
+	master->sva_enabled = true;
+	mutex_unlock(&sva_lock);
+
+	return 0;
+}
+
+static int arm_smmu_dev_disable_sva(struct device *dev)
+{
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	mutex_lock(&sva_lock);
+	if (!list_empty(&master->bonds)) {
+		dev_err(dev, "cannot disable SVA, device is bound\n");
+		mutex_unlock(&sva_lock);
+		return -EBUSY;
+	}
+	master->sva_enabled = false;
+	mutex_unlock(&sva_lock);
+
+	return 0;
+}
+
+static int arm_smmu_dev_enable_feature(struct device *dev,
+				       enum iommu_dev_features feat)
+{
+	if (!arm_smmu_dev_has_feature(dev, feat))
+		return -ENODEV;
+
+	if (arm_smmu_dev_feature_enabled(dev, feat))
+		return -EBUSY;
+
+	switch (feat) {
+	case IOMMU_DEV_FEAT_SVA:
+		return arm_smmu_dev_enable_sva(dev);
+	default:
+		return -EINVAL;
+	}
+}
+
+static int arm_smmu_dev_disable_feature(struct device *dev,
+					enum iommu_dev_features feat)
+{
+	if (!arm_smmu_dev_feature_enabled(dev, feat))
+		return -EINVAL;
+
+	switch (feat) {
+	case IOMMU_DEV_FEAT_SVA:
+		return arm_smmu_dev_disable_sva(dev);
+	default:
+		return -EINVAL;
+	}
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
@@ -3334,6 +3454,10 @@ static struct iommu_ops arm_smmu_ops = {
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
+	.dev_has_feat		= arm_smmu_dev_has_feature,
+	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
+	.dev_enable_feat	= arm_smmu_dev_enable_feature,
+	.dev_disable_feat	= arm_smmu_dev_disable_feature,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 11/12] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (9 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 10/12] iommu/arm-smmu-v3: Add SVA device feature Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-06-18 15:51 ` [PATCH v8 12/12] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops Jean-Philippe Brucker
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

The sva_bind() function allows devices to access process address spaces
using a PASID (aka SSID).

(1) bind() allocates or gets an existing MMU notifier tied to the
    (domain, mm) pair. Each mm gets one PASID.

(2) Any change to the address space calls invalidate_range() which sends
    ATC invalidations (in a subsequent patch).

(3) When the process address space dies, the release() notifier disables
    the CD to allow reclaiming the page tables. Since release() has to
    be light we do not instruct device drivers to stop DMA here, we just
    ignore incoming page faults from this point onwards.

    To avoid any event 0x0a print (C_BAD_CD) we disable translation
    without clearing CD.V. PCIe Translation Requests and Page Requests
    are silently denied. Don't clear the R bit because the S bit can't
    be cleared when STALL_MODEL==0b10 (forced), and clearing R without
    clearing S is useless. Faulting transactions will stall and will be
    aborted by the IOPF handler.

(4) After stopping DMA, the device driver releases the bond by calling
    unbind(). We release the MMU notifier, free the PASID and the bond.

Three structures keep track of bonds:
* arm_smmu_bond: one per {device, mm} pair, the handle returned to the
  device driver for a bind() request.
* arm_smmu_mmu_notifier: one per {domain, mm} pair, deals with ATS/TLB
  invalidations and clearing the context descriptor on mm exit.
* arm_smmu_ctx_desc: one per mm, holds the pinned ASID and pgd.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/Kconfig       |   2 +
 drivers/iommu/arm-smmu-v3.c | 263 +++++++++++++++++++++++++++++++++++-
 2 files changed, 260 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 74a10e7a8d082..d1ad6f63d2d42 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -437,8 +437,10 @@ config ARM_SMMU_V3
 	tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
 	depends on ARM64
 	select IOMMU_API
+	select IOMMU_SVA_LIB
 	select IOMMU_IO_PGTABLE_LPAE
 	select GENERIC_MSI_IRQ_DOMAIN
+	select MMU_NOTIFIER
 	help
 	  Support for implementations of the ARM System MMU architecture
 	  version 3 providing translation support to a PCIe root complex.
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d357c9f77bf7f..af551f3c78a78 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -23,6 +23,7 @@
 #include <linux/iopoll.h>
 #include <linux/module.h>
 #include <linux/mmu_context.h>
+#include <linux/mmu_notifier.h>
 #include <linux/msi.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
@@ -35,6 +36,7 @@
 #include <linux/amba/bus.h>
 
 #include "io-pgtable-arm.h"
+#include "iommu-sva-lib.h"
 
 /* MMIO registers */
 #define ARM_SMMU_IDR0			0x0
@@ -729,8 +731,32 @@ struct arm_smmu_domain {
 
 	struct list_head		devices;
 	spinlock_t			devices_lock;
+
+	struct list_head		mmu_notifiers;
+};
+
+struct arm_smmu_mmu_notifier {
+	struct mmu_notifier		mn;
+	struct arm_smmu_ctx_desc	*cd;
+	bool				cleared;
+	refcount_t			refs;
+	struct list_head		list;
+	struct arm_smmu_domain		*domain;
 };
 
+#define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn)
+
+struct arm_smmu_bond {
+	struct iommu_sva		sva;
+	struct mm_struct		*mm;
+	struct arm_smmu_mmu_notifier	*smmu_mn;
+	struct list_head		list;
+	refcount_t			refs;
+};
+
+#define sva_to_bond(handle) \
+	container_of(handle, struct arm_smmu_bond, sva)
+
 struct arm_smmu_option_prop {
 	u32 opt;
 	const char *prop;
@@ -740,6 +766,13 @@ static DEFINE_XARRAY_ALLOC1(asid_xa);
 static DEFINE_MUTEX(asid_lock);
 static DEFINE_MUTEX(sva_lock);
 
+/*
+ * When a process dies, DMA is still running but we need to clear the pgd. If we
+ * simply cleared the valid bit from the context descriptor, we'd get event 0x0a
+ * which are not recoverable.
+ */
+static struct arm_smmu_ctx_desc invalid_cd = { 0 };
+
 static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
 	{ ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"},
@@ -1643,7 +1676,9 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain,
 	 * (2) Install a secondary CD, for SID+SSID traffic.
 	 * (3) Update ASID of a CD. Atomically write the first 64 bits of the
 	 *     CD, then invalidate the old entry and mappings.
-	 * (4) Remove a secondary CD.
+	 * (4) Quiesce the context without clearing the valid bit. Disable
+	 *     translation, and ignore any translation fault.
+	 * (5) Remove a secondary CD.
 	 */
 	u64 val;
 	bool cd_live;
@@ -1660,8 +1695,10 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain,
 	val = le64_to_cpu(cdptr[0]);
 	cd_live = !!(val & CTXDESC_CD_0_V);
 
-	if (!cd) { /* (4) */
+	if (!cd) { /* (5) */
 		val = 0;
+	} else if (cd == &invalid_cd) { /* (4) */
+		val |= CTXDESC_CD_0_TCR_EPD0;
 	} else if (cd_live) { /* (3) */
 		val &= ~CTXDESC_CD_0_ASID;
 		val |= FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid);
@@ -1861,7 +1898,6 @@ static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
 	return NULL;
 }
 
-__maybe_unused
 static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
 {
 	u16 asid;
@@ -1958,7 +1994,6 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
 	return ERR_PTR(ret);
 }
 
-__maybe_unused
 static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd)
 {
 	lockdep_assert_held(&sva_lock);
@@ -2591,6 +2626,8 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 	}
 }
 
+static struct mmu_notifier_ops arm_smmu_mmu_notifier_ops;
+
 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 {
 	struct arm_smmu_domain *smmu_domain;
@@ -2618,6 +2655,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	mutex_init(&smmu_domain->init_mutex);
 	INIT_LIST_HEAD(&smmu_domain->devices);
 	spin_lock_init(&smmu_domain->devices_lock);
+	INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
 
 	return &smmu_domain->domain;
 }
@@ -3114,6 +3152,207 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
 	return ops->iova_to_phys(ops, iova);
 }
 
+static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
+{
+	struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
+	struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+
+	mutex_lock(&sva_lock);
+	if (smmu_mn->cleared) {
+		mutex_unlock(&sva_lock);
+		return;
+	}
+
+	/*
+	 * DMA may still be running. Keep the cd valid to avoid C_BAD_CD events,
+	 * but disable translation.
+	 */
+	arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, &invalid_cd);
+
+	arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
+
+	smmu_mn->cleared = true;
+	mutex_unlock(&sva_lock);
+}
+
+static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn)
+{
+	kfree(mn_to_smmu(mn));
+}
+
+static struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
+	.release		= arm_smmu_mm_release,
+	.free_notifier		= arm_smmu_mmu_notifier_free,
+};
+
+/* Allocate or get existing MMU notifier for this {domain, mm} pair */
+static struct arm_smmu_mmu_notifier *
+arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain,
+			  struct mm_struct *mm)
+{
+	int ret;
+	struct arm_smmu_ctx_desc *cd;
+	struct arm_smmu_mmu_notifier *smmu_mn;
+
+	lockdep_assert_held(&sva_lock);
+
+	list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) {
+		if (smmu_mn->mn.mm == mm) {
+			refcount_inc(&smmu_mn->refs);
+			return smmu_mn;
+		}
+	}
+
+	cd = arm_smmu_alloc_shared_cd(mm);
+	if (IS_ERR(cd))
+		return ERR_CAST(cd);
+
+	smmu_mn = kzalloc(sizeof(*smmu_mn), GFP_KERNEL);
+	if (!smmu_mn) {
+		ret = -ENOMEM;
+		goto err_free_cd;
+	}
+
+	refcount_set(&smmu_mn->refs, 1);
+	smmu_mn->cd = cd;
+	smmu_mn->domain = smmu_domain;
+	smmu_mn->mn.ops = &arm_smmu_mmu_notifier_ops;
+
+	ret = mmu_notifier_register(&smmu_mn->mn, mm);
+	if (ret) {
+		kfree(smmu_mn);
+		goto err_free_cd;
+	}
+
+	ret = arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, cd);
+	if (ret)
+		goto err_put_notifier;
+
+	list_add(&smmu_mn->list, &smmu_domain->mmu_notifiers);
+	return smmu_mn;
+
+err_put_notifier:
+	/* Frees smmu_mn */
+	mmu_notifier_put(&smmu_mn->mn);
+err_free_cd:
+	arm_smmu_free_shared_cd(cd);
+	return ERR_PTR(ret);
+}
+
+static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
+{
+	struct mm_struct *mm = smmu_mn->mn.mm;
+	struct arm_smmu_ctx_desc *cd = smmu_mn->cd;
+	struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+
+	lockdep_assert_held(&sva_lock);
+
+	if (!refcount_dec_and_test(&smmu_mn->refs))
+		return;
+
+	list_del(&smmu_mn->list);
+	arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, NULL);
+
+	/*
+	 * If we went through clear(), we've already invalidated, and no
+	 * new TLB entry can have been formed.
+	 */
+	if (!smmu_mn->cleared)
+		arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
+
+	/* Frees smmu_mn */
+	mmu_notifier_put(&smmu_mn->mn);
+	arm_smmu_free_shared_cd(cd);
+}
+
+static struct iommu_sva *
+__arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
+{
+	int ret;
+	struct arm_smmu_bond *bond;
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	lockdep_assert_held(&sva_lock);
+
+	if (!master || !master->sva_enabled)
+		return ERR_PTR(-ENODEV);
+
+	/* If bind() was already called for this {dev, mm} pair, reuse it. */
+	list_for_each_entry(bond, &master->bonds, list) {
+		if (bond->mm == mm) {
+			refcount_inc(&bond->refs);
+			return &bond->sva;
+		}
+	}
+
+	bond = kzalloc(sizeof(*bond), GFP_KERNEL);
+	if (!bond)
+		return ERR_PTR(-ENOMEM);
+
+	/* Allocate a PASID for this mm if necessary */
+	ret = iommu_sva_alloc_pasid(mm, 1, (1U << master->ssid_bits) - 1);
+	if (ret)
+		goto err_free_bond;
+
+	bond->mm = mm;
+	bond->sva.dev = dev;
+	refcount_set(&bond->refs, 1);
+
+	bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
+	if (IS_ERR(bond->smmu_mn)) {
+		ret = PTR_ERR(bond->smmu_mn);
+		goto err_free_pasid;
+	}
+
+	list_add(&bond->list, &master->bonds);
+	return &bond->sva;
+
+err_free_pasid:
+	iommu_sva_free_pasid(mm);
+err_free_bond:
+	kfree(bond);
+	return ERR_PTR(ret);
+}
+
+static struct iommu_sva *
+arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
+{
+	struct iommu_sva *handle;
+	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return ERR_PTR(-EINVAL);
+
+	mutex_lock(&sva_lock);
+	handle = __arm_smmu_sva_bind(dev, mm);
+	mutex_unlock(&sva_lock);
+	return handle;
+}
+
+static void arm_smmu_sva_unbind(struct iommu_sva *handle)
+{
+	struct arm_smmu_bond *bond = sva_to_bond(handle);
+
+	mutex_lock(&sva_lock);
+	if (refcount_dec_and_test(&bond->refs)) {
+		list_del(&bond->list);
+		arm_smmu_mmu_notifier_put(bond->smmu_mn);
+		iommu_sva_free_pasid(bond->mm);
+		kfree(bond);
+	}
+	mutex_unlock(&sva_lock);
+}
+
+static int arm_smmu_sva_get_pasid(struct iommu_sva *handle)
+{
+	struct arm_smmu_bond *bond = sva_to_bond(handle);
+
+	return bond->mm->pasid;
+}
+
 static struct platform_driver arm_smmu_driver;
 
 static
@@ -3458,6 +3697,9 @@ static struct iommu_ops arm_smmu_ops = {
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
 	.dev_disable_feat	= arm_smmu_dev_disable_feature,
+	.sva_bind		= arm_smmu_sva_bind,
+	.sva_unbind		= arm_smmu_sva_unbind,
+	.sva_get_pasid		= arm_smmu_sva_get_pasid,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
@@ -4520,6 +4762,16 @@ static const struct of_device_id arm_smmu_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
 
+static void arm_smmu_driver_unregister(struct platform_driver *drv)
+{
+	/*
+	 * Wait for all notifiers free() RCU callbacks, since they are still
+	 * using the arm_smmu_mmu_notifier_ops.
+	 */
+	mmu_notifier_synchronize();
+	platform_driver_unregister(drv);
+}
+
 static struct platform_driver arm_smmu_driver = {
 	.driver	= {
 		.name			= "arm-smmu-v3",
@@ -4530,7 +4782,8 @@ static struct platform_driver arm_smmu_driver = {
 	.remove	= arm_smmu_device_remove,
 	.shutdown = arm_smmu_device_shutdown,
 };
-module_platform_driver(arm_smmu_driver);
+module_driver(arm_smmu_driver, platform_driver_register,
+	      arm_smmu_driver_unregister);
 
 MODULE_DESCRIPTION("IOMMU API for ARM architected SMMUv3 implementations");
 MODULE_AUTHOR("Will Deacon <will@kernel.org>");
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v8 12/12] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (10 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 11/12] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind() Jean-Philippe Brucker
@ 2020-06-18 15:51 ` Jean-Philippe Brucker
  2020-07-09  9:39 ` [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
  2020-07-20 11:11 ` Will Deacon
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-06-18 15:51 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch, Jean-Philippe Brucker

The invalidate_range() notifier is called for any change to the address
space. Perform the required ATC invalidations.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm-smmu-v3.c | 36 ++++++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index af551f3c78a78..972c061399fc9 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2378,6 +2378,20 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
 	size_t inval_grain_shift = 12;
 	unsigned long page_start, page_end;
 
+	/*
+	 * ATS and PASID:
+	 *
+	 * If substream_valid is clear, the PCIe TLP is sent without a PASID
+	 * prefix. In that case all ATC entries within the address range are
+	 * invalidated, including those that were requested with a PASID! There
+	 * is no way to invalidate only entries without PASID.
+	 *
+	 * When using STRTAB_STE_1_S1DSS_SSID0 (reserving CD 0 for non-PASID
+	 * traffic), translation requests without PASID create ATC entries
+	 * without PASID, which must be invalidated with substream_valid clear.
+	 * This has the unpleasant side-effect of invalidating all PASID-tagged
+	 * ATC entries within the address range.
+	 */
 	*cmd = (struct arm_smmu_cmdq_ent) {
 		.opcode			= CMDQ_OP_ATC_INV,
 		.substream_valid	= !!ssid,
@@ -2421,12 +2435,12 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
 	cmd->atc.size	= log2_span;
 }
 
-static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
+static int arm_smmu_atc_inv_master(struct arm_smmu_master *master, int ssid)
 {
 	int i;
 	struct arm_smmu_cmdq_ent cmd;
 
-	arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd);
+	arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
 
 	for (i = 0; i < master->num_sids; i++) {
 		cmd.atc.sid = master->sids[i];
@@ -2955,7 +2969,7 @@ static void arm_smmu_disable_ats(struct arm_smmu_master *master)
 	 * ATC invalidation via the SMMU.
 	 */
 	wmb();
-	arm_smmu_atc_inv_master(master);
+	arm_smmu_atc_inv_master(master, 0);
 	atomic_dec(&smmu_domain->nr_ats_masters);
 }
 
@@ -3152,6 +3166,16 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
 	return ops->iova_to_phys(ops, iova);
 }
 
+static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
+					 struct mm_struct *mm,
+					 unsigned long start, unsigned long end)
+{
+	struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
+
+	arm_smmu_atc_inv_domain(smmu_mn->domain, mm->pasid, start,
+				end - start + 1);
+}
+
 static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
@@ -3170,6 +3194,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 	arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, &invalid_cd);
 
 	arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
+	arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0);
 
 	smmu_mn->cleared = true;
 	mutex_unlock(&sva_lock);
@@ -3181,6 +3206,7 @@ static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn)
 }
 
 static struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
+	.invalidate_range	= arm_smmu_mm_invalidate_range,
 	.release		= arm_smmu_mm_release,
 	.free_notifier		= arm_smmu_mmu_notifier_free,
 };
@@ -3257,8 +3283,10 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
 	 * If we went through clear(), we've already invalidated, and no
 	 * new TLB entry can have been formed.
 	 */
-	if (!smmu_mn->cleared)
+	if (!smmu_mn->cleared) {
 		arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
+		arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0);
+	}
 
 	/* Frees smmu_mn */
 	mmu_notifier_put(&smmu_mn->mn);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 03/12] iommu/sva: Add PASID helpers
  2020-06-18 15:51 ` [PATCH v8 03/12] iommu/sva: Add PASID helpers Jean-Philippe Brucker
@ 2020-06-19  7:37   ` Lu Baolu
  0 siblings, 0 replies; 23+ messages in thread
From: Lu Baolu @ 2020-06-19  7:37 UTC (permalink / raw)
  To: Jean-Philippe Brucker, iommu, linux-arm-kernel, linux-mm
  Cc: baolu.lu, joro, catalin.marinas, will, robin.murphy,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch

Hi Jean,

On 2020/6/18 23:51, Jean-Philippe Brucker wrote:
> Let IOMMU drivers allocate a single PASID per mm. Store the mm in the
> IOASID set to allow refcounting and searching mm by PASID, when handling
> an I/O page fault.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu

> ---
> v7->v8: rename to IOMMU_SVA_LIB (Lu Baolu)
> ---
>   drivers/iommu/Kconfig         |  5 +++
>   drivers/iommu/Makefile        |  1 +
>   drivers/iommu/iommu-sva-lib.h | 15 +++++++
>   drivers/iommu/iommu-sva-lib.c | 85 +++++++++++++++++++++++++++++++++++
>   4 files changed, 106 insertions(+)
>   create mode 100644 drivers/iommu/iommu-sva-lib.h
>   create mode 100644 drivers/iommu/iommu-sva-lib.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index b510f67dfa499..74a10e7a8d082 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -102,6 +102,11 @@ config IOMMU_DMA
>   	select IRQ_MSI_IOMMU
>   	select NEED_SG_DMA_LENGTH
>   
> +# Shared Virtual Addressing library
> +config IOMMU_SVA_LIB
> +	bool
> +	select IOASID
> +
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
>   	depends on PCI
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 342190196dfb0..0fe5a7f9bc69c 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -38,3 +38,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>   obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
>   obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
>   obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> +obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
> diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
> new file mode 100644
> index 0000000000000..b40990aef3fde
> --- /dev/null
> +++ b/drivers/iommu/iommu-sva-lib.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * SVA library for IOMMU drivers
> + */
> +#ifndef _IOMMU_SVA_LIB_H
> +#define _IOMMU_SVA_LIB_H
> +
> +#include <linux/ioasid.h>
> +#include <linux/mm_types.h>
> +
> +int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
> +void iommu_sva_free_pasid(struct mm_struct *mm);
> +struct mm_struct *iommu_sva_find(ioasid_t pasid);
> +
> +#endif /* _IOMMU_SVA_LIB_H */
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> new file mode 100644
> index 0000000000000..db7e6c104d6b0
> --- /dev/null
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -0,0 +1,85 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Helpers for IOMMU drivers implementing SVA
> + */
> +#include <linux/mutex.h>
> +#include <linux/sched/mm.h>
> +
> +#include "iommu-sva-lib.h"
> +
> +static DEFINE_MUTEX(iommu_sva_lock);
> +static DECLARE_IOASID_SET(iommu_sva_pasid);
> +
> +/**
> + * iommu_sva_alloc_pasid - Allocate a PASID for the mm
> + * @mm: the mm
> + * @min: minimum PASID value (inclusive)
> + * @max: maximum PASID value (inclusive)
> + *
> + * Try to allocate a PASID for this mm, or take a reference to the existing one
> + * provided it fits within the [min, max] range. On success the PASID is
> + * available in mm->pasid, and must be released with iommu_sva_free_pasid().
> + *
> + * Returns 0 on success and < 0 on error.
> + */
> +int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
> +{
> +	int ret = 0;
> +	ioasid_t pasid;
> +
> +	if (min == INVALID_IOASID || max == INVALID_IOASID ||
> +	    min == 0 || max < min)
> +		return -EINVAL;
> +
> +	mutex_lock(&iommu_sva_lock);
> +	if (mm->pasid) {
> +		if (mm->pasid >= min && mm->pasid <= max)
> +			ioasid_get(mm->pasid);
> +		else
> +			ret = -EOVERFLOW;
> +	} else {
> +		pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
> +		if (pasid == INVALID_IOASID)
> +			ret = -ENOMEM;
> +		else
> +			mm->pasid = pasid;
> +	}
> +	mutex_unlock(&iommu_sva_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
> +
> +/**
> + * iommu_sva_free_pasid - Release the mm's PASID
> + * @mm: the mm.
> + *
> + * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid()
> + */
> +void iommu_sva_free_pasid(struct mm_struct *mm)
> +{
> +	mutex_lock(&iommu_sva_lock);
> +	if (ioasid_put(mm->pasid))
> +		mm->pasid = 0;
> +	mutex_unlock(&iommu_sva_lock);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
> +
> +/* ioasid wants a void * argument */
> +static bool __mmget_not_zero(void *mm)
> +{
> +	return mmget_not_zero(mm);
> +}
> +
> +/**
> + * iommu_sva_find() - Find mm associated to the given PASID
> + * @pasid: Process Address Space ID assigned to the mm
> + *
> + * On success a reference to the mm is taken, and must be released with mmput().
> + *
> + * Returns the mm corresponding to this PASID, or an error if not found.
> + */
> +struct mm_struct *iommu_sva_find(ioasid_t pasid)
> +{
> +	return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_find);
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 08/12] iommu/arm-smmu-v3: Seize private ASID
  2020-06-18 15:51 ` [PATCH v8 08/12] iommu/arm-smmu-v3: Seize private ASID Jean-Philippe Brucker
@ 2020-07-06 12:40   ` Xiang Zheng
  2020-07-06 16:07     ` Jean-Philippe Brucker
  0 siblings, 1 reply; 23+ messages in thread
From: Xiang Zheng @ 2020-07-06 12:40 UTC (permalink / raw)
  To: Jean-Philippe Brucker, iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	fenghua.yu, hch, Wang Haibin

Hi Jean,

On 2020/6/18 23:51, Jean-Philippe Brucker wrote:
> The SMMU has a single ASID space, the union of shared and private ASID
> sets. This means that the SMMU driver competes with the arch allocator
> for ASIDs. Shared ASIDs are those of Linux processes, allocated by the
> arch, and contribute in broadcast TLB maintenance. Private ASIDs are
> allocated by the SMMU driver and used for "classic" map/unmap DMA. They
> require command-queue TLB invalidations.
> 
> When we pin down an mm_context and get an ASID that is already in use by
> the SMMU, it belongs to a private context. We used to simply abort the
> bind, but this is unfair to users that would be unable to bind a few
> seemingly random processes. Try to allocate a new private ASID for the
> context, and make the old ASID shared.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  drivers/iommu/arm-smmu-v3.c | 101 +++++++++++++++++++++++++++++-------
>  1 file changed, 82 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index cabd942e4cbf3..5506add42c9c8 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -733,6 +733,7 @@ struct arm_smmu_option_prop {
>  };
>  
>  static DEFINE_XARRAY_ALLOC1(asid_xa);
> +static DEFINE_MUTEX(asid_lock);
>  static DEFINE_MUTEX(sva_lock);
>  
>  static struct arm_smmu_option_prop arm_smmu_options[] = {
> @@ -1537,6 +1538,17 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
>  }
>  
>  /* Context descriptor manipulation functions */
> +static void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
> +{
> +	struct arm_smmu_cmdq_ent cmd = {
> +		.opcode = CMDQ_OP_TLBI_NH_ASID,
> +		.tlbi.asid = asid,
> +	};
> +
> +	arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +	arm_smmu_cmdq_issue_sync(smmu);
> +}
> +
>  static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain,
>  			     int ssid, bool leaf)
>  {
> @@ -1795,9 +1807,18 @@ static bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
>  	return free;
>  }
>  
> +/*
> + * Try to reserve this ASID in the SMMU. If it is in use, try to steal it from
> + * the private entry. Careful here, we may be modifying the context tables of
> + * another SMMU!
> + */
>  static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
>  {
> +	int ret;
> +	u32 new_asid;
>  	struct arm_smmu_ctx_desc *cd;
> +	struct arm_smmu_device *smmu;
> +	struct arm_smmu_domain *smmu_domain;
>  
>  	cd = xa_load(&asid_xa, asid);
>  	if (!cd)
> @@ -1809,8 +1830,31 @@ static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
>  		return cd;
>  	}
>  
> -	/* Ouch, ASID is already in use for a private cd. */
> -	return ERR_PTR(-EBUSY);
> +	smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd);
> +	smmu = smmu_domain->smmu;
> +
> +	ret = xa_alloc(&asid_xa, &new_asid, cd,
> +		       XA_LIMIT(1, 1 << smmu->asid_bits), GFP_KERNEL);
> +	if (ret)
> +		return ERR_PTR(-ENOSPC);
> +	/*
> +	 * Race with unmap: TLB invalidations will start targeting the new ASID,
> +	 * which isn't assigned yet. We'll do an invalidate-all on the old ASID
> +	 * later, so it doesn't matter.
> +	 */
> +	cd->asid = new_asid;
> +
> +	/*
> +	 * Update ASID and invalidate CD in all associated masters. There will
> +	 * be some overlap between use of both ASIDs, until we invalidate the
> +	 * TLB.
> +	 */
> +	arm_smmu_write_ctx_desc(smmu_domain, 0, cd);
> +
> +	/* Invalidate TLB entries previously associated with that context */
> +	arm_smmu_tlb_inv_asid(smmu, asid);
> +
> +	return NULL;
>  }
>  
>  __maybe_unused
> @@ -1836,7 +1880,20 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
>  
>  	arm_smmu_init_cd(cd);
>  
> +	/*
> +	 * Serialize against arm_smmu_domain_finalise_s1() and
> +	 * arm_smmu_domain_free() as we might need to replace the private ASID
> +	 * from an existing CD.
> +	 */
> +	mutex_lock(&asid_lock);
>  	old_cd = arm_smmu_share_asid(asid);
> +	if (!old_cd) {
> +		ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL);

Should we use "xa_store" here? If "asid" has already been used for private, old_cd would be NULL and
the entry indexed by "asid" in the asid_xa remains.

> +		if (ret)
> +			old_cd = ERR_PTR(ret);
> +	}
> +	mutex_unlock(&asid_lock);
> +
>  	if (IS_ERR(old_cd)) {
>  		ret = PTR_ERR(old_cd);
>  		goto err_free_cd;
> @@ -1850,11 +1907,6 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
>  		return old_cd;
>  	}
>  
> -	/* Fails if a private ASID has been allocated since we last checked */
> -	ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL);
> -	if (ret)
> -		goto err_free_cd;
> -
>  	tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) |
>  	      FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
>  	      FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
> @@ -2398,15 +2450,6 @@ static void arm_smmu_tlb_inv_context(void *cookie)
>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
>  	struct arm_smmu_cmdq_ent cmd;
>  
> -	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> -		cmd.opcode	= CMDQ_OP_TLBI_NH_ASID;
> -		cmd.tlbi.asid	= smmu_domain->s1_cfg.cd.asid;
> -		cmd.tlbi.vmid	= 0;
> -	} else {
> -		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
> -		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
> -	}
> -
>  	/*
>  	 * NOTE: when io-pgtable is in non-strict mode, we may get here with
>  	 * PTEs previously cleared by unmaps on the current CPU not yet visible
> @@ -2414,8 +2457,14 @@ static void arm_smmu_tlb_inv_context(void *cookie)
>  	 * insertion to guarantee those are observed before the TLBI. Do be
>  	 * careful, 007.
>  	 */
> -	arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> -	arm_smmu_cmdq_issue_sync(smmu);
> +	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> +		arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid);
> +	} else {
> +		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
> +		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
> +		arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +		arm_smmu_cmdq_issue_sync(smmu);
> +	}
>  	arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0);
>  }
>  
> @@ -2599,9 +2648,15 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>  	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
>  		struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
>  
> +		/*
> +		 * Prevent arm_smmu_share_asid() from rewriting CD#0 while we're
> +		 * freeing it.
> +		 */
> +		mutex_lock(&asid_lock);
>  		if (cfg->cdcfg.cdtab)
>  			arm_smmu_free_cd_tables(smmu_domain);
>  		arm_smmu_free_asid(&cfg->cd);
> +		mutex_unlock(&asid_lock);
>  	} else {
>  		struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
>  		if (cfg->vmid)
> @@ -2623,10 +2678,15 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
>  
>  	arm_smmu_init_cd(&cfg->cd);
>  
> +	/*
> +	 * Prevent arm_smmu_share_asid() from seizing the private ASID we're
> +	 * allocating here until it is written to the CD.
> +	 */
> +	mutex_lock(&asid_lock);
>  	ret = xa_alloc(&asid_xa, &asid, &cfg->cd,
>  		       XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
>  	if (ret)
> -		return ret;
> +		goto out_unlock;
>  
>  	cfg->s1cdmax = master->ssid_bits;
>  
> @@ -2654,12 +2714,15 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
>  	if (ret)
>  		goto out_free_cd_tables;
>  
> +	mutex_unlock(&asid_lock);
>  	return 0;
>  
>  out_free_cd_tables:
>  	arm_smmu_free_cd_tables(smmu_domain);
>  out_free_asid:
>  	arm_smmu_free_asid(&cfg->cd);
> +out_unlock:
> +	mutex_unlock(&asid_lock);
>  	return ret;
>  }
>  
> 

-- 
Thanks,
Xiang



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 08/12] iommu/arm-smmu-v3: Seize private ASID
  2020-07-06 12:40   ` Xiang Zheng
@ 2020-07-06 16:07     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-07-06 16:07 UTC (permalink / raw)
  To: Xiang Zheng
  Cc: iommu, linux-arm-kernel, linux-mm, joro, catalin.marinas, will,
	robin.murphy, baolu.lu, Jonathan.Cameron, jacob.jun.pan,
	zhangfei.gao, xuzaibo, fenghua.yu, hch, Wang Haibin

Hi Xiang,

On Mon, Jul 06, 2020 at 08:40:27PM +0800, Xiang Zheng wrote:
> > @@ -1836,7 +1880,20 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
> >  
> >  	arm_smmu_init_cd(cd);
> >  
> > +	/*
> > +	 * Serialize against arm_smmu_domain_finalise_s1() and
> > +	 * arm_smmu_domain_free() as we might need to replace the private ASID
> > +	 * from an existing CD.
> > +	 */
> > +	mutex_lock(&asid_lock);
> >  	old_cd = arm_smmu_share_asid(asid);
> > +	if (!old_cd) {
> > +		ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL);
> 
> Should we use "xa_store" here? If "asid" has already been used for private, old_cd would be NULL and
> the entry indexed by "asid" in the asid_xa remains.

Great catch, that's a bug introduced in v7. arm_smmu_share_asid() would
allocate a new asid for the private context but does not remove the old
entry. For the fix I think it looks clearer if arm_smmu_share_asid()
erases the old entry before returning.

Thanks,
Jean



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (11 preceding siblings ...)
  2020-06-18 15:51 ` [PATCH v8 12/12] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops Jean-Philippe Brucker
@ 2020-07-09  9:39 ` Jean-Philippe Brucker
  2020-07-20 11:11 ` Will Deacon
  13 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-07-09  9:39 UTC (permalink / raw)
  To: iommu, linux-arm-kernel, linux-mm
  Cc: joro, catalin.marinas, will, robin.murphy, baolu.lu,
	Jonathan.Cameron, jacob.jun.pan, zhangfei.gao, xuzaibo,
	zhengxiang9, fenghua.yu, hch

Hi Will,

On Thu, Jun 18, 2020 at 05:51:13PM +0200, Jean-Philippe Brucker wrote:
> Since v7 [1], I split the series into three parts to ease review. This
> first one adds page table sharing to the SMMUv3 driver. The second one
> adds support for I/O page faults through PRI and Stall, and the last one
> adds additional and optional features (DVM, VHE and HTTU). SVA needs the
> three parts to work. No significant change apart from that, I just
> addressed the previous comments.
> 
> I'd rather everything went through the IOMMU tree but I'm assuming patch
> 1 will also go through the x86 tree as part of [2]. It is definitely
> required by patch 3 which is required by patch 11. I don't know how this
> kind of conflict is usually resolved, but if it's a problem I could
> further shrink the series to only patches 4-10 this cycle.

I have one bugfix for this series but am planning to hold off resending
until you've found time to have a look. I can also reduce it to 7 patches
for v4.9, please let me know what you prefer.

Thanks,
Jean

> 
> [1] https://lore.kernel.org/linux-iommu/20200519175502.2504091-1-jean-philippe@linaro.org/
> [2] https://lore.kernel.org/linux-iommu/1592418233-17762-1-git-send-email-fenghua.yu@intel.com/
> 
> Fenghua Yu (1):
>   mm: Define pasid in mm
> 
> Jean-Philippe Brucker (11):
>   iommu/ioasid: Add ioasid references
>   iommu/sva: Add PASID helpers
>   arm64: mm: Pin down ASIDs for sharing mm with devices
>   iommu/io-pgtable-arm: Move some definitions to a header
>   arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
>   iommu/arm-smmu-v3: Share process page tables
>   iommu/arm-smmu-v3: Seize private ASID
>   iommu/arm-smmu-v3: Check for SVA features
>   iommu/arm-smmu-v3: Add SVA device feature
>   iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
>   iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
> 
>  drivers/iommu/Kconfig                |   7 +
>  drivers/iommu/Makefile               |   1 +
>  arch/arm64/include/asm/mmu.h         |   1 +
>  arch/arm64/include/asm/mmu_context.h |  11 +-
>  drivers/iommu/io-pgtable-arm.h       |  30 ++
>  drivers/iommu/iommu-sva-lib.h        |  15 +
>  include/linux/ioasid.h               |  10 +-
>  include/linux/mm_types.h             |   4 +
>  arch/arm64/kernel/cpufeature.c       |   1 +
>  arch/arm64/mm/context.c              |  95 +++-
>  drivers/iommu/arm-smmu-v3.c          | 702 ++++++++++++++++++++++++++-
>  drivers/iommu/intel/iommu.c          |   4 +-
>  drivers/iommu/intel/svm.c            |   6 +-
>  drivers/iommu/io-pgtable-arm.c       |  27 +-
>  drivers/iommu/ioasid.c               |  38 +-
>  drivers/iommu/iommu-sva-lib.c        |  85 ++++
>  MAINTAINERS                          |   3 +-
>  17 files changed, 977 insertions(+), 63 deletions(-)
>  create mode 100644 drivers/iommu/io-pgtable-arm.h
>  create mode 100644 drivers/iommu/iommu-sva-lib.h
>  create mode 100644 drivers/iommu/iommu-sva-lib.c
> 
> -- 
> 2.27.0
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices
  2020-06-18 15:51 ` [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices Jean-Philippe Brucker
@ 2020-07-13 15:46   ` Will Deacon
  2020-07-16 15:44     ` Jean-Philippe Brucker
  0 siblings, 1 reply; 23+ messages in thread
From: Will Deacon @ 2020-07-13 15:46 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: iommu, linux-arm-kernel, linux-mm, joro, catalin.marinas,
	robin.murphy, baolu.lu, Jonathan.Cameron, jacob.jun.pan,
	zhangfei.gao, xuzaibo, zhengxiang9, fenghua.yu, hch

On Thu, Jun 18, 2020 at 05:51:17PM +0200, Jean-Philippe Brucker wrote:
> To enable address space sharing with the IOMMU, introduce mm_context_get()
> and mm_context_put(), that pin down a context and ensure that it will keep
> its ASID after a rollover. Export the symbols to let the modular SMMUv3
> driver use them.
> 
> Pinning is necessary because a device constantly needs a valid ASID,
> unlike tasks that only require one when running. Without pinning, we would
> need to notify the IOMMU when we're about to use a new ASID for a task,
> and it would get complicated when a new task is assigned a shared ASID.
> Consider the following scenario with no ASID pinned:
> 
> 1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1)
> 2. Task t2 is scheduled on CPUx, gets ASID (1, 2)
> 3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1)
>    We would now have to immediately generate a new ASID for t1, notify
>    the IOMMU, and finally enable task tn. We are holding the lock during
>    all that time, since we can't afford having another CPU trigger a
>    rollover. The IOMMU issues invalidation commands that can take tens of
>    milliseconds.
> 
> It gets needlessly complicated. All we wanted to do was schedule task tn,
> that has no business with the IOMMU. By letting the IOMMU pin tasks when
> needed, we avoid stalling the slow path, and let the pinning fail when
> we're out of shareable ASIDs.
> 
> After a rollover, the allocator expects at least one ASID to be available
> in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS -
> 1) is the maximum number of ASIDs that can be shared with the IOMMU.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  arch/arm64/include/asm/mmu.h         |  1 +
>  arch/arm64/include/asm/mmu_context.h | 11 +++-
>  arch/arm64/mm/context.c              | 95 +++++++++++++++++++++++++++-
>  3 files changed, 104 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index 68140fdd89d6b..bbdd291e31d59 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -19,6 +19,7 @@
>  
>  typedef struct {
>  	atomic64_t	id;
> +	unsigned long	pinned;

bool?

>  	void		*vdso;
>  	unsigned long	flags;
>  } mm_context_t;
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index b0bd9b55594c5..7b0e0f6cb7e87 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -177,7 +177,13 @@ static inline void cpu_replace_ttbr1(pgd_t *pgdp)
>  #define destroy_context(mm)		do { } while(0)
>  void check_and_switch_context(struct mm_struct *mm, unsigned int cpu);
>  
> -#define init_new_context(tsk,mm)	({ atomic64_set(&(mm)->context.id, 0); 0; })
> +static inline int
> +init_new_context(struct task_struct *tsk, struct mm_struct *mm)
> +{
> +	atomic64_set(&mm->context.id, 0);
> +	mm->context.pinned = 0;
> +	return 0;
> +}
>  
>  #ifdef CONFIG_ARM64_SW_TTBR0_PAN
>  static inline void update_saved_ttbr0(struct task_struct *tsk,
> @@ -250,6 +256,9 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  void verify_cpu_asid_bits(void);
>  void post_ttbr_update_workaround(void);
>  
> +unsigned long mm_context_get(struct mm_struct *mm);
> +void mm_context_put(struct mm_struct *mm);
> +
>  #endif /* !__ASSEMBLY__ */
>  
>  #endif /* !__ASM_MMU_CONTEXT_H */
> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
> index d702d60e64dab..d0ddd413f5645 100644
> --- a/arch/arm64/mm/context.c
> +++ b/arch/arm64/mm/context.c
> @@ -27,6 +27,10 @@ static DEFINE_PER_CPU(atomic64_t, active_asids);
>  static DEFINE_PER_CPU(u64, reserved_asids);
>  static cpumask_t tlb_flush_pending;
>  
> +static unsigned long max_pinned_asids;
> +static unsigned long nr_pinned_asids;
> +static unsigned long *pinned_asid_map;
> +
>  #define ASID_MASK		(~GENMASK(asid_bits - 1, 0))
>  #define ASID_FIRST_VERSION	(1UL << asid_bits)
>  
> @@ -74,6 +78,9 @@ void verify_cpu_asid_bits(void)
>  
>  static void set_kpti_asid_bits(void)
>  {
> +	unsigned int k;
> +	u8 *dst = (u8 *)asid_map;
> +	u8 *src = (u8 *)pinned_asid_map;
>  	unsigned int len = BITS_TO_LONGS(NUM_USER_ASIDS) * sizeof(unsigned long);
>  	/*
>  	 * In case of KPTI kernel/user ASIDs are allocated in
> @@ -81,7 +88,8 @@ static void set_kpti_asid_bits(void)
>  	 * is set, then the ASID will map only userspace. Thus
>  	 * mark even as reserved for kernel.
>  	 */
> -	memset(asid_map, 0xaa, len);
> +	for (k = 0; k < len; k++)
> +		dst[k] = src[k] | 0xaa;

Can you use __bitmap_replace() here? I think it would be clearer to use the
bitmap API wherever possible, since casting 'unsigned long *' to 'u8 *'
just makes me worry about endianness issues (although in this case I don't
hink it's a problem).

>  }
>  
>  static void set_reserved_asid_bits(void)
> @@ -89,7 +97,7 @@ static void set_reserved_asid_bits(void)
>  	if (arm64_kernel_unmapped_at_el0())
>  		set_kpti_asid_bits();
>  	else
> -		bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
> +		bitmap_copy(asid_map, pinned_asid_map, NUM_USER_ASIDS);
>  }
>  
>  #define asid_gen_match(asid) \
> @@ -165,6 +173,14 @@ static u64 new_context(struct mm_struct *mm)
>  		if (check_update_reserved_asid(asid, newasid))
>  			return newasid;
>  
> +		/*
> +		 * If it is pinned, we can keep using it. Note that reserved
> +		 * takes priority, because even if it is also pinned, we need to
> +		 * update the generation into the reserved_asids.
> +		 */
> +		if (mm->context.pinned)
> +			return newasid;
> +
>  		/*
>  		 * We had a valid ASID in a previous life, so try to re-use
>  		 * it if possible.
> @@ -254,6 +270,68 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
>  		cpu_switch_mm(mm->pgd, mm);
>  }
>  
> +unsigned long mm_context_get(struct mm_struct *mm)
> +{
> +	unsigned long flags;
> +	u64 asid;
> +
> +	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
> +
> +	asid = atomic64_read(&mm->context.id);
> +
> +	if (mm->context.pinned) {
> +		mm->context.pinned++;
> +		asid &= ~ASID_MASK;
> +		goto out_unlock;
> +	}
> +
> +	if (nr_pinned_asids >= max_pinned_asids) {
> +		asid = 0;
> +		goto out_unlock;
> +	}
> +
> +	if (!asid_gen_match(asid)) {
> +		/*
> +		 * We went through one or more rollover since that ASID was
> +		 * used. Ensure that it is still valid, or generate a new one.
> +		 */
> +		asid = new_context(mm);
> +		atomic64_set(&mm->context.id, asid);
> +	}
> +
> +	asid &= ~ASID_MASK;
> +
> +	nr_pinned_asids++;
> +	__set_bit(asid2idx(asid), pinned_asid_map);
> +	mm->context.pinned++;
> +
> +out_unlock:
> +	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);

Maybe stick the & ~ASID_MASK here so it's easier to read?

> +	/* Set the equivalent of USER_ASID_BIT */
> +	if (asid && IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0))
> +		asid |= 1;
> +
> +	return asid;
> +}
> +EXPORT_SYMBOL_GPL(mm_context_get);

That's quite a generic symbol name to export... maybe throw 'arm64_' in
front of it?

> +
> +void mm_context_put(struct mm_struct *mm)
> +{
> +	unsigned long flags;
> +	u64 asid = atomic64_read(&mm->context.id) & ~ASID_MASK;

I don't think you need the masking here.

> +	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
> +
> +	if (--mm->context.pinned == 0) {
> +		__clear_bit(asid2idx(asid), pinned_asid_map);
> +		nr_pinned_asids--;
> +	}
> +
> +	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
> +}
> +EXPORT_SYMBOL_GPL(mm_context_put);

Same naming comment here.

>  /* Errata workaround post TTBRx_EL1 update. */
>  asmlinkage void post_ttbr_update_workaround(void)
>  {
> @@ -303,6 +381,13 @@ static int asids_update_limit(void)
>  	WARN_ON(num_available_asids - 1 <= num_possible_cpus());
>  	pr_info("ASID allocator initialised with %lu entries\n",
>  		num_available_asids);
> +
> +	/*
> +	 * We assume that an ASID is always available after a rollover. This
> +	 * means that even if all CPUs have a reserved ASID, there still is at
> +	 * least one slot available in the asid map.
> +	 */
> +	max_pinned_asids = num_available_asids - num_possible_cpus() - 2;

Is it worth checking that assumption, rather than setting max_pinned_asids
to a massive value?

>  	return 0;
>  }
>  arch_initcall(asids_update_limit);
> @@ -317,6 +402,12 @@ static int asids_init(void)
>  		panic("Failed to allocate bitmap for %lu ASIDs\n",
>  		      NUM_USER_ASIDS);
>  
> +	pinned_asid_map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS),
> +				  sizeof(*pinned_asid_map), GFP_KERNEL);
> +	if (!pinned_asid_map)
> +		panic("Failed to allocate pinned ASID bitmap\n");

Why can't we continue in this case without support for pinned ASIDs?

Will


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 07/12] iommu/arm-smmu-v3: Share process page tables
  2020-06-18 15:51 ` [PATCH v8 07/12] iommu/arm-smmu-v3: Share process page tables Jean-Philippe Brucker
@ 2020-07-13 20:22   ` Will Deacon
  2020-07-16 15:45     ` Jean-Philippe Brucker
  0 siblings, 1 reply; 23+ messages in thread
From: Will Deacon @ 2020-07-13 20:22 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: iommu, linux-arm-kernel, linux-mm, joro, catalin.marinas,
	robin.murphy, baolu.lu, Jonathan.Cameron, jacob.jun.pan,
	zhangfei.gao, xuzaibo, zhengxiang9, fenghua.yu, hch,
	Suzuki K Poulose

On Thu, Jun 18, 2020 at 05:51:20PM +0200, Jean-Philippe Brucker wrote:
> With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR,
> MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split
> into two sets, shared and private. Shared ASIDs correspond to those
> obtained from the arch ASID allocator, and private ASIDs are used for
> "classic" map/unmap DMA.
> 
> A possible conflict happens when trying to use a shared ASID that has
> already been allocated for private use by the SMMU driver. This will be
> addressed in a later patch by replacing the private ASID. At the
> moment we return -EBUSY.
> 
> Each mm_struct shared with the SMMU will have a single context
> descriptor. Add a refcount to keep track of this. It will be protected
> by the global SVA lock.
> 
> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  drivers/iommu/arm-smmu-v3.c | 150 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 146 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 937aa1af428d5..cabd942e4cbf3 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -22,6 +22,7 @@
>  #include <linux/iommu.h>
>  #include <linux/iopoll.h>
>  #include <linux/module.h>
> +#include <linux/mmu_context.h>
>  #include <linux/msi.h>
>  #include <linux/of.h>
>  #include <linux/of_address.h>
> @@ -33,6 +34,8 @@
>  
>  #include <linux/amba/bus.h>
>  
> +#include "io-pgtable-arm.h"
> +
>  /* MMIO registers */
>  #define ARM_SMMU_IDR0			0x0
>  #define IDR0_ST_LVL			GENMASK(28, 27)
> @@ -589,6 +592,9 @@ struct arm_smmu_ctx_desc {
>  	u64				ttbr;
>  	u64				tcr;
>  	u64				mair;
> +
> +	refcount_t			refs;
> +	struct mm_struct		*mm;
>  };
>  
>  struct arm_smmu_l1_ctx_desc {
> @@ -727,6 +733,7 @@ struct arm_smmu_option_prop {
>  };
>  
>  static DEFINE_XARRAY_ALLOC1(asid_xa);
> +static DEFINE_MUTEX(sva_lock);
>  
>  static struct arm_smmu_option_prop arm_smmu_options[] = {
>  	{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
> @@ -1662,7 +1669,8 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain,
>  #ifdef __BIG_ENDIAN
>  			CTXDESC_CD_0_ENDI |
>  #endif
> -			CTXDESC_CD_0_R | CTXDESC_CD_0_A | CTXDESC_CD_0_ASET |
> +			CTXDESC_CD_0_R | CTXDESC_CD_0_A |
> +			(cd->mm ? 0 : CTXDESC_CD_0_ASET) |
>  			CTXDESC_CD_0_AA64 |
>  			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
>  			CTXDESC_CD_0_V;
> @@ -1766,12 +1774,144 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain)
>  	cdcfg->cdtab = NULL;
>  }
>  
> -static void arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
> +static void arm_smmu_init_cd(struct arm_smmu_ctx_desc *cd)
>  {
> +	refcount_set(&cd->refs, 1);
> +}
> +
> +static bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
> +{
> +	bool free;
> +	struct arm_smmu_ctx_desc *old_cd;
> +
>  	if (!cd->asid)
> -		return;
> +		return false;
> +
> +	free = refcount_dec_and_test(&cd->refs);
> +	if (free) {
> +		old_cd = xa_erase(&asid_xa, cd->asid);
> +		WARN_ON(old_cd != cd);
> +	}
> +	return free;
> +}
> +
> +static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
> +{
> +	struct arm_smmu_ctx_desc *cd;
>  
> -	xa_erase(&asid_xa, cd->asid);
> +	cd = xa_load(&asid_xa, asid);
> +	if (!cd)
> +		return NULL;
> +
> +	if (cd->mm) {
> +		/* All devices bound to this mm use the same cd struct. */
> +		refcount_inc(&cd->refs);
> +		return cd;
> +	}

How do you handle racing against a concurrent arm_smmu_free_asid() here?

> +__maybe_unused
> +static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
> +{
> +	u16 asid;
> +	int ret = 0;
> +	u64 tcr, par, reg;
> +	struct arm_smmu_ctx_desc *cd;
> +	struct arm_smmu_ctx_desc *old_cd = NULL;
> +
> +	lockdep_assert_held(&sva_lock);

Please don't bother with these for static functions (but I can see the
value in having them for functions with external callers).

> +
> +	asid = mm_context_get(mm);
> +	if (!asid)
> +		return ERR_PTR(-ESRCH);
> +
> +	cd = kzalloc(sizeof(*cd), GFP_KERNEL);
> +	if (!cd) {
> +		ret = -ENOMEM;
> +		goto err_put_context;
> +	}
> +
> +	arm_smmu_init_cd(cd);
> +
> +	old_cd = arm_smmu_share_asid(asid);
> +	if (IS_ERR(old_cd)) {
> +		ret = PTR_ERR(old_cd);
> +		goto err_free_cd;
> +	} else if (old_cd) {

Don't need the 'else'

> +		if (WARN_ON(old_cd->mm != mm)) {
> +			ret = -EINVAL;
> +			goto err_free_cd;
> +		}
> +		kfree(cd);
> +		mm_context_put(mm);
> +		return old_cd;

This is a bit messy. Can you consolidate the return path so that ret is a
pointer and you have an 'int err', e.g.:

	return err < 0 ? ERR_PTR(err) : ret;

Will


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices
  2020-07-13 15:46   ` Will Deacon
@ 2020-07-16 15:44     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-07-16 15:44 UTC (permalink / raw)
  To: Will Deacon
  Cc: iommu, linux-arm-kernel, linux-mm, joro, catalin.marinas,
	robin.murphy, baolu.lu, Jonathan.Cameron, jacob.jun.pan,
	zhangfei.gao, xuzaibo, zhengxiang9, fenghua.yu, hch

Hi,

Thanks for reviewing, I think these 3 or 4 patches are the trickiest in
the whole SVA series

On Mon, Jul 13, 2020 at 04:46:23PM +0100, Will Deacon wrote:
> On Thu, Jun 18, 2020 at 05:51:17PM +0200, Jean-Philippe Brucker wrote:
> > diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> > index 68140fdd89d6b..bbdd291e31d59 100644
> > --- a/arch/arm64/include/asm/mmu.h
> > +++ b/arch/arm64/include/asm/mmu.h
> > @@ -19,6 +19,7 @@
> >  
> >  typedef struct {
> >  	atomic64_t	id;
> > +	unsigned long	pinned;
> 
> bool?

It's a refcount. I'll change it to refcount_t because it looks nicer
overall, even though it doesn't need to be atomic.

> >  static void set_kpti_asid_bits(void)
> >  {
> > +	unsigned int k;
> > +	u8 *dst = (u8 *)asid_map;
> > +	u8 *src = (u8 *)pinned_asid_map;
> >  	unsigned int len = BITS_TO_LONGS(NUM_USER_ASIDS) * sizeof(unsigned long);
> >  	/*
> >  	 * In case of KPTI kernel/user ASIDs are allocated in
> > @@ -81,7 +88,8 @@ static void set_kpti_asid_bits(void)
> >  	 * is set, then the ASID will map only userspace. Thus
> >  	 * mark even as reserved for kernel.
> >  	 */
> > -	memset(asid_map, 0xaa, len);
> > +	for (k = 0; k < len; k++)
> > +		dst[k] = src[k] | 0xaa;
> 
> Can you use __bitmap_replace() here? I think it would be clearer to use the
> bitmap API wherever possible, since casting 'unsigned long *' to 'u8 *'
> just makes me worry about endianness issues (although in this case I don't
> hink it's a problem).

I can do better actually: initialize pinned_asid_map with the kernel ASID
bits at boot and just copy the bitmap on rollover.

> > +unsigned long mm_context_get(struct mm_struct *mm)
> > +{
> > +	unsigned long flags;
> > +	u64 asid;
> > +
> > +	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
> > +
> > +	asid = atomic64_read(&mm->context.id);
> > +
> > +	if (mm->context.pinned) {
> > +		mm->context.pinned++;
> > +		asid &= ~ASID_MASK;
> > +		goto out_unlock;
> > +	}
> > +
> > +	if (nr_pinned_asids >= max_pinned_asids) {
> > +		asid = 0;
> > +		goto out_unlock;
> > +	}
> > +
> > +	if (!asid_gen_match(asid)) {
> > +		/*
> > +		 * We went through one or more rollover since that ASID was
> > +		 * used. Ensure that it is still valid, or generate a new one.
> > +		 */
> > +		asid = new_context(mm);
> > +		atomic64_set(&mm->context.id, asid);
> > +	}
> > +
> > +	asid &= ~ASID_MASK;
> > +
> > +	nr_pinned_asids++;
> > +	__set_bit(asid2idx(asid), pinned_asid_map);
> > +	mm->context.pinned++;
> > +
> > +out_unlock:
> > +	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
> 
> Maybe stick the & ~ASID_MASK here so it's easier to read?
> 
> > +	/* Set the equivalent of USER_ASID_BIT */
> > +	if (asid && IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0))
> > +		asid |= 1;
> > +
> > +	return asid;
> > +}
> > +EXPORT_SYMBOL_GPL(mm_context_get);
> 
> That's quite a generic symbol name to export... maybe throw 'arm64_' in
> front of it?
> 
> > +
> > +void mm_context_put(struct mm_struct *mm)
> > +{
> > +	unsigned long flags;
> > +	u64 asid = atomic64_read(&mm->context.id) & ~ASID_MASK;
> 
> I don't think you need the masking here.

Right, asid2idx() takes care of it.

> 
> > +	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
> > +
> > +	if (--mm->context.pinned == 0) {
> > +		__clear_bit(asid2idx(asid), pinned_asid_map);
> > +		nr_pinned_asids--;
> > +	}
> > +
> > +	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
> > +}
> > +EXPORT_SYMBOL_GPL(mm_context_put);
> 
> Same naming comment here.
> 
> >  /* Errata workaround post TTBRx_EL1 update. */
> >  asmlinkage void post_ttbr_update_workaround(void)
> >  {
> > @@ -303,6 +381,13 @@ static int asids_update_limit(void)
> >  	WARN_ON(num_available_asids - 1 <= num_possible_cpus());
> >  	pr_info("ASID allocator initialised with %lu entries\n",
> >  		num_available_asids);
> > +
> > +	/*
> > +	 * We assume that an ASID is always available after a rollover. This
> > +	 * means that even if all CPUs have a reserved ASID, there still is at
> > +	 * least one slot available in the asid map.
> > +	 */
> > +	max_pinned_asids = num_available_asids - num_possible_cpus() - 2;
> 
> Is it worth checking that assumption, rather than setting max_pinned_asids
> to a massive value?

The comment isn't right, it should be something like:

	"There must always be an ASID available after a rollover. Ensure
	that, even if all CPUs have a reserved ASID and the maximum number
	of ASIDs are pinned, there still is at least one empty slot in the
	ASID map."

I do wonder if we should reduce max_pinned_asids, because the system will
probably become unusable if it gets close to a single available ASID.  But
I think we'll need to introduce higher-level controls for SVA such as
cgroups to prevent users from pinning too many ASIDs.

> 
> >  	return 0;
> >  }
> >  arch_initcall(asids_update_limit);
> > @@ -317,6 +402,12 @@ static int asids_init(void)
> >  		panic("Failed to allocate bitmap for %lu ASIDs\n",
> >  		      NUM_USER_ASIDS);
> >  
> > +	pinned_asid_map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS),
> > +				  sizeof(*pinned_asid_map), GFP_KERNEL);
> > +	if (!pinned_asid_map)
> > +		panic("Failed to allocate pinned ASID bitmap\n");
> 
> Why can't we continue in this case without support for pinned ASIDs?

Good idea, I'll change that

Thanks,
Jean


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 07/12] iommu/arm-smmu-v3: Share process page tables
  2020-07-13 20:22   ` Will Deacon
@ 2020-07-16 15:45     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-07-16 15:45 UTC (permalink / raw)
  To: Will Deacon
  Cc: iommu, linux-arm-kernel, linux-mm, joro, catalin.marinas,
	robin.murphy, baolu.lu, Jonathan.Cameron, jacob.jun.pan,
	zhangfei.gao, xuzaibo, zhengxiang9, fenghua.yu, hch,
	Suzuki K Poulose

On Mon, Jul 13, 2020 at 09:22:37PM +0100, Will Deacon wrote:
> > +static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid)
> > +{
> > +	struct arm_smmu_ctx_desc *cd;
> >  
> > -	xa_erase(&asid_xa, cd->asid);
> > +	cd = xa_load(&asid_xa, asid);
> > +	if (!cd)
> > +		return NULL;
> > +
> > +	if (cd->mm) {
> > +		/* All devices bound to this mm use the same cd struct. */
> > +		refcount_inc(&cd->refs);
> > +		return cd;
> > +	}
> 
> How do you handle racing against a concurrent arm_smmu_free_asid() here?

Patch 8 adds an asid_lock to deal with this, but it should be introduced
in this patch. There is a potential use-after-free here, if
arm_smmu_domain_free() runs concurrently.

> 
> > +__maybe_unused
> > +static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
> > +{
> > +	u16 asid;
> > +	int ret = 0;
> > +	u64 tcr, par, reg;
> > +	struct arm_smmu_ctx_desc *cd;
> > +	struct arm_smmu_ctx_desc *old_cd = NULL;
> > +
> > +	lockdep_assert_held(&sva_lock);
> 
> Please don't bother with these for static functions (but I can see the
> value in having them for functions with external callers).
> 
> > +
> > +	asid = mm_context_get(mm);
> > +	if (!asid)
> > +		return ERR_PTR(-ESRCH);
> > +
> > +	cd = kzalloc(sizeof(*cd), GFP_KERNEL);
> > +	if (!cd) {
> > +		ret = -ENOMEM;
> > +		goto err_put_context;
> > +	}
> > +
> > +	arm_smmu_init_cd(cd);
> > +
> > +	old_cd = arm_smmu_share_asid(asid);
> > +	if (IS_ERR(old_cd)) {
> > +		ret = PTR_ERR(old_cd);
> > +		goto err_free_cd;
> > +	} else if (old_cd) {
> 
> Don't need the 'else'
> 
> > +		if (WARN_ON(old_cd->mm != mm)) {
> > +			ret = -EINVAL;
> > +			goto err_free_cd;
> > +		}
> > +		kfree(cd);
> > +		mm_context_put(mm);
> > +		return old_cd;
> 
> This is a bit messy. Can you consolidate the return path so that ret is a
> pointer and you have an 'int err', e.g.:
> 
> 	return err < 0 ? ERR_PTR(err) : ret;

Sure, I think it looks a little nicer this way

Thanks,
Jean


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)
  2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
                   ` (12 preceding siblings ...)
  2020-07-09  9:39 ` [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
@ 2020-07-20 11:11 ` Will Deacon
  2020-07-20 15:39   ` Jean-Philippe Brucker
  13 siblings, 1 reply; 23+ messages in thread
From: Will Deacon @ 2020-07-20 11:11 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: iommu, linux-arm-kernel, linux-mm, joro, catalin.marinas,
	robin.murphy, baolu.lu, Jonathan.Cameron, jacob.jun.pan,
	zhangfei.gao, xuzaibo, zhengxiang9, fenghua.yu, hch

On Thu, Jun 18, 2020 at 05:51:13PM +0200, Jean-Philippe Brucker wrote:
> Since v7 [1], I split the series into three parts to ease review. This
> first one adds page table sharing to the SMMUv3 driver. The second one
> adds support for I/O page faults through PRI and Stall, and the last one
> adds additional and optional features (DVM, VHE and HTTU). SVA needs the
> three parts to work. No significant change apart from that, I just
> addressed the previous comments.
> 
> I'd rather everything went through the IOMMU tree but I'm assuming patch
> 1 will also go through the x86 tree as part of [2]. It is definitely
> required by patch 3 which is required by patch 11. I don't know how this
> kind of conflict is usually resolved, but if it's a problem I could
> further shrink the series to only patches 4-10 this cycle.

Modulo my review comments, I think most of this looks alright from the SMMU
side. However, I would really like it if the SVA driver parts could be moved
into a separate file (e.g. arm-smmu-v3-sva.c), with a separate config option
(dependent on the current one, so you can easily build a driver without SVA
support). Does that sound remotely feasible? If so, I think it would really
help in terms of maintainability, since the SVA model is really all about
the mm, whereas the driver model is all about the device. This makes it
really hard to read when you have to keep working out whether the current
'handle' is an mm_struct or an arm_smmu_device.

Will


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)
  2020-07-20 11:11 ` Will Deacon
@ 2020-07-20 15:39   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 23+ messages in thread
From: Jean-Philippe Brucker @ 2020-07-20 15:39 UTC (permalink / raw)
  To: Will Deacon
  Cc: iommu, linux-arm-kernel, linux-mm, joro, catalin.marinas,
	robin.murphy, baolu.lu, Jonathan.Cameron, jacob.jun.pan,
	zhangfei.gao, xuzaibo, zhengxiang9, fenghua.yu, hch

On Mon, Jul 20, 2020 at 12:11:17PM +0100, Will Deacon wrote:
> On Thu, Jun 18, 2020 at 05:51:13PM +0200, Jean-Philippe Brucker wrote:
> > Since v7 [1], I split the series into three parts to ease review. This
> > first one adds page table sharing to the SMMUv3 driver. The second one
> > adds support for I/O page faults through PRI and Stall, and the last one
> > adds additional and optional features (DVM, VHE and HTTU). SVA needs the
> > three parts to work. No significant change apart from that, I just
> > addressed the previous comments.
> > 
> > I'd rather everything went through the IOMMU tree but I'm assuming patch
> > 1 will also go through the x86 tree as part of [2]. It is definitely
> > required by patch 3 which is required by patch 11. I don't know how this
> > kind of conflict is usually resolved, but if it's a problem I could
> > further shrink the series to only patches 4-10 this cycle.
> 
> Modulo my review comments, I think most of this looks alright from the SMMU
> side. However, I would really like it if the SVA driver parts could be moved
> into a separate file (e.g. arm-smmu-v3-sva.c), with a separate config option
> (dependent on the current one, so you can easily build a driver without SVA
> support). Does that sound remotely feasible?

Yes it makes sense. It requires moving some definitions from arm-smmu-v3.c
to a .h but should be straightforward, I'll give it a try.

Thanks,
Jean

> If so, I think it would really
> help in terms of maintainability, since the SVA model is really all about
> the mm, whereas the driver model is all about the device. This makes it
> really hard to read when you have to keep working out whether the current
> 'handle' is an mm_struct or an arm_smmu_device.
> 
> Will


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-07-20 15:41 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-18 15:51 [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 01/12] mm: Define pasid in mm Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 02/12] iommu/ioasid: Add ioasid references Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 03/12] iommu/sva: Add PASID helpers Jean-Philippe Brucker
2020-06-19  7:37   ` Lu Baolu
2020-06-18 15:51 ` [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices Jean-Philippe Brucker
2020-07-13 15:46   ` Will Deacon
2020-07-16 15:44     ` Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 05/12] iommu/io-pgtable-arm: Move some definitions to a header Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 06/12] arm64: cpufeature: Export symbol read_sanitised_ftr_reg() Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 07/12] iommu/arm-smmu-v3: Share process page tables Jean-Philippe Brucker
2020-07-13 20:22   ` Will Deacon
2020-07-16 15:45     ` Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 08/12] iommu/arm-smmu-v3: Seize private ASID Jean-Philippe Brucker
2020-07-06 12:40   ` Xiang Zheng
2020-07-06 16:07     ` Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 09/12] iommu/arm-smmu-v3: Check for SVA features Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 10/12] iommu/arm-smmu-v3: Add SVA device feature Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 11/12] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind() Jean-Philippe Brucker
2020-06-18 15:51 ` [PATCH v8 12/12] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops Jean-Philippe Brucker
2020-07-09  9:39 ` [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) Jean-Philippe Brucker
2020-07-20 11:11 ` Will Deacon
2020-07-20 15:39   ` Jean-Philippe Brucker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).