linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR
@ 2021-12-17 22:01 Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 01/11] iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA Fenghua Yu
                   ` (11 more replies)
  0 siblings, 12 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

Problems in the old code to manage SVM (Shared Virtual Memory) devices
and the PASID (Process Address Space ID) led to that code being
disabled.

Subsequent discussions resulted in a far simpler approach:

1) PASID life cycle is from first allocation by a process until that
   process exits.
2) All tasks begin with PASID disabled
3) The #GP fault handler tries to fix faulting ENQCMD instructions very
   early (thus avoiding complexities of the XSAVE infrastructure)

Change Log:
v2:
- Free PASID on mm exit instead of in exit(2) or unbind() (Thomas, AndyL,
  PeterZ)
- Directly write IA32_PASID MSR in fixup while local IRQ is still disabled
  (Thomas)
- Simplify handling ENQCMD in objtool (PeterZ and Josh)
- Define mm_pasid_get(), mm_pasid_drop(), and mm_pasid_init() in mm and
  call the functions from IOMMU (Dave Hansen).
- A few changes in the #GP fixup function (Dave Hansen, Tony Luck).
- Initial PASID value is changed to INVALID_PASID (Ashok Raj and
  Jacob Pan).
- Add mm_pasid_init(), mm_pasid_get(), and mm_pasid_drop() functions in mm.
  So the mm's PASID operations are generic for both X86 and ARM
  (Dave Hansen).
- Rename CONFIG_IOMMU_SVA_LIB to more useful and accurate
  CONFIG_IOMMU_SVA
- Use CONFIG_IOMMU_SVA for PASID processing condition (Jacob)
- The patch that cleans up old update_pasid() function is in upstream
  now (commit: 00ecd5401349 "iommu/vt-d: Clean up unused PASID updating
  functions") and therefore it's removed from this version.

v1 can be found at https://lore.kernel.org/lkml/20210920192349.2602141-1-fenghua.yu@intel.com/T/#md6d542091da1d1159eda0a44a16e57d0c0dfb209

Fenghua Yu (10):
  iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA
  mm: Change CONFIG option for mm->pasid field
  iommu/ioasid: Introduce a helper to check for valid PASIDs
  kernel/fork: Initialize mm's PASID
  iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm
    exit
  x86/fpu: Clear PASID when copying fpstate
  x86/traps: Demand-populate PASID MSR via #GP
  x86/cpufeatures: Re-enable ENQCMD
  tools/objtool: Check for use of the ENQCMD instruction in the kernel
  docs: x86: Change documentation for SVA (Shared Virtual Addressing)

Peter Zijlstra (1):
  sched: Define and initialize a flag to identify valid PASID in the
    task

 Documentation/x86/sva.rst                     | 58 +++++++++++++++----
 arch/x86/include/asm/disabled-features.h      |  7 ++-
 arch/x86/kernel/fpu/core.c                    |  7 +++
 arch/x86/kernel/traps.c                       | 55 ++++++++++++++++++
 drivers/iommu/Kconfig                         |  6 +-
 drivers/iommu/Makefile                        |  2 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  5 +-
 drivers/iommu/intel/Kconfig                   |  2 +-
 drivers/iommu/intel/svm.c                     |  9 ---
 drivers/iommu/iommu-sva-lib.c                 | 39 ++++---------
 drivers/iommu/iommu-sva-lib.h                 |  7 +--
 include/linux/ioasid.h                        |  9 +++
 include/linux/mm_types.h                      |  2 +-
 include/linux/sched.h                         |  3 +
 include/linux/sched/mm.h                      | 26 +++++++++
 kernel/fork.c                                 | 15 +++--
 mm/init-mm.c                                  |  4 ++
 tools/objtool/arch/x86/decode.c               | 11 +++-
 18 files changed, 194 insertions(+), 73 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 01/11] iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 02/11] mm: Change CONFIG option for mm->pasid field Fenghua Yu
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

This CONFIG option originally only referred to the Shared
Virtual Address (SVA) library. But it is now also used for
non-library portions of code.

Drop the "_LIB" suffix so that there is just one configuration
options for all code relating to SVA.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Add this patch for more meaningful name CONFIG_IOMMU_SVA

 drivers/iommu/Kconfig         | 6 +++---
 drivers/iommu/Makefile        | 2 +-
 drivers/iommu/intel/Kconfig   | 2 +-
 drivers/iommu/iommu-sva-lib.h | 6 +++---
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 3eb68fa1b8cc..c79a0df090c0 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -144,8 +144,8 @@ config IOMMU_DMA
 	select IRQ_MSI_IOMMU
 	select NEED_SG_DMA_LENGTH
 
-# Shared Virtual Addressing library
-config IOMMU_SVA_LIB
+# Shared Virtual Addressing
+config IOMMU_SVA
 	bool
 	select IOASID
 
@@ -379,7 +379,7 @@ config ARM_SMMU_V3
 config ARM_SMMU_V3_SVA
 	bool "Shared Virtual Addressing support for the ARM SMMUv3"
 	depends on ARM_SMMU_V3
-	select IOMMU_SVA_LIB
+	select IOMMU_SVA
 	select MMU_NOTIFIER
 	help
 	  Support for sharing process address spaces with devices using the
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index bc7f730edbb0..44475a9b3eea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,6 +27,6 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
-obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o io-pgfault.o
+obj-$(CONFIG_IOMMU_SVA) += iommu-sva-lib.o io-pgfault.o
 obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
 obj-$(CONFIG_APPLE_DART) += apple-dart.o
diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig
index 247d0f2d5fdf..39a06d245f12 100644
--- a/drivers/iommu/intel/Kconfig
+++ b/drivers/iommu/intel/Kconfig
@@ -52,7 +52,7 @@ config INTEL_IOMMU_SVM
 	select PCI_PRI
 	select MMU_NOTIFIER
 	select IOASID
-	select IOMMU_SVA_LIB
+	select IOMMU_SVA
 	help
 	  Shared Virtual Memory (SVM) provides a facility for devices
 	  to access DMA resources through process address space by
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 031155010ca8..95dc3ebc1928 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -17,7 +17,7 @@ struct device;
 struct iommu_fault;
 struct iopf_queue;
 
-#ifdef CONFIG_IOMMU_SVA_LIB
+#ifdef CONFIG_IOMMU_SVA
 int iommu_queue_iopf(struct iommu_fault *fault, void *cookie);
 
 int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev);
@@ -28,7 +28,7 @@ struct iopf_queue *iopf_queue_alloc(const char *name);
 void iopf_queue_free(struct iopf_queue *queue);
 int iopf_queue_discard_partial(struct iopf_queue *queue);
 
-#else /* CONFIG_IOMMU_SVA_LIB */
+#else /* CONFIG_IOMMU_SVA */
 static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
 {
 	return -ENODEV;
@@ -64,5 +64,5 @@ static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
 {
 	return -ENODEV;
 }
-#endif /* CONFIG_IOMMU_SVA_LIB */
+#endif /* CONFIG_IOMMU_SVA */
 #endif /* _IOMMU_SVA_LIB_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 02/11] mm: Change CONFIG option for mm->pasid field
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 01/11] iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 03/11] iommu/ioasid: Introduce a helper to check for valid PASIDs Fenghua Yu
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

This currently depends on CONFIG_IOMMU_SUPPORT. But it is only
needed when CONFIG_IOMMU_SVA option is enabled.

Change the CONFIG guards around definition and initialization
of mm->pasid field.

Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Change condition to more accurate CONFIG_IOMMU_SVA (Jacob and Tony)

 include/linux/mm_types.h | 2 +-
 kernel/fork.c            | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index c3a6e6209600..6d92121d5bd9 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -643,7 +643,7 @@ struct mm_struct {
 #endif
 		struct work_struct async_put_work;
 
-#ifdef CONFIG_IOMMU_SUPPORT
+#ifdef CONFIG_IOMMU_SVA
 		u32 pasid;
 #endif
 	} __randomize_layout;
diff --git a/kernel/fork.c b/kernel/fork.c
index 3244cc56b697..b33c52021d4e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1020,7 +1020,7 @@ static void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
 
 static void mm_init_pasid(struct mm_struct *mm)
 {
-#ifdef CONFIG_IOMMU_SUPPORT
+#ifdef CONFIG_IOMMU_SVA
 	mm->pasid = INIT_PASID;
 #endif
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 03/11] iommu/ioasid: Introduce a helper to check for valid PASIDs
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 01/11] iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 02/11] mm: Change CONFIG option for mm->pasid field Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 04/11] kernel/fork: Initialize mm's PASID Fenghua Yu
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

pasid_valid() is defined to check if a given PASID is valid.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Define a helper pasid_valid() (Ashok, Jacob, Thomas, Tony)

 include/linux/ioasid.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index e9dacd4b9f6b..2237f64dbaae 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -41,6 +41,10 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
 int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
 void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator);
 int ioasid_set_data(ioasid_t ioasid, void *data);
+static inline bool pasid_valid(ioasid_t ioasid)
+{
+	return ioasid != INVALID_IOASID;
+}
 
 #else /* !CONFIG_IOASID */
 static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
@@ -78,5 +82,10 @@ static inline int ioasid_set_data(ioasid_t ioasid, void *data)
 	return -ENOTSUPP;
 }
 
+static inline bool pasid_valid(ioasid_t ioasid)
+{
+	return false;
+}
+
 #endif /* CONFIG_IOASID */
 #endif /* __LINUX_IOASID_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 04/11] kernel/fork: Initialize mm's PASID
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (2 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 03/11] iommu/ioasid: Introduce a helper to check for valid PASIDs Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit Fenghua Yu
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

A new mm doesn't have a PASID yet when it's created. Initialize
the mm's PASID on fork() or for init_mm to INVALID_IOASID (-1).

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Change condition to more accurate CONFIG_IOMMU_SVA (Jacob)

 include/linux/sched/mm.h | 10 ++++++++++
 kernel/fork.c            | 10 ++--------
 mm/init-mm.c             |  4 ++++
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index aca874d33fe6..394c4359c4e1 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -8,6 +8,7 @@
 #include <linux/mm_types.h>
 #include <linux/gfp.h>
 #include <linux/sync_core.h>
+#include <linux/ioasid.h>
 
 /*
  * Routines for handling mm_structs
@@ -407,4 +408,13 @@ static inline void membarrier_update_current_mm(struct mm_struct *next_mm)
 }
 #endif
 
+#ifdef CONFIG_IOMMU_SVA
+static inline void mm_pasid_init(struct mm_struct *mm)
+{
+	mm->pasid = INVALID_IOASID;
+}
+#else
+static inline void mm_pasid_init(struct mm_struct *mm) {}
+#endif
+
 #endif /* _LINUX_SCHED_MM_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index b33c52021d4e..4e799c9b13bb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -96,6 +96,7 @@
 #include <linux/scs.h>
 #include <linux/io_uring.h>
 #include <linux/bpf.h>
+#include <linux/sched/mm.h>
 
 #include <asm/pgalloc.h>
 #include <linux/uaccess.h>
@@ -1018,13 +1019,6 @@ static void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
 #endif
 }
 
-static void mm_init_pasid(struct mm_struct *mm)
-{
-#ifdef CONFIG_IOMMU_SVA
-	mm->pasid = INIT_PASID;
-#endif
-}
-
 static void mm_init_uprobes_state(struct mm_struct *mm)
 {
 #ifdef CONFIG_UPROBES
@@ -1053,7 +1047,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	mm_init_cpumask(mm);
 	mm_init_aio(mm);
 	mm_init_owner(mm, p);
-	mm_init_pasid(mm);
+	mm_pasid_init(mm);
 	RCU_INIT_POINTER(mm->exe_file, NULL);
 	mmu_notifier_subscriptions_init(mm);
 	init_tlb_flush_pending(mm);
diff --git a/mm/init-mm.c b/mm/init-mm.c
index b4a6f38fb51d..fbe7844d0912 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -10,6 +10,7 @@
 
 #include <linux/atomic.h>
 #include <linux/user_namespace.h>
+#include <linux/ioasid.h>
 #include <asm/mmu.h>
 
 #ifndef INIT_MM_CONTEXT
@@ -38,6 +39,9 @@ struct mm_struct init_mm = {
 	.mmlist		= LIST_HEAD_INIT(init_mm.mmlist),
 	.user_ns	= &init_user_ns,
 	.cpu_bitmap	= CPU_BITS_NONE,
+#ifdef CONFIG_IOMMU_SVA
+	.pasid		= INVALID_IOASID,
+#endif
 	INIT_MM_CONTEXT(init_mm)
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (3 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 04/11] kernel/fork: Initialize mm's PASID Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2022-01-24 20:21   ` Thomas Gleixner
  2021-12-17 22:01 ` [PATCH v2 06/11] x86/fpu: Clear PASID when copying fpstate Fenghua Yu
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

To avoid complexity of updating each thread's PASID status (e.g. sending
IPI to update IA32_PASID MSR) on allocating and freeing PASID, once
allocated and assigned to an mm, the PASID stays with the mm for the
rest of the mm's lifetime. A reference to the PASID is taken on
allocating the PASID. Binding/unbinding the PASID won't change refcount.
The reference is dropped on mm exit and thus the PASID is freed.

Two helpers mm_pasid_get() and mm_pasid_drop() are defined in mm because
the PASID operations handle the pasid member in mm_struct and should be
part of mm operations.

20-bit PASID allows up to 1M processes bound to PASIDs at the same time.
With cgroups and other controls that might limit the number of process
creation, the limited number of PASIDs is not a realistic issue for
lazy PASID free.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Free PASID on mm exit instead of in exit(2) or unbind() (Thomas, AndyL,
  PeterZ)
- Add mm_pasid_init(), mm_pasid_get(), and mm_pasid_drop() functions in mm.
  So the mm's PASID operations are generic for both X86 and ARM
  (Dave Hansen)

 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  5 +--
 drivers/iommu/intel/svm.c                     |  9 -----
 drivers/iommu/iommu-sva-lib.c                 | 39 ++++++-------------
 drivers/iommu/iommu-sva-lib.h                 |  1 -
 include/linux/sched/mm.h                      | 16 ++++++++
 kernel/fork.c                                 |  1 +
 6 files changed, 30 insertions(+), 41 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index ee66d1f4cb81..c153ffae5462 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -340,14 +340,12 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 	bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
 	if (IS_ERR(bond->smmu_mn)) {
 		ret = PTR_ERR(bond->smmu_mn);
-		goto err_free_pasid;
+		goto err_free_bond;
 	}
 
 	list_add(&bond->list, &master->bonds);
 	return &bond->sva;
 
-err_free_pasid:
-	iommu_sva_free_pasid(mm);
 err_free_bond:
 	kfree(bond);
 	return ERR_PTR(ret);
@@ -377,7 +375,6 @@ void arm_smmu_sva_unbind(struct iommu_sva *handle)
 	if (refcount_dec_and_test(&bond->refs)) {
 		list_del(&bond->list);
 		arm_smmu_mmu_notifier_put(bond->smmu_mn);
-		iommu_sva_free_pasid(bond->mm);
 		kfree(bond);
 	}
 	mutex_unlock(&sva_lock);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 5b5d69b04fcc..51ac2096b3da 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -514,11 +514,6 @@ static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
 	return iommu_sva_alloc_pasid(mm, PASID_MIN, max_pasid - 1);
 }
 
-static void intel_svm_free_pasid(struct mm_struct *mm)
-{
-	iommu_sva_free_pasid(mm);
-}
-
 static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 					   struct device *dev,
 					   struct mm_struct *mm,
@@ -662,8 +657,6 @@ static int intel_svm_unbind_mm(struct device *dev, u32 pasid)
 				kfree(svm);
 			}
 		}
-		/* Drop a PASID reference and free it if no reference. */
-		intel_svm_free_pasid(mm);
 	}
 out:
 	return ret;
@@ -1047,8 +1040,6 @@ struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void
 	}
 
 	sva = intel_svm_bind_mm(iommu, dev, mm, flags);
-	if (IS_ERR_OR_NULL(sva))
-		intel_svm_free_pasid(mm);
 	mutex_unlock(&pasid_mutex);
 
 	return sva;
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index bd41405d34e9..ee2294e02716 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -18,8 +18,7 @@ static DECLARE_IOASID_SET(iommu_sva_pasid);
  *
  * Try to allocate a PASID for this mm, or take a reference to the existing one
  * provided it fits within the [@min, @max] range. On success the PASID is
- * available in mm->pasid, and must be released with iommu_sva_free_pasid().
- * @min must be greater than 0, because 0 indicates an unused mm->pasid.
+ * available in mm->pasid and will be available for the lifetime of the mm.
  *
  * Returns 0 on success and < 0 on error.
  */
@@ -33,38 +32,24 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
 		return -EINVAL;
 
 	mutex_lock(&iommu_sva_lock);
-	if (mm->pasid) {
-		if (mm->pasid >= min && mm->pasid <= max)
-			ioasid_get(mm->pasid);
-		else
+	/* Is a PASID already associated with this mm? */
+	if (pasid_valid(mm->pasid)) {
+		if (mm->pasid < min || mm->pasid >= max)
 			ret = -EOVERFLOW;
-	} else {
-		pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
-		if (pasid == INVALID_IOASID)
-			ret = -ENOMEM;
-		else
-			mm->pasid = pasid;
+		goto out;
 	}
+
+	pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
+	if (!pasid_valid(pasid))
+		ret = -ENOMEM;
+	else
+		mm_pasid_get(mm, pasid);
+out:
 	mutex_unlock(&iommu_sva_lock);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
 
-/**
- * iommu_sva_free_pasid - Release the mm's PASID
- * @mm: the mm
- *
- * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid()
- */
-void iommu_sva_free_pasid(struct mm_struct *mm)
-{
-	mutex_lock(&iommu_sva_lock);
-	if (ioasid_put(mm->pasid))
-		mm->pasid = 0;
-	mutex_unlock(&iommu_sva_lock);
-}
-EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
-
 /* ioasid_find getter() requires a void * argument */
 static bool __mmget_not_zero(void *mm)
 {
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 95dc3ebc1928..8909ea1094e3 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -9,7 +9,6 @@
 #include <linux/mm_types.h>
 
 int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
-void iommu_sva_free_pasid(struct mm_struct *mm);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
 
 /* I/O Page fault */
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 394c4359c4e1..21cd094283ad 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -413,8 +413,24 @@ static inline void mm_pasid_init(struct mm_struct *mm)
 {
 	mm->pasid = INVALID_IOASID;
 }
+
+/* Associate a PASID with an mm_struct: */
+static inline void mm_pasid_get(struct mm_struct *mm, u32 pasid)
+{
+	mm->pasid = pasid;
+}
+
+static inline void mm_pasid_drop(struct mm_struct *mm)
+{
+	if (pasid_valid(mm->pasid)) {
+		ioasid_put(mm->pasid);
+		mm->pasid = INVALID_IOASID;
+	}
+}
 #else
 static inline void mm_pasid_init(struct mm_struct *mm) {}
+static inline void mm_pasid_get(struct mm_struct *mm, u32 pasid) {}
+static inline void mm_pasid_drop(struct mm_struct *mm) {}
 #endif
 
 #endif /* _LINUX_SCHED_MM_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 4e799c9b13bb..3adad225cc09 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1114,6 +1114,7 @@ static inline void __mmput(struct mm_struct *mm)
 	}
 	if (mm->binfmt)
 		module_put(mm->binfmt->module);
+	mm_pasid_drop(mm);
 	mmdrop(mm);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 06/11] x86/fpu: Clear PASID when copying fpstate
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (4 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 07/11] sched: Define and initialize a flag to identify valid PASID in the task Fenghua Yu
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

The kernel must allocate a Process Address Space ID (PASID) on behalf of
each process which will use ENQCMD and program it into the new MSR to
communicate the process identity to platform hardware. ENQCMD uses the
PASID stored in this MSR to tag requests from this process.

The PASID state must be cleared on fork() since fork creates a
new address space.

For clone(), it would be functionally OK to copy the PASID. However,
clearing it is _also_ functionally OK since any PASID use will trigger
the #GP handler to populate the MSR.

Copying the PASID state has two main downsides:
 * It requires differentiating fork() and clone() in the code,
   both in the FPU code and keeping tsk->pasid_activated consistent.
 * It guarantees that the PASID is out of its init state, which
   incurs small but non-zero cost on every XSAVE/XRSTOR.

The main downside of clearing the PASID at fpstate copy is the future,
one-time #GP for the thread.

Use the simplest approach: clear the PASID state both on clone() and
fork().  Rely on the #GP handler for MSR population in children.

Also, just clear the PASID bit from xfeatures if XSAVE is supported.
This will have no effect on systems that do not have PASID support.  It
is virtually zero overhead because 'dst_fpu' was just written and
the whole thing is cache hot.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Rewrite changelog (Dave Hansen).
- Move xfeature tweaking into fpu_clone() and make it unconditional
  if XSAVE is supported (Dave Hansen).

 arch/x86/kernel/fpu/core.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8ea306b1bf8e..13fc0ea52237 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -518,6 +518,13 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags)
 		fpu_inherit_perms(dst_fpu);
 	fpregs_unlock();
 
+	/*
+	 * Children never inherit PASID state.
+	 * Force it to have its init value:
+	 */
+	if (use_xsave())
+		dst_fpu->fpstate->regs.xsave.header.xfeatures &= ~XFEATURE_MASK_PASID;
+
 	trace_x86_fpu_copy_src(src_fpu);
 	trace_x86_fpu_copy_dst(dst_fpu);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 07/11] sched: Define and initialize a flag to identify valid PASID in the task
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (5 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 06/11] x86/fpu: Clear PASID when copying fpstate Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 08/11] x86/traps: Demand-populate PASID MSR via #GP Fenghua Yu
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

From: Peter Zijlstra <peterz@infradead.org>

Add a new single bit field to the task structure to track whether this task
has initialized the IA32_PASID MSR to the mm's PASID.

Initialize the field to zero when creating a new task with fork/clone.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Change condition to more accurate CONFIG_IOMMU_SVA (Jacob)

 include/linux/sched.h | 3 +++
 kernel/fork.c         | 4 ++++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 78c351e35fec..41a0b5703f94 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -933,6 +933,9 @@ struct task_struct {
 	/* Recursion prevention for eventfd_signal() */
 	unsigned			in_eventfd_signal:1;
 #endif
+#ifdef CONFIG_IOMMU_SVA
+	unsigned			pasid_activated:1;
+#endif
 
 	unsigned long			atomic_flags; /* Flags requiring atomic access. */
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 3adad225cc09..cd297926b6f2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -967,6 +967,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
 	tsk->use_memdelay = 0;
 #endif
 
+#ifdef CONFIG_IOMMU_SVA
+	tsk->pasid_activated = 0;
+#endif
+
 #ifdef CONFIG_MEMCG
 	tsk->active_memcg = NULL;
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 08/11] x86/traps: Demand-populate PASID MSR via #GP
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (6 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 07/11] sched: Define and initialize a flag to identify valid PASID in the task Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 09/11] x86/cpufeatures: Re-enable ENQCMD Fenghua Yu
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

All tasks start with PASID state disabled. This means that the first
time they execute an ENQCMD instruction they will take a #GP fault.

Modify the #GP fault handler to check if the "mm" for the task has
already been allocated a PASID. If so, try to fix the #GP fault by
loading the IA32_PASID MSR.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Directly write IA32_PASID MSR in fixup while local IRQ is still disabled
  (Thomas)
- Move #ifdef over to CONFIG_IOMMU_SVA since it is what
  defines mm->pasid and ->pasid_activated (Dave Hansen).
- Rename try_fixup_pasid() -> try_fixup_enqcmd_gp(). This
  code really is highly specific to ENQCMD, not PASIDs (Dave Hansen).
- Add lockdep assert and comment about context (Dave Hansen).
- Re-flow the if() mess (Dave Hansen).

 arch/x86/kernel/traps.c | 55 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index c9d566dcf89a..7ef00dee35be 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -39,6 +39,7 @@
 #include <linux/io.h>
 #include <linux/hardirq.h>
 #include <linux/atomic.h>
+#include <linux/ioasid.h>
 
 #include <asm/stacktrace.h>
 #include <asm/processor.h>
@@ -559,6 +560,57 @@ static bool fixup_iopl_exception(struct pt_regs *regs)
 	return true;
 }
 
+/*
+ * The unprivileged ENQCMD instruction generates #GPs if the
+ * IA32_PASID MSR has not been populated.  If possible, populate
+ * the MSR from a PASID previously allocated to the mm.
+ */
+static bool try_fixup_enqcmd_gp(void)
+{
+#ifdef CONFIG_IOMMU_SVA
+	u32 pasid;
+
+	/*
+	 * MSR_IA32_PASID is managed using XSAVE.  Directly
+	 * writing to the MSR is only possible when fpregs
+	 * are valid and the fpstate is not.  This is
+	 * guaranteed when handling a userspace exception
+	 * in *before* interrupts are re-enabled.
+	 */
+	lockdep_assert_irqs_disabled();
+
+	/*
+	 * Hardware without ENQCMD will not generate
+	 * #GPs that can be fixed up here.
+	 */
+	if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
+		return false;
+
+	pasid = current->mm->pasid;
+
+	/*
+	 * If the mm has not been allocated a
+	 * PASID, the #GP can not be fixed up.
+	 */
+	if (!pasid_valid(pasid))
+		return false;
+
+	/*
+	 * Did this thread already have its PASID activated?
+	 * If so, the #GP must be from something else.
+	 */
+	if (current->pasid_activated)
+		return false;
+
+	wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
+	current->pasid_activated = 1;
+
+	return true;
+#else
+	return false;
+#endif
+}
+
 DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
 {
 	char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
@@ -567,6 +619,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
 	unsigned long gp_addr;
 	int ret;
 
+	if (user_mode(regs) && try_fixup_enqcmd_gp())
+		return;
+
 	cond_local_irq_enable(regs);
 
 	if (static_cpu_has(X86_FEATURE_UMIP)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 09/11] x86/cpufeatures: Re-enable ENQCMD
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (7 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 08/11] x86/traps: Demand-populate PASID MSR via #GP Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:01 ` [PATCH v2 10/11] tools/objtool: Check for use of the ENQCMD instruction in the kernel Fenghua Yu
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

Since ENQCMD is handled by #GP fix up, it can be re-enabled.

The ENQCMD feature can only be used if CONFIG_INTEL_IOMMU_SVM is set. Add
X86_FEATURE_ENQCMD to the disabled features mask as appropriate so that
cpu_feature_enabled() can be used to check the feature.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Update the commit message (Tony).

 arch/x86/include/asm/disabled-features.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 8f28fafa98b3..1231d63f836d 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -56,8 +56,11 @@
 # define DISABLE_PTI		(1 << (X86_FEATURE_PTI & 31))
 #endif
 
-/* Force disable because it's broken beyond repair */
-#define DISABLE_ENQCMD		(1 << (X86_FEATURE_ENQCMD & 31))
+#ifdef CONFIG_INTEL_IOMMU_SVM
+# define DISABLE_ENQCMD		0
+#else
+# define DISABLE_ENQCMD		(1 << (X86_FEATURE_ENQCMD & 31))
+#endif
 
 #ifdef CONFIG_X86_SGX
 # define DISABLE_SGX	0
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 10/11] tools/objtool: Check for use of the ENQCMD instruction in the kernel
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (8 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 09/11] x86/cpufeatures: Re-enable ENQCMD Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-17 22:57   ` Josh Poimboeuf
  2021-12-17 22:01 ` [PATCH v2 11/11] docs: x86: Change documentation for SVA (Shared Virtual Addressing) Fenghua Yu
  2021-12-27 17:52 ` [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
  11 siblings, 1 reply; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

The ENQCMD implicitly accesses the PASID_MSR to fill in the pasid field
of the descriptor being submitted to an accelerator. But there is no
precise (and stable across kernel changes) point at which the PASID_MSR
is updated from the value for one task to the next.

Kernel code that uses accelerators must always use the ENQCMDS instruction
which does not access the PASID_MSR.

Check for use of the ENQCMD instruction in the kernel and warn on its
usage.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Simplify handling ENQCMD (PeterZ and Josh)

 tools/objtool/arch/x86/decode.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 4d6d7fc13255..11ffa0e53f84 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -112,7 +112,7 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec
 	const struct elf *elf = file->elf;
 	struct insn insn;
 	int x86_64, ret;
-	unsigned char op1, op2,
+	unsigned char op1, op2, op3,
 		      rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0,
 		      modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
 		      sib = 0, /* sib_scale = 0, */ sib_index = 0, sib_base = 0;
@@ -139,6 +139,7 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec
 
 	op1 = insn.opcode.bytes[0];
 	op2 = insn.opcode.bytes[1];
+	op3 = insn.opcode.bytes[2];
 
 	if (insn.rex_prefix.nbytes) {
 		rex = insn.rex_prefix.bytes[0];
@@ -491,6 +492,14 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec
 			/* nopl/nopw */
 			*type = INSN_NOP;
 
+		} else if (op2 == 0x38 && op3 == 0xf8) {
+			if (insn.prefixes.nbytes == 1 &&
+			    insn.prefixes.bytes[0] == 0xf2) {
+				/* ENQCMD cannot be used in the kernel. */
+				WARN("ENQCMD instruction at %s:%lx", sec->name,
+				     offset);
+			}
+
 		} else if (op2 == 0xa0 || op2 == 0xa8) {
 
 			/* push fs/gs */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 11/11] docs: x86: Change documentation for SVA (Shared Virtual Addressing)
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (9 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 10/11] tools/objtool: Check for use of the ENQCMD instruction in the kernel Fenghua Yu
@ 2021-12-17 22:01 ` Fenghua Yu
  2021-12-27 17:52 ` [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-17 22:01 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

Since allocating, freeing and fixing up PASID are changed, the
documentation is updated to reflect the changes.

Originally-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Update life cycle info (Tony and Thomas).
- Update initial PASID value to INVALID_IOASID on fork().

 Documentation/x86/sva.rst | 58 +++++++++++++++++++++++++++++++--------
 1 file changed, 46 insertions(+), 12 deletions(-)

diff --git a/Documentation/x86/sva.rst b/Documentation/x86/sva.rst
index 076efd51ef1f..92341f26e525 100644
--- a/Documentation/x86/sva.rst
+++ b/Documentation/x86/sva.rst
@@ -104,18 +104,52 @@ The MSR must be configured on each logical CPU before any application
 thread can interact with a device. Threads that belong to the same
 process share the same page tables, thus the same MSR value.
 
-PASID is cleared when a process is created. The PASID allocation and MSR
-programming may occur long after a process and its threads have been created.
-One thread must call iommu_sva_bind_device() to allocate the PASID for the
-process. If a thread uses ENQCMD without the MSR first being populated, a #GP
-will be raised. The kernel will update the PASID MSR with the PASID for all
-threads in the process. A single process PASID can be used simultaneously
-with multiple devices since they all share the same address space.
-
-One thread can call iommu_sva_unbind_device() to free the allocated PASID.
-The kernel will clear the PASID MSR for all threads belonging to the process.
-
-New threads inherit the MSR value from the parent.
+PASID Life Cycle Management
+==========================
+
+PASID is initialized as INVALID_IOASID (-1) when a process is created.
+
+Only processes that access SVA-capable devices need to have a PASID
+allocated. This allocation happens when a process opens/binds an SVA-capable
+device but finds no PASID for this process. Subsequent binds of the same, or
+other devices will share the same PASID.
+
+Although the PASID is allocated to the process by opening a device,
+it is not active in any of the threads of that process. It's loaded to the
+IA32_PASID MSR lazily when a thread tries to submit a work descriptor
+to a device using the ENQCMD.
+
+That first access will trigger a #GP fault because the IA32_PASID MSR
+has not been initialized with the PASID value assigned to the process
+when the device was opened. The Linux #GP handler notes that a PASID has
+been allocated for the process, and so initializes the IA32_PASID MSR
+and returns so that the ENQCMD instruction is re-executed.
+
+On fork(2) or exec(2) the PASID is removed from the process as it no
+longer has the same address space that it had when the device was opened.
+
+On clone(2) the new task shares the same address space, so will be
+able to use the PASID allocated to the process. The IA32_PASID is not
+preemptively initialized as the PASID value might not be allocated yet or
+the kernel does not know whether this thread is going to access the device
+and the cleared IA32_PASID MSR reduces context switch overhead by xstate
+init optimization. Since #GP faults have to be handled on any threads that
+were created before the PASID was assigned to the mm of the process, newly
+created threads might as well be treated in a consistent way.
+
+Due to complexity of freeing the PASID and clearing all IA32_PASID MSRs in
+all threads in unbind, free the PASID lazily only on mm exit. Track the
+PASID's reference count in the following way:
+
+- Initialize the PASID's reference to 1 when the PASID is first allocated
+  to the mm. The reference is held for the rest life time of the mm until
+  it's dropped to 0 and the PASID is freed on mm exit.
+
+If a process does a close(2) of the device file descriptor and munmap(2)
+of the device MMIO portal, then the driver will unbind the device. The
+PASID is still marked VALID in the PASID_MSR for any threads in the
+process that accessed the device. But this is harmless as without the
+MMIO portal they cannot submit new work to the device.
 
 Relationships
 =============
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 10/11] tools/objtool: Check for use of the ENQCMD instruction in the kernel
  2021-12-17 22:01 ` [PATCH v2 10/11] tools/objtool: Check for use of the ENQCMD instruction in the kernel Fenghua Yu
@ 2021-12-17 22:57   ` Josh Poimboeuf
  2021-12-27 17:50     ` Fenghua Yu
  0 siblings, 1 reply; 26+ messages in thread
From: Josh Poimboeuf @ 2021-12-17 22:57 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

On Fri, Dec 17, 2021 at 10:01:35PM +0000, Fenghua Yu wrote:
> The ENQCMD implicitly accesses the PASID_MSR to fill in the pasid field
> of the descriptor being submitted to an accelerator. But there is no
> precise (and stable across kernel changes) point at which the PASID_MSR
> is updated from the value for one task to the next.
> 
> Kernel code that uses accelerators must always use the ENQCMDS instruction
> which does not access the PASID_MSR.
> 
> Check for use of the ENQCMD instruction in the kernel and warn on its
> usage.
> 
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> v2:
> - Simplify handling ENQCMD (PeterZ and Josh)

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>

-- 
Josh


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 10/11] tools/objtool: Check for use of the ENQCMD instruction in the kernel
  2021-12-17 22:57   ` Josh Poimboeuf
@ 2021-12-27 17:50     ` Fenghua Yu
  0 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-27 17:50 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

Hi, Josh,

On Fri, Dec 17, 2021 at 02:57:06PM -0800, Josh Poimboeuf wrote:
> On Fri, Dec 17, 2021 at 10:01:35PM +0000, Fenghua Yu wrote:
> > The ENQCMD implicitly accesses the PASID_MSR to fill in the pasid field
> > of the descriptor being submitted to an accelerator. But there is no
> > precise (and stable across kernel changes) point at which the PASID_MSR
> > is updated from the value for one task to the next.
> > 
> > Kernel code that uses accelerators must always use the ENQCMDS instruction
> > which does not access the PASID_MSR.
> > 
> > Check for use of the ENQCMD instruction in the kernel and warn on its
> > usage.
> > 
> > Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > ---
> > v2:
> > - Simplify handling ENQCMD (PeterZ and Josh)
> 
> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>

Thank you!

-Fenghua

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR
  2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
                   ` (10 preceding siblings ...)
  2021-12-17 22:01 ` [PATCH v2 11/11] docs: x86: Change documentation for SVA (Shared Virtual Addressing) Fenghua Yu
@ 2021-12-27 17:52 ` Fenghua Yu
  11 siblings, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2021-12-27 17:52 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel

Hi, Dear Maintainers,

On Fri, Dec 17, 2021 at 10:01:25PM +0000, Fenghua Yu wrote:
> Problems in the old code to manage SVM (Shared Virtual Memory) devices
> and the PASID (Process Address Space ID) led to that code being
> disabled.
> 
> Subsequent discussions resulted in a far simpler approach:
> 
> 1) PASID life cycle is from first allocation by a process until that
>    process exits.
> 2) All tasks begin with PASID disabled
> 3) The #GP fault handler tries to fix faulting ENQCMD instructions very
>    early (thus avoiding complexities of the XSAVE infrastructure)

Any comments on this series?

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2021-12-17 22:01 ` [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit Fenghua Yu
@ 2022-01-24 20:21   ` Thomas Gleixner
  2022-01-24 20:33     ` Fenghua Yu
  2022-01-24 20:36     ` Thomas Gleixner
  0 siblings, 2 replies; 26+ messages in thread
From: Thomas Gleixner @ 2022-01-24 20:21 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

On Fri, Dec 17 2021 at 22:01, Fenghua Yu wrote:
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index bd41405d34e9..ee2294e02716 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -18,8 +18,7 @@ static DECLARE_IOASID_SET(iommu_sva_pasid);
>   *
>   * Try to allocate a PASID for this mm, or take a reference to the existing one
>   * provided it fits within the [@min, @max] range. On success the PASID is
> - * available in mm->pasid, and must be released with iommu_sva_free_pasid().
> - * @min must be greater than 0, because 0 indicates an unused mm->pasid.
> + * available in mm->pasid and will be available for the lifetime of the mm.
>   *
>   * Returns 0 on success and < 0 on error.
>   */
> @@ -33,38 +32,24 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
>  		return -EINVAL;
>  
>  	mutex_lock(&iommu_sva_lock);
> -	if (mm->pasid) {
> -		if (mm->pasid >= min && mm->pasid <= max)
> -			ioasid_get(mm->pasid);
> -		else
> +	/* Is a PASID already associated with this mm? */
> +	if (pasid_valid(mm->pasid)) {
> +		if (mm->pasid < min || mm->pasid >= max)
>  			ret = -EOVERFLOW;
> -	} else {
> -		pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
> -		if (pasid == INVALID_IOASID)
> -			ret = -ENOMEM;
> -		else
> -			mm->pasid = pasid;
> +		goto out;
>  	}
> +
> +	pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
> +	if (!pasid_valid(pasid))
> +		ret = -ENOMEM;
> +	else
> +		mm_pasid_get(mm, pasid);

Hrm. This is odd.

> +/* Associate a PASID with an mm_struct: */
> +static inline void mm_pasid_get(struct mm_struct *mm, u32 pasid)
> +{
> +	mm->pasid = pasid;
> +}

This does not get anything. It sets the allocated PASID in the mm. The
refcount on the PASID was already taken by the allocation. So this
should be mm_pasid_set() or mm_pasid_install(), right?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-24 20:21   ` Thomas Gleixner
@ 2022-01-24 20:33     ` Fenghua Yu
  2022-01-24 20:36     ` Thomas Gleixner
  1 sibling, 0 replies; 26+ messages in thread
From: Fenghua Yu @ 2022-01-24 20:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

Hi, Thomas,

On Mon, Jan 24, 2022 at 09:21:24PM +0100, Thomas Gleixner wrote:
> On Fri, Dec 17 2021 at 22:01, Fenghua Yu wrote:
> > diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> > index bd41405d34e9..ee2294e02716 100644
> > --- a/drivers/iommu/iommu-sva-lib.c
> > +++ b/drivers/iommu/iommu-sva-lib.c
> > @@ -18,8 +18,7 @@ static DECLARE_IOASID_SET(iommu_sva_pasid);
> >   *
> >   * Try to allocate a PASID for this mm, or take a reference to the existing one
> >   * provided it fits within the [@min, @max] range. On success the PASID is
> > - * available in mm->pasid, and must be released with iommu_sva_free_pasid().
> > - * @min must be greater than 0, because 0 indicates an unused mm->pasid.
> > + * available in mm->pasid and will be available for the lifetime of the mm.
> >   *
> >   * Returns 0 on success and < 0 on error.
> >   */
> > @@ -33,38 +32,24 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
> >  		return -EINVAL;
> >  
> >  	mutex_lock(&iommu_sva_lock);
> > -	if (mm->pasid) {
> > -		if (mm->pasid >= min && mm->pasid <= max)
> > -			ioasid_get(mm->pasid);
> > -		else
> > +	/* Is a PASID already associated with this mm? */
> > +	if (pasid_valid(mm->pasid)) {
> > +		if (mm->pasid < min || mm->pasid >= max)
> >  			ret = -EOVERFLOW;
> > -	} else {
> > -		pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
> > -		if (pasid == INVALID_IOASID)
> > -			ret = -ENOMEM;
> > -		else
> > -			mm->pasid = pasid;
> > +		goto out;
> >  	}
> > +
> > +	pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
> > +	if (!pasid_valid(pasid))
> > +		ret = -ENOMEM;
> > +	else
> > +		mm_pasid_get(mm, pasid);
> 
> Hrm. This is odd.
> 
> > +/* Associate a PASID with an mm_struct: */
> > +static inline void mm_pasid_get(struct mm_struct *mm, u32 pasid)
> > +{
> > +	mm->pasid = pasid;
> > +}
> 
> This does not get anything. It sets the allocated PASID in the mm. The
> refcount on the PASID was already taken by the allocation. So this
> should be mm_pasid_set() or mm_pasid_install(), right?

Ok. Will change mm_pasid_get() to mm_pasid_set().

Thank you very much for your review!

-Fenghua

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-24 20:21   ` Thomas Gleixner
  2022-01-24 20:33     ` Fenghua Yu
@ 2022-01-24 20:36     ` Thomas Gleixner
  2022-01-24 20:52       ` Fenghua Yu
  1 sibling, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2022-01-24 20:36 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	Andy Lutomirski, Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel,
	Josh Poimboeuf, Jacob Pan, Ashok Raj, Ravi V Shankar
  Cc: iommu, x86, linux-kernel, Fenghua Yu

On Mon, Jan 24 2022 at 21:21, Thomas Gleixner wrote:
>
> Hrm. This is odd.
>
>> +/* Associate a PASID with an mm_struct: */
>> +static inline void mm_pasid_get(struct mm_struct *mm, u32 pasid)
>> +{
>> +	mm->pasid = pasid;
>> +}
>
> This does not get anything. It sets the allocated PASID in the mm. The
> refcount on the PASID was already taken by the allocation. So this
> should be mm_pasid_set() or mm_pasid_install(), right?

And as a result of all this ioasid_get() is now left without users...

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-24 20:36     ` Thomas Gleixner
@ 2022-01-24 20:52       ` Fenghua Yu
  2022-01-24 20:55         ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Fenghua Yu @ 2022-01-24 20:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

Hi, Thomas,

On Mon, Jan 24, 2022 at 09:36:00PM +0100, Thomas Gleixner wrote:
> On Mon, Jan 24 2022 at 21:21, Thomas Gleixner wrote:
> >
> > Hrm. This is odd.
> >
> >> +/* Associate a PASID with an mm_struct: */
> >> +static inline void mm_pasid_get(struct mm_struct *mm, u32 pasid)
> >> +{
> >> +	mm->pasid = pasid;
> >> +}
> >
> > This does not get anything. It sets the allocated PASID in the mm. The
> > refcount on the PASID was already taken by the allocation. So this
> > should be mm_pasid_set() or mm_pasid_install(), right?
> 
> And as a result of all this ioasid_get() is now left without users...
> 
> Thanks,

Ah. This patch should remove ioasid_get(). So I will change this patch
as follows:

1. Remove ioasid_get() because it's not used any more when IOASID's
   refcount is set to 1 once the IOASID is allocated and is freed on mm exit.
2. Change mm_pasid_get() to mm_pasid_set().

Will that work?

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-24 20:52       ` Fenghua Yu
@ 2022-01-24 20:55         ` Thomas Gleixner
  2022-01-25 15:18           ` Fenghua Yu
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2022-01-24 20:55 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

On Mon, Jan 24 2022 at 12:52, Fenghua Yu wrote:
> On Mon, Jan 24, 2022 at 09:36:00PM +0100, Thomas Gleixner wrote:
>> On Mon, Jan 24 2022 at 21:21, Thomas Gleixner wrote:
> Ah. This patch should remove ioasid_get(). So I will change this patch
> as follows:
>
> 1. Remove ioasid_get() because it's not used any more when IOASID's
>    refcount is set to 1 once the IOASID is allocated and is freed on mm exit.
> 2. Change mm_pasid_get() to mm_pasid_set().

Yes. Just resend this one. No need to post the full queue again.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-24 20:55         ` Thomas Gleixner
@ 2022-01-25 15:18           ` Fenghua Yu
  2022-01-26 14:23             ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Fenghua Yu @ 2022-01-25 15:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

Hi, Thomas,

On Mon, Jan 24, 2022 at 09:55:56PM +0100, Thomas Gleixner wrote:
> On Mon, Jan 24 2022 at 12:52, Fenghua Yu wrote:
> > On Mon, Jan 24, 2022 at 09:36:00PM +0100, Thomas Gleixner wrote:
> >> On Mon, Jan 24 2022 at 21:21, Thomas Gleixner wrote:
> > Ah. This patch should remove ioasid_get(). So I will change this patch
> > as follows:
> >
> > 1. Remove ioasid_get() because it's not used any more when IOASID's
> >    refcount is set to 1 once the IOASID is allocated and is freed on mm exit.
> > 2. Change mm_pasid_get() to mm_pasid_set().
> 
> Yes. Just resend this one. No need to post the full queue again.

Here is the updated patch #5. Thank you very much for your review!

From 895f2c15385dc93acf39700014458fe275d996a2 Mon Sep 17 00:00:00 2001
From: Fenghua Yu <fenghua.yu@intel.com>
Date: Thu, 18 Nov 2021 14:06:40 -0800
Subject: [PATCH v2.1 05/11] iommu/sva: Assign a PASID to mm on PASID
 allocation and free it on mm exit

To avoid complexity of updating each thread's PASID status (e.g. sending
IPI to update IA32_PASID MSR) on allocating and freeing PASID, once
allocated and assigned to an mm, the PASID stays with the mm for the
rest of the mm's lifetime. A reference to the PASID is taken on
allocating the PASID. Binding/unbinding the PASID won't change refcount.
The reference is dropped on mm exit and thus the PASID is freed.

Two helpers mm_pasid_set() and mm_pasid_drop() are defined in mm because
the PASID operations handle the pasid member in mm_struct and should be
part of mm operations. Unused ioasid_get() and iommu_sva_free_pasid()
are deleted.

20-bit PASID allows up to 1M processes bound to PASIDs at the same time.
With cgroups and other controls that might limit the number of process
creation, the limited number of PASIDs is not a realistic issue for
lazy PASID free.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2.1:
- Rename mm_pasid_get() to mm_pasid_set() (Thomas).
- Remove ioasid_get() because it's not used any more when IOASID's
  refcount is set to 1 once the IOASID is allocated and is freed
  on mm exit (Thomas).

v2:
- Free PASID on mm exit instead of in exit(2) or unbind() (Thomas, AndyL,
  PeterZ)
- Add mm_pasid_init(), mm_pasid_get(), and mm_pasid_drop() functions in mm.
  So the mm's PASID operations are generic for both X86 and ARM
  (Dave Hansen)

 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  5 +--
 drivers/iommu/intel/svm.c                     |  9 -----
 drivers/iommu/ioasid.c                        | 17 --------
 drivers/iommu/iommu-sva-lib.c                 | 39 ++++++-------------
 drivers/iommu/iommu-sva-lib.h                 |  1 -
 include/linux/ioasid.h                        |  5 ---
 include/linux/sched/mm.h                      | 16 ++++++++
 kernel/fork.c                                 |  1 +
 8 files changed, 30 insertions(+), 63 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index a737ba5f727e..22ddd05bbdcd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -340,14 +340,12 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
 	bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
 	if (IS_ERR(bond->smmu_mn)) {
 		ret = PTR_ERR(bond->smmu_mn);
-		goto err_free_pasid;
+		goto err_free_bond;
 	}
 
 	list_add(&bond->list, &master->bonds);
 	return &bond->sva;
 
-err_free_pasid:
-	iommu_sva_free_pasid(mm);
 err_free_bond:
 	kfree(bond);
 	return ERR_PTR(ret);
@@ -377,7 +375,6 @@ void arm_smmu_sva_unbind(struct iommu_sva *handle)
 	if (refcount_dec_and_test(&bond->refs)) {
 		list_del(&bond->list);
 		arm_smmu_mmu_notifier_put(bond->smmu_mn);
-		iommu_sva_free_pasid(bond->mm);
 		kfree(bond);
 	}
 	mutex_unlock(&sva_lock);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 5b5d69b04fcc..51ac2096b3da 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -514,11 +514,6 @@ static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
 	return iommu_sva_alloc_pasid(mm, PASID_MIN, max_pasid - 1);
 }
 
-static void intel_svm_free_pasid(struct mm_struct *mm)
-{
-	iommu_sva_free_pasid(mm);
-}
-
 static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
 					   struct device *dev,
 					   struct mm_struct *mm,
@@ -662,8 +657,6 @@ static int intel_svm_unbind_mm(struct device *dev, u32 pasid)
 				kfree(svm);
 			}
 		}
-		/* Drop a PASID reference and free it if no reference. */
-		intel_svm_free_pasid(mm);
 	}
 out:
 	return ret;
@@ -1047,8 +1040,6 @@ struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void
 	}
 
 	sva = intel_svm_bind_mm(iommu, dev, mm, flags);
-	if (IS_ERR_OR_NULL(sva))
-		intel_svm_free_pasid(mm);
 	mutex_unlock(&pasid_mutex);
 
 	return sva;
diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 50ee27bbd04e..cb8e74adb685 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -347,23 +347,6 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
 }
 EXPORT_SYMBOL_GPL(ioasid_alloc);
 
-/**
- * ioasid_get - obtain a reference to the IOASID
- */
-void ioasid_get(ioasid_t ioasid)
-{
-	struct ioasid_data *ioasid_data;
-
-	spin_lock(&ioasid_allocator_lock);
-	ioasid_data = xa_load(&active_allocator->xa, ioasid);
-	if (ioasid_data)
-		refcount_inc(&ioasid_data->refs);
-	else
-		WARN_ON(1);
-	spin_unlock(&ioasid_allocator_lock);
-}
-EXPORT_SYMBOL_GPL(ioasid_get);
-
 /**
  * ioasid_put - Release a reference to an ioasid
  * @ioasid: the ID to remove
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
index bd41405d34e9..106506143896 100644
--- a/drivers/iommu/iommu-sva-lib.c
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -18,8 +18,7 @@ static DECLARE_IOASID_SET(iommu_sva_pasid);
  *
  * Try to allocate a PASID for this mm, or take a reference to the existing one
  * provided it fits within the [@min, @max] range. On success the PASID is
- * available in mm->pasid, and must be released with iommu_sva_free_pasid().
- * @min must be greater than 0, because 0 indicates an unused mm->pasid.
+ * available in mm->pasid and will be available for the lifetime of the mm.
  *
  * Returns 0 on success and < 0 on error.
  */
@@ -33,38 +32,24 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
 		return -EINVAL;
 
 	mutex_lock(&iommu_sva_lock);
-	if (mm->pasid) {
-		if (mm->pasid >= min && mm->pasid <= max)
-			ioasid_get(mm->pasid);
-		else
+	/* Is a PASID already associated with this mm? */
+	if (pasid_valid(mm->pasid)) {
+		if (mm->pasid < min || mm->pasid >= max)
 			ret = -EOVERFLOW;
-	} else {
-		pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
-		if (pasid == INVALID_IOASID)
-			ret = -ENOMEM;
-		else
-			mm->pasid = pasid;
+		goto out;
 	}
+
+	pasid = ioasid_alloc(&iommu_sva_pasid, min, max, mm);
+	if (!pasid_valid(pasid))
+		ret = -ENOMEM;
+	else
+		mm_pasid_set(mm, pasid);
+out:
 	mutex_unlock(&iommu_sva_lock);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
 
-/**
- * iommu_sva_free_pasid - Release the mm's PASID
- * @mm: the mm
- *
- * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid()
- */
-void iommu_sva_free_pasid(struct mm_struct *mm)
-{
-	mutex_lock(&iommu_sva_lock);
-	if (ioasid_put(mm->pasid))
-		mm->pasid = 0;
-	mutex_unlock(&iommu_sva_lock);
-}
-EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
-
 /* ioasid_find getter() requires a void * argument */
 static bool __mmget_not_zero(void *mm)
 {
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index 95dc3ebc1928..8909ea1094e3 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -9,7 +9,6 @@
 #include <linux/mm_types.h>
 
 int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
-void iommu_sva_free_pasid(struct mm_struct *mm);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
 
 /* I/O Page fault */
diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index 2237f64dbaae..46a41f61ad3e 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -34,7 +34,6 @@ struct ioasid_allocator_ops {
 #if IS_ENABLED(CONFIG_IOASID)
 ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
 		      void *private);
-void ioasid_get(ioasid_t ioasid);
 bool ioasid_put(ioasid_t ioasid);
 void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
 		  bool (*getter)(void *));
@@ -53,10 +52,6 @@ static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
 	return INVALID_IOASID;
 }
 
-static inline void ioasid_get(ioasid_t ioasid)
-{
-}
-
 static inline bool ioasid_put(ioasid_t ioasid)
 {
 	return false;
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index c74d1edbac2f..1893ecfa93d6 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -439,8 +439,24 @@ static inline void mm_pasid_init(struct mm_struct *mm)
 {
 	mm->pasid = INVALID_IOASID;
 }
+
+/* Associate a PASID with an mm_struct: */
+static inline void mm_pasid_set(struct mm_struct *mm, u32 pasid)
+{
+	mm->pasid = pasid;
+}
+
+static inline void mm_pasid_drop(struct mm_struct *mm)
+{
+	if (pasid_valid(mm->pasid)) {
+		ioasid_put(mm->pasid);
+		mm->pasid = INVALID_IOASID;
+	}
+}
 #else
 static inline void mm_pasid_init(struct mm_struct *mm) {}
+static inline void mm_pasid_set(struct mm_struct *mm, u32 pasid) {}
+static inline void mm_pasid_drop(struct mm_struct *mm) {}
 #endif
 
 #endif /* _LINUX_SCHED_MM_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index deacd2c17a7f..c03c6682464c 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1115,6 +1115,7 @@ static inline void __mmput(struct mm_struct *mm)
 	}
 	if (mm->binfmt)
 		module_put(mm->binfmt->module);
+	mm_pasid_drop(mm);
 	mmdrop(mm);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-25 15:18           ` Fenghua Yu
@ 2022-01-26 14:23             ` Thomas Gleixner
  2022-01-26 17:36               ` Fenghua Yu
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2022-01-26 14:23 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

On Tue, Jan 25 2022 at 07:18, Fenghua Yu wrote:
> On Mon, Jan 24, 2022 at 09:55:56PM +0100, Thomas Gleixner wrote:
>  /**
>   * ioasid_put - Release a reference to an ioasid
>   * @ioasid: the ID to remove

which in turn makes ioasid_put() a misnomer and the whole refcounting of
the ioasid a pointless exercise.

While looking at ioasid_put() usage I tripped over the following UAF
issue:

--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4817,8 +4817,10 @@ static int aux_domain_add_dev(struct dma
 	auxiliary_unlink_device(domain, dev);
 link_failed:
 	spin_unlock_irqrestore(&device_domain_lock, flags);
-	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
+	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
 		ioasid_put(domain->default_pasid);
+		domain->default_pasid = INVALID_IOASID;
+	}
 
 	return ret;
 }
@@ -4847,8 +4849,10 @@ static void aux_domain_remove_dev(struct
 
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 
-	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
+	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
 		ioasid_put(domain->default_pasid);
+		domain->default_pasid = INVALID_IOASID;
+	}
 }
 
 static int prepare_domain_attach_device(struct iommu_domain *domain,

Vs. ioasid_put() I think we should fold the following:

--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4818,7 +4818,7 @@ static int aux_domain_add_dev(struct dma
 link_failed:
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
-		ioasid_put(domain->default_pasid);
+		ioasid_free(domain->default_pasid);
 		domain->default_pasid = INVALID_IOASID;
 	}
 
@@ -4850,7 +4850,7 @@ static void aux_domain_remove_dev(struct
 	spin_unlock_irqrestore(&device_domain_lock, flags);
 
 	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
-		ioasid_put(domain->default_pasid);
+		ioasid_free(domain->default_pasid);
 		domain->default_pasid = INVALID_IOASID;
 	}
 }
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -2,7 +2,7 @@
 /*
  * I/O Address Space ID allocator. There is one global IOASID space, split into
  * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
- * free IOASIDs with ioasid_alloc and ioasid_put.
+ * free IOASIDs with ioasid_alloc() and ioasid_free().
  */
 #include <linux/ioasid.h>
 #include <linux/module.h>
@@ -15,7 +15,6 @@ struct ioasid_data {
 	struct ioasid_set *set;
 	void *private;
 	struct rcu_head rcu;
-	refcount_t refs;
 };
 
 /*
@@ -315,7 +314,6 @@ ioasid_t ioasid_alloc(struct ioasid_set
 
 	data->set = set;
 	data->private = private;
-	refcount_set(&data->refs, 1);
 
 	/*
 	 * Custom allocator needs allocator data to perform platform specific
@@ -348,17 +346,11 @@ ioasid_t ioasid_alloc(struct ioasid_set
 EXPORT_SYMBOL_GPL(ioasid_alloc);
 
 /**
- * ioasid_put - Release a reference to an ioasid
+ * ioasid_free - Free an ioasid
  * @ioasid: the ID to remove
- *
- * Put a reference to the IOASID, free it when the number of references drops to
- * zero.
- *
- * Return: %true if the IOASID was freed, %false otherwise.
  */
-bool ioasid_put(ioasid_t ioasid)
+void ioasid_free(ioasid_t ioasid)
 {
-	bool free = false;
 	struct ioasid_data *ioasid_data;
 
 	spin_lock(&ioasid_allocator_lock);
@@ -368,10 +360,6 @@ bool ioasid_put(ioasid_t ioasid)
 		goto exit_unlock;
 	}
 
-	free = refcount_dec_and_test(&ioasid_data->refs);
-	if (!free)
-		goto exit_unlock;
-
 	active_allocator->ops->free(ioasid, active_allocator->ops->pdata);
 	/* Custom allocator needs additional steps to free the xa element */
 	if (active_allocator->flags & IOASID_ALLOCATOR_CUSTOM) {
@@ -381,9 +369,8 @@ bool ioasid_put(ioasid_t ioasid)
 
 exit_unlock:
 	spin_unlock(&ioasid_allocator_lock);
-	return free;
 }
-EXPORT_SYMBOL_GPL(ioasid_put);
+EXPORT_SYMBOL_GPL(ioasid_free);
 
 /**
  * ioasid_find - Find IOASID data
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -34,7 +34,7 @@ struct ioasid_allocator_ops {
 #if IS_ENABLED(CONFIG_IOASID)
 ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
 		      void *private);
-bool ioasid_put(ioasid_t ioasid);
+void ioasid_free(ioasid_t ioasid);
 void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
 		  bool (*getter)(void *));
 int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
@@ -52,10 +52,7 @@ static inline ioasid_t ioasid_alloc(stru
 	return INVALID_IOASID;
 }
 
-static inline bool ioasid_put(ioasid_t ioasid)
-{
-	return false;
-}
+static inline void ioasid_free(ioasid_t ioasid) { }
 
 static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
 				bool (*getter)(void *))
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -423,7 +423,7 @@ static inline void mm_pasid_set(struct m
 static inline void mm_pasid_drop(struct mm_struct *mm)
 {
 	if (pasid_valid(mm->pasid)) {
-		ioasid_put(mm->pasid);
+		ioasid_free(mm->pasid);
 		mm->pasid = INVALID_IOASID;
 	}
 }

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-26 14:23             ` Thomas Gleixner
@ 2022-01-26 17:36               ` Fenghua Yu
  2022-01-26 21:38                 ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Fenghua Yu @ 2022-01-26 17:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

Hi, Thomas,

On Wed, Jan 26, 2022 at 03:23:42PM +0100, Thomas Gleixner wrote:
> On Tue, Jan 25 2022 at 07:18, Fenghua Yu wrote:
> > On Mon, Jan 24, 2022 at 09:55:56PM +0100, Thomas Gleixner wrote:
> >  /**
> >   * ioasid_put - Release a reference to an ioasid
> >   * @ioasid: the ID to remove
> 
> which in turn makes ioasid_put() a misnomer and the whole refcounting of
> the ioasid a pointless exercise.
> 
> While looking at ioasid_put() usage I tripped over the following UAF
> issue:
> 
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4817,8 +4817,10 @@ static int aux_domain_add_dev(struct dma
>  	auxiliary_unlink_device(domain, dev);
>  link_failed:
>  	spin_unlock_irqrestore(&device_domain_lock, flags);
> -	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
> +	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>  		ioasid_put(domain->default_pasid);
> +		domain->default_pasid = INVALID_IOASID;
> +	}
>  
>  	return ret;
>  }
> @@ -4847,8 +4849,10 @@ static void aux_domain_remove_dev(struct
>  
>  	spin_unlock_irqrestore(&device_domain_lock, flags);
>  
> -	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
> +	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>  		ioasid_put(domain->default_pasid);
> +		domain->default_pasid = INVALID_IOASID;
> +	}
>  }
>  
>  static int prepare_domain_attach_device(struct iommu_domain *domain,

The above patch fixes an existing issue. I will put it in a separate patch,
right?

It cannot be applied cleanly to the upstream tree. Do you want me to base
the above patch (and the whole patch set) to the upstream tree or a specific
tip branch?

I will fold the following patch into patch #5. The patch #11 (the doc patch)
also needs to remove one paragraph talking about refcount.

So I will send the whole patch set with the following changes:
1. One new bug fix patch (the above patch)
2. Updated patch #5 (with the following patch folded)
3. Updated patch #11 (removing refcount description)

Are the changes OK to you?
 
> 
> Vs. ioasid_put() I think we should fold the following:
> 
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4818,7 +4818,7 @@ static int aux_domain_add_dev(struct dma
>  link_failed:
>  	spin_unlock_irqrestore(&device_domain_lock, flags);
>  	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
> -		ioasid_put(domain->default_pasid);
> +		ioasid_free(domain->default_pasid);
>  		domain->default_pasid = INVALID_IOASID;
>  	}
>  
> @@ -4850,7 +4850,7 @@ static void aux_domain_remove_dev(struct
>  	spin_unlock_irqrestore(&device_domain_lock, flags);
>  
>  	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
> -		ioasid_put(domain->default_pasid);
> +		ioasid_free(domain->default_pasid);
>  		domain->default_pasid = INVALID_IOASID;
>  	}
>  }
> --- a/drivers/iommu/ioasid.c
> +++ b/drivers/iommu/ioasid.c
> @@ -2,7 +2,7 @@
>  /*
>   * I/O Address Space ID allocator. There is one global IOASID space, split into
>   * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
> - * free IOASIDs with ioasid_alloc and ioasid_put.
> + * free IOASIDs with ioasid_alloc() and ioasid_free().
>   */
>  #include <linux/ioasid.h>
>  #include <linux/module.h>
> @@ -15,7 +15,6 @@ struct ioasid_data {
>  	struct ioasid_set *set;
>  	void *private;
>  	struct rcu_head rcu;
> -	refcount_t refs;
>  };
>  
>  /*
> @@ -315,7 +314,6 @@ ioasid_t ioasid_alloc(struct ioasid_set
>  
>  	data->set = set;
>  	data->private = private;
> -	refcount_set(&data->refs, 1);
>  
>  	/*
>  	 * Custom allocator needs allocator data to perform platform specific
> @@ -348,17 +346,11 @@ ioasid_t ioasid_alloc(struct ioasid_set
>  EXPORT_SYMBOL_GPL(ioasid_alloc);
>  
>  /**
> - * ioasid_put - Release a reference to an ioasid
> + * ioasid_free - Free an ioasid
>   * @ioasid: the ID to remove
> - *
> - * Put a reference to the IOASID, free it when the number of references drops to
> - * zero.
> - *
> - * Return: %true if the IOASID was freed, %false otherwise.
>   */
> -bool ioasid_put(ioasid_t ioasid)
> +void ioasid_free(ioasid_t ioasid)
>  {
> -	bool free = false;
>  	struct ioasid_data *ioasid_data;
>  
>  	spin_lock(&ioasid_allocator_lock);
> @@ -368,10 +360,6 @@ bool ioasid_put(ioasid_t ioasid)
>  		goto exit_unlock;
>  	}
>  
> -	free = refcount_dec_and_test(&ioasid_data->refs);
> -	if (!free)
> -		goto exit_unlock;
> -
>  	active_allocator->ops->free(ioasid, active_allocator->ops->pdata);
>  	/* Custom allocator needs additional steps to free the xa element */
>  	if (active_allocator->flags & IOASID_ALLOCATOR_CUSTOM) {
> @@ -381,9 +369,8 @@ bool ioasid_put(ioasid_t ioasid)
>  
>  exit_unlock:
>  	spin_unlock(&ioasid_allocator_lock);
> -	return free;
>  }
> -EXPORT_SYMBOL_GPL(ioasid_put);
> +EXPORT_SYMBOL_GPL(ioasid_free);
>  
>  /**
>   * ioasid_find - Find IOASID data
> --- a/include/linux/ioasid.h
> +++ b/include/linux/ioasid.h
> @@ -34,7 +34,7 @@ struct ioasid_allocator_ops {
>  #if IS_ENABLED(CONFIG_IOASID)
>  ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
>  		      void *private);
> -bool ioasid_put(ioasid_t ioasid);
> +void ioasid_free(ioasid_t ioasid);
>  void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
>  		  bool (*getter)(void *));
>  int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
> @@ -52,10 +52,7 @@ static inline ioasid_t ioasid_alloc(stru
>  	return INVALID_IOASID;
>  }
>  
> -static inline bool ioasid_put(ioasid_t ioasid)
> -{
> -	return false;
> -}
> +static inline void ioasid_free(ioasid_t ioasid) { }
>  
>  static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
>  				bool (*getter)(void *))
> --- a/include/linux/sched/mm.h
> +++ b/include/linux/sched/mm.h
> @@ -423,7 +423,7 @@ static inline void mm_pasid_set(struct m
>  static inline void mm_pasid_drop(struct mm_struct *mm)
>  {
>  	if (pasid_valid(mm->pasid)) {
> -		ioasid_put(mm->pasid);
> +		ioasid_free(mm->pasid);
>  		mm->pasid = INVALID_IOASID;
>  	}
>  }

Thank you very much for your review!

-Fenghua

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-26 17:36               ` Fenghua Yu
@ 2022-01-26 21:38                 ` Thomas Gleixner
  2022-01-28  2:42                   ` Fenghua Yu
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2022-01-26 21:38 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

On Wed, Jan 26 2022 at 09:36, Fenghua Yu wrote:
> On Wed, Jan 26, 2022 at 03:23:42PM +0100, Thomas Gleixner wrote:
>> On Tue, Jan 25 2022 at 07:18, Fenghua Yu wrote:
>> While looking at ioasid_put() usage I tripped over the following UAF
>> issue:
>> 
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -4817,8 +4817,10 @@ static int aux_domain_add_dev(struct dma
>>  	auxiliary_unlink_device(domain, dev);
>>  link_failed:
>>  	spin_unlock_irqrestore(&device_domain_lock, flags);
>> -	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
>> +	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>>  		ioasid_put(domain->default_pasid);
>> +		domain->default_pasid = INVALID_IOASID;
>> +	}
>>  
>>  	return ret;
>>  }
>> @@ -4847,8 +4849,10 @@ static void aux_domain_remove_dev(struct
>>  
>>  	spin_unlock_irqrestore(&device_domain_lock, flags);
>>  
>> -	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
>> +	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>>  		ioasid_put(domain->default_pasid);
>> +		domain->default_pasid = INVALID_IOASID;
>> +	}
>>  }
>>  
>>  static int prepare_domain_attach_device(struct iommu_domain *domain,
>
> The above patch fixes an existing issue. I will put it in a separate patch,
> right?

Correct.

> It cannot be applied cleanly to the upstream tree. Do you want me to base
> the above patch (and the whole patch set) to the upstream tree or a specific
> tip branch?

Against Linus tree please so that the bugfix applies.

> I will fold the following patch into patch #5. The patch #11 (the doc patch)
> also needs to remove one paragraph talking about refcount.
>
> So I will send the whole patch set with the following changes:
> 1. One new bug fix patch (the above patch)
> 2. Updated patch #5 (with the following patch folded)
> 3. Updated patch #11 (removing refcount description)

Looks good.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-26 21:38                 ` Thomas Gleixner
@ 2022-01-28  2:42                   ` Fenghua Yu
  2022-01-28 14:53                     ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Fenghua Yu @ 2022-01-28  2:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

Hi, Thomas,

On Wed, Jan 26, 2022 at 10:38:04PM +0100, Thomas Gleixner wrote:
> On Wed, Jan 26 2022 at 09:36, Fenghua Yu wrote:
> > On Wed, Jan 26, 2022 at 03:23:42PM +0100, Thomas Gleixner wrote:
> >> On Tue, Jan 25 2022 at 07:18, Fenghua Yu wrote:
> >> While looking at ioasid_put() usage I tripped over the following UAF
> >> issue:
> >> 
> >> --- a/drivers/iommu/intel/iommu.c
> >> +++ b/drivers/iommu/intel/iommu.c
> >> @@ -4817,8 +4817,10 @@ static int aux_domain_add_dev(struct dma
> >>  	auxiliary_unlink_device(domain, dev);
> >>  link_failed:
> >>  	spin_unlock_irqrestore(&device_domain_lock, flags);
> >> -	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
> >> +	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
> >>  		ioasid_put(domain->default_pasid);
> >> +		domain->default_pasid = INVALID_IOASID;
> >> +	}
> >>  
> >>  	return ret;
> >>  }
> >> @@ -4847,8 +4849,10 @@ static void aux_domain_remove_dev(struct
> >>  
> >>  	spin_unlock_irqrestore(&device_domain_lock, flags);
> >>  
> >> -	if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
> >> +	if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
> >>  		ioasid_put(domain->default_pasid);
> >> +		domain->default_pasid = INVALID_IOASID;
> >> +	}
> >>  }
> >>  
> >>  static int prepare_domain_attach_device(struct iommu_domain *domain,
> >
> > The above patch fixes an existing issue. I will put it in a separate patch,
> > right?
> 
> Correct.
> 
> > It cannot be applied cleanly to the upstream tree. Do you want me to base
> > the above patch (and the whole patch set) to the upstream tree or a specific
> > tip branch?
> 
> Against Linus tree please so that the bugfix applies.
> 
> > I will fold the following patch into patch #5. The patch #11 (the doc patch)
> > also needs to remove one paragraph talking about refcount.
> >
> > So I will send the whole patch set with the following changes:
> > 1. One new bug fix patch (the above patch)

When I study your above aux_domain bug fix path, I find more aux_domain bugs.
But then I find aux_domain will be completely removed because all aux_domain
related callbacks are not called and are dead code (no wonder there are
so many bugs in aux_domain). Please see this series: https://lore.kernel.org/linux-iommu/20220124071103.2097118-4-baolu.lu@linux.intel.com/
For the series, Baolu confirms that he is "pretty sure that should be part
of v5.18". And I don't find the series calls any IOASID function after
removing the aux_domain code.

So that means we don't need to fix those issues in the dead aux_domain
code any more because it will be completely removed in 5.18, right?

If you agree, I will not include the aux_domain fix patch or any other
aux_domain fix patches in the up-coming v3. Is that OK?

> > 2. Updated patch #5 (with the following patch folded)

I will still change ioasid_put() in the aux_domain callbacks to ioasid_free()
in patch #5. So compilation will pass. Baolu's series will remove
the entire aux_domain code in 5.18.

> > 3. Updated patch #11 (removing refcount description)
> 
> Looks good.
> 

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  2022-01-28  2:42                   ` Fenghua Yu
@ 2022-01-28 14:53                     ` Thomas Gleixner
  0 siblings, 0 replies; 26+ messages in thread
From: Thomas Gleixner @ 2022-01-28 14:53 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ingo Molnar, Borislav Petkov, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Tony Luck, Lu Baolu, Joerg Roedel, Josh Poimboeuf,
	Jacob Pan, Ashok Raj, Ravi V Shankar, iommu, x86, linux-kernel

On Thu, Jan 27 2022 at 18:42, Fenghua Yu wrote:
> On Wed, Jan 26, 2022 at 10:38:04PM +0100, Thomas Gleixner wrote:
>> Against Linus tree please so that the bugfix applies.
>> 
>> > I will fold the following patch into patch #5. The patch #11 (the doc patch)
>> > also needs to remove one paragraph talking about refcount.
>> >
>> > So I will send the whole patch set with the following changes:
>> > 1. One new bug fix patch (the above patch)
>
> When I study your above aux_domain bug fix path, I find more aux_domain bugs.
> But then I find aux_domain will be completely removed because all aux_domain
> related callbacks are not called and are dead code (no wonder there are
> so many bugs in aux_domain). Please see this series: https://lore.kernel.org/linux-iommu/20220124071103.2097118-4-baolu.lu@linux.intel.com/
> For the series, Baolu confirms that he is "pretty sure that should be part
> of v5.18". And I don't find the series calls any IOASID function after
> removing the aux_domain code.
>
> So that means we don't need to fix those issues in the dead aux_domain
> code any more because it will be completely removed in 5.18, right?
>
> If you agree, I will not include the aux_domain fix patch or any other
> aux_domain fix patches in the up-coming v3. Is that OK?

Fair enough.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-01-28 14:53 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-17 22:01 [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 01/11] iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 02/11] mm: Change CONFIG option for mm->pasid field Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 03/11] iommu/ioasid: Introduce a helper to check for valid PASIDs Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 04/11] kernel/fork: Initialize mm's PASID Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit Fenghua Yu
2022-01-24 20:21   ` Thomas Gleixner
2022-01-24 20:33     ` Fenghua Yu
2022-01-24 20:36     ` Thomas Gleixner
2022-01-24 20:52       ` Fenghua Yu
2022-01-24 20:55         ` Thomas Gleixner
2022-01-25 15:18           ` Fenghua Yu
2022-01-26 14:23             ` Thomas Gleixner
2022-01-26 17:36               ` Fenghua Yu
2022-01-26 21:38                 ` Thomas Gleixner
2022-01-28  2:42                   ` Fenghua Yu
2022-01-28 14:53                     ` Thomas Gleixner
2021-12-17 22:01 ` [PATCH v2 06/11] x86/fpu: Clear PASID when copying fpstate Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 07/11] sched: Define and initialize a flag to identify valid PASID in the task Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 08/11] x86/traps: Demand-populate PASID MSR via #GP Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 09/11] x86/cpufeatures: Re-enable ENQCMD Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 10/11] tools/objtool: Check for use of the ENQCMD instruction in the kernel Fenghua Yu
2021-12-17 22:57   ` Josh Poimboeuf
2021-12-27 17:50     ` Fenghua Yu
2021-12-17 22:01 ` [PATCH v2 11/11] docs: x86: Change documentation for SVA (Shared Virtual Addressing) Fenghua Yu
2021-12-27 17:52 ` [PATCH v2 00/11] Re-enable ENQCMD and PASID MSR Fenghua Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).